CN103796015A

CN103796015A - Quantization coefficient differential coding adapted to the number of coefficients

Info

Publication number: CN103796015A
Application number: CN201210425765.0A
Authority: CN
Inventors: 朱洪波
Original assignee: Individual
Current assignee: Individual
Priority date: 2012-10-31
Filing date: 2012-10-31
Publication date: 2014-05-14

Abstract

The invention discloses a quantization coefficient encoding method used for predicting a residual error transformation coefficient in a video signal. In the method, a selection signal of dQP and a selection signal of a quantization matrix are determined by the number of transformation quantization coefficient and are transmitted after the transformation quantization coefficient. An encoder and a decoder define a constant threshold value. If the number of nonzero quantization coefficients of a plurality of continuous transformation blocks is more than the threshold value, the encoder encodes a selection signal of the dQP or the quantification matrix in a code stream after encoding the nonzero quantization coefficients, wherein the selection signal indicates the QP or the quantification matrix used by the plurality of continuous transformation blocks. Thus, the decoder may calculate the number of the nonzero quantization coefficients so as to accurately decode the selection signal of the dQP or the quantification matrix.

Description

The adaptive quantization parameter differential coding of number of coefficients

Technical field

The invention belongs to compression of digital video field, be specifically related to the coding of the quantization parameter of vision signal prediction residual conversion coefficient.

Background technology

Digital video is by the continuous natural scene of time domain and spatial domain is carried out to time domain and spatial domain continuous sampling gained.As shown in Figure 1, digital video is made up of the frame of video in a series of time domains, and each frame of video represents that natural scene is at the Space domain sampling of certain time, and the vision pixel that it is sampled by Two Dimensional Uniform forms.Each pixel is made up of the numeral of a series of description pixel intensity and color, in Video coding, the form the most extensively being used is yuv format, in this form, each pixel is by a luminance component Y, and two color difference components U and V composition, generally respectively carry out down-sampling one time to U and V component horizontal and vertical direction, every like this 4 adjacent pixels share 1 U and 1 V component, Here it is YUV4:2:0 form.

The video coding technique the most extensively being used is piece base hybrid motion compensation dct transform video coding technique, and wherein the most representative MPEG of being and VCEG are combining the international standard HEVC of formulation.As shown in Figure 2, first HEVC encoder is split incoming frame to be divided into the NxN(power that wherein N is 2 one by one, and its minimum is 8, and maximum can arrive 64) piece, be called maximum coding unit (LCU), then from left to right, successively LCU is encoded from top to bottom.In HEVC, basic predictive transformation coding units is called coding unit (CU), the LCU that is 2Nx2N for a size, it directly can be carried out to predictive transformation coding as a CU, also it can be split in the mode of four points of trees be divided into 4 sizes and predict-convert as the unit of NxN-encode, and for the unit of each NxN, it also both can be used as a CU and has predicted-convert-and encode, also can continue to split and be divided into 4 less unit and encode in the mode of four points of trees.The coding that shown in Fig. 3 is a LCU is split component, in figure, LCU is split and is divided into 4 equidimension CU, and in 4 CU first, the 3rd and the 4th splitting point of having carried out respectively continuing.CU minimum is 8x8, maximum can with LCU equidimension.To each CU to be encoded, first from the reconstructed frame of having encoded, calculate a prediction to current block, and subtract each other with current block, residual error is carried out dct transform, quantification successively, then inverse quantization, anti-dct transform obtain reconstruct macro block, deposit in reconstructed frame sequence, for the CU encoding is thereafter produced to prediction signal.Because accurate dct transform is floating-point transform, so generally replace with the integer approximation of dct transform or the integer approximation of KLT conversion in reality.

The prediction of CU has 2 kinds, and the first is infra-frame prediction, only uses the pixel of the reconstruct of current encoded frame to predict current C U.Current most popular infra-prediction techniques is direction prediction technology in frame.In frame, direction prediction directly carries out take CU as unit.The angle infra-frame prediction using in HEVC as shown in Figure 4.Shown in Fig. 4, white black surround piece is present encoding CU, and the grey band of the left side and top is the reconstructed pixel of current C U upper left, and they are used for generating the prediction signal of current C U.Certain prediction direction of directional prediction modes explanation in each frame, for certain a line or a certain row (using vertical grid representation) of encoding as shown in Figure 4 in CU, find the corresponding pixel in the reconstructed blocks of upper left according to prediction direction, as shown in black picture element piece in Fig. 4, as the prediction of current row or column.Row prediction shown in Fig. 4 right side is not because the top reconstructed pixel in the left side exists, so before actual prediction starts, need to shine upon in the past according to current intra prediction direction from the reconstructed pixel of current block top according to current prediction direction, as shown in Figure 5.In HEVC, have 33 kinds of different directional prediction modes, as shown in Figure 6.

The second prediction of CU is inter prediction, the at this moment prediction of current block choosing in the reconstructed frame before present frame or after present frame time domain.In HEVC, prediction is take predicting unit (PU) as base unit.The CU of a 2Nx2N size has 4 kinds of PU partition modes, and it both can be used as single PU and has carried out motion compensation, also can be divided into several PU and carry out respectively different motion compensation, as shown in Figure 7.For the PU of an arbitrary shape, movement compensation process as shown in Figure 8, Figure 9, Figure 10 and Figure 11.Displaying time is that the frame of t is current encoded frame, and black block is present encoding piece.Displaying time is that t-t0, t-2*t0, t+t0 frame are reconstructed frame, and the grey block wherein with point-like border is and the piece of position, the same spatial domain of present encoding piece.Displaying time is the prediction that t-t0, t-2*t0, t+t0 frame can be served as present encoding piece.In Fig. 8, motion vector MV0 points to the forward prediction piece BLK0 of current block.In Fig. 9, motion vector MV1 points to the back forecast piece BLK1 of current block.In Figure 10, motion vector MV points to the forward prediction piece BLK0 of current block, and the opposite direction of MV is pointed to back forecast piece BLK1, the average prediction as current block of BLK0 and BLK1.In Figure 11, motion vector MV0 points to the forward prediction piece BLK0 of current block, and motion vector MV1 points to the back forecast piece BLK1 of current block, the average prediction as current block of BLK0 and BLK1.In forward prediction, back forecast and symmetrical prediction, only have a kinematic parameter (comprising motion vector and reference picture) to need coding, and in bi-predictive mode, 2 kinematic parameters need coding.For every kind of time domain compensation pattern, the motion estimation process of encoder percent of pass aberration optimizing obtains optimum kinematic parameter, and is enrolled code stream.

After being predicted, current block and prediction piece are subtracted each other, obtain residual block, then residual block is carried out to dct transform or other orthogonal transforms successively, then quantize, entropy coding, and then inverse quantization, inverse transformation and prediction piece be added and obtain reconstructed blocks, and for the prediction of subsequent encoded blocks in subsequent received.In HEVC, conversion is carried out take converter unit (TU) as unit.For infra-frame prediction, TU and CU same size, i.e. the CU to a 2Nx2N size, directly carries out the separable conversion of 2Nx2N two dimension.For time domain prediction, use variable-size-block conversion.At this moment the shape of TU depends on the shape of PU.In HEVC, the shape of TU is always little than PU, and conversion can not passed through the border of prediction like this, but has also reduced the flexibility of conversion.Four points of tree structure variable-size-blocks that use in HEVC convert as shown in figure 12, and the unit to be transformed of a 2Nx2N can have 4 kinds of conversion to cut apart pattern, i.e. directly 2Nx2N conversion, 2 2NxN conversion, 2 Nx2N convert or be divided into 4 NxN unit.Each in 4 NxN unit can independently be carried out the variable-size-block conversion of four points of tree structures.The minimum unit of TU is 4x4.

Need to use quantization parameter (QP) to the quantification of conversion coefficient, sometimes also will use weight quantization matrix.In the coding of current HEVC, the transmission of quantization parameter occurs on the CU of certain size, and this size coding is on image or fragment.In other words, when CU size MxM or on and this CU while including non-zero quantized coefficients, the difference (dQP) of a quantization parameter of encoder transmission, decoder just can calculate current MxM size or common QP that the more all transform blocks in large scale CU inside use according to QP and this dQP in the past like this.Such as M is 16 o'clock, if a CU is 32x32, and containing non-zero transform coefficient, encoder necessarily gives this CU transmission a dQP so.If CU is 8x8 and containing non-zero transform coefficient, encoder must give the 16x16CU transmission that comprises it dQP so, the inner all 8x8CU of this 16x16CU use this dQP.MxM size or more large scale CU, the position of dQP in code stream is always all before non-zero transform coefficient.In HEVC, quantization matrix is transmitted in image or fragment rank, and it does not have the adaptivity of LCU or CU rank.

Because complexity of video content changes very greatly, be not a kind of method efficiently so directly force an image or fragment all can only change QP on certain size.The invention describes one dQP transmission method more flexibly, it can provide higher code efficiency.

Summary of the invention

First part of the present invention is between encoder, to have a common integer value T, represents the number of non-zero transform coefficient, T >=2.This value can be coded in image or slice layer, also can be directly by a default value of encoder agreement.

The piece B of encoder to continuous some different sizes ₀, B ₁... B _n-1(n>1) implement dct transform, then these pieces are quantized, establish CfN (B _i) expression piece B _inon-zero quantization transform coefficient number, as several continuous B ₀, B ₁... B _m-1(m>0) when the non-zero transform coefficient number that quantification produces reaches T just, and if m is greater than at 1 o'clock and has

encoder writes a dQP in code stream, with the QP that represents that these continuous pieces use.In code stream, after this dQP is encoded in all quantization transform coefficients of these pieces, decoder can accurately judge in code stream, whether there is dQP and is correctly decoded by the number of accumulative total non-zero transform coefficient like this.In an actual encoder, encoder can try different QP and encode some continuous pieces, and spends by calculation rate distortion, is met the forced coding combination of condition above and is enrolled code stream.

In decoder, decoder block-by-block decoding non-zero transform coefficient, in the time that its accumulative total number reaches T, a dQP of decoder decoding, and be added with QP in the past, obtain the common QP that these pieces use, then just can and and predict that piece addition obtains reconstructed blocks to these piece application inverse quantization-inverse transformations.Then decoder continues the conversion coefficient of next chunk of decoding and again adds up its number to determine the position of decoding dQP.

The second part of the present invention is also between encoder, to have a common integer value Q, represents the number of non-zero transform coefficient, Q >=2.This value can be coded in image or slice layer, also can be directly by a default value of encoder agreement.Between encoder, also there are some quantization matrix QM ₀, QM ₁... QM _t-1(t>1), what this value can be part or all of is coded in image or slice layer, also can partly or entirely directly be arranged by encoder.

The piece B of encoder to continuous some different sizes ₀, B ₁... B _n-1(n>1) implement dct transform, then to some weight quantization matrix QM for these pieces _iquantize with a QP, establish CfN (B _i) expression piece B _inon-zero quantization transform coefficient number, as several continuous B ₀, B ₁... B _m-1(m>0) when the non-zero transform coefficient number that quantification produces reaches Q,

and if m is greater than at 1 o'clock and has

encoder represents QM to writing an index in code stream _i, to represent that the weight quantization matrix that these continuous pieces use is QM _i.In code stream, this QM _iafter index is encoded in all quantization transform coefficients of these pieces, decoder can accurately judge in code stream, whether there is QM by the number of accumulative total non-zero transform coefficient like this _iindex is also correctly decoded.In an actual encoder, encoder can try different QM to some continuous pieces _iand encode, and spend by calculation rate distortion, be met the forced coding combination of condition above and enrolled code stream.

In decoder, decoder block-by-block decoding non-zero transform coefficient, in the time that its accumulative total non-zero transform coefficient number of some continuous transform blocks is more than or equal to Q just, a QM of decoder decoding _iindex, obtains the common weight quantization matrix that these pieces use, and then just can and and predict that piece addition obtains reconstructed blocks to these piece application inverse quantization-inverse transformations.Then start the decoding of next group transform block.

First part of the present invention and the second part can independent utility to encoder in, also can the two be applied in an encoder simultaneously, the latter can obtain larger performance.

Accompanying drawing explanation

Fig. 1 is digital video example, wherein indicates 1 expression time-domain sampling, and indication 2 represents Space domain sampling.

Fig. 2 is piece base motion compensation dct transform video encoder, wherein indicates 1 expression to split and is divided into LCU, and indication 2 represents intraframe prediction information, indication 3 represents moving parameter information, indication 4 represents control information, and indication 5 represents quantization DCT coefficient, indication 6 presentation code code streams.

Fig. 3 is that LCU is divided into CU by four points of splitting of recurrence of tree.

Fig. 4 is angle infra-frame prediction schematic diagram.

Fig. 5 is the mapping calculation schematic diagram of the unavailable pixel in upper left in angle infra-frame prediction.

Fig. 6 is all direction mode schematic diagrames of angle infra-frame prediction.

Fig. 7 is that CU splits all modes that are divided into PU.

Fig. 8 is time domain forward prediction schematic diagram.

Fig. 9 is time domain back forecast schematic diagram.

Figure 10 is the symmetrical prediction of time domain schematic diagram.

Figure 11 is time domain bidirectional prediction schematic diagram.

Figure 12 is that in HEVC, schematic diagram is cut apart in conversion, and NxN piece wherein can continue same cutting apart.

Figure 13 is that Laplce quantizes matrix schematic diagram

Figure 14 is video encoder of the present invention

Figure 15 is Video Decoder of the present invention

Execution mode

Shown in Figure 13, be that a kind of Laplce quantizes matrix schematic diagram.The element that coordinate (i, j) is located is ρ ^t, t=min (7, i+j).In the present embodiment, 8 fixing quantization matrixes of agreement between encoder, numbering 0 to 7, wherein numbering i has ρ=1.03041 × 1.02123 ⁱ.But the present invention does not limit the form of quantization matrix.Then encoder agreement T=3, Q=8.But the present invention is not limited to this, the present invention's restricted T and Q must be more than or equal to 2.

Figure 14 comprises a video encoder of the present invention.First encoder is divided into LCU input video frame, then LCU is carried out to sequential encoding from left to right, from top to bottom.To each LCU, encoder is split to be divided into CU and to encode.By the selection of prediction and mapping mode, encoder obtains a series of discrete cosine transform blocks that need quantification.Then according to algorithm described in this patent, these pieces are tried out to different QP and quantization matrixes, carry out the quantification that uses weight quantization matrix.Spend by calculation rate distortion, select to meet the optimum combination of decode condition, enrolled code stream.

Shown in Figure 15, be to comprise Video Decoder of the present invention, it is the contrary of Figure 14.Decoder from left to right, decode to each LCU from top to bottom, decoder first obtains the information about block size and order, then one by one piece is decoded, in the time that accumulative total non-zero quantized coefficients exceedes a corresponding fixing threshold value, decoder decode corresponding dQP or quantization matrix index, obtain the quantization parameter of those pieces.Then inverse quantization-inverse transformation, adds prediction signal, obtains reconstructed blocks.

Claims

1. video encoder uses the transform block B of a QP to one group of different size ₀, B ₁... B _n-1(n>1) implement to quantize.If CfN is (B _i) expression piece B _inon-zero quantized coefficients number, when and if m is greater than at 1 o'clock and has encoder is coded into after code stream at a front m piece, in code stream, writes a dQP, represents that continuous m piece in the past used this QP.Encoder is applied different QP to continuous transform block, then selects the optimum combination that meets decode condition to be coded into code stream by calculation rate distortion cost.

2. as claimed in claim 1, T is the value that encoder is owned together, and T >=2.T can be default convention, also can be encoded in image or slice layer.

3. the corresponding continuous decoding transform block of Video Decoder as claimed in claim 1.Once the non-zero quantized coefficients of accumulative total is more than or equal to T, decoder reads a dQP from code stream, then calculates the QP of this group coefficient.Then decoder starts to decode, and next organizes transform block, and restarts the number of accumulative total non-zero quantized coefficients, to determine the position of the next dQP of decoding.

4. between encoder, own some weight quantization matrix QM together ₀, QM ₁... QM _t-1(t>1), what quantization matrix can be part or all of is coded in image or slice layer, also can be partly or entirely by the direct default convention of encoder.Video encoder uses a QM _ito the transform block B of one group of different size ₀, B ₁... B _n-1(n>1) implement to quantize.If CfN is (B _i) expression piece B _inon-zero quantized coefficients number, when

and if m is greater than at 1 o'clock and has

encoder is coded into after code stream at a front m piece, and encoder continues in code stream, to write one and represents QM _iindex, represent that continuous m piece in the past used this QM _i.Encoder is applied different quantization matrixes to continuous transform block, then selects the optimum combination that meets decode condition to be coded into code stream by calculation rate distortion cost.

5. as claimed in claim 4, Q is the value that encoder is owned together, and Q >=2.Q can be default convention, also can be encoded in image or slice layer.

6. the corresponding continuous decoding transform block of Video Decoder as claimed in claim 4.Once the non-zero quantized coefficients of accumulative total is more than or equal to Q, decoder reads an index to determine weight quantization matrix used from code stream.Then decoder starts to decode, and next organizes transform block, and restarts the number of accumulative total non-zero quantized coefficients, to determine the position of the next weight quantization matrix index of decoding.