WO2017093188A1

WO2017093188A1 - Encoding and decoding of pictures in a video

Info

Publication number: WO2017093188A1
Application number: PCT/EP2016/079007
Authority: WO
Inventors: Kenneth Andersson; Martin Pettersson; Per Hermansson; Jacob STRÖM; Jonatan Samuelsson
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2015-11-30
Filing date: 2016-11-28
Publication date: 2017-06-08

Abstract

There are provided mechanisms for encoding a picture of a video sequence in a video bitstream. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The method comprises determining a frequency distribution of the color component values of the at least one color component. The method comprises calculating a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples. The second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The method comprises quantizing at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter.

Description

ENCODING AND DECODING OF PICTURES IN A VIDEO TECHNICAL FIELD

Embodiments herein relate to the field of video coding, such as High Efficiency Video Coding (HEVC) or the like. In particular, embodiments herein relate to a method and an encoder for encoding a picture of a video sequence in a video bitstream. Embodiments herein relate as well to a method and a decoder for decoding a video bitstream comprising an encoded video sequence. Corresponding computer programs therefor are also disclosed. BACKGROUND

High Dynamic Range (HDR) with Wide Color Gamut (WCG) has become an increasingly hot topic within the TV and multimedia industry in the last couple of years. While screens capable of displaying the HDR video signal are emerging at the consumer market, over-the-top (OTT) players such as Netflix have announced that HDR content will be delivered to the end-user. Standardization bodies are working on specifying the requirements for HDR. For instance, in the roadmap for DVB, UHDTV1 phase 2 will include HDR support. MPEG is currently working on exploring how HDR video could be compressed.

HDR imaging is a set of techniques within photography that allows for a greater dynamic range of luminosity compared to standard digital imaging. Dynamic range in digital cameras is typically measured in f-stops, where one f-stop means doubling of the amount of light. A standard LCD HDTV using Standard Dynamic Range (SDR) can display less than or equal to 10 stops. HDR is defined by MPEG to have a dynamic range of over 16 f-stops. WCG is to increase the color fidelity from ITU-R BT.709 towards ITU-R BT.2020 such that more of the visible colors can be captured and displayed.

HDR is defined for UHDTV in ITU-R Recommendation BT.2020 while SDR is defined for HDTV in ITU- R Recommendation BT.709.

A color model is a mathematical model that defines the possible colors that can be presented using a predefined number of components. Examples of color models are RGB, Y'CbCr 4:2:0 (also called YUV 4:2:0), CIE1931 etc.

A picture element (pixel for short) is the smallest element of a digital image and holds the luminance and color information of that element. The luminance and color can be expressed in different ways depending on the use case. Displays usually have three color elements, red, green and blue which are lit at different intensities depending on what color and luminance is to be displayed. It becomes therefore convenient to send the pixel information in RGB pixel format to the display. Since the signal is digital the intensity of each component of the pixel must be represented with a fixed number of bits, referred to as the bit depth of the component. A bit depth of n can represent 2ⁿ different values, e.g. 256 values per component for 8 bits and 1024 values per component for 10 bits. When video needs to be compressed it is convenient to express the luminance and color information of the pixel with one luminance component and two color components. This is done since the human visual system (HVS) is more sensitive to luminance than to color, meaning that luminance can be represented with higher accuracy than color. One commonly used format that allows for this separation is Y'CbCr 4:2:0 (also called YUV 4:2:0) where the Cb- and Cr-components have quarter resolution compared to the Y' components. When encoding video, the non-linear gamma transfer function is typically applied to the linear RGB samples to obtain the non-linear R'G'B' representation, and then a 3x3 matrix multiplication is applied to get to Y'CbCr. The resulting Y' component is referred to as luma which is roughly equal to luminance. The true luminance is instead obtained by converting the linear RGB samples using a 3x3 matrix operation to get to XYZ in the CIE1931 color space. The luminance is the Y coordinate of this XYZ-vector. Sometimes one can refer to a function of the Y coordinate as luminance, for instance when a transfer function has been applied to Y. Likewise, the Cb and Cr components of Y'CbCr 4:2:0 together are called chroma, which is similar to but different from chrominance. To get the chrominance, the X and Z coordinates of the CIE 1931 are used. One chrominance representation is the coordinates (x,y) where x=X/(X+Y+Z) and y=Y/(X+Y+Z). Y'CbCr is not the only representation that attempts to separate luminance from chrominance, there also exist other formats such as YdZdx which is based on XYZ etc. However, Y'CbCr is the most commonly used representation. Before displaying samples, the chroma components are first upsampled to 4:4:4, e.g., the same resolution as the luma, and then the luma and chroma are converted to R'G'B' and then converted to the linear domain before being displayed. High Efficiency Video Coding (HEVC) is a block based video codec standardized by ITU-T and MPEG that utilizes both temporal and spatial prediction. Spatial prediction is achieved using intra (I) prediction within the current frame. Temporal prediction is achieved using inter (P) or bi-directional inter (B) prediction on block level from previously decoded reference pictures. The difference between the original pixel data and the predicted pixel data, referred to as a residual, is transformed into the frequency domain and quantized before being entropy coded and transmitted together with necessary prediction parameters such as mode selections and motion vectors. By quantizing the transformed residuals, the tradeoff between bitrate and quality of the video may be controlled. The level of quantization is determined by the quantization parameter (QP). The quantization parameter (QP) is a key technique to control the quality/bitrate of the residual in video coding. It is applied such that it controls the fidelity of the residual (typically transform coefficients) and thus also controls the amount of coding artifacts. When QP is high the transform coefficients are quantized coarsely resulting in fewer bits but also possibly more coding artifacts than when QP is small where the transform coefficients are quantized finely. A low QP thus generally results in high quality and a high QP results in low quality. In HEVC v1 (similarly also for H.264/AVC) the quantization parameter can be controlled on picture or slice level or block level. On picture and slice level it can be controlled individually for each color component. In HEVC v2 the quantization parameter for chroma can be individually controlled for the chroma components on a block level.

It is known from state of the art that the QP can be controlled based on the local luma level such that a finer quantization, e.g., a lower QP, is used for blocks with high local luma levels / small variations in local luma levels than for blocks with low local luma levels / large variations in local luma levels. The reason is that it is better to spend bits in smooth areas where errors are more visible than in highly textured areas where errors are masked. Similarly, it is easier to spot errors at high luminance levels than in low luminance levels, and since luma is often a good predictor for luminance, this works.

HEVC uses by default a uniform reconstruction quantization (URQ) scheme that quantizes frequencies equally. HEVC has the option of using quantization scaling matrices (also referred to as scaling lists), either default ones, or quantization scaling matrices that are signaled as scaling list data in the SPS or PPS. To reduce the memory needed for storage, scaling matrices may only be specified for 4x4 and 8x8 matrices. For the larger transformations of sizes 16x16 and 32x32 the signaled 8x8 matrix is applied by having 2x2 and 4x4 blocks share the same scaling value, except at the DC positions.

A scaling matrix, with individual scaling factors for respective transform coefficient, can be used to make a different quantization effect for respective transform coefficient by scaling the transform coefficients individually with respective scaling factor as part of the quantization. This enables for example that the quantization effect is stronger for higher frequency transform coefficients than for lower frequency transform coefficients. In HEVC default scaling matrices are defined for each transform size and can be invoked by flags in the Sequence Parameter Set (SPS) and/or the Picture Parameter Set (PPS). Scaling matrices also exist in H.264. In HEVC it is also possible to define own scaling matrices in SPS or PPS specifically for each combination of color component, transform size and prediction type (intra or inter mode). SUMMARY

The problem with current solutions is that they lack flexibility to address fine granular changes/adaptations of the coding/decoding with respect to statistical characteristics of a video sequence. With the introduction of High Dynamic Range content more variation of sample values is present than for Standard Dynamic Range.

The basic idea of the invention is to provide more flexibility to coding/decoding to be able to address a larger range of variations of sample values that exist in High Dynamic Range content. However, this could also be beneficial to coding of Standard Dynamic Range content.

This and other objectives are met by embodiments as disclosed herein. A first aspect of the embodiments defines a method for encoding a picture of a video sequence in a video bitstream. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The method comprises calculating a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples. The second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The method comprises quantizing at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter. A second aspect of the embodiments defines an encoder for encoding a picture of a video sequence in a video bitstream. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The encoder comprises processing means operative to calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples, wherein the second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The encoder comprises processing means operative to quantize at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter. A third aspect of the embodiments defines a computer program for encoding a picture of a video sequence in a video bitstream. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The computer program comprises code means which, when run on a computer, causes the computer to calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples, wherein the second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The computer program comprises code means which, when run on a computer, causes the computer to quantize at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter.

A fourth aspect of the embodiments defines a computer program product comprising computer readable means and a computer program, according to the third aspect, stored on the computer readable means.

A fifth aspect of the embodiments defines a method for decoding a video bitstream comprising an encoded video sequence. The encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component. The method comprises calculating a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples. The second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. The method comprises inverse quantizing at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

A sixth aspect of the embodiments defines a decoder for decoding a video bitstream comprising an encoded video sequence. The encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component. The decoder comprises processing means operative to calculate a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples. The second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. The decoder comprises processing means operative to quantize at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

A seventh aspect of the embodiments defines a computer program for decoding a video bitstream comprising an encoded video sequence. The encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component. The computer program comprises code means which, when run on a computer, causes the computer to calculate a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples. The second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. The computer program comprises code means which, when run on a computer, causes the computer to quantize at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

An eighth aspect of the embodiments defines a computer program product comprising computer readable means and a computer program, according to the seventh aspect, stored on the computer readable means.

Advantageously, at least some of the embodiments provide higher compression efficiency. It is to be noted that any feature of the first, second, third, fourth, fifth, sixth, seventh and eighth aspects may be applied to any other aspect, whenever appropriate. Likewise, any advantage of the first aspect may equally apply to the second, third, fourth, fifth, sixth, seventh and eighth aspects respectively, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims and from the drawings.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention, together with further objects and advantages thereof, may best be understood by making reference to the following description taken together with the accompanying drawings, in which:

Figs. 1 (A) and (B) illustrate first and second blocks of samples for inter and intra prediction respectively, according to embodiments of the present invention.

Fig. 2 illustrates a flowchart of a method of encoding a picture of a video sequence, according to embodiments of the present invention.

Fig. 3 illustrates a flowchart of a method of decoding a video bitstream comprising an encoded video sequence, according to embodiments of the present invention. Fig. 4 depicts a schematic block diagram illustrating functional units of an encoder for encoding a picture of a video sequence according to embodiments of the present invention.

Fig. 5 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for encoding a picture of a video sequence, according to embodiments of the present invention. Fig. 6 depicts a schematic block diagram illustrating functional units of a decoder for decoding a video bitstream comprising an encoded video sequence, according to an embodiment of the present invention.

Fig. 7 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for decoding a video bitstream comprising an encoded video sequence, according to an embodiment of the present invention.

DETAILED DESCRIPTION

The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the art to make and use the invention. Throughout the drawings, the same reference numbers are used for similar or corresponding elements.

Throughout the description, the terms "video", "video sequence"," input video" and "source video" are interchangeably used.

Even though the description of the invention is based on the HEVC codec, it is to be understood by a person skilled in the art that the invention could be applied to any other state-of-the-art and a future video coding standard.

The present embodiments generally relate to a method and an encoder for encoding a picture of a video sequence in a bitstream, as well as a method and a decoder for decoding a video bitstream comprising an encoded video sequence.

According to one aspect, a method for encoding a picture of a video sequence in a video bitstream is provided, as shown in Fig. 2. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The method comprises a step S1 of calculating a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples. The second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The statistics, used for calculating a quantization parameter for at least one color component in the first block of samples, may be based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values from the sample values from at least one color component in the second block of samples. According to a first embodiment of the present invention, a quantization parameter, QP parameter, for at least one color component in the first block of samples is derived from an average and/or variation of sample values from at least the same color component in the second block of samples. For example, a quantization parameter for a luma color component in the first block of samples may be calculated from the statistics from the sample values from a luma color component in the second block of samples. Moreover, according to this embodiment, a quantization parameter for a chroma color component in the first block of samples may be calculated from the statistics from the sample values from any of the: luma color component, (same) chroma color component or both luma and (same) chroma color component in the second block of samples. Taking into account the sample values from the luma color component when calculating the quantization parameter for the chroma color component may be especially advantageous when there is a risk of chrominance artifacts. This may happen when the sample values in a neighboring block of samples are very close to white or light gray, in which case it is important to have the chroma component well preserved. In this situation, the encoder may encode the first block of samples using a lower chroma QP parameter, which results in better preserving the chroma and avoiding the chrominance artifacts in the current (first) block of samples. This lower chroma QP can be obtained by decreasing a default QP, defined e.g. by a decoding process specification, by a value that is calculated from the luma color component.

For this purpose, a flag may be provided in a video bitstream to enable/disable this approach for respective color component or for all color components. In another variant of this embodiment, one may look at the RGB representation of previously coded/decoded block of samples, the R'G'B' representation or the chrominance coordinates (x,y) or uV where u' = 4x/(-2x+12y+3) and v' = 9y/(-2x+12y+3). Here it may be beneficial to decrease or increase the chroma QP or luma QP either when one of these representations indicates that the chrominance is close to the white point, or when it indicates that the chrominance is close to the gamut edge. This lower chroma component QP or luma QP can be determined by decreasing from a default luma respectively chroma component QP defined by a decoding process specification or as given in other embodiments.

In case the first block of samples is inter predicted, as shown in Fig. 1 (A), the second block of samples is from a reference picture used for inter prediction of the picture (the reference block in previously decoded pictures). The position of the second block of samples which, in this case, is the reference block in the reference picture, can be determined from a reference index which indicates the reference picture and a motion vector which indicates a displacement between the first and the second block of samples. If the motion vector has sub-sample accuracy, the second block of samples may be filtered (interpolated) before its e.g. average is calculated. In case the first block of samples is intra predicted, the second block of samples is from the same picture as the first block of samples.

In case of intra picture prediction, shown in Fig. 1 (B), the second block of samples may be used before or after in-loop filtering compared to what is described in a decoding process specification, whereas in case of inter picture prediction the second block of samples is typically taken after in-loop filtering has been performed.

The actual QP, i.e. the QP parameter for at least one color component in the first block of samples, is a modification of a default QP known in the prior art. The default QP could be defined in a standard specification (e.g. in HEVC specification), it may be signaled to the decoder or may be a combination of both. One way to determine the actual quantization parameter would be to have a default mapping between luma level and local QP adjustment (and/or between luma level variation and local QP adjustment) defined in a standard specification (a process that describes how to determine the actual QP to be used for a block from the default QP where the default QP is the QP that would be used without the mapping, for example a picture or slice QP) and to allow overriding this default mapping by signaling a new mapping in e.g. in a picture parameter set (PPS). The new mapping describes a QP adjustment for each range of luma values and/or range of luma level variations to be applied instead of the default mapping. In this way the actual QP can vary much locally in the picture without the need to signal the difference from the default QP, thus avoiding overhead for such signaling. If there is a need to do adjustment from this mapping, a delta QP can be signaled. One advantage of this approach is that an encoder may by this design perform subjectively better.

According to a second embodiment of the present invention, the quantization parameters for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first block of samples are calculated differently. As it is well known from the prior art, the DC component and the AC components are obtained after applying a frequency transform (such as a Discrete Cosine Transform (DCT)) on a residual signal, where the residual signal is typically the difference between the sample values and the predicted (i.e. inter predicted or intra predicted) sample values.

According to this embodiment, a quantization parameter is first calculated from the second block of samples, as e.g. described in the first embodiment. This quantization parameter is then used to dequantize (inverse quantize) the DC component of the transform coefficients for the at least one color component of the first block of samples. The inversely quantized DC coefficient will, after inverse frequency transform, give the average luma value of the residual for the first block of samples for the at least one color component. The average luma value of the residual is then added to the predicted luma sample value for the first block of samples. A new average (and variance) of the luma sample values is then calculated and further used to derive a new quantization parameter. The new quantization parameter is then finally used to dequantize (inverse quantize) the AC components of the transform coefficients for the at least one color component of the first block of samples. Thus, the AC components can, according to this embodiment, be quantized more coarsely if e.g. the average luma level is low, and more finely quantized if the average luma level is high.

The same process as above can be iteratively repeated for each AC coefficient. Namely, after each iteration, both the average (and variance) of the residual and reconstructed sample values will change, resulting in an updated quantization parameter that better reflects the original luma level.

According to another aspect of the second embodiment, a first QP value that does not vary with the average luma value, is used to encode/decode the DC component of the transform coefficients for the at least one color component of the first block of samples. The decoder first decodes the DC component using the first QP value. After having decoded the DC component to a pixel value, this value is added to the average of the prediction, giving the actual value of the luma average value in the first block of samples. (This value will be equal to the finally decoded first block of samples' average value). This value may then be used to select a second QP value that will be used for the remaining (AC) coefficients in the first block of samples. In a variant of this embodiment, the first QP value may be predicted from surrounding blocks of samples, whereas the second QP value is selected using the actual average luma value in the second block of samples.

In yet another variant of this embodiment, the actual QP for the at least one color component of the first block of samples is determined based on the average luminance level of already coded/decoded blocks of samples. This is different than the average luma level, since luminance and luma are not the same thing. Alternatively, the actual QP is determined based on the average luminance level of already coded/decoded second blocks of samples and the variance of the luminance level of already coded/decoded second blocks of samples.

According to a third embodiment of the present invention, the calculated quantization parameter for the first block of samples has a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream. The explicitly signaled quantization parameter is normally an integer number. However, the calculated quantization parameter for the first block of samples according to the embodiments of the present invention may not be an integer, i.e., it may be a floating point value. As such, the calculated quantization parameter may indeed give a higher accuracy than the quantization parameter that explicitly has to be signaled. Denote QP_delta a difference between the calculated quantization parameter and the explicitly signaled quantization parameter. The calculated quantization parameter according to the embodiments of the present invention, e.g. based on average of luma samples in the second (i.e., a spatial or temporally neighboring) block of samples, may have a non-integer value that may accordingly have a higher accuracy than integer precision. Prior to signaling, the calculated quantization parameter needs to be converted to an integer value, e.g. by rounding to the nearest integer. The QP_delta value is normally not signaled or subsequently used. However, QP_delta may be determined at both the encoder and the decoder even if not signaled to the decoder. The QP_delta may therefore be added to the explicitly signaled QP, which gives the actual calculated QP with a non-integer precision. This actual QP is then used to inverse quantize at least one transform coefficient which then is inverse transformed to determine a residual block of samples which is then added to the corresponding area of the prediction block of samples.

According to a fourth embodiment of the present invention, the calculated quantization parameter is a scaling factor for a DC component of the transform coefficients. This basically means that the quantization scaling matrices are calculated from the sample values from at least one color component in a second block of samples. The most prominent example is having luma level dependent quantization scaling matrices.

Luma level dependent quantization scaling matrices may be applied to at least the largest transform block sizes for luma and/or chroma color components. This approach may be used when decoding a transform block for the largest transform block sizes where the average luma sample level, for example calculated from previously coded/decoded luma samples or as otherwise defined in other embodiments, is within a specific range of luma sample values. In that case, the quantized transform coefficients are scaled using the luma level dependent quantization scaling matrix. Then, the inverse transform is applied to derive a residual block of samples to be added to a prediction block of samples. The range may for example comprise luma sample values below a certain threshold where it is difficult to see fine details. The approach may specifically be applied to the chroma color component in order to reduce the amount of bits spent for the chroma color component when the luma levels are low.

A variant of this embodiment is to apply the luma level quantization scaling matrix for the lower frequency coefficients only, for example only for the top-left 4x4 transform coefficients of the full transform coefficient block where the lowest transform coefficient is at the top-left position of the 4x4 block. For higher frequency transform either no scaling is applied or a non-luma level dependent quantization scaling matrix is used. The method further comprises a step S2 of quantizing at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter, i.e. the calculated quantization parameter from step S1.

The method may further comprise a step S3 of sending instructions to a decoder on how to calculate the quantization parameter for the at least one color component in the first block of samples. For example, an instruction may be which of the following measures is used to calculate the quantization parameter: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values. The instruction may be which color components are used for calculating the quantization parameter for the first block of samples for the at least one color component. As already said, a flag may be provided in a video bitstream to indicate this. The instruction may also be to use the calculated quantization parameter as a scaling factor for a DC component of the transform coefficients. The instruction may additionally comprise the value QP_delta that is described in the third embodiment above. The instruction may also additionally comprise which transform sizes and ranges of luma levels are used according to the fourth embodiment described above.

This may be done by signaling the instructions in a sequence parameter set (SPS), picture parameter set (PPS), slice header, slice data etc. The instructions may also be specified in a decoder process specification.

According to another aspect, a method for decoding a video bitstream comprising an encoded video sequence is provided, as shown in Fig. 3. The encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples. Each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component.

The method comprises a step S5 of calculating a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples. The second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. This step is very similar to step S1 described above, except that it is performed at the decoder. Thus, the statistics are calculated from sample values from the second decoded block of samples, rather than the second block of samples itself. The statistics, used for calculating the quantization parameter for the at least one color component in the first block of samples, may be based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values from the decoded sample values from at least the same color component in the second decoded block of samples.

The same statistics are used at the encoder and the decoder, i.e., the same measure is used by both the encoder and the decoder to calculate the quantization parameter for the at least one color component in the first block of samples. Having the same statistics, i.e., the same calculating operation, implies that quantization parameters themselves do not need to be signaled as they can be derived both at the encoder and the decoder just by knowing which measure (statistics) to use. An exception to this is described above in the third embodiment of the encoding method where the QP_delta value may be signaled.

Similar to what is done at the encoder, the quantization parameter for at least one color component in the first block of samples is derived from an average and/or variation of sample values from at least the same color component in the second decoded block of samples. One example of this is when the quantization parameter for the luma color component in the first coded block of samples is calculated from the statistics from the sample values from the luma color component in the second decoded block of samples. Another option is that the quantization parameter for the chroma color component in the first coded block of samples may be calculated from the statistics from the sample values from any of the: luma color component, (same) chroma color component or both luma and (same) chroma color component in the second decoded block of samples. This approach is applied if there is a risk of chrominance artifacts, as described above. For this purpose, a flag may be provided in a video bitstream to enable/disable this approach for respective color component or for all color components.

In another variant of this embodiment, one may look at the RGB representation of previously decoded block of samples, the R'G'B' representation or the decoded chrominance coordinates (x,y) or uV where u' = 4x/(-2x+12y+3) and v' = 9y/(-2x+12y+3). Here it may be beneficial to decrease or increase the chroma QP or luma QP either when one of these representations indicates that the decoded chrominance is close to the white point, or when it indicates that the decoded chrominance is close to the gamut edge. This lower chroma component QP or luma QP can be determined by decreasing from a default luma respectively chroma component QP defined by a decoding process specification or as given in other embodiments. In case the first coded block of samples is inter predicted, the second decoded block of samples is from a reference picture used for inter prediction of the picture (the reference block in previously decoded pictures). The position of the second decoded block of samples which, in this case is the reference block in the reference picture, can be determined from a reference index which indicates the reference picture and a decoded motion vector which indicates a displacement between the first coded and the second decoded block of samples. If the motion vector has sub-sample accuracy, the reference block may be filtered (interpolated) before its e.g. average is computed. In case the first coded block of samples is intra predicted, the second decoded block of samples is from the same picture as the first coded block of samples. In case of intra picture prediction the second decoded block of samples may be used before or after in- loop filtering compared to what is described in a decoding process specification, whereas in the case of inter picture prediction the second decoded block of samples is typically taken after in-loop filtering has been performed.

According to an embodiment of the present invention, the quantization parameters for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first coded block of samples, are calculated differently. The DC component and the AC components are obtained after entropy decoding the video bitstream and are further used to obtain a decoded residual signal for the second decoded block of samples.

According to this embodiment, the quantization parameter is first calculated from the second decoded block of samples, as e.g. previously described. This quantization parameter is then used to inverse quantize (dequantize) the DC component of the transform coefficients for the at least one color component of the first coded block of samples. The inversely quantized DC coefficient will after inverse frequency transform give the average luma value of the residual for the first coded block of samples for the at least one color component. The average luma value of the residual is then added to the predicted luma sample value for the first coded block of samples. A new average (and variance) of the luma sample values is then calculated and further used to derive a new quantization parameter. The new quantization parameter is then finally used to dequantize (inverse quantize) the AC components of the transform coefficients for the at least one color component of the first coded block of samples. Thus, the AC components can, according to this embodiment, be quantized more coarsely if e.g. the average luma level is low and more finely quantized if the average luma level is high. The same process as above can be iteratively repeated for each AC coefficient. Namely, after each iteration, both the average (and variance) of the residual and reconstructed sample values will change, resulting in an updated quantization parameter that better reflects the original luma level.

According to another aspect of this embodiment, a first QP value that does not vary with the average luma value, is used to encode/decode the DC component of the transform coefficients for the at least one color component of the first coded block of samples. The decoder first decodes the DC component using the first QP value. After having decoded the DC component to a pixel value, this value is added to the average of the prediction, giving the actual value of the luma average value in the first coded block of samples. This value may then be used to select a second QP value that will be used for the remaining AC coefficients in the first coded block of samples.

In a variant of this embodiment, the first QP value may be predicted from surrounding blocks of samples, whereas the second QP value is selected using the actual average luma value in the second decoded block of samples.

In yet another variant of this embodiment, the actual QP for the at least one color component of the first coded block of samples is determined based on the average luminance level of already decoded blocks of samples. Alternatively, the actual QP is determined based on the average luminance level of already decoded second blocks of samples and the variance of the luminance level of already decoded second blocks of samples.

According to an embodiment of the present invention, the calculated quantization parameter for the first coded block of samples has a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream. Same as described above in the encoding method, the calculated quantization parameter for the first coded block of samples may not be an integer, i.e., it may be a floating point value. As such, the calculated quantization parameter may indeed give a higher accuracy than the quantization parameter that explicitly has to be signaled. According to another embodiment of the present invention, the calculated quantization parameter is a scaling factor for a DC component of the transform coefficients. The quantization scaling matrices are therefore calculated from the sample values from at least one color component in the second decoded block of samples. The most prominent example is having luma level dependent quantization scaling matrices described above. The method further comprises a step S6 of inverse quantizing at least one transform coefficient for the at least one color component in the first coded block of samples with the calculated quantization parameter. Applying the same inverse frequency transform on the transform coefficients gives a decoded residual that is further used to reconstruct the first coded block of samples. The method optionally comprises a step S4 of receiving instructions from the encoder on how to calculate the quantization parameter for the at least one color component in the first coded block of samples. As already mentioned, an instruction may be which measure is used to calculate the quantization parameter: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values. The instruction may be which color components are used for calculating the quantization parameter for the first coded block of samples for the at least one color component. The instruction may further be to use the calculated quantization parameter as a scaling factor for a DC component of the transform coefficients.

The instructions may be received in a sequence parameter set, picture parameter set, slice header, slice data etc.

Fig. 4 is a schematic block diagram of an encoder 100 for encoding a picture of a video sequence in a video bitstream. The picture comprises a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component. The encoder comprises according to this aspect a calculating unit 160, configured to calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples. The second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The encoder comprises according to this aspect a quantizing unit 170, configured to quantize at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter.

The encoder may optionally comprise a sending unit 180, configured to send instructions to a decoder on how to calculate the quantization parameter for the at least one color component in the first block of samples. The calculating 160, quantizing 170 and sending 180 units may be hardware based, software based (in this case they are called calculating, quantizing and sending modules respectively) or may be a combination of hardware and software.

The encoder 100 may be an HEVC encoder or any other state of the art or future video encoder. The calculating unit 160 may calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in the second block of samples. The second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples. The statistics may be based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values.

The calculating unit 160 may further calculate the quantization parameter for the at least one color component in the first block of samples as an average and/or a variation of sample values from at least the same color component in the second block of samples. The second block of samples may be from a reference picture used for inter prediction of the picture in case the first block of samples is inter predicted, or from the same picture as the first block of samples in case the first block of samples is intra predicted.

The calculating unit 160 may further calculate differently the quantization parameters for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first block of samples obtained after a frequency transform. For example, the quantization parameter for the AC coefficients is calculated from the sample values from the second block of samples and an inversely quantized DC component.

The calculating unit 160 may further calculate quantization parameter for the first block of samples with a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream.

The calculating unit 160 may further calculate the quantization parameter as a scaling factor for a DC component of the transform coefficients.

The sending unit 180 may signal the instructions on how to calculate the quantization parameter for the at least one color component in the first block of samples in a sequence parameter set, picture parameter set, slice header, slice data etc. The encoder 100 can be implemented in hardware, in software or a combination of hardware and software. The encoder 100 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer. The encoder 100 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.

Although the respective units disclosed in conjunction with Fig. 4 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in Fig 5.

Fig. 5 schematically illustrates an embodiment of a computer 150 having a processing unit 110 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit). The processing unit 110 can be a single unit or a plurality of units for performing different steps of the method described herein. The computer also comprises an input/output (I/O) unit 120 for receiving a video sequence. The I/O unit 120 has been illustrated as a single unit in Fig. 5 but can likewise be in the form of a separate input unit and a separate output unit.

Furthermore, the computer 150 comprises at least one computer program product 130 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive. The computer program product 130 comprises a computer program 140, which comprises code means which, when run on the computer 150, such as by the processing unit 110, causes the computer 150 to perform the steps of the method described in the foregoing in connection with Fig 2.

Fig. 6 is a schematic block diagram of a decoder 200 for decoding a video bitstream comprising an encoded video sequence. The encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component. The decoder comprises according to this aspect a calculating unit 270, configured to calculate a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples, wherein the second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. According to this aspect, the decoder comprises an inverse quantizing unit 280, configured to inverse quantize at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

The encoder may optionally comprise a receiving unit 260, configured to receive instructions from an encoder on how to calculate the quantization parameter for the at least one color component in the first coded block of samples.

The calculating 270, inverse quantizing 280 and receiving 260 units may be hardware based, software based (in this case they are called calculating, inverse quantizing and receiving modules respectively) or may be a combination of hardware and software. The decoder 200 may be an HEVC decoder or any other state of the art or future video decoder.

The calculating unit 270 may calculate the quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples. The second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples. The statistics may be based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values from at least the same color component in the second decoded block of samples. The calculating unit 270 may further calculate the quantization parameter for the at least one color component in the first coded block of samples as an average and/or a variation of sample values from at least the same color component in the second decoded block of samples. The second decoded block of samples may be from a reference picture used for inter prediction of the picture in case the first coded block of samples is inter predicted, or from the same picture as the first coded block of samples in case the first coded block of samples is intra predicted.

The calculating unit 270 may further calculate differently the quantization parameters for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first coded block of samples obtained after an inverse frequency transform. For example, the quantization parameter for the AC coefficients is calculated from the sample values from the second decoded block of samples and an inversely quantized DC component. The calculating unit 270 may further calculate quantization parameter for the first coded block of samples with a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream.

The calculating unit 270 may further calculate the quantization parameter as a scaling factor for a DC component of the transform coefficients.

The receiving unit 260 may receive the instructions on how to calculate the quantization parameter for the at least one color component in the first coded block of samples in a sequence parameter set, picture parameter set, slice header, slice data etc.

The decoder 200 can be implemented in hardware, in software or a combination of hardware and software. The decoder 200 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer. The decoder 200 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.

Although the respective units disclosed in conjunction with Fig. 6 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in Fig. 7.

Fig. 7 schematically illustrates an embodiment of a computer 250 having a processing unit 210 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit). The processing unit 210 can be a single unit or a plurality of units for performing different steps of the method described herein. The computer also comprises an input/output (I/O) unit 220 for receiving a video bitstream. The I/O unit 220 has been illustrated as a single unit in Fig. 7 but can likewise be in the form of a separate input unit and a separate output unit. Furthermore, the computer 250 comprises at least one computer program product 230 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive. The computer program product 230 comprises a computer program 240, which comprises code means which, when run on the computer 250, such as by the processing unit 210, causes the computer 250 to perform the steps of the method described in the foregoing in connection with Fig. 3. The embodiments described above are to be understood as a few illustrative examples of the present invention. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the scope of the present invention. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.

Claims

1. A method for encoding a picture of a video sequence in a video bitstream, the picture comprising a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component, the method comprising:

calculating (S1) a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples, wherein the second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples; and

quantizing (S2) at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter.

2. The method according to claim 1 , further comprising:

sending (S3) instructions to a decoder on how to calculate the quantization parameter for the at least one color component in the first block of samples.

3. The method according to any of claims 1-2, wherein the statistics is based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values from at least the same color component in the second block of samples.

4. The method according to any of the preceding claims, wherein the second block of samples is from a reference picture used for inter prediction of the picture in case the first block of samples is inter predicted, or from the same picture as the first block of samples in case the first block of samples is intra predicted, and wherein the quantization parameter for the at least one color component in the first block of samples is derived from an average and/or a variation of sample values from at least the same color component in the second block of samples.

5. The method according to any of the preceding claims, wherein the quantization parameters for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first block of samples obtained after a frequency transform, are calculated differently, and wherein the quantization parameter for the AC coefficients is calculated from the sample values from the second block of samples and an inversely quantized DC component.

6. The method according to any of the preceding claims, wherein the calculated quantization parameter for the first block of samples has a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream.

5 7. The method according to any of the preceding claims, wherein the calculated quantization parameter is a scaling factor for a DC component of the transform coefficients.

8. A method for decoding a video bitstream comprising an encoded video sequence, wherein the encoded video sequence comprises at least one encoded picture, wherein the encoded picture

10 comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component, the method comprising:

calculating (S5) a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color 15 component in a second decoded block of samples, wherein the second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples;

inverse quantizing (S6) at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

20

9. The method according to claim 8, further comprising:

receiving (S4) instructions from an encoder on how to calculate the quantization parameter for the at least one color component in the first coded block of samples.

25 10. The method according to any of claims 8-9, wherein the statistics is based on at least one of: average, median, minimum, maximum, quantile of previously reconstructed sample values or predicted sample values and color component variation of reconstructed sample values or predicted sample values from at least the same color component in the second decoded block of samples.

30 11. The method according to any of claims 8-10, wherein the second decoded block of samples is from a reference picture used for inter prediction of the picture in case the first coded block of samples is inter predicted, or from the same picture as the first coded block of samples in case the first coded block of samples is intra predicted, and wherein the quantization parameter for the at least one color component in the first coded block of samples is derived from an average and/or a variation of sample values from at least the same color component in the second decoded block of samples.

12. The method according to any of the preceding claims, wherein the quantization parameters 5 for the DC component and for the AC components of the transform coefficients, for the at least one color component of the first coded block of samples obtained after an inverse frequency transform, are calculated differently, and wherein the quantization parameter for the AC coefficients is calculated from the sample values from the second decoded block of samples and an inversely quantized DC component.

10

13. The method according to any of claims 8-12, wherein the calculated quantization parameter for the first coded block has a higher accuracy than what can be explicitly signaled for a quantization parameter in the video bitstream.

15 14. The method according to any of the preceding claims, wherein the calculated quantization parameter is a scaling factor for a DC component of the transform coefficients.

15. An encoder (100), for encoding a picture of a video sequence in a video bitstream, the picture comprising a first block of samples, wherein each sample in the first block of samples has sample 20 values associated with at least a luma color component and a chroma color component, the encoder

(100) comprising processing means (110) operative to:

calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples, wherein the second block of samples is one of: a previously reconstructed 25 block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples; and

quantize at least one transform coefficient of a residual for the at least one color component in the first block of samples with the calculated quantization parameter.

30 16. The encoder (100) according to claim 15, wherein the processing means (110) is further operative to:

send instructions to a decoder on how to calculate the quantization parameter for the at least one color component in the first block of samples.

17. The encoder (100) according to claims 15-16, wherein the processing means (110) comprise a processor (190) and a memory (130) wherein said memory (130) is containing instructions executable by said processor (190).

18. A decoder (200), for decoding a video bitstream comprising an encoded video sequence, wherein the encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded chroma color component, the decoder (200) comprising processing means (210) operative to:

calculate a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples, wherein the second decoded block of samples is one of: a previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples; and

inverse quantize at least one transform coefficient of a residual for the at least one color component in the first coded block of samples with the calculated quantization parameter.

19. The decoder (200) according to claim 18, wherein the processing means (210) is further operative to:

receive instructions from an encoder on how to calculate the quantization parameter for the at least one color component in the first coded block of samples.

20. The decoder (200) according to claims 18-19, wherein the processing means (210) comprise a processor (290) and a memory (230) wherein said memory (230) is containing instructions executable by said processor (290).

21. A computer program (140), for encoding a picture of a video sequence in a video bitstream, the picture comprising a first block of samples, wherein each sample in the first block of samples has sample values associated with at least a luma color component and a chroma color component, the computer program (140) comprising code means which, when run on a computer (150), causes the computer (150) to:

calculate a quantization parameter for at least one color component in the first block of samples based on statistics calculated from the sample values from at least one color component in a second block of samples, wherein the second block of samples is one of: a previously reconstructed block of samples, reference sample values for the first block of samples and predicted sample values for the first block of samples; and

5

22. A computer program (240), for decoding a video bitstream comprising an encoded video sequence, wherein the encoded video sequence comprises at least one encoded picture, wherein the encoded picture comprises a first coded block of samples wherein each sample in the first coded block of samples has sample values associated with at least a coded luma color component and a coded

10 chroma color component, the computer program (240) comprising code means which, when run on a computer (250), causes the computer (250) to:

calculate a quantization parameter for at least one color component in the first coded block of samples based on statistics calculated from the sample values from at least one color component in a second decoded block of samples, wherein the second decoded block of samples is one of: a

15 previously reconstructed block of samples, reference sample values for the first coded block of samples and predicted sample values for the first coded block of samples; and

20 23. A computer program product (300) comprising computer readable means (310) and a computer program (140) according to claim 21 stored on the computer readable means (310).

24. A computer program product (400) comprising computer readable means (410) and a computer program (240) according to claim 22 stored on the computer readable means (410).

25

30