WO2022155922A1

WO2022155922A1 - Video coding method and system, video decoding method and system, video coder and video decoder

Info

Publication number: WO2022155922A1
Application number: PCT/CN2021/073409
Authority: WO
Inventors: 戴震宇
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-01-22
Filing date: 2021-01-22
Publication date: 2022-07-28
Also published as: CN117082239A; CN116746152A

Abstract

Provided are a video coding method and system, a video decoding method and system, a video coder and a video decoder. The video coding method comprises: by means of an ALF coefficient of a floating-point number type of a first ALF, determining the maximum quantization scale of the ALF coefficient of the floating-point number type of the first ALF; according to the maximum quantization scale, determining a target quantization scale of the ALF coefficient of the floating-point number type of the first ALF; using the target quantization scale to quantize the ALF coefficient of the floating-point number type of the first ALF as an ALF coefficient of an integer type; and coding the ALF coefficient of the integer type of the first ALF to obtain a code stream. The balance of gains brought by coding overheads of a filter coefficient and by a filter is realized, thereby improving the video coding and decoding effects.

Description

Video encoding and decoding method and system, and video encoder and video decoder

technical field

The present application relates to the technical field of video encoding and decoding, and in particular, to a video encoding and decoding method and system, as well as a video encoder and a video decoder.

Background technique

Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, e-readers or video players, and the like. With the development of video technology, the amount of data included in video data is relatively large. In order to facilitate the transmission of video data, video devices implement video compression technology to enable more efficient transmission or storage of video data.

Errors are introduced in the video compression process. In order to reduce the error, the reconstructed image is filtered, for example, the reconstructed image is filtered using an adaptive correction filter to minimize the mean square error between the reconstructed image and the original image. When an adaptive correction filter (hereinafter referred to as ALF) is used for filtering, a fixed quantization scale is currently collected, and the coefficients of the floating point type of the ALF are quantized into an integer type.

Different regions of the reconstructed image or reconstructed images of different frames may have different ranges of ALF coefficients of floating point type derived, and may also have different filter gains. However, when a fixed quantization scale is used to quantize the floating-point type coefficients, the coding overhead of the filter coefficients and the gain brought by the filter cannot be balanced.

SUMMARY OF THE INVENTION

Embodiments of the present application provide a video encoding and decoding method and system, as well as a video encoder and a video decoder, to achieve a balance between the encoding overhead of filter coefficients and the gain brought by the filter.

In a first aspect, the present application provides a video encoding method, including:

obtaining a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

determining the ALF coefficient of the floating point type of the first ALF when the target area in the reconstructed image under the first component is filtered using the first adaptive correction filter ALF;

determining the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF;

According to the maximum quantization scale, determine the target quantization scale of the floating-point type ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;

Using the target quantization scale, quantizing the floating-point type ALF coefficients of the first ALF into integer-type ALF coefficients;

Encoding the integer-type ALF coefficients of the first ALF to obtain a code stream.

In a second aspect, an embodiment of the present application provides a video decoding method, including:

Decode the code stream to obtain the residual value of the current image;

determining a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

Decoding code stream, obtains the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target area in the described reconstruction image under the described first component;

According to the ALF coefficient information of the first ALF, the target area in the reconstructed image under the first component is filtered using the first ALF.

In a third aspect, the present application provides a video encoder for performing the method in the first aspect or each of its implementations. Specifically, the encoder includes a functional unit for executing the method in the above-mentioned first aspect or each of its implementations.

In a fourth aspect, the present application provides a video decoder for executing the method in the second aspect or each of its implementations. Specifically, the decoder includes functional units for performing the methods in the second aspect or the respective implementations thereof.

In a fifth aspect, a video encoder is provided, including a processor and a memory. The memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory, so as to execute the method in the above-mentioned first aspect or each implementation manner thereof.

In a sixth aspect, a video decoder is provided, including a processor and a memory. The memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory, so as to execute the method in the above-mentioned second aspect or each implementation manner thereof.

In a seventh aspect, a video encoding and decoding system is provided, including a video encoder and a video decoder. The video encoder is used to perform the method in the first aspect or each of its implementations, and the video decoder is used to perform the method in the above-mentioned second aspect or each of its implementations.

In an eighth aspect, a chip is provided for implementing any one of the above-mentioned first aspect to the second aspect or the method in each implementation manner thereof. Specifically, the chip includes: a processor for invoking and running a computer program from a memory, so that a device on which the chip is installed executes any one of the above-mentioned first to second aspects or each of its implementations method.

In a ninth aspect, a computer-readable storage medium is provided for storing a computer program, the computer program causing a computer to execute the method in any one of the above-mentioned first aspect to the second aspect or each of its implementations.

In a tenth aspect, a computer program product is provided, comprising computer program instructions, the computer program instructions causing a computer to perform the method in any one of the above-mentioned first to second aspects or the implementations thereof.

In an eleventh aspect, there is provided a computer program which, when run on a computer, causes the computer to perform the method in any one of the above-mentioned first to second aspects or the respective implementations thereof.

Based on the above technical solutions, in the video encoding process, the reconstructed image of the current image is obtained; when the target area in the reconstructed image under the first component is filtered using the first ALF, the floating-point type ALF coefficient of the first ALF is determined; Determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF; according to the maximum quantization scale, determine the target quantization scale of the ALF coefficients of the floating point type of the first ALF; use The target quantization scale is to quantize the ALF coefficients of the floating point type of the first ALF into ALF coefficients of the integer type; encode the ALF coefficients of the integer type of the first ALF to obtain a code stream. That is, the present application determines the target quantization scale of the ALF coefficient according to the size of the ALF coefficient of the floating point type of the first ALF, and uses the target quantization scale to quantize the ALF coefficient of the floating point type, so that when the ALF coefficient of the floating point type is large , the target quantization scale of the determined ALF coefficient is relatively large, and then the filter gain corresponding to the ALF coefficient quantized by using the target quantization scale is relatively large. When the ALF coefficient of the floating-point type is small, the target quantization scale of the determined ALF coefficient is relatively small. The coding overhead is small, thereby achieving a balance between the coding overhead of the filter coefficients and the gain brought by the filter, thereby improving the effect of video coding and decoding.

Description of drawings

FIG. 1 is a schematic block diagram of a video encoding and decoding system 100 involved in an embodiment of the present application;

FIG. 2 is a schematic block diagram of a video encoder 200 provided by an embodiment of the present application;

FIG. 3 is a schematic block diagram of a decoding framework 300 provided by an embodiment of the present application;

FIG. 4 is a schematic flowchart of a video encoding method 400 provided by an embodiment of the present application;

5A is a schematic diagram of an ALF shape involved in an embodiment of the application;

5B is a schematic diagram of another ALF shape involved in an embodiment of the application;

FIG. 6 is another schematic flowchart of a video encoding method 600 provided by an embodiment of the present application;

FIG. 7 is another schematic flowchart of a video encoding method 700 provided by an embodiment of the present application;

FIG. 8 is a schematic flowchart of a video decoding method 800 provided by an embodiment of the present application;

FIG. 9 is a schematic flowchart of a video decoding method 900 provided by an embodiment of the present application;

FIG. 10 is a schematic flowchart of a video decoding method 1000 provided by an embodiment of the present application;

FIG. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application;

12 is a schematic block diagram of a video decoder 20 provided by an embodiment of the present application;

FIG. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application;

FIG. 14 is a schematic block diagram of a video encoding and decoding system 40 provided by an embodiment of the present application.

Detailed ways

The present application can be applied to the field of image encoding and decoding, the field of video encoding and decoding, the field of hardware video encoding and decoding, the field of dedicated circuit video encoding and decoding, the field of real-time video encoding and decoding, and the like. For example, the solution of the present application can be combined with audio video coding standard (audio video coding standard, AVS for short), for example, H.264/audio video coding (audio video coding, AVC for short) standard, H.265/High Efficiency Video Coding ( High efficiency video coding, referred to as HEVC) standard and H.266/versatile video coding (versatile video coding, referred to as VVC) standard. Alternatively, the schemes of the present application may operate in conjunction with other proprietary or industry standards including ITU-TH.261, ISO/IECMPEG-1 Visual, ITU-TH.262 or ISO/IECMPEG-2 Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including Scalable Video Codec (SVC) and Multi-View Video Codec (MVC) extensions. It should be understood that the techniques of this application are not limited to any particular codec standard or technique.

For ease of understanding, the video coding and decoding system involved in the embodiments of the present application is first introduced with reference to FIG. 1 .

FIG. 1 is a schematic block diagram of a video encoding and decoding system 100 according to an embodiment of the present application. It should be noted that FIG. 1 is only an example, and the video encoding and decoding systems in the embodiments of the present application include, but are not limited to, those shown in FIG. 1 . As shown in FIG. 1 , the video codec system 100 includes an encoding device 110 and a decoding device 120 . The encoding device is used to encode the video data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device. The decoding device decodes the code stream encoded by the encoding device to obtain decoded video data.

The encoding device 110 in this embodiment of the present application may be understood as a device with a video encoding function, and the decoding device 120 may be understood as a device with a video decoding function, that is, the encoding device 110 and the decoding device 120 in the embodiments of the present application include a wider range of devices, Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, and the like.

In some embodiments, the encoding device 110 may transmit the encoded video data (eg, a code stream) to the decoding device 120 via the channel 130 . Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .

In one example, channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video data directly to decoding device 120 in real-time. In this example, encoding apparatus 110 may modulate the encoded video data according to a communication standard and transmit the modulated video data to decoding apparatus 120 . Wherein the communication medium includes a wireless communication medium, such as a radio frequency spectrum, optionally, the communication medium may also include a wired communication medium, such as one or more physical transmission lines.

In another example, channel 130 includes a storage medium that can store video data encoded by encoding device 110 . Storage media include a variety of locally accessible data storage media such as optical discs, DVDs, flash memory, and the like. In this example, the decoding apparatus 120 may obtain the encoded video data from the storage medium.

In another example, channel 130 may include a storage server that may store video data encoded by encoding device 110 . In this instance, the decoding device 120 may download the stored encoded video data from the storage server. Optionally, the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a file transfer protocol (FTP) server, and the like.

In some embodiments, encoding apparatus 110 includes video encoder 112 and output interface 113 . Among them, the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.

In some embodiments, encoding device 110 may include video source 111 in addition to video encoder 112 and input interface 113 .

The video source 111 may include at least one of a video capture device (eg, a video camera), a video archive, a video input interface, a computer graphics system for receiving video data from a video content provider, a computer graphics system Used to generate video data.

The video encoder 112 encodes the video data from the video source 111 to generate a code stream. Video data may include one or more pictures or a sequence of pictures. The code stream contains the encoding information of the image or image sequence in the form of bit stream. The encoded information may include encoded image data and associated data. The associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short), and other syntax structures. An SPS may contain parameters that apply to one or more sequences. A PPS may contain parameters that apply to one or more images. A syntax structure refers to a set of zero or more syntax elements in a codestream arranged in a specified order.

The video encoder 112 directly transmits the encoded video data to the decoding device 120 via the output interface 113 . The encoded video data may also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .

In some embodiments, decoding device 120 includes input interface 121 and video decoder 122 .

In some embodiments, the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .

The input interface 121 includes a receiver and/or a modem. The input interface 121 may receive the encoded video data through the channel 130 .

The video decoder 122 is configured to decode the encoded video data, obtain the decoded video data, and transmit the decoded video data to the display device 123 .

The display device 123 displays the decoded video data. The display device 123 may be integrated with the decoding apparatus 120 or external to the decoding apparatus 120 . The display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.

In addition, FIG. 1 is only an example, and the technical solutions of the embodiments of the present application are not limited to FIG. 1 . For example, the technology of the present application may also be applied to single-side video encoding or single-side video decoding.

The following describes the video coding framework involved in the embodiments of the present application.

FIG. 2 is a schematic block diagram of a video encoder 200 provided by an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images, and can also be used to perform lossless compression on images. The lossless compression may be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).

The video encoder 200 can be applied to image data in luminance chrominance (YCbCr, YUV) format. For example, the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y represents the luminance (Luma), Cb(U) represents the blue chromaticity, Cr(V) represents the red chromaticity, U and V are expressed as chroma (Chroma) to describe color and saturation. For example, in color format, 4:2:0 means that every 4 pixels has 4 luma components, 2 chrominance components (YYYYCbCr), 4:2:2 means that every 4 pixels has 4 luma components, 4 Chroma component (YYYYCbCrCbCr), 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).

For example, the video encoder 200 reads video data, and for each frame of image in the video data, divides one frame of image into several coding tree units (CTUs). In some examples, the CTB may be referred to as "Tree block", "Largest Coding Unit" (LCU for short) or "coding tree block" (CTB for short). Each CTU may be associated with a block of pixels of equal size within the image. Each pixel may correspond to one luminance (luma) sample and two chrominance (chrominance or chroma) samples. Thus, each CTU may be associated with one block of luma samples and two blocks of chroma samples. The size of one CTU is, for example, 128×128, 64×64, 32×32, and so on. A CTU can be further divided into several coding units (Coding Unit, CU) for coding, and the CU can be a rectangular block or a square block. The CU can be further divided into a prediction unit (PU for short) and a transform unit (TU for short), so that coding, prediction, and transformation are separated and processing is more flexible. In one example, the CTU is divided into CUs in a quadtree manner, and the CUs are divided into TUs and PUs in a quadtree manner.

Video encoders and video decoders may support various PU sizes. Assuming the size of a particular CU is 2Nx2N, video encoders and video decoders may support PU sizes of 2Nx2N or NxN for intra prediction, and support 2Nx2N, 2NxN, Nx2N, NxN or similar sized symmetric PUs for inter prediction. Video encoders and video decoders may also support 2NxnU, 2NxnD, nLx2N, and nRx2N asymmetric PUs for inter prediction.

In some embodiments, as shown in FIG. 2, the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, a loop filtering unit 260 , a decoded image buffer 270 and an entropy encoding unit 280 . It should be noted that the video encoder 200 may include more, less or different functional components.

Optionally, in this application, a current block (current block) may be referred to as a current coding unit (CU) or a current prediction unit (PU), or the like. A prediction block may also be referred to as a predicted image block or an image prediction block, and a reconstructed image block may also be referred to as a reconstructed block or an image reconstructed image block.

In some embodiments, prediction unit 210 includes an inter prediction unit 211 and an intra prediction unit 212 . Since there is a strong correlation between adjacent pixels in a frame of a video, the method of intra-frame prediction is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Due to the strong similarity between adjacent frames in the video, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving the coding efficiency.

The inter-frame prediction unit 211 can be used for inter-frame prediction, and the inter-frame prediction can refer to image information of different frames, and the inter-frame prediction uses motion information to find a reference block from the reference frame, and generates a prediction block according to the reference block for eliminating temporal redundancy; Frames used for inter-frame prediction may be P frames and/or B frames, where P frames refer to forward predicted frames, and B frames refer to bidirectional predicted frames. The motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector. The motion vector can be of whole pixel or sub-pixel. If the motion vector is sub-pixel, then it is necessary to use interpolation filtering in the reference frame to make the required sub-pixel block. Here, the reference frame found according to the motion vector is used. The whole pixel or sub-pixel block is called the reference block. In some technologies, the reference block is directly used as the prediction block, and some technologies are processed on the basis of the reference block to generate the prediction block. Reprocessing to generate a prediction block on the basis of the reference block can also be understood as taking the reference block as a prediction block and then processing it on the basis of the prediction block to generate a new prediction block.

Currently, the most commonly used inter-frame prediction methods include: geometric partitioning mode (GPM) in the VVC video codec standard, and angular weighted prediction (AWP) in the AVS3 video codec standard. These two intra prediction modes have something in common in principle.

The intra-frame prediction unit 212 only refers to the information of the same frame image, and predicts the pixel information in the current code image block, so as to eliminate the spatial redundancy. Frames used for intra prediction may be I-frames. For example, as shown in FIG. 5 , the white 4×4 block is the current block, and the gray pixels in the left row and upper column of the current block are the reference pixels of the current block, and the intra prediction uses these reference pixels to predict the current block. These reference pixels may already be all available, ie all already coded and decoded. Some parts may not be available. For example, if the current block is the leftmost part of the whole frame, the reference pixels to the left of the current block are not available. Or when the current block is encoded and decoded, the lower left part of the current block has not been encoded or decoded, so the reference pixels at the lower left are also unavailable. In the case where the reference pixel is not available, the available reference pixel or some value or some method can be used for padding, or no padding is performed.

In some embodiments, the intra prediction method further includes a multiple reference line intra prediction method (multiple reference line, MRL), which can use more reference pixels to improve coding efficiency.

There are multiple prediction modes for intra-frame prediction. In H.264, there are 9 modes for intra-frame prediction for 4×4 blocks. Among them, mode 0 is to copy the pixels above the current block to the current block in the vertical direction as the predicted value; mode 1 is to copy the reference pixel on the left to the current block in the horizontal direction as the predicted value; mode 2 (DC) is to copy A ~ The average value of the 8 points D and I to L is used as the predicted value of all points. Modes 3 to 8 copy the reference pixels to the corresponding position of the current block according to a certain angle respectively. Because some positions of the current block cannot exactly correspond to the reference pixels, it may be necessary to use a weighted average of the reference pixels, or sub-pixels of the interpolated reference pixels.

The intra-frame prediction modes used by HEVC include Planar mode, DC and 33 angle modes, for a total of 35 prediction modes. The intra-frame modes used by VVC are Planar, DC, and 65 angular modes, for a total of 67 prediction modes. The intra-frame modes used by AVS3 are DC, Plane, Bilinear and 63 angle modes, a total of 66 prediction modes.

It should be noted that with the increase of the angle mode, the intra-frame prediction will be more accurate and more in line with the demand for the development of high-definition and ultra-high-definition digital video.

Residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, residual unit 220 may generate a residual block of the CU such that each sample in the residual block has a value equal to the difference between the samples in the CU's pixel block, and the CU's PU's Corresponding samples in the prediction block.

Transform/quantization unit 230 may quantize transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of the CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with the CU by adjusting the QP value associated with the CU.

Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform, respectively, to the quantized transform coefficients to reconstruct a residual block from the quantized transform coefficients.

Reconstruction unit 250 may add the samples of the reconstructed residual block to corresponding samples of the one or more prediction blocks generated by prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the block of samples for each TU of the CU in this manner, video encoder 200 may reconstruct the block of pixels of the CU.

In-loop filtering unit 260 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.

In some embodiments, loop filtering unit 260 includes a deblocking filtering unit, a sample adaptive compensation SAO unit, an adaptive loop filtering ALF unit.

The decoded image buffer 270 may store the reconstructed pixel blocks. Inter-prediction unit 211 may use the reference picture containing the reconstructed pixel block to perform inter-prediction on PUs of other pictures. In addition, intra-prediction unit 212 may use the reconstructed pixel blocks in decoded picture buffer 270 to perform intra-prediction on other PUs in the same picture as the CU.

Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.

The basic flow of video coding involved in the present application is as follows: at the coding end, the current image is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate a prediction block of the current block. The residual unit 220 may calculate a residual block based on the predicted block and the original block of the current block, that is, the difference between the predicted block and the original block of the current block, and the residual block may also be referred to as residual information. The residual block can be transformed and quantized by the transform/quantization unit 230 to remove information insensitive to human eyes, so as to eliminate visual redundancy. Optionally, the residual block before being transformed and quantized by the transform/quantization unit 230 may be referred to as a time-domain residual block, and the time-domain residual block after being transformed and quantized by the transform/quantization unit 230 may be referred to as a frequency residual block. or a frequency domain residual block. The entropy coding unit 280 receives the quantized variation coefficient output by the variation quantization unit 230, and can perform entropy coding on the quantized variation coefficient to output a code stream. For example, the entropy encoding unit 280 may eliminate character redundancy according to the target context model and the probability information of the binary code stream.

In addition, the video encoder performs inverse quantization and inverse variation on the quantized variation coefficient output by the variation quantization unit 230 to obtain the residual block of the current block, and then adds the residual block of the current block to the prediction block of the current block, Get the reconstructed block of the current block. As the encoding proceeds, reconstructed blocks corresponding to other image blocks in the current image can be obtained, and these reconstructed blocks are spliced to obtain a reconstructed image of the current image. Due to the error introduced in the encoding process, in order to reduce the error, the reconstructed image is filtered, for example, ALF is used to filter the reconstructed image to reduce the difference between the pixel value of the pixel in the reconstructed image and the original pixel value of the pixel in the current image. difference. The filtered reconstructed image is stored in the decoded image buffer 270, and can be used as a reference frame for inter-frame prediction for subsequent frames.

It should be noted that the block division information determined by the coding end, and mode information or parameter information such as prediction, transformation, quantization, entropy coding, and loop filtering, etc., are carried in the code stream when necessary. The decoding end determines the same block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information as the encoding end by analyzing the code stream and analyzing the existing information, so as to ensure the decoded image obtained by the encoding end. It is the same as the decoded image obtained by the decoder.

FIG. 3 is a schematic block diagram of a decoding framework 300 provided by an embodiment of the present application.

As shown in FIG. 3 , the video decoder 300 includes an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization/transformation unit 330 , a reconstruction unit 340 , a loop filtering unit 350 , and a decoded image buffer 360 . It should be noted that the video decoder 300 may include more, less or different functional components.

The video decoder 300 may receive the code stream. Entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the codestream, entropy decoding unit 310 may parse the entropy-encoded syntax elements in the codestream. The prediction unit 320, the inverse quantization/transform unit 330, the reconstruction unit 340, and the in-loop filtering unit 350 may decode the video data according to the syntax elements extracted from the code stream, ie, generate decoded video data.

In some embodiments, prediction unit 320 includes intra prediction unit 321 and inter prediction unit 322 .

Intra-prediction unit 321 may perform intra-prediction to generate prediction blocks for the PU. Intra-prediction unit 321 may use an intra-prediction mode to generate prediction blocks for a PU based on pixel blocks of spatially neighboring PUs. Intra-prediction unit 321 may also determine an intra-prediction mode for the PU from one or more syntax elements parsed from the codestream.

Inter-prediction unit 322 may construct a first reference picture list (List 0) and a second reference picture list (List 1) from the syntax elements parsed from the codestream. Furthermore, if the PU is encoded using inter-prediction, entropy decoding unit 310 may parse the motion information for the PU. Inter-prediction unit 322 may determine one or more reference blocks for the PU according to the motion information of the PU. Inter-prediction unit 322 may generate a prediction block for the PU from one or more reference blocks of the PU.

The inverse quantization/transform unit 330 inversely quantizes (ie, dequantizes) the transform coefficients associated with the TUs. Inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization.

After inverse quantizing the transform coefficients, inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients to generate a residual block associated with the TU.

Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the pixel block of the CU, resulting in a reconstructed image block.

In-loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.

In some embodiments, the loop filtering unit 350 includes a deblocking filtering unit, a sample adaptive compensation SAO unit, an adaptive loop filtering ALF unit.

Video decoder 300 may store the reconstructed images of the CU in decoded image buffer 360 . The video decoder 300 may use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.

The basic process of video decoding involved in this application is as follows: the entropy decoding unit 310 can parse the code stream to obtain the prediction information, quantization coefficient matrix, etc. of the current block, and the prediction unit 320 uses intra prediction or inter prediction for the current block to generate the current block based on the prediction information. The predicted block for the block. The inverse quantization/transform unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block. The reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block. The reconstructed blocks form a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the block to obtain a decoded image. The decoded image may also be referred to as a reconstructed image. On the one hand, the reconstructed image may be displayed by a display device, and on the other hand, the reconstructed image may be stored in the decoded image buffer 360 to serve as a reference frame for inter-frame prediction for subsequent frames.

The above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized. This application is applicable to the block-based hybrid coding framework. The basic process of the video codec, but not limited to the framework and process.

The video encoding system, video encoder, video decoding, and intra-frame prediction mode involved in the embodiments of the present application are described above. On this basis, the technical solutions provided by the embodiments of the present application are described in detail below with reference to specific embodiments.

The encoding end will be introduced below with reference to FIG. 4 .

FIG. 4 is a schematic flowchart of a video encoding method 400 provided by an embodiment of the present application, and the embodiment of the present application is applied to the video encoder shown in FIG. 1 and FIG. 2 . As shown in FIG. 4 , the method of the embodiment of the present application includes:

S401. Obtain a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component may be a luminance component or a chrominance component.

In the video encoding process, the video encoder receives a video stream, which consists of a series of image frames, and performs video encoding for each frame of image in the video stream. For convenience of description, this application uses a frame of image currently to be encoded Record as the current image.

Specifically, as shown in FIG. 2 , the video encoder divides the current image into one or more image blocks to be encoded, and for each image block to be encoded, the prediction unit 210 in the video encoder uses inter-frame prediction, intra-frame prediction After the prediction block of the image block to be encoded is generated, the prediction block is sent to the residual unit 220, which can be understood as a summer, including one or more components that perform a subtraction operation. The residual unit 220 subtracts the prediction block from the image block to be encoded to form a residual block, and sends the residual block to the transform and quantization unit 230 . The transform and quantization unit 230 transforms the residual block using, for example, discrete cosine transform (DCT) or the like, to obtain transform coefficients. The transform and quantization unit 230 further quantizes the transform coefficients to obtain quantized transform coefficients.

It can be seen from FIG. 2 that, on the one hand, the transform and quantization unit 230 forwards the quantized transform coefficients to the entropy encoding unit 280 . The entropy encoding unit 280 entropy encodes the quantized transform coefficients. For example, entropy encoding unit 280 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) ) coding and other coding methods, entropy coding is performed on the quantized transform coefficients to obtain a code stream.

On the other hand, the transform and quantization unit 230 forwards the quantized transform coefficients to the inverse transform and quantization unit 240 . The inverse transform and quantization unit 240 inversely quantizes and inversely transforms the quantized transform coefficients to reconstruct the residual block in the pixel domain. Reconstruction unit 250 is understood to be a summer, including one or more components that perform subtraction operations. The reconstruction unit 250 adds the reconstructed residual block to the prediction block generated by the prediction unit 210 to generate a partial or complete reconstructed image of the current image, the partial or complete reconstructed image including one or more reconstructed image blocks.

The reconstructed image includes a first component, and the first component may be a luminance component or a chrominance component.

S402. Determine the ALF coefficients of the floating point type of the first ALF when the target area in the reconstructed image under the first component is filtered using the first ALF.

The video encoder in this embodiment of the present application can be used for images in different formats, such as YUV format or YCbCr format.

In AVS3, there is an independent adaptive correction filter (ALF for short) for each channel in the YUV or YCbCr format. For example, under the chrominance component, the first chrominance component of a frame image (such as U component or Cb component) corresponds to an ALF, and the second chrominance component (eg, V component or Cr component) of a frame of image corresponds to an ALF. Under the luminance component, a frame of image is divided into multiple regions, for example, into 16 regions or 64 regions, and each region corresponds to an ALF. Based on this, when the first component is a chrominance component, the present application uses the entire reconstructed image under the first component as the target area to be filtered. When the first component is the luminance component, the present application divides the reconstructed image under the first component into regions, for example, into 16 regions or 64 regions, and takes one of the regions to be filtered as the target region. The ALF that performs adaptive correction filtering on the target area is denoted as the first ALF.

When using the first ALF to filter the target area in the reconstructed image under the first component, it is first necessary to determine a set of ALF coefficients of the first ALF.

In some embodiments, the form of the first ALF may be as shown in FIG. 5A , the form of the first ALF is a 7×7 cross plus a 3×3 square, and a set of ALF coefficients of the first ALF includes 9 ALF coefficients , C0, C1, C2, C3, C4, C5, C6, C7, and C8, respectively, where C8 is the ALF coefficient corresponding to the center point of the first ALF, and other ALF coefficients in the first ALF are related to the first ALF Center point symmetry.

In some embodiments, the form of the first ALF may be as shown in FIG. 5B , the form of the first ALF is a 7×7 cross plus a 5×5 square, and a set of ALF coefficients of the first ALF includes 15 ALF coefficients , respectively C0, C1, C2...C13 and C14, wherein C14 is the ALF coefficient corresponding to the center point of the first ALF, wherein other ALF coefficients in the first ALF are symmetrical with respect to the center point of the first ALF.

In this application, the Wienerhoff method is used to derive the ALF coefficients of the floating point type of the first ALF.

For example, according to the following formula (1), the ALF coefficient of the floating point type of the first ALF is derived:

Among them, r=(x, y) is the pixel in the target area R, ||R|| represents the number of pixels in the target area R, s[r] represents the original pixel value of the pixel r, t[r] Represents the reconstructed pixel value of the pixel point r, that is, the pixel value of the pixel point r in the reconstructed image. s[r] represents the original pixel value of pixel r. C0, C1...CN-1 are the ALF coefficients of the floating point type of the first ALF. {P ₀ , P ₁ , P ₂ ,...,P _N-1 } is a position offset relative to r.

If the shape of the first ALF is shown in FIG. 5A , the above N is 8, and if the shape of the first ALF is shown in FIG. 5B , the above N is 14.

According to the above formula (1), the floating point type ALF coefficients C0, C1...CN-1 of the first ALF can be derived.

The value range of ALF coefficients of floating-point type is almost unlimited. In standard syntax elements and encodings, it is difficult to use a variable to represent a floating-point number with unlimited range. Therefore, it is necessary to quantize the ALT coefficients of the floating point type into the ALT coefficients of the integer type. Specifically, the steps of S403 to S406 are performed.

S403. Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.

In AVS3, when the first ALF is the 9-coefficient filter shown in FIG. 5A, the first 8 ALF coefficients (ie C0, C1, C2, C3, C4, C5, C6 and C7) of the first ALF after quantization The coefficients of integer type are limited in the range of -64 to 63, and the coefficients of integer type after the quantization of the ninth ALF coefficient (ie, C8) are limited in the range of 0 to 127.

When the first ALF is a 15-coefficient filter as shown in FIG. 5B , the quantized integer-type coefficients of the first 14 ALF coefficients of the first ALF (ie, C0, C1, ... C12, and C13) are limited to -64 to 63 Within the range of , the quantized integer type coefficient of the 15th ALF coefficient (ie, C14) is limited to the range of 0-127.

In this way, the maximum quantization scale of the floating-point ALF coefficient of the first ALF can be determined according to the floating-point ALF coefficient of the first ALF and the integer coefficient threshold corresponding to the quantization of the floating-point ALF coefficient to the integer type.

In some embodiments, the above S403 includes the following S403-A1 and S403-A2:

S403-A1. According to the first coefficient in the ALF coefficients of the floating point type of the first ALF and the maximum integer coefficient threshold corresponding to the first coefficient, determine the maximum shift allowed when the ALF coefficient of the floating point type is quantized into an integer type value;

S403-A2: Determine the maximum quantization scale of the floating-point type ALF coefficient of the first ALF according to the maximum shift value.

Among the ALF coefficients of the first ALF, the ALF coefficient corresponding to the center position of the first ALF, for example, C8 or C14 is the largest, and the ALF coefficient corresponding to the non-center position of the first ALF is smaller. The above-mentioned first coefficient may be a floating-point type ALF coefficient corresponding to the center position of the first ALF, or may be a floating-point type ALF coefficient corresponding to a non-center position of the first ALF.

In an example, if the above-mentioned first coefficient is the smallest floating-point type ALF coefficient among the floating-point type ALF coefficients corresponding to the non-center position in the first ALF, it is the ALF coefficient W _{min(x, y)} , the maximum integer coefficient threshold corresponding to the ALF coefficient W _{min(x, y)} is b1. In this way, according to the ALF coefficients W _{min(x, y)} and b1, the maximum shift value allowed when the floating-point type ALF coefficient W _{min(x, y)} is quantized into an integer type can be determined.

For example, according to the following formula (2), the maximum shift value bitshift allowed when quantizing the ALF coefficient of the floating-point type into an integer type is determined:

Wherein, floor is rounded down, W _{max(x, y)} is the largest ALF coefficient among the floating-point type ALF coefficients corresponding to the non-center position of the first ALF, and neither x nor y is equal to 0. d1 is the maximum integer coefficient threshold corresponding to W _{max(x, y)} , for example, d1 is 63.

In another example, if the first coefficient is a coefficient of floating point type corresponding to the center position of the first ALF, it can be determined according to the following formula (3) when the ALF coefficient of floating point type is quantized into an integer type Maximum allowed shift value bitshift:

Wherein, W _f(0,0) is the floating-point type coefficient corresponding to the center position of the first ALF, and d is the maximum integer coefficient threshold corresponding to W _f(0,0) , for example, d is 127.

According to the above method, it is possible to determine the maximum shift value bitshift allowed when the ALF coefficient of the floating point type is quantized into an integer type, and then, according to the maximum shift value bitshift, determine the ALF coefficient of the floating point type of the first ALF. Maximum quantization scale.

For example, according to the following formula (4), the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF is determined:

Scale _max = 2 ^bitshift (4)

Among them, Scale _max is the maximum quantization scale.

After the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF is determined, the following S404 is performed to determine the target quantization scale of the ALF coefficients of the floating point type of the first ALF.

S404. Determine, according to the maximum quantization scale, the target quantization scale of the ALF coefficients of the floating point type of the first ALF.

Among them, the target quantization scale is less than or equal to the maximum quantization scale. For example, the maximum quantization scale is 2 ⁸ , in this way, it can be determined that the target quantization scale is 2 ⁷ or 2 ⁶ or 2 ⁵ smaller than or equal to 2 ⁸ and greater than 1 to the power of 2.

In the above-mentioned S404, according to the maximum quantization scale, the methods for determining the target quantization scale of the ALF coefficients of the floating point type of the first ALF include but are not limited to the following:

Manner 1, if the first component is a luminance component, the second quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF, and the second quantization scale is less than or equal to the maximum quantization scale.

At this time, in some embodiments, if the reconstructed image includes chrominance components, the third quantization scale is determined as the target quantization scale of the floating-point type ALF coefficients of the third ALF, and the third ALF is the target quantization scale for the chrominance components. The filter used when ALS filtering the reconstructed image, where the third quantization scale is smaller than the second quantization scale.

This is because the eye is more sensitive to luminance, so higher quantization precision is used for the reconstructed image under the luminance component to improve the luminance information of the reconstructed image. The eye is less sensitive to chrominance, so a lower quantization precision is used for the reconstructed image under the chrominance component to improve coding efficiency.

Based on this, when the first component is a luminance component, the larger second quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF. In this way, the second ALF quantized by using the second quantization scale has higher quantization accuracy, and when the reconstructed image under the luminance component is filtered by using the second ALF, the filtering accuracy can be improved. For example, if the maximum quantization scale is 2 ⁸ , the second quantization scale is determined to be 2 ⁸ , 2 ⁷ or 2 ⁶ and so on.

In this method, after the target quantization scale corresponding to the luminance component is determined, the third quantization scale smaller than the second quantization scale is directly determined as the target quantization scale corresponding to the chrominance component, that is, the third ALF is used to quantify the chrominance component. When filtering the reconstructed image, the third quantization scale can be determined as the target quantization scale of the floating-point type ALF coefficient of the third ALF, and there is no need to additionally calculate the target quantization scale corresponding to the chrominance component, reducing the amount of calculation and improving the coding efficiency. . For example, the largest quantization scale is 2 ⁸ , the second quantization scale is 2 ⁸ , 2 ⁷ or 2 ⁶ , the third quantization scale is 2 ⁵ or 2 ⁴ and so on.

In the second method, if the first component is a chrominance component, the fourth quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF, and the fourth quantization scale is smaller than the maximum quantization scale.

At this time, in some embodiments, the fifth quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the fourth ALF, and the fourth ALF is a filter used when performing ALS filtering on the reconstructed image under the luminance component , the fifth quantization scale is larger than the fourth quantization scale.

In this manner, when the first component is a chrominance component, a smaller fourth quantization scale is determined as the target quantization scale of the ALF coefficients of the floating point type of the second ALF. In this way, the coding scale of the ALF coefficients of the second ALF quantized by using the fourth quantization scale is shorter, which is convenient for coding. For example, if the maximum quantization scale is 2 ⁸ , the fourth quantization scale is determined to be 2 ⁵ , or 2 ⁴ and so on.

In this method, after the target quantization scale corresponding to the chrominance component is determined, the fifth quantization scale larger than the fourth quantization scale is directly determined as the target quantization scale corresponding to the luminance component, and there is no need to additionally calculate the target quantization scale corresponding to the luminance component , reduce the amount of calculation and improve the coding efficiency. For example, the largest quantization scale is ²⁸ , the third quantization scale is ²⁵ or ²⁴ , etc., and the fifth quantization scale is ²⁸ , ²⁷ , ²⁶ , etc.

Mode 3, the above S404 includes the following S404-A1 and S404-A2:

S404-A1. For each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale, determine the quantization cost of the first quantization scale, wherein the first quantization scale is a power of 2 positive integer;

S404-A2. Determine, according to the quantization cost of each first quantization scale, the target quantization cost of the floating-point type ALF coefficient of the first ALF.

In this implementation manner, the maximum quantization scale and the preset minimum quantization scale constitute a quantization interval. For convenience of description, each quantization scale in the quantization interval is the first quantization scale. The quantization cost of each first quantization scale in the quantization interval is determined, and according to the quantization cost of each first quantization scale, the target quantization cost of the ALF coefficient of the floating point type of the first ALF is determined. For example, the first quantization scale with the smallest quantization cost is determined as the target quantization cost of the floating point type ALF coefficients of the first ALF. Alternatively, the average value of the quantization costs of the first quantization scales is determined as the target quantization cost of the floating-point type ALF coefficients of the first ALF, where the average value may be an arithmetic average value or a weighted average value.

Optionally, the above preset minimum quantization scale is 2 ⁰ , that is, 1.

Optionally, the above preset minimum quantization scale is 2 ¹ or 2 ² or the like.

The process of determining the quantization cost of the first quantization scale in the above S404-A1 will be described in detail below.

It should be noted that, corresponding to each first quantization scale in the quantization interval, the quantization cost is determined in the same manner. For ease of description, the present application takes a first quantization scale as an example for description.

In some embodiments, the encoding cost of the first quantization scale is determined as the quantization cost of the first quantization scale, for example, the floating-point type ALF coefficients of the first ALF are quantized using the first quantization scale to the integer type first ALF The coefficient, the number of bits occupied by the first ALF coefficient of the encoding integer type, is determined as the quantization cost of the first quantization scale.

In some embodiments, determining the quantization cost of the first quantization scale in the above S404-A1 includes the following steps S404-A11 to S404-A13:

S404-A11, using the first quantization scale, quantize the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type;

S404-A12, determining the quantization distortion result corresponding to the first ALF coefficient when the target area of the reconstructed image is encoded using the first ALF coefficient;

S404-A13: Determine the quantization cost of the first quantization scale according to the quantization distortion result corresponding to the first ALF coefficient and the number of bits consumed by encoding the first ALF coefficient.

In this implementation manner, the first quantization scale is used to quantize the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type. For example, according to the following formula (5), the floating point coefficients of the first ALF are quantized An ALF coefficient of type is quantized to the first ALF coefficient of an integer type:

AlfCoeff _integer = round(AlfCoeff _float ×2 ^p ) (5)

Wherein, AlfCoeff _integer is the first ALF coefficient of the integer type, AlfCoeff _float is the ALF coefficient of the floating point type of the first ALF, round is the rounding operation, and 2 ^P is the first quantization scale.

After quantizing the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type, perform S404-A12 to determine when the first ALF coefficient of the integer type is used to encode the target area of the reconstructed image under the first component , the quantization distortion result corresponding to the first ALF coefficient.

In some embodiments, when it is determined in S404-A12 that the first ALF coefficient is used to encode the target area of the reconstructed image, the manner of the quantization distortion result corresponding to the first ALF coefficient includes but is not limited to the following:

Method 1: Determine the autocorrelation coefficient of the pixels in the target area according to the reconstructed pixel values of the pixels in the target area; determine the cross-correlation coefficient of the pixels in the target area according to the original pixel values of the pixels in the target area; According to the target area The product of the autocorrelation coefficient of the middle pixel and the first ALF coefficient, and the cross-correlation coefficient of the pixel in the target area, determine the quantization distortion result corresponding to the first ALF coefficient.

Assuming that the autocorrelation coefficient of the pixels in the target area is the matrix E, according to the following formula (6), the autocorrelation coefficient E of the pixels in the target area can be determined:

Among them, r=(x, y) is the pixel in the target area R, ||R|| represents the number of pixels in the target area R, s[r] represents the original pixel value of the pixel r, t[r] Represents the reconstructed pixel value of the pixel point r, that is, the pixel value of the pixel point r in the reconstructed image. {P ₀ , P ₁ , P ₂ ,...,P _N-1 } is a position offset relative to r.

Assuming that the cross-correlation coefficient of pixels in the target area is matrix Y, according to the following formula (7), the cross-correlation coefficient Y of pixels in the target area can be determined:

Among them, s[r] represents the original pixel value of the pixel point r.

According to the above method, after determining the autocorrelation coefficient E and the cross-correlation coefficient Y of the pixels in the target area, according to the product of the autocorrelation coefficient E and the first ALF coefficient of the pixels in the target area, and the mutual relationship of the pixels in the target area Calculate Y to determine the quantization distortion result corresponding to the first ALF coefficient.

For example, according to the following formula (8), determine the quantization distortion result corresponding to the first ALF coefficient:

D=A*C-Y (8)

Wherein, D is the quantization distortion result corresponding to the first ALF coefficient, and C is the first ALF coefficient.

In some embodiments, it is assumed that the autocorrelation coefficient and cross-correlation coefficient of the target area are respectively denoted as E[15][15] and y[15], where 15 is the number of ALF coefficients of the first ALF, and coeff[i] is the first ALF coefficient. For an ALF coefficient, execute the following procedure to obtain the quantization distortion result D corresponding to the first ALF coefficient:

Among them, the value of dist is D.

Method 2: Use the first ALF coefficient to filter the target area of the reconstructed image; determine the quantization distortion result corresponding to the first ALF coefficient according to the difference between the filtered pixel value and the original pixel value of the pixel in the target area.

Specifically, according to the first ALF coefficient, the first ALF is used to filter the target area of the reconstructed image under the first component to obtain the filtered target area. The filtered pixel values of the pixel points in the filtered target area are compared with the original pixel values to obtain the interpolation of the pixel values, and the quantization distortion result corresponding to the first ALF coefficient is determined according to the difference of the pixel values. For example, the larger the difference value, the larger the quantization distortion corresponding to the first ALF coefficient, and the smaller the difference value, the smaller the quantization distortion corresponding to the first ALF coefficient.

According to the above manners 1 and 2, the quantization distortion result corresponding to the first ALF coefficient is determined.

Next, the quantization cost of the first quantization scale is determined according to the quantization distortion result corresponding to the first ALF coefficient and the number of bits consumed for encoding the first ALF coefficient.

For example, according to the following formula (9), determine the quantization cost of the first quantization scale:

J=D+λR (9)

Among them, J is the quantization cost of the first quantization scale, D is the quantization distortion result corresponding to the first ALF coefficient, R is the number of bits consumed for encoding the first ALF coefficient, and λ is a variable value.

In some embodiments, R is an estimate of the number of bits consumed to encode the first ALF coefficient.

Specifically, taking an ALF with 15 coefficients as an example, it is necessary to estimate the sum of the number of bits consumed by the 15 filter coefficients. For each filter coefficient, it is represented by the codeword length of a signed variable-length code (signed exponential Golomb), as shown in Table 1:

Table 1

系数值Coefficient value	码字numbers	消耗比特数consume bits
00	11	11
11	010010	33
-1-1	011011	33
22	0010000100	55
-2-2	0010100101	55
33	0011000110	55
-3-3	0011100111	55

44	00010000001000	77
-4-4	00010010001001	77
55	00010100001010	77
-5-5	00010110001011	77
66	00011000001100	77
-6-6	00011010001101	77
77	00011100001110	77
-7-7	00011110001111	77
……	……	……

For example, if the filter coefficient is 0, the number of bits consumed is 1 bit, and if the filter coefficient is 3, the number of bits consumed is 5 bits. Look up Table 1 to obtain the number of bits consumed by each ALF coefficient in the encoding of the first ALF coefficients. The sum of the bits consumed by the coefficients, as the value of R, is brought into the above formula (9) to obtain the quantization cost of the first quantization scale.

Referring to the above manner, the quantization cost corresponding to each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale can be determined. According to the quantization cost of each first quantization scale, the target quantization cost of the floating-point type ALF coefficient of the first ALF is determined. For example, the first quantization scale with the smallest quantization cost is determined as the target quantization cost of the floating point type ALF coefficients of the first ALF. Next, the following step S405 is performed.

S405. Using the target quantization scale, quantize the floating-point ALF coefficients of the first ALF into integer-type ALF coefficients.

For example, according to the following formula (10), the ALF coefficients of the floating point type of the first ALF are quantized into the ALF coefficients of the integer type:

AlfCoeff _integer = round(AlfCoeff _float ×2 ^Alfshift ) (10)

Among them, AlfCoeff _integer is an ALF coefficient of integer type, AlfCoeff _float is an ALF coefficient of floating point type of the first ALF, round is a rounding operation, and 2 ^Alfshift is a target quantization scale.

S406: Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.

According to the above method, after the ALF coefficients of the integer type of the first ALF are determined, the ALF coefficients of the integer type of the first ALF are encoded, carried in the code stream, and sent to the decoding end. In this way, the decoding end decodes the residual value of the current image from the code stream, determines the reconstructed image of the current image according to the residual value, and decodes the integer type ALF coefficient of the first ALF from the code stream. The integer-type ALF coefficients are used to filter the target area in the reconstructed image under the first component using the first ALF to improve the accuracy of the reconstructed image.

In some embodiments, all integer-type ALF coefficients of the first ALF are included in the codestream. In this way, the decoding end can directly decode all integer-type ALF coefficients of the first ALF from the code stream, which are used for later filtering without the need to calculate the amount of ALF coefficients, thereby improving the filtering efficiency of the decoding end.

In some embodiments, the code stream includes integer-type second ALF coefficients of the first ALF and first information, the second ALF coefficients are integer-type ALF coefficients corresponding to non-central points in the first ALF, and the first information is an integer-type ALF coefficient. to indicate the target quantization scale. In this way, after decoding the second ALF coefficient and the first information of the integer type of the first ALF from the code stream, the decoding end needs to determine the first ALF coefficient of the first ALF according to the second ALF coefficient and the first information, and then according to The first ALF coefficient and the second ALF coefficient are used to filter the target area in the reconstructed image under the first component by using the first ALF, wherein the first ALF coefficient reference of the first ALF is determined according to the second ALF coefficient and the first information The specific description of the decoding end above will not be repeated here.

In one example, the first information includes a target shift value corresponding to the target quantization scale. For example, if the target quantization scale is 2 ^Alfshift , the target shift value is Alfshift.

In one example, the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value. For example, if the target shift value corresponding to the target quantization scale is Alfshift and the first value is a, the first information includes |Alfshift-a|, which can reduce the data amount of the first information and save coding resources.

Optionally, the first value is 6, 5, or 4, etc.

The changes of the present application at the grammatical and semantic level are shown in Table 2 below:

Table 2

adaptive_filter_shift_enable_flag is the adaptive modification filter shape quantization scale flag, which is a binary variable. A value of '1' indicates that the adaptive correction filtering can use a 7×7 cross plus a 5×5 square filter shape; a value of '0' indicates that the adaptive correction filtering uses a 7×7 cross plus a 3×3 square filter shape .

adaptive_filter_shift_enable_flag is the adaptive modification filter shift enable flag, which is a binary variable. A value of '1' indicates that the solution of the present application is adopted, and the shift value of the adaptive correction filter changes; a value of '0' indicates that the solution of the present application is not adopted, and the shift value of the adaptive correction filter does not change, for example, still is 6.

The value of AlfShiftEnableFlag is equal to the value of adaptive_filter_shift_enable_flag.

When both the luminance component and the chrominance component of the reconstructed image of the current image adopt the technical solution of the present application, the first information in the code stream includes the shift information corresponding to the luminance component and the shift information corresponding to the chrominance component, as shown in the following table. 3 shows:

table 3

Among them, alf_luma_shift_num_minus6[i] is the coefficient shift of the adaptive correction filter corresponding to the image luminance component sample.

It can be known from the above description that the luminance component may correspond to multiple adaptive correction filters, for example, 16 or 64.

alf_coeff_luma[i][j] represents the j-th ALF coefficient of the i-th adaptive correction filter corresponding to the luminance component, and the j-th ALF coefficient can be understood as the above-mentioned second ALF coefficient.

alf_luma_shift_num_minus6[i] represents the quantization scale shift value of the i-th adaptive correction filter corresponding to the luminance component minus 6, where 6 can be understood as the above-mentioned first value. At this time, AlfLumaShift[i]=alf_luma_shift_num_minus6[i]+6.

alf_chroma_shift_num_minus6[i] represents the coefficient shift of the adaptive correction filter corresponding to the image chroma component samples, where i=0,1.

When i=0, alf_coeff_chroma[0][j] represents the jth ALF parameter of the adaptive correction filter corresponding to the Cb component. The jth ALF coefficient can be understood as the above-mentioned second ALF coefficient.

alf_chroma_shift_num_minus6[0] represents the quantization scale shift value of the adaptive correction filter corresponding to the Cb component minus 6. At this time, AlfChromaShift[0]=alf_chroma_shift_num_minus6[0]+6.

When i=1, alf_coeff_chroma[1][j] represents the jth ALF parameter of the adaptive correction filter corresponding to the Cr component. The jth ALF coefficient can be understood as the above-mentioned second ALF coefficient.

alf_chroma_shift_num_minus6[1] represents the quantization scale shift value of the adaptive correction filter corresponding to the Cr component minus 6. At this time, AlfChromaShift[1]=alf_chroma_shift_num_minus6[1]+6.

In some embodiments, the present application further includes: filtering the target region of the reconstructed image under the first component using integer-type ALF coefficients of the first AFL.

For example, filter the target area of the reconstructed image under the first component according to the following formula (11):

Among them, I _rec(0,0) ' is the reconstructed pixel value filtered by the current point (0,0) in the target area, (x, y) is the position relative to the current point, and W _j is the jth of the first ALF ALF coefficients of integer type, W _n-1 is the ALF coefficient of integer type corresponding to the center point of the first ALF, I _rec(0,0) is the reconstructed pixel value of the current point, and Alfshif is the target shift corresponding to the target quantization scale Bit value, n is the number of ALF coefficients included in the first ALF.

In some embodiments, when the shape of the first ALF is a 7×7 cross plus a 3×3 square as shown in FIG. 5A , the first ALF includes 9 ALF coefficients, where (x, y) takes the same value as Wj The corresponding relationship is shown in Table 4 below:

Table 4

j的值the value of j	x的值the value of x	y的值the value of y
00	00	33
11	00	22
22	11	11
33	00	11
44	11	-1-1
55	33	00
66	22	00
77	11	00

In some embodiments, when the shape of the first ALF is a 7×7 cross plus a 5×5 square as shown in FIG. 5B , the first ALF includes 15 ALF coefficients, where (x, y) takes the same value as Wj The corresponding relationship is shown in Table 5 below:

table 5

j的值the value of j	x的值the value of x	y的值the value of y
00	00	33
11	22	22
22	11	22
33	00	22
44	11	-2-2
55	22	-2-2
66	22	11
77	11	11
88	00	11
99	11	-1-1
1010	22	-1-1
1111	33	00
1212	22	00

13

1

0

It should be noted that, by using the ALF coefficients of the integer type of the first AFL, filtering the target area of the reconstructed image under the first component and encoding the ALF coefficients of the integer type of the first ALF, it is obtained that when the code stream is executed, There is no precedence. That is to say, the encoder can firstly encode the integer-type ALF coefficients of the first ALF, and then use the integer-type ALF coefficients of the first AFL to filter the target area of the reconstructed image under the first component, or First use the ALF coefficients of the integer type of the first AFL to filter the target area of the reconstructed image under the first component, and then encode the ALF coefficients of the integer type of the first AFL, or use the integer type of the first AFL. The ALF coefficients of the first ALF are encoded with the integer type ALF coefficients of the first ALF while filtering the target area of the reconstructed image under the first component.

In this embodiment of the present application, when the target area in the reconstructed image under the first component is filtered using the first ALF, the floating-point ALF coefficient of the first ALF is determined, and the floating-point ALF coefficient of the first ALF is determined according to the floating-point ALF coefficient of the first ALF. , which determines the maximum quantization scale of the floating-point type ALF coefficients of the first ALF. Next, according to the maximum quantization scale, the target quantization scale of the floating-point type ALF coefficients of the first ALF is determined. Using the target quantization scale, the ALF coefficients of the floating point type of the first ALF are quantized into ALF coefficients of the integer type. Compared with the current use of a fixed quantization scale to quantize the ALF coefficients of the floating point type of the first ALF, the present application considers The range of floating-point type ALF coefficients derived from different regions or different frames may be different, resulting in the problem that the filter gain may also be different. The variable quantization scale is used to quantize the ALF coefficients corresponding to different target regions, so that after quantization, the ALF coefficients corresponding to different target regions are quantized. The ALF coefficients can achieve a better balance between coding filter coefficient overhead and filter gain.

In order to further illustrate the technical effect of the present application, after the technical solution of the present application is implemented on the AVS3 reference software HPM-9.1, the test sequence required by AVS3 is tested under the configuration condition of all intra frames (All intra), and the test The results are shown in Table 6:

Table 6

Among them, Tongjia 4K represents the test video with 4K resolution, and its resolution is 3840x2160, and Tongyi 1080p represents the test video with 1080p resolution, and its resolution is 1920x1080. , Tong Bing 720p means the test video of 720p resolution, its resolution is 1280x720. EncT is the ratio of the encoding time of the application to the encoding time of the original algorithm, and DecT is the ratio of the decoding time of the application to the decoding time of the original algorithm.

BD-Rate is one of the main parameters for evaluating the performance of a video coding algorithm, which indicates that the video encoded by the new algorithm (that is, the technical solution of the present application) has a higher bit rate and PSNR (Peak Signal to Noise Ratio, peak signal-to-noise ratio) than the original algorithm. The change in the above, that is, the change in the code rate of the new algorithm and the original algorithm under the same signal-to-noise ratio. "-" indicates performance improvement, such as bit rate and PSNR performance improvement. As shown in Table 6, under the condition of All intra configuration, the average changes of BD-rate on Y, Cb, and Cr components are -0.14%, 0.01%, and -0.00%, respectively.

The test sequence required by AVS3 is tested under the configuration conditions of both intra-frame and inter-frame (Random Access), and the test results are shown in Table 7:

Table 7

As shown in Table 7, under the condition of Random Access configuration, the average changes of BD-rate on Y, Cb and Cr components are -0.14%, -0.01% and -0.00% respectively, indicating that the coding performance is improved by using the technology of this application .

In the present application, the coding end increases with very little computational complexity and the decoding end complexity remains unchanged, which brings additional performance gains to the ALF design.

FIG. 6 is another schematic flowchart of a video encoding method 600 provided by an embodiment of the present application. Taking the first component as a luminance component as an example, as shown in FIG. 6 , it includes:

S601. Obtain a reconstructed image of the current image, where the reconstructed image includes a luminance component.

S602. According to a preset area division rule, perform area division on the reconstructed image under the luminance component to obtain a target area to be filtered. For example, the preset area division rule is to divide the reconstructed image under the luminance component into 16 areas, so that the reconstructed image under the luminance component is divided into 16 areas according to a preset division method (eg, uniform division). For example, the preset area division rule is to divide the reconstructed image under the luminance component into 64 areas, so that the reconstructed image under the luminance component is divided into 64 areas according to a preset division method (eg, uniform division).

S603. Determine the ALF coefficient of the floating point type of the first ALF when the target area under the luminance component is filtered using the first ALF.

S604: Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.

S605. Determine, according to the maximum quantization scale, the target quantization scale of the ALF coefficients of the floating point type of the first ALF.

S606. Using the target quantization scale, quantize the floating-point ALF coefficients of the first ALF into integer-type ALF coefficients.

S607: Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.

The specific implementation process of the above S603 to S607 refers to the description of the above S402 to S406, and details are not repeated here.

S608. Use the integer-type ALF coefficients of the first AFL to filter the target area under the luminance component.

It should be noted that the above S608 and S607 have no sequence in the execution process, and S608 may be executed before S607, may be executed after S607, or may be executed simultaneously with S607.

In some embodiments, the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the luminance component of the reconstructed image, while the chrominance component of the reconstructed image still uses the current fixed quantization scale, such as 2 ⁶ . In this case, the encoder carries the variable quantization scale information corresponding to the luminance component in the code stream and sends it to the decoder, but does not carry the quantization scale corresponding to the chrominance component in the code stream. At this time, the definition of adaptive correction filtering parameters is shown in Table 8.

Table 8

After the decoding end obtains Table 8, the variable quantization scale information corresponding to the luminance component is parsed from Table 8, and the variable quantization scale information can be the difference between the target shift value corresponding to the target quantization scale and the first numerical value. Absolute value, in this way, the decoding end can determine the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first value according to the second ALF coefficient corresponding to the luminance component (ie alf_coeff_luma[i][j]) The first ALF coefficient of the first ALF is obtained, and then the first ALF is used to filter the target area in the reconstructed image under the luminance component according to the first ALF coefficient and the second ALF coefficient of the first ALF, so as to improve the filtering accuracy.

In addition, if the variable quantization scale information corresponding to the chrominance component is not resolved at the decoding end in Table 8, the decoding end uses the fixed quantization scale information, such as a fixed shift value of 6, to filter the reconstructed image under the chrominance component. . Specifically, the decoding end can determine the first ALF according to the second ALF coefficient corresponding to the chrominance component in Table 6 (ie alf_coeff_chroma[0][j] or alf_coeff_chroma[1][j]) and the fixed shift value 6 Then, according to the first ALF coefficient and the second ALF coefficient of the first ALF, the first ALF is used to filter the target area in the reconstructed image under the luminance component, so as to improve the filtering accuracy.

FIG. 7 is another schematic flowchart of a video encoding method 700 provided by an embodiment of the present application. Taking the first component as a luminance component as an example, as shown in FIG. 7 , the method includes:

S701. Obtain a reconstructed image of the current image, where the reconstructed image includes chrominance components, wherein the chrominance components include a first chrominance component and/or include a second chrominance component, wherein the first chrominance component is, for example, U or Cb, and the first chrominance component is U or Cb. Two chrominance components such as V or Cr.

S702. Use the entire reconstructed image under the chrominance component as the target area. When filtering the reconstructed image under the chrominance component, an adaptive correction filter is used to filter the entire reconstructed image under the Cb component, and an adaptive correction filter is used to filter the entire reconstructed image under the Cr component , therefore, in this embodiment of the present application, the entire reconstructed image is used as the target area to perform ALF filtering.

S703: Determine the ALF coefficient of the floating point type of the first ALF when the target area under the chrominance component is filtered using the first ALF.

S704. Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.

S705. Determine, according to the maximum quantization scale, the target quantization scale of the ALF coefficients of the floating point type of the first ALF.

S706. Using the target quantization scale, quantize the floating-point ALF coefficients of the first ALF into integer-type ALF coefficients.

S707: Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.

The specific implementation process of the above S703 to S707 refers to the description of the above S402 to S406, and details are not repeated here.

S708. Use the integer-type ALF coefficients of the first AFL to filter the target area under the chrominance component.

It should be noted that the above-mentioned S708 and S707 have no sequence in the execution process, and S708 may be executed before S707, or after S707, and may also be executed simultaneously with S707.

In some embodiments, the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the first chrominance component of the reconstructed image, while the second chrominance component and luminance component of the reconstructed image still use the currently fixed quantization scale. Quantization scale, eg 2 ⁶ . In this case, the encoder carries the variable quantization scale information corresponding to the first chrominance component in the code stream and sends it to the decoder, but does not carry the quantization corresponding to the second chrominance component and the luminance component in the code stream scale. At this time, the definition of adaptive correction filtering parameters is shown in Table 9:

Table 9

After the decoding end obtains Table 9, the variable quantization scale information corresponding to the first chrominance component is parsed from Table 9, and the ALF coefficient of the first ALF is determined according to the variable quantization scale information corresponding to the first chrominance component, And according to the determined ALF coefficient of the first ALF, the first ALF is used to filter the reconstructed image under the first chrominance component, so as to improve the filtering precision.

The decoding end uses an existing method to determine the ALF coefficient corresponding to the second chrominance component and/or the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the second chrominance component and/or the luminance component.

In some embodiments, the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the second chrominance component of the reconstructed image, while the first chrominance component and luminance component of the reconstructed image still use the currently fixed Quantization scale, eg 2 ⁶ . In this case, the encoder carries the variable quantization scale information corresponding to the second chrominance component in the code stream and sends it to the decoder, but does not carry the quantization corresponding to the first chrominance component and the luminance component in the code stream scale. At this time, the definition of adaptive correction filtering parameters is shown in Table 10:

Table 10

After the decoding end obtains Table 10, the variable quantization scale information corresponding to the second chrominance component is parsed from Table 10, and the ALF coefficient of the first ALF is determined according to the variable quantization scale information corresponding to the second chrominance component, And according to the determined ALF coefficient of the first ALF, the first ALF is used to filter the reconstructed image under the second chrominance component, so as to improve the filtering precision.

The decoding end uses an existing method to determine the ALF coefficient corresponding to the first chrominance component and/or the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the first chrominance component and/or the luminance component.

In some embodiments, the present application may only perform ALF coefficient quantization on the chrominance components of the reconstructed image (including the first chrominance component and the second chrominance component) using the variable quantization scale of the present application, while the luminance of the reconstructed image The components still use the current fixed quantization scale, for example, 2 ⁶ . In this case, the encoder carries the variable quantization scale information corresponding to the chrominance component in the code stream and sends it to the decoder, but does not carry the quantization scale corresponding to the luminance component in the code stream. At this time, the definition of adaptive correction filtering parameters is shown in Table 11.

Table 11

After obtaining Table 11, the decoding end parses out the variable quantization scale information corresponding to the first chrominance component and the second chrominance component from Table 11. The ALF coefficient corresponding to the first chrominance component is determined according to the variable quantization scale information corresponding to the first chrominance component, and the reconstructed image under the first chrominance component is filtered according to the determined ALF coefficient. The ALF coefficient corresponding to the second chrominance component is determined according to the variable quantization scale information corresponding to the second chrominance component, and the reconstructed image under the second chrominance component is filtered according to the determined ALF coefficient.

The decoding end uses the existing method to determine the ALF coefficient corresponding to the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the luminance component.

It should be noted that, for the specific method of the ALF coefficient determined by the decoding end, refer to the specific description of the decoding end.

The video encoding method involved in the embodiments of the present application is described above. Based on this, the following describes the video decoding method involved in the present application for the decoding end.

FIG. 8 is a schematic flowchart of a video decoding method 800 provided by an embodiment of the present application. As shown in FIG. 8 , the method of the embodiment of the present application includes:

S801. Decode the code stream to obtain the residual value of the current image.

S802. Determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component.

Specifically, as shown in FIG. 3 , the entropy decoding unit 310 in the decoder can parse the code stream to obtain prediction information, quantization coefficient matrix, etc. of the current block in the current image, and the prediction unit 320 uses intra prediction or Inter prediction produces a predicted block for the current block. The inverse quantization/transform unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block. The reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block. By analogy, the reconstructed blocks of other image blocks in the current image can be obtained, and each reconstructed block constitutes a reconstructed image.

The reconstructed image includes a first component, which is a luminance component or a chrominance component.

S803, decode the code stream to obtain the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target area in the reconstructed image under the first component.

S804. According to the ALF coefficient information of the first ALF, use the first ALF to filter the target area in the reconstructed image under the first component.

In this application, it is set whether to perform adaptive correction filtering on the reconstructed image under the first component, and which region in the reconstructed image is to perform adaptive correction filtering. For example, if an adaptive correction filter switch is set, when the adaptive correction filter switch corresponding to the target area of the reconstruction area under the first component is on, for example, the control signal of the switch is 1, it means that the reconstruction area under the first component is The target region is filtered using the first ALF. If the adaptive correction filter switch corresponding to the target area of the reconstruction area under the first component is off, for example, the control signal of the switch is 0, it means that the target area of the reconstruction area under the first component is not subjected to adaptive correction filtering.

In the present application, when it is determined to use the first ALF to perform adaptive correction filtering on the target area of the reconstruction area under the first component, the code stream is decoded to obtain the ALF coefficient information of the first ALF carried in the code stream. The target region in the reconstructed image under the first component is filtered using the first ALF according to the ALF coefficient information of the first ALF.

In some embodiments, the ALF coefficient information of the first ALF includes all integer-type ALF coefficients of the first ALF. In this way, the decoding end directly uses the first ALF to filter the target region in the reconstructed image under the first component according to all integer-type ALF coefficients of the first ALF decoded, and the filtering process is simple.

In some embodiments, the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF corresponding to a non-center point in the first ALF coefficient, the first information is used to indicate the target quantization scale.

In one example, the first information includes a target shift value corresponding to the target quantization scale.

In another example, the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value.

In this embodiment, the above S804 includes the following steps S804-A1 and S804-A2:

S804-A1, according to the second ALF coefficient of the first ALF and the target quantization scale, determine the first ALF coefficient of the first ALF, and the first ALF coefficient is an integer type ALF coefficient corresponding to the center point of the first ALF;

S804-A2. Use the first ALF to filter the target area in the reconstructed image under the first component according to the second ALF coefficient and the first ALF coefficient of the first ALF.

As can be seen from Table 8 above, the code stream includes the second ALF coefficient of the first ALF and the target quantization scale information, so that the center position of the first ALF can be determined according to the second ALF coefficient of the first ALF and the target quantization scale information. The first ALF coefficient of .

In some embodiments, the first ALF coefficient of the first ALF can be determined according to the following formula (12):

Wherein, AlfCoeffLuma[i][k] is the first ALF coefficient of the first ALF, and k is the number of ALF coefficients included in the first ALF minus one. For example, the first ALF is shown in FIG. 5A and includes 9 ALF coefficients , then k=8, if the first ALF includes 15 ALF coefficients as shown in FIG. 5B , then k= ^{14. 2 AlfLumaShift[i]} is the target quantization scale corresponding to the first ALF, AlfCoeffLuma[i][j ] is the jth second ALF coefficient in the first ALF.

The decoder decodes the second ALF coefficient of the first ALF from the code stream, and determines the first ALF coefficient according to the above formula (12). coefficient, and the first ALF is used to filter the target area in the reconstructed image under the first component, thereby improving the accuracy of the reconstructed image.

In some embodiments, the decoder filters the target region in the reconstructed image under the first component according to the following formula (13):

Among them, I _rec(0,0) ′ is the reconstructed pixel value of the current point (0,0) in the target area after filtering, (x, y) is the position relative to the current point, and W _j is the jth of the first ALF ALF coefficients of integer type, W _n-1 is the ALF coefficient of integer type corresponding to the center point of the first ALF, I _rec(0,0) is the reconstructed pixel value of the current point, and Alfshif is the target shift corresponding to the target quantization scale Bit value, n is the number of ALF coefficients included in the first ALF.

After the decoder obtains the filtered reconstructed image, on the one hand, the filtered reconstructed image can be sent to the display device for display, and on the other hand, the filtered reconstructed image can be stored in the decoded image buffer for subsequent frames as inter-frame prediction. frame of reference.

FIG. 9 is a schematic flowchart of a video decoding method 900 provided by an embodiment of the present application. When the first component is a luminance component, as shown in FIG. 9 , the method of the embodiment of the present application includes:

S901. Decode the code stream to obtain the residual value of the current image.

S902. Determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a luminance component.

S903: Divide the reconstructed image under the luminance component into regions to obtain a target area of the reconstructed image under the luminance component, where the target area is an area using adaptive correction filtering.

S904. Decode the code stream to obtain ALF coefficient information of the first ALF corresponding to the target area in the reconstructed image under the luminance component. The ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF coefficient corresponding to a non-center point in the first ALF, The first information is used to indicate the target quantization scale.

S905. Obtain the shape information of the first ALF by parsing the code stream, and determine the type of the first ALF according to the shape information of the first ALF. If the shape of the first ALF is the shape of a 7×7 cross plus a 3×3 square filter, the first ALF includes 9 ALF coefficients; if the shape of the first ALF is a 7×7 cross plus 5×5 square filter shape, the first ALF includes 15 ALF coefficients.

S906: Determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is an integer type ALF coefficient corresponding to the center point of the first ALF.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 0, it means that the first ALF, as shown in FIG. 5A , includes 9 coefficients. At this time, according to the following formula (14), determine the first ALF The first ALF coefficient of ALF:

Among them, AlfCoeffLuma[i][8] is the first ALF coefficient of the first ALF, i is the number of ALFs corresponding to the luminance component, such as 16 or 64, and AlfCoeffLuma[i][j] is the i-th first ALF The jth ALF coefficient of , i=0～alf_filter_num_minus1, j=0～7, where alf_filter_num_minus1 is the number of ALFs corresponding to the luminance component minus one, alf_filter_num_minus1 can be understood as p.

The bit width of AlfCoeffLuma[i][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[i][8] is 0 to 127.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 1, it means that the first ALF, as shown in FIG. 5B , includes 15 coefficients. At this time, according to the following formula (15), determine the first ALF The first ALF coefficient of ALF:

Wherein, AlfCoeffLuma[i][14] is the first ALF coefficient of the first ALF, and the value ranges from 0 to 127, and j=0 to 14.

S907. According to the second ALF coefficient and the first ALF coefficient of the first ALF, use the first ALF to filter the target area in the reconstructed image under the luminance component.

For example, the decoder filters the target area in the reconstructed image under the first component according to the following formula (16):

Among them, I _rec(0,0) ′ is the reconstructed pixel value of the current point (0,0) in the target area after filtering, (x, y) is the position relative to the current point, and W _j is the jth of the first ALF a second ALF coefficient, W _n-1 is the first ALF coefficient, I _rec(0,0) is the reconstructed pixel value of the current point, Alfshif is the target shift value corresponding to the target quantization scale, and n is the first ALF included The number of ALF coefficients.

FIG. 10 is a schematic flowchart of a video decoding method 1000 provided by an embodiment of the present application. When the first component is a luminance component, as shown in FIG. 10 , the method of the embodiment of the present application includes:

S110. Decode the code stream to obtain the residual value of the current image.

S120. Determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes chrominance components.

S130. Use the entire reconstructed image under the chrominance component as a target area, where the target area is an area using adaptive correction filtering.

S140. Decode the code stream to obtain ALF coefficient information of the first ALF corresponding to the target area in the reconstructed image under the chrominance component. The ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF coefficient corresponding to a non-center point in the first ALF, The first information is used to indicate the target quantization scale.

S150. Obtain shape information of the first ALF by parsing from the code stream, and determine the type of the first ALF according to the shape information of the first ALF. If the shape of the first ALF is the shape of a 7×7 cross plus a 3×3 square filter, the first ALF includes 9 ALF coefficients; if the shape of the first ALF is a 7×7 cross plus 5×5 square filter shape, the first ALF includes 15 ALF coefficients.

S160: Determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is an integer type ALF coefficient corresponding to the center point of the first ALF.

When the chrominance component is the first chrominance component, such as Cb, at this time, the first ALF coefficient of the first ALF is determined according to the following method.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 0, it means that the first ALF, as shown in FIG. 5A , includes 9 coefficients. At this time, according to the following formula (17), determine the first ALF The first ALF coefficient of ALF:

Wherein, AlfCoeffLuma[0][8] is the first ALF coefficient of the first ALF, AlfCoeffLuma[0][j] is the jth ALF coefficient of the first ALF, and j=0˜7.

The bit width of AlfCoeffLuma[0][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[0][8] is 0 to 127.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 1, it means that the first ALF is shown in FIG. 5B and includes 15 coefficients. At this time, according to the following formula (18), determine the first ALF The first ALF coefficient of ALF:

Wherein, AlfCoeffLuma[0][14] is the first ALF coefficient of the first ALF, the value range is 0-127, and j=0-14.

When the chrominance component is the second chrominance component, such as Cr, at this time, the first ALF coefficient of the first ALF is determined according to the following method.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 0, it means that the first ALF, as shown in FIG. 5A , includes 9 coefficients. At this time, according to the following formula (19), determine the first ALF The first ALF coefficient of ALF:

Wherein, AlfCoeffLuma[1][8] is the first ALF coefficient of the first ALF, AlfCoeffLuma[1][j] is the jth ALF coefficient of the first ALF, and j=0˜7.

The bit width of AlfCoeffLuma[1][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[1][8] is 0 to 127.

In an example, as shown in Table 8 above, if the AlfShapeEnableFlag in the code stream is equal to 1, it means that the first ALF, as shown in FIG. 5B , includes 15 coefficients. At this time, according to the following formula (20), determine the first ALF The first ALF coefficient of ALF:

Wherein, AlfCoeffLuma[1][14] is the first ALF coefficient of the first ALF, and the value ranges from 0 to 127, and j=0 to 14.

S170. Use the first ALF to filter the target area in the reconstructed image under the chrominance component according to the second ALF coefficient and the first ALF coefficient of the first ALF.

It should be understood that FIG. 4 to FIG. 10 are only examples of the present application, and should not be construed as limiting the present application.

In some embodiments, the present application determines that the quantization scale corresponding to the luminance component is larger than the quantization scale corresponding to the chrominance component. For example, the quantization scale corresponding to the luminance component is fixed to 2 ⁷ , and the quantization scale corresponding to the chrominance component is fixed to 2 ⁶ , so that Adjusting the fixed quantization scale can also improve filtering performance, and no bit overhead is required to encode the variable quantization scale.

The preferred embodiments of the present application have been described in detail above with reference to the accompanying drawings. However, the present application is not limited to the specific details of the above-mentioned embodiments. Within the scope of the technical concept of the present application, various simple modifications can be made to the technical solutions of the present application. These simple modifications all belong to the protection scope of the present application. For example, the specific technical features described in the above-mentioned specific embodiments can be combined in any suitable manner unless they are inconsistent. In order to avoid unnecessary repetition, this application does not describe any possible combination. State otherwise. For another example, the various embodiments of the present application can also be combined arbitrarily, as long as they do not violate the idea of the present application, they should also be regarded as the content disclosed in the present application.

It should also be understood that, in the various method embodiments of the present application, the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application. The implementation of the embodiments constitutes no limitation. In addition, in this embodiment of the present application, the term "and/or" is only an association relationship for describing associated objects, indicating that there may be three kinds of relationships. Specifically, A and/or B can represent three situations: A exists alone, A and B exist at the same time, and B exists alone. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship.

The method embodiments of the present application are described in detail above with reference to FIGS. 4 to 10 , and the apparatus embodiments of the present application are described in detail below with reference to FIGS. 11 to 14 .

FIG. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application.

As shown in Figure 11, the video encoder 10 includes:

an obtaining unit 110, configured to obtain a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

A first determining unit 120, configured to determine, when filtering the target region in the reconstructed image under the first component using the first adaptive correction filter ALF, an ALF coefficient of the floating point type of the first ALF ;

A second determining unit 130, configured to determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF;

A third determining unit 140, configured to determine, according to the maximum quantization scale, a target quantization scale of the floating-point type ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;

a quantization unit 150, configured to use the target quantization scale to quantize the floating-point ALF coefficients of the first ALF into integer-type ALF coefficients;

The encoding unit 160 is configured to encode the integer-type ALF coefficients of the first ALF to obtain a code stream.

In some embodiments, the second determining unit 130 is specifically configured to determine, according to a first coefficient in the floating-point type ALF coefficients of the first ALF and a maximum integer coefficient threshold corresponding to the first coefficient The maximum shift value allowed when the ALF coefficient of the floating point type is quantized into an integer type; according to the maximum shift value, the maximum quantization scale of the ALF coefficient of the floating point type of the first ALF is determined.

In an example, the first coefficient is a floating-point type coefficient corresponding to the center position of the first ALF.

In some embodiments, the second determining unit 130 is specifically configured to determine the maximum shift value according to the following formula:

The maximum shift value of the bitshift, the floor is rounded down, the W _f(0,0) is the coefficient of floating point type corresponding to the center position of the first ALF, the d is the maximum integer coefficient threshold corresponding to the W _f(0,0) .

Optionally, the d is 127.

In some embodiments, the second determining unit 130 is specifically configured to determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the following formula:

Scale _max = 2 ^bitshift ,

Wherein, the Scale _max is the maximum quantization scale.

In some embodiments, if the first component is a luminance component, the third determining unit 140 is specifically configured to determine the first quantization scale as the target quantization scale of the floating-point ALF coefficient of the first ALF , the first quantization scale is smaller than or equal to the maximum quantization scale.

In some embodiments, the third determining unit 140 is further configured to determine the second quantization scale as the target quantization scale of the ALF coefficients of the floating point type of the second ALF, where the second ALF is a pair of all the chrominance components. A filter used when performing ALS filtering on the reconstructed image, and the second quantization scale is smaller than the first quantization scale.

In some embodiments, if the first component is a chrominance component, the third determining unit 140 is specifically configured to determine the third quantization scale as the target of the floating-point type ALF coefficient of the first ALF a quantization scale, the third quantization scale is smaller than the maximum quantization scale.

In some embodiments, the third determining unit 140 is further configured to determine the fourth quantization scale as the target quantization scale of the floating-point type ALF coefficient of the third ALF, where the third ALF is the A filter used when performing ALS filtering on the reconstructed image, where the fourth quantization scale is larger than the third quantization scale.

In some embodiments, the third determining unit 140 is specifically configured to, for each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale, determine the quantization of the first quantization scale cost, the first quantization scale is a positive integer of the power of 2; according to the quantization cost of each first quantization scale, the target quantization cost of the ALF coefficient of the floating point type of the first ALF is determined.

In some embodiments, the third determining unit 140 is specifically configured to determine the first quantization scale with the smallest quantization cost as the target quantization cost of the floating-point type ALF coefficient of the first ALF.

In some embodiments, the third determining unit 140 is specifically configured to use the first quantization scale to quantize the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type; determine to use When the first ALF coefficient encodes the target area of the reconstructed image, the quantization distortion result corresponding to the first ALF coefficient; and encoding the first ALF according to the quantization distortion result corresponding to the first ALF coefficient The number of bits consumed by the coefficient determines the quantization cost of the first quantization scale.

In some embodiments, the third determining unit 140 is specifically configured to determine the autocorrelation coefficient of the pixels in the target area according to the reconstructed pixel values of the pixels in the target area; The original pixel value, determine the cross-correlation coefficient of the pixels in the target area; according to the product of the autocorrelation coefficient of the pixels in the target area and the first ALF coefficient, and the cross-correlation coefficient of the pixels in the target area , and determine the quantization distortion result corresponding to the first ALF coefficient.

In some embodiments, the third determining unit 140 is specifically configured to use the first ALF coefficient to filter the target area of the reconstructed image; according to the filtered pixel value and the original pixel value of the pixel in the target area The difference value of , determines the quantization distortion result corresponding to the first ALF coefficient.

In some embodiments, the third determining unit 140 is specifically configured to determine the quantization cost of the first quantization scale according to the following formula:

J=D+λR

Wherein, the J is the quantization cost of the first quantization scale, the D is the quantization distortion result corresponding to the first ALF coefficient, the R is the number of bits consumed by encoding the first ALF coefficient, and the The λ is a variable value.

In some embodiments, the codestream includes all integer-type ALF coefficients of the first ALF.

In an example, the code stream includes a second ALF coefficient of an integer type of the first ALF and first information, and the second ALF coefficient is an integer type corresponding to a non-central point in the first ALF The ALF coefficient, the first information is used to indicate the target quantization scale.

In an example, the first information includes a target shift value corresponding to the target quantization scale.

In an example, the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value.

For example, the first numerical value is 6, 5 or 4.

In some embodiments, the encoding unit 160 is further configured to use the integer-type ALF coefficients of the first AFL to filter the target area of the reconstructed image under the first component.

In some embodiments, the encoding unit 160 is specifically configured to filter the target area of the reconstructed image under the first component according to the following formula:

Wherein, the I _rec(0,0) ' is the reconstructed pixel value after filtering by the current point (0,0) in the target area, and the (x, y) is the position relative to the current point, so The W _j is the ALF coefficient of the jth integer type of the first ALF, the W _n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF, and the I _{rec(0,0 )} is the reconstructed pixel value of the current point, the Alfshif is the target shift value corresponding to the target quantization scale, and the n is the number of ALF coefficients included in the first ALF.

It should be understood that the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here. Specifically, the video encoder 10 shown in FIG. 11 can perform the methods of the embodiments of the present application, and the aforementioned and other operations and/or functions of the various units in the video encoder 10 are for implementing the

methods

400, 600, and 700, respectively. For the sake of brevity, the corresponding processes in the method will not be repeated here.

FIG. 12 is a schematic block diagram of a video decoder 20 provided by an embodiment of the present application.

As shown in Figure 12, the video decoder 20 may include:

The first decoding unit 210 is used for decoding the code stream to obtain the residual value of the current image;

a determining unit 220, configured to determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

The second decoding unit 230 is configured to decode the code stream to obtain the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target area in the reconstructed image under the first component;

The filtering unit 240 is configured to use the first ALF to filter the target area in the reconstructed image under the first component according to the ALF coefficient information of the first ALF.

In some embodiments, the ALF coefficient information of the first ALF includes all integer-type ALF coefficients of the first ALF.

In some embodiments, the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and first information, and the second ALF coefficient is a non-center point in the first ALF Corresponding integer type ALF coefficients, the first information is used to indicate the target quantization scale.

For example, the first numerical value is 6, 5 or 4.

In some embodiments, the filtering unit 240 is specifically configured to determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is The integer-type ALF coefficient corresponding to the center point of the first ALF; according to the second ALF coefficient and the first ALF coefficient of the first ALF, use the first ALF to perform a downlink on the first component. The target region in the reconstructed image is filtered.

In some embodiments, the filtering unit 240 is specifically configured to determine the first ALF coefficient of the first ALF according to the following formula:

Wherein, the AlfCoeffLuma[i][k] is the first ALF coefficient of the i-th first ALF, the k is the number of ALF coefficients included in the first ALF minus one, and the first ALF The AlfLumaShift[i] is the target quantization scale corresponding to the first ALF, and the AlfCoeffLuma[i][j] is the jth second ALF coefficient in the first ALF.

In an example, if the first component is a luminance component, the i value ranges from 0 to p, and the p is the number of ALFs corresponding to the luminance component minus one.

For example, the p is 15 or 63.

In an example, if the first component is the Cb component, the i is equal to the first value, and if the first component is the Cr component, the i is equal to the second value.

For example, the first value is 0 and the second value is 1.

In some embodiments, the second decoding unit 230 is specifically configured to parse and obtain the shape information of the first ALF from the code stream; and determine the type of the first ALF according to the shape information of the first ALF .

In some embodiments, if the shape of the first ALF is a 7×7 cross plus a 3×3 square filter shape, the first ALF includes 9 ALF coefficients; if the shape of the first ALF is A 7x7 cross plus a 5x5 square filter shape, the first ALF includes 15 ALF coefficients.

In some embodiments, the filtering unit 240 is specifically configured to filter the target area in the reconstructed image under the first component according to the following formula:

Wherein, the I _rec(0,0) ′ is the reconstructed pixel value filtered by the current point (0,0) in the target area, and the (x, y) is the position relative to the current point, so The W _j is the ALF coefficient of the jth integer type of the first ALF, the W _n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF, and the I _{rec(0,0 )} is the reconstructed pixel value of the current point, the Alfshi is the target shift value corresponding to the target quantization scale, and the n is the number of ALF coefficients included in the first ALF.

It should be understood that the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here. Specifically, the video decoder 20 shown in FIG. 20 may correspond to the corresponding subject in performing the

method

800 or 900 or 1000 of the embodiments of the present application, and the aforementioned and other operations and/or functions of the respective units in the video decoder 20 In order to implement the corresponding processes in each method such as

method

800 or 900 or 1000, for the sake of brevity, details are not repeated here.

The apparatus and system of the embodiments of the present application are described above from the perspective of functional units with reference to the accompanying drawings. It should be understood that the functional unit may be implemented in the form of hardware, may also be implemented by an instruction in the form of software, or may be implemented by a combination of hardware and software units. Specifically, the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware The execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software units in the decoding processor. Optionally, the software unit may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.

FIG. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application.

As shown in FIG. 13 , the electronic device 30 may be the video encoder or the video decoder described in this embodiment of the application, and the electronic device 30 may include:

A memory 33 and a processor 32 for storing a computer program 34 and transmitting the program code 34 to the processor 32 . In other words, the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.

For example, the processor 32 may be configured to perform the steps of the method 200 described above according to instructions in the computer program 34 .

In some embodiments of the present application, the processor 32 may include, but is not limited to:

General-purpose processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates Or transistor logic devices, discrete hardware components, and so on.

In some embodiments of the present application, the memory 33 includes but is not limited to:

Volatile memory and/or non-volatile memory. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DR RAM).

In some embodiments of the present application, the computer program 34 may be divided into one or more units, and the one or more units are stored in the memory 33 and executed by the processor 32 to complete the procedures provided by the present application. Methods. The one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30 .

As shown in FIG. 13 , the electronic device 30 may further include:

A transceiver 33 which can be connected to the processor 32 or the memory 33 .

The processor 32 can control the transceiver 33 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices. The transceiver 33 may include a transmitter and a receiver. The transceiver 33 may further include antennas, and the number of the antennas may be one or more.

It should be understood that each component in the electronic device 30 is connected through a bus system, wherein the bus system includes a power bus, a control bus and a status signal bus in addition to a data bus.

As shown in FIG. 14 , the video encoding and decoding system 40 may include: a video encoder 41 and a video decoder 42 , wherein the video encoder 41 is used for executing the video encoding method involved in the embodiments of the present application, and the video decoder 42 is used for executing The video decoding method involved in the embodiments of the present application.

The present application also provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a computer, enables the computer to execute the methods of the above method embodiments. In other words, the embodiments of the present application further provide a computer program product including instructions, when the instructions are executed by a computer, the instructions cause the computer to execute the methods of the above method embodiments.

When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions according to the embodiments of the present application are generated. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions may be stored on or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer, server or data center (eg coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to another website site, computer, server or data center. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated. The available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), and the like.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the device embodiments described above are only illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment. For example, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. Any person skilled in the art who is familiar with the technical scope disclosed in the present application can easily think of changes or substitutions. Covered within the scope of protection of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

A video coding method, comprising:

obtaining a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

determining the ALF coefficient of the floating point type of the first ALF when the target area in the reconstructed image under the first component is filtered using the first adaptive correction filter ALF;

determining the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF;

According to the maximum quantization scale, determine the target quantization scale of the floating-point type ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;

Using the target quantization scale, quantizing the floating-point type ALF coefficients of the first ALF into integer-type ALF coefficients;

Encoding the integer-type ALF coefficients of the first ALF to obtain a code stream.
The method according to claim 1, wherein the determining, according to the ALF coefficients of the floating point type of the first ALF, the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF comprises:

According to the first coefficient in the ALF coefficients of the floating point type of the first ALF and the maximum integer coefficient threshold corresponding to the first coefficient, determine the maximum allowable maximum value when the ALF coefficient of the floating point type is quantized into the integer type shift value;

According to the maximum shift value, the maximum quantization scale of the floating-point type ALF coefficient of the first ALF is determined.
The method according to claim 2, wherein the first coefficient is a floating-point type coefficient corresponding to a center position of the first ALF.
The method according to claim 3, wherein, according to the first coefficient in the ALF coefficients of the floating-point number type and a maximum integer coefficient threshold corresponding to the first coefficient, determining the floating-point number type The maximum shift value allowed when ALF coefficients are quantized to integer types, including:

The maximum shift value is determined according to the following formula:

The maximum shift value of the bitshift, the floor is rounded down, the W f(0,0) is the coefficient of floating point type corresponding to the center position of the first ALF, the d is the maximum integer coefficient threshold corresponding to the W f(0,0) .
The method according to claim 4, wherein the d is 127.
The method according to claim 4, wherein the determining, according to the maximum shift value, the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF comprises:

According to the following formula, the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF is determined:

Scale max = 2 bitshift ,

Wherein, the Scale max is the maximum quantization scale.
The method according to any one of claims 2-6, wherein, if the first component is a luminance component, determining the floating-point type ALF of the first ALF according to the maximum quantization scale The target quantization scale of the coefficients, including:

A first quantization scale is determined as the target quantization scale of the floating-point type ALF coefficients of the first ALF, the first quantization scale being less than or equal to the maximum quantization scale.
The method according to claim 7, wherein the method further comprises:

The second quantization scale is determined as the target quantization scale of the floating-point ALF coefficient of the second ALF, the second ALF is a filter used when performing ALS filtering on the reconstructed image under the chrominance component, and the second ALF is The binary quantization scale is smaller than the first quantization scale.
The method according to any one of claims 2-6, wherein, if the first component is a chrominance component, determining the floating point type of the first ALF according to the maximum quantization scale Target quantization scale for ALF coefficients, including:

A third quantization scale is determined as the target quantization scale of the floating-point type ALF coefficients of the first ALF, and the third quantization scale is smaller than the maximum quantization scale.
The method according to claim 9, wherein the method further comprises:

The fourth quantization scale is determined as the target quantization scale of the floating-point ALF coefficient of the third ALF, the third ALF is a filter used when performing ALS filtering on the reconstructed image under the luminance component, and the fourth ALF is The quantization scale is larger than the third quantization scale.
The method according to any one of claims 2-6, wherein the determining, according to the maximum quantization scale, the target quantization scale of the floating-point type ALF coefficient of the first ALF comprises:

For each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale, determine the quantization cost of the first quantization scale, where the first quantization scale is a power of 2 positive integer;

According to the quantization cost of each of the first quantization scales, the target quantization cost of the floating-point type ALF coefficient of the first ALF is determined.
The method according to claim 11, wherein the determining the target quantization cost of the floating-point type ALF coefficient of the first ALF according to the quantization cost of each of the first quantization scales comprises:

The first quantization scale with the smallest quantization cost is determined as the target quantization cost of the floating-point type ALF coefficient of the first ALF.
The method according to claim 11, wherein the determining the quantization cost of the first quantization scale comprises:

using the first quantization scale, quantizing the ALF coefficients of the floating point type of the first ALF to the first ALF coefficients of the integer type;

determining the quantization distortion result corresponding to the first ALF coefficient when the target region of the reconstructed image is encoded by using the first ALF coefficient;

The quantization cost of the first quantization scale is determined according to the quantization distortion result corresponding to the first ALF coefficient and the number of bits consumed for encoding the first ALF coefficient.
The method according to claim 13, wherein when the determining to use the first ALF coefficient to encode the target area of the reconstructed image, the quantization distortion result corresponding to the first ALF coefficient comprises:

According to the reconstructed pixel value of the pixel point in the target area, determine the autocorrelation coefficient of the pixel point in the target area;

According to the original pixel value of the pixel in the target area, determine the cross-correlation coefficient of the pixel in the target area;

The quantization distortion result corresponding to the first ALF coefficient is determined according to the product of the autocorrelation coefficient of the pixels in the target area and the first ALF coefficient, and the cross-correlation coefficient of the pixels in the target area.
The method according to claim 13, wherein the determining the quantization distortion corresponding to the first ALF coefficient when the target region of the reconstructed image is encoded by using the first ALF coefficient, comprises:

filtering the target region of the reconstructed image using the first ALF coefficients;

The quantization distortion result corresponding to the first ALF coefficient is determined according to the difference between the filtered pixel value and the original pixel value of the pixel in the target area.
The method according to any one of claims 13-15, wherein the determining the The quantization cost of the first quantization scale, including:

The quantization cost of the first quantization scale is determined according to the following formula:

J=D+λR

Wherein, the J is the quantization cost of the first quantization scale, the D is the quantization distortion result corresponding to the first ALF coefficient, the R is the number of bits consumed by encoding the first ALF coefficient, and the The λ is a variable value.
The method according to claim 1, wherein the code stream includes ALF coefficients of all integer types of the first ALF.
The method according to claim 1, wherein the code stream includes second ALF coefficients and first information of an integer type of the first ALF, and the second ALF coefficients are in the first ALF. An integer type ALF coefficient corresponding to a non-center point, where the first information is used to indicate the target quantization scale.
The method according to claim 18, wherein the first information comprises a target shift value corresponding to the target quantization scale.
The method according to claim 18, wherein the first information comprises the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first value.
The method of claim 20, wherein the first value is 6, 5 or 4.
The method according to claim 1, wherein the method further comprises:

The target region of the reconstructed image under the first component is filtered using integer-type ALF coefficients of the first AFL.
The method according to claim 22, wherein the filtering the target area of the reconstructed image under the first component by using the integer-type ALF coefficients of the first AFL comprises:

The target area of the reconstructed image under the first component is filtered according to the following formula:

Wherein, the I rec(0,0) ' is the reconstructed pixel value after filtering by the current point (0,0) in the target area, and the (x, y) is the position relative to the current point, so The W j is the ALF coefficient of the jth integer type of the first ALF, the W n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF, and the I rec(0,0 ) is the reconstructed pixel value of the current point, the Alfshif is the target shift value corresponding to the target quantization scale, and the n is the number of ALF coefficients included in the first ALF.
A video decoding method, comprising:

Decode the code stream to obtain the residual value of the current image;

determining a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

Decoding the code stream to obtain the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target region in the reconstructed image under the first component;

According to the ALF coefficient information of the first ALF, the target area in the reconstructed image under the first component is filtered using the first ALF.
The method according to claim 24, wherein the ALF coefficient information of the first ALF includes ALF coefficients of all integer types of the first ALF.
The method according to claim 24, wherein the ALF coefficient information of the first ALF comprises an integer type second ALF coefficient and first information of the first ALF, and the second ALF coefficient is the The integer-type ALF coefficient corresponding to the non-central point in the first ALF, where the first information is used to indicate the target quantization scale.
The method according to claim 26, wherein the first information comprises a target shift value corresponding to the target quantization scale.
The method according to claim 26, wherein the first information comprises the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first value.
The method of claim 28, wherein the first value is 6, 5 or 4.
The method according to claim 26, wherein the first ALF is used to filter the target area in the reconstructed image under the first component according to the ALF coefficient information of the first ALF ,include:

Determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is an integer type corresponding to the center point of the first ALF ALF coefficient;

The target area in the reconstructed image under the first component is filtered using the first ALF according to the second ALF coefficient and the first ALF coefficient of the first ALF.
The method according to claim 30, wherein the determining the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale comprises:

The first ALF coefficient of the first ALF is determined according to the following formula:

Wherein, the AlfCoeffLuma[i][k] is the first ALF coefficient of the i-th first ALF, the k is the number of ALF coefficients included in the first ALF minus one, and the first ALF The AlfLumaShift[i] is the target quantization scale corresponding to the first ALF, and the AlfCoeffLuma[i][j] is the jth second ALF coefficient in the first ALF.
The method according to claim 31, wherein if the first component is a luminance component, the value of i ranges from 0 to p, and p is the number of ALFs corresponding to the luminance component minus the number of one.
The method of claim 32, wherein the p is 15 or 63.
The method according to claim 31, wherein if the first component is a Cb component, the i is equal to a first value, and if the first component is a Cr component, the i is equal to a second value .
The method of claim 34, wherein the first value is 0 and the second value is 1.
The method of claim 31, wherein the method further comprises:

The shape information of the first ALF is obtained by parsing from the code stream;

The type of the first ALF is determined according to the shape information of the first ALF.
The method according to claim 36, wherein if the shape of the first ALF is a 7×7 cross plus a 3×3 square filter shape, the first ALF includes 9 ALF coefficients; If the shape of the first ALF is a 7×7 cross shape plus a 5×5 square filter shape, the first ALF includes 15 ALF coefficients.
The method according to claim 30, wherein, according to the second ALF coefficient and the first ALF coefficient of the first ALF, using the first ALF to compare the The target area in the reconstructed image is filtered, including:

The target area in the reconstructed image under the first component is filtered according to the following formula:

Wherein, the I rec(0,0) ′ is the reconstructed pixel value filtered by the current point (0,0) in the target area, and the (x, y) is the position relative to the current point, so The W j is the ALF coefficient of the jth integer type of the first ALF, the W n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF, and the I rec(0,0 ) is the reconstructed pixel value of the current point, the Alfshif is the target shift value corresponding to the target quantization scale, and the n is the number of ALF coefficients included in the first ALF.
A video encoder, comprising:

an obtaining unit, configured to obtain a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

a first determining unit, configured to determine an ALF coefficient of the floating point type of the first ALF when the target area in the reconstructed image under the first component is filtered using the first adaptive correction filter ALF;

a second determination unit, configured to determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF;

a third determining unit, configured to determine, according to the maximum quantization scale, a target quantization scale of the floating-point ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;

a quantization unit, configured to use the target quantization scale to quantize the floating-point type ALF coefficients of the first ALF into integer-type ALF coefficients;

an encoding unit, configured to encode the integer-type ALF coefficients of the first ALF to obtain a code stream.
A video decoder, comprising:

The first decoding unit is used for decoding the code stream to obtain the residual value of the current image;

a determining unit, configured to determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;

a second decoding unit, configured to decode the code stream to obtain the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target region in the reconstructed image under the first component;

A filtering unit, configured to use the first ALF to filter the target area in the reconstructed image under the first component according to the ALF coefficient information of the first ALF.
A video encoding and decoding system, comprising:

The video encoder of claim 39;

and the video decoder of claim 40.
A computer-readable storage medium, characterized by being used for storing a computer program, the computer program causing a computer to execute the method according to any one of claims 1 to 23 or claims 24 to 38.