WO2022227062A1

WO2022227062A1 - Encoding and decoding methods, code stream, encoder, decoder, and storage medium

Info

Publication number: WO2022227062A1
Application number: PCT/CN2021/091670
Authority: WO
Inventors: 戴震宇
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2022-11-03
Also published as: CN116803078A

Abstract

Disclosed in embodiments of the present application are encoding and decoding methods, a code stream, an encoder, a decoder, and a storage medium. The method comprises: parsing the code stream, and determining the value of at least one piece of syntax element identification information; when the at least one piece of syntax element identification information indicates that a current frame or current block uses a weight matrix for filtering processing, determining a weight matrix network model of the current block, and determining a weight matrix of the current block according to the weight matrix network model; and determining a target reconstructed image block of the current block by using the weight matrix. In this way, by using the weight matrix network model, a weight matrix technique based on deep learning can be implemented, and pixel-level weighting processing can be provided for the use of an output reconstructed image block, thereby improving encoding performance and further improving encoding and decoding efficiency.

Description

Codec method, code stream, encoder, decoder and storage medium

technical field

The embodiments of the present application relate to the technical field of image processing, and in particular, to an encoding and decoding method, a code stream, an encoder, a decoder, and a storage medium.

Background technique

In video codec systems, loop filters are used to improve the subjective and objective quality of reconstructed images. Among them, the traditional loop filter mainly includes deblocking filter, sample adaptive compensation filter and adaptive correction filter. In the high performance-modular intelligent coding test model (High Performance-Modular Artificial Intelligence Model, HPM-ModAI) of the third-generation audio and video coding standard (3rd Audio Video coding Standard, AVS3), a residual neural network based The loop filter (hereinafter referred to as CNNLF) is used as the baseline scheme of the intelligent loop filter module, and is located between the sample adaptive compensation filter and the adaptive correction filter.

In the current HPM-ModAI of AVS3, when it is determined that the current block uses CNNLF through the rate-distortion cost method, the reconstructed pixel value of the current block will be updated to the pixel value processed by CNNLF at this time. However, the smaller distortion value of the entire block does not mean that the distortion of each pixel value in the block will be smaller, so that the improvement of coding performance is limited.

SUMMARY OF THE INVENTION

Embodiments of the present application provide an encoding and decoding method, a code stream, an encoder, a decoder, and a storage medium, which can improve encoding performance and further improve encoding and decoding efficiency.

The technical solutions of the embodiments of the present application can be implemented as follows:

In a first aspect, an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:

Parse the code stream, and determine the value of at least one syntax element identification information;

When at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model;

Using the weight matrix, the target reconstructed image block of the current block is determined.

In a second aspect, an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:

Determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model;

Determine the value of at least one syntax element identification information;

When the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, the weight matrix is used to determine the target reconstructed image block of the current block.

In a third aspect, an embodiment of the present application provides a code stream, where the code stream is generated by bit encoding according to the value of at least one syntax element identification information;

Wherein, the at least one syntax element identification information includes at least: first syntax element identification information, first luma syntax element identification information, second luma syntax element identification information, and chroma syntax element identification information;

Wherein, the first syntax element identification information is used to indicate whether the video sequence is filtered using the weight matrix, the first luminance syntax element identification information is used to indicate whether the luminance component of the current frame is filtered using the weight matrix, and the second luminance syntax element identification The information is used to indicate whether the luminance component of the current block is filtered using the weight matrix, and the chroma syntax element identification information is used to indicate whether the chroma component of the current frame is filtered using the weight matrix; the video sequence includes the current frame, and the current frame includes the current frame. piece.

In a fourth aspect, an embodiment of the present application provides an encoder, where the encoder includes a first determining unit and a first filtering unit; wherein,

a first determining unit, configured to determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model; and also configured to determine the value of at least one syntax element identification information;

The first filtering unit is configured to use the weight matrix to determine the target reconstructed image block of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing.

In a fifth aspect, an embodiment of the present application provides an encoder, where the encoder includes a first memory and a first processor; wherein,

a first memory for storing a computer program executable on the first processor;

The first processor is configured to execute the method of the second aspect when running the computer program.

In a sixth aspect, an embodiment of the present application provides a decoder, the decoder includes a parsing unit, a second determining unit, and a second filtering unit; wherein,

a parsing unit, configured to parse the code stream, and determine the value of at least one syntax element identification information;

a second determining unit, configured to determine the weight matrix network model of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, and determine the weight matrix of the current block according to the weight matrix network model;

The second filtering unit is configured to use the weight matrix to determine the target reconstructed image block of the current block.

In a seventh aspect, an embodiment of the present application provides a decoder, where the decoder includes a second memory and a second processor; wherein,

a second memory for storing a computer program executable on the second processor;

The second processor is configured to execute the method according to the first aspect when running the computer program.

In an eighth aspect, an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and when the computer program is executed, the method described in the first aspect or the second aspect is realized. Methods.

The embodiments of the present application provide an encoding and decoding method, a code stream, an encoder, a decoder, and a storage medium. On the encoder side, the weight matrix network model of the current block is determined, and the weight of the current block is determined according to the weight matrix network model. matrix; determine the value of at least one syntax element identification information; when the at least one syntax element identification information indicates that the current frame or current block uses the weight matrix for filtering, use the weight matrix to determine the target reconstructed image block of the current block. On the decoder side, the value of at least one syntax element identification information is determined by parsing the code stream; when the at least one syntax element identification information indicates that the current frame or current block uses the weight matrix for filtering processing, the weight matrix network model of the current block is determined , and determine the weight matrix of the current block according to the weight matrix network model; use the weight matrix to determine the target reconstructed image block of the current block. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

Description of drawings

Fig. 1 is the application schematic diagram of a kind of coding framework provided by the related art;

2 is an application schematic diagram of another encoding framework provided by the related art;

3A is a schematic diagram of a detailed framework of a video coding system provided by an embodiment of the present application;

3B is a schematic diagram of a detailed framework of a video decoding system provided by an embodiment of the application;

4 is a schematic flowchart of a decoding method provided by an embodiment of the present application;

5A is a schematic diagram of a network structure composition of a luminance component provided by an embodiment of the present application;

5B is a schematic diagram of a network structure composition of a chrominance component provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a network structure composition of a residual block provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of the application of a coding framework provided by an embodiment of the present application;

8 is a schematic diagram of the composition and structure of a weight matrix network model provided by an embodiment of the present application;

9 is a schematic diagram of an overall framework of a weight matrix network model provided by an embodiment of the present application;

10 is a schematic flowchart of an encoding method provided by an embodiment of the present application;

11 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the application;

12 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the application;

13 is a schematic diagram of the composition and structure of a decoder provided by an embodiment of the present application;

FIG. 14 is a schematic diagram of a specific hardware structure of a decoder provided by an embodiment of the present application.

Detailed ways

In order to have a more detailed understanding of the features and technical contents of the embodiments of the present application, the implementation of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" can be the same or a different subset of all possible embodiments, and Can be combined with each other without conflict. It should also be pointed out that the term "first\second\third" involved in the embodiments of the present application is only used to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that "first\second\" Where permitted, the specific order or sequence may be interchanged so that the embodiments of the present application described herein can be implemented in sequences other than those illustrated or described herein.

In a video image, a first image component, a second image component, and a third image component are generally used to represent a coding block (Coding Block, CB); wherein, the three image components are a luminance component and a blue chrominance component respectively. and a red chrominance component, specifically, the luminance component is usually represented by the symbol Y, the blue chrominance component is usually represented by the symbol Cb or U, and the red chrominance component is usually represented by the symbol Cr or V; in this way, the video image can use the YCbCr format Representation can also be represented in YUV format.

Before the embodiments of the present application are described in further detail, the nouns and terms involved in the embodiments of the present application will be described first. The nouns and terms involved in the embodiments of the present application are applicable to the following explanations:

Moving Picture Experts Group (MPEG)

International Standardization Organization (ISO)

International Electrotechnical Commission (IEC)

Joint Video Experts Team (JVET)

Alliance for Open Media (AOM)

A new generation of video coding standard H.266/Versatile Video Coding (VVC)

VVC's reference software test platform (VVC Test Model, VTM)

Audio Video coding Standard (AVS)

High-Performance Model (HPM) of AVS

High Performance-Modular Artificial Intelligence Model of AVS (High Performance-Modular Artificial Intelligence Model, HPM-ModAI)

Convolutional Neural Network based in-Loop Filter (CNNLF) based on residual neural network

Deblocking Filter (DBF)

Sample Adaptive Offset (SAO)

Adaptive loop filter (ALF)

Quantization Parameter (QP)

Coding unit (Coding Unit, CU)

Coding Tree Unit (CTU)

It can be understood that the digital video compression technology mainly compresses the huge digital image and video data to facilitate transmission and storage. With the proliferation of Internet video and people's higher and higher requirements for video definition, although the existing digital video compression standards can save a lot of video data, it is still necessary to pursue better digital video compression technology to reduce digital video data. Bandwidth and traffic pressure for video transmission.

In the process of digital video encoding, the encoder reads unequal pixels for the original video sequences of different color formats, including luminance components and chrominance components, that is, the encoder reads a black and white or color image. The image is then divided into blocks, and the block data is handed over to the encoder for encoding. Nowadays, the encoder is usually a hybrid frame encoding mode, which can generally include intra-frame prediction and inter-frame prediction, transform/quantization, inverse quantization/inverse transform, For operations such as loop filtering and entropy coding, the specific processing flow can be referred to as shown in FIG. 1 . Here, intra-frame prediction only refers to the information of the same frame image, and predicts the pixel information in the current divided block to eliminate spatial redundancy; inter-frame prediction can include motion estimation and motion compensation, which can refer to the image information of different frames, using Motion estimation searches for the motion vector information that best matches the current divided block to eliminate temporal redundancy; transformation converts the predicted image block to the frequency domain, and the energy is redistributed. Combined with quantization, the information that is not sensitive to the human eye can be removed and used for Eliminate visual redundancy; entropy coding can eliminate character redundancy according to the current context model and the probability information of binary code stream; loop filtering mainly processes the pixels after inverse transformation and inverse quantization, compensates for distortion information, and provides subsequent coding pixels. better reference.

For AVS3, in the loop filtering part, the traditional loop filtering module mainly includes deblocking filter (hereinafter referred to as DBF), sample adaptive compensation filter (hereinafter referred to as SAO) and adaptive correction filter (hereinafter referred to as SAO) abbreviated as ALF). In the application of HPM-ModAI, the loop filter based on residual neural network (hereinafter referred to as CNNLF) is also used as the baseline scheme of the intelligent loop filter module, and is set between SAO filtering and ALF filtering. See Figure 2. During coding test, according to the general test conditions of intelligent coding, for All Intra configuration, turn on ALF, turn off DBF and SAO; for Random Access and Low Delay configuration, turn on I frame DBF, open ALF, close SAO.

In the current HPM-ModAI of AVS3, when it is determined to use CNNLF through the rate-distortion cost method, the reconstructed pixel value of the current CTU will be updated to the pixel value processed by CNNLF. At this time, CNNLF is turned on, indicating that the mean square error of the current CTU after CNNLF processing will become smaller than that of the original CTU before CNNLF; but the smaller distortion of the entire CTU does not mean that the value of each pixel in the CTU is smaller. Distortion will be reduced. The existing solutions lack the optimal selection of each pixel value in the CTU processed by CNNLF, which limits the improvement of coding performance.

The embodiment of the present application provides an encoding method. On the encoder side, the weight matrix network model of the current block is determined, and the weight matrix of the current block is determined according to the weight matrix network model; the value of at least one syntax element identification information is determined; The at least one syntax element identification information indicates that when the current frame or the current block uses the weight matrix for filtering processing, the weight matrix is used to determine the target reconstructed image block of the current block.

The embodiment of the present application provides a decoding method. On the decoder side, the code stream is parsed to determine the value of at least one syntax element identification information; when the at least one syntax element identification information indicates that the current frame or current block is filtered by using a weight matrix When , determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model; use the weight matrix to determine the target reconstructed image block of the current block.

In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to FIG. 3A , it shows a schematic diagram of a detailed framework of a video coding system provided by an embodiment of the present application. As shown in FIG. 3A, the video coding system 10 includes a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter Control the analysis unit 107, the filtering unit 108, the encoding unit 109, the decoded image buffering unit 110, etc., wherein the filtering unit 108 can implement DBF filtering/SAO filtering/ALF filtering, and the encoding unit 109 can implement header information encoding and context-based adaptation Binary arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC). For the input original video signal, a video coding block can be obtained by dividing the coding tree unit (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is transformed and quantized by the quantization unit 101. The video coding block is transformed, including transforming residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate; the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for Intra prediction is performed on the video coding block; specifically, the intra prediction unit 102 and the intra prediction unit 103 are used to determine the intra prediction mode to be used to encode the video coding block; the motion compensation unit 104 and the motion estimation unit 105 is used to perform inter-predictive encoding of the received video encoding block relative to one or more blocks in one or more reference frames to provide temporal prediction information; the motion estimation performed by the motion estimation unit 105 is to generate a motion vector. process, the motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 also For providing the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video Reconstruction of the coding block, reconstructing the residual block in the pixel domain, the reconstructed residual block removing the blocking artifacts by the filter control analysis unit 107 and the filtering unit 108, and then adding the reconstructed residual block to the decoding A predictive block in the frame of the image buffer unit 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients. In the CABAC-based coding algorithm, The context content can be based on adjacent coding blocks, and can be used to encode information indicating the determined intra-frame prediction mode, and output a code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks, for Forecast reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .

Referring to FIG. 3B , it shows a schematic diagram of a detailed framework of a video decoding system provided by an embodiment of the present application. As shown in FIG. 3B, the video decoding system 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra-frame prediction unit 203, a motion compensation unit 204, a filtering unit 205, a decoded image buffer unit 206, etc., wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering. After the input video signal is subjected to the encoding process of FIG. 3A, the code stream of the video signal is output; the code stream is input into the video decoding system 20, and firstly passes through the decoding unit 201 to obtain the decoded transform coefficient; Inverse transform and inverse quantization unit 202 processes to generate residual blocks in the pixel domain; intra prediction unit 203 may be used to generate based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture Prediction data for the current video decoding block; motion compensation unit 204 determines prediction information for the video decoding block by parsing the motion vector and other associated syntax elements, and uses the prediction information to generate predictive information for the video decoding block being decoded block; a decoded video block is formed by summing the residual block from inverse transform and inverse quantization unit 202 and the corresponding predictive block produced by intra prediction unit 203 or motion compensation unit 204; the decoded video signal Video quality may be improved by filtering unit 205 in order to remove blocking artifacts; decoded video blocks are then stored in decoded image buffer unit 206, which stores reference images for subsequent intra prediction or motion compensation , and is also used for the output of the video signal, that is, the restored original video signal is obtained.

It should be noted that the methods provided by the embodiments of the present application may be applied to the filtering unit 108 shown in FIG. 3A (represented by a bold black box), and may also be applied to the filtering unit 205 shown in FIG. 3B . (indicated by a bold black box). That is to say, the methods in the embodiments of the present application can be applied to both a video encoding system (referred to as an "encoder"), a video decoding system (referred to as a "decoder" for short), or even simultaneously. Video encoding system and video decoding system, but no limitation is made here.

It should also be noted that when the embodiments of the present application are applied to the encoder, the "current block" specifically refers to the block currently to be encoded in the video image (may also be referred to as "encoding blocks" for short); when the embodiments of the present application apply In the case of a decoder, the "current block" specifically refers to the block to be decoded currently in the video image (it may also be referred to as a "decoded block" for short).

In an embodiment of the present application, refer to FIG. 4 , which shows a schematic flowchart of a decoding method provided by an embodiment of the present application. As shown in Figure 4, the method may include:

S401: Parse the code stream, and determine the value of at least one syntax element identification information.

It should be noted that a video image may be divided into multiple image blocks, and each image block to be decoded currently may be called a decoding block. Here, each decoding block may include a first image component, a second image component, and a third image component; and the current block is the first image component, the second image component, or the third image component loop filtering in the video image currently to be performed Processed decoded blocks.

Here, for the first image component, the second image component, and the third image component, from the perspective of color division, this embodiment of the present application may divide them into two types of color components, such as luminance component and chrominance component. In this case, if the current block performs operations such as prediction, inverse transformation and inverse quantization, and loop filtering of luminance components, the current block can also be called a luminance block; Transform and inverse quantization, loop filtering and other operations, then the current block can also be called a chroma block.

It should also be noted that, on the decoder side, an embodiment of the present application specifically provides a loop filtering method, and the loop filtering method is applied to the part of the filtering unit 205 as shown in FIG. 3B . Here, the filtering unit 205 may include a deblocking filter (LBF), a sample adaptive compensation filter (SAO), a residual neural network-based loop filter (CNNLF), and an adaptive correction filter (ALF), The loop filtering method described in the embodiment of the present application is specifically applied between CNNLF and ALF, so that each pixel value in the current block after being filtered by CNNLF can be optimally selected.

It can be understood that, in a specific example, CNNLF designs different network structures for the luminance component and the chrominance component, respectively, see FIG. 5A and FIG. 5B for details.

For the luminance component, as shown in Figure 5A, the entire network structure can be composed of convolutional layers, activation layers, residual blocks, skip connection layers and other parts. Here, the convolution kernel of the convolution layer can be 3×3, that is, it can be represented by 3×3Conv; the activation layer can be a linear activation function, that is, it can be represented by a linear rectification function (Rectified Linear Unit, ReLU), also known as Modified linear unit is an activation function commonly used in artificial neural networks, usually referring to nonlinear functions represented by ramp functions and their variants. The network structure of the residual block (ResBlock) is shown in the dotted box in Figure 6, which can be composed of a convolution layer (Conv), an activation layer (ReLU), and a jump connection layer. In the network structure, the jump connection layer (Concat) refers to a global jump connection from input to output included in the network structure, which enables the network to focus on learning residuals and accelerates the convergence process of the network.

For the chrominance component, as shown in Figure 5B, the luma component is introduced as one of the inputs to guide the filtering of the chrominance component. The entire network structure can be composed of convolutional layers, activation layers, residual blocks, pooling layers, jumps connection layer and other parts. Due to resolution inconsistencies, the chroma components first need to be upsampled. To avoid introducing additional noise during upsampling, resolution enlargement can be accomplished by directly copying adjacent pixels to obtain an Enlarged chroma frame. In addition, at the end of the network structure, a pooling layer (such as 2×2AvgPool) is also used to complete the downsampling of the chroma components. Specifically, in the application of HPM-ModAI, the number of residual blocks of the luma component network can be set to N=20, and the number of residual blocks of the chroma component network can be set to N=10.

Here, the use of CNNLF can include two stages of offline training and inference testing. Among them, in the offline training stage, 4 I-frame luminance component models, 4 non-I-frame luminance component models, 4 chrominance U component models, and 4 chrominance V component models can be trained offline, a total of 16 models. Specifically, using a preset image dataset (such as DIV2K, which has 1000 high-definition images (2K resolution), of which 800 are for training, 100 for validation, and 100 for testing), images are converted from RGB A single-frame video sequence in YUV4:2:0 format is used as tag data. The sequence was then encoded using HPM in the All Intra configuration, with traditional filters such as DBF, SAO and ALF turned off, and the quantization step size was set from 27 to 50. For the reconstructed sequence obtained by encoding, it is divided into 4 intervals according to the range of QP 27~31, 32~37, 38~44, 45~50, and is cut into 128×128 image blocks as training data, and 4 kinds of I Frame luminance component model, 4 chrominance U component models, 4 chrominance V component models. Further, using a preset video dataset (e.g. BVI-DVC), encoding with HPM-ModAI under Random Access configuration, turning off traditional filters such as DBF, SAO and ALF, and turning on CNNLF for I-frames, collecting non-coding reconstructed images. For I-frame data, four non-I-frame luminance component models were trained.

In the inference test phase, HPM-ModAI sets frame-level flags and CTU-level flags in the form of switches for the luminance component to control whether to turn on CNNLF, and sets the frame-level flags in the form of switches for the chroma components to control whether to turn on CNNLF. . Here, the flag bit can usually be represented by flag. In addition, the frame-level flag bit is determined by equation (1), where D=D _net -D _rec represents the distortion reduced after CNNLF processing (D _net is the distortion after filtering, D _rec is the distortion before filtering), R represents the current The number of CTUs in the frame, λ is consistent with the λ of the adaptive correction filter. When RDcost is negative, the frame-level flag is turned on, otherwise the frame-level flag is turned off.

RDcost=D+λ*R (1)

When the frame-level flag is turned on, it is necessary to further decide whether to turn on CNNLF for each CTU by means of rate-distortion cost. Here, the CTU level flag is set to control whether CNNLF is turned on. Specifically, the CTU-level flag bit is determined by equation (2).

RDcost=D (2)

That is to say, in HPM-ModAI of AVS3, when CNNLF is determined to be used by the rate-distortion cost method, the reconstructed pixel value of the current CTU will be updated to the pixel value processed by CNNLF. At this time, the CNNLF flag is turned on, which means that the mean square error of the current CTU after CNNLF processing will be smaller than that of the original CTU before CNNLF, but the smaller distortion of the entire CTU does not mean that each pixel in the CTU The distortion of the value will become smaller, so that the improvement of encoding performance is limited.

In order to further improve the coding performance, an embodiment of the present application provides a deep learning-based weight matrix in-loop filtering method for weighting the output of the CNNLF, so that the output of the CNNLF is closer to the original image. As shown in Figure 7, optimization is performed on the basis of Figure 2. Compared with Figure 2, in addition to DBF, SAO, CNNLF and ALF, the loop filter can also include a weighting matrix module (Weighting Matrix), and the weights The matrix module is located between CNNLF filtering and ALF filtering. In addition, the use of the weight matrix module does not depend on the flag bits of DBF, SAO, CNNLF and ALF, but only after CNNLF and before ALF in position.

Here, in order to facilitate the decoder to determine whether to use the loop filtering method of the embodiment of the present application (that is, whether the weight matrix module is enabled), a first syntax element identification information may be set to indicate whether the current video sequence uses the present application. The loop filtering method of an embodiment.

In other words, the identification information of the at least one syntax element obtained by decoding at least includes the identification information of the first syntax element. In some embodiments, the parsing the code stream and determining the value of the identification information of at least one syntax element may include:

The code stream is parsed, and the value of the first syntax element identification information is determined; wherein, the first syntax element identification information is used to indicate whether the video sequence is filtered by using the weight matrix.

Further, in some embodiments, the method may also include:

If the value of the first syntax element identification information is the first value, it is determined that the first syntax element identification information indicates that the video sequence is filtered by using the weight matrix; or,

If the value of the first syntax element identification information is the second value, it is determined that the first syntax element identification information indicates that the video sequence is not filtered using the weight matrix.

Here, the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. Specifically, the first syntax element identification information may be a parameter written in a profile (profile), or may be a value of a flag (flag), which is not limited in this embodiment of the present application.

Taking the first syntax element identification information as a flag as an example, at this time, for the first value and the second value, the first value can be set to 1, and the second value can be set to 0; alternatively, the first value can also be set is true, the second value can also be set to false; alternatively, the first value can also be set to 0, and the second value can also be set to 1; alternatively, the first value can also be set to false, and the second value can also be set to true. Exemplarily, for flag, in general, the first value may be 1, and the second value may be 0, but no limitation is imposed.

It should also be noted that the video sequence includes at least one frame, and the at least one frame may include the current frame. Here, when the video sequence is enabled to use the loop filtering method of the embodiment of the present application, it is determined that the weight matrix module is enabled for the video sequence, then the embodiment of the present application needs to further determine whether the current frame in the video sequence uses the weight matrix For filtering processing, it is also necessary to set a second syntax element identification information.

That is, in some embodiments, the parsing the code stream and determining the identification information of at least one syntax element may further include:

The code stream is parsed, and the value of the second syntax element identification information is determined; wherein, the second syntax element identification information is used to indicate whether the current frame uses the weight matrix for filtering processing.

In a specific example, the parsing the code stream and determining the identification information of at least one syntax element may include:

Parse the code stream, and determine the value of the identification information of the first syntax element;

When the first syntax element identification information indicates that the video sequence is filtered by using the weight matrix, the code stream is parsed to determine the value of the second syntax element identification information.

That is, the at least one syntax element identification information may include at least the first syntax element identification information and the second syntax element identification information. Here, for the second syntax element identification information, the meanings represented by the second syntax element identification information are different according to different types of color components of the current frame. In addition, different color component types have different weight matrices. For example, a weight matrix corresponding to a luminance component may be called a luminance weight matrix, and a weight matrix corresponding to a chrominance component may be called a chrominance weight matrix. Specifically, in some embodiments, the parsing the code stream and determining the identification information of the second syntax element may include:

When the color component type of the current frame is a luminance component, it is determined that the second syntax element identification information is the first luminance syntax element identification information, and the first luminance syntax element identification information is used to indicate whether the luminance component of the current frame is filtered using the luminance weight matrix processing; or,

When the color component type of the current frame is a chroma component, it is determined that the second syntax element identification information is the chroma syntax element identification information, and the chroma syntax element identification information is used to indicate whether the chroma components of the current frame are processed using a chroma weight matrix. filter processing.

In this embodiment of the present application, the color component types may include luminance components and chrominance components. If the color component type is a luminance component, then the second syntax element identification information may be referred to as the first luminance syntax element identification information at this time, to indicate whether the luminance component of the current frame is filtered using the luminance weight matrix. If the color component type is a luma component, then the second syntax element identification information may be referred to as the first luma syntax element identification information at this time, to indicate whether the luma component of the current frame is filtered using the chrominance weight matrix.

S402: When at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model.

It should be noted that the at least one syntax element identification information described in the embodiments of the present application is not limited to the first syntax element identification information and the second syntax element identification information, and may even include other syntax element identification information, and no limited.

It should also be noted that, after the current frame is divided into blocks, at least one block can be obtained, and the at least one block includes the current block. Here, if the current frame uses the weight matrix for filtering processing, it does not mean that every block in the current frame uses the weight matrix for filtering processing, and may also involve CTU-level syntax element identification information to determine whether the current block uses the weight matrix. filter processing. The following will take the two types of color components, the luminance component and the chrominance component, as examples to describe respectively.

In addition, for different color component types, the weight matrix network model here is also different. In this embodiment of the present application, it is assumed that the weight matrix network model corresponding to the luminance component may be referred to as a luminance weight matrix network model, and the weight matrix network model corresponding to the chrominance component may be referred to as a chrominance weight matrix network model.

In a possible implementation manner, when the color component type of the current frame is a luminance component, the weight matrix network model of the current block is determined when the identification information of at least one syntax element indicates that a weight matrix is used for filtering processing, which may be include:

When the first luminance syntax element identification information indicates that the luminance component of the current frame is filtered using the luminance weight matrix, the code stream is parsed, and the value of the second luminance syntax element identification information is determined;

When the second luma syntax element identification information indicates that the luma component of the current block is filtered using the luma weight matrix, a luma weight matrix network model of the current block is determined.

It should be noted that, for the luma component, two syntax elements are involved here: frame-level syntax elements and CTU-level syntax elements. The frame-level syntax element may be referred to as the first luma syntax element identification information, represented by luma_frame_weighting_matrix_flag; the CTU-level syntax element may be referred to as the second luma syntax element identification information, represented by luma_ctu_weighting_matrix_flag.

It should also be noted that, for the luminance component, the embodiments of the present application may also set the luminance frame-level flag and the luminance CTU-level flag, and determine whether to use the luminance weight matrix for filtering by controlling whether to open the weight matrix module. Therefore, in some embodiments, the method may further include: setting the luminance frame level flag and the luminance CTU level flag.

Here, the current block is within the current frame. Wherein, the luminance frame level flag can be used to control whether the luminance component of the current frame is filtered by the luminance weight matrix, and the luminance CTU level flag can be used to control whether the luminance component of the current block is filtered by the luminance weight matrix.

Further, for the first luma syntax element identification information, in some embodiments, the method may further include:

If the value of the first luma syntax element identification information is the first value, it is determined that the first luma syntax element identification information indicates that the luma component of the current frame is filtered by using the luma weight matrix; or,

If the value of the first luma syntax element identification information is the second value, it is determined that the first luma syntax element identification information indicates that the luma component of the current frame is not filtered using the luma weight matrix.

In some embodiments, the method may also include:

If the value of the first luminance syntax element identification information is the first value, turn on the luminance frame level flag; or,

If the value of the first luminance syntax element identification information is the second value, the luminance frame level flag is turned off.

It should be noted that the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. Specifically, both the first luminance syntax element identification information and the first luminance syntax element identification information may be parameters written in the profile (profile), or may be the value of a flag (flag). This embodiment of the present application There is no restriction on this.

In the embodiment of the present application, taking the first luminance syntax element identification information as a flag information as an example, at this time, for the first value and the second value, the first value may be set to 1, and the second value may be set to 0 ; Alternatively, the first value can also be set to true, and the second value can also be set to false; alternatively, the first value can also be set to 0, and the second value can also be set to 1; alternatively, the first value can also be set to false, the second value can also be set to true. This embodiment of the present application does not make any limitation on this.

Taking the first value of 1 and the second value of 0 as an example, if the value of the first luminance syntax element identification information obtained by decoding is 1, then the luminance frame-level flag bit can be turned on, that is, the frame-level weight matrix module can be turned on. At this time It can be determined that the luminance component of the current frame is filtered using the luminance weight matrix. Otherwise, if the value of the first luminance syntax element identification information is 0, the luminance frame-level flag can be turned off, that is, the frame-level weight matrix module can be turned off. At this time, it can be determined that the luminance component of the current frame does not use the luminance weight matrix for filtering processing. , at this time, the next frame can be obtained from the video sequence, the next frame is determined as the current frame, and then the steps of parsing the code stream and determining the value of the identification information of the first luminance syntax element are continued.

Further, for the second luma syntax element identification information, in some embodiments, the method may further include:

If the value of the second luma syntax element identification information is the first value, it is determined that the second luma syntax element identification information indicates that the luma component of the current block is filtered by using the luma weight matrix; or,

If the value of the second luma syntax element identification information is the second value, it is determined that the second luma syntax element identification information indicates that the luma component of the current block is not filtered using the luma weight matrix.

In some embodiments, the method may also include:

If the value of the second luminance syntax element identification information is the first value, turn on the luminance CTU level flag; or,

If the value of the second luma syntax element identification information is the second value, the luma CTU level flag is turned off.

It should be noted that the first value and the second value are different.

In the embodiment of the present application, taking the second luminance syntax element identification information as another flag information as an example, at this time, for the first value and the second value, the first value may be set to 1, and the second value may be set to 0; alternatively, the first value can also be set to true, and the second value can also be set to false; alternatively, the first value can also be set to 0, and the second value can also be set to 1; alternatively, the first value can also be set to is false, the second value can also be set to true. This embodiment of the present application does not make any limitation on this.

Taking the first value of 1 and the second value of 0 as an example, in the case where the value of the first luminance syntax element identification information is 1 by decoding, if the value of the second luminance syntax element identification information is 1, then Turning on the luminance CTU level flag bit means turning on the CTU level weight matrix module. At this time, it can be determined that the luminance component of the current block is filtered by the luminance weight matrix. Otherwise, if the value of the second luminance syntax element identification information is 0, the luminance CTU level flag bit can be turned off, that is, the CTU level weight matrix module can be turned off. At this time, it can be determined that the luminance component of the current block does not use the luminance weight matrix for filtering Processing, at this time, the next block can be obtained from the current frame, the next block is determined as the current block, and then continue to perform the steps of parsing the code stream and determining the value of the identification information of the second luminance syntax element, until the current frame includes The chunks are all processed and the next frame is loaded to continue processing.

In another possible implementation, when the color component type of the current frame is a chrominance component, the weight matrix network model of the current block is determined when the identification information of the at least one syntax element indicates that a weight matrix is used for filtering processing , which can include:

When the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix, the chroma weight matrix network model of the current block is determined.

It should be noted that, for chroma components, frame-level syntax elements are involved here. The frame-level syntax element may be referred to as chroma syntax element identification information, which is represented by chroma_frame_weighting_matrix_flag.

It should also be noted that, due to the consideration of coding performance and computational complexity, if the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix, then the blocks included in the current frame all use chroma by default. If the chroma syntax element identification information indicates that the chroma components of the current frame do not use the chroma weight matrix for filtering, then the blocks included in the current frame do not use the chroma weight matrix for filtering by default. Therefore, it is no longer necessary to set the CTU-level syntax element for the chroma components, and similarly, the CTU-level flag bit does not need to be set. In other words, for the chrominance component, only the frame-level flag bit may be set in this embodiment of the present application. Therefore, in some embodiments, the method may further include: setting a chroma frame level flag. The chrominance frame-level flag can be used to control whether the chrominance components of the current frame are filtered using the chrominance weight matrix.

Further, for the chroma syntax element identification information, in some embodiments, the method may further include:

If the value of the chroma syntax element identification information is the first value, it is determined that the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix; or,

If the value of the chroma syntax element identification information is the second value, it is determined that the chroma syntax element identification information indicates that the chroma components of the current frame are not filtered using the chroma weight matrix.

In some embodiments, the method may also include:

If the value of the chroma syntax element identification information is the first value, the chroma frame-level flag is turned on; or,

If the value of the chroma syntax element identification information is the second value, the chroma frame level flag is turned off.

It should be noted that the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. Specifically, the chroma syntax element identification information may be a parameter written in a profile (profile), or may be a value of a flag (flag), which is not limited in this embodiment of the present application.

In the embodiment of the present application, taking the chroma syntax element identification information as another flag information as an example, at this time, for the first value and the second value, the first value can be set to 1, and the second value can be set to 0 ; Alternatively, the first value can also be set to true, and the second value can also be set to false; alternatively, the first value can also be set to 0, and the second value can also be set to 1; alternatively, the first value can also be set to false, the second value can also be set to true. This embodiment of the present application does not make any limitation on this.

Taking the first value of 1 and the second value of 0 as an example, if the value of the chroma syntax element identification information obtained by decoding is 1, then the chroma frame-level flag bit can be turned on, that is, the frame-level weight matrix module can be turned on. At this time It can be determined that the chrominance components of the current frame are filtered using the chrominance weight matrix, and by default, the chrominance components of each block in the current frame are filtered using the chrominance weight matrix. Otherwise, if the value of the chroma syntax element identification information is 0, then the chroma frame-level flag bit can be turned off, that is, the frame-level weight matrix module can be turned off. At this time, it can be determined that the chroma components of the current frame do not use the chroma weight matrix. In the filtering process, the next frame can be obtained from the video sequence, the next frame is determined as the current frame, and then the steps of parsing the code stream and determining the value of the chroma syntax element identification information are continued.

It should also be noted that, if it is determined that the luminance component of the current block is filtered using the chrominance weight matrix, then the network model of the luminance weight matrix of the current block needs to be determined. In some embodiments, the determining the luminance weight matrix network model of the current block may include:

In the case that the color component type of the current block is a luminance component, determining at least one candidate luminance weight matrix network model;

Determine the quantization parameter of the current block, and select a candidate brightness weight matrix network model corresponding to the quantization parameter from at least one candidate brightness weight matrix network model;

The selected candidate luminance weight matrix network model is determined as the luminance weight matrix network model of the current block.

Further, when it is determined that the chrominance components of the current block are filtered by using the weight matrix, the chrominance weight matrix network model of the current block needs to be determined at this time. In some embodiments, the determining the chrominance weight matrix network model of the current block may include:

In the case that the color component type of the current block is a chrominance component, determining at least one candidate chrominance weight matrix network model;

Determine the quantization parameter of the current block, and select a candidate chroma weight matrix network model corresponding to the quantization parameter from at least one candidate chroma weight matrix network model;

The selected candidate chroma weight matrix network model is determined as the chroma weight matrix network model of the current block.

It should be noted that the weight matrix network model of the current block is not only related to the quantization parameter, but also related to the color component type. Among them, different color component types correspond to different weight matrix network models. For example, for the luminance component, the weight matrix network model may be a luminance weight matrix network model related to the luminance component; for the chrominance component, the weight matrix The network model may be a chroma weight matrix network model associated with the chroma components.

It should also be noted that, according to different quantization parameters, for example, the value of QP is 27-31, 32-37, 38-44, 45-50, etc., at least one candidate brightness weight matrix network model and at least one candidate brightness weight matrix network model can be trained in advance. Candidate chroma weight matrix network model. In this way, after the quantization parameter of the current block is determined, the candidate luminance weight matrix network model corresponding to the quantization parameter can be selected from at least one candidate luminance weight matrix network model, that is, the luminance weight matrix network model of the current block; The candidate chrominance weight matrix network model corresponding to the quantization parameter is selected from the at least one candidate chrominance weight matrix network model, that is, the chrominance weight matrix network model of the current block.

Further, for the model training of at least one candidate luminance weight matrix network model and at least one candidate chrominance weight matrix network model, in some embodiments, the method may further include:

determining at least one training sample, wherein the training sample is obtained according to at least one quantization parameter;

Use the brightness component of at least one training sample to train the preset neural network model to obtain at least one candidate brightness weight matrix network model;

Use the chrominance component of at least one training sample to train the preset neural network model to obtain at least one candidate chrominance weight matrix network model;

Wherein, at least one candidate luminance weight matrix network model has a correspondence relationship with luminance components and quantization parameters, and at least one candidate chrominance weight matrix network model has a correspondence relationship with chrominance components and quantization parameters.

In this embodiment of the present application, the preset neural network model may include at least one of the following: at least one convolution layer, at least one activation layer, and a jump connection layer.

That is to say, the preset neural network model can select a multi-layer convolutional neural network, and then use the training samples to perform deep learning to obtain a weight matrix network model, such as a luminance weight matrix network model or a chrominance weight matrix network model.

It should be noted that deep learning is a type of machine learning, and machine learning is the only way to realize artificial intelligence. The concept of deep learning originates from the study of artificial neural networks, and a multilayer perceptron with multiple hidden layers is a deep learning structure. Deep learning can form more abstract high-level representation attribute categories or features by combining low-level features to discover distributed feature representations of data. In the embodiments of this application, take Convolutional Neural Networks (CNN) as an example, which is a type of Feedforward Neural Networks (Feedforward Neural Networks) that includes convolutional computation and has a deep structure. Learning) is one of the representative algorithms. The preset neural network model here can be a convolutional neural network structure.

It should also be noted that, in the embodiments of the present application, whether it is a luminance weight matrix network model or a chrominance weight matrix network model, the weight matrix network model can also be regarded as being composed of a multi-layer convolutional neural network. Specifically, the weight matrix network model may also include at least one of the following: at least one convolution layer, at least one activation layer, and a jump connection layer.

Further, after the weight matrix network model is determined, the weight matrix can be determined accordingly. In some embodiments, the determining the weight matrix of the current block according to the weight matrix network model may include:

Determine the input reconstructed image patch and the output reconstructed image patch of the neural network loop filter;

Input the output reconstructed image block into the weight matrix network model to obtain the weight matrix of the current block.

Here, the neural network loop filter specifically refers to the aforementioned CNNLF. After determining the input reconstructed image block and output reconstructed image block of CNNLF, the output reconstructed image block is used as the input of the weight matrix network model, and the output of the weight matrix network model is the weight matrix of the current block.

S403: Using the weight matrix, determine the target reconstructed image block of the current block.

It should be noted that, after obtaining the weight matrix, if the current block uses the weight matrix for filtering, then the input and output of CNNLF can be weighted at this time.

In some embodiments, using the weight matrix to determine the target reconstructed image block of the current block may include: using the weight matrix to perform weighting processing on the input reconstructed image block and the output reconstructed image block to obtain the target reconstructed image block.

In a specific example, the weight matrix is represented by Weighting Matrix, then for the target reconstructed image block, the target reconstructed image block=output reconstructed image block×Weighting Matrix+input reconstructed image block×(1-Weighting Matrix).

In this way, by weighting the input and output of the neural network loop filter through the weight matrix, the output of the neural network loop filter can be made closer to the original image.

It should also be noted that, according to the different types of color components, the weight matrix includes a luminance weight matrix and a chrominance weight matrix; in this way, for the target reconstructed image block, the target reconstructed image block of the luminance component and the chrominance component can also be included. The target reconstructed image patch. In some embodiments, the determining the weight matrix of the current block according to the weight matrix network model may include:

When the color component type of the current block is a luminance component, the luminance weight matrix is determined by using the luminance weight matrix network model;

When the color component type of the current block is a chrominance component, the chrominance weight matrix is determined by using the chrominance weight matrix network model.

In a specific example, when the color component type of the current block is a luminance component, the determining the luminance weight matrix by using the luminance weight matrix network model may include:

Determine the input luminance reconstructed image block and the output luminance reconstructed image block of the neural network loop filter;

Input the output brightness reconstruction image block into the brightness weight matrix network model to obtain the brightness weight matrix;

Further, using the weight matrix to determine the target reconstructed image block of the current block may include:

The input luminance reconstructed image block and the output luminance reconstructed image block are weighted by the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.

In another specific example, when the color component type of the current block is a chrominance component, the determining the chrominance weight matrix by using the chrominance weight matrix network model may include:

Determine the input chrominance reconstructed image block and the output chrominance reconstructed image block of the neural network loop filter;

Input the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix;

The input chrominance reconstructed image block and the output chrominance reconstructed image block are weighted by using the chrominance weight matrix to obtain the target reconstructed image block of the chrominance component of the current block.

It should be noted that, for the input reconstructed image block, depending on the type of color components, it may refer to the input luminance reconstructed image block, or it may refer to the input chrominance reconstructed image block; for the output reconstructed image block, according to the different color component types , which can refer to the output luminance reconstructed image block or the output chrominance reconstructed image block. In this way, the target reconstructed image block of the luminance component can be obtained by weighted calculation according to the input luminance reconstructed image block and the output luminance reconstructed image block; image block.

It should also be noted that, for input reconstructed image blocks (including input luminance reconstructed image blocks or input chrominance reconstructed image blocks), here, the input reconstructed image blocks may be obtained through a deblocking filter and a sample adaptive compensation filter. obtained after filtering.

Besides, in some embodiments, the method may further include: after the target reconstructed image block of the current block is determined, using an adaptive correction filter to continue filtering the target reconstructed image block.

That is to say, if the current block uses the weight matrix for filtering, the input of the adaptive correction filter is the target reconstructed image block; if the current block does not use the weight matrix for filtering, then there is no need to filter the neural network loop at this time. The output of the filter is weighted, that is, the input of the adaptive correction filter is the output reconstructed image block.

Exemplarily, see FIG. 8 , which shows a schematic diagram of the composition and structure of a weight matrix network model provided by an embodiment of the present application. As shown in Figure 8, the weight matrix network model is set in the weight matrix module. The weight matrix network model can be composed of multi-layer convolutional neural networks, and its network structure can be composed of K layers of convolution layers, J layers of activation layers and L layers of jumping layers It consists of a transfer connection layer; wherein, K, J, and L are all integers greater than or equal to 1.

In a specific example, K=5, J=4, L=1, but it does not make any limitation.

Specifically, the network structure shown in Figure 8 includes five convolutional layers. Except for the third convolutional layer, each convolutional layer is followed by an activation layer. The network structure also includes the second convolutional layer. A skip connection layer that feeds into the output of the third convolutional layer. It should be noted that the activation layer can be a linear activation function or a nonlinear activation function, etc.

In addition, the input of the network structure is the output reconstructed image block of CNNLF (which can be the output luminance reconstructed image block or the output chrominance reconstructed image block), and the output of the network structure is a weight matrix, which is used to reconstruct the output image block of CNNLF and The input reconstructed image blocks of CNNLF are weighted, as shown in Figure 9, which shows an example of the overall framework of a weight matrix network model. As can be seen from Figure 9, the input reconstructed image block can be obtained after filtering through a deblocking filter (DBF) and a sample adaptive compensation filter (SAO), and the weighted target reconstructed image block is also The adaptive correction filter (ALF) can be input to continue the filtering process.

It should also be noted that the input reconstructed image block in the weighting process can also be extracted from multiple hidden layer feature image blocks of CNNLF and used in the weighting process; in addition, the output reconstructed image block of CNNLF can also be The output image blocks of other efficient neural network filters are not limited in this embodiment of the present application.

It is understandable that the technical solutions of the embodiments of the present application act on the loop filtering module of the decoder, and the specific process is as follows:

The decoder obtains and parses the code stream, and when it is parsed into the loop filter module, it is processed according to the preset filter order. Here, the preset filter sequence is DBF filtering---->SAO filtering---->CNNLF filtering---->weight matrix module---->ALF filtering. When entering the weight matrix module,

(a) For the luminance component, determine whether the current frame is processed using the weighting matrix module according to the decoded luma_frame_weighting_matrix_flag. Determine whether each CTU in the current frame uses the weighting matrix module for processing according to the decoded luma_ctu_weighting_matrix_flag. If luma_frame_weighting_matrix_flag is "0", skip to (c). Otherwise, when luma_frame_weighting_matrix_flag is "1", if luma_ctu_weighting_matrix_flag is "0", skip to (c); otherwise, when luma_ctu_weighting_matrix_flag is "1", use the weight matrix module for the current CTU of the current frame at this time;

(b) For the chroma component, determine whether the current frame is processed by the weighting matrix module according to the decoded chroma_frame_weighting_matrix_flag. If chroma_frame_weighting_matrix_flag is "0", skip to (c); otherwise, when chroma_frame_weighting_matrix_flag is "1", use the weight matrix module for the current frame at this time;

(c) If the current frame has completed the processing of the weight matrix module, load the next frame for processing, and then jump to (a).

It should be noted that for the weight matrix module, it is also necessary to select the corresponding weight matrix network model according to the color component type and quantization parameters of the current block. The output reconstructed image patch of CNNLF is used as the input of the weight matrix network model, and the output of the network model obtains the weight matrix. The output reconstructed image block of CNNLF and the input reconstructed image block of CNNLF are weighted according to the weight matrix to obtain the final output reconstructed image block (ie, the target reconstructed image block described in the foregoing embodiment).

In the implementation, for the luma component, the modification of its syntax elements is shown in Table 1.

Table 1

In the implementation, for the chroma components, the modification of its syntax elements is shown in Table 2.

Table 2

Here, the frame-level flag of the luminance component of the neural network-based weighting matrix is luma_frame_weighting_matrix_flag; the CTU-level flag of the luminance component of the neural network-based weighting matrix is luma_ctu_weighting_matrix_flag; the frame-level flag of the chrominance component of the neural network-based weighting matrix is chroma_frame_weighting_matrix_flag.

In short, in the embodiment of the present application, by inputting the reconstructed image blocks of the CNNLF output of HPM-ModAI into the weight matrix network model (multi-layer convolutional neural network), the feature information is extracted and the weight matrix is output, so that the CNNLF can be The use of provides pixel-level weighting, resulting in improved coding performance.

Exemplarily, after the loop filtering method of the embodiment of the present application is implemented on the AVS3 intelligent coding reference software HPM10.0-ModAI5.0, the test sequence required by AVS3 is tested under the intelligent coding general test condition All Intra configuration, and The average changes of BD-rate on the three components of Y, U, and V are -0.36%, -1.26%, and -0.38%, respectively, as shown in Table 3; the test for AVS3 requirements under the general test condition Random Access configuration of intelligent coding The sequence is tested, and the average changes of BD-rate on the three components of Y, U, and V are -0.16%, -1.04%, and -0.79%, respectively, as shown in Table 4. The data in Tables 3 and 4 can illustrate that the method of the embodiment of the present application improves the coding performance. Specifically, the embodiment of the present application can provide the existing AVS3 intelligent coding reference software HPM- ModAI brings a nice performance gain.

table 3

Table 4

This embodiment provides a decoding method, and specifically provides a loop filtering method, which is applied to a decoder. Determine the value of at least one syntax element identification information by parsing the code stream; when at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, determine the weight matrix network model of the current block, and according to the weight matrix The network model determines the weight matrix of the current block; using the weight matrix, the target reconstructed image block of the current block is determined. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. decoding efficiency; at the same time, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

In another embodiment of the present application, referring to FIG. 10 , it shows a schematic flowchart of an encoding method provided by an embodiment of the present application. As shown in Figure 10, the method may include:

S1001: Determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model.

It should be noted that a video image may be divided into multiple image blocks, and each image block to be encoded currently may be called an encoding block. Here, each coding block may include a first image component, a second image component, and a third image component; and the current block is the first image component, the second image component or the third image component loop filtering in the video image currently to be performed The processed encoding block.

Here, for the first image component, the second image component, and the third image component, from the perspective of color division, this embodiment of the present application may divide them into two types of color components, such as luminance component and chrominance component. In this case, if the current block performs operations such as prediction, transformation and quantization of luma components, and loop filtering, the current block can also be called a luma block; or, if the current block performs prediction, transformation and quantization of chroma components , loop filtering, etc., then the current block can also be called a chroma block.

It should also be noted that, on the encoder side, an embodiment of the present application specifically provides a loop filtering method, and the loop filtering method is applied to the filtering unit 108 as shown in FIG. 3A . Here, the filtering unit 108 may also include a deblocking filter (LBF), a sample adaptive compensation filter (SAO), a residual neural network-based loop filter (CNNLF), and an adaptive correction filter (ALF) , and the loop filtering method described in the embodiment of the present application is specifically applied between CNNLF and ALF, so that each pixel value in the current block after being filtered by CNNLF can be optimally selected.

In the embodiment of the present application, the loop filtering method mainly introduces a weight matrix network model, and the weight matrix network model may be a multi-layer convolutional neural network. For different color component types, the weight matrix network model here is also different. In some embodiments, the determining the weight matrix network model of the current block may include:

When the color component type of the current block is a luminance component, determine the luminance weight matrix network model of the current block;

When the color component type of the current block is a chrominance component, the chrominance weight matrix network model of the current block is determined.

It should be noted that the weight matrix network model is related to the color component type. Here, the weight matrix network model corresponding to the luminance component may be referred to as a luminance weight matrix network model, and the weight matrix network model corresponding to the chrominance component may be referred to as a chrominance weight matrix network model. The following will take the two types of color components, the luminance component and the chrominance component, as examples to describe respectively.

In a possible implementation manner, the determining the luminance weight matrix network model of the current block may include:

In another possible implementation manner, the determining the chrominance weight matrix network model of the current block may include:

It should be noted that the weight matrix network model of the current block has an associated relationship with the quantization parameter and the color component type. Among them, different color component types correspond to different weight matrix network models. For example, for the luminance component, the weight matrix network model may be a luminance weight matrix network model related to the luminance component; for the chrominance component, the weight matrix The network model may be a chroma weight matrix network model associated with the chroma components.

Further, according to different quantization parameters, for example, the value of QP is 27-31, 32-37, 38-44, 45-50, etc., at least one candidate luminance weight matrix network model and at least one candidate chroma can be trained in advance. Weight matrix network model. In this way, after the quantization parameter of the current block is determined, the candidate luminance weight matrix network model corresponding to the quantization parameter can be selected from at least one candidate luminance weight matrix network model, that is, the luminance weight matrix network model of the current block; The candidate chrominance weight matrix network model corresponding to the quantization parameter is selected from the at least one candidate chrominance weight matrix network model, that is, the chrominance weight matrix network model of the current block.

It should also be noted that, for the model training of at least one candidate luminance weight matrix network model and at least one candidate chrominance weight matrix network model, in some embodiments, the method may further include:

Use the brightness component of the at least one training sample to train a preset neural network model to obtain at least one candidate brightness weight matrix network model;

Use the chrominance component of the at least one training sample to train the preset neural network model to obtain at least one candidate chrominance weight matrix network model;

Wherein, the at least one candidate luminance weight matrix network model has a corresponding relationship with the luminance component and the quantization parameter, and the at least one candidate chrominance weight matrix network model has a corresponding relationship with the chrominance component and the quantization parameter.

That is to say, the preset neural network model can select a multi-layer convolutional neural network, and then use the training samples to perform deep learning to obtain a weight matrix network model, such as a luminance weight matrix network model or a chrominance weight matrix network model. In this way, the weight matrix network model can also be regarded as composed of multi-layer convolutional neural networks. Specifically, the weight matrix network model may also include at least one of the following: at least one convolution layer, at least one activation layer, and a jump connection layer.

Input the output reconstructed image block into the weight matrix network model to get the weight matrix of the current block.

It should also be noted that, according to different types of color components, the weight matrix includes a luminance weight matrix and a chrominance weight matrix. In some embodiments, the determining the weight matrix of the current block according to the weight matrix network model may include:

Input the output luminance reconstructed image block into the luminance weight matrix network model to obtain the luminance weight matrix.

Input the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix.

It should be noted that, for the input reconstructed image block, depending on the type of color components, it may refer to the input luminance reconstructed image block, or it may refer to the input chrominance reconstructed image block; for the output reconstructed image block, according to the different color component types , which can refer to the output luminance reconstructed image block or the output chrominance reconstructed image block. In this way, the luminance weight matrix can be obtained according to the luminance weight matrix network model and the output luminance reconstructed image block, and the chrominance weight matrix can be obtained according to the chrominance weight matrix network model and the output chrominance reconstructed image block.

S1002: Determine the value of at least one syntax element identification information.

It should be noted that there is no order between step S1001 and step S1002, and step S1001 may be executed first, and then step S1002 may be executed; or step S1002 may be executed first, and then step S1001 may be executed; even the two may be executed in parallel, The embodiments of the present application do not make any limitation.

It should also be noted that whether to use the weight matrix for filtering processing may be indicated by at least one syntax element identification information. In this way, in the encoder, after determining the value of the at least one syntax element identification information, the value of the at least one syntax element identification information is written into the code stream for transmission to the decoder, so that the decoder can parse the code stream by parsing the value of the code stream. It can be known whether to use the weight matrix for filtering processing.

In some embodiments, the determining the value of at least one syntax element identification information may include:

If it is determined that the video sequence is filtered by using the weight matrix, then it is determined that the value of the first syntax element identification information is the first value; or,

If it is determined that the video sequence does not use the weight matrix for filtering processing, it is determined that the value of the first syntax element identification information is the second value.

Further, the method further includes: encoding the value of the identification information of the first syntax element, and writing the encoded bits into the code stream.

That is, first, first syntax element identification information may be set to indicate whether the current video sequence uses the loop filtering method in this embodiment of the present application (ie, whether the weight matrix module is enabled). Here, if the weight matrix module of the video sequence is enabled, that is, it is determined that the video sequence is filtered using the weight matrix, then the value of the first syntax element identification information can be the first value at this time; if the weight of the video sequence is If the matrix module is disabled, that is, it is determined that the video sequence does not use the weight matrix for filtering processing, then at this time, the value of the identification information of the first syntax element may be the second value.

It should also be noted that the first value and the second value are different.

It should also be noted that the video sequence includes at least one frame, and the at least one frame may include the current frame. Here, when the video sequence is enabled to use the loop filtering method of the embodiment of the present application, it is determined that the weight matrix module is enabled for the video sequence, then the embodiment of the present application needs to further determine whether the current frame in the video sequence uses the weight matrix For filtering processing, it is also necessary to set a second syntax element identification information. For the second syntax element identification information, the meanings represented by the second syntax element identification information are different according to different types of color components.

In this embodiment of the present application, the color component types may include luminance components and chrominance components. If the color component type is a luminance component, it is assumed that the second syntax element identification information is the first luminance syntax element identification information at this time to indicate whether the luminance component of the current frame is filtered using the luminance weight matrix; if the color component type is a chrominance component , at this time, it is assumed that the second syntax element identification information is the chroma syntax element identification information, so as to indicate whether the chroma components of the current frame are filtered using the chroma weight matrix.

In this way, after it is determined that the video sequence is filtered by using the weight matrix, when the color component type of the current frame is the luminance component, the weight matrix is determined to be the luminance weight matrix. At this time, in a possible implementation manner, the determining the value of the identification information of at least one syntax element may further include:

determining the first rate-distortion cost value of filtering the luminance component of the current frame in the video sequence using the luminance weight matrix;

determining the second rate-distortion cost value of the luminance component of the current frame in the video sequence that is not filtered using the luminance weight matrix;

If the first rate-distortion cost value is less than the second rate-distortion cost value, determine that the value of the first luma syntax element identification information is the first value; or,

If the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value, the value of the first luma syntax element identification information is determined to be the second value.

Further, the method further includes: encoding the value of the identification information of the first luma syntax element, and writing the encoded bits into the code stream.

It should be noted that, for the luma component, if the value of the first luma syntax element identification information is the first value, that is, it is determined that the luma component of the current frame is filtered using the luma weight matrix, then it is necessary to continue to judge the current frame. Whether the luma component of the current block is filtered using the luma weight matrix. Therefore, in some embodiments, when the first rate-distortion cost value is less than the second rate-distortion cost value, the determining the value of the at least one syntax element identification information may further include:

Determine the third rate-distortion cost value for the luminance component of the current block to be filtered using the luminance weight matrix; and determine the fourth rate-distortion cost for the luminance component of the current block to be filtered without using the luminance weight matrix;

If the third rate-distortion cost value is less than the fourth rate-distortion cost value, determine that the value of the identification information of the second luma syntax element is the first value; or,

If the third rate-distortion cost value is greater than or equal to the fourth rate-distortion cost value, the value of the second luma syntax element identification information is determined to be a second value.

Further, the method further includes: encoding the value of the identification information of the second luma syntax element, and writing the encoded bits into the code stream.

That is, for the luma component, there are two kinds of syntax elements involved here: frame-level syntax elements and CTU-level syntax elements. The frame-level syntax element may be referred to as the first luma syntax element identification information, and the CTU-level syntax element may be referred to as the second luma syntax element identification information. Assuming that the first luma syntax element identification information and the second luma syntax element identification information are flag information, the first luma syntax element identification information can be represented by luma_frame_weighting_matrix_flag; the second luma syntax element identification information can be represented by luma_ctu_weighting_matrix_flag. Here, both the value of the identification information of the first luma syntax element and the value of the identification information of the second luma syntax element can be determined by utilizing the distortion cost.

Taking the identification information of the first luma syntax element as an example, for the current frame, in some embodiments, the method may further include: performing block division on the current frame, and determining at least one divided block; wherein, the at least one divided block includes the current block;

Correspondingly, determining the first rate-distortion cost value for filtering the luminance component of the current frame in the video sequence using the luminance weight matrix may include:

respectively calculating the third rate-distortion cost value of filtering the luminance component of the at least one divided block using the luminance weight matrix;

Accumulate the calculated third rate-distortion cost value to obtain the first rate-distortion cost value;

The calculating the second rate-distortion cost value of the luminance component of the current frame in the video sequence without using the luminance weight matrix for filtering processing may include:

respectively calculating the fourth rate-distortion cost value of the luminance component of the at least one divided block that is not filtered using the luminance weight matrix;

Accumulate the calculated fourth rate-distortion cost value to obtain the second rate-distortion cost value.

That is to say, the third rate-distortion cost value for filtering the luminance component of each block using the luminance weight matrix can be calculated, and then the first rate-distortion cost value of the current frame can be obtained by cumulative calculation. In a specific example, in the calculation of the rate-distortion cost value, the distortion value may be determined according to the mean square error.

In a possible implementation, after the luminance component of each block is filtered by the luminance weight matrix, the target reconstructed image block of the luminance component of each block can be obtained; then the mean square of the target reconstructed image block and the original image block is calculated Error value, the mean square error value of each block can be obtained; the utilization distortion cost formula RDcost=D+λ*R can calculate the third rate distortion cost value of each block; where D is the mean square of each block The error value, R is 1, and λ is consistent with the λ of the adaptive correction filter. Finally, the third rate-distortion cost value of each block is accumulated to obtain the first rate-distortion cost value of the current frame.

In another possible implementation, after the luminance component of each block is filtered using a luminance weight matrix, the target reconstructed image block of the luminance component of each block can be obtained; then the average of the target reconstructed image block and the original image block is calculated. square error value, the mean square error value of each block can be obtained, and the mean square error value of the current frame can be obtained by cumulative calculation; the reuse rate distortion cost formula RDcost=D+λ*R is calculated to obtain the first rate distortion cost value; where , where D is the mean square error value of the current frame, R is the number of blocks included in the current frame, and λ is consistent with the λ of the adaptive correction filter.

It should also be noted that the fourth rate-distortion cost value of the luminance component of each block that is not filtered using the luminance weight matrix can also be calculated, and then the second rate-distortion cost value of the current frame can be obtained by cumulative calculation. Here, in the calculation of the rate-distortion cost value, the distortion value can also be determined according to the mean square error; the mean square error at this time refers to the mean square error value between the output reconstructed image block and the original image block without going through the weight matrix module , and other calculation operations are the same as calculating the first rate-distortion cost value, and will not be described in detail here.

In this way, taking the first value of 1 and the second value of 0 as an example, after obtaining the first rate-distortion cost value and the second rate-distortion cost value, the first rate-distortion cost value and the second rate-distortion cost value can be calculated by Compare; if the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value, then it can be determined that the value of the first luminance syntax element identification information is 0, which means that the luminance component of the current frame does not need to use the luminance weight matrix After filtering, the next frame can be acquired from the video sequence, the next frame is determined as the current frame, and the calculation of the first rate-distortion cost value and the second rate-distortion cost value is continued. Otherwise, if the first rate-distortion cost value is less than the second rate-distortion cost value, then it can be determined that the value of the first luminance syntax element identification information is 1, which means that the luminance component of the current frame needs to be filtered using the luminance weight matrix, At this time, it is necessary to continue to judge whether the luminance component of the current block in the current frame is filtered using the luminance weight matrix; that is, compare the third rate distortion cost value with the fourth rate distortion cost value; if the third rate distortion cost value is less than the fourth rate distortion cost value Distortion cost value, then it can be determined that the value of the identification information of the second luma syntax element is 1, which means that the luma component of the current block needs to be filtered using the luma weight matrix; otherwise, if the third rate distortion cost value is greater than or equal to The fourth rate-distortion cost value, then it can be determined that the value of the identification information of the second luma syntax element is 0, which means that the luma component of the current block does not need to be filtered using the luma weight matrix, which can be obtained from the current frame. In the next block, the next block is determined as the current block, and the calculation of the third rate-distortion cost value and the fourth rate-distortion cost value is continued.

In addition, for the luminance component, the embodiments of the present application can also set the luminance frame level flag bit and the luminance CTU level flag bit, and determine whether to use the luminance weight matrix for filtering by controlling whether to open the weight matrix module.

For the luminance frame level flag, in some embodiments, the method may further include: setting a luminance frame level flag, where the luminance frame level flag is used to control whether the luminance component of the current frame is filtered using the luminance weight matrix;

Accordingly, the method may also include:

If the first rate-distortion cost value is less than the second rate-distortion cost value, turn on the luminance frame level flag; or,

If the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value, the luminance frame level flag is turned off.

For the luminance CTU level flag bit, in some embodiments, the method may further include: setting a luminance CTU level flag bit, where the luminance CTU level flag bit is used to control whether the luminance component of the current block is filtered using the luminance weight matrix;

Accordingly, the method may also include:

If the third rate-distortion cost value is less than the fourth rate-distortion cost value, turn on the luminance CTU level flag; or,

If the third rate-distortion cost value is greater than or equal to the fourth rate-distortion cost value, the luminance CTU level flag is turned off.

It should be noted that, whether it is the luminance frame level flag or the luminance CTU level flag, whether to enable or not can also be determined according to the rate-distortion cost method. Here, in a possible implementation manner, it can be determined according to the magnitude of the calculated rate-distortion cost value.

In another possible implementation manner, for the luminance frame level flag bit, it is still determined according to RDcost=D+λ*R. Here, D represents the distortion value reduced after the current frame is processed by the weight matrix module, D=D _out - D _rec (D _out is the distortion after the weight matrix module processing, D _rec is the distortion before the weight matrix module processing), R is the The number of blocks included in the current frame, λ is consistent with the λ of the adaptive correction filter. At this time, when RDcost is a negative value, turn on the luminance frame-level flag, that is, turn on the frame-level weight matrix module; otherwise, turn off the luminance frame-level flag, that is, turn off the frame-level weight matrix module.

When the luminance frame level flag is turned on, the luminance CTU level flag can be determined according to RDcost=D. Here, D represents the distortion value reduced after the current block is processed by the weight matrix module, D=D _out - D _rec (D _out is the distortion after the weight matrix module processing, D _rec is the distortion before the weight matrix module processing).

Further, after it is determined that the video sequence is filtered using the weight matrix, when the color component type of the current frame is a chrominance component, the weight matrix is determined to be a chrominance weight matrix. At this time, in another possible implementation manner, the determining the value of the identification information of at least one syntax element may further include:

Determine the fifth rate-distortion cost value of filtering the chrominance component of the current frame in the video sequence using the chrominance weight matrix;

determining the sixth rate-distortion cost value of the chrominance component of the current frame in the video sequence that is not filtered using the chrominance weight matrix;

If the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, determine that the value of the chroma syntax element identification information is the first value; or,

If the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value, the value of the chroma syntax element identification information is determined to be the second value.

Further, the method may further include: encoding the value of the identification information of the chroma syntax element, and writing the encoded bits into the code stream.

It should be noted that, for chroma components, frame-level syntax elements are involved here. The frame-level syntax element may be referred to as chroma syntax element identification information. Assuming that the chroma syntax element identification information is flag information, it may be represented by chroma_frame_weighting_matrix_flag.

It should also be noted that, for the first value and the second value, the first value can be set to 1, and the second value can be set to 0; or, the first value can also be set to true, and the second value can also be set to is false; alternatively, the first value can also be set to 0, and the second value can also be set to 1; alternatively, the first value can also be set to false, and the second value can also be set to true. Exemplarily, in general, the first value may be 1, and the second value may be 0, but it is not limited in any way.

Further, due to the consideration of coding performance and computational complexity, if the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix, then the blocks included in the current frame all use the chroma weight matrix by default. Perform filtering; if the chroma syntax element identification information indicates that the chroma components of the current frame do not use the chroma weight matrix for filtering, then the blocks included in the current frame do not use the chroma weight matrix for filtering by default. Therefore, it is no longer necessary to set the CTU-level syntax element for the chroma components, and similarly, the CTU-level flag bit does not need to be set.

In other words, for the chrominance component, only the frame-level flag bit may be set in this embodiment of the present application. Therefore, in some embodiments, the method may further include: setting a chroma frame-level flag bit, where the chroma frame-level flag bit is used to control whether the chroma components of the current frame are filtered using a chroma weight matrix;

Accordingly, the method may also include:

If the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, turn on the chroma frame-level flag; or,

If the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value, the chroma frame-level flag is turned off.

It should be noted that, for the fifth rate-distortion cost value and the sixth rate-distortion cost value, in a specific example, the distortion value can also be determined according to the mean square error, and other calculation operations are the same as calculating the first rate-distortion cost value and The second rate-distortion cost value is the same, and will not be described in detail here. In addition, whether the chroma frame-level flag is turned on is the same as the implementation of determining whether the luminance frame-level flag is turned on, and will not be described in detail here.

In this way, taking the first value of 1 and the second value of 0 as an example, after obtaining the fifth rate-distortion cost value and the sixth rate-distortion cost value, the fifth rate-distortion cost value and the sixth rate-distortion cost value can be calculated by Compare; if the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, then the chroma frame-level flag bit can be turned on, and it can also be determined that the value of the chroma syntax element identification information is 1, which means that the color of the current frame is The chrominance component needs to be filtered using the chrominance weight matrix; after the current frame is processed, continue to load the next frame for processing. Otherwise, if the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value, then the chroma frame-level flag bit can be turned off, and it can also be determined that the value of the chroma syntax element identification information is 0, which means that the current frame The chrominance component does not need to use the chrominance weight matrix for filtering processing. At this time, the next frame can be obtained from the video sequence, the next frame can be determined as the current frame, and the next frame can be loaded for processing to determine the next frame. The value of the syntax element identification information.

S1003: When at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, use the weight matrix to determine the target reconstructed image block of the current block.

It should be noted that the color component types may include luminance components and chrominance components. If the current frame uses the weight matrix for filtering, it does not mean that every block in the current frame uses the weight matrix for filtering, and may also involve CTU-level syntax element identification information to further determine whether the current block uses the weight matrix for filtering. deal with. The following will take the two types of color components, the luminance component and the chrominance component, as examples to describe respectively.

In some embodiments, for the luminance component, when the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix to perform filtering processing, using the weight matrix to determine the target reconstructed image block of the current block may include:

If the third rate-distortion cost value is less than the fourth rate-distortion cost value, use the luminance weight matrix to determine the target reconstructed image block of the current block.

In some embodiments, for the chrominance component, when the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix for filtering processing, using the weight matrix to determine the target reconstructed image block of the current block may include: :

If the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, the chrominance weight matrix is used to determine the target reconstructed image block of the current block.

It should be noted that, for the luma component, two kinds of syntax elements are required: frame-level syntax elements and CTU-level syntax elements. Only when the CTU-level syntax element (ie, the second luminance syntax element identification information) indicates that the current block uses the weight matrix for filtering processing, that is, the third rate-distortion cost value is less than the fourth rate-distortion cost value, then the luminance weight matrix can be used to determine The target reconstructed image block of the current block (specifically refers to the target reconstructed image block of the luminance component of the current block).

For chroma components, however, only one syntax element may be required: a frame-level syntax element. When only the frame-level syntax element (ie, the chroma syntax element identification information) indicates that the current frame uses the weight matrix for filtering processing, that is, the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, at this time, all blocks in the current frame are defaulted. The chrominance weight matrix is used for filtering processing, and the weight matrix may also be used to determine the target reconstructed image block of the current block (specifically, the target reconstructed image block of the chrominance component of the current block).

Further, in some embodiments, the determining the target reconstructed image block of the current block by using the weight matrix may include:

The input reconstructed image block and the output reconstructed image block are weighted by the weight matrix to obtain the target reconstructed image block.

In this way, the input and output of CNNLF are weighted by the weight matrix, which can make the output of the neural network loop filter closer to the original image.

It should also be noted that, according to the different types of color components, the weight matrix includes a luminance weight matrix and a chrominance weight matrix; in this way, for the target reconstructed image block, the target reconstructed image block of the luminance component and the chrominance component can also be included. The target reconstructed image patch. In a specific example, for the luminance component, the determining the luminance weight matrix using the luminance weight matrix network model may include:

The input luminance reconstructed image block and the output luminance reconstructed image block are weighted by using the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.

In another specific example, for the chrominance component, the determining the chrominance weight matrix using the chrominance weight matrix network model may include:

That is to say, if the current block uses the weight matrix for filtering, the input of the adaptive correction filter is the target reconstructed image block; if the current block does not use the weight matrix for filtering, then the output of CNNLF does not need to be weighted at this time. , that is, the input of the adaptive correction filter is the output reconstructed image block.

In short, the embodiment of the present application proposes a weight matrix based on deep learning, which is used to weight the output of the neural network loop filter, so that the output of the neural network loop filter is closer to the original image. Here, the network structure corresponding to the weight matrix network model can be selected from a multi-layer convolutional neural network, the input is the output reconstructed image block of the neural network loop filter, and the output is a new reconstructed image block after weighting processing (that is, the foregoing embodiment). the "target reconstructed image patch"). The position of the weight matrix module in the encoder/decoder is shown in Figure 7. The Weighting Matrix is the weight matrix module in the figure. The use of the weight matrix module does not depend on the flag bits of DBF, SAO, ALF, and CNNLF, but only in the position It is placed after CNNLF and before ALF.

In addition, whether the weight matrix based on deep learning is selected can be determined by the encoder. On the encoder side, when entering the loop filtering module, the input reconstructed image block of CNNLF can be marked as I _rec , and the output reconstructed image block of CNNLF can be marked as I rec . It is marked as I _net , and I _net is used as the input of the weight matrix network model, the output can obtain the weight matrix W, and the final output I _out can be obtained by formula (3), as shown below,

_Iout =W* _Inet +(1-W)* _Irec (3)

Further, in the embodiment of the present application, a frame-level flag bit and a CTU-level flag bit are set for the luminance component to control whether to turn on, and a frame-level flag bit is set for the chrominance component to control whether to turn on. Specifically, the frame-level flag bit can be determined by equation (4), where D=D _out - D _rec represents the distortion reduced after processing by the weight matrix module (D _out is the distortion processed by the weight matrix module, D _rec is the weight matrix Distortion before module processing), R is the number of CTUs in the current frame, and λ is consistent with the λ of the adaptive correction filter. At this time, when RDcost is negative, the frame-level weight matrix module is turned on, otherwise the frame-level weight matrix module is turned off.

RDcost=D+λ*R (4)

When the frame-level flag is turned on, it can be further decided whether to turn on the weight matrix module for each CTU through the rate-distortion cost method. Specifically, the CTU-level flag bit can be determined by equation (5).

RDcost=D (5)

Here, the flag bit status of the weight matrix module will be encoded into the code stream for the decoder to read. On the decoder side, when the parsed weight matrix module is turned on, the input and output of CNNLF will be weighted according to formula (3).

In a specific example, the technical solutions of the embodiments of the present application act on the loop filtering module of the encoder, and the specific process is as follows:

When the encoder enters the loop filter module, it is processed according to the preset filter order. Here, the preset filter sequence is DBF filtering---->SAO filtering---->CNNLF filtering---->weight matrix module---->ALF filtering. When entering the weight matrix module,

(a) First, select the corresponding weight matrix network model according to the color component type and quantization parameter of the current block. The output reconstructed image block of CNNLF is used as the input of the weight matrix network model, and the output of the weight matrix network model obtains the weight matrix. The output reconstructed image block of CNNLF and the input reconstructed image block of CNNLF are weighted according to the obtained weight matrix, and the final target reconstructed image block can be obtained;

(b) For the luminance component, first calculate the mean square error value between the output reconstructed image block and the original image block, and accumulate the mean square error value of the current frame. Then, the decision is made by the rate-distortion cost method. If it is determined that the rate-distortion cost value after using the weight matrix module is less than that of the unused weight matrix module, set luma_frame_weighting_matrix_flag to "1", otherwise, set it to "0". If luma_frame_weighting_matrix_flag is "0", go to (d). Otherwise, according to the mean square error value of each CTU in the current frame, the decision is made by the rate-distortion cost method. If it is determined that the rate-distortion cost value after using the weight matrix module is less than the weight matrix module that is not used, then set luma_ctu_weighting_matrix_flag to "1" , otherwise set to "0";

(c) For the chrominance component, the mean square error value between the output reconstructed image block and the original image block is first calculated, and the mean square error value of the current frame is obtained by cumulative calculation. Then, the decision is made through the rate-distortion cost method. If it is determined that the rate-distortion cost value after using the weight matrix module is smaller than the unused weight matrix module, set chroma_frame_weighting_matrix_flag to "1", otherwise set it to "0";

(d) If the current frame has completed the decision of the weight matrix module, load the next frame for processing, and then jump to (a).

In the implementation, for the luma component, the modification of its syntax elements is shown in Table 1; for the chroma component, the modification of its syntax elements is shown in Table 2.

That is to say, in the embodiment of the present application, by inputting the reconstructed image block of the CNNLF output of HPM-ModAI into the weight matrix network model (multi-layer convolutional neural network), the feature information is extracted and the weight matrix is output, so that the CNNLF can be The use of provides pixel-level weighting, resulting in improved coding performance.

In another embodiment of the present application, the embodiment of the present application provides a code stream, where the code stream is generated by bit encoding according to the value of at least one syntax element identification information.

In the application embodiment, the at least one syntax element identification information may at least include: first syntax element identification information, first luma syntax element identification information, second luma syntax element identification information, and chroma syntax element identification information.

Wherein, the first syntax element identification information is used to indicate whether the video sequence is filtered using the weight matrix, the first luminance syntax element identification information is used to indicate whether the luminance component of the current frame is filtered using the weight matrix, and the second luminance syntax element identification The information is used to indicate whether the luminance component of the current block is filtered using the weight matrix, and the chroma syntax element identification information is used to indicate whether the chroma component of the current frame is filtered using the weight matrix; wherein, the video sequence may include the current frame, the current A frame may include the current block.

It should also be noted that the first luma syntax element identification information and the chroma syntax element identification information may be frame-level marks, and the second luma syntax element identification information may be CTU-level marks. Here, the frame-level flag of the luma component based on the weight matrix of the neural network (that is, the first luma syntax element identification information) is luma_frame_weighting_matrix_flag; the CTU-level flag of the luma component based on the weight matrix of the neural network (that is, the identification information of the second luma syntax element) is luma_ctu_weighting_matrix_flag; the chroma component frame-level flag (ie, chroma syntax element identification information) of the neural network-based weighting matrix is chroma_frame_weighting_matrix_flag.

This embodiment provides an encoding method, and specifically provides a loop filtering method, which is applied to an encoder. By determining the weight matrix network model of the current block, and determining the weight matrix of the current block according to the weight matrix network model; determining the value of at least one syntax element identification information; when the at least one syntax element identification information indicates the current frame or the current block, the weight matrix is used During the filtering process, the weight matrix is used to determine the target reconstructed image block of the current block. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

In yet another embodiment of the present application, based on the same inventive concept as the foregoing embodiments, see FIG. 11 , which shows a schematic structural diagram of an encoder 110 provided by an embodiment of the present application. As shown in FIG. 11 , the encoder 110 may include: a first determining unit 1101 and a first filtering unit 1102; wherein,

The first determining unit 1101 is configured to determine a weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model; and is also configured to determine the value of at least one syntax element identification information;

The first filtering unit 1102 is configured to use the weight matrix to determine the target reconstructed image block of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing.

In some embodiments, the first determining unit 1101 is further configured to determine the luminance weight matrix network model of the current block when the color component type of the current block is a luminance component; when the color component type of the current block is a chrominance component, Determines the chroma weight matrix network model for the current block.

In some embodiments, the first determining unit 1101 is further configured to, when the color component type of the current block is a luminance component, determine at least one candidate luminance weight matrix network model; and determine the quantization parameter of the current block, from at least one The candidate brightness weight matrix network model corresponding to the quantization parameter is selected from the candidate brightness weight matrix network model; and the selected candidate brightness weight matrix network model is determined as the brightness weight matrix network model of the current block.

The first determination unit 1101 is further configured to determine at least one candidate chrominance weight matrix network model when the color component type of the current block is a chrominance component; and determine the quantization parameter of the current block, from at least one candidate chrominance weight In the matrix network model, a candidate chrominance weight matrix network model corresponding to the quantization parameter is selected; and the selected candidate chrominance weight matrix network model is determined as the chrominance weight matrix network model of the current block.

In some embodiments, referring to FIG. 11 , the encoder 110 may further include a first training unit 1103;

The first determining unit 1101 is further configured to determine at least one training sample, wherein the training sample is obtained according to at least one quantization parameter;

The first training unit 1103 is configured to use the luminance component of at least one training sample to train the preset neural network model to obtain at least one candidate luminance weight matrix network model; and use the chrominance component of at least one training sample to train the preset neural network model. The model is trained to obtain at least one candidate chrominance weight matrix network model; wherein, at least one candidate brightness weight matrix network model has a corresponding relationship with the brightness component and the quantization parameter, and at least one candidate chrominance weight matrix network model and the chrominance component. and quantization parameters have a corresponding relationship.

In some embodiments, the preset neural network model includes at least one of the following: at least one convolutional layer, at least one activation layer, and a skip connection layer.

In some embodiments, the first determining unit 1101 is further configured to determine the value of the first syntax element identification information as the first value if it is determined that the video sequence is filtered using the weight matrix; or, if it is determined that the video sequence does not use The weight matrix is filtered, and the value of the identification information of the first syntax element is determined to be the second value.

In some embodiments, referring to FIG. 11 , the encoder 110 may further include an encoding unit 1104, configured to encode the value of the identification information of the first syntax element, and write the encoded bits into the code stream.

In some embodiments, after determining that the video sequence is filtered using the weight matrix, the first determining unit 1101 is further configured to, when the color component type of the current frame is a luminance component, determine that the weight matrix is a luminance weight matrix; and determine the video A first rate-distortion cost value for the luminance component of the current frame in the sequence to be filtered using the luminance weighting matrix; and a second rate-distortion cost for determining the luminance component of the current frame in the video sequence that is not filtered using the luminance weighting matrix; and if If the first rate-distortion cost value is less than the second rate-distortion cost value, it is determined that the value of the first luminance syntax element identification information is the first value; or, if the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value, Then, it is determined that the value of the identification information of the first luma syntax element is the second value.

In some embodiments, the encoding unit 1104 is further configured to encode the value of the identification information of the first luma syntax element, and write the encoded bits into the code stream.

In some embodiments, referring to FIG. 11 , the encoder 110 may further include a first setting unit 1105 configured to set a luminance frame-level flag, where the luminance frame-level flag is used to control whether the luminance component of the current frame is filtered using a weight matrix deal with;

Correspondingly, the first determining unit 1101 is further configured to turn on the luminance frame level flag if the first rate-distortion cost value is less than the second rate-distortion cost value; or, if the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value. Distortion cost value, turn off the luminance frame level flag.

In some embodiments, the first determining unit 1101 is further configured to perform block division on the current frame, and determine at least one divided block; wherein the at least one divided block includes the current block; and separately calculate the luminance component of the at least one divided block using luminance The third rate-distortion cost value of the filtering processing is performed on the weight matrix, and the calculated third rate-distortion cost value is accumulated to obtain the first rate-distortion cost value; The matrix performs filtering processing on the fourth rate-distortion cost value, and accumulates the calculated fourth rate-distortion cost value to obtain the second rate-distortion cost value.

In some embodiments, the first determining unit 1101 is further configured to, when the first rate-distortion cost value is less than the second rate-distortion cost value, determine a third rate-distortion code for filtering the luminance component of the current block using the luminance weight matrix. and determining the fourth rate-distortion cost value for which the luma component of the current block is not filtered using the luma weight matrix; and if the third rate-distortion cost value is less than the fourth rate-distortion cost value, then determining the second luma syntax element identification information The value of is the first value; or, if the third rate-distortion cost value is greater than or equal to the fourth rate-distortion cost value, the value of the second luma syntax element identification information is determined to be the second value.

In some embodiments, the encoding unit 1104 is further configured to encode the value of the identification information of the second luma syntax element, and write the encoded bits into the code stream.

In some embodiments, the first setting unit 1105 is further configured to set the luminance CTU level flag bit, and the luminance CTU level flag bit is used to control whether the luminance component of the current block is filtered using the luminance weight matrix;

Correspondingly, the first determining unit 1101 is further configured to turn on the luminance CTU level flag if the third rate-distortion cost value is less than the fourth rate-distortion cost value; or, if the third rate-distortion cost value is greater than or equal to the fourth rate-distortion cost value. Distortion cost value, turn off the brightness CTU level flag.

In some embodiments, the first determining unit 1101 is further configured to use the luminance weight matrix to determine the target reconstructed image block of the current block if the third rate-distortion cost value is less than the fourth rate-distortion cost value.

In some embodiments, after determining that the video sequence is filtered using the weight matrix, the first determining unit 1101 is further configured to determine that the weight matrix is a chroma weight matrix when the color component type of the current frame is a chroma component; and A fifth rate-distortion cost value for determining the chrominance component of the current frame in the video sequence is filtered using the chrominance weighting matrix; rate-distortion cost value; and if the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, determining that the value of the chroma syntax element identification information is the first value; or, if the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value If the rate-distortion cost value is 6, it is determined that the value of the chroma syntax element identification information is the second value.

In some embodiments, the encoding unit 1104 is further configured to encode the value of the identification information of the chroma syntax element, and write the encoded bits into the code stream.

In some embodiments, the first setting unit 1105 is further configured to set a chroma frame-level flag bit, and the chroma frame-level flag bit is used to control whether the chroma components of the current frame are filtered using a weight matrix;

Correspondingly, the first determining unit 1101 is further configured to turn on the chroma frame-level flag if the fifth rate-distortion cost value is less than the sixth rate-distortion cost value; or, if the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value. If the rate-distortion cost value is set, the chroma frame-level flag is turned off.

In some embodiments, the first value is one and the second value is zero.

In some embodiments, the first determining unit 1101 is further configured to use the chrominance weight matrix to determine the target reconstructed image block of the current block if the fifth rate-distortion cost value is less than the sixth rate-distortion cost value.

In some embodiments, the first determining unit 1101 is further configured to determine the input reconstructed image block and the output reconstructed image block of the neural network loop filter; and input the output reconstructed image block into the weight matrix network model to obtain the weight of the current block matrix.

In some embodiments, the first filtering unit 1102 is specifically configured to use a weight matrix to perform weighting processing on the input reconstructed image block and the output reconstructed image block to obtain the target reconstructed image block.

In some embodiments, the first determining unit 1101 is further configured to use a luminance weight matrix network model to determine a luminance weight matrix when the color component type of the current block is a luminance component; when the color component type of the current block is a chrominance component , using the chrominance weight matrix network model to determine the chrominance weight matrix.

In some embodiments, the first determining unit 1101 is further configured to, when the color component type of the current block is a luminance component, determine the input luminance reconstructed image block and the output luminance reconstructed image block of the neural network loop filter; Input the brightness weight matrix network model to the brightness reconstruction image block, and obtain the brightness weight matrix;

The first filtering unit 1102 is further configured to perform weighting processing on the input luminance reconstructed image block and the output luminance reconstructed image block by using the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.

In some embodiments, the first determining unit 1101 is further configured to determine the input chrominance reconstructed image block and the output chrominance reconstructed image block of the neural network loop filter when the color component type of the current block is a chrominance component; and inputting the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix;

The first filtering unit 1102 is further configured to perform weighting processing on the input chrominance reconstructed image block and the output chrominance reconstructed image block by using the chrominance weight matrix to obtain the target reconstructed image block of the chrominance component of the current block.

In some embodiments, the input reconstructed image block is obtained after filtering through a deblocking filter and a sample adaptive compensation filter.

In some embodiments, the first filtering unit 1102 is further configured to, after determining the target reconstructed image block of the current block, use an adaptive correction filter to continue filtering the target reconstructed image block.

It can be understood that, in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.

If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or The part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium, and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.

Therefore, an embodiment of the present application provides a computer storage medium, which is applied to the encoder 110, where the computer storage medium stores a computer program, and when the computer program is executed by the first processor, any one of the foregoing embodiments is implemented. Methods.

Based on the composition of the encoder 110 and the computer storage medium described above, see FIG. 12 , which shows a schematic diagram of a specific hardware structure of the encoder 110 provided by the embodiment of the present application. As shown in FIG. 12 , it may include: a first communication interface 1201 , a first memory 1202 and a first processor 1203 ; each component is coupled together through a first bus system 1204 . It can be understood that the first bus system 1204 is used to realize the connection communication between these components. In addition to the data bus, the first bus system 1204 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, the various buses are labeled as the first bus system 1204 in FIG. 12 . in,

The first communication interface 1201 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;

a first memory 1202 for storing computer programs that can run on the first processor 1203;

The first processor 1203 is configured to, when running the computer program, execute:

Determine the value of at least one syntax element identification information;

It can be understood that the first memory 1202 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synchlink DRAM, SLDRAM) And direct memory bus random access memory (Direct Rambus RAM, DRRAM). The first memory 1202 of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory.

The first processor 1203 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the first processor 1203 or an instruction in the form of software. The above-mentioned first processor 1203 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the first memory 1202, and the first processor 1203 reads the information in the first memory 1202, and completes the steps of the above method in combination with its hardware.

It will be appreciated that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processor, Controller, Microcontroller, Microprocessor, Others for performing the functions described in this application electronic unit or a combination thereof. For a software implementation, the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein. Software codes may be stored in memory and executed by a processor. The memory can be implemented in the processor or external to the processor.

Optionally, as another embodiment, the first processor 1203 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.

This embodiment provides an encoder, and the encoder may include a first determining unit and a first filtering unit. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

In yet another embodiment of the present application, based on the same inventive concept as the foregoing embodiments, see FIG. 13 , which shows a schematic structural diagram of a decoder 130 provided by an embodiment of the present application. As shown in FIG. 13 , the decoder 130 may include: a parsing unit 1301, a second determining unit 1302 and a second filtering unit 1303; wherein,

A parsing unit 1301, configured to parse the code stream, and determine the value of at least one syntax element identification information;

The second determining unit 1302 is configured to determine a weight matrix network model of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, and determine the weight matrix of the current block according to the weight matrix network model ;

The second filtering unit 1303 is configured to use the weight matrix to determine the target reconstructed image block of the current block.

In some embodiments, the parsing unit 1301 is specifically configured to parse the code stream, and determine the value of the first syntax element identification information; and when the first syntax element identification information indicates that the video sequence is filtered using the weight matrix, parse the code stream , determine the value of the second syntax element identification information; wherein, the second syntax element identification information is used to indicate whether the current frame in the video sequence is filtered by using the weight matrix, and the current frame includes the current block.

In some embodiments, the second determining unit 1302 is further configured to, if the value of the first syntax element identification information is a first value, determine that the first syntax element identification information indicates that the video sequence is filtered by using a weight matrix; or, If the value of the first syntax element identification information is the second value, it is determined that the first syntax element identification information indicates that the video sequence is not filtered using the weight matrix.

In some embodiments, the second determining unit 1302 is further configured to, when the color component type of the current frame is a luma component, determine that the second syntax element identification information is the first luma syntax element identification information, and the first luma syntax element identification information Used to indicate whether the luminance component of the current frame is filtered using the luminance weight matrix; or, when the color component type of the current frame is a chroma component, determine that the second syntax element identification information is the chroma syntax element identification information, and the chroma syntax The element identification information is used to indicate whether the chroma components of the current frame are filtered using the chroma weight matrix.

In some embodiments, when the color component type is a luminance component, the parsing unit 1301 is further configured to parse the code stream when the first luminance syntax element identification information indicates that the luminance component of the current frame is filtered using a luminance weight matrix, and determine The value of the second luminance syntax element identification information;

The second determining unit 1302 is further configured to determine a luma weight matrix network model of the current block when the second luma syntax element identification information indicates that the luma component of the current block is filtered using the luma weight matrix.

In some embodiments, referring to FIG. 13 , the decoder 130 may further include a second setting unit 1304 configured to set the luminance frame level flag bit and the luminance CTU level flag bit; wherein the luminance frame level flag bit is used to control the Whether the luminance component is filtered by the luminance weight matrix, the luminance CTU level flag is used to control whether the luminance component of the current block is filtered by the luminance weight matrix.

In some embodiments, the second determining unit 1302 is further configured to turn on the luma frame-level flag if the value of the first luma syntax element identification information is the first value; or, if the first luma syntax element identification information If the value is the second value, the brightness frame level flag is turned off.

In some embodiments, the second determining unit 1302 is further configured to turn on the luminance CTU level flag if the value of the second luminance syntax element identification information is the first value; or, if the second luminance syntax element identification information If the value is the second value, the brightness CTU level flag is turned off.

In some embodiments, the second determining unit 1302 is further configured to, if the value of the first luma syntax element identification information is a first value, determine that the first luma syntax element identification information indicates that the luma component of the current frame uses a luma weight matrix Perform filtering processing; or, if the value of the first luma syntax element identification information is the second value, determine that the first luma syntax element identification information indicates that the luma component of the current frame is not filtered using the luma weight matrix.

In some embodiments, the second determining unit 1302 is further configured to, if the value of the second luma syntax element identification information is the first value, determine that the second luma syntax element identification information indicates that the luma component of the current block uses the luma weight matrix Perform filtering processing; or, if the value of the second luma syntax element identification information is the second value, determine that the second luma syntax element identification information indicates that the luma component of the current block is not filtered using the luma weight matrix.

In some embodiments, when the color component type of the current frame is a chroma component, the second determining unit 1302 is further configured to use the chroma weight matrix for filtering processing when the chroma syntax element identification information indicates that the chroma component of the current frame is filtered When , determine the chrominance weight matrix network model of the current block.

In some embodiments, the second setting unit 1304 is further configured to set a chroma frame-level flag bit, wherein the chroma frame-level flag bit is used to control whether the chroma components of the current frame are filtered using a weight matrix.

In some embodiments, the second determining unit 1302 is further configured to turn on the chroma frame-level flag if the value of the chroma syntax element identification information is the first value; or, if the value of the chroma syntax element identification information is the first value If the value is the second value, the chroma frame-level flag is turned off.

In some embodiments, the second determining unit 1302 is further configured to, if the value of the chroma syntax element identification information is the first value, determine that the chroma syntax element identification information indicates that the chroma component of the current frame uses a chroma weight matrix Perform filtering processing; or, if the value of the chroma syntax element identification information is the second value, determine that the chroma syntax element identification information indicates that the chroma components of the current frame are not filtered using the chroma weight matrix.

In some embodiments, the first value is one and the second value is zero.

In some embodiments, the second determining unit 1302 is further configured to, when the color component type of the current block is a luminance component, determine at least one candidate luminance weight matrix network model; and determine the quantization parameter of the current block, from at least one The candidate brightness weight matrix network model corresponding to the quantization parameter is selected from the candidate brightness weight matrix network model; and the selected candidate brightness weight matrix network model is determined as the brightness weight matrix network model of the current block.

In some embodiments, the second determining unit 1302 is further configured to determine at least one candidate chrominance weight matrix network model when the color component type of the current block is a chrominance component; and determine the quantization parameter of the current block, from From at least one candidate chroma weight matrix network model, a candidate chroma weight matrix network model corresponding to the quantization parameter is selected; and the selected candidate chroma weight matrix network model is determined as the chroma weight matrix network model of the current block.

In some embodiments, referring to FIG. 13 , the decoder 130 may further include a second training unit 1305;

The second determining unit 1302 is further configured to determine at least one training sample, wherein the training sample is obtained according to at least one quantization parameter;

The second training unit 1305 is configured to use the luminance component of at least one training sample to train the preset neural network model to obtain at least one candidate luminance weight matrix network model; and use the chrominance component of the at least one training sample to train the preset neural network model The model is trained to obtain at least one candidate chrominance weight matrix network model; wherein, at least one candidate brightness weight matrix network model has a corresponding relationship with the brightness component and the quantization parameter, and at least one candidate chrominance weight matrix network model and the chrominance component. and quantization parameters have a corresponding relationship.

In some embodiments, the second determining unit 1302 is further configured to determine the input reconstructed image block and the output reconstructed image block of the neural network loop filter; and input the output reconstructed image block into the weight matrix network model to obtain the weight of the current block matrix.

In some embodiments, the second filtering unit 1303 is further configured to perform weighting processing on the input reconstructed image block and the output reconstructed image block by using a weight matrix to obtain the target reconstructed image block.

In some embodiments, the second determining unit 1302 is further configured to use a luminance weight matrix network model to determine a luminance weight matrix when the color component type of the current block is a luminance component; and when the color component type of the current block is a chrominance component When , use the chrominance weight matrix network model to determine the chrominance weight matrix.

In some embodiments, the second determining unit 1302 is further configured to, when the color component type of the current block is a luminance component, determine the input luminance reconstructed image block and the output luminance reconstructed image block of the neural network loop filter; Input the brightness weight matrix network model to the brightness reconstruction image block, and obtain the brightness weight matrix;

The second filtering unit 1303 is further configured to perform weighting processing on the input luminance reconstructed image block and the output luminance reconstructed image block by using the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.

In some embodiments, the second determining unit 1302 is further configured to determine the input chrominance reconstructed image block and the output chrominance reconstructed image block of the neural network loop filter when the color component type of the current block is a chrominance component; and inputting the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix;

The second filtering unit 1303 is further configured to perform weighting processing on the input chrominance reconstructed image block and the output chrominance reconstructed image block by using the chrominance weight matrix to obtain the target reconstructed image block of the chrominance component of the current block.

In some embodiments, the second filtering unit 1303 is further configured to, after determining the target reconstructed image block of the current block, continue to perform filtering processing on the target reconstructed image block by using an adaptive correction filter.

It can be understood that, in this embodiment, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.

If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on such understanding, this embodiment provides a computer storage medium, which is applied to the decoder 130, where the computer storage medium stores a computer program, and when the computer program is executed by the second processor, any one of the foregoing embodiments is implemented the method described.

Based on the composition of the above-mentioned decoder 130 and the computer storage medium, see FIG. 14 , which shows a schematic diagram of a specific hardware structure of the decoder 130 provided by the embodiment of the present application. As shown in FIG. 14 , it may include: a second communication interface 1401 , a second memory 1402 and a second processor 1403 ; various components are coupled together through a second bus system 1404 . It can be understood that the second bus system 1404 is used to realize the connection communication between these components. In addition to the data bus, the second bus system 1404 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, the various buses are labeled as the second bus system 1404 in FIG. 14 . in,

The second communication interface 1401 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;

a second memory 1402 for storing computer programs that can run on the second processor 1403;

The second processor 1403 is configured to, when running the computer program, execute:

Optionally, as another embodiment, the second processor 1403 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.

It can be understood that the hardware functions of the second memory 1402 and the first memory 1202 are similar, and the hardware functions of the second processor 1403 and the first processor 1203 are similar; details are not described here.

This embodiment provides a decoder, and the decoder may include a parsing unit, a second determining unit, and a second filtering unit. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

It should be noted that, in this application, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements , but also other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.

The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined under the condition of no conflict to obtain new method embodiments.

The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain a new product embodiment.

The features disclosed in several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments or device embodiments.

The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Industrial Applicability

In this embodiment of the present application, on the encoder side, the weight matrix network model of the current block is determined, and the weight matrix of the current block is determined according to the weight matrix network model; the value of the identification information of at least one syntax element is determined; when at least one syntax element The identification information indicates that when the current frame or the current block uses the weight matrix for filtering processing, the weight matrix is used to determine the target reconstructed image block of the current block. On the decoder side, the value of at least one syntax element identification information is determined by parsing the code stream; when the at least one syntax element identification information indicates that the current frame or current block uses the weight matrix for filtering processing, the weight matrix network model of the current block is determined , and determine the weight matrix of the current block according to the weight matrix network model; use the weight matrix to determine the target reconstructed image block of the current block. In this way, by using the weight matrix network model to determine the weight matrix, it can not only realize the weight matrix technology based on deep learning, but also provide pixel-level weighting processing for the use of the output reconstructed image block, but also improve the coding performance, which in turn can improve the coding performance. Decoding efficiency; in addition, the present application can also make the output of the neural network loop filter closer to the original image, which can improve the video image quality.

Claims

A decoding method, applied to a decoder, the method comprising:

Parse the code stream, and determine the value of at least one syntax element identification information;

When the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model;

Using the weight matrix, a target reconstructed image block of the current block is determined.
The method according to claim 1, wherein the parsing the code stream and determining the value of the identification information of at least one syntax element comprises:

Parse the code stream, and determine the value of the identification information of the first syntax element;

When the first syntax element identification information indicates that the video sequence is filtered using the weight matrix, the code stream is parsed to determine the value of the second syntax element identification information; wherein the second syntax element identification information is used to indicate Whether the current frame in the video sequence is filtered using the weight matrix, the current frame includes the current block.
The method of claim 2, wherein the method further comprises:

If the value of the first syntax element identification information is a first value, it is determined that the first syntax element identification information indicates that the video sequence is filtered by using the weight matrix; or,

If the value of the first syntax element identification information is a second value, it is determined that the first syntax element identification information indicates that the video sequence does not use the weight matrix for filtering processing.
The method according to claim 2, wherein the analyzing the code stream and determining the identification information of the second syntax element comprises:

When the color component type of the current frame is a luminance component, it is determined that the second syntax element identification information is first luminance syntax element identification information, and the first luminance syntax element identification information is used to indicate the luminance of the current frame Whether the component is filtered using the luma weight matrix; or,

When the color component type of the current frame is a chroma component, determining that the second syntax element identification information is chroma syntax element identification information, where the chroma syntax element identification information is used to indicate the chroma of the current frame Whether the component is filtered using the chroma weight matrix.
The method according to claim 4, wherein, when the color component type of the current frame is a luminance component, when the identification information of the at least one syntax element indicates that a weight matrix is used for filtering processing, determining the color component of the current block Weight matrix network model, including:

When the first luma syntax element identification information indicates that the luma component of the current frame is filtered by using the luma weight matrix, the code stream is parsed, and the value of the second luma syntax element identification information is determined;

When the second luma syntax element identification information indicates that the luma component of the current block is filtered using the luma weight matrix, a luma weight matrix network model of the current block is determined.
The method of claim 5, wherein the method further comprises:

If the value of the first luma syntax element identification information is a first value, it is determined that the first luma syntax element identification information indicates that the luma component of the current frame is filtered by using the luma weight matrix; or,

If the value of the first luma syntax element identification information is the second value, it is determined that the first luma syntax element identification information indicates that the luma component of the current frame does not use the luma weight matrix for filtering processing.
The method of claim 5, wherein the method further comprises:

If the value of the second luma syntax element identification information is a first value, it is determined that the second luma syntax element identification information indicates that the luma component of the current block is filtered by using the luma weight matrix; or,

If the value of the second luma syntax element identification information is a second value, it is determined that the second luma syntax element identification information indicates that the luma component of the current block does not use the luma weight matrix for filtering processing.
The method according to claim 4, wherein when the color component type of the current frame is a chrominance component, the current block is determined when the identification information of the at least one syntax element indicates that a weight matrix is used for filtering processing The weight matrix network model, including:

When the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix, a chroma weight matrix network model for the current block is determined.
The method of claim 8, wherein the method further comprises:

If the value of the chroma syntax element identification information is the first value, it is determined that the chroma syntax element identification information indicates that the chroma components of the current frame are filtered using the chroma weight matrix; or,

If the value of the chroma syntax element identification information is the second value, it is determined that the chroma syntax element identification information indicates that the chroma components of the current frame do not use the chroma weight matrix for filtering processing.
The method of any one of claims 3, 6, 7 and 9, wherein the first value is one and the second value is zero.
The method according to claim 5, wherein the determining the luminance weight matrix network model of the current block comprises:

In the case that the color component type of the current block is a luminance component, determining at least one candidate luminance weight matrix network model;

Determine the quantization parameter of the current block, and select a candidate brightness weight matrix network model corresponding to the quantization parameter from the at least one candidate brightness weight matrix network model;

The selected candidate luminance weight matrix network model is determined as the luminance weight matrix network model of the current block.
The method according to claim 8, wherein the determining the chrominance weight matrix network model of the current block comprises:

In the case that the color component type of the current block is a chrominance component, determining at least one candidate chrominance weight matrix network model;

determining the quantization parameter of the current block, and selecting a candidate chroma weight matrix network model corresponding to the quantization parameter from the at least one candidate chroma weight matrix network model;

The selected candidate chrominance weight matrix network model is determined as the chrominance weight matrix network model of the current block.
The method of claim 11 or 12, wherein the method further comprises:

determining at least one training sample, wherein the training sample is obtained according to at least one quantization parameter;

Use the brightness component of the at least one training sample to train a preset neural network model to obtain at least one candidate brightness weight matrix network model;

Use the chrominance component of the at least one training sample to train the preset neural network model to obtain at least one candidate chrominance weight matrix network model;

Wherein, the at least one candidate luminance weight matrix network model has a corresponding relationship with the luminance component and the quantization parameter, and the at least one candidate chrominance weight matrix network model has a corresponding relationship with the chrominance component and the quantization parameter.
The method of claim 13, wherein the preset neural network model includes at least one of the following: at least one convolutional layer, at least one activation layer and a skip connection layer.
The method according to any one of claims 1 to 14, wherein the determining the weight matrix of the current block according to the weight matrix network model comprises:

Determine the input reconstructed image patch and the output reconstructed image patch of the neural network loop filter;

The output reconstructed image block is input into the weight matrix network model to obtain the weight matrix of the current block.
The method according to claim 15, wherein the determining the target reconstructed image block of the current block by using the weight matrix comprises:

The input reconstructed image block and the output reconstructed image block are weighted by using the weight matrix to obtain the target reconstructed image block.
The method according to claim 1, wherein the determining the weight matrix of the current block according to the weight matrix network model comprises:

When the color component type of the current block is a luminance component, use a luminance weight matrix network model to determine a luminance weight matrix;

When the color component type of the current block is a chrominance component, a chrominance weight matrix is determined by using a chrominance weight matrix network model.
The method according to claim 17, wherein, when the color component type of the current block is a luminance component, the determining a luminance weight matrix by using a luminance weight matrix network model comprises:

Determine the input luminance reconstructed image block and the output luminance reconstructed image block of the neural network loop filter;

Inputting the output brightness reconstructed image block into the brightness weight matrix network model to obtain the brightness weight matrix;

The determining the target reconstructed image block of the current block by using the weight matrix includes:

The input luminance reconstructed image block and the output luminance reconstructed image block are weighted by using the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.
The method according to claim 17, wherein, when the color component type of the current block is a chrominance component, the determining a chrominance weight matrix by using a chrominance weight matrix network model comprises:

Determine the input chrominance reconstructed image block and the output chrominance reconstructed image block of the neural network loop filter;

Inputting the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix;

The determining the target reconstructed image block of the current block by using the weight matrix includes:

The input chrominance reconstructed image block and the output chrominance reconstructed image block are weighted by using the chrominance weight matrix to obtain the target reconstructed image block of the chrominance component of the current block.
The method according to claim 15, wherein the input reconstructed image block is obtained after filtering through a deblocking filter and a sample adaptive compensation filter.
The method according to any one of claims 1 to 20, wherein the method further comprises:

After the target reconstructed image block is determined, the adaptive correction filter is used to continue filtering the target reconstructed image block.
An encoding method, applied to an encoder, the method comprising:

Determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model;

Determine the value of at least one syntax element identification information;

When the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix for filtering processing, the weight matrix is used to determine a target reconstructed image block of the current block.
The method according to claim 22, wherein said determining the weight matrix network model of the current block comprises:

When the color component type of the current block is a luminance component, determining a luminance weight matrix network model of the current block;

When the color component type of the current block is a chrominance component, a chrominance weight matrix network model of the current block is determined.
The method according to claim 23, wherein the determining the luminance weight matrix network model of the current block comprises:

In the case that the color component type of the current block is a luminance component, determining at least one candidate luminance weight matrix network model;

Determine the quantization parameter of the current block, and select a candidate brightness weight matrix network model corresponding to the quantization parameter from the at least one candidate brightness weight matrix network model;

The selected candidate luminance weight matrix network model is determined as the luminance weight matrix network model of the current block.
The method according to claim 23, wherein the determining the chroma weight matrix network model of the current block comprises:

In the case that the color component type of the current block is a chrominance component, determining at least one candidate chrominance weight matrix network model;

determining the quantization parameter of the current block, and selecting a candidate chroma weight matrix network model corresponding to the quantization parameter from the at least one candidate chroma weight matrix network model;

The selected candidate chrominance weight matrix network model is determined as the chrominance weight matrix network model of the current block.
The method of claim 24 or 25, wherein the method further comprises:

determining at least one training sample, wherein the training sample is obtained according to at least one quantization parameter;

Use the brightness component of the at least one training sample to train a preset neural network model to obtain at least one candidate brightness weight matrix network model;

Use the chrominance component of the at least one training sample to train the preset neural network model to obtain at least one candidate chrominance weight matrix network model;

Wherein, the at least one candidate luminance weight matrix network model has a corresponding relationship with the luminance component and the quantization parameter, and the at least one candidate chrominance weight matrix network model has a corresponding relationship with the chrominance component and the quantization parameter.
The method of claim 26, wherein the preset neural network model includes at least one of the following: at least one convolutional layer, at least one activation layer, and a skip connection layer.
The method according to claim 22, wherein said determining the value of at least one syntax element identification information comprises:

If it is determined that the video sequence is filtered using the weight matrix, then the value of the first syntax element identification information is determined to be the first value; or,

If it is determined that the video sequence does not use the weight matrix for filtering processing, it is determined that the value of the first syntax element identification information is the second value.
The method of claim 28, wherein the method further comprises:

The value of the first syntax element identification information is encoded, and the encoded bits are written into the code stream.
The method according to claim 28, wherein after determining that the video sequence is filtered using the weight matrix, when the color component type of the current frame is a luminance component, the weight matrix is determined to be a luminance weight matrix , the determining the value of the identification information of at least one syntax element further includes:

determining a first rate-distortion cost of filtering the luminance component of the current frame in the video sequence using the luminance weight matrix;

determining a second rate-distortion cost value for which the luminance component of the current frame in the video sequence is not filtered using the luminance weight matrix;

If the first rate-distortion cost value is less than the second rate-distortion cost value, determine that the value of the first luma syntax element identification information is a first value; or,

If the first rate-distortion cost value is greater than or equal to the second rate-distortion cost value, the value of the first luma syntax element identification information is determined to be a second value.
The method of claim 30, wherein the method further comprises:

The value of the first luminance syntax element identification information is encoded, and the encoded bits are written into the code stream.
The method of claim 30, wherein the method further comprises:

Perform block division on the current frame to determine at least one divided block; wherein, the at least one divided block includes the current block;

The determining of the first rate-distortion cost of filtering the luminance component of the current frame in the video sequence using the luminance weight matrix includes:

respectively calculating a third rate-distortion cost value for filtering the luminance component of the at least one divided block using the luminance weight matrix;

Accumulating the calculated third rate-distortion cost value to obtain the first rate-distortion cost value;

The calculating the second rate-distortion cost value of the luminance component of the current frame in the video sequence without using the luminance weight matrix for filtering processing includes:

respectively calculating the fourth rate-distortion cost value of the luminance component of the at least one divided block that is not filtered using the luminance weight matrix;

Accumulate the calculated fourth rate-distortion cost value to obtain the second rate-distortion cost value.
The method according to claim 32, wherein, when the first rate-distortion cost value is less than the second rate-distortion cost value, the determining the value of at least one syntax element identification information further comprises:

determining a third rate-distortion cost value for the luminance component of the current block to be filtered using the luminance weight matrix; and determining a fourth rate-distortion cost value for the luminance component of the current block not to be filtered using the luminance weight matrix value;

If the third rate-distortion cost value is less than the fourth rate-distortion cost value, determine that the value of the second luma syntax element identification information is the first value; or,

If the third rate-distortion cost value is greater than or equal to the fourth rate-distortion cost value, determining that the value of the second luma syntax element identification information is a second value.
The method of claim 33, wherein the method further comprises:

The value of the second luminance syntax element identification information is encoded, and the encoded bits are written into the code stream.
The method according to claim 33, wherein when the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix for filtering processing, the weight matrix is used to determine the target reconstructed image of the current block blocks, including:

If the third rate-distortion cost value is less than the fourth rate-distortion cost value, determining the target reconstructed image block of the current block by using the luminance weight matrix.
The method according to claim 28, wherein after determining that the video sequence is filtered using the weight matrix, when the color component type of the current frame is a chroma component, the weight matrix is determined to be a chroma component The weight matrix, which determines the value of at least one syntax element identification information, also includes:

determining the fifth rate-distortion cost of filtering the chrominance component of the current frame in the video sequence using the chrominance weight matrix;

determining the sixth rate-distortion cost value of the chrominance component of the current frame in the video sequence that is not filtered using the chrominance weight matrix;

If the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, determine that the value of the chroma syntax element identification information is the first value; or,

If the fifth rate-distortion cost value is greater than or equal to the sixth rate-distortion cost value, the value of the chroma syntax element identification information is determined to be a second value.
The method of claim 36, wherein the method further comprises:

The value of the chroma syntax element identification information is encoded, and the encoded bits are written into the code stream.
The method of any of claims 28, 30, 33 and 36, wherein the first value is one and the second value is zero.
The method according to claim 36, wherein, when the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix for filtering processing, the weight matrix is used to determine the target reconstructed image of the current block blocks, including:

If the fifth rate-distortion cost value is less than the sixth rate-distortion cost value, determining the target reconstructed image block of the current block by using the chrominance weight matrix.
The method according to any one of claims 22 to 39, wherein the determining the weight matrix of the current block according to the weight matrix network model comprises:

Determine the input reconstructed image patch and the output reconstructed image patch of the neural network loop filter;

The output reconstructed image block is input into the weight matrix network model to obtain the weight matrix of the current block.
The method according to claim 40, wherein the determining the target reconstructed image block of the current block by using the weight matrix comprises:

The input reconstructed image block and the output reconstructed image block are weighted by using the weight matrix to obtain the target reconstructed image block.
The method according to claim 22, wherein the determining the weight matrix of the current block according to the weight matrix network model comprises:

When the color component type of the current block is a luminance component, use a luminance weight matrix network model to determine a luminance weight matrix;

When the color component type of the current block is a chrominance component, a chrominance weight matrix is determined by using a chrominance weight matrix network model.
The method according to claim 42, wherein, when the color component type of the current block is a luminance component, the determining a luminance weight matrix by using a luminance weight matrix network model comprises:

Determine the input luminance reconstructed image block and the output luminance reconstructed image block of the neural network loop filter;

Inputting the output brightness reconstructed image block into the brightness weight matrix network model to obtain the brightness weight matrix;

The determining the target reconstructed image block of the current block by using the weight matrix includes:

The input luminance reconstructed image block and the output luminance reconstructed image block are weighted by using the luminance weight matrix to obtain the target reconstructed image block of the luminance component of the current block.
The method according to claim 42, wherein, when the color component type of the current block is a chrominance component, the determining a chrominance weight matrix by using a chrominance weight matrix network model comprises:

Determine the input chrominance reconstructed image block and the output chrominance reconstructed image block of the neural network loop filter;

Inputting the output chrominance reconstruction image block into the chrominance weight matrix network model to obtain the chrominance weight matrix;

The determining the target reconstructed image block of the current block by using the weight matrix includes:

The input chrominance reconstructed image block and the output chrominance reconstructed image block are weighted by using the chrominance weight matrix to obtain the target reconstructed image block of the chrominance component of the current block.
The method of claim 40, wherein the input reconstructed image block is obtained after filtering through a deblocking filter and a sample adaptive compensation filter.
The method of any one of claims 22 to 45, wherein the method further comprises:

After the target reconstructed image block of the current block is determined, the adaptive correction filter is used to continue filtering the target reconstructed image block.
A code stream, wherein the code stream is generated by bit encoding according to the value of at least one syntax element identification information;

Wherein, the at least one syntax element identification information includes at least: first syntax element identification information, first luma syntax element identification information, second luma syntax element identification information, and chroma syntax element identification information;

Wherein, the first syntax element identification information is used to indicate whether the video sequence is filtered using a weight matrix, the first luminance syntax element identification information is used to indicate whether the luminance component of the current frame is filtered using a weight matrix, the The second luma syntax element identification information is used to indicate whether the luma component of the current block is filtered using the weight matrix, and the chroma syntax element identification information is used to indicate whether the chroma component of the current frame is filtered using the weight matrix; the The video sequence includes the current frame, which includes the current block.
An encoder comprising a first determining unit and a first filtering unit; wherein,

The first determining unit is configured to determine the weight matrix network model of the current block, and determine the weight matrix of the current block according to the weight matrix network model; and determine the value of at least one syntax element identification information;

The first filtering unit is configured to use the weight matrix to determine a target reconstructed image block of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses a weight matrix for filtering processing.
An encoder comprising a first memory and a first processor; wherein,

the first memory for storing a computer program executable on the first processor;

The first processor is configured to execute the method according to any one of claims 22 to 46 when running the computer program.
A decoder comprising a parsing unit, a second determining unit and a second filtering unit; wherein,

The parsing unit is configured to parse the code stream and determine the value of at least one syntax element identification information;

The second determining unit is configured to determine the weight matrix network model of the current block when the at least one syntax element identification information indicates that the current frame or the current block uses the weight matrix for filtering processing, and determine according to the weight matrix network model the weight matrix of the current block;

The second filtering unit is configured to use the weight matrix to determine a target reconstructed image block of the current block.
A decoder comprising a second memory and a second processor; wherein,

the second memory for storing a computer program executable on the second processor;

The second processor is configured to execute the method according to any one of claims 1 to 21 when running the computer program.
A computer storage medium, wherein the computer storage medium stores a computer program, and when the computer program is executed, the method according to any one of claims 1 to 21 or any one of claims 22 to 46 is realized. method described in item.