WO2023245544A1 - 编解码方法、码流、编码器、解码器以及存储介质 - Google Patents
编解码方法、码流、编码器、解码器以及存储介质 Download PDFInfo
- Publication number
- WO2023245544A1 WO2023245544A1 PCT/CN2022/100728 CN2022100728W WO2023245544A1 WO 2023245544 A1 WO2023245544 A1 WO 2023245544A1 CN 2022100728 W CN2022100728 W CN 2022100728W WO 2023245544 A1 WO2023245544 A1 WO 2023245544A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- filtered
- value
- component
- block
- identification information
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 226
- 238000013139 quantization Methods 0.000 claims abstract description 632
- 238000001914 filtration Methods 0.000 claims description 227
- 238000003062 neural network model Methods 0.000 claims description 112
- 230000015654 memory Effects 0.000 claims description 38
- 238000004590 computer program Methods 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 21
- 230000009466 transformation Effects 0.000 claims description 21
- 230000004913 activation Effects 0.000 claims description 11
- 238000005192 partition Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 54
- 238000013528 artificial neural network Methods 0.000 description 42
- 230000008569 process Effects 0.000 description 38
- 238000010586 diagram Methods 0.000 description 24
- 230000009286 beneficial effect Effects 0.000 description 20
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 18
- 230000006870 function Effects 0.000 description 14
- 230000006978 adaptation Effects 0.000 description 11
- 241000023320 Luma <angiosperm> Species 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 7
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 230000009467 reduction Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000012812 general test Methods 0.000 description 1
- 238000007373 indentation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
Definitions
- the embodiments of the present application relate to the field of video coding and decoding technology, and in particular, to a coding and decoding method, a code stream, an encoder, a decoder, and a storage medium.
- loop filters are used to improve the subjective and objective quality of reconstructed images.
- some neural network solutions such as multi-model intra-frame switchable solutions and intra-frame non-switchable solutions.
- this solution has more neural network models and can adjust the model according to local details; while for the latter, although this solution only has two neural network models, it does not switch models within the frame. ;
- the current frame is an I frame, only the neural network model corresponding to the I frame is used; if the current frame is a B frame, only the neural network model corresponding to the B frame is used.
- Embodiments of the present application provide a coding and decoding method, a code stream, an encoder, a decoder, and a storage medium, which can reduce the computational complexity during model inference, thereby improving coding and decoding efficiency.
- embodiments of the present application provide a decoding method, which is applied to a decoder.
- the method includes:
- the code stream is parsed to determine the second syntax element identification information of the component to be filtered of the current block; wherein, the current The frame includes at least one divided block, and the current block is any one of the at least one divided block;
- the block quantization parameter information determines whether the component to be filtered of the current block is filtered using the preset network model. If the second syntax element identification information indicates that the component to be filtered of the current block is filtered using the preset network model, determine the block quantization parameter information of the current block; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and The block quantization parameter value of the second color component;
- inventions of the present application provide an encoding method, which is applied to an encoder.
- the method includes:
- the first syntax element identification information indicates that the component to be filtered of the divided block exists in the current frame and allows filtering to be performed using a preset network model
- embodiments of the present application provide a code stream, which is generated by bit encoding based on the information to be encoded; wherein the information to be encoded includes at least one of the following: the first component of the current frame to be filtered Syntax element identification information, second syntax element identification information of the component to be filtered of the current block, third syntax element identification information of the component to be filtered of the current frame, residual scaling factor and component to be filtered of at least one divided block included in the current frame The initial residual value; wherein, the current frame includes at least one division block, and the current block is any one of the at least one division block.
- embodiments of the present application provide an encoder, which includes a first determination unit and a first filtering unit; wherein,
- a first determination unit configured to determine the first syntax element identification information of the component to be filtered of the current frame; and when the first syntax element identification information indicates that the component to be filtered that exists in the divided block in the current frame allows filtering using the preset network model , determine the second syntax element identification information of the component to be filtered of the current block; wherein the current frame includes at least one divided block, and the current block is any one of the at least one divided block; and the second syntax element identification information indicates the current block
- the block quantization parameter information of the current block is determined; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and the block quantization parameter value of the second color component;
- the first determination unit is also configured to determine the reconstruction value of the component to be filtered of the current block,
- a first memory for storing a computer program capable of running on the first processor
- embodiments of the present application provide a decoder, which includes a decoding unit, a second determination unit, and a second filtering unit; wherein,
- the decoding unit is configured to parse the code stream and determine the first syntax element identification information of the component to be filtered in the current frame; and the first syntax element identification information indicates that the component to be filtered that exists in the divided block in the current frame is allowed to use the preset network model.
- parse the code stream to determine the second syntax element identification information of the component to be filtered of the current block; wherein the current frame includes at least one division block, and the current block is any one of the at least one division block;
- the second determination unit is further configured to determine the reconstruction value of the component to be filtered of the current block
- a second memory for storing a computer program capable of running on the second processor
- the second processor is configured to perform the method described in the first aspect when running the computer program.
- Embodiments of the present application provide a coding and decoding method, a code stream, an encoder, a decoder, and a storage medium.
- the first syntax element identification information of the component to be filtered of the current frame is first determined;
- the second syntax element identification information of the component to be filtered in the current block is determined; wherein the current frame includes at least one Divide blocks, and the current block is any one of at least one divided block;
- the second syntax element identification information indicates that the component to be filtered of the current block is filtered using a preset network model, determine the block quantization parameter information of the current block; wherein,
- the block quantization parameter information at least includes the block quantization parameter value of the first color component and the block quantization parameter value of the second color component; then determine the reconstruction value of the component to be filtered of the current block,
- the amount of calculation is beneficial to the implementation of the decoder and reduces the decoding time; in addition, since the input block quantization parameter information includes at least the block quantization parameter values of two color components, even if multi-channel quantization parameters are used as input, the brightness color component and The chroma color component has more choices and adaptations; and by introducing new syntax elements, the decoder can achieve a more flexible configuration without having to store multiple neural network models, which is beneficial to improving encoding performance and thereby improving Encoding and decoding efficiency.
- Figure 3 is a schematic diagram of the composition of a residual block
- Figure 4 is a schematic diagram of the network architecture of another neural network model
- Figure 5A is a schematic block diagram of an encoder provided by an embodiment of the present application.
- Figure 6 is a schematic flow chart of a decoding method provided by an embodiment of the present application.
- Figure 7 is a schematic diagram of the network architecture of a neural network model provided by an embodiment of the present application.
- Figure 8 is a schematic flow chart of another decoding method provided by an embodiment of the present application.
- Figure 9 is a schematic flow chart of another decoding method provided by an embodiment of the present application.
- Figure 10 is a schematic flow chart of an encoding method provided by an embodiment of the present application.
- Figure 11 is a schematic structural diagram of an encoder provided by an embodiment of the present application.
- Figure 12 is a schematic diagram of the specific hardware structure of an encoder provided by an embodiment of the present application.
- Figure 13 is a schematic structural diagram of a decoder provided by an embodiment of the present application.
- Figure 14 is a schematic diagram of the specific hardware structure of a decoder provided by an embodiment of the present application.
- Figure 15 is a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application.
- the first color component, the second color component and the third color component are generally used to represent the coding block (Coding Block, CB).
- these three color components are one brightness color component and two chroma color components (blue chroma color component and red chroma color component).
- the brightness color component is usually represented by the symbol Y
- the blue chroma color component The component is usually represented by the symbol Cb or U
- the red chroma color component is usually represented by the symbol Cr or V; in this way, the video image can be represented in YCbCr format, YUV format, or even RGB format, but it does not Any limitations.
- video compression technology mainly compresses huge digital image video data to facilitate transmission and storage.
- existing digital video compression standards can save a lot of video data, it is still necessary to pursue better digital video compression technology to reduce the number of Bandwidth and traffic pressure of video transmission.
- the encoder reads unequal pixels from the original video sequence in different color formats, including brightness color components and chrominance color components. That is, the encoder reads a black and white or color image. Afterwards, the blocks are divided and the block data is handed over to the encoder for encoding.
- VVC Video Coding
- LCU Largest Coding Unit
- CU Coding Unit
- Coding units may also be divided into prediction units (Prediction Unit, PU), transformation units (Transform Unit, TU), etc.
- the hybrid coding framework can include modules such as Prediction, Transform, Quantization, Entropy coding, and Inloop Filter.
- the prediction module can include intra prediction (Intra Prediction) and inter prediction (Inter Prediction), and inter prediction can include motion estimation (Motion Estimation) and motion compensation (Motion Compensation). Since there is a strong correlation between adjacent pixels within a frame of a video image, the use of intra-frame prediction in video encoding and decoding technology can eliminate the spatial redundancy between adjacent pixels; however, due to the There is also a strong similarity between frames. In video encoding and decoding technology, inter-frame prediction is used to eliminate temporal redundancy between adjacent frames, thereby improving encoding and decoding efficiency.
- the basic process of the video codec is as follows: at the encoding end, a frame of image is divided into blocks, and intra prediction or inter prediction is used for the current block to generate the prediction block of the current block.
- the prediction block is obtained by subtracting the prediction block from the original image block of the current block.
- the residual block For the residual block, the residual block is transformed and quantized to obtain a quantized coefficient matrix, and the quantized coefficient matrix is entropy-encoded and output to the code stream.
- intra prediction or inter prediction is used for the current block to generate the prediction block of the current block.
- the code stream is parsed to obtain the quantization coefficient matrix.
- the quantization coefficient matrix is inversely quantized and inversely transformed to obtain the residual block.
- the prediction block is The block and residual block are added to obtain the reconstructed block.
- Reconstruction blocks form a reconstructed image, and loop filtering is performed on the reconstructed image based on images or blocks to obtain a decoded image.
- the encoding end also needs similar operations as the decoding end to obtain the decoded image.
- the decoded image can be used as a reference frame for inter-frame prediction for subsequent frames.
- the block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information determined by the encoding end need to be output to the code stream if necessary.
- the decoding end determines the same block division information as the encoding end through parsing and analyzing based on existing information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, thereby ensuring the decoded image and decoding obtained by the encoding end
- the decoded image obtained at both ends is the same.
- the decoded image obtained at the encoding end is usually also called a reconstructed image.
- the current block can be divided into prediction units during prediction, and the current block can be divided into transformation units during transformation.
- the divisions of prediction units and transformation units can be different.
- the embodiments of the present application are applicable to the basic process of the video codec under the block-based hybrid coding framework, but are not limited to this framework and process.
- the current block can be the current coding unit (CU), the current prediction unit (PU), or the current transformation unit (TU), etc.
- the International Video Coding Standards Development Organization - Joint Video Experts Team has established two exploratory experiment groups, namely exploratory experiments based on neural network coding and exploration beyond VVC experiment, and set up several corresponding expert discussion groups.
- the Exploration Experimental Group Beyond VVC aims to explore higher coding efficiency based on the latest encoding and decoding standard H.266/VVC with strict performance and complexity requirements.
- the encoding method studied by this group is closer to VVC. It can be called a traditional coding method.
- the performance of the algorithm reference model of this exploratory experiment has surpassed the coding performance of the latest VVC reference model (VVC TEST MODEL, VTM) by about 15%.
- the learning method studied by the first exploratory experimental group is an intelligent coding method based on neural networks.
- deep learning and neural networks are hot topics in all walks of life, especially in the field of computer vision, methods based on deep learning often have An overwhelming advantage.
- Experts from the JVET standards organization have brought neural networks into the field of video encoding and decoding.
- coding tools based on neural networks often have very efficient coding efficiency.
- many manufacturers focused on coding tools based on deep learning and proposed intra-frame prediction methods based on neural networks, inter-frame prediction methods based on neural networks, and loop filtering methods based on neural networks.
- the coding performance of the neural network-based loop filtering method is the most outstanding.
- the coding performance of the neural network-based loop filtering scheme currently studied by the first exploratory experimental group of the JVET conference is as high as 12%, reaching a level that can contribute almost half a generation of coding performance.
- the embodiment of this application is improved on the basis of the exploratory experiments of the JVET conference, and a loop filtering enhancement scheme based on neural network (Neural network, NN) is proposed.
- the following will first introduce the neural network loop filtering scheme in related technologies.
- the exploration of loop filtering solutions based on neural networks mainly focuses on two forms.
- the first is a solution that can switch multiple models within a frame; the second is a solution that cannot switch models within a frame.
- the architectural form of the neural network has not changed much, and the tool is in the in-loop filtering of the traditional hybrid coding framework. Therefore, the basic processing unit of both schemes is the coding tree unit, that is, the maximum coding unit size.
- the biggest difference between the first multi-model intra-frame switchable solution and the second intra-frame non-switchable model solution is that when encoding and decoding the current frame, the first solution can switch the neural network model at will, while the second solution Neural network models cannot be switched.
- each coding tree unit has multiple optional candidate neural network models, and the encoding end selects which neural network model is used by the current coding tree unit.
- the network model has the best filtering effect, and then the index number of the neural network model is written into the code stream.
- a coding tree unit level usage flag needs to be transmitted first, and then The index number of the transmission neural network model. If filtering is not required, only a coding tree unit-level usage flag is transmitted; after parsing the index number, the decoder loads the neural network model corresponding to the index number into the current coding tree unit for the current coding tree. unit for filtering.
- the second solution when encoding a frame of image, the neural network model available for each coding tree unit in the current frame is fixed, and each coding tree unit uses the same neural network model, that is, on the encoding side, the second The solution does not have a model selection process; the decoding end parses and obtains the usage flag of whether the current coding tree unit uses loop filtering based on neural networks. If the usage flag is true, the preset model (similar to the encoding end) is used. Same) perform filtering on the coding tree unit. If the usage flag is false, no additional operations will be performed.
- the model can be adjusted according to local details, that is, local optimization to achieve better global results.
- this solution has more neural network models. Different neural network models are trained under different quantization parameters for JVET general test conditions. At the same time, different encoding frame types may also require different neural network models to achieve better results.
- the filter uses up to 22 neural network models to cover different coding frame types and different quantization parameters, and model switching is performed at the coding tree unit level. This filter can provide up to 10% more coding performance based on VVC.
- this solution has two neural network models as a whole, the models are not switched within the frame.
- This solution is judged at the encoding end. If the current encoding frame type is an I frame, the neural network model corresponding to the I frame is imported, and only the neural network model corresponding to the I frame is used in the current frame; if the current encoding frame type is a B frame , then the neural network model corresponding to the B frame is imported, and only the neural network model corresponding to the B frame is used in this frame.
- This solution can provide 8.65% coding performance based on VVC. Although it is slightly lower than the first solution, the overall performance is almost impossible to achieve coding efficiency compared with traditional encoding tools.
- the first solution has higher flexibility and higher coding performance, but this solution has a fatal shortcoming in hardware implementation, that is, hardware experts are concerned about the code for intra-frame model switching.
- the model means that the worst case scenario is that the decoder needs to reload the neural network model every time it processes a coding tree unit.
- the complexity of hardware implementation in the existing high-performance graphics processing unit (GPU) ) is an additional burden.
- the existence of multiple models also means that a large number of parameters need to be stored, which is also a huge overhead burden in current hardware implementation.
- this neural network loop filtering further explores the powerful generalization ability of deep learning, taking various information as input instead of simply taking the reconstructed sample as the input of the model, and more information is used as input for the neural network.
- Network learning provides more help, which better reflects the model's generalization ability and removes many unnecessary redundant parameters.
- Continuously updated solutions have emerged for different test conditions and quantitative parameters, and only a simplified low-complexity neural network model can be used. Compared with the first solution, this saves the consumption of constantly reloading the model and the need to open up larger storage space for a large number of parameters.
- FIG. 2 shows a schematic diagram of the network architecture of a neural network model.
- the main structure of the network architecture can be composed of multiple residual blocks (ResBlocks).
- the detailed structure of the residual block is shown in Figure 3.
- a single residual block consists of multiple convolutional layers (Conv) connected to the Convolutional Blocks Attention Module (CBAM) layer.
- CBAM Convolutional Blocks Attention Module
- the residual block also has a direct skip connection structure between the input and the output.
- the multiple convolutional layers in Figure 3 include a first convolutional layer, a second convolutional layer and a third convolutional layer, and an activation layer is connected after the first convolutional layer.
- the size of the first convolutional layer is 1 ⁇ 1 ⁇ k ⁇ n
- the size of the second convolutional layer is 1 ⁇ 1 ⁇ n ⁇ k
- the size of the third convolutional layer is 3 ⁇ 3 ⁇ k ⁇ k
- k and n are positive integers
- the activation layer can include a Rectified Linear Unit (ReLU) function, also called a linear rectification function, which is an activation function often used in current neural network models.
- ReLU is actually a ramp function, which is simple and has fast convergence speed.
- FIG. 2 there is also a skip connection structure in the network architecture, which connects the input reconstructed YUV information with the output of the pixel shuffle module.
- the main function of Pixel Shuffle is to obtain high-resolution feature maps through convolution and multi-channel reorganization of low-resolution feature maps; as an upsampling method, it can effectively perform reduction on the reduced feature maps. enlarge.
- the inputs of this network architecture mainly include reconstructed YUV information (rec_yuv), predicted YUV information (pred_yuv) and YUV information with partition information (par_yuv).
- YUV information with division information may be processed differently in I frames and B frames. Among them, I frames need to input YUV information with division information, while B frames do not need to input YUV information with division information. YUV information that divides the information.
- the first solution has a corresponding neural network model.
- the three color components of YUV are mainly composed of two channels, brightness and chrominance, they are different in color components.
- FIG 4 shows a schematic diagram of the network architecture of another neural network model.
- the main structure of the network architecture is basically the same as the first scheme. The difference is that compared with the first scheme, the input of the second scheme adds quantified parameter information as Additional input.
- the above-mentioned first solution loads different neural network models based on different quantified parameter information to achieve more flexible processing and more efficient coding effects, while the second solution uses quantified parameter information as the input of the network to improve the neural network model.
- the generalization ability of the network enables the model to adapt and provide good filtering performance under different quantization parameter conditions.
- BaseQP indicates the sequence-level quantization parameters set by the encoder when encoding the video sequence, that is, the quantization parameter points required by JVET test, and are also the parameters used to select the neural network model in the first solution.
- SliceQP is the quantization parameter of the current frame.
- the quantization parameter of the current frame can be different from the sequence-level quantization parameter. This is because in the video encoding process, the quantization conditions of the B frame are different from the I frame, and the quantization parameters are also different at different time domain levels. Therefore, SliceQP is generally different from BaseQP in B frames.
- the input of the neural network model of the I frame only requires SliceQP, while the neural network model of the B frame requires both BaseQP and SliceQP as input.
- the inputs of the network architecture mainly include reconstructed YUV information (rec_yuv), predicted YUV information (pred_yuv), YUV information with partition information (par_yuv), BaseQP, SliceQP, and finally output filtered component information (output_yuv) .
- the output of the model generally does not require additional processing, that is, if the output of the model is residual information, the current coding tree will be superimposed.
- the reconstructed samples of the unit are then output as the loop filtering tool based on the neural network; if the output of the model is a complete reconstructed sample, the model output is the output of the loop filtering tool based on the neural network.
- the output of the second option 2 generally requires a scaling process. Taking the residual information output by the model as an example, the model infers and outputs the residual information of the current coding tree unit.
- the residual information is scaled and then superimposed on the current coding tree.
- the reconstructed sample information of the unit, and this scaling factor is obtained by the encoding end and needs to be written into the code stream and sent to the decoding end.
- the general neural network-based loop filtering scheme may not be exactly the same as the above two schemes.
- the specific schemes may differ in details, but the main ideas are basically the same.
- different details of the second solution can be reflected in the design of the neural network architecture, such as the convolution size of the residual block, the number of convolution layers, and whether the attention mechanism module is included, etc., or can also be reflected in the input of the neural network model.
- the input can even have more additional information, such as boundary strength values for deblocking filtering.
- this technical solution Since the chroma color component has an independent neural network model, this technical solution has an average compression performance that is 2 to 5% higher than the previous two solutions in terms of chroma color component performance. If the brightness color component performance is not transferred, this technical solution can achieve an additional 10% higher compression performance on the chroma color component. It can be seen that the performance of the above two solutions on the chroma color component still has room for improvement.
- various ablation experiments were conducted on the loop filtering technology of the neural network, and it was found that additional input information becomes useless when the training time is extended.
- the neural network-based loop filtering technology can consider removing input information such as predicted YUV information, YUV information with division information, and boundary strength (Boundary strength, Bs), and tailor it to only the reconstructed YUV information and input to BaseQP.
- embodiments of the present application provide a coding and decoding method. Whether at the encoding end or the decoding end, first determine the first syntax element identification information of the component to be filtered of the current frame; when the first syntax element identification information indicates the current frame When the component to be filtered of the divided block exists in the preset network model for filtering, then the second syntax element identification information of the component to be filtered of the current block is determined; wherein the current frame includes at least one divided block, and the current block is at least one Any one of the divided blocks; when the second syntax element identification information indicates that the component to be filtered of the current block is filtered using a preset network model, determine the block quantization parameter information of the current block; wherein the block quantization parameter information at least includes the first color The block quantization parameter value of the component and the block quantization parameter value of the second color component; then determine the reconstruction value of the component to be filtered of the current block, and input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the
- the amount of calculation is beneficial to the implementation of the decoder and reduces the decoding time; in addition, since the input block quantization parameter information includes at least the block quantization parameter values of two color components, even if multi-channel quantization parameters are used as input, the brightness color component and The chroma color component has more choices and adaptations; and by introducing new syntax elements, the decoder can achieve a more flexible configuration without having to store multiple neural network models, which is beneficial to improving encoding performance and thereby improving Encoding and decoding efficiency.
- the encoder 100 may include a transformation and quantization unit 101, an intra estimation unit 102, an intra prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform
- the filter unit 108 can implement deblocking filtering and sample adaptive indentation (Sample Adaptive Offset, SAO ) filtering
- the encoding unit 109 can implement header information encoding and context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmetic Coding, CABAC).
- a video coding block can be obtained by dividing the coding tree block (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is processed through the transformation and quantization unit 101
- the video coding block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate;
- the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to Intra prediction is performed on the video encoding block; specifically, intra estimation unit 102 and intra prediction unit 103 are used to determine an intra prediction mode to be used to encode the video encoding block;
- motion compensation unit 104 and motion estimation unit 105 is used to perform inter-frame prediction encoding of the received video encoding block with respect to one or more blocks in one or more reference frames to provide temporal prediction information; motion estimation performed by the motion estimation unit 105 is to generate a motion vector.
- the motion vector can estimate the motion of the video encoding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 also is used to provide the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated and determined motion vector data to the encoding unit 109; in addition, the inverse transformation and inverse quantization unit 106 is used for the video Reconstruction of the coding block, the residual block is reconstructed in the pixel domain, the reconstructed residual block removes block effect artifacts through the filter control analysis unit 107 and the filtering unit 108, and then the reconstructed residual block is added to the decoding A predictive block in the frame of the image cache unit 110 is used to generate a reconstructed video encoding block; the encoding unit 109 is used to encode various encoding parameters and quantized transform coefficients.
- the contextual content can be based on adjacent coding blocks and can be used to encode information indicating the determined intra prediction mode and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks for Forecast reference. As the video image encoding proceeds, new reconstructed video encoding blocks will be continuously generated, and these reconstructed video encoding blocks will be stored in the decoded image cache unit 110 .
- the decoder 200 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205 and a decoded image cache unit. 206, etc., wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement deblocking filtering and SAO filtering.
- the code stream of the video signal is output; the code stream is input into the decoder 200 and first passes through the decoding unit 201 to obtain the decoded transform coefficient; for the transform coefficient, the inverse Transform and inverse quantization unit 202 performs processing to generate a residual block in the pixel domain; intra prediction unit 203 may be operable to generate a current block based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture. Prediction data for the video decoding block; motion compensation unit 204 determines prediction information for the video decoding block by parsing motion vectors and other associated syntax elements, and uses the prediction information to generate predictive blocks for the video decoding block being decoded.
- a decoded video block is formed by summing the residual block from the inverse transform and inverse quantization unit 202 and the corresponding predictive block produced by the intra prediction unit 203 or the motion compensation unit 204; the decoded video signal is formed by The filtering unit 205 removes blocking artifacts, which can improve video quality; the decoded video blocks are then stored in the decoded image cache unit 206, which stores reference images for subsequent intra prediction or motion compensation, At the same time, it is also used for the output of video signals, that is, the restored original video signals are obtained.
- both the filtering unit 108 and the filtering unit 205 here refer to the loop filtering part based on the neural network. That is to say, the embodiments of this application mainly affect the loop filtering part in the video coding hybrid framework, and can be applied to both the encoder and the decoder, or even to the encoder and the decoder at the same time, but this will not be done here. Specific limitations.
- FIG. 6 shows a schematic flowchart of a decoding method provided by an embodiment of the present application. As shown in Figure 6, the method may include:
- S601 Parse the code stream and determine the first syntax element identification information of the component to be filtered of the current frame.
- this method is applied to the decoder. Specifically, it can be applied to the loop filtering method based on the neural network model. More specifically, the loop can be obtained based on the neural network model inputting multiple quantization parameters. Filtering method.
- the decoder can determine the first syntax element identification information by parsing the code stream.
- the first syntax element identification information is a frame-level syntax element, which can be used to indicate whether there is a component to be filtered in the divided block in the current frame that allows filtering using a preset network model.
- the current frame may include at least one divided block, and the current block is any one of the at least one divided block. That is to say, the first syntax element identification information may determine whether all components to be filtered of at least one divided block included in the current frame are not allowed to be filtered using the preset network model.
- the component to be filtered may refer to the color component.
- the color components may include at least one of: a first color component, a second color component, and a third color component.
- the first color component may be a brightness color component
- the second color component and the third color component may be a chroma color component (for example, the second color component is a blue chroma color component, and the third color component is a red chroma color component. component; alternatively, the second color component is a red chroma color component and the third color component is a blue chroma color component).
- the first syntax element identification information may be ph_nnlf_luma_enable_flag; if the component to be filtered is a chroma color component, the first syntax element identification information may be ph_nnlf_chroma_enable_flag. That is to say, for different color components in the current frame, different first syntax element identification information is correspondingly set.
- the decoder can determine the first syntax element identification information of the component to be filtered, so that whether there is a division block in the current frame under the component to be filtered allows filtering using the preset network model.
- the specific information may be determined by decoding the value of the identification information.
- parsing the code stream and determining the first syntax element identification information of the component to be filtered of the current frame may include:
- the method may also include:
- the value of the first syntax element identification information is the first value, it is determined that the first syntax element identification information indicates that there is a component to be filtered of the divided block in the current frame that allows filtering using the preset network model;
- the value of the first syntax element identification information is the second value, it is determined that the first syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are not allowed to be filtered using the preset network model.
- the first value and the second value are different, and the first value and the second value may be in parameter form or in numerical form.
- the first syntax element identification information may be a parameter written in the profile, or may be the value of a flag, which is not specifically limited here.
- enable flag bit (enable_flag) and disable flag bit (disable_flag).
- the value of the enabled flag bit is the first value
- the value of the non-enabled flag bit is the second value
- the first value can be set to 1
- the second value can be Set to 0; or, the first value can also be set to true (true), and the second value can also be set to false (false); however, the embodiment of the present application does not specifically limit it.
- parsing the code stream and determining the first syntax element identification information of the current frame may include:
- the third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are not filtered using the preset network model, parse the code stream and determine the first syntax element identification information of the to-be-filtered components of the current frame.
- the third syntax element identification information is also a frame-level syntax element, which can be used to indicate whether all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model. That is to say, the third syntax element identification information can determine whether all the to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model, or whether all the to-be-filtered components of at least one divided block included in the current frame are not filtered using the preset network model. Set up a network model for filtering.
- the third syntax element identification information may be ph_nnlf_luma_ctrl_flag; if the component to be filtered is a chroma color component, the third syntax element identification information may be ph_nnlf_chroma_ctrl_flag. That is to say, for different color components in the current frame, different third syntax element identification information is correspondingly set.
- the decoder can first determine the third syntax element identification information of the component to be filtered, and only when the third syntax element identification information indicates that the components to be filtered in at least one divided block included in the current frame do not all use the pre-filtered When the network model is used for filtering, the decoder still needs to decode to obtain the value of the first syntax element identification information.
- parsing the code stream and determining the third syntax element identification information of the component to be filtered in the current frame may include: parsing the code stream to obtain the value of the third syntax element identification information;
- the method may also include:
- the third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model;
- the third syntax element identification information is a flag
- the first value can be set to 1, and the second value can be set to 0; or, the first value It can also be set to true, and the second value can also be set to false; however, the embodiment of this application does not specifically limit it.
- the first syntax element identification information and the third syntax element identification information are both frame-level syntax elements.
- the third syntax element identification information may also be called a frame-level switch identification bit
- the first syntax element identification information may also be called a frame-level usage identification bit.
- the current block here specifically refers to the divided block currently to be subjected to loop filtering, which may be any one of at least one divided block included in the current frame.
- the current block can be the current coding unit, the current prediction unit, the current transformation unit, or even the current coding tree unit (Coding Tree Unit, CTU), etc. A detailed description will be given below taking the current block as the current coding tree unit as an example.
- the second syntax element identification information may be ctb_nnlf_luma_flag; if the component to be filtered is a chroma color component, the second syntax element identification information may be ctb_nnlf_chroma_flag. That is to say, for different color components in the current coding tree unit, different second syntax element identification information is correspondingly set.
- the decoder may first determine the third syntax element identification information of the component to be filtered.
- the decoder When the third syntax element identification information indicates that all the components to be filtered in at least one divided block included in the current frame do not use the preset
- the decoder also needs to decode to obtain the value of the first syntax element identification information; only when the first syntax element identification information indicates that there are components to be filtered that are divided into blocks in the current frame, the preset network model can be used for filtering. , then the decoder will continue to decode to obtain the value of the second syntax element identification information.
- parsing the code stream and determining the second syntax element identification information of the component to be filtered of the current block may include: parsing the code stream to obtain the value of the second syntax element identification information.
- the third syntax element identification information may be called a frame-level switch identification bit
- the first syntax element identification information may be called a frame-level usage identification bit
- the second syntax element identification information may be It is called the coding tree unit identification bit.
- bit and all coding tree unit usage flag bits in the current frame are all set to true; only when the frame-level switch flag bit is false, at this time not all components to be filtered of at least one divided block included in the current frame use the preset If the network model is used for filtering, then it is necessary to continue to parse the code stream to determine the frame-level usage flag; then when the frame-level usage flag is true, continue to parse the code stream to determine each divided block in the current frame.
- the encoding tree unit uses identification bits.
- S604 Determine the reconstruction value of the component to be filtered of the current block, input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and determine the post-filter reconstruction value of the component to be filtered of the current block. .
- the block quantization parameter information may at least include the block quantization parameter value of the luma color component and the block quantization parameter value of the chroma color component.
- the block quantization parameter value of the first color component may be the block quantization parameter value of the luma color component (represented by ctb_nnlf_luma_baseqp)
- the block quantization parameter value of the second color component may be the block quantization parameter value of the chroma color component ( Represented by ctb_nnlf_chroma_baseqp).
- the block quantization parameter value of the second color component corresponding to the current block is determined from the second quantization parameter candidate set.
- the first quantization parameter candidate set may be composed of at least two candidate quantization parameter values of the first color component
- the second quantization parameter candidate set may be composed of at least two candidate quantization parameter values of the second color component
- the decoder can obtain the first quantization parameter index by parsing the code stream; then the block quantization parameter value of the first color component can be determined from the first quantization parameter candidate set according to the first quantization parameter index; the decoder can obtain the first quantization parameter index by parsing the code stream. stream, the second quantization parameter index can also be obtained; and then the block quantization parameter value of the second color component can be determined from the second quantization parameter candidate set according to the second quantization parameter index.
- determining the block quantization parameter information of the current block may include:
- the code stream is parsed to determine the block quantization parameter value of the first color component and the block quantization parameter value of the second color component corresponding to the current block.
- the decoder can directly determine the block quantization parameter value of the first color component and the block quantization parameter value of the second color component by parsing the code stream.
- the brightness color component if the value of the second syntax element identification information of the brightness color component of the current block is true, then the brightness color component of the current block uses If the network model is preset for filtering, then it is also necessary to decode to obtain the block quantization parameter value of the brightness color component; for the chroma color component, if the value of the second syntax element identification information of the chroma color component of the current block is true , at this time, the chroma color component of the current block is filtered using the preset network model, and then it is also necessary to decode to obtain the block quantization parameter value of the chroma color component.
- the chroma of the current block needs to be The color component is filtered using the preset network model.
- the input block quantization parameter value still includes the block quantization parameter value of the brightness color component and the block quantization parameter value of the chroma color component.
- the brightness color obtained from the code stream The component's block quantization parameter value is the default value.
- the network model performs filtering.
- the input block quantization parameter value still includes the block quantization parameter value of the brightness color component and the block quantization parameter value of the chroma color component.
- the block quantization parameter value of the chroma color component obtained from the code stream Quantization parameter values are default values.
- determining the reconstruction value of the component to be filtered of the current block may include:
- the reconstruction value of the component to be filtered of the current block is determined.
- parsing the code stream to determine the reconstructed residual value of the component to be filtered of the current block may include: parsing the code stream to obtain the target residual value of the component to be filtered of the current block; The target residual value of the component to be filtered of the block is subjected to inverse quantization and inverse transformation to obtain the reconstructed residual value of the component to be filtered of the current block.
- determining the reconstruction value of the component to be filtered of the current block based on the reconstruction residual value of the component to be filtered of the current block and the predicted value of the component to be filtered of the current block may include: The reconstruction residual value of the component to be filtered of the block and the predicted value of the component to be filtered of the current block are added together to obtain the reconstruction value of the component to be filtered of the current block.
- the reconstruction residual value of the component to be filtered of the current block can be obtained through decoding; and then intra-frame or inter-frame prediction is performed on the component to be filtered of the current block to determine the value of the component to be filtered of the current block.
- the predicted value of the component; then the reconstructed residual value of the component to be filtered and the predicted value of the component to be filtered are added to obtain the reconstructed value of the component to be filtered of the current block, which is the reconstructed YUV information described above; then It serves as the input of the preset network model to determine the filtered reconstruction value of the component to be filtered of the current block.
- the method may further include: when the second syntax element identification information indicates the component of the current block to be filtered, When the filtered component is not filtered using the preset network model, the reconstructed value of the component to be filtered of the current block is directly determined as the filtered reconstructed value of the component to be filtered of the current block.
- the component to be filtered of the current block can be The reconstruction value and block quantization parameter information are input to the preset network model, so that the filtered reconstruction value of the component to be filtered of the current block can be obtained; if the second syntax element identification information indicates that the component to be filtered of the current block is not processed using the preset network model filtering, then the reconstructed value of the component to be filtered of the current block can be directly determined as the post-filtered reconstruction value of the component to be filtered of the current block.
- the method may also include:
- the values of the second syntax element identification information of the components to be filtered of the divided blocks are equalized. set to second value;
- the reconstruction value of the component to be filtered of the divided block is directly determined as the post-filtered reconstruction value of the component to be filtered of the divided block.
- the first syntax element identification information indicates that the component to be filtered that exists in the divided block in the current frame allows filtering using the preset network model, then It is necessary to continue decoding to determine the second syntax element identification information, and then determine the filtered reconstruction value of the component to be filtered of the current block according to the second syntax element identification information; conversely, if the first syntax element identification information indicates that the current frame includes at least one If all the components to be filtered in the divided blocks are not allowed to be filtered using the preset network model, then the value of the second syntax element identification information of the at least one component to be filtered in the divided blocks can be set to the second value; and then after determining After the reconstruction value of the component to be filtered in each divided block, the reconstructed value of the component to be filtered in each divided block is directly determined as the filtered reconstruction value of the component to be filtered in the divided block; other loop filtering needs to be performed
- the method may also include:
- the third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model, parse the code stream and determine the frame quantization parameter information of the current frame; wherein the frame quantization parameter information is at least including a frame quantization parameter value of the first color component and a frame quantization parameter value of the second color component;
- the reconstruction value of the component to be filtered of the divided block and the block quantization parameter information of the divided block are input to the preset network model, and the post-filter reconstruction value of the component to be filtered of the divided block is determined. .
- the third syntax element identification information indicates that the components to be filtered of at least one divided block included in the current frame do not all use the preset network If the model is used for filtering, then it is necessary to continue decoding to determine the first syntax element identification information and the second syntax element identification information, and then determine the filtered reconstruction value of the component to be filtered of the current block based on these two syntax element identification information; conversely, if The third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model, then only the frame quantization parameter information of the current frame needs to be decoded to determine; then the first syntax element identification information and The values of the second syntax element identification information are all set to the first value, and the block quantization parameter information of all divided blocks included in the current frame is determined according to the frame quantization parameter information of the current frame.
- the frame quantization parameter information at least includes the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component.
- the first color component is a luminance color component
- the second color component is a chrominance color component.
- parsing the code stream and determining the frame quantization parameter information of the current frame may include:
- the frame quantization parameter value of the second color component corresponding to the current frame is determined from the second quantization parameter candidate set.
- the first quantization parameter candidate set may be composed of at least two candidate quantization parameter values of the first color component
- the second quantization parameter candidate set may be composed of at least two candidate quantization parameter values of the second color component. It should be noted that for the same frame, the first quantization parameter candidate set can be the same, and the second quantization parameter candidate set can be the same; and the first quantization parameter candidate sets corresponding to different frames can be different. The second quantization parameter candidate sets corresponding to the frames may also be different.
- what is written in the code stream may be the third quantization parameter index and the fourth quantization parameter index.
- the decoder can obtain the third quantization parameter index by parsing the code stream; then the frame quantization parameter value of the first color component can be determined from the first quantization parameter candidate set according to the third quantization parameter index; the decoder can obtain the third quantization parameter index by parsing the code stream. stream, a fourth quantization parameter index can also be obtained; and then the frame quantization parameter value of the second color component can be determined from the second quantization parameter candidate set according to the fourth quantization parameter index.
- the frame quantization parameter value of the luminance color component may be represented by ph_nnlf_luma_baseqp
- the frame quantization parameter value of the chroma color component may be represented by ph_nnlf_chroma_baseqp.
- parsing the code stream and determining the frame quantization parameter information of the current frame may include:
- the code stream is parsed to determine the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component corresponding to the current frame.
- the decoder can directly determine the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component by parsing the code stream.
- the current frame is determined by parsing the code stream.
- the components to be filtered include at least luminance color components and chrominance color components; the method may also include:
- the third syntax element identification information is the frame-level brightness switch identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame.
- the second syntax element identification information is the block-level brightness usage identification information of the current block; wherein the frame-level brightness switch identification information is used to indicate whether the brightness color components of at least one divided block included in the current frame are all filtered using the preset network model, and the frame
- the level brightness enable identification information is used to indicate whether the brightness color component of the divided block in the current frame is allowed to use the preset network model for filtering.
- the block level brightness use identification information is used to indicate whether the brightness color component of the current block uses the preset network model. perform filtering;
- the third syntax element identification information is the frame-level chroma switch identification information of the current frame
- the first syntax element identification information is the frame-level chroma enable identification of the current frame.
- the second syntax element identification information is the block-level chroma usage identification information of the current block; wherein the frame-level chroma switch identification information is used to indicate whether the chroma color components of at least one divided block included in the current frame all use the preset
- the network model performs filtering.
- the frame-level chroma enable identification information is used to indicate whether there is a chroma color component divided into blocks in the current frame.
- the preset network model is allowed to be used for filtering.
- the block-level chroma usage identification information is used to indicate whether the chroma color component of the current block exists. Whether the chroma color components are filtered using the default network model.
- the frame-level brightness switch identification information can be represented by ph_nnlf_luma_ctrl_flag
- the frame-level brightness enable identification information can be represented by ph_nnlf_luma_enable_flag
- the block-level brightness usage identification information can be represented by ctb_nnlf_luma_flag
- the frame-level chroma switch identification information can be represented by ph_nnlf_chroma_ctrl_flag
- the frame-level chroma enable identification information can be represented by ph_nnlf_chroma_enable_flag
- the block-level chroma usage identification information can be represented by ctb_nnlf_chroma_flag.
- sequence-level syntax elements may also be provided to determine whether the current sequence allows the use of neural network-based loop filtering technology.
- the method may also include:
- the fourth syntax element identification information indicates that the component to be filtered in the current sequence is allowed to be filtered using the preset network model, perform the step of parsing the code stream and determining the third syntax element identification information of the component to be filtered in the current frame.
- the fourth syntax element identification information is a sequence-level syntax element, which can be used to indicate whether the component to be filtered of the current sequence is allowed to be filtered using the preset network model. That is to say, depending on the value of the fourth syntax element identification information, it can be determined whether the component to be filtered of the current sequence is allowed to be filtered using the preset network model, or whether the component to be filtered of the current sequence is not allowed to be filtered using the preset network model.
- the fourth syntax element identification information may be represented by sps_nnlf_enable_flag.
- sps_nnlf_enable_flag if at least one of the brightness color component and chrominance color component of the current sequence allows the use of the preset network model for filtering, it means that the value of sps_nnlf_enable_flag is true, that is, the component to be filtered of the current sequence allows the use of the preset network model.
- parsing the code stream and determining the fourth syntax element identification information may include: parsing the code stream to obtain the value of the fourth syntax element identification information.
- the method may also include:
- the fourth syntax element identification information indicates that the component to be filtered of the current sequence is allowed to be filtered using the preset network model
- the fourth syntax element identification information indicates that the component to be filtered of the current sequence is not allowed to be filtered using the preset network model.
- the first value and the second value are different, and the first value and the second value may be in parameter form or in numerical form.
- the fourth syntax element identification information may be a parameter written in the profile, or may be the value of a flag, which is not specifically limited here.
- the fourth syntax element identification information is a flag
- the first value may be set to 1 and the second value may be set to 0; or , the first value can also be set to true, and the second value can also be set to false; however, the embodiment of this application does not make a specific limitation.
- the fourth syntax element identification information may be called a sequence-level identification bit.
- the decoder first decodes to obtain the sequence-level identification bit. If the value of sps_nnlf_enable_flag is true, it means that the current code stream allows the use of loop filtering technology based on the preset network model, and the subsequent decoding process needs to parse the relevant syntax elements; otherwise, it means The current code stream does not allow the use of loop filtering technology based on the preset network model. The subsequent decoding process does not need to parse the relevant syntax elements. By default, the relevant syntax elements are initial values or in a false state.
- the preset network model may be a neural network model, and the neural network model at least includes: a convolution layer, an activation layer, a splicing layer and a skip connection layer.
- the preset network model its input can include: the reconstruction value of the component to be filtered (expressed by rec_yuv), the quantization parameter value of the brightness color component (expressed by BaseQPluma) and the chrominance color component.
- Quantization parameter value represented by BaseQPchroma
- its output can be: the filtered reconstruction value of the component to be filtered (represented by output_yuv.) Since the embodiment of the present application removes unimportant information such as predicted YUV information, YUV information with division information, etc.
- Input elements can reduce the calculation amount of network model inference, which is beneficial to the implementation of the decoding end and reduces the decoding time.
- the input of the preset network model may also include the quantization parameter (SliceQP) of the current frame, but SliceQP does not need to distinguish between brightness color components and chroma color components.
- the main structure of the network is similar to the aforementioned Figure 2 or Figure 4.
- the main structure is also composed of multiple residual blocks, and the composition structure of the residual blocks can be detailed See Figure 3.
- FIG. 7 shows a schematic network architecture diagram of a neural network model provided by an embodiment of the present application.
- the reconstructed value of the component to be filtered is processed by the convolution layer and the activation layer, and then is spliced with the quantization parameter value of the brightness color component and the quantization parameter value of the chroma color component.
- the splicing structure is fed into the main structure; and there is also a skip connection structure here, which connects the input reconstruction value of the component to be filtered with the output after the Pixel Shuffle module, and finally outputs the filtered reconstruction value of the component to be filtered.
- the inputs of the network architecture mainly include reconstructed YUV information (rec_yuv), BaseQPluma and BaseQPchroma, and the output of the network architecture is the output filtered component information (output_yuv).
- the embodiment of this application proposes a multi-BaseQP input loop filtering technology based on a neural network model.
- the main idea is that the brightness color component is input to one channel of BaseQPluma, and the chroma color component is also input to one channel of BaseQPchroma, while maintaining The number of models remains unchanged.
- the embodiments of the present application can provide more information for the brightness and chroma color components by increasing the amount of inference calculations without increasing the number of models, while allowing more choices and adaptations for the brightness and chroma color components. .
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block.
- the method may also include: determining that the output of the preset network model is the current block. The filtered reconstruction value of the component to be filtered.
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block, and the output of the preset network model can also be residual information.
- the method may also include:
- S801 Determine the reconstruction value of the component to be filtered of the current block, input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and output the component to be filtered of the current block through the preset network model the first residual value.
- S802 Determine the post-filtering reconstruction value of the component to be filtered of the current block based on the reconstruction value of the component to be filtered of the current block and the first residual value of the component to be filtered of the current block.
- the output of the preset network model can be directly the filtered reconstruction value of the component to be filtered of the current block, or it can also be the first residual value of the component to be filtered of the current block.
- the decoder also needs to add the reconstruction value of the component to be filtered of the current block and the first residual value of the component to be filtered of the current block to determine the post-filter reconstruction value of the component to be filtered of the current block. value.
- the method may further include:
- S901 Analyze the code stream and determine the residual scaling factor.
- S902 Scale the first residual value of the component to be filtered of the current block according to the residual scaling factor to obtain the second residual value of the component to be filtered of the current block.
- S903 Determine the post-filtering reconstruction value of the component to be filtered of the current block based on the reconstruction value of the component to be filtered of the current block and the second residual value of the component to be filtered of the current block.
- the model output is the output of the loop filter tool based on the preset network model.
- the model output generally needs to undergo a scaling process.
- the preset network model infers and outputs the residual information of the current block, and the residual information is After the scaling process, the reconstructed samples of the current block are superimposed; and this residual scaling factor is obtained by the encoder, which needs to be written into the code stream and sent to the decoder, so that the decoder can obtain the residual scaling factor through decoding.
- the method may also include:
- the current frame may include at least one divided block. Then traverse these divided blocks, regard each divided block as the current block in turn, and repeatedly execute the decoding method process of the embodiment of the present application to obtain the filtered reconstruction value corresponding to each divided block; according to the obtained filtered reconstruction values, Determine the reconstructed image of the current frame.
- the decoder can also continue to traverse other loop filtering tools and output a complete reconstructed image after completion. The specific process is not closely related to the embodiments of the present application, so it will not be described in detail here.
- the decoding method of the embodiment of the present application only allows the B frame to use different quantization parameters (BaseQPluma and BaseQPchroma) for the bright chroma component input, while the I frame uses the same quantization parameter input for the bright chroma component. This not only reduces the encoding and decoding time, but also saves the bit overhead of quantization parameter transmission on the I frame, further improving compression efficiency.
- the embodiments of the present application only add one layer of chromaticity quantification parameters as additional inputs.
- quantization parameters can also be added to the Cb color component and Cr color component respectively as additional inputs.
- the loop filtering enhancement method based on the neural network model proposed by the embodiments of this application can also be extended to other input parts, such as boundary strength, etc., which is not specifically limited by the embodiments of this application.
- This embodiment provides a decoding method that determines the first syntax element identification information of the component to be filtered in the current frame by parsing the code stream; the first syntax element identification information indicates that the component to be filtered that exists in the current frame and is divided into blocks is allowed to be used.
- the code stream is parsed to determine the second syntax element identification information of the component to be filtered of the current block; and when the second syntax element identification information indicates that the component to be filtered of the current block is filtered using the preset network model.
- the block quantization parameter information at least includes the block quantization parameter value of the first color component and the block quantization parameter value of the second color component; and then determine the reconstruction value of the component to be filtered of the current block , input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and determine the post-filter reconstruction value of the component to be filtered of the current block.
- the preset network model since it only includes the reconstruction value of the component to be filtered and the block quantization parameter information, non-important input elements such as the prediction information and division information of the color component are removed, which can reduce the complexity of the network model inference.
- the amount of calculation is beneficial to the implementation of the decoder and reduces the decoding time; in addition, since the input block quantization parameter information includes at least the block quantization parameter values of two color components, even if multi-channel quantization parameters are used as input, the brightness color component and The chroma color component has more choices and adaptations; and by introducing new syntax elements, the decoder can achieve a more flexible configuration without having to store multiple neural network models, which is beneficial to improving encoding performance and thereby improving Encoding and decoding efficiency.
- Figure 10 shows a schematic flow chart of an encoding method provided by the embodiment of the present application.
- the method may include:
- S1001 Determine the first syntax element identification information of the component to be filtered of the current frame.
- this method is applied to the encoder. Specifically, it can be applied to the loop filtering method based on the neural network model. More specifically, the loop can be obtained based on the neural network model inputting multiple quantization parameters. Filtering method.
- the first syntax element identification information is a frame-level syntax element, which can be used to indicate whether there is a component to be filtered in the divided block in the current frame that allows filtering using the preset network model.
- the current frame may include at least one divided block, and the current block is any one of the at least one divided block. That is to say, the first syntax element identification information may determine whether all components to be filtered of at least one divided block included in the current frame are not allowed to be filtered using the preset network model.
- the component to be filtered may refer to the color component.
- the color components may include at least one of: a first color component, a second color component, and a third color component.
- the first color component may be a brightness color component
- the second color component and the third color component may be a chroma color component (for example, the second color component is a blue chroma color component, and the third color component is a red chroma color component. component; alternatively, the second color component is a red chroma color component and the third color component is a blue chroma color component).
- the frame-level syntax element identification information of the component to be filtered of the current frame may include first syntax element identification information and third syntax element identification information.
- the encoder Before determining the first syntax element identification information, the encoder first needs to determine the third syntax element identification information of the component to be filtered of the current frame.
- determining the first syntax element identification information of the component to be filtered of the current frame may include: determining the third syntax element identification information of the component to be filtered of the current frame; when the third syntax element identification information indicates that the current When all the components to be filtered in at least one divided block included in the frame are not filtered using the preset network model, the first syntax element identification information of the components to be filtered in the current frame is determined.
- the third syntax element identification information is used to indicate whether all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model
- the first syntax element identification information is used to indicate whether the components in the current frame are all filtered using the preset network model. Whether there are components to be filtered that are divided into blocks allows filtering to be performed using a preset network model.
- the first syntax element identification information may be ph_nnlf_luma_enable_flag
- the third syntax element identification information may be ph_nnlf_luma_ctrl_flag
- the first syntax element identification information may be ph_nnlf_luma_ctrl_flag
- the third syntax element identification information may be ph_nnlf_chroma_ctrl_flag. That is to say, for different color components in the current frame, different first syntax element identification information and third syntax element identification information are correspondingly set.
- the distortion method may be used to determine whether the components to be filtered in at least one divided block included in the current frame are all filtered using a preset network model, and/or the distortion method may be used to determine whether there are Dividing the components to be filtered into blocks allows filtering using a preset network model.
- the distortion method here may be a rate distortion penalty method.
- the rate distortion cost value After calculating the rate distortion cost value in different situations, it is determined according to the size of the rate distortion cost value whether all the to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model, that is, the third syntax element is determined.
- the value of the identification information; and/or, determining whether there is a component to be filtered in the current frame according to the rate distortion cost value allows the use of a preset network model for filtering, that is, determining the value of the first syntax element identification information. value.
- the method may further include:
- At least one division block included in the current frame there may be three situations under the component to be filtered: all of the at least one division block is filtered using the preset network model, None of the at least one division block uses the preset network model for filtering, and there are some division blocks in the at least one division block that use the preset network model for filtering.
- the rate distortion cost method can be used to calculate the first rate distortion cost value of at least one divided block to be filtered without using the preset network model for filtering, and the first rate distortion cost value of the at least one divided block to be filtered.
- the second rate distortion cost value of all filtering components using the preset network model for filtering, and the third rate distortion cost value of the to-be-filtered components of at least one division block that allow partial division blocks to be filtered using the preset network model; and then according to The magnitudes of these three rate-distortion cost values determine the value of the first syntax element identification information and the value of the third syntax element identification information.
- the frame level of the component to be filtered of the current frame is determined based on the first rate distortion cost value, the second rate distortion cost value, and the third rate distortion cost value.
- Syntax element identification information which can include:
- the value of the third syntax element identification information is set to the first value
- the value of the third syntax element identification information is set to Second value.
- the method may further include: encoding the value of the third syntax element identification information, and writing the resulting encoded bits into the code stream.
- the third rate distortion cost can be set.
- the value of the three-syntax element identification information is the first value; otherwise, if the first rate distortion cost value is the smallest or the third rate distortion cost value is the smallest, it means that the components to be filtered in at least one divided block included in the current frame are not all Using the preset network model for filtering, the value of the third syntax element identification information can be set to the second value.
- the encoder can also write the value of the third syntax element identification information into the code stream, so that the subsequent decoder can determine the third syntax element identification information by parsing the code stream.
- the syntax element identification information can further determine whether all components to be filtered of at least one divided block included in the current frame are filtered using a preset network model.
- the frame level of the component to be filtered of the current frame is determined based on the first rate distortion cost value, the second rate distortion cost value, and the third rate distortion cost value.
- Syntax element identification information which can include:
- the value of the first syntax element identification information is set to the first value
- the value of the first syntax element identification information is set to the second value
- the method may further include: encoding the value of the first syntax element identification information, and writing the resulting encoded bits into the code stream.
- the first syntax can be set The value of the element identification information is the first value; otherwise, if the first rate distortion cost value is the smallest, which means that all the components to be filtered in at least one division block included in the current frame are not filtered using the preset network model, then it can Set the value of the first syntax element identification information to the second value.
- the encoder can also write the value of the first syntax element identification information into the code stream, so that the subsequent decoder can determine the first syntax element identification information by parsing the code stream.
- the syntax element identification information can then determine whether there is a component to be filtered that is divided into blocks in the current frame, allowing filtering to be performed using a preset network model.
- the first value and the second value are different, and the first value and the second value may be in parameter form or in numerical form. Specifically, whether it is the first syntax element identification information or the third syntax element identification information, both can be parameters written in the profile (profile), or can be the value of a flag (flag), which will not be discussed here. Specific limitations.
- the first value when the second syntax element identification information is a flag, for the first value and the second value, the first value can be set to 1 and the second value can be set to 0; or , the first value can also be set to true, and the second value can also be set to false; however, the embodiment of this application does not make a specific limitation.
- the preset network model for filtering the first rate distortion Consideration value may include:
- the rate distortion cost is calculated according to the original value of the component to be filtered of at least one divided block included in the current frame and the reconstructed value of the component to be filtered of at least one divided block included in the current frame, to obtain a first rate distortion cost value.
- the encoder can first calculate the cost information of the current frame without using the preset network model, that is, use the reconstructed sample of the current block prepared as the input of the preset network model and the original image sample of the current block to calculate the first rate
- the distortion cost value can be expressed by costOrg.
- determining the reconstruction value of the component to be filtered of at least one divided block included in the current frame may include:
- the reconstruction value of the component to be filtered of the at least one divided block is determined according to the predicted value of the component to be filtered of the at least one divided block and the reconstruction residual value of the component to be filtered of the at least one divided block.
- the reconstruction value of the component to be filtered of at least one divided block is determined based on the predicted value of the component to be filtered of the at least one divided block and the reconstruction residual value of the component to be filtered of the at least one divided block. Specifically, it may be:
- the reconstructed value of the component to be filtered of at least one divided block can be determined by performing an addition operation on the predicted value of the component to be filtered of the at least one divided block and the reconstruction residual value of the component to be filtered of the at least one divided block.
- the target residual value is also written into the code stream, so that the subsequent decoder can obtain the target residual value through decoding, and then through inverse quantization and inverse transformation processing, the reconstructed residual value can be obtained, and then the reconstructed residual value can be determined Get the reconstructed value of the component to be filtered that divides the block.
- the method may further include: encoding the target residual value of the to-be-filtered component of at least one divided block, and writing the resulting encoded bits into the code stream.
- the current block for at least one divided block, taking the current block as an example, first determine the predicted value of the component to be filtered of the current block; and then determine the predicted value of the component to be filtered of the current block based on the original value of the component to be filtered. and the predicted value of the component to be filtered of the current block to obtain the initial residual value of the component to be filtered of the current block; then transform and quantize the initial residual value of the component to be filtered of the current block to obtain the component to be filtered of the current block The target residual value of the current block is then inversely quantized and inversely transformed to obtain the reconstructed residual value of the current block's to-be-filtered component.
- the predicted value and the reconstructed residual value of the component to be filtered of the current block are added to determine the value of the component to be filtered of the current block. Reconstructed values of filtered components.
- the second rate distortion Consideration value for the calculation of the second rate distortion cost value, it is determined that all to-be-filtered components of at least one divided block included in the current frame are filtered using a preset network model to obtain the second rate distortion Consideration value may include:
- each quantization parameter combination at least includes a candidate quantization parameter value of the first color component and a candidate quantization parameter value of the second color component;
- the reconstructed value of the component to be filtered of at least one divided block included in the current frame is filtered based on the preset network model to obtain the filtered reconstruction of the component to be filtered of at least one divided block included in the current frame. value;
- the rate-distortion cost is calculated based on the original value of the component to be filtered of at least one divided block included in the current frame and the filtered reconstructed value of the component to be filtered of at least one divided block included in the current frame, and the first quantization parameter combination under each quantization parameter combination is obtained.
- the encoder can try loop filtering technology based on a preset network model to traverse these four quantization parameter combinations respectively; use the reconstruction of the current block
- the sample YUV and quantization parameters are input into the loaded preset network model for inference, and the preset network model outputs the reconstructed sample block of the current block.
- the fourth rate distortion cost value is calculated based on the reconstructed sample of the current block after loop filtering based on the preset network model and the original image sample of the current block under the combination of these four quantization parameters, using costFrame1, costFrame2, costFrame3 and costFrame4 respectively. represents; select the minimum rate distortion cost value from costFrame1, costFrame2, costFrame3 and costFrame4, and use the selected fourth rate distortion cost value as the final second rate distortion cost value, represented by costFrameBest.
- the method may also include: combining the quantization parameters corresponding to the minimum rate distortion cost value as the frame quantization parameter information of the current frame;
- the method may also include: identifying the third syntax element information After encoding the value of , continue to encode the frame quantization parameter information of the current frame, and write the resulting encoded bits into the code stream.
- the minimum rate distortion cost value is selected from costFrame1, costFrame2, costFrame3 and costFrame4, and the quantization parameter combination corresponding to the selected minimum rate distortion cost value is used as the frame quantization of the current frame. Parameter information. In this way, after encoding the value of the third syntax element identification information, the frame quantization parameter information of the current frame can also be continued to be encoded, and then written into the code stream.
- determining at least two quantization parameter combinations may include:
- the first quantization parameter candidate set is composed of at least two candidate quantization parameter values of the first color component
- the second quantization parameter candidate set is composed of at least two candidate quantization parameter values of the second color component
- the first quantization parameter candidate set may include candidate quantization parameters for two brightness color components
- the second quantization parameter candidate set may include two color components.
- Candidate quantization parameters of the luma color component; four quantization parameter combinations can be obtained based on the candidate quantization parameters of the two luma color components and the candidate quantization parameters of the two chroma color components.
- encoding the frame quantization parameter information of the current frame and writing the resulting encoded bits into the code stream may also include:
- the third quantization parameter index and the fourth quantization parameter index are encoded, and the obtained encoded bits are written into the code stream.
- the third quantization parameter index is used to indicate the index number of the frame quantization parameter value of the first color component in the first quantization parameter candidate set
- the fourth quantization parameter index is used to indicate The index number of the frame quantization parameter value of the second color component in the second quantization parameter candidate set.
- the third quantization parameter index and the fourth quantization parameter index need to be written into the code stream ; Therefore, there is no need to calculate the rate distortion cost in the decoder later.
- the third quantization parameter index and the fourth quantization parameter index can be obtained by parsing the code stream, and then the frame quantization parameter information of the current frame, that is, the frame of the first color component, can be determined.
- the quantization parameter value and the frame quantization parameter value of the second color component can be determined.
- the frame quantization parameter value of the luminance color component may be represented by ph_nnlf_luma_baseqp
- the frame quantization parameter value of the chroma color component may be represented by ph_nnlf_chroma_baseqp.
- the first quantization parameter candidate set may be the same, and the second quantization parameter candidate set may be the same; and different frames correspond to the first
- the quantization parameter candidate sets may be different, and the second quantization parameter candidate sets corresponding to different frames may also be different.
- the first syntax element identification information indicates that there are components to be filtered that are divided into blocks in the current frame and the preset network model is allowed to be used for filtering. Then the encoder also needs to determine the second syntax element identification information of the component to be filtered of the current block.
- the current frame includes at least one division block, where the current block specifically refers to the division block currently to be subjected to loop filtering, which may be any one of the at least one division block included in the current frame.
- the current block may be the current coding unit, the current prediction unit, the current transformation unit, or even the current coding tree unit, etc. The following will take the current coding tree unit as an example for detailed description.
- the second syntax element identification information is a coding tree unit level syntax element, which can be used to indicate whether the component to be filtered of the current block is filtered using a preset network model.
- the second syntax element identification information may also be called a coding tree unit usage identification bit. That is to say, the second syntax element identification information can determine whether the to-be-filtered component of the current coding tree unit is filtered using a preset network model, or whether the to-be-filtered component of the current coding tree unit is filtered without using the preset network model.
- the second syntax element identification information may be ctb_nnlf_luma_flag; if the component to be filtered is a chroma color component, the second syntax element identification information may be ctb_nnlf_chroma_flag. That is to say, for different color components in the current coding tree unit, different second syntax element identification information is correspondingly set.
- the encoder may first determine the third syntax element identification information of the component to be filtered.
- the encoder When the third syntax element identification information indicates that the components to be filtered of at least one divided block included in the current frame are not all filtered using the preset network model, The encoder also needs to continue to determine the value of the first syntax element identification information; only when the first syntax element identification information indicates that there are components to be filtered that are divided into blocks in the current frame and allow filtering using the preset network model, then the encoder can The value of the identification information of the second syntax element will continue to be determined.
- the method may further include:
- the second syntax element identification information of the component to be filtered of the current block is determined according to the fifth rate distortion cost value and at least two sixth rate distortion cost values.
- the encoder will try to optimize the selection at the coding tree unit level.
- the component to be filtered it is not only necessary to calculate the fifth rate distortion cost value of the reconstructed sample without using loop filtering based on the preset network model and the original sample of the current block, which can be expressed by costCTUorg; it is also necessary to calculate a variety of BaseQPluma and BaseQPchroma combine the fifth rate distortion cost of the reconstructed sample and the original sample of the current block under the condition of loop filtering based on the preset network model, which can be represented by costCTUnn1, costCTUnn2, costCTUnn3 and costCTUnn4 respectively; then costCTUorg, costCTUnn1, The value of the second syntax element identification information is determined by the rate-distortion cost values of costCTUnn2, costCTUnn3 and costCTUnn4.
- the second syntax of the component to be filtered of the current block is determined based on the fifth rate distortion cost value and at least two sixth rate distortion cost values.
- Element identification information which can include:
- the minimum rate distortion cost value is one of the sixth rate distortion cost values, then the value of the second syntax element identification information is set to the first value;
- the minimum rate distortion cost value is the fifth rate distortion cost value, then the value of the second syntax element identification information is set to the second value.
- the method may further include: encoding the value of the second syntax element identification information, and writing the resulting encoded bits into the code stream.
- the second syntax element identifier can be set The value of the information is the second value; otherwise, if the smallest one is a certain sixth rate distortion cost value, which means that the component to be filtered in the current block is filtered using the preset network model, then the second syntax element identifier can be set The value of the information is the first value.
- the encoder can also write the value of the second syntax element identification information into the code stream, so that the subsequent decoder can determine the second syntax element identification information by parsing the code stream.
- the syntax element identifies information that can determine whether the current block in the current frame is filtered using a preset network model.
- the first value and the second value are different, and the first value and the second value may be in parameter form or in numerical form.
- the second syntax element identification information may also be a parameter written in the profile, or it may be the value of a flag, which is not specifically limited here.
- the first value when the second syntax element identification information is a flag, for the first value and the second value, the first value can be set to 1 and the second value can be set to 0; or , the first value can also be set to true, and the second value can also be set to false; however, the embodiment of this application does not make a specific limitation.
- S1004 Determine the reconstruction value of the component to be filtered of the current block, input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and determine the post-filter reconstruction value of the component to be filtered of the current block. .
- this method can also be used for the block quantization parameter information of the current block.
- the quantization parameter combination corresponding to the minimum rate distortion cost value is used as the block quantization parameter information of the current block;
- the method may also include: after encoding the value of the second syntax element identification information, continue to encode the block quantization parameter information of the current block, and write the resulting encoded bits into the code stream.
- the minimum rate distortion cost value is selected from costCTUorg, costCTUnn1, costCTUnn2, costCTUnn3 and costCTUnn4, and the BaseQPluma and BaseQPchroma combination corresponding to the selected minimum rate distortion cost value is used as the block quantization parameter information of the current block.
- the block quantization parameter information of the current block can also be continued to be encoded, and then written into the code stream.
- the method encodes the block quantization parameter information of the current block and writes the resulting encoded bits into the code stream.
- the method may also include:
- the first quantization parameter index and the second quantization parameter index are encoded, and the obtained encoded bits are written into the code stream.
- the first quantization parameter candidate set may include candidate quantization parameters for at least two luminance color components
- the second quantization parameter candidate set may include candidate quantization parameters for at least two chrominance color components.
- candidate quantization parameters of two luminance color components and candidate quantization parameters of two chroma color components are used, four quantization parameter combinations can be obtained.
- the first quantization parameter candidate set may be the same, and the second quantization parameter candidate set may be the same; and the first quantization parameter candidate sets corresponding to different frames may be different, and different frames may have different first quantization parameter candidate sets.
- the corresponding second quantization parameter candidate sets may also be different.
- the first quantization parameter index and the second quantization parameter index need to be The second quantization parameter index; therefore, there is no need to calculate the rate distortion cost in the decoder later.
- the first quantization parameter index and the second quantization parameter index can be obtained by parsing the code stream, and then the frame quantization of the current frame is determined. Parameter information, that is, the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component.
- the block quantization parameter value of the luma color component may be represented by ctb_nnlf_luma_baseqp
- the block quantization parameter value of the chroma color component may be represented by ctb_nnlf_chroma_baseqp.
- the components to be filtered include at least luminance color components and chrominance color components; the method may also include:
- the third syntax element identification information is the frame-level brightness switch identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame.
- the second syntax element identification information is the block-level brightness usage identification information of the current block; wherein the frame-level brightness switch identification information is used to indicate whether the brightness color components of at least one divided block included in the current frame are all filtered using the preset network model, and the frame
- the level brightness enable identification information is used to indicate whether the brightness color component of the divided block in the current frame is allowed to use the preset network model for filtering.
- the block level brightness use identification information is used to indicate whether the brightness color component of the current block uses the preset network model. perform filtering;
- the third syntax element identification information is the frame-level chroma switch identification information of the current frame
- the first syntax element identification information is the frame-level chroma enable identification of the current frame.
- the second syntax element identification information is the block-level chroma usage identification information of the current block; wherein the frame-level chroma switch identification information is used to indicate whether the chroma color components of at least one divided block included in the current frame all use the preset
- the network model performs filtering.
- the frame-level chroma enable identification information is used to indicate whether there is a chroma color component divided into blocks in the current frame.
- the preset network model is allowed to be used for filtering.
- the block-level chroma usage identification information is used to indicate whether the chroma color component of the current block exists. Whether the chroma color components are filtered using the default network model.
- the frame-level brightness switch identification information can be represented by ph_nnlf_luma_ctrl_flag
- the frame-level brightness enable identification information can be represented by ph_nnlf_luma_enable_flag
- the block-level brightness usage identification information can be represented by ctb_nnlf_luma_flag
- the frame-level chroma switch identification information can be represented by ph_nnlf_chroma_ctrl_flag
- the frame-level chroma enable identification information can be represented by ph_nnlf_chroma_enable_flag
- the block-level chroma usage identification information can be represented by ctb_nnlf_chroma_flag.
- sequence-level syntax elements may also be provided to determine whether the current sequence allows the use of neural network-based loop filtering technology.
- the method may also include:
- the fourth syntax element identification information indicates that the component to be filtered of the current sequence allows filtering using the preset network model, perform the step of determining the third syntax element identification information of the component to be filtered of the current frame; wherein the current sequence includes the current frame.
- the fourth syntax element identification information is a sequence-level syntax element, which can be used to indicate whether the component to be filtered of the current sequence is allowed to be filtered using the preset network model.
- the fourth syntax element identification information can be represented by sps_nnlf_enable_flag.
- sps_nnlf_enable_flag if at least one of the brightness color component and chrominance color component of the current sequence allows the use of the preset network model for filtering, it means that the value of sps_nnlf_enable_flag is true, that is, the component to be filtered of the current sequence allows the use of the preset network model.
- determining the fourth syntax element identification information may include:
- the value of the fourth syntax element identification information is set to the first value
- the value of the fourth syntax element identification information is set to the second value
- the method also includes: encoding the value of the fourth syntax element identification information, and writing the obtained encoded bits into the code stream.
- the first value and the second value are different, and the first value and the second value may be in parameter form or in numerical form.
- the fourth syntax element identification information is a flag
- the first value can be set to 1 and the second value can be set to 0; or, the first value can also be set to 0. If set to true, the second value can also be set to false; however, this application embodiment does not specifically limit it.
- the fourth syntax element identification information may be called a sequence-level identification bit.
- the sequence-level identification bit if the sequence-level identification bit is true, the neural network-based loop filtering technology is allowed to be used; if the sequence-level identification bit is false, the neural network-based loop filtering technology is not allowed to be used. Among them, the sequence-level identification bit needs to be written into the code stream when encoding the video sequence.
- the preset network model is a neural network model
- the neural network model at least includes: a convolution layer, an activation layer, a splicing layer and a skip connection layer.
- its input can include: the reconstruction value of the component to be filtered (represented by rec_yuv), the quantization parameter value of the brightness color component (represented by BaseQPluma) and The quantization parameter value of the chroma color component (expressed by BaseQPchroma); its output can be: the filtered reconstruction value of the component to be filtered (expressed by output_yuv.) Since the embodiment of the present application removes information such as predicted YUV information and division information Non-important input elements such as YUV information can reduce the calculation amount of network model inference, which is beneficial to the implementation of the decoding end and reduces the decoding time.
- the input of the preset network model may also include the quantization parameter (SliceQP) of the current frame, but SliceQP does not need to distinguish between the brightness color component and the chroma color component (not shown in Figure 7 Shows).
- SliceQP quantization parameter
- the embodiment of this application proposes a multi-BaseQP input loop filtering technology based on a neural network model.
- the main idea is that the brightness color component is input to one channel of BaseQPluma, and the chroma color component is also input to one channel of BaseQPchroma, while maintaining The number of models remains unchanged.
- the embodiments of the present application can provide more information for the brightness and chroma color components by increasing the amount of inference calculations without increasing the number of models, while allowing more choices and adaptations for the brightness and chroma color components. .
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block.
- the method may also include: determining that the output of the preset network model is the current block. The filtered reconstruction value of the component to be filtered.
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block
- the output of the preset network model can also be residual information.
- the method may further include: determining that the output of the preset network model is the first residual value of the component to be filtered of the current block;
- determining the filtered reconstruction value of the component to be filtered of the current block may include: after obtaining the first residual value of the component to be filtered of the current block through the preset network model, according to the current block The reconstructed value of the component to be filtered and the first residual value of the component to be filtered of the current block are used to determine the post-filtered reconstruction value of the component to be filtered of the current block.
- the output of the preset network model can be directly the filtered reconstruction value of the component to be filtered of the current block, or it can also be the first residual value of the component to be filtered of the current block.
- the encoder also needs to add the reconstruction value of the component to be filtered of the current block and the first residual value of the component to be filtered of the current block to determine the post-filter reconstruction value of the component to be filtered of the current block. value.
- the method may further include:
- determining the filtered reconstruction value of the component to be filtered of the current block may include:
- the post-filtering reconstruction value of the component to be filtered of the current block is determined according to the reconstruction value of the component to be filtered of the current block and the second residual value of the component to be filtered of the current block.
- the method may further include: encoding the residual scaling factor, and writing the resulting encoded bits into the code stream.
- the model output is the output of the loop filter tool based on the preset network model.
- the model output generally needs to undergo a scaling process.
- the preset network model infers and outputs the residual information of the current block, and the residual information is After the scaling process, the reconstructed samples of the current block are superimposed; the residual scaling factor can be obtained by the encoder, which needs to be written into the code stream and sent to the decoder, so that the subsequent decoder can obtain the residual scaling factor through decoding.
- the method may also include:
- the current frame may include at least one divided block. Then traverse these divided blocks, regard each divided block as the current block in turn, and repeatedly execute the encoding method process of the embodiment of the present application to obtain the filtered reconstruction value corresponding to each divided block; according to the obtained filtered reconstruction values, Determine the reconstructed image of the current frame.
- the encoder can also continue to traverse other loop filtering tools and output a complete reconstructed image after completion. The specific process is not closely related to the embodiments of the present application, so it will not be described in detail here.
- the decoding method of the embodiment of the present application only allows the B frame to use different quantization parameters (BaseQPluma and BaseQPchroma) for the bright chroma component input, while the I frame uses the same quantization parameter input for the bright chroma component. This not only reduces the encoding and decoding time, but also saves the bit overhead of quantization parameter transmission on the I frame, further improving compression efficiency.
- the embodiments of the present application only add one layer of chromaticity quantification parameters as additional inputs.
- quantization parameters can also be added to the Cb color component and Cr color component respectively as additional inputs.
- the loop filtering enhancement method based on the neural network model proposed by the embodiments of this application can also be extended to other input parts, such as boundary strength, etc., which is not specifically limited by the embodiments of this application.
- This embodiment provides a coding method by determining the first syntax element identification information of the component to be filtered in the current frame; the first syntax element identification information indicates that there is a component to be filtered that is divided into blocks in the current frame, allowing the use of a preset network model.
- the second syntax element identification information of the component to be filtered of the current block When performing filtering, determine the second syntax element identification information of the component to be filtered of the current block; wherein the current frame includes at least one divided block, and the current block is any one of the at least one divided block; in the second syntax element identification information indication
- the block quantization parameter information of the current block is determined; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and the block quantization parameter of the second color component.
- the amount of calculation is beneficial to the implementation of the decoder and reduces the decoding time; in addition, since the input block quantization parameter information includes at least the block quantization parameter values of two color components, even if multi-channel quantization parameters are used as input, the brightness color component and The chroma color component has more choices and adaptations; and by introducing new syntax elements, the decoder can achieve a more flexible configuration without having to store multiple neural network models, which is beneficial to improving encoding performance and thereby improving Encoding and decoding efficiency.
- the embodiment of the present application proposes a multi-BaseQP input neural network-based loop filtering technology.
- the main idea is to input brightness and color components.
- the embodiments of this application can provide more information for brightness color components and chroma color components by increasing the amount of inference calculations without increasing the number of models, and at the same time, the encoding end also has more choices.
- FIG. 7 shows a schematic network architecture diagram of a multi-BaseQP input neural network model provided by an embodiment of the present application.
- the preset network model takes the neural network model shown in Figure 7 as an example.
- the encoding end can provide multiple BaseQPluma and BaseQPchroma candidate combinations, and the coding tree unit or coding unit inputs each candidate into the current neural network model. Infer and calculate the filtered reconstructed sample block, and obtain the corresponding rate-distortion cost.
- the input stream is transmitted to the decoding end.
- the decoder parses the code stream to obtain the neural network loop filtering identification bit of the current coding tree unit or coding unit and parses and calculates the above-mentioned BaseQPluma and BaseQPchroma candidate combinations.
- the final candidate combination of BaseQPluma and BaseQPchroma is determined and input into the neural network model as the quantization parameter of the current coding tree unit or coding unit, and the reconstructed sample output from the neural network model is obtained as the output sample of the current filtering technology.
- the encoder traverses intra-frame or inter-frame prediction to obtain the prediction block of each coding unit.
- the residual of the coding unit can be obtained by making the difference between the original image block and the prediction block.
- the residual is obtained through various transformation modes to obtain the frequency domain residual coefficient.
- the distortion residual information (that is, the reconstruction residual value described in the previous embodiment) is obtained after inverse transformation.
- the distortion residual information and the prediction block are superimposed to obtain the reconstruction block.
- the loop filtering module filters the image using the coding tree unit level as the basic unit. The technical solution of the embodiment of the present application is applied here.
- loop filtering technology based on the neural network model that is allowed to use the flag bit, that is, sps_nnlf_enable_flag. If the flag bit is true, the loop filtering technology based on the neural network model is allowed to be used; if the flag bit is false, the loop filtering technology based on the neural network model is not allowed to be used. Loop filtering techniques for network models. The sequence level allows the use of identification bits that need to be written into the code stream when encoding the video sequence.
- Step 1 If the allowed use flag of the loop filtering based on the neural network model is true, the encoding end tries the loop filtering technology based on the neural network model, that is, perform step 2; if the loop filtering based on the neural network model is If the allow use flag is false, the encoding end will not try the loop filtering technology based on the neural network model, that is, skip step 2 and directly perform step 3;
- Step 2 Initialize the neural network-based loop filtering technology and load the neural network model suitable for the current frame.
- the encoding end calculates the cost information without using the loop filtering technology based on the neural network model, that is, using the reconstruction sample of the coding tree unit prepared as the input of the neural network model and the original image sample of the coding tree unit to calculate the rate-distortion cost value, recorded as costOrg ;
- the encoding end tries the loop filtering technology based on the neural network model, traverses two brightness quantization parameter candidates and two chroma quantification parameter candidates respectively, and uses the reconstructed sample YUV and quantization parameters of the current coding tree unit to input them into the loaded neural network Inference is performed in the model, and the neural network model outputs the reconstructed sample block of the current coding tree unit.
- the rate-distortion cost values are calculated from the reconstructed samples of the coding tree unit after loop filtering based on the neural network model under various quantization parameter combinations and the original image samples of the coding tree unit, which are recorded as costFrame1, costFrame2, costFrame3 and costFrame4 respectively. Select the minimum cost combination as the optimal output of the second round, mark the cost value as costFrameBest and record the corresponding brightness quantization parameters and chroma quantification parameters;
- the encoding end attempts to optimize the selection at the coding tree unit level.
- the second round of encoding end attempts based on neural network model loop filtering directly defaults to all coding tree units in the current frame using this technology, and uses one frame each for the brightness color component and the chroma color component.
- the level switch flag bit is controlled, while the coding tree unit level does not need to transmit the use flag bit.
- This round attempts to encode the identification bit combination at the tree unit level, and each color component can be controlled independently.
- the encoder traverses the coding tree unit and calculates the rate-distortion cost of the reconstructed sample without loop filtering based on the neural network model and the original sample of the current coding tree unit, recorded as costCTUorg; calculates multiple combinations of BaseQPluma and BaseQPchroma based on the neural network
- the rate-distortion costs of the reconstructed samples of the model loop filter and the original samples of the current coding tree unit are recorded as costCTUnn1, costCTUnn2, costCTUnn3 and costCTUnn4 respectively.
- the coding tree unit level of the brightness color component is based on the use flag bit (ctb_nnlf_luma_flag) of the neural network model loop filtering to false; otherwise, Set the ctb_nnlf_luma_flag to true and record the current BaseQPluma quantization parameter index.
- the coding tree unit level of the chroma color component is based on the use flag bit (ctb_nnlf_chroma_flag) of the neural network model loop filtering to False; otherwise, set the ctb_nnlf_chroma_flag to true and record the current BaseQPchroma quantization parameter index.
- the rate-distortion cost of the reconstructed sample of the current frame and the original image sample in this case is calculated, recorded as costCTUBest;
- Step 3 The encoder continues to try other loop filtering tools, and after completion, outputs a complete reconstructed image.
- the specific process is not related to the technical solution of the embodiment of the present application, so it will not be elaborated here.
- the decoding end parses the sequence-level flag bit. If sps_nnlf_enable_flag is true, it means that the current code stream is allowed to use the loop filtering technology based on the neural network model, and the subsequent decoding process needs to parse the relevant syntax elements; otherwise, it means that the current code stream is not allowed to use the loop filtering technology based on the neural network model. With the loop filtering technology of the neural network model, the subsequent decoding process does not need to parse the relevant syntax elements, and the relevant syntax elements default to the initial value or to a false state.
- Step 1 The decoder parses the syntax elements of the current frame and obtains the frame-level switch identification bit and the frame-level usage identification bit based on the neural network model. If the frame-level identification bits are not all negative, perform step 2; otherwise, skip Step 2, perform step 3.
- Step 2 If the frame-level switch flag is true, it means that all coding tree units under the current color component are filtered using the loop filtering technology based on the neural network model, that is, all coding trees under the current frame under the color component are automatically filtered.
- the coding tree unit level use flag of the unit is set to true; otherwise, it means that there are some coding tree units using the loop filtering technology based on the neural network model under the current color component, and there are also some coding tree units that do not use the loop filtering technology based on the neural network model. technology.
- the BaseQPluma value (ph_nnlf_luma_baseqp) and BaseQPchroma (ph_nnlf_chroma_baseqp) of the current frame are parsed, and used as input quantization parameter information to apply to all coding tree units of the corresponding color components of the current frame.
- ph_nnlf_luma_enable_flag/ph_nnlf_chroma_enable_flag are both false, use the flag position false for all coding tree units in the current frame; otherwise, parse BaseQPluma (ctb_nnlf_luma_baseqp) or BaseQPchroma (ctb_nnlf_chroma_baseqp) of the current coding tree unit, and all coding tree units of the corresponding color components. Use identification bits.
- the current coding tree unit is filtered using a loop filtering technology based on the neural network model. Takes the reconstructed sample YUV of the current coding tree unit and the quantization parameter information (BaseQPluma and BaseQPchroma) as input.
- the neural network model performs inference and obtains the reconstructed sample YUV of the current coding tree unit based on the loop filtering of the neural network model.
- the reconstructed sample is selected as the output of the loop filtering technology based on the neural network model. If the coding tree unit usage flag of the corresponding color component is true, the above correspondence is used.
- the reconstructed sample of the color component after loop filtering based on the neural network model is used as the output; otherwise, the reconstructed sample that has not been filtered based on the neural network model loop is used as the output of the color component.
- the loop filtering module After traversing all coding tree units of the current frame, the loop filtering module based on the neural network model ends.
- Step 3 The decoder continues to traverse other loop filtering tools, and after completion, outputs a complete reconstructed image.
- the specific process is not related to the technical solution of the embodiment of the present application, so it will not be elaborated here.
- the residual scaling part is not introduced in detail in all the above embodiments, but this does not mean that the residual scaling technology cannot be used in the embodiments of the present application.
- the residual scaling technology is used for the output of the neural network model. Specifically, it can be scaling the residual obtained by the difference between the reconstructed sample output by the neural network and the original reconstructed sample. This will not be elaborated here.
- the embodiment of the present application provides a code stream, which is generated by bit encoding according to the information to be encoded; wherein the information to be encoded may include at least one of the following: current frame The first syntax element identification information of the component to be filtered, the second syntax element identification information of the component to be filtered of the current block, the third syntax element identification information of the component to be filtered of the current frame, the residual scaling factor and at least the current frame includes The initial residual value of the component to be filtered of a divided block; wherein the current frame includes at least one divided block, and the current block is any one of the at least one divided block.
- the embodiments of the present application propose a new neural network loop.
- the filtering model uses multi-channel quantization parameters as input to improve coding performance and introduce new syntax elements. In this way, while maintaining only one model or fewer models, the channels of important input elements are increased, so that the brightness color component and the chrominance color component have more choices and adaptations.
- the decoding side can achieve a more flexible configuration without storing multiple neural network models, which is beneficial to improving encoding performance; at the same time, this technical solution also removes non-standard components such as prediction information YUV, division information YUV, etc. Important input elements reduce the calculation amount of network model inference, which is beneficial to the implementation of the decoding end and reduces decoding time.
- FIG. 11 shows a schematic structural diagram of an encoder provided by an embodiment of the present application.
- the encoder 100 may include: a first determination unit 1101 and a first filtering unit 1102; wherein,
- the first determining unit 1101 is configured to determine the first syntax element identification information of the component to be filtered of the current frame; and the first syntax element identification information indicates that the component to be filtered that exists in the current frame is divided into blocks, allowing filtering to be performed using the preset network model.
- the block quantization parameter information of the current block is determined; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and the block quantization parameter value of the second color component. ;
- the first determination unit 1101 is also configured to determine the reconstruction value of the component to be filtered of the current block,
- the first filtering unit 1102 is configured to input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and determine the filtered reconstruction value of the component to be filtered of the current block.
- the first determining unit 1101 is further configured to determine the third syntax element identification information of the component to be filtered of the current frame; and when the third syntax element identification information indicates the to-be-filtered component of at least one divided block included in the current frame When not all components are filtered using the preset network model, the first syntax element identification information of the component to be filtered in the current frame is determined.
- the first determination unit 1101 is further configured to determine a first rate distortion cost value in which all to-be-filtered components of at least one divided block included in the current frame are not filtered using the preset network model; determine the first rate distortion cost value included in the current frame.
- the second rate distortion cost when all the components to be filtered of at least one divided block are filtered using the preset network model; the third rate distortion cost when it is determined that there are components to be filtered in the divided block in the current frame and the preset network model is used for filtering.
- the frame-level syntax element identification information includes the first Syntax element identification information and third syntax element identification information.
- the encoder 100 may further include a first setting unit 1103 and an encoding unit 1104; wherein,
- the first setting unit 1103 is configured to set the value of the third syntax element identification information if the second rate distortion cost value is the smallest among the first rate distortion cost value, the second rate distortion cost value and the third rate distortion cost value.
- the first value if the first rate distortion cost value, the second rate distortion cost value and the third rate distortion cost value are the smallest or the third rate distortion cost is the smallest, then the third syntax element identification information is set
- the value of is the second value;
- the encoding unit 1104 is configured to encode the value of the third syntax element identification information, and write the resulting encoded bits into the code stream.
- the first setting unit 1103 is further configured to set the first syntax if the third rate distortion cost value among the first rate distortion cost value, the second rate distortion cost value and the third rate distortion cost value is the smallest.
- the value of the element identification information is the first value; if the first rate distortion cost value among the first rate distortion cost value, the second rate distortion cost value and the third rate distortion cost value is the smallest, then the first syntax element identification information is set The value is the second value; the encoding unit 1104 is also configured to encode the value of the first syntax element identification information, and write the resulting encoded bits into the code stream.
- the first determining unit 1101 is further configured to determine the original value of the component to be filtered of at least one divided block included in the current frame, and determine the reconstructed value of the component to be filtered of at least one divided block included in the current frame; and performing rate distortion cost calculation based on the original value of the component to be filtered of at least one divided block included in the current frame and the reconstructed value of the component to be filtered of at least one divided block included in the current frame to obtain a first rate distortion cost value.
- the first determination unit 1101 is further configured to determine the original image of the component to be filtered of the current frame; divide the original image to obtain the original value of the component to be filtered of at least one divided block; Perform intra-frame or inter-frame prediction to determine the predicted value of the component to be filtered of at least one divided block; obtain at least one partition based on the original value of the component to be filtered of at least one divided block and the predicted value of the component to be filtered of at least one divided block.
- the initial residual value of the component to be filtered of the block; the initial residual value of the component to be filtered of at least one divided block is transformed and quantized respectively to obtain the target residual value of the component to be filtered of at least one divided block; for at least one
- the target residual value of the component to be filtered in the divided block is respectively subjected to inverse quantization and inverse transformation processing to obtain a reconstructed residual value of the component to be filtered in at least one divided block; and based on the predicted value of the component to be filtered in the at least one divided block and at least
- the reconstruction residual value of the component to be filtered of a divided block determines the reconstruction value of the component to be filtered of at least one divided block.
- the encoding unit 1104 is further configured to encode the target residual value of the component to be filtered in at least one divided block, and write the resulting encoded bits into the code stream.
- the first determination unit 1101 is further configured to determine at least two quantization parameter combinations; wherein each quantization parameter combination at least includes a candidate quantization parameter value of the first color component and a candidate quantization parameter value of the second color component.
- the filtered reconstruction value of the component calculate the rate distortion cost based on the original value of the component to be filtered of at least one division block included in the current frame and the filtered reconstruction value of the component to be filtered of at least one division block included in the current frame, and obtain each The fourth rate distortion cost value under a combination of quantization parameters; and selecting the minimum rate distortion cost value from the obtained fourth rate distortion cost value, and determining the second rate distortion cost value based on the minimum rate distortion cost value.
- the first determination unit 1101 is further configured to combine the quantization parameters corresponding to the minimum rate distortion cost value as the frame quantization parameter information of the current frame;
- the encoding unit 1104 is also configured to continue encoding the current value after encoding the value of the third syntax element identification information when the second rate distortion cost value is the smallest among the second rate distortion cost value and the third rate distortion cost value.
- the frame quantization parameter information of the frame is encoded, and the resulting encoded bits are written into the code stream.
- the first determination unit 1101 is further configured to determine a first quantization parameter candidate set and a second quantization parameter candidate set; traverse the first quantization parameter candidate set and the second quantization parameter candidate set, and determine at least two quantization parameters. Parameter combination; wherein, the first quantization parameter candidate set is composed of at least two candidate quantization parameter values of the first color component, and the second quantization parameter candidate set is composed of at least two candidate quantization parameter values of the second color component.
- the first determination unit 1101 is further configured to determine the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component according to the frame quantization parameter information of the current frame; according to the first quantization parameter The candidate set and the frame quantization parameter value of the first color component determine the third quantization parameter index; wherein the third quantization parameter index is used to indicate the index number of the frame quantization parameter value of the first color component in the first quantization parameter candidate set.
- the encoding unit 1104 is also configured to encode the third quantization parameter index and the fourth quantization parameter index, and write the resulting encoded bits into the code stream.
- the first determining unit 1101 is further configured to determine, based on the current block in the current frame, the original value of the component to be filtered of the current block and the reconstructed value of the component to be filtered of the current block; in at least two quantization parameters In combination, filter the reconstruction values of the components to be filtered of the current block based on the preset network model to obtain at least two filtered reconstruction values of the components to be filtered of the current block; according to the original values of the components to be filtered of the current block and the current block
- the rate distortion cost is calculated on the reconstructed value of the component to be filtered to obtain the fifth rate distortion cost value; rate distortion is performed respectively based on the original value of the component to be filtered of the current block and the filtered reconstruction values of at least two components to be filtered of the current block. cost calculation to obtain at least two sixth rate distortion cost values; and determine the second syntax element identification information of the component to be filtered of the current block based on the fifth rate distortion cost value and at least two sixth rate distortion cost values
- the first determining unit 1101 is further configured to select a minimum rate distortion cost value from a fifth rate distortion cost value and at least two sixth rate distortion cost values; if the minimum rate distortion cost value is one of the sixth rate distortion cost values, If the minimum rate distortion cost value is the fifth rate distortion cost value, then the value of the second syntax element identification information is set to the first value; if the minimum rate distortion cost value is the fifth rate distortion cost value, then the value of the second syntax element identification information is set to the second value. value; the encoding unit 1104 is also configured to encode the value of the second syntax element identification information, and write the resulting encoded bits into the code stream.
- the first determining unit 1101 is further configured to use the quantization parameter combination corresponding to the minimum rate distortion cost value as the block of the current block when the minimum rate distortion cost value is one of the sixth rate distortion cost values.
- the encoding unit 1104 is further configured to, after encoding the value of the second syntax element identification information, continue to encode the block quantization parameter information of the current block, and write the resulting encoded bits into the code stream.
- the first determination unit 1101 is further configured to determine the block quantization parameter value of the first color component and the block quantization parameter value of the second color component according to the block quantization parameter information of the current block; according to the first quantization parameter The candidate set and the block quantization parameter value of the first color component determine the first quantization parameter index; wherein the first quantization parameter index is used to indicate the index number of the block quantization parameter value of the first color component in the first quantization parameter candidate set.
- the encoding unit 1104 is also configured to encode the first quantization parameter index and the second quantization parameter index, and write the resulting encoded bits into the code stream.
- the components to be filtered include at least a brightness color component and a chroma color component; accordingly, the first determination unit 1101 is also configured to determine the third syntax element when the color component type of the current frame is a brightness color component.
- the identification information is the frame-level brightness switch identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame
- the second syntax element identification information is the block-level brightness usage identification information of the current block
- the frame-level brightness switch identification information is used to indicate whether the brightness color components of at least one division block included in the current frame are all filtered using the preset network model
- the frame-level brightness enable identification information is used to indicate whether there is brightness of the division block in the current frame.
- the color component allows filtering using the preset network model.
- the block-level brightness usage identification information is used to indicate whether the brightness color component of the current block uses the preset network model for filtering; when the color component type of the current frame is a chroma color component, determine
- the third syntax element identification information is the frame-level chroma switch identification information of the current frame, the first syntax element identification information is the frame-level chroma enable identification information of the current frame, and the second syntax element identification information is the block-level color of the current block.
- the frame-level chroma switch identification information is used to indicate whether the chroma color components of at least one divided block included in the current frame are all filtered using the preset network model
- the frame-level chroma enable identification information is used to Indicates whether the chroma color component of the divided block exists in the current frame and allows filtering using a preset network model.
- the block-level chroma usage identification information is used to indicate whether the chroma color component of the current block uses the preset network model for filtering.
- the first determining unit 1101 is further configured to determine the fourth syntax element identification information; and when the fourth syntax element identification information indicates that the component to be filtered of the current sequence allows filtering using the preset network model, perform the determination The step of identifying information of a third syntax element of the component to be filtered of the current frame; wherein the current sequence includes the current frame.
- the first determining unit 1101 is also configured to determine whether the component to be filtered in the current sequence allows filtering using the preset network model; if the component to be filtered in the current sequence allows filtering using the preset network model, set The value of the fourth syntax element identification information is the first value; if the component to be filtered in the current sequence does not allow filtering using the preset network model, then the value of the fourth syntax element identification information is set to the second value; encoding unit 1104 , and is further configured to encode the value of the fourth syntax element identification information, and write the resulting encoded bits into the code stream.
- the default network model is a neural network model
- the neural network model at least includes: a convolution layer, an activation layer, a splicing layer and a skip connection layer.
- the input of the preset network model is the reconstruction value of the component to be filtered and the block quantization parameter information of the current block.
- the first filtering unit 1102 is also configured to determine that the output of the preset network model is the current block. The filtered reconstruction value of the component to be filtered.
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block.
- the first filtering unit 1102 is also configured to determine the output of the preset network model. is the first residual value of the component to be filtered of the current block; and determines the post-filter reconstruction of the component to be filtered of the current block based on the reconstruction value of the component to be filtered of the current block and the first residual value of the component to be filtered of the current block. value.
- the first determination unit 1101 is further configured to determine a residual scaling factor; perform scaling processing on the first residual value of the component to be filtered of the current block according to the residual scaling factor to obtain the component to be filtered of the current block. the second residual value; and determining the filtered reconstruction value of the component to be filtered of the current block based on the reconstruction value of the component to be filtered of the current block and the second residual value of the component to be filtered of the current block.
- the encoding unit 1104 is also configured to encode the residual scaling factor and write the resulting encoded bits into the code stream.
- the "unit" may be part of a circuit, part of a processor, part of a program or software, etc., and of course may also be a module, or may be non-modular.
- each component in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
- the above integrated units can be implemented in the form of hardware or software function modules.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of this embodiment is essentially either The part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
- the computer software product is stored in a storage medium and includes a number of instructions to make a computer device (can It is a personal computer, server, or network device, etc.) or processor that executes all or part of the steps of the method described in this embodiment.
- the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk or optical disk and other media that can store program code.
- embodiments of the present application provide a computer-readable storage medium for use in the encoder 100.
- the computer-readable storage medium stores a computer program.
- the computer program is executed by the first processor, any of the foregoing embodiments can be implemented. The method described in one item.
- the encoder 100 may include: a first communication interface 1201 , a first memory 1202 and a first processor 1203 ; the various components are coupled together through a first bus system 1204 .
- the first bus system 1204 is used to implement connection communication between these components.
- the first bus system 1204 also includes a power bus, a control bus and a status signal bus.
- various buses are labeled as first bus system 1204 in FIG. 12 . in,
- the first communication interface 1201 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- the first memory 1202 is used to store a computer program capable of running on the first processor 1203;
- the first processor 1203 is configured to execute: when running the computer program:
- the first syntax element identification information indicates that the component to be filtered of the divided block exists in the current frame and allows filtering to be performed using a preset network model
- the block quantization parameter information determines whether the component to be filtered of the current block is filtered using the preset network model. If the second syntax element identification information indicates that the component to be filtered of the current block is filtered using the preset network model, determine the block quantization parameter information of the current block; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and The block quantization parameter value of the second color component;
- the first memory 1202 in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
- non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
- Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
- RAM static random access memory
- DRAM dynamic random access memory
- DRAM synchronous dynamic random access memory
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDRSDRAM
- enhanced SDRAM ESDRAM
- Synchlink DRAM SLDRAM
- Direct Rambus RAM DRRAM
- the first memory 1202 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
- the first processor 1203 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the first processor 1203 .
- the above-mentioned first processor 1203 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or a ready-made programmable gate array (Field Programmable Gate Array, FPGA). or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
- the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
- the storage medium is located in the first memory 1202.
- the first processor 1203 reads the information in the first memory 1202 and completes the steps of the above method in combination with its hardware.
- the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, and other devices used to perform the functions described in this application electronic unit or combination thereof.
- ASIC Application Specific Integrated Circuits
- DSP Digital Signal Processing
- DSP Device Digital Signal Processing Device
- DSPD Digital Signal Processing Device
- PLD programmable Logic Device
- FPGA Field-Programmable Gate Array
- the technology described in this application can be implemented through modules (such as procedures, functions, etc.) that perform the functions described in this application.
- Software code may be stored in memory and executed by a processor.
- the memory can be implemented in the processor or external to the processor.
- the first processor 1203 is further configured to perform the method described in any one of the preceding embodiments when running the computer program.
- This embodiment provides an encoder that can use multiple quantization parameters to input a preset network model loop filtering technology, wherein for the input of the preset network model, since it only includes the reconstructed value of the component to be filtered and Block quantization parameter information removes non-important input elements such as prediction information and division information of color components, which can reduce the calculation amount during network model inference, which is beneficial to the implementation of the decoder and reduces decoding time; in addition, due to the input of block quantization parameter information Block quantization parameter values that include at least two color components, even if multi-channel quantization parameters are used as input, can provide more choices and adaptations for brightness color components and chroma color components; and by introducing new syntax elements, The decoder does not need to store multiple neural network models to achieve a more flexible configuration, which is beneficial to improving encoding performance and thus improving encoding and decoding efficiency.
- the decoder 200 may include: a decoding unit 1301, a second determination unit 1302 and a second filtering unit 1303; wherein,
- the decoding unit 1301 is configured to parse the code stream and determine the first syntax element identification information of the component to be filtered in the current frame; and the first syntax element identification information indicates that the component to be filtered that exists in the divided block in the current frame allows the use of a preset network model.
- parse the code stream to determine the second syntax element identification information of the component to be filtered of the current block; wherein the current frame includes at least one division block, and the current block is any one of the at least one division block;
- the second determination unit 1302 is configured to determine the block quantization parameter information of the current block when the second syntax element identification information indicates that the to-be-filtered component of the current block is filtered using the preset network model; wherein the block quantization parameter information at least includes the first The block quantization parameter value of the color component and the block quantization parameter value of the second color component;
- the second determination unit 1302 is also configured to determine the reconstruction value of the component to be filtered of the current block
- the second filtering unit 1303 is configured to input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and determine the filtered reconstruction value of the component to be filtered of the current block.
- the decoding unit 1301 is also configured to parse the code stream and determine the first quantization parameter index and the second quantization parameter index of the current block;
- the second determination unit 1302 is further configured to determine the block quantization parameter value of the first color component corresponding to the current block from the first quantization parameter candidate set according to the first quantization parameter index; and according to the second quantization parameter index, determine from the second quantization parameter candidate set Determine the block quantization parameter value of the second color component corresponding to the current block in the quantization parameter candidate set; wherein the first quantization parameter candidate set is composed of at least two candidate quantization parameter values of the first color component, and the second quantization parameter candidate set is composed of candidate quantization parameter values of at least two second color components.
- the decoding unit 1301 is further configured to parse the code stream and determine the block quantization parameter value of the first color component and the block quantization parameter value of the second color component corresponding to the current block.
- the decoding unit 1301 is also configured to parse the code stream and determine the reconstruction residual value of the component to be filtered of the current block;
- the second determination unit 1302 is also configured to perform intra-frame or inter-frame prediction on the component to be filtered of the current block, determine the predicted value of the component to be filtered of the current block; and based on the reconstruction residual value of the component to be filtered of the current block and the current The predicted value of the component to be filtered of the block determines the reconstructed value of the component to be filtered of the current block.
- the second determination unit 1302 is further configured to perform an addition calculation on the reconstruction residual value of the component to be filtered of the current block and the predicted value of the component to be filtered of the current block, to obtain the reconstruction of the component to be filtered of the current block. value.
- the decoding unit 1301 is also configured to parse the code stream and obtain the value of the second syntax element identification information
- the second determining unit 1302 is also configured to determine that if the value of the second syntax element identification information is the first value, determine that the second syntax element identification information indicates that the component to be filtered of the current block is filtered using the preset network model; if the second If the value of the syntax element identification information is the second value, it is determined that the second syntax element identification information indicates that the component to be filtered of the current block is not filtered using the preset network model.
- the second determination unit 1302 is also configured to directly change the reconstructed value of the component to be filtered of the current block when the second syntax element identification information indicates that the component to be filtered of the current block is not filtered using the preset network model. Determined as the filtered reconstruction value of the component to be filtered of the current block.
- the decoding unit 1301 is also configured to parse the code stream and obtain the value of the first syntax element identification information
- the second determination unit 1302 is further configured to determine that if the value of the first syntax element identification information is the first value, determine that the first syntax element identification information indicates that the component to be filtered that exists in the current frame and is divided into blocks is allowed to use the preset network model. Filtering; if the value of the first syntax element identification information is the second value, it is determined that the first syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are not allowed to be filtered using the preset network model.
- the decoder 200 may further include a second setting unit 1304 configured to use the preset when the first syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are not allowed to be used.
- the value of the second syntax element identification information of the component to be filtered in the divided block is set to the second value; and after determining the reconstruction value of the component to be filtered in the divided block, the to-be-filtered component of the divided block is set to the second value.
- the reconstruction value of the component is directly determined as the filtered reconstruction value of the component to be filtered of the divided block.
- the decoding unit 1301 is further configured to parse the code stream and determine the third syntax element identification information of the component to be filtered of the current frame; and when the third syntax element identification information indicates at least one divided block included in the current frame, When all the components to be filtered are not filtered using the preset network model, the code stream is parsed to determine the first syntax element identification information of the component to be filtered in the current frame.
- the decoding unit 1301 is also configured to parse the code stream and obtain the value of the third syntax element identification information
- the second determination unit 1302 is further configured to determine that if the value of the third syntax element identification information is the first value, determine that the third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame use the preset network.
- the model performs filtering; if the value of the third syntax element identification information is the second value, it is determined that the third syntax element identification information indicates that all components to be filtered of at least one division block included in the current frame are not filtered using the preset network model.
- the decoding unit 1301 is also configured to parse the code stream and determine the current frame when the third syntax element identification information indicates that all to-be-filtered components of at least one divided block included in the current frame are filtered using the preset network model.
- the frame quantization parameter information wherein the frame quantization parameter information at least includes the frame quantization parameter value of the first color component and the frame quantization parameter value of the second color component;
- the second setting unit 1304 is also configured to set the value of the first syntax element identification information of the component to be filtered in the current frame to the first value, and set the second syntax element identification information of the component to be filtered in the divided block in the current frame.
- the values of are all set to the first value, and the block quantization parameter information of the divided block is determined according to the frame quantization parameter information of the current frame;
- the second filtering unit 1303 is also configured to, after determining the reconstruction value of the component to be filtered of the divided block, input the reconstructed value of the component to be filtered of the divided block and the block quantization parameter information of the divided block into the preset network model, and determine the divided block The filtered reconstruction value of the component to be filtered.
- the decoding unit 1301 is also configured to parse the code stream and determine the third quantization parameter index and the fourth quantization parameter index of the current frame;
- the second determination unit 1302 is further configured to determine the frame quantization parameter value of the first color component corresponding to the current frame from the first quantization parameter candidate set according to the third quantization parameter index; and according to the fourth quantization parameter index, determine from the second quantization parameter candidate set Determine the frame quantization parameter value of the second color component corresponding to the current frame in the quantization parameter candidate set; wherein the first quantization parameter candidate set is composed of at least two candidate quantization parameter values of the first color component, and the second quantization parameter candidate set is composed of candidate quantization parameter values of at least two second color components.
- the components to be filtered include at least a brightness color component and a chroma color component; accordingly, the second determination unit 1302 is also configured to determine the third syntax element when the color component type of the current frame is a brightness color component.
- the identification information is the frame-level brightness switch identification information of the current frame
- the first syntax element identification information is the frame-level brightness enable identification information of the current frame
- the second syntax element identification information is the block-level brightness usage identification information of the current block
- the frame-level brightness switch identification information is used to indicate whether the brightness color components of at least one division block included in the current frame are all filtered using the preset network model
- the frame-level brightness enable identification information is used to indicate whether there is brightness of the division block in the current frame.
- the color component allows filtering using the preset network model.
- the block-level brightness usage identification information is used to indicate whether the brightness color component of the current block uses the preset network model for filtering; when the color component type of the current frame is a chroma color component, determine
- the third syntax element identification information is the frame-level chroma switch identification information of the current frame, the first syntax element identification information is the frame-level chroma enable identification information of the current frame, and the second syntax element identification information is the block-level color of the current block.
- the frame-level chroma switch identification information is used to indicate whether the chroma color components of at least one divided block included in the current frame are all filtered using the preset network model
- the frame-level chroma enable identification information is used to Indicates whether the chroma color component of the divided block exists in the current frame and allows filtering using a preset network model.
- the block-level chroma usage identification information is used to indicate whether the chroma color component of the current block uses the preset network model for filtering.
- the decoding unit 1301 is also configured to parse the code stream and determine the fourth syntax element identification information; and when the fourth syntax element identification information indicates that the component to be filtered of the current sequence allows filtering using the preset network model, Execute the step of parsing the code stream and determining the third syntax element identification information of the component to be filtered of the current frame; wherein the current sequence includes the current frame.
- the decoding unit 1301 is also configured to parse the code stream and obtain the value of the fourth syntax element identification information
- the second determining unit 1302 is further configured to determine that if the value of the fourth syntax element identification information is the first value, determine that the fourth syntax element identification information indicates that the component to be filtered of the current sequence is allowed to be filtered using the preset network model; if the third If the value of the four syntax element identification information is the second value, it is determined that the fourth syntax element identification information indicates that the component to be filtered of the current sequence is not allowed to be filtered using the preset network model.
- the default network model is a neural network model
- the neural network model at least includes: a convolution layer, an activation layer, a splicing layer and a skip connection layer.
- the input of the preset network model is the reconstruction value of the component to be filtered and the block quantization parameter information of the current block.
- the second filtering unit 1303 is also configured to determine that the output of the preset network model is the current block. The filtered reconstruction value of the component to be filtered.
- the input of the preset network model is the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block.
- the second filtering unit 1303 is also configured to determine the output of the preset network model. is the first residual value of the component to be filtered of the current block; according to the reconstruction value of the component to be filtered of the current block and the first residual value of the component to be filtered of the current block, determine the post-filter reconstruction value of the component to be filtered of the current block.
- the decoding unit 1301 is also configured to parse the code stream and determine the residual scaling factor
- the second determination unit 1302 is further configured to perform scaling processing on the first residual value of the component to be filtered of the current block according to the residual scaling factor to obtain the second residual value of the component to be filtered of the current block;
- the reconstructed value of the filtered component and the second residual value of the component to be filtered of the current block determine the post-filtered reconstruction value of the component to be filtered of the current block.
- the second determination unit 1302 is also configured to traverse at least one divided block in the current frame, regard each divided block as the current block in turn, repeatedly execute the parsing code stream, and determine the third component of the current block to be filtered. The steps of obtaining the value of syntax element identification information to obtain the filtered reconstruction value corresponding to at least one division block; and determining the reconstructed image of the current frame based on the filtered reconstruction value corresponding to at least one division block.
- the "unit" may be part of a circuit, part of a processor, part of a program or software, etc., and of course may also be a module, or may be non-modular.
- each component in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
- the above integrated units can be implemented in the form of hardware or software function modules.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- this embodiment provides a computer-readable storage medium for use in the decoder 200.
- the computer-readable storage medium stores a computer program.
- the computer program is executed by the second processor, the foregoing embodiments are implemented. any one of the methods.
- the decoder 200 may include: a second communication interface 1401, a second memory 1402, and a second processor 1403; the various components are coupled together through a second bus system 1404. It can be understood that the second bus system 1404 is used to implement connection communication between these components.
- the second bus system 1404 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as second bus system 1404 in FIG. 14 . in,
- the second communication interface 1401 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- the second memory 1402 is used to store a computer program capable of running on the second processor 1403;
- the second processor 1403 is used to execute: when running the computer program:
- the code stream is parsed to determine the second syntax element identification information of the component to be filtered of the current block; wherein, the current The frame includes at least one divided block, and the current block is any one of the at least one divided block;
- the block quantization parameter information determines whether the component to be filtered of the current block is filtered using the preset network model. If the second syntax element identification information indicates that the component to be filtered of the current block is filtered using the preset network model, determine the block quantization parameter information of the current block; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and The block quantization parameter value of the second color component;
- the second processor 1403 is further configured to perform the method described in any one of the preceding embodiments when running the computer program.
- This embodiment provides a decoder that can use multiple quantization parameters to input the preset network model loop filtering technology, where, for the input of the preset network model, since it only includes the reconstructed value of the component to be filtered and Block quantization parameter information removes non-important input elements such as prediction information and division information of color components, which can reduce the calculation amount during network model inference, which is beneficial to the implementation of the decoder and reduces decoding time; in addition, due to the input of block quantization parameter information Block quantization parameter values that include at least two color components, even if multi-channel quantization parameters are used as input, can provide more choices and adaptations for brightness color components and chroma color components; and by introducing new syntax elements, The decoder does not need to store multiple neural network models to achieve a more flexible configuration, which is beneficial to improving encoding performance and thus improving encoding and decoding efficiency.
- FIG. 15 shows a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application.
- the encoding and decoding system 150 may include an encoder 1501 and a decoder 1502.
- the encoder 1501 may be the encoder described in any of the preceding embodiments
- the decoder 1502 may be the decoder described in any of the preceding embodiments.
- both the encoder 1501 and the decoder 1502 can use multiple quantization parameters to input the preset network model loop filtering technology, wherein for the input of the preset network model
- the encoder 1501 and the decoder 1502 can use multiple quantization parameters to input the preset network model loop filtering technology, wherein for the input of the preset network model
- the decoder can achieve a more flexible configuration without storing multiple neural network models, which is beneficial to improving encoding performance and thus improving encoding and decoding efficiency.
- the first syntax element identification information of the component to be filtered of the current frame is first determined; the first syntax element identification information indicates that there is a component to be filtered that is divided into blocks in the current frame.
- the second syntax element identification information of the component to be filtered of the current block is then determined; wherein the current frame includes at least one division block, and the current block is any one of the at least one division block; in the The second syntax element identification information indicates that when the component to be filtered of the current block is filtered using a preset network model, the block quantization parameter information of the current block is determined; wherein the block quantization parameter information at least includes the block quantization parameter value of the first color component and the second The block quantization parameter value of the color component; then determine the reconstruction value of the component to be filtered of the current block, input the reconstruction value of the component to be filtered of the current block and the block quantization parameter information of the current block into the preset network model, and finally determine the current The filtered reconstruction value of the component to be filtered of the block.
- the amount of calculation is beneficial to the implementation of the decoder and reduces the decoding time; in addition, since the input block quantization parameter information includes at least the block quantization parameter values of two color components, even if multi-channel quantization parameters are used as input, the brightness color component and The chroma color component has more choices and adaptations; and by introducing new syntax elements, the decoder can achieve a more flexible configuration without having to store multiple neural network models, which is beneficial to improving encoding performance and thereby improving Encoding and decoding efficiency.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请公开一种编解码方法、码流、编码器、解码器以及存储介质,该方法包括:解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
Description
本申请实施例涉及视频编解码技术领域,尤其涉及一种编解码方法、码流、编码器、解码器以及存储介质。
在视频编解码系统中,环路滤波器被使用来提升重建图像的主客观质量。其中,在环路滤波部分,目前存在一些神经网络方案,例如,多模型帧内可切换方案和帧内不可切换方案。对于前者而言,该方案拥有较多的神经网络模型,能够根据局部细节进行模型调整;而对于后者而言,该方案虽然仅拥有两个神经网络模型,但是在帧内并不进行模型切换;其中,若当前帧为I帧,则仅使用I帧对应的神经网络模型;若当前帧为B帧,则仅使用B帧对应的神经网络模型。
然而,采用多模型帧内可切换方案进行环路滤波时,由于根据量化参数和颜色分量的不同可以对应不同的神经网络模型,使得硬件实现上的复杂度高,而且开销大;帧内不可切换方案虽然可以降低复杂度,提高模型泛化能力,但是受量化参数的影响,环路滤波时的选择不够灵活,尤其是对于颜色分量的处理,环路滤波时还可能存在亮度颜色分量的性能好、但色度颜色分量的性能差的问题,从而不能够达到很好的编解码效果。
发明内容
本申请实施例提供一种编解码方法、码流、编码器、解码器以及存储介质,可以降低模型推理时的计算复杂度,从而能够提升编解码效率。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种解码方法,应用于解码器,该方法包括:
解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;
在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
第二方面,本申请实施例提供了一种编码方法,应用于编码器,该方法包括:
确定当前帧的待滤波分量的第一语法元素标识信息;
在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
第三方面,本申请实施例提供了一种码流,该码流是根据待编码信息进行比特编码生成的;其中,待编码信息包括下述至少一项:当前帧的待滤波分量的第一语法元素标识信息、当前块的待滤波分量的第二语法元素标识信息、当前帧的待滤波分量的第三语法元素标识信息、残差缩放因子和当前帧包括的至少一个划分块的待滤波分量的初始残差值;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个。
第四方面,本申请实施例提供了一种编码器,该编码器包括第一确定单元和第一滤波单元;其中,
第一确定单元,配置为确定当前帧的待滤波分量的第一语法元素标识信息;以及在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;以及在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
第一确定单元,还配置为确定当前块的待滤波分量的重建值,
第一滤波单元,配置为将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
第五方面,本申请实施例提供了一种编码器,该编码器包括第一存储器和第一处理器;其中,
第一存储器,用于存储能够在第一处理器上运行的计算机程序;
第一处理器,用于在运行所述计算机程序时,执行如第二方面所述的方法。
第五方面,本申请实施例提供了一种解码器,该解码器包括解码单元、第二确定单元和第二滤波单元;其中,
解码单元,配置为解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;以及在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
第二确定单元,配置为在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
第二确定单元,还配置为确定当前块的待滤波分量的重建值;
第二滤波单元,配置为将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
第六方面,本申请实施例提供了一种解码器,该解码器包括第二存储器和第二处理器;其中,
第二存储器,用于存储能够在第二处理器上运行的计算机程序;
第二处理器,用于在运行所述计算机程序时,执行如第一方面所述的方法。
第七方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如第一方面所述的方法、或者如第二方面所述的方法。
本申请实施例提供了一种编解码方法、码流、编码器、解码器以及存储介质,无论是在编码端还是解码端,首先确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,然后确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;再确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,最后可以确定出当前块的待滤波分量的滤波后重建值。这样,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
图1为一种混合编码框架的应用示意图;
图2为一种神经网络模型的网络架构示意图;
图3为一种残差块的组成结构示意图;
图4为另一种神经网络模型的网络架构示意图;
图5A为本申请实施例提供的一种编码器的组成框图示意图;
图5B为本申请实施例提供的一种解码器的组成框图示意图;
图6为本申请实施例提供的一种解码方法的流程示意图;
图7为本申请实施例提供的一种神经网络模型的网络架构示意图;
图8为本申请实施例提供的另一种解码方法的流程示意图;
图9为本申请实施例提供的又一种解码方法的流程示意图;
图10为本申请实施例提供的一种编码方法的流程示意图;
图11为本申请实施例提供的一种编码器的组成结构示意图;
图12为本申请实施例提供的一种编码器的具体硬件结构示意图;
图13为本申请实施例提供的一种解码器的组成结构示意图;
图14为本申请实施例提供的一种解码器的具体硬件结构示意图;
图15为本申请实施例提供的一种编解码系统的组成结构示意图。
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
可以理解,在视频图像中,一般采用第一颜色分量、第二颜色分量和第三颜色分量来表征编码块(Coding Block,CB)。其中,这三个颜色分量分别为一个亮度颜色分量和两个色度颜色分量(蓝色色度颜色分量和红色色度颜色分量),具体地,亮度颜色分量通常使用符号Y表示,蓝色色度颜色分量通常使用符号Cb或者U表示,红色色度颜色分量通常使用符号Cr或者V表示;这样,视频图像可以用YCbCr格式表示,也可以用YUV格式表示,甚至还可以用RGB格式表示,但是并不作任何限定。
还可以理解,视频压缩技术主要是将庞大的数字影像视频数据进行压缩,以便于传输以及存储等。随着互联网视频的激增以及人们对视频清晰度的要求越来越高,尽管已有的数字视频压缩标准能够节省不少视频数据,但目前仍然需要追求更好的数字视频压缩技术,以减少数字视频传输的带宽和流量压力。在视频编码过程中,编码器对不同颜色格式的原始视频序列读取不相等的像素,包含亮度颜色分量和色度颜色分量,即编码器读取一幅黑白或者彩色图像。之后进行块划分,将块数据交由编码器进行编码。
目前,通用的视频编解码标准都采用基于块的混合编码框架,例如H.266/多功能视频编码(Versatile Video Coding,VVC)。视频中的每一帧被分割成相同大小(如128×128,64×64等)的正方形的最大编码单元(Largest Coding Unit,LCU)。每个最大编码单元可根据规则划分成矩形的编码单元(Coding Unit,CU)。编码单元可能还会划分预测单元(Prediction Unit,PU),变换单元(Transform Unit,TU)等。如图1所示,混合编码框架可以包括有预测(Prediction)、变换(Transform)、量化(Quantization)、熵编码(Entropy coding)、环路滤波(Inloop Filter)等模块。其中,预测模块可以包括帧内预测(Intra Prediction)和帧间预测(Inter Prediction),帧间预测可以包括运动估计(Motion Estimation)和运动补偿(Motion Compensation)。由于视频图像的一个帧内相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测方式能够消除相邻像素之间的空间冗余;但是由于视频图像中的相邻帧之间也存在着很强的相似性,在视频编解码技术中使用帧间预测方式消除相邻帧之间的时间冗余,从而能够提高编解码效率。
视频编解码器的基本流程如下:在编码端,将一帧图像划分成块,对当前块使用帧内预测或帧间预测产生当前块的预测块,当前块的原始图像块减去预测块得到残差块,对残差块进行变换、量化得到量化系数矩阵,对量化系数矩阵进行熵编码输出到码流中。在解码端,对当前块使用帧内预测或帧间预测产生当前块的预测块,另一方面解析码流得到量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块,将预测块和残差块相加得到重建块。重建块组成重建图像,基于图像或基于块对重建图像进行环路滤波得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。解码图像可以为后续的帧作为帧间预测的参考帧。编码端确定的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息如果有必要需要在输出到码流中。解码端通过解析及根据已有信息进行分析确定与编码端 相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。编码端获得的解码图像通常也叫做重建图像。在预测时可以将当前块划分成预测单元,在变换时可以将当前块划分成变换单元,预测单元和变换单元的划分可以不同。上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化。也就是说,本申请实施例适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。其中,当前块(current block)可以是当前编码单元(CU)、当前预测单元(PU)或当前变换单元(TU)等。
在相关技术中,国际视频编码标准制定组织----联合视频专家组(Joint Video Experts Team,JVET)已经成立了两个探索实验小组,分别是基于神经网络编码的探索实验以及超越VVC的探索实验,并成立相应若干个专家讨论组。其中,超越VVC的探索实验小组旨在最新编解码标准H.266/VVC的基础上以严格的性能和复杂度要求进行更高的编码效率探索,该小组所研究的编码方法与VVC更接近,可以称之为传统的编码方法,目前该探索实验的算法参考模型性能已经超越最新的VVC参考模型(VVC TEST MODEL,VTM)约15%的编码性能。
而第一个探索实验小组所研究学习的方法是基于神经网络的一种智能化编码方法,时下深度学习和神经网络是各行各业的热点,尤其在计算机视觉领域,基于深度学习的方法往往有着压倒性的优势。JVET标准组织的专家将神经网络带入到视频编解码领域,借由神经网络强大的学习能力,基于神经网络的编码工具往往都有着很高效的编码效率。在VVC标准制定初期,不少厂商放眼于基于深度学习的编码工具,提出了包括基于神经网络的帧内预测方法,基于神经网络的帧间预测方法以及基于神经网络的环路滤波方法。其中,基于神经网络的环路滤波方法编码性能最为突出,经过多次会议研究探索,编码性能能够达到8%以上。而目前JVET会议的第一个探索实验小组所研究的基于神经网络的环路滤波方案编码性能高达12%,达到了几乎能够贡献了接近半代编码性能的程度。
本申请实施例是在JVET会议的探索实验基础上进行改进,提出一种基于神经网络(Neural network,NN)的环路滤波增强方案。下文将首先对相关技术中基于神经网络环路滤波方案进行相关介绍。
在相关技术中,针对基于神经网络的环路滤波方案探索主要集中为两种形式,第一种为多模型帧内可切换的方案;第二种为帧内不可切换模型的方案。但无论是哪种方案,神经网络的架构形式变化不大,且该工具在传统混合编码框架的环内滤波当中。故这两种方案的基本处理单元都是编码树单元,即最大编码单元大小。
第一种多模型帧内可切换的方案与第二种帧内不可切换模型的方案最大区别在于,编解码当前帧的时候,第一种方案可以随意切换神经网络模型,而第二种方案则不能切换神经网络模型。换而言之,以第一种方案为例,在编码一帧图像的时候,每一个编码树单元都有多种可选候选神经网络模型,由编码端进行选择当前编码树单元使用哪一个神经网络模型进行滤波效果最优,然后把该神经网络模型的索引序号写入码流,即该方案中若编码树单元需要进行滤波,则需先传输一个编码树单元级的使用标志位,后再传输神经网络模型的索引序号。若不需要滤波,则仅需传输一个编码树单元级的使用标志位即可;解码端在解析该索引序号后,在当前编码树单元载入该索引序号所对应的神经网络模型对当前编码树单元进行滤波。
以第二种方案为例,在编码一帧图像时候,当前帧内的每一个编码树单元可用的神经网络模型固定,每一个编码树单元使用相同的神经网络模型,即在编码端第二种方案并没有模型选择的过程;解码端在解析得到当前编码树单元是否使用基于神经网络的环路滤波的使用标志位,若该使用标志位为真,则使用预先设定的模型(与编码端相同)对该编码树单元进行滤波,若该使用标志位为假,则不做额外操作。
对于第一种多模型帧内可切换的方案,在编码树单元级拥有较强的灵活性,可以根据局部细节进行模型调整,即局部最优以达到全局更优的效果。通常该方案拥有较多的神经网络模型,针对JVET通用测试条件在不同量化参数下训练不同的神经网络模型,同时编码帧类型不同也可能需要不同的神经网络模型以达到更好的效果。以相关技术中的一滤波器为例,该滤波器使用多达22个神经网络模型覆盖不同的编码帧类型以及不同的量化参数,模型切换在编码树单元级进行。该滤波器在VVC的基础上能够提供多达10%以上的编码性能。
对于第二种帧内不可切换模型的方案,虽然该方案总体拥有两个神经网络模型,但在帧内并不进行模型的切换。该方案在编码端进行判断,若当前编码帧类型为I帧,则导入I帧对应的神经网络模型,而该当前帧内仅使用I帧对应的神经网络模型;若当前编码帧类型为B帧,则导入B帧对应的神经网络模型,同样该帧内仅使用B帧对应的神经网络模型。该方案在VVC的基础上能够提供8.65%的编码性能,虽然比第一种方案略低,但总体性能相比传统编码工具而言是近乎不可能达到的编码效率。
也就是说,第一种方案拥有较高的灵活性,编码性能更高,但该方案有个硬件实现上的致命缺点,即硬件专家对于帧内模型切换的代码比较担忧,在编码树单元级对模型进行切换意味着,最坏情况是解 码端每处理一个编码树单元就需要重新加载一次神经网络模型,且不说硬件实现复杂度,在现有高性能图形处理器(Graphics Processing Unit,GPU)上都是一种额外负担。同时,多模型的存在也意味着,大量的参数需要存储,这也是目前硬件实现上极大的开销负担。然而,对于第二种方案,这种神经网络环路滤波进一步探索了深度学习强大的泛化能力,将各种信息作为输入而不是单一地将重建样本作为模型的输入,更多的信息为神经网络的学习提供了更多的帮助,使得模型泛化能力得到更好的体现,去除了许多不需要的冗余参数。不断更新后的方案已经出现了针对不同测试条件和量化参数,仅用一个简化的低复杂度的神经网络模型就可以胜任。这相比第一种方案而言,即省去了不断重载模型再来的消耗和为了大量参数而需要开辟更大的存储空间。
下面将针对这两种方案的神经网络架构进行相关技术介绍。
参见图2,其示出了一种神经网络模型的网络架构示意图。如图2所示,该网络架构的主体结构可以由多个残差块(ResBlocks)组成。而残差块的组成结构详见图3所示。在图3中,单个残差块由多个卷积层(Conv)连接卷积注意力机制模块(Convolutional Blocks Attention Module,CBAM)层组成,而CBAM作为一种注意力机制模块,其主要负责细节特征的进一步提取,此外,残差块在输入和输出之间还存在有一个直接的跳跃连接(Skip Connection)结构。在这里,图3中的多个卷积层包括第一卷积层、第二卷积层和第三卷积层,而且第一卷积层之后还连接有激活层。示例性地,第一卷积层的大小为1×1×k×n,第二卷积层的大小为1×1×n×k,第三卷积层的大小为3×3×k×k,且k、n为正整数;激活层可以包括修正线性单元(Rectified Linear Unit,ReLU)函数,也称线性整流函数,是目前神经网络模型中经常使用的激活函数。ReLU实际上是一个斜坡函数,该函数简单,而且收敛速度快。
对于图2来说,在该网络架构中也存在有一个跳跃连接结构,其将输入的重建YUV信息与像素重组(Pixel Shuffle)模块后的输出连接。其中,Pixel Shuffle的主要功能是将低分辨的特征图,通过卷积和多通道间的重组得到高分辨率的特征图;其作为一种上采样方法,可以对缩小后的特征图进行有效的放大。另外,该网络架构的输入主要有重建YUV信息(rec_yuv)、预测YUV信息(pred_yuv)以及带有划分信息的YUV信息(par_yuv),所有的输入进行简单的卷积和激活操作后进行拼接(Cat),之后送入主体结构当中,最终输出滤波后分量信息(output_y)。值得注意的是,带有划分信息的YUV信息在I帧和B帧的处理上可能会有所不同,其中,I帧需要输入带有划分信息的YUV信息,而B帧则不需要输入带有划分信息的YUV信息。
综上可知,对于每一个I帧和B帧的任意一个JVET要求通测量化参数点,第一种方案都有一个与之对应的神经网络模型。同时,因为YUV三个颜色分量主要由亮度和色度两个通道组成,因此在颜色分量上又有所不同。
参见图4,其示出了另一种神经网络模型的网络架构示意图。如图4所示,该网络架构在主体结构上第一种方案与第二种方案基本相同,不同之处在于第二种方案的输入相比第一种方案而言,增加了量化参数信息作为额外输入。上述的第一种方案根据量化参数信息的不同载入不同的神经网络模型来实现更灵活的处理和更高效的编码效果,而第二种方案则是把量化参数信息作为网络的输入来提高神经网络的泛化能力,使其在不同的量化参数条件下模型都能适应并提供良好的滤波性能。
从图4可以看到,有两种量化参数作为输入进入到神经网络模型中,一种为BaseQP,另一种为SliceQP。BaseQP这里指示编码器在编码视频序列时设定的序列级量化参数,即JVET通测要求的量化参数点,也是第一种方案当中用来抉择神经网络模型的参数。SliceQP为当前帧的量化参数,当前帧的量化参数可以与序列级量化参数不同,这是因为在视频编码过程中,B帧的量化条件与I帧不同,时域层级不同量化参数也不同,因此SliceQP在B帧中一般与BaseQP不同。所以在相关技术中,I帧的神经网络模型的输入仅需要SliceQP即可,而B帧的神经网络模型需要BaseQP和SliceQP同时作为输入。在图4中,该网络架构的输入主要有重建YUV信息(rec_yuv)、预测YUV信息(pred_yuv)、带有划分信息的YUV信息(par_yuv)以及BaseQP、SliceQP,最终输出滤波后分量信息(output_yuv)。
另外,第二种方案还存在有一点与第一种方案会有所不同,其中,第一种方案模型的输出一般不需要再做额外处理,即模型的输出若是残差信息则叠加当前编码树单元的重建样本后作为基于神经网络的环路滤波工具输出;若模型的输出是完整的重建样本,则模型输出即为基于神经网络的环路滤波工具输出。而第二种方案二的输出一般需要做一个缩放处理,以模型输出的残差信息为例,模型进行推断输出当前编码树单元的残差信息,该残差信息进行缩放后再叠加当前编码树单元的重建样本信息,而这个缩放因子是由编码端求得,并需要写入码流传到解码端的。
在相关技术中,正是因为量化参数作为额外信息的输入,使得模型数量的减少得以实现并成为了当下JVET会议上受欢迎的解决方案。此外,通用基于神经网络的环路滤波方案可以不与上述两种方案完全相同,具体方案在细节上可以存在不同,但主要的思想基本一致。例如第二种方案的不同细节处可以体现在神经网络架构的设计上,诸如残差块的卷积大小、卷积层数以及是否包含注意力机制模块等,也 可以体现在神经网络模型的输入上,输入甚至可以有更多额外信息,诸如去块效应滤波的边界强度值等。
虽然上述的两种方案均大幅降低了神经网络环路滤波技术的实现复杂度,同时保持了较为可观的性能表现。但无论是单模型还是双模型的神经网络环路滤波,对于亮度和色度等颜色分量的处理均由一个模型处理完成,通过调参训练我们可以保持不错的亮度颜色分量性能,但色度颜色分量却仍有较大的提升空间。在一种实现方式中,总共可以使用4个神经网络模型对环路滤波进行提升,这比某相关技术多出2个神经网络模型,区别在于色度颜色分量的处理上。由于色度颜色分量拥有了独立的神经网络模型,因此该技术方案比前述两种方案在色度颜色分量的性能表现上平均高出2~5%的压缩性能。若不进行亮度颜色分量性能转移,该技术方案在色度颜色分量上能额外再高出10%的压缩性能,可见前述两种方案在色度颜色分量的性能表现还有提升的空间。此外,在另一种实现方式中,通过对神经网络的环路滤波技术做了多种消融实验,发现额外的输入信息在延长训练时长的前提下变得没有用处。因此,在本申请实施例中,基于神经网络的环路滤波技术可以考虑去除预测YUV信息、带有划分信息的YUV信息以及边界强度(Boundary strength,Bs)等输入信息,剪裁成仅具有重建YUV信息以及BaseQP的输入。
综上可知,由于BaseQP的重要性,在单一的模型方案中,受限于时间复杂度,亮度颜色分量和色度颜色分量并不能同时自由的切换;若单一的BaseQP输入到单一的网络模型方案当中,则解码端需要多次推理出重建样本,解码端时间复杂度的急剧增加在当前的软硬件设计中仍然不可接受,不利于提升编码性能。
基于此,本申请实施例提供了一种编解码方法,无论是在编码端还是解码端,首先确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,然后确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;再确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,最后可以确定出当前块的待滤波分量的滤波后重建值。
这样,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
下面将结合附图对本申请各实施例进行详细说明。
参见图5A,其示出了本申请实施例提供的一种编码器的组成框图示意图。如图5A所示,编码器(具体为“视频编码器”)100可以包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110等,其中,滤波单元108可以实现去方块滤波及样本自适应缩进(Sample Adaptive 0ffset,SAO)滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)。针对输入的原始视频信号,通过编码树块(Coding Tree Unit,CTU)的划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息, 输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。
参见图5B,其示出了本申请实施例提供的一种解码器的组成框图示意图。如图5B所示,解码器(具体为“视频解码器”)200包括解码单元201、反变换与反量化单元202、帧内预测单元203、运动补偿单元204、滤波单元205和解码图像缓存单元206等,其中,解码单元201可以实现头信息解码以及CABAC解码,滤波单元205可以实现去方块滤波以及SAO滤波。输入的视频信号经过图5A的编码处理之后,输出该视频信号的码流;该码流输入解码器200中,首先经过解码单元201,用于得到解码后的变换系数;针对该变换系数通过反变换与反量化单元202进行处理,以便在像素域中产生残差块;帧内预测单元203可用于基于所确定的帧内预测模式和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元204是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换与反量化单元202的残差块与由帧内预测单元203或运动补偿单元204产生的对应预测性块进行求和,而形成解码的视频块;该解码的视频信号通过滤波单元205以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元206中,解码图像缓存单元206存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,即得到了所恢复的原始视频信号。
需要说明的是,本申请实施例的方法主要应用在如图5A所示的滤波单元108部分和如图5B所示的滤波单元205部分。其中,无论是滤波单元108还是滤波单元205,这里均是指基于神经网络的环路滤波部分。也就是说,本申请实施例主要影响视频编码混合框架中的环路滤波部分,既可以应用于编码器,也可以应用于解码器,甚至还可以同时应用于编码器和解码器,但是这里不作具体限定。
在本申请的一实施例中,参见图6,其示出了本申请实施例提供的一种解码方法的流程示意图。如图6所示,该方法可以包括:
S601:解析码流,确定当前帧的待滤波分量的第一语法元素标识信息。
需要说明的是,在本申请实施中,该方法应用于解码器,具体可以应用于基于神经网络模型的环路滤波方法,更具体地,可以是基于多量化参数输入的神经网络模型得到环路滤波方法。
还需要说明的是,在本申请实施例中,解码器可以通过解析码流,确定第一语法元素标识信息。这里,第一语法元素标识信息为帧级语法元素,可以用于指示当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波。另外,当前帧可以包括至少一个划分块,且当前块为至少一个划分块中的任意一个。也就是说,第一语法元素标识信息可以确定当前帧包括的至少一个划分块的待滤波分量是否全部不允许使用预设网络模型进行滤波。
示例性地,在本申请实施例中,待滤波分量可以是指颜色分量。颜色分量可以包括下述至少一项:第一颜色分量、第二颜色分量和第三颜色分量。其中,第一颜色分量可以为亮度颜色分量,第二颜色分量和第三颜色分量可以为色度颜色分量(例如,第二颜色分量为蓝色色度颜色分量,第三颜色分量为红色色度颜色分量;或者,第二颜色分量为红色色度颜色分量,第三颜色分量为蓝色色度颜色分量)。
示例性地,如果待滤波分量为亮度颜色分量,那么第一语法元素标识信息可以为ph_nnlf_luma_enable_flag;如果待滤波分量为色度颜色分量,那么第一语法元素标识信息可以为ph_nnlf_chroma_enable_flag。也就是说,针对当前帧中不同的颜色分量,对应设置有不同的第一语法元素标识信息。具体地,解码器在解析码流之后,可以确定待滤波分量的第一语法元素标识信息,从而可以在该待滤波分量下当前帧中是否存在划分块允许使用预设网络模型进行滤波。
需要说明的是,对于第一语法元素标识信息而言,具体可以由解码该标识信息的取值来确定。在一些实施例中,所述解析码流,确定当前帧的待滤波分量的第一语法元素标识信息,可以包括:
解析码流,获取第一语法元素标识信息的取值;
相应地,该方法还可以包括:
若第一语法元素标识信息的取值为第一值,则确定第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波;
若第一语法元素标识信息的取值为第二值,则确定第一语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部不允许使用预设网络模型进行滤波。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,第一语法元素标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。
示例性地,以flag为例,flag的设置有两种方式:使能标志位(enable_flag)和非使能标志位 (disable_flag)。假定使能标志位的取值为第一值,非使能标志位的取值为第二值;那么对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为真(true),第二值还可以设置为假(false);但是本申请实施例并不作具体限定。
还需要说明的是,对于第一语法元素标识信息而言,解码器首先需要解码确定当前帧的待滤波分量的第三语法元素标识信息,然后再确定是否需要解码第一语法元素标识信息。因此,在一些实施例中,所述解析码流,确定当前帧的第一语法元素标识信息,可以包括:
解析码流,确定当前帧的待滤波分量的第三语法元素标识信息;
在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,解析码流,确定当前帧的待滤波分量的第一语法元素标识信息。
在本申请实施例中,第三语法元素标识信息也为帧级语法元素,可以用于指示当前帧包括的至少一个划分块的待滤波分量是否全部使用预设网络模型进行滤波。也就是说,第三语法元素标识信息可以确定当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波,还是当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波。
示例性地,如果待滤波分量为亮度颜色分量,那么第三语法元素标识信息可以为ph_nnlf_luma_ctrl_flag;如果待滤波分量为色度颜色分量,那么第三语法元素标识信息可以为ph_nnlf_chroma_ctrl_flag。也就是说,针对当前帧中不同的颜色分量,对应设置有不同的第三语法元素标识信息。具体地,解码器在解析码流之后,可以先确定待滤波分量的第三语法元素标识信息,只有在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,这时候解码器才还需要解码获得第一语法元素标识信息的取值。
在一种具体的实施例中,所述解析码流,确定当前帧的待滤波分量的第三语法元素标识信息,可以包括:解析码流,获取第三语法元素标识信息的取值;
相应地,该方法还可以包括:
若第三语法元素标识信息的取值为第一值,则确定第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波;
若第三语法元素标识信息的取值为第二值,则确定第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波。
在本申请实施例中,第三语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
在本申请实施例中,第一语法元素标识信息和第三语法元素标识信息均为帧级语法元素。示例性地,第三语法元素标识信息也可称为帧级开关标识位,第一语法元素标识信息也可称为帧级使用标识位。其中,帧级开关标识位为真的情况下,这时候当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波,那么无需再解析码流,可以直接将帧级使用标识位设置为真;只有在帧级开关标识位为假的情况下,这时候当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波,那么还需要继续解析码流,以确定出帧级使用标识位,即第一语法元素标识信息。
S602:在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息。
需要说明的是,在本申请实施例中,如果第一语法元素标识信息为真,这时候第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波,那么还需要继续解析码流,以确定当前块的待滤波分量的第二语法元素标识信息。另外,这里的当前块具体是指当前待进行环路滤波的划分块,其可以为当前帧包括的至少一个划分块中的任意一个。在这里,当前块可以是当前编码单元、当前预测单元或者当前变换单元,甚至也可以是当前编码树单元(Coding Tree Unit,CTU)等。下面将以当前块为当前编码树单元为例进行具体描述。
还需要说明的是,在本申请实施例中,第二语法元素标识信息为编码树单元级语法元素,可以用于指示当前块的待滤波分量是否使用预设网络模型进行滤波。第二语法元素标识信息也可称为编码树单元使用标识位。也就是说,第二语法元素标识信息可以确定当前编码树单元的待滤波分量使用预设网络模型进行滤波,还是当前编码树单元的待滤波分量不使用预设网络模型进行滤波。
示例性地,如果待滤波分量为亮度颜色分量,那么第二语法元素标识信息可以为ctb_nnlf_luma_flag;如果待滤波分量为色度颜色分量,那么第二语法元素标识信息可以为ctb_nnlf_chroma_flag。也就是说,针对当前编码树单元中不同的颜色分量,对应设置有不同的第二语法元素标识信息。具体地,解码器在解析码流之后,可以先确定待滤波分量的第三语法元素标识信息,在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,解码器还需要解码获得第一 语法元素标识信息的取值;只有在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,这时候解码器才会继续解码获得第二语法元素标识信息的取值。
在一种具体的实施例中,所述解析码流,确定当前块的待滤波分量的第二语法元素标识信息,可以包括:解析码流,获取第二语法元素标识信息的取值。
相应地,该方法还可以包括:
若第二语法元素标识信息的取值为第一值,则确定第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波;
若第二语法元素标识信息的取值为第二值,则确定第二语法元素标识信息指示当前块的待滤波分量不使用预设网络模型进行滤波。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,无论是第一语法元素标识信息,还是第二语法元素标识信息、第三语法元素标识信息,均可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。
示例性地,在本申请实施例中,第二语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
还需要说明的是,在本申请实施例中,第三语法元素标识信息可称为帧级开关标识位,第一语法元素标识信息可称为帧级使用标识位,第二语法元素标识信息可称为编码树单元标识位。这样,帧级开关标识位为真的情况下,这时候当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波,那么无需再解析码流,可以直接将帧级使用标识位和当前帧内的所有编码树单元使用标识位全部设置为真;只有在帧级开关标识位为假的情况下,这时候当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波,那么还需要继续解析码流,以确定出帧级使用标识位;然后再帧级使用标识位为真的情况下,继续解析码流,以确定出当前帧内每一划分块的编码树单元使用标识位。
S603:在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值。
S604:确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
在本申请实施例中,对于当前块而言,如果解码确定第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波,那么还需要确定当前块的块量化参数信息和当前块的待滤波分量的重建值,然后将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,就可以确定出当前块的待滤波分量的滤波后重建值。
为了为亮度颜色分量和色度颜色分量提供更多信息,以提升编解码性能,那么本申请实施例不仅亮度颜色分量需要输入一个量化参数通道,同时色度颜色分量也需要输入一个量化参数通道;因此,这里的块量化参数信息至少可以包括亮度颜色分量的块量化参数值和色度颜色分量的块量化参数值。示例性地,第一颜色分量的块量化参数值可以为亮度颜色分量的块量化参数值(用ctb_nnlf_luma_baseqp表示),第二颜色分量的块量化参数值可以为色度颜色分量的块量化参数值(用ctb_nnlf_chroma_baseqp表示)。
在一种可能的实施例中,所述确定当前块的块量化参数信息,可以包括:
解析码流,确定当前块的第一量化参数索引和第二量化参数索引;
根据第一量化参数索引,从第一量化参数候选集合中确定当前块对应的第一颜色分量的块量化参数值;以及
根据第二量化参数索引,从第二量化参数候选集合中确定当前块对应的第二颜色分量的块量化参数值。
在这里,第一量化参数候选集合可以是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合可以是由至少两个第二颜色分量的候选量化参数值组成。
需要说明的是,在本申请实施例中,码流中写入的是第一量化参数索引和第二量化参数索引。此时解码器通过解析码流,可以获得第一量化参数索引;然后根据第一量化参数索引可以从第一量化参数候选集合中确定出第一颜色分量的块量化参数值;解码器通过解析码流,还可以获得第二量化参数索引;然后根据第二量化参数索引可以从第二量化参数候选集合中确定出第二颜色分量的块量化参数值。
在另一种可能的实施例中,所述确定当前块的块量化参数信息,可以包括:
解析码流,确定当前块对应的第一颜色分量的块量化参数值和第二颜色分量的块量化参数值。
需要说明的是,在本申请实施例中,码流中写入的是第一颜色分量的块量化参数值和第二颜色分量的块量化参数值。这样,解码器通过解析码流,可以直接确定出第一颜色分量的块量化参数值和第二颜 色分量的块量化参数值。
还需要说明的是,在本申请实施例中,对于亮度颜色分量而言,如果当前块的亮度颜色分量的第二语法元素标识信息的取值为真时,这时候当前块的亮度颜色分量使用预设网络模型进行滤波,那么还需要解码获得亮度颜色分量的块量化参数值;对于色度颜色分量而言,如果当前块的色度颜色分量的第二语法元素标识信息的取值为真时,这时候当前块的色度颜色分量使用预设网络模型进行滤波,那么还需要解码获得色度颜色分量的块量化参数值。需要注意的是,如果当前块的亮度颜色分量的第二语法元素标识信息的取值为假,色度颜色分量的第二语法元素标识信息的取值为真,那么需要对当前块的色度颜色分量使用预设网络模型进行滤波,这时候输入的块量化参数值仍然包括亮度颜色分量的块量化参数值和色度颜色分量的块量化参数值,而此时从码流中获取的亮度颜色分量的块量化参数值是默认值。反之,如果当前块的色度颜色分量的第二语法元素标识信息的取值为假,亮度颜色分量的第二语法元素标识信息的取值为真,那么需要对当前块的亮度颜色分量使用预设网络模型进行滤波,这时候输入的块量化参数值仍然包括亮度颜色分量的块量化参数值和色度颜色分量的块量化参数值,而此时从码流中获取的色度颜色分量的块量化参数值是默认值。
可以理解的是,在解码重建过程中,解码器还需要确定当前块的待滤波分量的重建值。在一些实施例中,所述确定当前块的待滤波分量的重建值,可以包括:
解析码流,确定当前块的待滤波分量的重建残差值;
对当前块的待滤波分量进行帧内或帧间预测,确定当前块的待滤波分量的预测值;
根据当前块的待滤波分量的重建残差值和当前块的待滤波分量的预测值,确定当前块的待滤波分量的重建值。
在一种具体的实施例中,所述解析码流,确定当前块的待滤波分量的重建残差值,可以包括:解析码流,获取当前块的待滤波分量的目标残差值;对当前块的待滤波分量的目标残差值进行反量化与反变换处理,得到当前块的待滤波分量的重建残差值。
在一种具体的实施例中,所述根据当前块的待滤波分量的重建残差值和当前块的待滤波分量的预测值,确定当前块的待滤波分量的重建值,可以包括:对当前块的待滤波分量的重建残差值和当前块的待滤波分量的预测值进行加法计算,得到当前块的待滤波分量的重建值。
在这里,对于当前块而言,可以通过解码获得当前块的待滤波分量的重建残差值;然后通过对当前块的待滤波分量进行帧内或帧间预测,以确定出当前块的待滤波分量的预测值;再对待滤波分量的重建残差值和待滤波分量的预测值进行加法计算,即可得到当前块的待滤波分量的重建值,也就是前述所描述的重建YUV信息;然后将其作为预设网络模型的输入,以确定当前块的待滤波分量的滤波后重建值。
还需要说明的是,解析码流,确定当前块的待滤波分量的第二语法元素标识信息之后,在一些实施例中,该方法还可以包括:在第二语法元素标识信息指示当前块的待滤波分量不使用预设网络模型进行滤波时,将当前块的待滤波分量的重建值直接确定为当前块的待滤波分量的滤波后重建值。
也就是说,在解码确定当前块的第二语法元素标识信息之后,如果第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波,那么可以将当前块的待滤波分量的重建值和块量化参数信息输入到预设网络模型,从而能够得到当前块的待滤波分量的滤波后重建值;如果第二语法元素标识信息指示当前块的待滤波分量不使用预设网络模型进行滤波,那么可以将当前块的待滤波分量的重建值直接确定为当前块的待滤波分量的滤波后重建值。
还需要说明的是,解析码流,确定当前帧的待滤波分量的第一语法元素标识信息之后,在一些实施例中,该方法还可以包括:
在第一语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部不允许使用预设网络模型进行滤波时,将划分块的待滤波分量的第二语法元素标识信息的取值均设置为第二值;
在确定划分块的待滤波分量的重建值之后,将划分块的待滤波分量的重建值直接确定为划分块的待滤波分量的滤波后重建值。
也就是说,在解码确定当前帧的待滤波分量的第一语法元素标识信息之后,如果第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波,那么需要继续解码确定第二语法元素标识信息,然后根据第二语法元素标识信息来确定出当前块的待滤波分量的滤波后重建值;反之,如果第一语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部不允许使用预设网络模型进行滤波,那么可以将这至少一个划分块的待滤波分量的第二语法元素标识信息的取值均设置为第二值;然后在确定出每一划分块的待滤波分量的重建值之后,将每一划分块的待滤波分量的重建值直接确定为该划分块的待滤波分量的滤波后重建值;后续还需要继续执行其他环路滤波方法,在所有环路滤波方法全部执行完毕后输出完整的重建图像。
还需要说明的是,解析码流,确定当前帧的待滤波分量的第三语法元素标识信息之后,在一些实施 例中,该方法还可以包括:
在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波时,解析码流,确定当前帧的帧量化参数信息;其中,帧量化参数信息至少包括第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;
将当前帧的待滤波分量的第一语法元素标识信息的取值设置为第一值,将当前帧中的划分块的待滤波分量的第二语法元素标识信息的取值均设置为第一值,以及根据当前帧的帧量化参数信息确定划分块的块量化参数信息;
在确定划分块的待滤波分量的重建值之后,将划分块的待滤波分量的重建值和划分块的块量化参数信息输入到预设网络模型,确定划分块的待滤波分量的滤波后重建值。
也就是说,在解码确定当前帧的待滤波分量的第三语法元素标识信息之后,如果第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用所述预设网络模型进行滤波,那么需要继续解码确定第一语法元素标识信息和第二语法元素标识信息,然后根据这两个语法元素标识信息来确定出当前块的待滤波分量的滤波后重建值;反之,如果第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波,那么仅需要解码确定当前帧的帧量化参数信息;然后将第一语法元素标识信息和第二语法元素标识信息的取值都设置为第一值,并且根据当前帧的帧量化参数信息来确定出当前帧包括的所有划分块的块量化参数信息。
在本申请实施例中,帧量化参数信息至少包括第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。示例性地,第一颜色分量为亮度颜色分量,第二颜色分量为色度颜色分量。在一种可能的实施例中,所述解析码流,确定当前帧的帧量化参数信息,可以包括:
解析码流,确定当前帧的第三量化参数索引和第四量化参数索引;
根据第三量化参数索引,从第一量化参数候选集合中确定当前帧对应的第一颜色分量的帧量化参数值;以及
根据第四量化参数索引,从第二量化参数候选集合中确定当前帧对应的第二颜色分量的帧量化参数值。
在这里,第一量化参数候选集合可以是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合可以是由至少两个第二颜色分量的候选量化参数值组成。需要注意的是,对于同一帧来说,第一量化参数候选集合是可以相同的,第二量化参数候选集合可以是相同的;而不同的帧各自对应的第一量化参数候选集合可以不同,不同的帧各自对应的第二量化参数候选集合也可以不同。
还需要说明的是,在本申请实施例中,码流中写入的可以是第三量化参数索引和第四量化参数索引。此时解码器通过解析码流,可以获得第三量化参数索引;然后根据第三量化参数索引可以从第一量化参数候选集合中确定出第一颜色分量的帧量化参数值;解码器通过解析码流,还可以获得第四量化参数索引;然后根据第四量化参数索引可以从第二量化参数候选集合中确定出第二颜色分量的帧量化参数值。示例性地,亮度颜色分量的帧量化参数值可以用ph_nnlf_luma_baseqp表示,色度颜色分量的帧量化参数值可以用ph_nnlf_chroma_baseqp表示。
在另一种可能的实施例中,所述解析码流,确定当前帧的帧量化参数信息,可以包括:
解析码流,确定当前帧对应的第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。
需要说明的是,在本申请实施例中,码流中写入的是第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。这样,解码器通过解析码流,可以直接确定出第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。
还需要说明的是,在本申请实施例中,在解码确定当前帧包括的至少一个划分块的待滤波分量全部使用所述预设网络模型进行滤波时,这时候通过解析码流以确定出当前帧的帧量化参数信息;然后将当前块的亮度颜色分量的块量化参数值赋值为亮度颜色分量的帧量化参数值,即ctb_nnlf_luma_baseqp=ph_nnlf_luma_baseqp,将当前块的色度颜色分量的块量化参数值赋值为色度颜色分量的帧量化参数值,即ctb_nnlf_chroma_baseqp=ph_nnlf_chroma_baseqp。
还可以理解的是,在本申请实施例中,这里引入了新的语法元素,例如待滤波分量的第一语法元素标识信息、第二语法元素标识信息、第三语法元素标识信息等。在一些实施例中,待滤波分量至少包括亮度颜色分量和色度颜色分量;该方法还可以包括:
在当前帧的颜色分量类型为亮度颜色分量时,确定第三语法元素标识信息为当前帧的帧级亮度开关标识信息,第一语法元素标识信息为当前帧的帧级亮度使能标识信息,第二语法元素标识信息为当前块的块级亮度使用标识信息;其中,帧级亮度开关标识信息用于指示当前帧包括的至少一个划分块的亮度颜色分量是否全部使用预设网络模型进行滤波,帧级亮度使能标识信息用于指示当前帧中是否存在划分块的亮度颜色分量允许使用预设网络模型进行滤波,块级亮度使用标识信息用于指示当前块的亮度颜色 分量是否使用预设网络模型进行滤波;
在当前帧的颜色分量类型为色度颜色分量时,确定第三语法元素标识信息为当前帧的帧级色度开关标识信息,第一语法元素标识信息为当前帧的帧级色度使能标识信息,第二语法元素标识信息为当前块的块级色度使用标识信息;其中,帧级色度开关标识信息用于指示当前帧包括的至少一个划分块的色度颜色分量是否全部使用预设网络模型进行滤波,帧级色度使能标识信息用于指示当前帧中是否存在划分块的色度颜色分量允许使用预设网络模型进行滤波,块级色度使用标识信息用于指示当前块的色度颜色分量是否使用预设网络模型进行滤波。
在这里,对于亮度颜色分量来说,帧级亮度开关标识信息可以用ph_nnlf_luma_ctrl_flag表示,帧级亮度使能标识信息可以用ph_nnlf_luma_enable_flag表示,块级亮度使用标识信息可以用ctb_nnlf_luma_flag表示;对于色度颜色分量来说,帧级色度开关标识信息可以用ph_nnlf_chroma_ctrl_flag表示,帧级色度使能标识信息可以用ph_nnlf_chroma_enable_flag表示,块级色度使用标识信息可以用ctb_nnlf_chroma_flag表示。
进一步地,在本申请实施例中,还可以设置有序列级语法元素,以便确定当前序列是否允许使用基于神经网络的环路滤波技术,该方法还可以包括:
解析码流,确定第四语法元素标识信息;
在第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波时,执行解析码流,确定当前帧的待滤波分量的第三语法元素标识信息的步骤。
需要说明的是,在本申请实施例中,第四语法元素标识信息为序列级语法元素,可以用于指示当前序列的待滤波分量是否允许使用预设网络模型进行滤波。也就是说,根据第四语法元素标识信息的取值不同,可以确定当前序列的待滤波分量允许使用预设网络模型进行滤波,还是当前序列的待滤波分量不允许使用预设网络模型进行滤波。
还需要说明的是,在本申请实施例中,第四语法元素标识信息可以用sps_nnlf_enable_flag表示。其中,如果当前序列的亮度颜色分量和色度颜色分量中至少一项允许使用预设网络模型进行滤波,那么意味着sps_nnlf_enable_flag的取值为真,即当前序列的待滤波分量允许使用预设网络模型进行滤波;如果当前序列的亮度颜色分量和色度颜色分量中均不允许使用预设网络模型进行滤波,那么意味着sps_nnlf_enable_flag的取值为假,即当前序列的待滤波分量不允许使用预设网络模型进行滤波。
在一种具体的实施例中,所述解析码流,确定第四语法元素标识信息,可以包括:解析码流,获取第四语法元素标识信息的取值。
相应地,该方法还可以包括:
若第四语法元素标识信息的取值为第一值,则确定第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波;
若第四语法元素标识信息的取值为第二值,则确定第四语法元素标识信息指示当前序列的待滤波分量不允许使用预设网络模型进行滤波。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,对于第四语法元素标识信息而言,可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。
示例性地,在本申请实施例中,第四语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
还需要说明的是,第四语法元素标识信息可称为序列级标识位。解码器首先解码获得序列级标识位,若sps_nnlf_enable_flag的取值为真,则表示当前码流允许使用基于预设网络模型的环路滤波技术,且后续解码过程需要解析相关的语法元素;否则,表示当前码流不允许使用基于预设网络模型的环路滤波技术,后续解码过程不需要解析相关语法元素,默认相关语法元素为初始值或至假状态。
进一步地,在本申请实施例中,预设网络模型可以为神经网络模型,且该神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
需要说明的是,对于预设网络模型来说,其输入可以包括:待滤波分量的重建值(用rec_yuv表示)、亮度颜色分分量的量化参数值(用BaseQPluma表示)和色度颜色分分量的量化参数值(用BaseQPchroma表示);其输出可以为:待滤波分量的滤波后重建值(用output_yuv表示。)由于本申请实施例去除了如预测YUV信息、带有划分信息的YUV信息等非重要输入元素,可以减少网络模型推理的计算量,有利于解码端的实现和降低解码时间。另外,在本申请实施例中,预设网络模型的输入还可以包括当前帧的量化参数(SliceQP),但是SliceQP不用区分亮度颜色分量和色度颜色分量。
还需要说明的是,对于预设网络模型来说,其网络的主体结构与前述的图2或图4类似,其主体结 构也是由多个残差块组成,而且残差块的组成结构可以详见图3所示。
示例性地,参见图7,其示出了本申请实施例提供的一种神经网络模型的网络架构示意图。如图7所示,在输入端,待滤波分量的重建值经过卷积层和激活层的操作之后再和亮度颜色分分量的量化参数值、色度颜色分分量的量化参数值进行拼接处理,之后再将拼接结构送入主体结构中;而且这里也存在有一个跳跃连接结构,其将输入的待滤波分量的重建值与Pixel Shuffle模块后的输出连接,最终输出待滤波分量的滤波后重建值。具体地,在图7中,该网络架构的输入主要有重建YUV信息(rec_yuv)、BaseQPluma及BaseQPchroma,该网络架构的输出则为输出滤波后分量信息(output_yuv)。
这样,本申请实施例提出了一种多BaseQP输入的基于神经网络模型的环路滤波技术,主要思想是亮度颜色分量输入一个通道的BaseQPluma,同时色度颜色分量也输入一个通道的BaseQPchroma,同时保持模型数量一个不变。如此,本申请实施例在不增加模型数量的前提下,通过增加推理计算量,可以为亮色度颜色分量提供更多信息,同时使得亮度颜色分量和色度颜色分量拥有更多的选择和适配。
进一步地,在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,该方法还可以包括:确定预设网络模型的输出为当前块的待滤波分量的滤波后重建值。
进一步地,在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,预设网络模型的输出还可以为残差信息。如图8所示,该方法还可以包括:
S801:确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,通过预设网络模型输出当前块的待滤波分量的第一残差值。
S802:根据当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值,确定当前块的待滤波分量的滤波后重建值。
需要说明的是,在本申请实施例中,预设网络模型的输出可以直接为当前块的待滤波分量的滤波后重建值,或者也可以是当前块的待滤波分量的第一残差值。对于后者而言,解码器还需要对当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值进行加法运算,可以确定出当前块的待滤波分量的滤波后重建值。
还需要说明的是,在本申请实施例中,在预设网络模型的输出端可以增加一个缩放处理,即利用残差缩放因子对待滤波分量的第一残差值进行缩放。因此,在一些实施例中,如图9所示,该方法还可以包括:
S901:解析码流,确定残差缩放因子。
S902:根据残差缩放因子对当前块的待滤波分量的第一残差值进行缩放处理,得到当前块的待滤波分量的第二残差值。
S903:根据当前块的待滤波分量的重建值和当前块的待滤波分量的第二残差值,确定当前块的待滤波分量的滤波后重建值。
需要说明的是,预设网络模型的输出若是残差信息,那么需要叠加当前块的重建样本后作为基于预设网络模型的环路滤波工具输出;若预设网络模型的输出是完整的重建样本,那么模型输出即为基于预设网络模型的环路滤波工具输出。然而,在一种可能的实施例中,模型输出一般还需要进行一个缩放处理,以模型输出为残差信息为例,预设网络模型进行推断输出当前块的残差信息,该残差信息进行缩放处理后再叠加当前块的重建样本;而这个残差缩放因子是由编码器求得,其需要写入码流传到解码器中,使得解码器通过解码即可获得残差缩放因子。
进一步地,在一些实施例中,该方法还可以包括:
遍历当前帧中的至少一个划分块,将每一划分块依次作为当前块,重复执行解析码流,确定当前块的待滤波分量的第二语法元素标识信息的取值的步骤,以得到至少一个划分块各自对应的滤波后重建值;
根据至少一个划分块各自对应的滤波后重建值,确定当前帧的重建图像。
需要说明的是,对于当前帧而言,当前帧可以包括至少一个划分块。然后遍历这些划分块,将每一划分块依次作为当前块,重复执行本申请实施例的解码方法流程,以得到每一划分块对应的滤波后重建值;根据所得到的这些滤波后重建值可以确定出当前帧的重建图像。另外,需要注意的是,解码器还可以继续遍历其他环路滤波工具,完毕后输出完整的重建图像,具体过程与本申请实施例并不密切相关,故这里不作详细赘述。
进一步地,在一些实施例中,由于视频编码在I帧和B帧的质量要求上存在不同,往往I帧要求较高的编码质量以利于B帧作为参考。故在I帧和B帧上,本申请实施例的解码方法仅允许B帧使用亮色度分量不同的量化参数(BaseQPluma和BaseQPchroma)输入,而I帧亮色度分量的量化参数输入一致。这样不仅降低编解码时间,同时在I帧上可以节省量化参数传输的比特开销,进一步提升压缩效率。
进一步地,在一些实施例中,本申请实施例仅增加了一层色度量化参数作为额外输入,此外还可以对Cb颜色分量和Cr颜色分量分别增加量化参数作为额外输入。
进一步地,在一些实施例中,本申请实施例提出的基于神经网络模型的环路滤波增强方法还可以拓展到其他输入部分,例如边界强度等,本申请实施例不作具体限定。
本实施例提供了一种解码方法,通过解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;以及在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;然后再确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。这样,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
在本申请的另一实施例中,基于前述实施例所述的解码方法,参见图10,其示出了本申请实施例提供的一种编码方法的流程示意图。如图10所示,该方法可以包括:
S1001:确定当前帧的待滤波分量的第一语法元素标识信息。
需要说明的是,在本申请实施中,该方法应用于编码器,具体可以应用于基于神经网络模型的环路滤波方法,更具体地,可以是基于多量化参数输入的神经网络模型得到环路滤波方法。
还需要说明的是,在本申请实施例中,第一语法元素标识信息为帧级语法元素,可以用于指示当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波。另外,当前帧可以包括至少一个划分块,且当前块为至少一个划分块中的任意一个。也就是说,第一语法元素标识信息可以确定当前帧包括的至少一个划分块的待滤波分量是否全部不允许使用预设网络模型进行滤波。
示例性地,在本申请实施例中,待滤波分量可以是指颜色分量。颜色分量可以包括下述至少一项:第一颜色分量、第二颜色分量和第三颜色分量。其中,第一颜色分量可以为亮度颜色分量,第二颜色分量和第三颜色分量可以为色度颜色分量(例如,第二颜色分量为蓝色色度颜色分量,第三颜色分量为红色色度颜色分量;或者,第二颜色分量为红色色度颜色分量,第三颜色分量为蓝色色度颜色分量)。
进一步地,对于当前帧来说,当前帧的待滤波分量的帧级语法元素标识信息可以包括第一语法元素标识信息和第三语法元素标识信息。其中,在确定第一语法元素标识信息之前,编码器首先需要确定当前帧的待滤波分量的第三语法元素标识信息。在一些实施例中,所述确定当前帧的待滤波分量的第一语法元素标识信息,可以包括:确定当前帧的待滤波分量的第三语法元素标识信息;在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,确定当前帧的待滤波分量的第一语法元素标识信息。
在本申请实施例中,第三语法元素标识信息用于指示当前帧包括的至少一个划分块的待滤波分量是否全部使用预设网络模型进行滤波,第一语法元素标识信息用于指示当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波。示例性地,如果待滤波分量为亮度颜色分量,那么第一语法元素标识信息可以为ph_nnlf_luma_enable_flag,第三语法元素标识信息可以为ph_nnlf_luma_ctrl_flag;如果待滤波分量为色度颜色分量,那么第一语法元素标识信息可以为ph_nnlf_chroma_enable_flag,第三语法元素标识信息可以为ph_nnlf_chroma_ctrl_flag。也就是说,针对当前帧中不同的颜色分量,对应设置有不同的第一语法元素标识信息和第三语法元素标识信息。
在本申请实施例中,这里可以通过失真方式来确定当前帧包括的至少一个划分块的待滤波分量是否全部使用预设网络模型进行滤波,和/或,通过失真方式来确定当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波。示例性地,这里的失真方式可以为率失真代价方式。在计算不同情况下的率失真代价值之后,根据率失真代价值的大小来确定当前帧包括的至少一个划分块的待滤波分量是否全部使用预设网络模型进行滤波,即确定出第三语法元素标识信息的取值;和/或,根据率失真代价值的大小来确定当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波,即确定出第一语法元素标识信息的取值。
在一种具体的实施例中,该方法还可以包括:
确定当前帧包括的至少一个划分块的待滤波分量全部不使用预设网络模型进行滤波的第一率失真代价值;
确定当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波的第二率失真代 价值;
确定当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波的第三率失真代价值;
根据第一率失真代价值、第二率失真代价值和第三率失真代价值,确定当前帧的待滤波分量的帧级语法元素标识信息;其中,帧级语法元素标识信息包括第一语法元素标识信息和第三语法元素标识信息。
需要说明的是,在本申请实施例中,对于当前帧包括的至少一个划分块来说,在待滤波分量下,可能存在三种情况:这至少一个划分块全部使用预设网络模型进行滤波,这至少一个划分块全部不使用预设网络模型进行滤波,这至少一个划分块中存在部分划分块使用预设网络模型进行滤波。
这样,针对上述这三种情况,可以使用率失真代价方式来计算这至少一个划分块的待滤波分量全部不使用预设网络模型进行滤波的第一率失真代价值、这至少一个划分块的待滤波分量全部使用预设网络模型进行滤波的第二率失真代价值、这至少一个划分块中存在部分划分块的待滤波分量允许使用预设网络模型进行滤波的第三率失真代价值;然后根据这三个率失真代价值的大小来确定第一语法元素标识信息的取值和第三语法元素标识信息的取值。
在一些实施例中,对于第三语法元素标识信息来说,所述根据第一率失真代价值、第二率失真代价值和第三率失真代价值,确定当前帧的待滤波分量的帧级语法元素标识信息,可以包括:
若第一率失真代价值、第二率失真代价值和第三率失真代价值中第二率失真代价值最小,则设置第三语法元素标识信息的取值为第一值;
若第一率失真代价值、第二率失真代价值和第三率失真代价值中第一率失真代价值最小或第三率失真代价值最小,则设置第三语法元素标识信息的取值为第二值。
相应地,在一些实施例中,该方法还可以包括:对第三语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
需要说明的是,在本申请实施例中,如果第二率失真代价值最小,这时候意味着当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波,那么可以设置第三语法元素标识信息的取值为第一值;否则,如果第一率失真代价值最小或第三率失真代价值最小,这时候意味着当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波,那么可以设置第三语法元素标识信息的取值为第二值。
还需要说明的是,在本申请实施例中,编码器还可以将第三语法元素标识信息的取值写入码流中,以使得后续解码器可以通过解析码流,就能够确定出第三语法元素标识信息,进而可以确定当前帧包括的至少一个划分块的待滤波分量是否全部使用预设网络模型进行滤波。
在一些实施例中,对于第一语法元素标识信息来说,所述根据第一率失真代价值、第二率失真代价值和第三率失真代价值,确定当前帧的待滤波分量的帧级语法元素标识信息,可以包括:
若第一率失真代价值、第二率失真代价值和第三率失真代价值中第三率失真代价值最小,则设置第一语法元素标识信息的取值为第一值;
若第一率失真代价值、第二率失真代价值和第三率失真代价值中第一率失真代价值最小,则设置第一语法元素标识信息的取值为第二值;
相应地,在一些实施例中,该方法还可以包括:对第一语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
需要说明的是,在本申请实施例中,如果第三率失真代价值最小,这时候意味着当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波,那么可以设置第一语法元素标识信息的取值为第一值;否则,如果第一率失真代价值最小,这时候意味着当前帧包括的至少一个划分块的待滤波分量全部不使用预设网络模型进行滤波,那么可以设置第一语法元素标识信息的取值为第二值。
还需要说明的是,在本申请实施例中,编码器还可以将第一语法元素标识信息的取值写入码流中,以使得后续解码器可以通过解析码流,就能够确定出第一语法元素标识信息,进而可以确定当前帧中是否存在划分块的待滤波分量允许使用预设网络模型进行滤波。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,无论是第一语法元素标识信息,还是第三语法元素标识信息,均可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。
示例性地,在本申请实施例中,第二语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
可以理解地,在本申请实施例中,对于第一率失真代价值的计算,所述确定当前帧包括的至少一个划分块的待滤波分量全部不使用预设网络模型进行滤波的第一率失真代价值,可以包括:
确定当前帧包括的至少一个划分块的待滤波分量的原始值,以及确定当前帧包括的至少一个划分块 的待滤波分量的重建值;
根据当前帧包括的至少一个划分块的待滤波分量的原始值和当前帧包括的至少一个划分块的待滤波分量的重建值进行率失真代价计算,得到第一率失真代价值。
需要说明的是,编码器首先可以计算当前帧未使用预设网络模型的代价信息,即使用准备作为预设网络模型输入的当前块的重建样本与当前块的原始图像样本来计算出第一率失真代价值,可以用costOrg表示。
还需要说明的是,对于每一个划分块的待滤波分量的重建值而言,在一些实施例中,所述确定当前帧包括的至少一个划分块的待滤波分量的重建值,可以包括:
确定当前帧的待滤波分量的原始图像;
对原始图像进行划分,得到至少一个划分块的待滤波分量的原始值;
对至少一个划分块进行帧内或帧间预测,确定至少一个划分块的待滤波分量的预测值;
根据至少一个划分块的待滤波分量的原始值和至少一个划分块的待滤波分量的预测值,得到至少一个划分块的待滤波分量的初始残差值;
对至少一个划分块的待滤波分量的初始残差值分别进行变换与量化处理,得到至少一个划分块的待滤波分量的目标残差值;
对至少一个划分块的待滤波分量的目标残差值分别进行反量化与反变换处理,得到至少一个划分块的待滤波分量的重建残差值;
根据至少一个划分块的待滤波分量的预测值和至少一个划分块的待滤波分量的重建残差值,确定至少一个划分块的待滤波分量的重建值。
在本申请实施例中,根据至少一个划分块的待滤波分量的预测值和至少一个划分块的待滤波分量的重建残差值,确定至少一个划分块的待滤波分量的重建值,具体可以是对至少一个划分块的待滤波分量的预测值和至少一个划分块的待滤波分量的重建残差值进行加法运算,可以确定至少一个划分块的待滤波分量的重建值。
在本申请实施例中,目标残差值还会写入码流中,以便后续解码器通过解码获得目标残差值,然后通过反量化与反变换处理,可以得到重建残差值,进而能够确定出划分块的待滤波分量的重建值。在一些实施例中,该方法还可以包括:对至少一个划分块的待滤波分量的目标残差值进行编码,将所得到的编码比特写入码流。
还需要说明的是,在本申请实施例中,对于至少一个划分块而言,以当前块为例,首先确定当前块的待滤波分量的预测值;然后根据当前块的待滤波分量的原始值和当前块的待滤波分量的预测值,得到当前块的待滤波分量的初始残差值;再对当前块的待滤波分量的初始残差值进行变换与量化处理,得到当前块的待滤波分量的目标残差值;然后再对当前块的待滤波分量的目标残差值进行反量化与反变换处理,得到当前块的待滤波分量的重建残差值;最后根据当前块的待滤波分量的预测值和当前块的待滤波分量的重建残差值,具体是对当前块的待滤波分量的预测值和当前块的待滤波分量的重建残差值进行加法运算,能够确定出当前块的待滤波分量的重建值。
还可以理解地,在本申请实施例中,对于第二率失真代价值的计算,所述确定当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波的第二率失真代价值,可以包括:
确定至少两种量化参数组合;其中,每一种量化参数组合至少包括一个第一颜色分量的候选量化参数值和一个第二颜色分量的候选量化参数值;
在每一种量化参数组合下,基于预设网络模型对当前帧包括的至少一个划分块的待滤波分量的重建值进行滤波,得到当前帧包括的至少一个划分块的待滤波分量的滤波后重建值;
根据当前帧包括的至少一个划分块的待滤波分量的原始值与当前帧包括的至少一个划分块的待滤波分量的滤波后重建值进行率失真代价计算,得到每一种量化参数组合下的第四率失真代价值;
从所得到的第四率失真代价值中选取最小率失真代价值,根据最小率失真代价值确定第二率失真代价值。
需要说明的是,在本申请实施例中,以四种量化参数组合为例,编码器可以尝试基于预设网络模型的环路滤波技术,分别遍历这四种量化参数组合;使用当前块的重建样本YUV以及量化参数输入到已加载好的预设网络模型当中进行推理,预设网络模型输出当前块的重建样本块。据此将这四种量化参数组合下基于预设网络模型环路滤波后的当前块的重建样本与当前块的原始图像样本计算出第四率失真代价值,分别用costFrame1、costFrame2、costFrame3以及costFrame4表示;从costFrame1、costFrame2、costFrame3以及costFrame4中选择最小率失真代价值,将所选择的第四率失真代价值作为最终的第二率失真代价值,用costFrameBest表示。
进一步地,在一些实施例中,该方法还可以包括:将最小率失真代价值对应的量化参数组合作为当 前帧的帧量化参数信息;
相应地,在第一率失真代价值、第二率失真代价值和第三率失真代价值中第二率失真代价值最小的情况下,该方法还可以包括:在对第三语法元素标识信息的取值进行编码之后,继续对当前帧的帧量化参数信息进行编码,将所得到的编码比特写入码流。
还需要说明的是,在本申请实施例中,从costFrame1、costFrame2、costFrame3以及costFrame4中选择最小率失真代价值,并将所选择的最小率失真代价值对应的量化参数组合作为当前帧的帧量化参数信息。这样,在对第三语法元素标识信息的取值进行编码之后,还可以继续对当前帧的帧量化参数信息进行编码,然后将其写入码流。
进一步地,在一些实施例中,对于量化参数组合,所述确定至少两种量化参数组合,可以包括:
确定第一量化参数候选集合和第二量化参数候选集合;
遍历第一量化参数候选集合和第二量化参数候选集合,确定至少两种量化参数组合;
其中,第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
需要说明的是,在本申请实施例中,如果存在四种量化参数组合,那么第一量化参数候选集合可以包括两种亮度颜色分量的候选量化参数,第二量化参数候选集合可以包括两种色度颜色分量的候选量化参数;根据两种亮度颜色分量的候选量化参数和两种色度颜色分量的候选量化参数可以组合得到四种量化参数组合。
进一步地,在一些实施例中,所述对当前帧的帧量化参数信息进行编码,将所得到的编码比特写入码流,还可以包括:
根据当前帧的帧量化参数信息,确定第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;
根据第一量化参数候选集合和第一颜色分量的帧量化参数值,确定第三量化参数索引;
根据第二量化参数候选集合和第二颜色分量的帧量化参数值,确定第四量化参数索引;
对第三量化参数索引和第四量化参数索引进行编码,将所得到的编码比特写入码流。
还需要说明的是,在本申请实施例中,第三量化参数索引用于指示第一颜色分量的帧量化参数值在第一量化参数候选集合中的索引序号,第四量化参数索引用于指示第二颜色分量的帧量化参数值在第二量化参数候选集合中的索引序号。
这样,在根据第一量化参数候选集合和第二量化参数候选集合确定出第三量化参数索引和第四量化参数索引之后,需要将第三量化参数索引和第四量化参数索引写入码流中;从而后续在解码器中无需进行率失真代价计算,可以通过解析码流获取第三量化参数索引和第四量化参数索引,进而确定出当前帧的帧量化参数信息,即第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。示例性地,亮度颜色分量的帧量化参数值可以用ph_nnlf_luma_baseqp表示,色度颜色分量的帧量化参数值可以用ph_nnlf_chroma_baseqp表示。
还需要说明的是,在本申请实施例中,对于同一帧来说,第一量化参数候选集合是可以相同的,第二量化参数候选集合可以是相同的;而不同的帧各自对应的第一量化参数候选集合可以不同,不同的帧各自对应的第二量化参数候选集合也可以不同。
S1002:在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息。
需要说明的是,在本申请实施例中,如果第一语法元素标识信息为真,这时候第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波,那么编码器还需要确定当前块的待滤波分量的第二语法元素标识信息。另外,当前帧包括至少一个划分块,这里的当前块具体是指当前待进行环路滤波的划分块,其可以为当前帧包括的至少一个划分块中的任意一个。在这里,当前块可以是当前编码单元、当前预测单元或者当前变换单元,甚至也可以是当前编码树单元等。下面将以当前编码树单元为例进行具体描述。
还需要说明的是,在本申请实施例中,第二语法元素标识信息为编码树单元级语法元素,可以用于指示当前块的待滤波分量是否使用预设网络模型进行滤波。第二语法元素标识信息也可称为编码树单元使用标识位。也就是说,第二语法元素标识信息可以确定当前编码树单元的待滤波分量使用预设网络模型进行滤波,还是当前编码树单元的待滤波分量不使用预设网络模型进行滤波。
示例性地,如果待滤波分量为亮度颜色分量,那么第二语法元素标识信息可以为ctb_nnlf_luma_flag;如果待滤波分量为色度颜色分量,那么第二语法元素标识信息可以为ctb_nnlf_chroma_flag。也就是说,针对当前编码树单元中不同的颜色分量,对应设置有不同的第二语法元素标识信息。具体地,编码器可以先确定待滤波分量的第三语法元素标识信息,在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,编码器还需要继续确定第一语法元素标识信息 的取值;只有在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,这时候编码器才会继续确定第二语法元素标识信息的取值。
在一种具体的实施例中,该方法还可以包括:
基于当前帧中的当前块,确定当前块的待滤波分量的原始值和当前块的待滤波分量的重建值;
在至少两种量化参数组合下,基于预设网络模型对当前块的待滤波分量的重建值进行滤波,得到至少两种当前块的待滤波分量的滤波后重建值;
根据当前块的待滤波分量的原始值和当前块的待滤波分量的重建值进行率失真代价计算,得到第五率失真代价值;
根据当前块的待滤波分量的原始值和至少两种当前块的待滤波分量的滤波后重建值分别进行率失真代价计算,得到至少两个第六率失真代价值;
根据第五率失真代价值和至少两个第六率失真代价值,确定当前块的待滤波分量的第二语法元素标识信息。
需要说明的是,在本申请实施例中,对于当前块来说,编码器会尝试编码树单元级的优化选择。其中,针对待滤波分量,不仅需要计算不使用基于预设网络模型的环路滤波情况下的重建样本与当前块的原始样本的第五率失真代价值,可以用costCTUorg表示;还需要计算多种BaseQPluma和BaseQPchroma组合基于预设网络模型的环路滤波情况下的重建样本与当前块的原始样本的第五率失真代价值,可以分别用costCTUnn1、costCTUnn2、costCTUnn3及costCTUnn4表示;然后通过costCTUorg、costCTUnn1、costCTUnn2、costCTUnn3及costCTUnn4的率失真代价值的大小来确定第二语法元素标识信息的取值。
进一步地,在一些实施例中,对于第三语法元素标识信息来说,所述根据第五率失真代价值和至少两个第六率失真代价值,确定当前块的待滤波分量的第二语法元素标识信息,可以包括:
从第五率失真代价值和至少两个第六率失真代价值中选取最小率失真代价值;
若最小率失真代价值为其中一个第六率失真代价值,则设置第二语法元素标识信息的取值为第一值;
若最小率失真代价值为第五率失真代价值,则设置第二语法元素标识信息的取值为第二值。
相应地,在一些实施例中,该方法还可以包括:对第二语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
需要说明的是,在本申请实施例中,如果最小的为第五率失真代价值,这时候意味着当前块的待滤波分量不使用预设网络模型进行滤波,那么可以设置第二语法元素标识信息的取值为第二值;否则,如果最小的为某一第六率失真代价值,这时候意味着当前块的待滤波分量使用预设网络模型进行滤波,那么可以设置第二语法元素标识信息的取值为第一值。
还需要说明的是,在本申请实施例中,编码器还可以将第二语法元素标识信息的取值写入码流中,以使得后续解码器可以通过解析码流,就能够确定出第二语法元素标识信息,进而可以确定当前帧中的当前块是否使用预设网络模型进行滤波。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,对于第二语法元素标识信息来说,也可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。
示例性地,在本申请实施例中,第二语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
S1003:在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值。
S1004:确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
需要说明的是,在本申请实施例中,在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,对于当前块的块量化参数信息而言,该方法还可以包括:
在最小率失真代价值为其中一个第六率失真代价值的情况下,将最小率失真代价值对应的量化参数组合作为当前块的块量化参数信息;
相应地,该方法还可以包括:在对第二语法元素标识信息的取值进行编码之后,继续对当前块的块量化参数信息进行编码,将所得到的编码比特写入码流。
在这里,从costCTUorg、costCTUnn1、costCTUnn2、costCTUnn3及costCTUnn4中选择最小率失真代价值,并将所选择的最小率失真代价值对应的BaseQPluma和BaseQPchroma组合作为当前块的块量化参数信息。这样,在对第二语法元素标识信息的取值进行编码之后,还可以继续对当前块的块量化 参数信息进行编码,然后将其写入码流。
在一种具体的实施例中,所述对当前块的块量化参数信息进行编码,将所得到的编码比特写入码流,该方法还可以包括:
根据当前块的块量化参数信息,确定第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
根据第一量化参数候选集合和第一颜色分量的块量化参数值,确定第一量化参数索引;其中,第一量化参数索引用于指示第一颜色分量的块量化参数值在第一量化参数候选集合中的索引序号;
根据第二量化参数候选集合和第二颜色分量的块量化参数值,确定第二量化参数索引;其中,第二量化参数索引用于指示第二颜色分量的块量化参数值在第二量化参数候选集合中的索引序号;
对第一量化参数索引和第二量化参数索引进行编码,将所得到的编码比特写入码流。
需要说明的是,在本申请实施例中,第一量化参数候选集合可以包括至少两种亮度颜色分量的候选量化参数,第二量化参数候选集合可以包括至少两种色度颜色分量的候选量化参数。示例性地,如果根据两种亮度颜色分量的候选量化参数和两种色度颜色分量的候选量化参数,那么可以组合得到四种量化参数组合。另外,对于同一帧来说,第一量化参数候选集合是可以相同的,第二量化参数候选集合可以是相同的;而不同的帧各自对应的第一量化参数候选集合可以不同,不同的帧各自对应的第二量化参数候选集合也可以不同。
这样,对于当前块的块量化参数来说,在根据第一量化参数候选集合和第二量化参数候选集合确定出第一量化参数索引和第二量化参数索引之后,需要将第一量化参数索引和第二量化参数索引写入码流中;从而后续在解码器中无需进行率失真代价计算,可以通过解析码流获取第一量化参数索引和第二量化参数索引,进而确定出当前帧的帧量化参数信息,即第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值。示例性地,亮度颜色分量的块量化参数值可以用ctb_nnlf_luma_baseqp表示,色度颜色分量的块量化参数值可以用ctb_nnlf_chroma_baseqp表示。
还可以理解的是,在本申请实施例中,这里引入了新的语法元素,例如待滤波分量的第一语法元素标识信息、第二语法元素标识信息、第三语法元素标识信息等。在一些实施例中,待滤波分量至少包括亮度颜色分量和色度颜色分量;该方法还可以包括:
在当前帧的颜色分量类型为亮度颜色分量时,确定第三语法元素标识信息为当前帧的帧级亮度开关标识信息,第一语法元素标识信息为当前帧的帧级亮度使能标识信息,第二语法元素标识信息为当前块的块级亮度使用标识信息;其中,帧级亮度开关标识信息用于指示当前帧包括的至少一个划分块的亮度颜色分量是否全部使用预设网络模型进行滤波,帧级亮度使能标识信息用于指示当前帧中是否存在划分块的亮度颜色分量允许使用预设网络模型进行滤波,块级亮度使用标识信息用于指示当前块的亮度颜色分量是否使用预设网络模型进行滤波;
在当前帧的颜色分量类型为色度颜色分量时,确定第三语法元素标识信息为当前帧的帧级色度开关标识信息,第一语法元素标识信息为当前帧的帧级色度使能标识信息,第二语法元素标识信息为当前块的块级色度使用标识信息;其中,帧级色度开关标识信息用于指示当前帧包括的至少一个划分块的色度颜色分量是否全部使用预设网络模型进行滤波,帧级色度使能标识信息用于指示当前帧中是否存在划分块的色度颜色分量允许使用预设网络模型进行滤波,块级色度使用标识信息用于指示当前块的色度颜色分量是否使用预设网络模型进行滤波。
在这里,对于亮度颜色分量来说,帧级亮度开关标识信息可以用ph_nnlf_luma_ctrl_flag表示,帧级亮度使能标识信息可以用ph_nnlf_luma_enable_flag表示,块级亮度使用标识信息可以用ctb_nnlf_luma_flag表示;对于色度颜色分量来说,帧级色度开关标识信息可以用ph_nnlf_chroma_ctrl_flag表示,帧级色度使能标识信息可以用ph_nnlf_chroma_enable_flag表示,块级色度使用标识信息可以用ctb_nnlf_chroma_flag表示。
进一步地,在本申请实施例中,还可以设置有序列级语法元素,以便确定当前序列是否允许使用基于神经网络的环路滤波技术,该方法还可以包括:
确定第四语法元素标识信息;
在第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波时,执行确定当前帧的待滤波分量的第三语法元素标识信息的步骤;其中,当前序列包括当前帧。
需要说明的是,在本申请实施例中,第四语法元素标识信息为序列级语法元素,可以用于指示当前序列的待滤波分量是否允许使用预设网络模型进行滤波。其中,第四语法元素标识信息可以用sps_nnlf_enable_flag表示。其中,如果当前序列的亮度颜色分量和色度颜色分量中至少一项允许使用预设网络模型进行滤波,那么意味着sps_nnlf_enable_flag的取值为真,即当前序列的待滤波分量允许使用预设网络模型进行滤波;如果当前序列的亮度颜色分量和色度颜色分量中均不允许使用预设网络模型进行滤波,那么意味着sps_nnlf_enable_flag的取值为假,即当前序列的待滤波分量不允许使用预设网 络模型进行滤波。
在一种具体的实施例中,所述确定第四语法元素标识信息,可以包括:
确定当前序列的待滤波分量是否允许使用预设网络模型进行滤波;
若当前序列的待滤波分量允许使用预设网络模型进行滤波,则设置第四语法元素标识信息的取值为第一值;
若当前序列的待滤波分量不允许使用预设网络模型进行滤波,则设置第四语法元素标识信息的取值为第二值;
相应地,该方法还包括:对第四语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
在本申请实施例中,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。示例性地,在第四语法元素标识信息为一flag时,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值还可以设置为true,第二值还可以设置为false;但是本申请实施例并不作具体限定。
还需要说明的是,第四语法元素标识信息可称为序列级标识位。在编码器中,若该序列级标识位为真,则允许使用基于神经网络的环路滤波技术;若该序列级标识位为假,则不允许使用基于神经网络的环路滤波技术。其中,序列级标识位在编码视频序列时需要写入码流当中。
进一步地,在本申请实施例中,预设网络模型为神经网络模型,且神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
需要说明的是,对于预设网络模型来说,如图7所示,其输入可以包括:待滤波分量的重建值(用rec_yuv表示)、亮度颜色分分量的量化参数值(用BaseQPluma表示)和色度颜色分分量的量化参数值(用BaseQPchroma表示);其输出可以为:待滤波分量的滤波后重建值(用output_yuv表示。)由于本申请实施例去除了如预测YUV信息、带有划分信息的YUV信息等非重要输入元素,可以减少网络模型推理的计算量,有利于解码端的实现和降低解码时间。另外,还需要说明的是,在本申请实施例中,预设网络模型的输入还可以包括当前帧的量化参数(SliceQP),但是SliceQP不用区分亮度颜色分量和色度颜色分量(图7中未示出)。
这样,本申请实施例提出了一种多BaseQP输入的基于神经网络模型的环路滤波技术,主要思想是亮度颜色分量输入一个通道的BaseQPluma,同时色度颜色分量也输入一个通道的BaseQPchroma,同时保持模型数量一个不变。如此,本申请实施例在不增加模型数量的前提下,通过增加推理计算量,可以为亮色度颜色分量提供更多信息,同时使得亮度颜色分量和色度颜色分量拥有更多的选择和适配。
进一步地,在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,该方法还可以包括:确定预设网络模型的输出为当前块的待滤波分量的滤波后重建值。
进一步地,在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,预设网络模型的输出还可以为残差信息。该方法还可以包括:确定预设网络模型的输出为当前块的待滤波分量的第一残差值;
相应地,对于S1004来说,所述确定当前块的待滤波分量的滤波后重建值,可以包括:在通过预设网络模型得到当前块的待滤波分量的第一残差值后,根据当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值,确定当前块的待滤波分量的滤波后重建值。
需要说明的是,在本申请实施例中,预设网络模型的输出可以直接为当前块的待滤波分量的滤波后重建值,或者也可以是当前块的待滤波分量的第一残差值。对于后者而言,编码器还需要对当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值进行加法运算,可以确定出当前块的待滤波分量的滤波后重建值。
还需要说明的是,在本申请实施例中,在预设网络模型的输出端可以增加一个缩放处理,即利用残差缩放因子对待滤波分量的第一残差值进行缩放。因此,在一些实施例中,该方法还可以包括:
确定残差缩放因子;
相应地,对于S1004来说,所述确定当前块的待滤波分量的滤波后重建值,可以包括:
根据残差缩放因子对当前块的待滤波分量的第一残差值进行缩放处理,得到当前块的待滤波分量的第二残差值;
根据当前块的待滤波分量的重建值和当前块的待滤波分量的第二残差值,确定当前块的待滤波分量的滤波后重建值。
进一步地,在一些实施例中,该方法还可以包括:对残差缩放因子进行编码,将所得到的编码比特写入码流。
需要说明的是,预设网络模型的输出若是残差信息,那么需要叠加当前块的重建样本后作为基于预设网络模型的环路滤波工具输出;若预设网络模型的输出是完整的重建样本,那么模型输出即为基于预 设网络模型的环路滤波工具输出。然而,在一种可能的实施例中,模型输出一般还需要进行一个缩放处理,以模型输出为残差信息为例,预设网络模型进行推断输出当前块的残差信息,该残差信息进行缩放处理后再叠加当前块的重建样本;而这个残差缩放因子可以是由编码器求得,其需要写入码流传到解码器中,使得后续解码器通过解码即可获得残差缩放因子。
进一步地,在一些实施例中,该方法还可以包括:
遍历当前帧中的至少一个划分块,将每一划分块依次作为当前块,重复执行确定当前块的待滤波分量的第二语法元素标识信息的取值的步骤,以得到至少一个划分块各自对应的滤波后重建值;
根据至少一个划分块各自对应的滤波后重建值,确定当前帧的重建图像。
需要说明的是,对于当前帧而言,当前帧可以包括至少一个划分块。然后遍历这些划分块,将每一划分块依次作为当前块,重复执行本申请实施例的编码方法流程,以得到每一划分块对应的滤波后重建值;根据所得到的这些滤波后重建值可以确定出当前帧的重建图像。另外,需要注意的是,编码器还可以继续遍历其他环路滤波工具,完毕后输出完整的重建图像,具体过程与本申请实施例并不密切相关,故这里不作详细赘述。
进一步地,在一些实施例中,由于视频编码在I帧和B帧的质量要求上存在不同,往往I帧要求较高的编码质量以利于B帧作为参考。故在I帧和B帧上,本申请实施例的解码方法仅允许B帧使用亮色度分量不同的量化参数(BaseQPluma和BaseQPchroma)输入,而I帧亮色度分量的量化参数输入一致。这样不仅降低编解码时间,同时在I帧上可以节省量化参数传输的比特开销,进一步提升压缩效率。
进一步地,在一些实施例中,本申请实施例仅增加了一层色度量化参数作为额外输入,此外还可以对Cb颜色分量和Cr颜色分量分别增加量化参数作为额外输入。
进一步地,在一些实施例中,本申请实施例提出的基于神经网络模型的环路滤波增强方法还可以拓展到其他输入部分,例如边界强度等,本申请实施例不作具体限定。
本实施例提供了一种编码方法,通过确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。这样,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
在本申请的又一实施例中,基于前述实施例所述的解码方法和编码方法,本申请实施例提出一种多BaseQP输入的基于神经网络的环路滤波技术,主要思想是亮度颜色分量输入一个通道的BaseQPluma,同时色度颜色分量也输入一个通道的BaseQPchroma,而且还保持模型数量一个不变。本申请实施例在不增加模型数量的前提下,通过增加推理计算量,可以为亮度颜色分量和色度颜色分量提供更多信息,同时编码端也拥有更多选择。
示例性地,图7示出了本申请实施例提供的一种多BaseQP输入的神经网络模型的网络架构示意图。具体来说,预设网络模型以图7所示的神经网络模型为例,编码端可以提供多个BaseQPluma和BaseQPchroma候选组合,编码树单元或者编码单元对每一个候选输入到当前的神经网络模型,推理计算滤波后的重建样本块,并求得对应的率失真代价值。选择率失真代价值最小的候选组合对应的重建样本块作为当前滤波技术的输出样本,并记录该最小率失真代价值对应的BaseQPluma和BaseQPchroma候选组合,通过量化参数索引或者直接二值化的方式写入码流传输到解码端。解码端解析码流,以获得当前编码树单元或编码单元的基于神经网络环路滤波使用标识位以及解析并计算得到上述提到的BaseQPluma和BaseQPchroma候选组合,若使用基于神经网络的环路滤波技术,则确定最终的BaseQPluma和BaseQPchroma候选组合作为当前编码树单元或者编码单元的量化参数输入到神经网络模型当中,获取从神经网络模型输出的重建样本作为当前滤波技术的输出样本。
在一种具体的实施例中,对于编码端,具体过程如下:
编码器遍历帧内或者帧间预测,得到各编码单元的预测块,通过原始图像块与预测块作差即可得到 编码单元的残差,残差经由各种变换模式得到频域残差系数,后经过量化和反量化,反变换后得到失真残差信息(即前述实施例所述的重建残差值),将失真残差信息与预测块进行叠加即可得到重建块。待编码完图像后,环路滤波模块以编码树单元级为基本单位对图像进行滤波,本申请实施例的技术方案应用在此处。获取基于神经网络模型的环路滤波允许使用标识位,即sps_nnlf_enable_flag,若该标识位为真,则允许使用基于神经网络模型的环路滤波技术;若该标识位为假,则不允许使用基于神经网络模型的环路滤波技术。序列级允许使用标识位可以在编码视频序列时需要写入码流当中。
步骤1、若基于神经网络模型的环路滤波的允许使用标志位为真,则编码端尝试基于神经网络模型的环路滤波的技术,即执行步骤2;若基于神经网络模型的环路滤波的允许使用标志位为假,则编码端不尝试基于神经网络模型的环路滤波的技术,即跳过步骤2直接执行步骤3;
步骤2、初始化基于神经网络的环路滤波技术,载入适用于当前帧的神经网络模型。
第一轮:
编码端计算未使用基于神经网络模型的环路滤波技术的代价信息,即使用准备作为神经网络模型输入的编码树单元重建样本与该编码树单元原始图像样本计算出率失真代价值,记为costOrg;
第二轮:
编码端尝试基于神经网络模型的环路滤波技术,分别遍历两种亮度量化参数候选以及两种色度量化参数候选,使用当前编码树单元的重建样本YUV以及量化参数输入到已加载好的神经网络模型当中进行推理,神经网络模型输出当前编码树单元的重建样本块。将各种量化参数组合下的基于神经网络模型环路滤波后的编码树单元重建样本与该编码树单元原始图像样本计算出率失真代价值,分别记为costFrame1,costFrame2,costFrame3以及costFrame4。选择最小代价组合作为第二轮的最优输出,标记代价值为costFrameBest以及记录对应的亮度量化参数和色度量化参数;
第三轮:
编码端尝试编码树单元级的优化选择,第二轮编码端的基于神经网络模型环路滤波的尝试直接默认当前帧所有编码树单元都使用该技术,亮度颜色分量和色度颜色分量各使用一个帧级开关标识位进行控制,而编码树单元级则不需要传输使用标识位。本轮尝试编码树单元级的标识位组合,且每个颜色分量都可以单独控制。编码器遍历编码树单元,计算不使用基于神经网络模型环路滤波情况下的重建样本与当前编码树单元原始样本的率失真代价值,记为costCTUorg;计算多种BaseQPluma和BaseQPchroma的组合基于神经网络模型环路滤波的重建样本与当前编码树单元原始样本的率失真代价值,分别记为costCTUnn1,costCTUnn2,costCTUnn3以及costCTUnn4。
对于亮度颜色分量,若当前亮度颜色分量的costCTUorg比任意亮度颜色分量costCTUnn小,则将该亮度颜色分量的编码树单元级基于神经网络模型环路滤波的使用标识位(ctb_nnlf_luma_flag)至假;否则,将该ctb_nnlf_luma_flag至真,同时记录当前BaseQPluma的量化参数索引。
对于色度颜色分量,若当前色度颜色分量的costCTUorg比任意色度颜色分量costCTUnn小,则将该色度颜色分量的编码树单元级基于神经网络模型环路滤波的使用标识位(ctb_nnlf_chroma_flag)至假;否则,将该ctb_nnlf_chroma_flag至真,同时记录当前BaseQPchroma的量化参数索引。
若当前帧内所有编码树单元都已遍历完毕,则计算该情况下的当前帧重建样本与原始图像样本的率失真代价值,记为costCTUBest;
遍历各颜色分量,若costOrg的值最小,则将该颜色分量对应的帧级基于神经网络模型环路滤波的开关标识位至否,写入码流;若costFrameBest的值最小,则将该颜色分量对应的基于神经网络模型环路滤波的帧级开关标识位(ph_nnlf_luma_ctrl_flag/ph_nnlf_chroma_ctrl_flag)至真,同时将记录的最优BaseQPluma和BaseQPchroma的量化参数组合一并写入码流;若costCTUBest的值最小,则将该颜色分量对应的基于神经网络模型环路滤波的帧级使用标识位至真,帧级开关标识位至假,同时也将第三轮中决策的编码树单元级使用标识位和最优BaseQPluma和BaseQPchroma的量化参数组合一并写入码流。
步骤3、编码器继续尝试其他环路滤波工具,完毕后输出完整的重建图像,具体过程与本申请实施例的技术方案并不相关,故此处不详细阐述。
在另一种具体的实施例中,对于解码端,具体过程如下:
解码端解析序列级标识位,若sps_nnlf_enable_flag为真,则表示当前码流允许使用基于神经网络模型的环路滤波技术,且后续解码过程需要解析相关的语法元素;否则表示当前码流不允许使用基于神经网络模型的环路滤波技术,后续解码过程不需要解析相关语法元素,默认相关语法元素为初始值或至假状态。
步骤1、解码器解析当前帧的语法元素,获取得到基于神经网络模型的帧级开关标识位以及帧级使用标识位,若该帧级标识位不全为否,则执行步骤2;否则,跳过步骤2,执行步骤3。
步骤2、若该帧级开关标识位为真,则表示当前颜色分量下的所有编码树单元都使用基于神经网络 模型的环路滤波技术进行滤波,即自动将该颜色分量下当前帧所有编码树单元的编码树单元级使用标识位至真;否则表示当前颜色分量下存在有些编码树单元使用基于神经网络模型的环路滤波技术,也存在有些编码树单元不使用基于神经网络模型的环路滤波技术。故若帧级开关标识位为假,则需要进一步解析该颜色分量下当前帧所有编码树单元的编码树单元级使用标识位(ctb_nnlf_luma_flag/ctb_nnlf_chroma_flag)。其中:
若ph_nnlf_luma_ctrl_flag或ph_nnlf_chroma_ctrl_flag为真,则解析当前帧的BaseQPluma值(ph_nnlf_luma_baseqp)和BaseQPchroma(ph_nnlf_chroma_baseqp),并作为输入的量化参数信息应用于当前帧对应颜色分量的所有编码树单元。此外,还需将ph_nnlf_luma_enable_flag或ph_nnlf_chroma_enable_flag,以及当前帧内所有的编码树单元使用标识位ctb_nnlf_luma_flag或ctb_nnlf_chroma_flag置真;否则,解析ph_nnlf_luma_enable_flag/ph_nnlf_chroma_enable_flag。
若ph_nnlf_luma_enable_flag/ph_nnlf_chroma_enable_flag均为假,则将当前帧内所有的编码树单元使用标识位置假;否则,解析当前编码树单元的BaseQPluma(ctb_nnlf_luma_baseqp)或BaseQPchroma(ctb_nnlf_chroma_baseqp),以及对应颜色分量的所有编码树单元使用标识位。
若当前编码树单元的所有颜色分量的编码树单元级使用标识位不全为假,则对当前编码树单元使用基于神经网络模型的环路滤波技术进行滤波。以当前编码树单元的重建样本YUV以及量化参数信息(BaseQPluma和BaseQPchroma)作为输入。神经网络模型进行推理,得到当前编码树单元的基于神经网络模型环路滤波后的重建样本YUV。
根据当前编码树单元各颜色分量的编码树单元使用标识位情况对选择重建样本作为基于神经网络模型的环路滤波技术输出,若对应颜色分量的编码树单元使用标识位为真,则使用上述对应颜色分量的基于神经网络模型环路滤波后的重建样本作为输出;否则,使用未经过基于神经网络模型环路滤波后的重建样本作为该颜色分量的输出。
遍历完当前帧所有的编码树单元后,基于神经网络模型的环路滤波模块结束。
步骤3、解码端继续遍历其他环路滤波工具,完毕后输出完整的重建图像,具体过程与本申请实施例的技术方案并不相关,故此处不详细阐述。
在又一种具体的实施例中,解码端的解析流程简述如表1所示,其中,字体加粗表示需要解析的语法元素。
表1
上述所有实施例中未详细介绍残差缩放部分,但不代表本申请实施例不可以使用残差缩放技术。残差缩放技术对神经网络模型的输出使用,具体可以是对神经网络输出的重建样本与原始重建样本作差得到的残差进行缩放,这里不作详细阐述。
在本申请的再一实施例中,本申请实施例提供了一种码流,该码流是根据待编码信息进行比特编码生成的;其中,待编码信息可以包括下述至少一项:当前帧的待滤波分量的第一语法元素标识信息、当前块的待滤波分量的第二语法元素标识信息、当前帧的待滤波分量的第三语法元素标识信息、残差缩放因子和当前帧包括的至少一个划分块的待滤波分量的初始残差值;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个。
在本申请实施例中,通过上述实施例对前述实施例的具体实现进行了详细阐述,根据前述实施例的技术方案,从中可以看出,本申请实施例提出了一种新的神经网络环路滤波模型,使用多通道量化参数作为输入,用于提升编码性能,且引入新的语法元素。如此,在保持仅有一个模型或较少模型的情况下,增加重要输入元素的通道,使得亮度颜色分量和色度颜色分量拥有更多的选择和适配。通过编码端的率失真优化计算,解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能;同时,本技术方案也去除了如预测信息YUV、划分信息YUV等非重要输入元素,减少了网络模型推理的计算量,有利于解码端的实现和降低解码时间。
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图11,其示出了本申请实施例提供的一种编码器的组成结构示意图。如图11所示,该编码器100可以包括:第一确定单元1101和第一滤波单元1102;其中,
第一确定单元1101,配置为确定当前帧的待滤波分量的第一语法元素标识信息;以及在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;以及在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
第一确定单元1101,还配置为确定当前块的待滤波分量的重建值,
第一滤波单元1102,配置为将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,第一确定单元1101,还配置为确定当前帧的待滤波分量的第三语法元素标识信息;以及在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,确定当前帧的待滤波分量的第一语法元素标识信息。
在一些实施例中,第一确定单元1101,还配置为确定当前帧包括的至少一个划分块的待滤波分量全部不使用预设网络模型进行滤波的第一率失真代价值;确定当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波的第二率失真代价值;确定当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波的第三率失真代价值;以及根据第一率失真代价值、第二率失真代价值和第三率失真代价值,确定当前帧的待滤波分量的帧级语法元素标识信息;其中,帧级语法元素标识信息包括第一语法元素标识信息和第三语法元素标识信息。
在一些实施例中,参见图11,编码器100还可以包括第一设置单元1103和编码单元1104;其中,
第一设置单元1103,配置为若第一率失真代价值、第二率失真代价值和第三率失真代价值中第二率失真代价值最小,则设置第三语法元素标识信息的取值为第一值;若第一率失真代价值、第二率失真代价值和第三率失真代价值中第一率失真代价值最小或第三率失真代价值最小,则设置第三语法元素标识信息的取值为第二值;
编码单元1104,配置为对第三语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一设置单元1103,还配置为若第一率失真代价值、第二率失真代价值和第三率失真代价值中第三率失真代价值最小,则设置第一语法元素标识信息的取值为第一值;若第一率失真代价值、第二率失真代价值和第三率失真代价值中第一率失真代价值最小,则设置第一语法元素标识信息的取值为第二值;编码单元1104,还配置为对第一语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为确定当前帧包括的至少一个划分块的待滤波分量的原始值,以及确定当前帧包括的至少一个划分块的待滤波分量的重建值;以及根据当前帧包括的至少一个划分块的待滤波分量的原始值和当前帧包括的至少一个划分块的待滤波分量的重建值进行率失真代价计算,得到第一率失真代价值。
在一些实施例中,第一确定单元1101,还配置为确定当前帧的待滤波分量的原始图像;对原始图像进行划分,得到至少一个划分块的待滤波分量的原始值;对至少一个划分块进行帧内或帧间预测,确定至少一个划分块的待滤波分量的预测值;根据至少一个划分块的待滤波分量的原始值和至少一个划分块的待滤波分量的预测值,得到至少一个划分块的待滤波分量的初始残差值;对至少一个划分块的待滤波分量的初始残差值分别进行变换与量化处理,得到至少一个划分块的待滤波分量的目标残差值;对至少一个划分块的待滤波分量的目标残差值分别进行反量化与反变换处理,得到至少一个划分块的待滤波分量的重建残差值;以及根据至少一个划分块的待滤波分量的预测值和至少一个划分块的待滤波分量的重建残差值,确定至少一个划分块的待滤波分量的重建值。
在一些实施例中,编码单元1104,还配置为对至少一个划分块的待滤波分量的目标残差值进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为确定至少两种量化参数组合;其中,每一种量化参数组合至少包括一个第一颜色分量的候选量化参数值和一个第二颜色分量的候选量化参数值;在每一种量化参数组合下,基于预设网络模型对当前帧包括的至少一个划分块的待滤波分量的重建值进行滤波,得到当前帧包括的至少一个划分块的待滤波分量的滤波后重建值;根据当前帧包括的至少一个划分块的待滤波分量的原始值与当前帧包括的至少一个划分块的待滤波分量的滤波后重建值进行率失真代价计算,得到每一种量化参数组合下的第四率失真代价值;以及从所得到的第四率失真代价值中选取最小率失真代价值,根据最小率失真代价值确定第二率失真代价值。
在一些实施例中,第一确定单元1101,还配置为将最小率失真代价值对应的量化参数组合作为当前帧的帧量化参数信息;
编码单元1104,还配置为第二率失真代价值和第三率失真代价值中第二率失真代价值最小的情况下,在对第三语法元素标识信息的取值进行编码之后,继续对当前帧的帧量化参数信息进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为确定第一量化参数候选集合和第二量化参数候选集合;遍历第一量化参数候选集合和第二量化参数候选集合,确定至少两种量化参数组合;其中,第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
在一些实施例中,第一确定单元1101,还配置为根据当前帧的帧量化参数信息,确定第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;根据第一量化参数候选集合和第一颜色分量的帧量化参数值,确定第三量化参数索引;其中,第三量化参数索引用于指示第一颜色分量的帧量化参数值在第一量化参数候选集合中的索引序号;根据第二量化参数候选集合和第二颜色分量的帧量化参数值,确定第四量化参数索引;其中,第四量化参数索引用于指示第二颜色分量的帧量化参数值在第二量化参数候选集合中的索引序号;编码单元1104,还配置为对第三量化参数索引和第四量化参数索引进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为基于当前帧中的当前块,确定当前块的待滤波分量的原始值和当前块的待滤波分量的重建值;在至少两种量化参数组合下,基于预设网络模型对当前块的待滤波分量的重建值进行滤波,得到至少两种当前块的待滤波分量的滤波后重建值;根据当前块的待滤波分量的原始值和当前块的待滤波分量的重建值进行率失真代价计算,得到第五率失真代价值;根据当前块的待滤波分量的原始值和至少两种当前块的待滤波分量的滤波后重建值分别进行率失真代价计算,得到至少两个第六率失真代价值;以及根据第五率失真代价值和至少两个第六率失真代价值,确定当前块的待滤波分量的第二语法元素标识信息。
在一些实施例中,第一确定单元1101,还配置为从第五率失真代价值和至少两个第六率失真代价值中选取最小率失真代价值;若最小率失真代价值为其中一个第六率失真代价值,则设置第二语法元素标识信息的取值为第一值;若最小率失真代价值为第五率失真代价值,则设置第二语法元素标识信息的取值为第二值;编码单元1104,还配置为对第二语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为在最小率失真代价值为其中一个第六率失真代价值的情况下,将最小率失真代价值对应的量化参数组合作为当前块的块量化参数信息;
编码单元1104,还配置为在对第二语法元素标识信息的取值进行编码之后,继续对当前块的块量化参数信息进行编码,将所得到的编码比特写入码流。
在一些实施例中,第一确定单元1101,还配置为根据当前块的块量化参数信息,确定第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;根据第一量化参数候选集合和第一颜色分量的块量化参数值,确定第一量化参数索引;其中,第一量化参数索引用于指示第一颜色分量的块量化参数值在第一量化参数候选集合中的索引序号;根据第二量化参数候选集合和第二颜色分量的块量化参数值,确定第二量化参数索引;其中,第二量化参数索引用于指示第二颜色分量的块量化参数值在第二量化参数候选集合中的索引序号;编码单元1104,还配置为对第一量化参数索引和第二量化参数索引进行编码,将所得到的编码比特写入码流。
在一些实施例中,待滤波分量至少包括亮度颜色分量和色度颜色分量;相应地,第一确定单元1101,还配置为在当前帧的颜色分量类型为亮度颜色分量时,确定第三语法元素标识信息为当前帧的帧级亮度开关标识信息,第一语法元素标识信息为当前帧的帧级亮度使能标识信息,第二语法元素标识信息为当前块的块级亮度使用标识信息;其中,帧级亮度开关标识信息用于指示当前帧包括的至少一个划分块的亮度颜色分量是否全部使用预设网络模型进行滤波,帧级亮度使能标识信息用于指示当前帧中是否存在划分块的亮度颜色分量允许使用预设网络模型进行滤波,块级亮度使用标识信息用于指示当前块的亮度颜色分量是否使用预设网络模型进行滤波;在当前帧的颜色分量类型为色度颜色分量时,确定第三语法元素标识信息为当前帧的帧级色度开关标识信息,第一语法元素标识信息为当前帧的帧级色度使能标识信息,第二语法元素标识信息为当前块的块级色度使用标识信息;其中,帧级色度开关标识信息用于指示当前帧包括的至少一个划分块的色度颜色分量是否全部使用预设网络模型进行滤波,帧级色度使能标识信息用于指示当前帧中是否存在划分块的色度颜色分量允许使用预设网络模型进行滤波,块级色度使用标识信息用于指示当前块的色度颜色分量是否使用预设网络模型进行滤波。
在一些实施例中,第一确定单元1101,还配置为确定第四语法元素标识信息;以及在第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波时,执行确定当前帧的待滤波分量的第三语法元素标识信息的步骤;其中,当前序列包括当前帧。
在一些实施例中,第一确定单元1101,还配置为确定当前序列的待滤波分量是否允许使用预设网络模型进行滤波;若当前序列的待滤波分量允许使用预设网络模型进行滤波,则设置第四语法元素标识信息的取值为第一值;若当前序列的待滤波分量不允许使用预设网络模型进行滤波,则设置第四语法元素标识信息的取值为第二值;编码单元1104,还配置为对第四语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
在一些实施例中,预设网络模型为神经网络模型,且神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和块量化参数信息,相应地,第一滤波单元1102,还配置为确定预设网络模型的输出为当前块的待滤波分量的滤波后重建值。
在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,相应地,第一滤波单元1102,还配置为确定预设网络模型的输出为当前块的待滤波分量的第一残差值;以及根据当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,第一确定单元1101,还配置为确定残差缩放因子;根据残差缩放因子对当前块的待滤波分量的第一残差值进行缩放处理,得到当前块的待滤波分量的第二残差值;以及根据当前块的待滤波分量的重建值和当前块的待滤波分量的第二残差值,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,编码单元1104,还配置为对残差缩放因子进行编码,将所得到的编码比特写入码流。
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
因此,本申请实施例提供了一种计算机可读存储介质,应用于编码器100,该计算机可读存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。
基于上述编码器100的组成以及计算机可读存储介质,参见图12,其示出了本申请实施例提供的编码器100的具体硬件结构示意图。如图12所示,编码器100可以包括:第一通信接口1201、第一存储器1202和第一处理器1203;各个组件通过第一总线系统1204耦合在一起。可理解,第一总线系统1204用于实现这些组件之间的连接通信。第一总线系统1204除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图12中将各种总线都标为第一总线系统1204。其中,
第一通信接口1201,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
第一存储器1202,用于存储能够在第一处理器1203上运行的计算机程序;
第一处理器1203,用于在运行所述计算机程序时,执行:
确定当前帧的待滤波分量的第一语法元素标识信息;
在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
可以理解,本申请实施例中的第一存储器1202可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第一存储器1202旨在 包括但不限于这些和任意其它适合类型的存储器。
而第一处理器1203可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器1203中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器1203可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器1202,第一处理器1203读取第一存储器1202中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,第一处理器1203还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。
本实施例提供了一种编码器,该编码器可以使用多量化参数输入预设网络模型环路滤波技术,其中,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
基于前述实施例相同的发明构思,参见图13,其示出了本申请实施例提供的一种解码器的组成结构示意图。如图13所示,该解码器200可以包括:解码单元1301、第二确定单元1302和第二滤波单元1303;其中,
解码单元1301,配置为解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;以及在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
第二确定单元1302,配置为在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
第二确定单元1302,还配置为确定当前块的待滤波分量的重建值;
第二滤波单元1303,配置为将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,解码单元1301,还配置为解析码流,确定当前块的第一量化参数索引和第二量化参数索引;
第二确定单元1302,还配置为根据第一量化参数索引,从第一量化参数候选集合中确定当前块对应的第一颜色分量的块量化参数值;以及根据第二量化参数索引,从第二量化参数候选集合中确定当前块对应的第二颜色分量的块量化参数值;其中,第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
在一些实施例中,解码单元1301,还配置为解析码流,确定当前块对应的第一颜色分量的块量化参数值和第二颜色分量的块量化参数值。
在一些实施例中,解码单元1301,还配置为解析码流,确定当前块的待滤波分量的重建残差值;
第二确定单元1302,还配置为对当前块的待滤波分量进行帧内或帧间预测,确定当前块的待滤波分量的预测值;以及根据当前块的待滤波分量的重建残差值和当前块的待滤波分量的预测值,确定当前 块的待滤波分量的重建值。
在一些实施例中,第二确定单元1302,还配置为对当前块的待滤波分量的重建残差值和当前块的待滤波分量的预测值进行加法计算,得到当前块的待滤波分量的重建值。
在一些实施例中,解码单元1301,还配置为解析码流,获取第二语法元素标识信息的取值;
第二确定单元1302,还配置为若第二语法元素标识信息的取值为第一值,则确定第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波;若第二语法元素标识信息的取值为第二值,则确定第二语法元素标识信息指示当前块的待滤波分量不使用预设网络模型进行滤波。
在一些实施例中,第二确定单元1302,还配置为在第二语法元素标识信息指示当前块的待滤波分量不使用预设网络模型进行滤波时,将当前块的待滤波分量的重建值直接确定为当前块的待滤波分量的滤波后重建值。
在一些实施例中,解码单元1301,还配置为解析码流,获取第一语法元素标识信息的取值;
第二确定单元1302,还配置为若第一语法元素标识信息的取值为第一值,则确定第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波;若第一语法元素标识信息的取值为第二值,则确定第一语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部不允许使用预设网络模型进行滤波。
在一些实施例中,参见图13,解码器200还可以包括第二设置单元1304,配置为在第一语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部不允许使用预设网络模型进行滤波时,将划分块的待滤波分量的第二语法元素标识信息的取值均设置为第二值;以及在确定划分块的待滤波分量的重建值之后,将划分块的待滤波分量的重建值直接确定为划分块的待滤波分量的滤波后重建值。
在一些实施例中,解码单元1301,还配置为解析码流,确定当前帧的待滤波分量的第三语法元素标识信息;以及在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波时,解析码流,确定当前帧的待滤波分量的第一语法元素标识信息。
在一些实施例中,解码单元1301,还配置为解析码流,获取第三语法元素标识信息的取值;
第二确定单元1302,还配置为若第三语法元素标识信息的取值为第一值,则确定第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波;若第三语法元素标识信息的取值为第二值,则确定第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量未全部使用预设网络模型进行滤波。
在一些实施例中,解码单元1301,还配置为在第三语法元素标识信息指示当前帧包括的至少一个划分块的待滤波分量全部使用预设网络模型进行滤波时,解析码流,确定当前帧的帧量化参数信息;其中,帧量化参数信息至少包括第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;
第二设置单元1304,还配置为将当前帧的待滤波分量的第一语法元素标识信息的取值设置为第一值,将当前帧中的划分块的待滤波分量的第二语法元素标识信息的取值均设置为第一值,以及根据当前帧的帧量化参数信息确定划分块的块量化参数信息;
第二滤波单元1303,还配置为在确定划分块的待滤波分量的重建值之后,将划分块的待滤波分量的重建值和划分块的块量化参数信息输入到预设网络模型,确定划分块的待滤波分量的滤波后重建值。
在一些实施例中,解码单元1301,还配置为解析码流,确定当前帧的第三量化参数索引和第四量化参数索引;
第二确定单元1302,还配置为根据第三量化参数索引,从第一量化参数候选集合中确定当前帧对应的第一颜色分量的帧量化参数值;以及根据第四量化参数索引,从第二量化参数候选集合中确定当前帧对应的第二颜色分量的帧量化参数值;其中,第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
在一些实施例中,待滤波分量至少包括亮度颜色分量和色度颜色分量;相应地,第二确定单元1302,还配置为在当前帧的颜色分量类型为亮度颜色分量时,确定第三语法元素标识信息为当前帧的帧级亮度开关标识信息,第一语法元素标识信息为当前帧的帧级亮度使能标识信息,第二语法元素标识信息为当前块的块级亮度使用标识信息;其中,帧级亮度开关标识信息用于指示当前帧包括的至少一个划分块的亮度颜色分量是否全部使用预设网络模型进行滤波,帧级亮度使能标识信息用于指示当前帧中是否存在划分块的亮度颜色分量允许使用预设网络模型进行滤波,块级亮度使用标识信息用于指示当前块的亮度颜色分量是否使用预设网络模型进行滤波;在当前帧的颜色分量类型为色度颜色分量时,确定第三语法元素标识信息为当前帧的帧级色度开关标识信息,第一语法元素标识信息为当前帧的帧级色度使能标识信息,第二语法元素标识信息为当前块的块级色度使用标识信息;其中,帧级色度开关标识信息用于指示当前帧包括的至少一个划分块的色度颜色分量是否全部使用预设网络模型进行滤波,帧级色度使能标识信息用于指示当前帧中是否存在划分块的色度颜色分量允许使用预设网络模型进行滤波,块级色度使 用标识信息用于指示当前块的色度颜色分量是否使用预设网络模型进行滤波。
在一些实施例中,解码单元1301,还配置为解析码流,确定第四语法元素标识信息;以及在第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波时,执行解析码流,确定当前帧的待滤波分量的第三语法元素标识信息的步骤;其中,当前序列包括当前帧。
在一些实施例中,解码单元1301,还配置为解析码流,获取第四语法元素标识信息的取值;
第二确定单元1302,还配置为若第四语法元素标识信息的取值为第一值,则确定第四语法元素标识信息指示当前序列的待滤波分量允许使用预设网络模型进行滤波;若第四语法元素标识信息的取值为第二值,则确定第四语法元素标识信息指示当前序列的待滤波分量不允许使用预设网络模型进行滤波。
在一些实施例中,预设网络模型为神经网络模型,且神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和块量化参数信息,相应地,第二滤波单元1303,还配置为确定预设网络模型的输出为当前块的待滤波分量的滤波后重建值。
在一些实施例中,预设网络模型的输入为当前块的待滤波分量的重建值和当前块的块量化参数信息,相应地,第二滤波单元1303,还配置为确定预设网络模型的输出为当前块的待滤波分量的第一残差值;根据当前块的待滤波分量的重建值和当前块的待滤波分量的第一残差值,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,解码单元1301,还配置为解析码流,确定残差缩放因子;
第二确定单元1302,还配置为根据残差缩放因子对当前块的待滤波分量的第一残差值进行缩放处理,得到当前块的待滤波分量的第二残差值;根据当前块的待滤波分量的重建值和当前块的待滤波分量的第二残差值,确定当前块的待滤波分量的滤波后重建值。
在一些实施例中,第二确定单元1302,还配置为遍历当前帧中的至少一个划分块,将每一划分块依次作为当前块,重复执行解析码流,确定当前块的待滤波分量的第二语法元素标识信息的取值的步骤,以得到至少一个划分块各自对应的滤波后重建值;以及根据至少一个划分块各自对应的滤波后重建值,确定当前帧的重建图像。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本实施例提供了一种计算机可读存储介质,应用于解码器200,该计算机可读存储介质存储有计算机程序,所述计算机程序被第二处理器执行时实现前述实施例中任一项所述的方法。
基于上述解码器200的组成以及计算机可读存储介质,参见图14,其示出了本申请实施例提供的解码器200的具体硬件结构示意图。如图14所示,解码器200可以包括:第二通信接口1401、第二存储器1402和第二处理器1403;各个组件通过第二总线系统1404耦合在一起。可理解,第二总线系统1404用于实现这些组件之间的连接通信。第二总线系统1404除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图14中将各种总线都标为第二总线系统1404。其中,
第二通信接口1401,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
第二存储器1402,用于存储能够在第二处理器1403上运行的计算机程序;
第二处理器1403,用于在运行所述计算机程序时,执行:
解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;
在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;
在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;
确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,确定当前块的待滤波分量的滤波后重建值。
可选地,作为另一个实施例,第二处理器1403还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。
可以理解,第二存储器1402与第一存储器1202的硬件功能类似,第二处理器1403与第一处理器1203的硬件功能类似;这里不再详述。
本实施例提供了一种解码器,该解码器可以使用多量化参数输入预设网络模型环路滤波技术,其中,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
在本申请的再一实施例中,参见图15,其示出了本申请实施例提供的一种编解码系统的组成结构示意图。如图15所示,编解码系统150可以包括编码器1501和解码器1502。其中,编码器1501可以为前述实施例中任一项所述的编码器,解码器1502可以为前述实施例中任一项所述的解码器。
在本申请实施例中,该编解码系统150中,无论是编码器1501还是解码器1502,均可以使用多量化参数输入预设网络模型环路滤波技术,其中,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
本申请实施例中,无论是在编码端还是解码端,首先确定当前帧的待滤波分量的第一语法元素标识信息;在第一语法元素标识信息指示当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,然后确定当前块的待滤波分量的第二语法元素标识信息;其中,当前帧包括至少一个划分块,且当前块为至少一个划分块中的任意一个;在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,确定当前块的块量化参数信息;其中,块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;再确定当前块的待滤波分量的重建值,将当前块的待滤波分量的重建值和当前块的块量化参数信息输入到预设网络模型,最后可以确定出当前块的待滤波分量的滤波后重建值。这样,对于预设网络模型的输入而言,由于仅包括待滤波分量的重建值和块量化参数信息,去除了颜色分量的预测信息、划分信息等非重要输入元素,可以减少网络模型推理时的计算量,有利于解码端的实现和降低解码时间;另外,由于输入的块量化参数信息至少包括两种颜色分量的块量化参数值,即使用了多通道量化参数作为输入,可以使得亮度颜色分量和色度颜色分量拥有更多的选择和适配;而且通过引入新的语法元素,使得解码端不需要存储多个神经网络模型就可以达到更灵活的配置,有利于提升编码性能,进而还能够提升编解码效率。
Claims (51)
- 一种解码方法,应用于解码器,所述方法包括:解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;在所述第一语法元素标识信息指示所述当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,所述当前帧包括至少一个划分块,且所述当前块为所述至少一个划分块中的任意一个;在所述第二语法元素标识信息指示所述当前块的待滤波分量使用预设网络模型进行滤波时,确定所述当前块的块量化参数信息;其中,所述块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;确定所述当前块的待滤波分量的重建值,将所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息输入到所述预设网络模型,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求1所述的方法,其中,所述确定所述当前块的块量化参数信息,包括:解析码流,确定所述当前块的第一量化参数索引和第二量化参数索引;根据所述第一量化参数索引,从第一量化参数候选集合中确定所述当前块对应的所述第一颜色分量的块量化参数值;以及根据所述第二量化参数索引,从第二量化参数候选集合中确定所述当前块对应的所述第二颜色分量的块量化参数值;其中,所述第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,所述第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
- 根据权利要求1所述的方法,其中,所述确定所述当前块的块量化参数信息,包括:解析码流,确定所述当前块对应的所述第一颜色分量的块量化参数值和所述第二颜色分量的块量化参数值。
- 根据权利要求1所述的方法,其中,所述确定所述当前块的待滤波分量的重建值,包括:解析码流,确定所述当前块的待滤波分量的重建残差值;对所述当前块的待滤波分量进行帧内或帧间预测,确定所述当前块的待滤波分量的预测值;根据所述当前块的待滤波分量的重建残差值和所述当前块的待滤波分量的预测值,确定所述当前块的待滤波分量的重建值。
- 根据权利要求4所述的方法,其中,所述根据所述当前块的待滤波分量的重建残差值和所述当前块的待滤波分量的预测值,确定所述当前块的待滤波分量的重建值,包括:对所述当前块的待滤波分量的重建残差值和所述当前块的待滤波分量的预测值进行加法计算,得到所述当前块的待滤波分量的重建值。
- 根据权利要求1所述的方法,其中,所述解析码流,确定当前块的待滤波分量的第二语法元素标识信息,包括:解析码流,获取所述第二语法元素标识信息的取值;相应地,所述方法还包括:若所述第二语法元素标识信息的取值为第一值,则确定所述第二语法元素标识信息指示所述当前块的待滤波分量使用所述预设网络模型进行滤波;若所述第二语法元素标识信息的取值为第二值,则确定所述第二语法元素标识信息指示所述当前块的待滤波分量不使用所述预设网络模型进行滤波。
- 根据权利要求6所述的方法,其中,所述解析码流,确定当前块的待滤波分量的第二语法元素标识信息之后,所述方法还包括:在所述第二语法元素标识信息指示所述当前块的待滤波分量不使用预设网络模型进行滤波时,将所述当前块的待滤波分量的重建值直接确定为所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求1所述的方法,其中,所述解析码流,确定当前帧的待滤波分量的第一语法元素标识信息,包括:解析码流,获取所述第一语法元素标识信息的取值;相应地,所述方法还包括:若所述第一语法元素标识信息的取值为第一值,则确定所述第一语法元素标识信息指示所述当前帧中存在划分块的待滤波分量允许使用所述预设网络模型进行滤波;若所述第一语法元素标识信息的取值为第二值,则确定所述第一语法元素标识信息指示所述当前帧 包括的至少一个划分块的待滤波分量全部不允许使用所述预设网络模型进行滤波。
- 根据权利要求8所述的方法,其中,所述解析码流,确定当前帧的待滤波分量的第一语法元素标识信息之后,所述方法还包括:在所述第一语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量全部不允许使用所述预设网络模型进行滤波时,将所述划分块的待滤波分量的第二语法元素标识信息的取值均设置为第二值;在确定所述划分块的待滤波分量的重建值之后,将所述划分块的待滤波分量的重建值直接确定为所述划分块的待滤波分量的滤波后重建值。
- 根据权利要求1所述的方法,其中,所述解析码流,确定当前帧的第一语法元素标识信息,包括:解析码流,确定所述当前帧的待滤波分量的第三语法元素标识信息;在所述第三语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量未全部使用所述预设网络模型进行滤波时,解析码流,确定所述当前帧的待滤波分量的第一语法元素标识信息。
- 根据权利要求10所述的方法,其中,所述解析码流,确定所述当前帧的待滤波分量的第三语法元素标识信息,包括:解析码流,获取所述第三语法元素标识信息的取值;相应地,所述方法还包括:若所述第三语法元素标识信息的取值为第一值,则确定所述第三语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量全部使用所述预设网络模型进行滤波;若所述第三语法元素标识信息的取值为第二值,则确定所述第三语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量未全部使用所述预设网络模型进行滤波。
- 根据权利要求10所述的方法,其中,所述解析码流,确定所述当前帧的待滤波分量的第三语法元素标识信息之后,所述方法还包括:在所述第三语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量全部使用所述预设网络模型进行滤波时,解析码流,确定所述当前帧的帧量化参数信息;其中,所述帧量化参数信息至少包括第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;将所述当前帧的待滤波分量的第一语法元素标识信息的取值设置为第一值,将所述当前帧中的所述划分块的待滤波分量的第二语法元素标识信息的取值均设置为第一值,以及根据所述当前帧的帧量化参数信息确定所述划分块的块量化参数信息;在确定所述划分块的待滤波分量的重建值之后,将所述划分块的待滤波分量的重建值和所述划分块的块量化参数信息输入到所述预设网络模型,确定所述划分块的待滤波分量的滤波后重建值。
- 根据权利要求12所述的方法,其中,所述解析码流,确定所述当前帧的帧量化参数信息,包括:解析码流,确定所述当前帧的第三量化参数索引和第四量化参数索引;根据所述第三量化参数索引,从第一量化参数候选集合中确定所述当前帧对应的所述第一颜色分量的帧量化参数值;以及根据所述第四量化参数索引,从第二量化参数候选集合中确定所述当前帧对应的所述第二颜色分量的帧量化参数值;其中,所述第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,所述第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
- 根据权利要求10所述的方法,其中,所述待滤波分量至少包括亮度颜色分量和色度颜色分量;所述方法还包括:在所述当前帧的颜色分量类型为亮度颜色分量时,确定所述第三语法元素标识信息为所述当前帧的帧级亮度开关标识信息,所述第一语法元素标识信息为所述当前帧的帧级亮度使能标识信息,所述第二语法元素标识信息为所述当前块的块级亮度使用标识信息;其中,所述帧级亮度开关标识信息用于指示所述当前帧包括的至少一个划分块的亮度颜色分量是否全部使用所述预设网络模型进行滤波,所述帧级亮度使能标识信息用于指示所述当前帧中是否存在划分块的亮度颜色分量允许使用所述预设网络模型进行滤波,所述块级亮度使用标识信息用于指示所述当前块的亮度颜色分量是否使用所述预设网络模型进行滤波;在所述当前帧的颜色分量类型为色度颜色分量时,确定所述第三语法元素标识信息为所述当前帧的帧级色度开关标识信息,所述第一语法元素标识信息为所述当前帧的帧级色度使能标识信息,所述第二语法元素标识信息为所述当前块的块级色度使用标识信息;其中,所述帧级色度开关标识信息用于指示 所述当前帧包括的至少一个划分块的色度颜色分量是否全部使用所述预设网络模型进行滤波,所述帧级色度使能标识信息用于指示所述当前帧中是否存在划分块的色度颜色分量允许使用所述预设网络模型进行滤波,所述块级色度使用标识信息用于指示所述当前块的色度颜色分量是否使用所述预设网络模型进行滤波。
- 根据权利要求10所述的方法,其中,所述方法还包括:解析码流,确定第四语法元素标识信息;在所述第四语法元素标识信息指示当前序列的待滤波分量允许使用所述预设网络模型进行滤波时,执行所述解析码流,确定所述当前帧的待滤波分量的第三语法元素标识信息的步骤;其中,所述当前序列包括所述当前帧。
- 根据权利要求15所述的方法,其中,所述解析码流,确定第四语法元素标识信息,包括:解析码流,获取所述第四语法元素标识信息的取值;相应地,所述方法还包括:若所述第四语法元素标识信息的取值为第一值,则确定所述第四语法元素标识信息指示当前序列的待滤波分量允许使用所述预设网络模型进行滤波;若所述第四语法元素标识信息的取值为第二值,则确定所述第四语法元素标识信息指示当前序列的待滤波分量不允许使用所述预设网络模型进行滤波。
- 根据权利要求1所述的方法,其中,所述预设网络模型为神经网络模型,且所述神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
- 根据权利要求1至17任一项所述的方法,其中,所述预设网络模型的输入为所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息,所述方法还包括:确定所述预设网络模型的输出为所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求1至17任一项所述的方法,其中,所述预设网络模型的输入为所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息,所述方法还包括:确定所述预设网络模型的输出为所述当前块的待滤波分量的第一残差值;相应地,所述确定所述当前块的待滤波分量的滤波后重建值,包括:在通过所述预设网络模型得到所述当前块的待滤波分量的第一残差值后,根据所述当前块的待滤波分量的重建值和所述当前块的待滤波分量的第一残差值,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求19所述的方法,其中,所述方法还包括:解析码流,确定残差缩放因子;相应地,所述确定所述当前块的待滤波分量的滤波后重建值,包括:根据所述残差缩放因子对所述当前块的待滤波分量的第一残差值进行缩放处理,得到所述当前块的待滤波分量的第二残差值;根据所述当前块的待滤波分量的重建值和所述当前块的待滤波分量的第二残差值,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求1所述的方法,其中,所述方法还包括:遍历所述当前帧中的至少一个划分块,将每一划分块依次作为所述当前块,重复执行所述解析码流,确定当前块的待滤波分量的第二语法元素标识信息的取值的步骤,以得到所述至少一个划分块各自对应的滤波后重建值;根据所述至少一个划分块各自对应的滤波后重建值,确定所述当前帧的重建图像。
- 一种编码方法,应用于编码器,所述方法包括:确定当前帧的待滤波分量的第一语法元素标识信息;在所述第一语法元素标识信息指示所述当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,所述当前帧包括至少一个划分块,且所述当前块为所述至少一个划分块中的任意一个;在所述第二语法元素标识信息指示所述当前块的待滤波分量使用预设网络模型进行滤波时,确定所述当前块的块量化参数信息;其中,所述块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;确定所述当前块的待滤波分量的重建值,将所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息输入到所述预设网络模型,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求22所述的方法,其中,所述确定当前帧的待滤波分量的第一语法元素标识信息,包括:确定所述当前帧的待滤波分量的第三语法元素标识信息;在所述第三语法元素标识信息指示所述当前帧包括的至少一个划分块的待滤波分量未全部使用所述预设网络模型进行滤波时,确定所述当前帧的待滤波分量的第一语法元素标识信息。
- 根据权利要求23所述的方法,其中,所述方法还包括:确定所述当前帧包括的至少一个划分块的待滤波分量全部不使用所述预设网络模型进行滤波的第一率失真代价值;确定所述当前帧包括的至少一个划分块的待滤波分量全部使用所述预设网络模型进行滤波的第二率失真代价值;确定所述当前帧中存在划分块的待滤波分量允许使用所述预设网络模型进行滤波的第三率失真代价值;根据所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值,确定所述当前帧的待滤波分量的帧级语法元素标识信息;其中,所述帧级语法元素标识信息包括所述第一语法元素标识信息和所述第三语法元素标识信息。
- 根据权利要求24所述的方法,其中,所述根据所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值,确定所述当前帧的待滤波分量的帧级语法元素标识信息,包括:若所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值中所述第二率失真代价值最小,则设置所述第三语法元素标识信息的取值为第一值;若所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值中所述第一率失真代价值最小或所述第三率失真代价值最小,则设置所述第三语法元素标识信息的取值为第二值;相应地,所述方法还包括:对所述第三语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
- 根据权利要求24所述的方法,其中,所述根据所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值,确定所述当前帧的待滤波分量的帧级语法元素标识信息,包括:若所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值中所述第三率失真代价值最小,则设置所述第一语法元素标识信息的取值为第一值;若所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值中所述第一率失真代价值最小,则设置所述第一语法元素标识信息的取值为第二值;相应地,所述方法还包括:对所述第一语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
- 根据权利要求24所述的方法,其中,所述确定所述当前帧包括的至少一个划分块的待滤波分量全部不使用所述预设网络模型进行滤波的第一率失真代价值,包括:确定所述当前帧包括的至少一个划分块的待滤波分量的原始值,以及确定所述当前帧包括的至少一个划分块的待滤波分量的重建值;根据所述当前帧包括的至少一个划分块的待滤波分量的原始值和所述当前帧包括的至少一个划分块的待滤波分量的重建值进行率失真代价计算,得到所述第一率失真代价值。
- 根据权利要求27所述的方法,其中,所述确定所述当前帧包括的至少一个划分块的待滤波分量的重建值,包括:确定所述当前帧的待滤波分量的原始图像;对所述原始图像进行划分,得到至少一个划分块的待滤波分量的原始值;对所述至少一个划分块进行帧内或帧间预测,确定所述至少一个划分块的待滤波分量的预测值;根据所述至少一个划分块的待滤波分量的原始值和所述至少一个划分块的待滤波分量的预测值,得到所述至少一个划分块的待滤波分量的初始残差值;对所述至少一个划分块的待滤波分量的初始残差值分别进行变换与量化处理,得到所述至少一个划分块的待滤波分量的目标残差值;对所述至少一个划分块的待滤波分量的目标残差值分别进行反量化与反变换处理,得到所述至少一个划分块的待滤波分量的重建残差值;根据所述至少一个划分块的待滤波分量的预测值和所述至少一个划分块的待滤波分量的重建残差值,确定所述至少一个划分块的待滤波分量的重建值。
- 根据权利要求28所述的方法,其中,所述方法还包括:对所述至少一个划分块的待滤波分量的目标残差值进行编码,将所得到的编码比特写入码流。
- 根据权利要求25所述的方法,其中,所述确定所述当前帧包括的至少一个划分块的待滤波分量全部使用所述预设网络模型进行滤波的第二率失真代价值,包括:确定至少两种量化参数组合;其中,每一种量化参数组合至少包括一个第一颜色分量的候选量化参 数值和一个第二颜色分量的候选量化参数值;在每一种量化参数组合下,基于所述预设网络模型对所述当前帧包括的至少一个划分块的待滤波分量的重建值进行滤波,得到所述当前帧包括的至少一个划分块的待滤波分量的滤波后重建值;根据所述当前帧包括的至少一个划分块的待滤波分量的原始值与所述当前帧包括的至少一个划分块的待滤波分量的滤波后重建值进行率失真代价计算,得到每一种量化参数组合下的第四率失真代价值;从所得到的第四率失真代价值中选取最小率失真代价值,根据所述最小率失真代价值确定所述第二率失真代价值。
- 根据权利要求30所述的方法,其中,所述方法还包括:将所述最小率失真代价值对应的量化参数组合作为所述当前帧的帧量化参数信息;相应地,在所述第一率失真代价值、所述第二率失真代价值和所述第三率失真代价值中所述第二率失真代价值最小的情况下,所述方法还包括:在对所述第三语法元素标识信息的取值进行编码之后,继续对所述当前帧的帧量化参数信息进行编码,将所得到的编码比特写入码流。
- 根据权利要求31所述的方法,其中,所述确定至少两种量化参数组合,包括:确定第一量化参数候选集合和第二量化参数候选集合;遍历所述第一量化参数候选集合和所述第二量化参数候选集合,确定所述至少两种量化参数组合;其中,所述第一量化参数候选集合是由至少两个第一颜色分量的候选量化参数值组成,所述第二量化参数候选集合是由至少两个第二颜色分量的候选量化参数值组成。
- 根据权利要求32所述的方法,其中,所述对所述当前帧的帧量化参数信息进行编码,将所得到的编码比特写入码流,还包括:根据所述当前帧的帧量化参数信息,确定第一颜色分量的帧量化参数值和第二颜色分量的帧量化参数值;根据所述第一量化参数候选集合和所述第一颜色分量的帧量化参数值,确定第三量化参数索引;其中,所述第三量化参数索引用于指示所述第一颜色分量的帧量化参数值在所述第一量化参数候选集合中的索引序号;根据所述第二量化参数候选集合和所述第二颜色分量的帧量化参数值,确定第四量化参数索引;其中,所述第四量化参数索引用于指示所述第二颜色分量的帧量化参数值在所述第二量化参数候选集合中的索引序号;对所述第三量化参数索引和所述第四量化参数索引进行编码,将所得到的编码比特写入码流。
- 根据权利要求30所述的方法,其中,所述方法还包括:基于所述当前帧中的当前块,确定所述当前块的待滤波分量的原始值和所述当前块的待滤波分量的重建值;在所述至少两种量化参数组合下,基于所述预设网络模型对所述当前块的待滤波分量的重建值进行滤波,得到至少两种所述当前块的待滤波分量的滤波后重建值;根据所述当前块的待滤波分量的原始值和所述当前块的待滤波分量的重建值进行率失真代价计算,得到第五率失真代价值;根据所述当前块的待滤波分量的原始值和至少两种所述当前块的待滤波分量的滤波后重建值分别进行率失真代价计算,得到至少两个第六率失真代价值;根据所述第五率失真代价值和所述至少两个第六率失真代价值,确定所述当前块的待滤波分量的第二语法元素标识信息。
- 根据权利要求34所述的方法,其中,所述根据所述第五率失真代价值和所述至少两个第六率失真代价值,确定所述当前块的待滤波分量的第二语法元素标识信息,包括:从所述第五率失真代价值和所述至少两个第六率失真代价值中选取最小率失真代价值;若所述最小率失真代价值为其中一个第六率失真代价值,则设置所述第二语法元素标识信息的取值为第一值;若所述最小率失真代价值为所述第五率失真代价值,则设置所述第二语法元素标识信息的取值为第二值;相应地,所述方法还包括:对所述第二语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
- 根据权利要求35所述的方法,其中,在第二语法元素标识信息指示当前块的待滤波分量使用预设网络模型进行滤波时,所述确定当前块的块量化参数信息,包括:在所述最小率失真代价值为其中一个第六率失真代价值的情况下,将所述最小率失真代价值对应的 量化参数组合作为所述当前块的块量化参数信息;相应地,所述方法还包括:在对所述第二语法元素标识信息的取值进行编码之后,继续对所述当前块的块量化参数信息进行编码,将所得到的编码比特写入码流。
- 根据权利要求36所述的方法,其中,所述对所述当前块的块量化参数信息进行编码,将所得到的编码比特写入码流,还包括:根据所述当前块的块量化参数信息,确定第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;根据第一量化参数候选集合和所述第一颜色分量的块量化参数值,确定第一量化参数索引;其中,所述第一量化参数索引用于指示所述第一颜色分量的块量化参数值在所述第一量化参数候选集合中的索引序号;根据第二量化参数候选集合和所述第二颜色分量的块量化参数值,确定第二量化参数索引;其中,所述第二量化参数索引用于指示所述第二颜色分量的块量化参数值在所述第二量化参数候选集合中的索引序号;对所述第一量化参数索引和所述第二量化参数索引进行编码,将所得到的编码比特写入码流。
- 根据权利要求23所述的方法,其中,所述待滤波分量至少包括亮度颜色分量和色度颜色分量;所述方法还包括:在所述当前帧的颜色分量类型为亮度颜色分量时,确定所述第三语法元素标识信息为所述当前帧的帧级亮度开关标识信息,所述第一语法元素标识信息为所述当前帧的帧级亮度使能标识信息,所述第二语法元素标识信息为所述当前块的块级亮度使用标识信息;其中,所述帧级亮度开关标识信息用于指示所述当前帧包括的至少一个划分块的亮度颜色分量是否全部使用所述预设网络模型进行滤波,所述帧级亮度使能标识信息用于指示所述当前帧中是否存在划分块的亮度颜色分量允许使用所述预设网络模型进行滤波,所述块级亮度使用标识信息用于指示所述当前块的亮度颜色分量是否使用所述预设网络模型进行滤波;在所述当前帧的颜色分量类型为色度颜色分量时,确定所述第三语法元素标识信息为所述当前帧的帧级色度开关标识信息,所述第一语法元素标识信息为所述当前帧的帧级色度使能标识信息,所述第二语法元素标识信息为所述当前块的块级色度使用标识信息;其中,所述帧级色度开关标识信息用于指示所述当前帧包括的至少一个划分块的色度颜色分量是否全部使用所述预设网络模型进行滤波,所述帧级色度使能标识信息用于指示所述当前帧中是否存在划分块的色度颜色分量允许使用所述预设网络模型进行滤波,所述块级色度使用标识信息用于指示所述当前块的色度颜色分量是否使用所述预设网络模型进行滤波。
- 根据权利要求23所述的方法,其中,所述方法还包括:确定第四语法元素标识信息;在所述第四语法元素标识信息指示当前序列的待滤波分量允许使用所述预设网络模型进行滤波时,执行所述确定所述当前帧的待滤波分量的第三语法元素标识信息的步骤;其中,所述当前序列包括所述当前帧。
- 根据权利要求39所述的方法,其中,所述确定第四语法元素标识信息,包括:确定当前序列的待滤波分量是否允许使用所述预设网络模型进行滤波;若所述当前序列的待滤波分量允许使用所述预设网络模型进行滤波,则设置所述第四语法元素标识信息的取值为第一值;若所述当前序列的待滤波分量不允许使用所述预设网络模型进行滤波,则设置所述第四语法元素标识信息的取值为第二值;相应地,所述方法还包括:对所述第四语法元素标识信息的取值进行编码,将所得到的编码比特写入码流。
- 根据权利要求22所述的方法,其中,所述预设网络模型为神经网络模型,且所述神经网络模型至少包括:卷积层、激活层、拼接层和跳跃连接层。
- 根据权利要求22至41任一项所述的方法,其中,所述预设网络模型的输入为所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息,所述方法还包括:确定所述预设网络模型的输出为所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求22至41任一项所述的方法,其中,所述预设网络模型的输入为所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息,所述方法还包括:确定所述预设网络模型的输出为所述当前块的待滤波分量的第一残差值;相应地,所述确定所述当前块的待滤波分量的滤波后重建值,包括:在通过所述预设网络模型得到所述当前块的待滤波分量的第一残差值后,根据所述当前块的待滤波分量的重建值和所述当前块的待滤波分量的第一残差值,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求43所述的方法,其中,所述方法还包括:确定残差缩放因子;相应地,所述确定所述当前块的待滤波分量的滤波后重建值,包括:根据所述残差缩放因子对所述当前块的待滤波分量的第一残差值进行缩放处理,得到所述当前块的待滤波分量的第二残差值;根据所述当前块的待滤波分量的重建值和所述当前块的待滤波分量的第二残差值,确定所述当前块的待滤波分量的滤波后重建值。
- 根据权利要求44所述的方法,其中,所述方法还包括:对所述残差缩放因子进行编码,将所得到的编码比特写入码流。
- 一种码流,所述码流是根据待编码信息进行比特编码生成的;其中,所述待编码信息包括下述至少一项:当前帧的待滤波分量的第一语法元素标识信息、当前块的待滤波分量的第二语法元素标识信息、所述当前帧的待滤波分量的第三语法元素标识信息、残差缩放因子和所述当前帧包括的至少一个划分块的待滤波分量的初始残差值;其中,所述当前帧包括至少一个划分块,且所述当前块为所述至少一个划分块中的任意一个。
- 一种编码器,所述编码器包括第一确定单元和第一滤波单元;其中,所述第一确定单元,配置为确定当前帧的待滤波分量的第一语法元素标识信息;以及在所述第一语法元素标识信息指示所述当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,确定当前块的待滤波分量的第二语法元素标识信息;其中,所述当前帧包括至少一个划分块,且所述当前块为所述至少一个划分块中的任意一个;以及在所述第二语法元素标识信息指示所述当前块的待滤波分量使用预设网络模型进行滤波时,确定所述当前块的块量化参数信息;其中,所述块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;所述第一确定单元,还配置为确定所述当前块的待滤波分量的重建值,所述第一滤波单元,配置为将所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息输入到所述预设网络模型,确定所述当前块的待滤波分量的滤波后重建值。
- 一种编码器,所述编码器包括第一存储器和第一处理器;其中,所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;所述第一处理器,用于在运行所述计算机程序时,执行如权利要求22至45任一项所述的方法。
- 一种解码器,所述解码器包括解码单元、第二确定单元和第二滤波单元;其中,所述解码单元,配置为解析码流,确定当前帧的待滤波分量的第一语法元素标识信息;以及在所述第一语法元素标识信息指示所述当前帧中存在划分块的待滤波分量允许使用预设网络模型进行滤波时,解析码流,确定当前块的待滤波分量的第二语法元素标识信息;其中,所述当前帧包括至少一个划分块,且所述当前块为所述至少一个划分块中的任意一个;所述第二确定单元,配置为在所述第二语法元素标识信息指示所述当前块的待滤波分量使用预设网络模型进行滤波时,确定所述当前块的块量化参数信息;其中,所述块量化参数信息至少包括第一颜色分量的块量化参数值和第二颜色分量的块量化参数值;所述第二确定单元,还配置为确定所述当前块的待滤波分量的重建值;所述第二滤波单元,配置为将所述当前块的待滤波分量的重建值和所述当前块的块量化参数信息输入到所述预设网络模型,确定所述当前块的待滤波分量的滤波后重建值。
- 一种解码器,所述解码器包括第二存储器和第二处理器;其中,所述第二存储器,用于存储能够在所述第二处理器上运行的计算机程序;所述第二处理器,用于在运行所述计算机程序时,执行如权利要求1至21任一项所述的方法。
- 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至21任一项所述的方法、或者如权利要求22至45任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/100728 WO2023245544A1 (zh) | 2022-06-23 | 2022-06-23 | 编解码方法、码流、编码器、解码器以及存储介质 |
TW112123268A TW202404350A (zh) | 2022-06-23 | 2023-06-20 | 編解碼方法、編碼器、解碼器以及儲存媒介 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/100728 WO2023245544A1 (zh) | 2022-06-23 | 2022-06-23 | 编解码方法、码流、编码器、解码器以及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023245544A1 true WO2023245544A1 (zh) | 2023-12-28 |
Family
ID=89378879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/100728 WO2023245544A1 (zh) | 2022-06-23 | 2022-06-23 | 编解码方法、码流、编码器、解码器以及存储介质 |
Country Status (2)
Country | Link |
---|---|
TW (1) | TW202404350A (zh) |
WO (1) | WO2023245544A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112425163A (zh) * | 2018-07-17 | 2021-02-26 | 高通股份有限公司 | 基于块的自适应环路滤波器设计和信令通知 |
CN112544081A (zh) * | 2019-12-31 | 2021-03-23 | 北京大学 | 环路滤波的方法与装置 |
US20220103864A1 (en) * | 2020-09-29 | 2022-03-31 | Qualcomm Incorporated | Multiple neural network models for filtering during video coding |
WO2022067805A1 (zh) * | 2020-09-30 | 2022-04-07 | Oppo广东移动通信有限公司 | 图像预测方法、编码器、解码器以及计算机存储介质 |
-
2022
- 2022-06-23 WO PCT/CN2022/100728 patent/WO2023245544A1/zh unknown
-
2023
- 2023-06-20 TW TW112123268A patent/TW202404350A/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112425163A (zh) * | 2018-07-17 | 2021-02-26 | 高通股份有限公司 | 基于块的自适应环路滤波器设计和信令通知 |
CN112544081A (zh) * | 2019-12-31 | 2021-03-23 | 北京大学 | 环路滤波的方法与装置 |
US20220103864A1 (en) * | 2020-09-29 | 2022-03-31 | Qualcomm Incorporated | Multiple neural network models for filtering during video coding |
WO2022067805A1 (zh) * | 2020-09-30 | 2022-04-07 | Oppo广东移动通信有限公司 | 图像预测方法、编码器、解码器以及计算机存储介质 |
Also Published As
Publication number | Publication date |
---|---|
TW202404350A (zh) | 2024-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102350436B1 (ko) | 영역 기반 필터에 대해서 협력적 분할 부호화하는 방법 및 장치 | |
WO2020177134A1 (zh) | 环路滤波实现方法、装置及计算机存储介质 | |
WO2021203394A1 (zh) | 环路滤波的方法与装置 | |
WO2022052533A1 (zh) | 编码方法、解码方法、编码器、解码器以及编码系统 | |
WO2021185008A1 (zh) | 编码方法、解码方法、编码器、解码器以及电子设备 | |
JP7439841B2 (ja) | ループ内フィルタリングの方法及びループ内フィルタリングの装置 | |
CN113727106B (zh) | 视频编码、解码方法、装置、电子设备及存储介质 | |
US20240107015A1 (en) | Encoding method, decoding method, code stream, encoder, decoder and storage medium | |
WO2022227062A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
WO2022266971A1 (zh) | 编解码方法、编码器、解码器以及计算机存储介质 | |
TW202408228A (zh) | 濾波方法、編碼器、解碼器、碼流以及儲存媒介 | |
CN114467306A (zh) | 图像预测方法、编码器、解码器以及存储介质 | |
WO2023245544A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
WO2022227082A1 (zh) | 块划分方法、编码器、解码器以及计算机存储介质 | |
WO2021143177A1 (zh) | 编码、解码方法、装置及其设备 | |
WO2023197230A1 (zh) | 滤波方法、编码器、解码器以及存储介质 | |
Ghassab et al. | Video Compression Using Convolutional Neural Networks of Video with Chroma Subsampling | |
WO2024077573A1 (zh) | 编解码方法、编码器、解码器、码流以及存储介质 | |
WO2023231008A1 (zh) | 编解码方法、编码器、解码器以及存储介质 | |
WO2023070505A1 (zh) | 帧内预测方法、解码器、编码器及编解码系统 | |
WO2022257130A1 (zh) | 编解码方法、码流、编码器、解码器、系统和存储介质 | |
WO2023130226A1 (zh) | 一种滤波方法、解码器、编码器及计算机可读存储介质 | |
WO2023092404A1 (zh) | 视频编解码方法、设备、系统、及存储介质 | |
WO2024011370A1 (zh) | 视频图像处理方法及装置、编解码器、码流、存储介质 | |
WO2024212190A1 (zh) | 编解码方法及装置、编解码器、码流、设备、存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22947316 Country of ref document: EP Kind code of ref document: A1 |