WO2024007120A1 - 编解码方法、编码器、解码器以及存储介质 - Google Patents

编解码方法、编码器、解码器以及存储介质 Download PDF

Info

Publication number
WO2024007120A1
WO2024007120A1 PCT/CN2022/103686 CN2022103686W WO2024007120A1 WO 2024007120 A1 WO2024007120 A1 WO 2024007120A1 CN 2022103686 W CN2022103686 W CN 2022103686W WO 2024007120 A1 WO2024007120 A1 WO 2024007120A1
Authority
WO
WIPO (PCT)
Prior art keywords
lfnst
mip
current block
mode
prediction mode
Prior art date
Application number
PCT/CN2022/103686
Other languages
English (en)
French (fr)
Inventor
谢志煌
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2022/103686 priority Critical patent/WO2024007120A1/zh
Priority to TW112124177A priority patent/TW202404361A/zh
Publication of WO2024007120A1 publication Critical patent/WO2024007120A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular, to a coding and decoding method, an encoder, a decoder, and a storage medium.
  • H.265/High Efficiency Video Coding can no longer meet the needs of the rapid development of video applications.
  • JVET Joint Video Exploration Team
  • VVC Test Model VTM
  • ECM Enhanced Compression Model
  • Decoder side Intra Mode Derivation is the intra prediction technology of ECM.
  • the main core point of this technology is that the intra prediction mode is derived on the decoding end using the same method as the encoding end, so as to achieve the purpose of saving bit overhead.
  • DIMD technology introduces greater complexity in both software and hardware, increasing the compression cost.
  • Embodiments of the present application provide a coding and decoding method, an encoder, a decoder, and a storage medium, which can reduce computational complexity and thereby improve coding efficiency.
  • embodiments of the present application provide a decoding method, which is applied to a decoder.
  • the method includes:
  • the prediction mode parameter indicates using MIP to determine the intra-frame prediction value, decode the code stream and determine the MIP parameter of the current block;
  • mapping mode of the LFNST transform set select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, and determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set;
  • the transform coefficients are transformed.
  • the MIP parameter determine the intra prediction block of the current block, and calculate the residual block between the current block and the intra prediction value
  • mapping mode of the LFNST transform set select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set, and set the LFNST index serial number and write it into the video stream;
  • the residual block is transformed using the LFNST transformation kernel.
  • embodiments of the present application provide an encoder, which includes a first determination unit, a coding unit, and a first transformation unit; wherein,
  • the encoding unit is configured to write a video code stream
  • the first transformation unit is further configured to use the LFNST transformation kernel to perform transformation processing on the residual block.
  • embodiments of the present application provide an encoder, which includes a first memory and a first processor; wherein,
  • the first memory is used to store a computer program capable of running on the first processor
  • the first processor is configured to execute the method described in the second aspect when running the computer program.
  • embodiments of the present application provide a decoder, which includes a second determination unit and a second transformation unit; wherein,
  • the second determination unit is configured to decode the code stream and determine the prediction mode parameters; when the prediction mode parameter indicates using MIP to determine the intra-frame prediction value, decode the code stream and determine the MIP parameters of the current block; decode the code stream and determine The transform coefficient and LFNST index number of the current block; when the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameter; according to the mapping mode of the LFNST transform set , select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, and determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set;
  • the second transform unit is configured to use the LFNST transform kernel to perform transform processing on the transform coefficients.
  • embodiments of the present application provide a decoder, the decoder including a second memory and a second processor; wherein,
  • the second memory is used to store a computer program capable of running on the second processor
  • the second processor is configured to execute the method described in the first aspect when running the computer program.
  • embodiments of the present application provide a computer-readable storage medium that stores a computer program.
  • the computer program When the computer program is executed, the method as described in the first aspect is implemented, or the method as described in the first aspect is implemented. The method described in the second aspect.
  • Embodiments of the present application provide a coding and decoding method, an encoder, a decoder, and a storage medium.
  • the code stream is decoded to determine the prediction mode parameters; when the prediction mode parameters indicate that MIP is used to determine the intra-frame prediction value, the decoding code Stream, determine the MIP parameters of the current block; decode the code stream, determine the transform coefficient and LFNST index number of the current block; when the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; according to the LFNST transform Set mapping mode, select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set; use the LFNST transform kernel to transform the transform coefficients deal with.
  • the prediction mode parameters are determined; when the prediction mode parameters indicate that the current block uses MIP to determine the intra prediction value, the MIP parameters of the current block are determined; based on the MIP parameters, the intra prediction block of the current block is determined, and the current block and frame are calculated Residual block between intra-prediction values; when the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; select an LFNST transform kernel from multiple LFNST transform kernel candidate sets according to the mapping mode of the LFNST transform set candidate set, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set, set the LFNST index number and write it into the video code stream; use the LFNST transform kernel to transform the residual block.
  • mapping mode of the LFNST transform set is determined according to the size parameter in the MIP parameter of the current block, where, for larger size image blocks, you can choose Not using DIMD to export mapping mode can reduce computational complexity, thereby improving coding efficiency.
  • Figure 1 is a schematic diagram of matrix-based intra prediction technology
  • Figure 2 is a correspondence table between intra prediction modes and transformation sets
  • FIG. 4 is a block diagram of a video coding system provided by an embodiment of the present application.
  • Figure 6 is a schematic flow chart of the implementation of the decoding method proposed in the embodiment of the present application.
  • Figure 7 is a schematic flow chart of the implementation of the encoding method proposed in the embodiment of the present application.
  • Figure 8 is a schematic diagram of the structure of the encoder
  • Figure 10 is a schematic diagram of the structure of the decoder
  • Figure 11 is a schematic diagram 2 of the structure of the decoder.
  • the first image component, the second image component and the third image component are generally used to represent the coding block (CB); among them, these three image components are a brightness component and a blue chrominance component respectively. and a red chroma component.
  • the luminance component is usually represented by the symbol Y
  • the blue chroma component is usually represented by the symbol Cb or U
  • the red chroma component is usually represented by the symbol Cr or V; in this way, the video image can be represented by the YCbCr format Represented, it can also be expressed in YUV format.
  • the first image component may be a brightness component
  • the second image component may be a blue chrominance component
  • the third image component may be a red chrominance component, but this is not specifically limited in the embodiment of the present application.
  • Each frame in the video image is divided into square largest coding units (Largest Coding Unit, LCU) or coding tree units (Coding Tree Unit, CTU) of the same size (such as 128 ⁇ 128, 64 ⁇ 64, etc.), each
  • the maximum coding unit or coding tree unit can also be divided into rectangular coding units (Coding Unit, CU) according to rules; and the coding unit may also be divided into smaller prediction units (Prediction Unit, PU) and transformation units (Transform Unit, TU) etc.
  • the hybrid coding framework can include modules such as Prediction, Transform, Quantization, Entropy coding, and Inloop Filter.
  • the prediction module can include intra prediction (Intra Prediction) and inter prediction (Inter Prediction), and inter prediction can include motion estimation (Motion Estimation) and motion compensation (Motion Compensation). Since there is a strong correlation between adjacent pixels within a frame of a video image, the use of intra-frame prediction in video encoding and decoding technology can eliminate the spatial redundancy between adjacent pixels; however, due to the There is also a strong similarity between frames. In video encoding and decoding technology, inter-frame prediction is used to eliminate temporal redundancy between adjacent frames, thereby improving encoding and decoding efficiency.
  • the basic process of the video codec is as follows: at the encoding end, a frame of image is divided into blocks, intra prediction or inter prediction is used for the current block to generate the prediction block of the current block, and the prediction block is subtracted from the original block of the current block to obtain the residual block. For the difference block, the residual block is transformed and quantized to obtain a quantized coefficient matrix, and the quantized coefficient matrix is entropy-encoded and output to the code stream.
  • intra prediction or inter prediction is used for the current block to generate the prediction block of the current block.
  • the code stream is decoded to obtain the quantization coefficient matrix.
  • the quantization coefficient matrix is inversely quantized and inversely transformed to obtain the residual block.
  • the prediction block is The block and residual block are added to obtain the reconstructed block.
  • Reconstruction blocks form a reconstructed image, and loop filtering is performed on the reconstructed image based on images or blocks to obtain a decoded image.
  • the encoding end also needs similar operations as the decoding end to obtain the decoded image.
  • the decoded image can be used as a reference frame for inter-frame prediction for subsequent frames.
  • the block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information determined by the encoding end need to be output to the code stream if necessary.
  • the decoding end determines the same block division information as the encoding end through decoding and analysis based on existing information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, thereby ensuring the decoded image and decoding obtained by the encoding end
  • the decoded image obtained at both ends is the same.
  • the decoded image obtained at the encoding end is usually also called a reconstructed image.
  • the current block can be divided into prediction units during prediction, and the current block can be divided into transformation units during transformation.
  • the divisions of prediction units and transformation units can be different.
  • the current block can be the current coding unit (CU) or the current prediction unit (PU), etc.
  • ECM Enhanced Compression Model
  • MIP matrix-based Intra Prediction
  • Matrix-based intra prediction technology also known as MIP technology, can be divided into three main steps, namely downsampling, matrix multiplication and upsampling.
  • the first step is to downsample the spatially adjacent reconstructed samples, and obtain the downsampled sample sequence as the input vector of the second step;
  • the second step is to use the output vector of the first step as the input and multiply it with the preset matrix. And add the bias vector, and output the calculated sample vector;
  • the third step uses the output vector of the second step as input to upsample into the final prediction block.
  • Figure 1 is a schematic diagram of the matrix-based intra prediction technology. The above process is shown in Figure 1.
  • the MIP technology obtains the upper adjacent downsampled reconstructed sample vector by averaging the adjacent reconstructed samples above the current coding unit.
  • the left adjacent downsampled reconstructed sample vector is obtained by averaging the left adjacent reconstructed samples.
  • the upper vector and the left vector are used as the input of the matrix-vector multiplication in the second step.
  • a k is a preset matrix
  • b k is a preset bias vector
  • k is the MIP mode index.
  • the third step performs linear interpolation upsampling on the results obtained in the second step to obtain a prediction sample block that is consistent with the actual number of coding unit samples.
  • MIP For coding units of different block sizes, the number of MIP modes is different. Taking H.266/VVC as an example, for a 4 ⁇ 4 coding unit, MIP has 16 prediction modes; for an 8 ⁇ 8 coding unit, or a coding unit with width and height equal to 4, MIP has 8 prediction modes ; For coding units of other sizes, MIP has 6 prediction modes.
  • MIP technology has a transposition function. For prediction modes that meet the current size, MIP will try to transpose calculations on the encoding side. If transposition is required, the order of the input vectors on the upper side and the left side of the input is reversed, and the output is reversed after matrix calculation.
  • MIP not only requires a flag bit to indicate whether the current coding unit uses MIP technology, but also, if the current coding unit uses MIP technology, an additional transposition flag bit and MIP mode index need to be transmitted to the decoder.
  • the transposition flag of MIP is binarized by fixed-length encoding (Fixed Length, FL), and the length is 1.
  • the mode index of MIP is binarized by truncated binary encoding (Truncated Binary, TB).
  • LFNST Low-Frequency Non-Separable Transform
  • LFNST is applied between forward main transform and quantization on the encoding side, and between inverse quantization and inverse main transform on the decoding side. After the residual of the current coding block undergoes the main transformation, the coefficients in the frequency domain are obtained. On this basis, LFNST performs frequency domain transformation on some coefficients, transforms some frequency domain coefficients, and obtains coefficients in another domain, and then performs Quantization and entropy coding operations. LFNST further removes statistical redundancy and has good performance on VVC's reference software VTM.
  • LFNST mainly performs secondary transformation on the 4 ⁇ 4 or 8 ⁇ 8 area in the upper left corner of the transformation block.
  • the transformation kernels of LFNST are mainly classified into 4 transformation sets in VVC, and each transformation set has 2 candidate transformation kernels.
  • ECM the transformation kernel of LFNST has been expanded from the original 4 transformation sets to 35 transformation sets, and from the original 2 candidate transformation kernels per transformation set to 3 candidate transformation kernels per transformation set. .
  • LFNST allows for intra prediction and inter prediction.
  • LFNST adopts the method of selecting the transformation set corresponding to the intra prediction mode in intra prediction, which can save bit overhead. Since intra prediction usually has a corresponding intra prediction mode, that is, DC mode, PLANAR mode or angle prediction mode, these will The intra prediction mode is bound to the LFNST transform set. For example, in VVC, DC mode and PLANAR mode correspond to the first transformation set, as shown in Table 1 below.
  • predModeIntra can be the intra prediction mode indicator
  • SetIdx can be the LFNST index sequence number.
  • the value of the LFNST index number is set to an index number indicating that the current block uses LFNST and the LFNST transform core is in the LFNST transform core candidate set. For example, if the LFNST transformation set includes four transformation core candidate sets (set0, set1, set2, set3), the values corresponding to SetIdx are 0, 1, 2, and 3 respectively.
  • Figure 2 shows the correspondence table between intra prediction modes and transform sets, such as As shown in Figure 2, the expanded correspondence has 35 transformation sets.
  • DIMD decoder side Intra Mode Derivation
  • DIMD is the intra-frame prediction technology of ECM, which is not available in VVC.
  • the main core point of this technology is that the intra prediction mode is derived on the decoding end using the same method as the encoding end, thereby avoiding transmitting the intra prediction mode index of the current coding unit in the code stream, thereby saving bit overhead.
  • the specific approach is divided into two main steps. The first step is to derive the prediction mode, and use the same prediction mode strength calculation method on the encoding and decoding side.
  • the encoding end uses the Sobel operator to count the gradient histogram (histogram of gradients) in each prediction mode.
  • the area of effect is the three rows of adjacent reconstructed samples above the current block, the three adjacent columns of reconstructed samples on the left, and the corresponding adjacent ones in the upper left.
  • the decoding end uses the same steps to derive the first prediction mode and the second prediction mode; the second step is to derive the prediction block, and the encoding and decoding end uses the same prediction block derivation method to obtain the current prediction block.
  • the encoding end determines the following two conditions: 1.
  • the gradient of the second prediction mode is not 0; 2.
  • the specific method is that the PLANAR mode occupies 1/3 of the weighted weight, and the remaining 2/3 is used as the weighted weight by the first prediction mode according to the gradient intensity ratio of the first prediction mode to the sum of the gradient intensity of the first and second prediction modes, and The second prediction mode is weighted according to the ratio of the gradient strength of the second prediction mode to the sum of the gradient strengths of the first and second prediction modes.
  • the above three prediction modes namely PLANAR, the first prediction mode and the second prediction mode, are weighted and averaged to obtain the prediction block of the current coding unit.
  • the decoder uses the same steps to obtain the prediction block.
  • Figure 3 is a schematic diagram of the intra-mode derivation technology at the decoding end. The above specific operation process is shown in Figure 3.
  • the specific weight calculation method is as follows:
  • Weight(mode2) 1–Weight(PLANAR)–Weight(mode1) (3)
  • mode1 and mode2 respectively represent the first prediction mode and the second prediction mode
  • amp1 and amp2 respectively represent the gradient amplitude value of the first prediction mode and the gradient amplitude value of the second prediction mode.
  • DIMD technology a flag bit needs to be transmitted to the decoder to indicate whether the current coding unit uses DIMD technology.
  • MIP prediction modes are defaulted to PLANAR mode before mapping to the transformation set process of LFNST. This is because LFNST uses the intra prediction mode as the training input in the early stage of design, and obtains the transformation kernel coefficients of LFNST through deep learning training.
  • the MIP prediction mode is expressed differently from the traditional intra prediction mode.
  • the MIP prediction mode represents a certain prediction. Matrix coefficients, traditional prediction models represent directionality.
  • the prediction results of MIP are similar to the traditional PLANAR mode, so all MIP prediction modes use PLANAR to map to the LFNST transformation set.
  • the code stream is decoded to determine the prediction mode parameters; when the prediction mode parameter indicates that MIP is used to determine the intra-frame prediction value, the code stream is decoded to determine the MIP parameters of the current block.
  • the prediction mode parameters are determined; when the prediction mode parameters indicate that the current block uses MIP to determine the intra prediction value, the MIP parameters of the current block are determined; based on the MIP parameters, the intra prediction block of the current block is determined, and the current block and frame are calculated Residual block between intra-prediction values; when the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; select an LFNST transform kernel from multiple LFNST transform kernel candidate sets according to the mapping mode of the LFNST transform set candidate set, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set, set the LFNST index number and write it into the video code stream; use the LFNST transform kernel to transform the residual block.
  • mapping mode of the LFNST transform set is determined according to the size parameter in the MIP parameter of the current block, where, for larger size image blocks, you can choose Not using DIMD to export mapping mode can reduce computational complexity, thereby improving coding efficiency.
  • the video coding system 10 includes a transformation and quantization unit 101, an intra-frame estimation unit 102, an intra-frame Prediction unit 103, motion compensation unit 104, motion estimation unit 105, inverse transformation and inverse quantization unit 106, filter control analysis unit 107, filtering unit 108, encoding unit 109 and decoded image cache unit 110, etc., wherein the filtering unit 108 can To implement deblocking filtering and Sample Adaptive Offset (SAO) filtering, the encoding unit 109 can implement header information encoding and Context-based Adaptive Binary Arithmatic Coding (CABAC).
  • CABAC Context-based Adaptive Binary Arithmatic Coding
  • a video coding block can be obtained by dividing the coding tree block (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is processed through the transformation and quantization unit 101
  • the video coding block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate;
  • the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to Intra prediction is performed on the video encoding block; specifically, intra estimation unit 102 and intra prediction unit 103 are used to determine an intra prediction mode to be used to encode the video encoding block;
  • motion compensation unit 104 and motion estimation unit 105 is used to perform inter-frame prediction encoding of the received video encoding block with respect to one or more blocks in one or more reference frames to provide temporal prediction information; motion estimation performed by the motion estimation unit 105 is to generate a motion vector.
  • the motion vector can estimate the motion of the video encoding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 also is used to provide the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated and determined motion vector data to the encoding unit 109; in addition, the inverse transformation and inverse quantization unit 106 is used for the video Reconstruction of the coding block, the residual block is reconstructed in the pixel domain, the reconstructed residual block removes block effect artifacts through the filter control analysis unit 107 and the filtering unit 108, and then the reconstructed residual block is added to the decoding A predictive block in the frame of the image cache unit 110 is used to generate a reconstructed video encoding block; the encoding unit 109 is used to encode various encoding parameters and quantized transform coefficients.
  • the contextual content can be based on adjacent coding blocks and can be used to encode information indicating the determined intra prediction mode and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks for Forecast reference. As the video image encoding proceeds, new reconstructed video encoding blocks will be continuously generated, and these reconstructed video encoding blocks will be stored in the decoded image cache unit 110 .
  • the video decoding system 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra-frame Prediction unit 203, motion compensation unit 204, filtering unit 205, decoded image cache unit 206, etc., wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement deblocking filtering and SAO filtering.
  • the code stream of the video signal is output; the code stream is input into the video decoding system 20 and first passes through the decoding unit 201 to obtain the decoded transformation coefficient; for the transformation coefficient, pass Inverse transform and inverse quantization unit 202 performs processing to generate a residual block in the pixel domain; intra prediction unit 203 may be operable to generate based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture. Prediction data for the current video decoding block; motion compensation unit 204 determines prediction information for the video decoding block by parsing motion vectors and other associated syntax elements, and uses the prediction information to generate predictions for the video decoding block being decoded.
  • a decoded video block is formed by summing the residual block from inverse transform and inverse quantization unit 202 and the corresponding predictive block generated by intra prediction unit 203 or motion compensation unit 204; the decoded video signal Video quality can be improved by filtering unit 205 to remove blocking artifacts; the decoded video blocks are then stored in decoded image buffer unit 206, which stores reference images for subsequent intra prediction or motion compensation. , and is also used for the output of video signals, that is, the restored original video signals are obtained.
  • the encoding method in the embodiment of the present application can be applied to the intra estimation unit 102 and the intra prediction unit 103 as shown in FIG. 4 .
  • the decoding method in the embodiment of the present application can also be applied to the intra prediction unit 203 as shown in FIG. 5 . That is to say, the encoding and decoding method in the embodiment of the present application can be applied to both the video encoding system and the video decoding system, and can even be applied to the video encoding system and the video decoding system at the same time.
  • the embodiment of the present application does not Specific limitations.
  • the "current block” specifically refers to the current coding block in intra prediction
  • the "current block” Specifically, it refers to the current decoded block in intra prediction.
  • FIG. 6 is a schematic flow chart of the implementation of the decoding method proposed by the embodiment of the present application.
  • the method for the decoder to perform decoding processing may include the following steps:
  • Step 101 Decode the code stream and determine the prediction mode parameters.
  • the decoder decodes the code stream and may first determine the prediction mode parameters.
  • the prediction mode parameter indicates the coding mode of the current block and parameters related to the coding mode.
  • prediction modes usually include traditional intra prediction modes and non-traditional intra prediction modes, and traditional intra prediction modes can include direct current (DC) mode, planar (PLANAR) mode, angle mode, etc.
  • non-traditional intra prediction modes Traditional intra prediction modes can include MIP mode, Cross-component Linear Model Prediction (CCLM) mode, Intra Block Copy (IBC) mode, PLT (Palette) mode, etc.
  • predictive coding can be performed on the current block.
  • the prediction mode of the current block can be determined, and the corresponding prediction mode parameters can be written into
  • the code stream transmits the prediction mode parameters from the encoder to the decoder.
  • the intra prediction mode of the brightness or chroma component of the current block or the coding block where the current block is located can be obtained by decoding the code stream.
  • the predModeIntra intra prediction mode indicator
  • the calculation formula is as follows,
  • the image component indicator (can be represented by cIdx) is used to indicate the luminance component or chrominance component of the current block; here, if the current block predicts the luminance component, then cIdx is equal to 0; if the current block predicts the chrominance component , then cIdx is equal to 1.
  • (xTbY, yTbY) is the coordinate of the upper left corner sampling point of the current block
  • IntraPredModeY[xTbY][yTbY] is the intra prediction mode of the luminance component
  • IntraPredModeC[xTbY][yTbY] is the intra prediction mode of the chroma component.
  • the prediction mode parameters by obtaining the prediction mode parameters, it can be determined based on the prediction mode parameters whether to use MIP to determine the intra prediction value when performing intra prediction.
  • Step 102 When the prediction mode parameter indicates using MIP to determine the intra-frame prediction value, decode the code stream and determine the MIP parameter of the current block.
  • the code stream can be continued to be decoded, thereby determining the MIP parameters of the current block.
  • the MIP parameters may include the MIP transposition indication parameter (which can be represented by isTransposed), the MIP mode index number (which can be represented by modeId), the size of the current block, and the category of the current block. (can be represented by mipSizeId) and other parameters; the values of these parameters can be obtained by decoding the code stream.
  • the MIP transposition indication parameter by decoding the MIP parameters determined by the code stream, at least one of the MIP transposition indication parameter, MIP mode index number, current block size, current block type and other information can be obtained Give instructions.
  • the value of isTransposed can be determined; when the value of isTransposed is equal to 1, it can be determined that the sampling point input vector used in the MIP mode needs to be transposed; When the value of isTransposed is equal to 0, it can be determined that there is no need to transpose the sampling point input vector used in MIP mode; that is to say, the MIP transposition indication parameter isTransposed can be used to indicate whether to transpose the sampling point used in MIP mode.
  • the input vector is transposed.
  • the MIP mode index serial number modeId can also be determined; wherein, the MIP mode index serial number can be used to indicate the MIP mode used by the current block, and the MIP mode can be used to indicate the use of MIP Determines how the intra prediction block for the current block is calculated and derived. That is to say, different MIP modes have different values of corresponding MIP mode index numbers; here, the value of the MIP mode index number can be 0, 1, 2, 3, 4 or 5.
  • parameter information such as the size of the current block, the aspect ratio, the type of the current block mipSizeId, etc. can also be determined.
  • the LFNST transformation core which can be represented by kernel
  • the MIP parameter can determine the size parameter of the current block, where the size parameter can represent the size of the current block, which can be the height and width of the current block, or the size of the current block. aspect ratio.
  • Step 103 Decode the code stream and determine the transform coefficient and LFNST index number of the current block.
  • the prediction mode parameter indicates using MIP to determine the intra prediction value
  • the code stream can be continued to be decoded, and the transform coefficient and LFNST index sequence number of the current block can be determined.
  • the value of the LFNST index number can be used to indicate whether the current block uses LFNST, and can also be used to indicate the index number of the LFNST transformation core in the LFNST transformation core candidate set.
  • the index number of the transformation core can be equal to the value of the LFNST index number, or the index number of the transformation core can also be equal to the value of the LFNST index number minus 1.
  • Step 104 When the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters.
  • the mapping mode of the LFNST transform set can be further determined based on the MIP parameters.
  • the MIP parameter may be the size parameter of the current block, where the size parameter may represent the size of the current block, and may be the height and width of the current block, or the size of the current block. aspect ratio.
  • the size parameter of the current block may be referred to.
  • the mapping mode of the LFNST transform set is determined based on the height and width of the current block, or the mapping mode of the LFNST transform set is determined based on the aspect ratio of the current block.
  • the mapping mode of the LFNST transform set when determining the mapping mode of the LFNST transform set according to the size parameter of the current block, it can first be determined whether the size parameter satisfies the first preset condition. If the size parameter satisfies the first preset condition, Then the first preset prediction mode can be determined as the mapping mode of the LFNST transform set; if the size parameter does not meet the first preset condition, then DIMD is used to determine the mapping mode of the LFNST transform set.
  • the first preset condition may be used to limit the size of the current block.
  • the first preset condition corresponds to the size parameter of the current block. If the size parameter of the current block is the height and width of the current block, then the first preset condition can limit the height and width respectively; if the size parameter of the current block is the aspect ratio of the current block, then the first preset condition can Limit the aspect ratio.
  • the first preset condition can be set to be that the width is greater than or equal to the preset width threshold, and/or that the height is greater than Or equal to the preset height threshold. For example, if the height of the current block is greater than or equal to the preset height threshold, or the width of the current block is greater than or equal to the preset width threshold, then it can be determined that the size parameter meets the first preset condition; if the height of the current block is less than the preset height threshold, and the width of the current block is less than the preset width threshold, then it can be determined that the size parameter does not meet the first preset condition.
  • the preset width threshold and the preset height threshold can be any value greater than or equal to 0, for example, the preset width threshold is 32, and the preset height threshold is also 32, that is, If the height or width of the current block is greater than or equal to 32, it can be determined that the current block satisfies the first preset condition.
  • the preset width threshold is 32, and the preset height threshold is also 16. That is, if the height of the current block is greater than or equal to 32, or the width of the current block is greater than or equal to 16, it can be determined that the current block meets the first preset condition.
  • DIMD can be restricted through the first preset condition, that is, only when the size parameter of the current block does not meet the first preset condition, DIMD is allowed to be used to determine Mapping mode for the LFNST transform set.
  • the first preset prediction mode can be directly determined as the mapping mode of the LFNST transform set.
  • the first preset prediction mode may be PLANAR mode or DC mode.
  • mapping mode of the LFNST transform set when determining the mapping mode of the LFNST transform set, combined with the first preset condition, you can choose to directly set the mapping mode for some image blocks. For example, for image blocks with larger sizes, the PLANAR mode or DC mode is directly determined as the mapping mode of the LFNST transform set.
  • At least one intra prediction mode when using DIMD to determine the mapping mode of the LFNST transform set, at least one intra prediction mode can be traversed first to determine at least one gradient information corresponding to the current block; then, the mapping mode can be determined according to at least A gradient information determines the mapping mode of the LFNST transform set.
  • one intra prediction mode corresponds to one gradient information
  • the gradient information can be a gradient histogram.
  • the gradient amplitude value corresponding to each intra prediction mode can be determined based on at least one gradient information; and then The intra prediction mode with the largest gradient amplitude value among the at least one intra prediction mode may be determined as the mapping mode of the LFNST transform set.
  • one intra prediction mode corresponds to one gradient amplitude value.
  • the first step is to derive the prediction mode, and use the same prediction mode strength calculation method on the encoding and decoding side.
  • the Sobel operator is used to count the histogram of gradients in each prediction mode.
  • the area of effect is the three rows of adjacent reconstructed samples above the current block, the three adjacent columns of reconstructed samples on the left, and the corresponding adjacent reconstructed samples in the upper left.
  • the first prediction mode corresponding to the largest amplitude in the histogram and the second prediction mode corresponding to the second largest amplitude in the histogram can be obtained;
  • the second step is to derive the prediction block, the same prediction block derivation method is used on the encoding and decoding end to obtain the current prediction block.
  • the gradient of the second prediction mode is not 0; 2.
  • Neither the first prediction mode nor the second prediction mode is PLANAR or DC prediction mode.
  • the current prediction block only uses the first prediction mode to calculate the prediction sample value of the current block, that is, the ordinary prediction prediction process is applied to the first prediction mode; otherwise, that is, the above two conditions are established, then The current prediction block will be derived using weighted averaging.
  • the specific method is that the PLANAR mode occupies 1/3 of the weighted weight, and the remaining 2/3 is used as the weighted weight by the first prediction mode according to the gradient intensity ratio of the first prediction mode to the sum of the gradient intensity of the first and second prediction modes, and The second prediction mode is weighted according to the ratio of the gradient strength of the second prediction mode to the sum of the gradient strengths of the first and second prediction modes.
  • the above three prediction modes namely PLANAR, the first prediction mode and the second prediction mode, are weighted and averaged to obtain the prediction block of the current coding unit.
  • the decoder uses the same steps to obtain the prediction block.
  • the down-sampling vector of the current block can be determined first according to the MIP parameters; then matrix multiplication calculation is performed according to the down-sampling vector to obtain the MIP output vector; and then the MIP of the current block is determined according to the MIP output vector. Prediction block; finally traverse at least one intra prediction mode for the MIP prediction block to obtain at least one gradient information.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size parameters of the current block, and the sampling step size is determined by the size parameters of the current block.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side is adjusted according to the MIP transposition indication parameter obtained by decoding. If transposition is not required, the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced. The sampled reference reconstruction sample is spliced after the downsampled reference reconstruction sample on the left, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficient can be obtained according to the decoded MIP mode index number, and the output vector (MIP output vector) can be calculated with the input (downsampling vector). Then according to the number of output vectors and the size parameters of the current block, the output vector is upsampled. If upsampling is not required, the vectors are filled in the horizontal direction in sequence as the MIP prediction block output of the current block. If upsampling is required, horizontal upsampling is performed first. The direction is then downsampled in the vertical direction, upsampled to the same size as the template, and then output as the MIP prediction block of the current block.
  • the DIMD method can be directly used for the MIP prediction block of the current block to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set. That is, traverse at least one intra prediction mode for the MIP prediction block of the current block, and calculate the gradient information of at least one intra prediction mode on the MIP prediction block of the current block,
  • the DIMD calculation process may be performed after the MIP output vector is upsampled.
  • the down-sampling vector of the current block can be determined first according to the MIP parameters; then matrix multiplication calculation is performed according to the down-sampling vector to obtain the MIP output vector; finally, at least one frame is traversed for the MIP output vector In intra prediction mode, at least one gradient information is obtained.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size parameters of the current block, and the sampling step size is determined by the size parameters of the current block.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side is adjusted according to the MIP transposition indication parameter obtained by decoding. If transposition is not required, the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced. The sampled reference reconstruction sample is spliced after the downsampled reference reconstruction sample on the left, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficient can be obtained according to the decoded MIP mode index number, and the output vector (MIP output vector) can be calculated with the input (downsampling vector).
  • the DIMD calculation process can also be performed before upsampling the MIP output vector.
  • 67 intra prediction modes can be selectively skipped, reducing the number of traversed intra prediction modes.
  • Number for example, perform selective filtering with a step size of 1.
  • Step 105 Select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets according to the mapping mode of the LFNST transform set, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set.
  • the mapping mode of the LFNST transform set can be selected from multiple LFNST transform core candidate sets. Select a LFNST transform kernel candidate set, and then determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set.
  • the index number of the mapping mode of the LFNST transform set can be determined first; then the LFNST intra prediction mode can be determined according to the value of the index number. The value of the index serial number; then, one LFNST transform kernel candidate set can be selected from multiple LFNST transform kernel candidate sets according to the value of the LFNST intra prediction mode index serial number; finally, one can select from the selected LFNST transform kernel candidate set, Select the transformation kernel indicated by the LFNST index number and set it as the LFNST transformation kernel used by the current block.
  • the index number of the mapping mode of the LFNST transform set can be further determined, and then the value of the index number of the mapping mode of the LFNST transform set can be determined. Convert to the value of the LFNST intra prediction mode index number (which can be represented by predModeIntra); then select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets based on the value of predModeIntra to determine the transform kernel candidate set ; And from the selected LFNST transform kernel candidate set, select the transform kernel indicated by the LFNST index number and set it as the LFNST transform kernel used by the current block.
  • multiple LFNST transform kernel candidate sets may include 4 LFNST transform kernel candidate sets, where each Each LFNST transformation core candidate set includes two LFNST transformation cores; accordingly, the first lookup table can be used to determine the value of the LFNST intra prediction mode index number corresponding to the value of the index number.
  • the DC mode, PLANAR mode or angle prediction mode and the transformation set of LFNST can be bound based on the first lookup table, such as the first lookup table shown in Table 1.
  • the multiple LFNST transform kernel candidate sets may also include 35 LFNST transform kernel candidate sets, where, Each LFNST transform core candidate set includes three LFNST transform kernels; accordingly, the second lookup table can be used to determine the value of the LFNST intra prediction mode index number corresponding to the value of the index number.
  • the LFNST transform sets corresponding to different intra prediction modes will be more fine-grained, such as the second lookup table shown in Figure 2.
  • the MIP parameters may also include a MIP transposition indication parameter, where the value of the MIP transposition indication parameter is used to indicate whether to transpose the sampling point input vector used in the MIP mode.
  • the value of the MIP transposition indication parameter indicates transposition processing of the sampling point input vector used in the MIP mode, you can choose to perform matrix transposition processing on the transformation core indicated by the LFNST index number to obtain the LFNST used in the current block. Transform kernel.
  • the value of the MIP transposition indication parameter when the value of the MIP transposition indication parameter is equal to 1, it can be considered that the value of the MIP transposition indication parameter indicates that the sampling point input vector used in the MIP mode is transposed. , at this time, it is necessary to perform corresponding matrix transposition processing on the selected transformation kernel, so that the LFNST transformation kernel used in the current block can be obtained.
  • Step 106 Use the LFNST transformation kernel to transform the transformation coefficients.
  • one LFNST transform kernel candidate set is selected from multiple LFNST transform kernel candidate sets, and the LFNST transform used by the current block is determined from the selected LFNST transform kernel candidate set.
  • the LFNST transformation kernel can be used to transform the transformation coefficients.
  • the LFNST transform kernel determined from the selected LFNST transform kernel candidate set is the LFNST transform kernel used by the current block.
  • the LFNST transform kernel can be a transformation matrix that transforms the transform coefficients.
  • the secondary transformation coefficient vector can be used as input, and the transformation matrix (transformation kernel) is used to multiply it to obtain the primary transformation coefficient vector. In this way, after matrix calculation, the transformation processing of the transformation coefficient can be implemented.
  • the decoder can decode the coding unit level type flag bit, and if the intra mode is indicated, decode to obtain the MIP allowed use flag bit (prediction mode parameter), which
  • the flag bit may be a sequence-level flag bit, used to indicate whether the current decoder allows the use of MIP technology.
  • the sequence-level flag bit can be expressed in the form of sps_mip_enable_flag.
  • the MIP use flag of the current coding unit (current block) is decoded. Otherwise, the current decoding process does not need to decode the MIP use flag of the coding unit level.
  • the MIP use flag of the coding unit level is The flag defaults to No.
  • the MIP usage flag of the current coding unit is true, decode to obtain the MIP parameters of the current coding unit, where the MIP parameters may include information such as MIP transposition indication parameters, MIP mode index numbers, current block size, current block type, etc. at least one piece of information in . Otherwise, continue to decode information such as usage flags or indexes of other intra prediction technologies, and obtain the final prediction block of the current coding unit based on the decoded information.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size of the current coding unit (the size parameter of the current block).
  • the sampling step size is determined according to the coding unit size, and can be combined with
  • the decoded MIP transpose instruction parameter adjusts the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side.
  • the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced.
  • the reference reconstruction sample after side downsampling is spliced after the reference reconstruction sample after left downsampling, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients are obtained according to the decoded MIP prediction mode, and the output vector (MIP output vector) is calculated with the input (downsampling vector). Then the output vector is upsampled according to the number of output vectors and the size of the current coding unit. Among them, if upsampling is not required, the vectors are sequentially filled in the horizontal direction and output as the current coding unit prediction block (MIP prediction block of the current block). If upsampling is required, the horizontal direction is first upsampled and then the vertical direction is downsampled, and the upsampling is equal to After the template size is the same, it is output as the current coding unit prediction block (MIP prediction block of the current block).
  • the first preset prediction mode is directly determined as the mapping mode of the LFNST transform set, for example, if the width and height of the current coding unit are both greater than or equal to 32, then the PLANAR mode (first preset prediction mode) can be used as the mapping mode of the LFNST transform set. If the size parameter of the current block meets the first preset condition, you may choose to use the DIMD method for the current MIP prediction block to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set.
  • the 67 intra prediction modes in the current VVC and ECM can be traversed for the current MIP prediction block (or some intra prediction modes can be traversed ), calculate the gradient information of each intra prediction mode on the current MIP prediction block, and then determine the corresponding gradient amplitude value based on the gradient information, and then sort the traversed intra prediction modes according to the gradient amplitude value, and the frame with the largest amplitude
  • the intra prediction mode is the optimal mode, that is, the mapping mode of the LFNST transform set used as the inverse transformation process in subsequent steps.
  • the decoder can decode the coding unit level type flag bit, and if the intra mode is indicated, decode and obtain the MIP allowed use flag bit (prediction mode parameter),
  • This flag bit may be a sequence-level flag bit used to indicate whether the current decoder allows the use of MIP technology.
  • the sequence-level flag bit can be expressed in the form of sps_mip_enable_flag.
  • the MIP use flag of the current coding unit (current block) is decoded. Otherwise, the current decoding process does not need to decode the MIP use flag of the coding unit level.
  • the MIP use flag of the coding unit level is The flag defaults to No.
  • the MIP usage flag of the current coding unit is true, decode to obtain the MIP parameters of the current coding unit, where the MIP parameters may include information such as MIP transposition indication parameters, MIP mode index numbers, current block size, current block type, etc. at least one piece of information in . Otherwise, continue to decode information such as usage flags or indexes of other intra prediction technologies, and obtain the final prediction block of the current coding unit based on the decoded information.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size of the current coding unit (the size parameter of the current block).
  • the sampling step size is determined according to the coding unit size, and can be combined with
  • the decoded MIP transpose instruction parameter adjusts the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side.
  • the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced.
  • the reference reconstruction sample after side downsampling is spliced after the reference reconstruction sample after left downsampling, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients are obtained according to the decoded MIP prediction mode, and the output vector (MIP output vector) is calculated with the input (downsampling vector).
  • the first preset prediction mode is directly determined as the mapping mode of the LFNST transform set, for example, if the width and height of the current coding unit are both greater than or equal to 32, then the PLANAR mode (first preset prediction mode) can be used as the mapping mode of the LFNST transform set. If the size parameter of the current block meets the first preset condition, you may choose to use the DIMD method on the current MIP output vector to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set.
  • the 67 intra prediction modes (or traversal parts) of the current MIP output vector (MIP output vector) in the current VVC and ECM can be Intra prediction mode), calculates the gradient information of each intra prediction mode on the current MIP output vector, and then determines the corresponding gradient amplitude value based on the gradient information, and then sorts the traversed intra prediction modes according to the gradient amplitude value,
  • the intra prediction mode with the largest amplitude is the optimal mode, which is the mapping mode of the LFNST transform set used as the inverse transform process in subsequent steps.
  • the output vector can be upsampled according to the number of output vectors and the current coding unit size. Among them, if upsampling is not required, the vectors are sequentially filled in the horizontal direction and output as the current coding unit prediction block (MIP prediction block of the current block). If upsampling is required, the horizontal direction is first upsampled and then the vertical direction is downsampled, and the upsampling is equal to After the template size is the same, it is output as the current coding unit prediction block (MIP prediction block of the current block).
  • the encoding and decoding method proposed in the embodiment of this application is suitable for the intra prediction part of the encoding and decoding end.
  • the solution proposed in the embodiment of this application is adopted and integrated into JVET-Z008.
  • the test results are tested under the AI test condition. As shown in Table 2 and Table 3 below:
  • the decoding method proposed in the embodiments of the present application can be used only in B frames, or can be used in both I frames and B frames. use simultaneously.
  • the decoding method proposed in the embodiment of this application can also be used only on I frames.
  • the conditions under which the decoding method proposed in the embodiment of the present application is allowed to be used are different on the B frame or the I frame.
  • the I frame allows coding units of all sizes to use the decoding method proposed in the embodiment of the present application
  • the B frame only allows the use of the decoding method proposed in the embodiment of the present application.
  • Small-sized coding units are allowed to use the decoding method proposed in the embodiment of this application.
  • the chroma component of the current coding unit uses the MIP prediction mode, and the chroma component does not use the MIP prediction mode and does not use the traditional intra prediction mode, then the chroma component
  • the LFNST transform set can inherit the LFNST transform set of the luma component.
  • the LFNST transform set of the bright chroma component can be solved according to the decoding method proposed in the embodiment of the present application.
  • the decoding method proposed in the embodiment of the present application involves using DIMD to derive a MIP prediction block mapping LFNST transform set.
  • DIMD To the one hand, it is proposed to limit the size of the coding unit using DIMD. For larger size image blocks, the MIP output vector is more upsampled and the direction information is not obvious. Therefore, it is chosen to skip the process of deriving the traditional prediction mode from DIMD to reduce the computational complexity.
  • DIMD coding units On the basis of size, the computational complexity is further reduced, and the MIP output vector before upsampling is used as the input of DIMD and the optimal traditional intra prediction mode is derived.
  • the embodiment of the present application provides a decoding method.
  • the code stream is decoded to determine the prediction mode parameters; when the prediction mode parameter indicates that MIP is used to determine the intra-frame prediction value, the code stream is decoded to determine the MIP parameters of the current block; decoding Code stream, determine the transform coefficient and LFNST index number of the current block; when the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; according to the mapping mode of the LFNST transform set, transform from multiple LFNST Select an LFNST transform kernel candidate set from the kernel candidate set, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set; use the LFNST transform kernel to transform the transform coefficients.
  • mapping mode of the LFNST transform set is determined according to the size parameter in the MIP parameter of the current block, where, for larger size image blocks, you can choose Not using DIMD to export mapping mode can reduce computational complexity, thereby improving coding efficiency.
  • Figure 7 is a schematic flow chart of the implementation of the encoding method proposed by the embodiment of the present application.
  • the method for the encoder to perform encoding processing may include the following steps. :
  • Step 201 Determine prediction mode parameters.
  • the encoder may first determine the prediction mode parameters.
  • each image block currently to be encoded can be called a coding block (CB).
  • each encoding block may include a first image component, a second image component, and a third image component; and the current block is the encoding of the first image component, the second image component, or the third image component that is currently to be predicted in the video image. piece.
  • the current block performs the first image component prediction, and the first image component is a brightness component, that is, the image component to be predicted is a brightness component, then the current block can also be called a brightness block; or, it is assumed that the current block performs the second image component prediction prediction, and the second image component is a chrominance component, that is, the image component to be predicted is a chrominance component, then the current block can also be called a chrominance block.
  • the prediction mode parameter indicates the coding mode of the current block and parameters related to the mode.
  • Rate Distortion Optimization (RDO) can usually be used to determine the prediction mode parameters of the current block.
  • the image components to be predicted of the current block can be determined first; then based on the parameters of the current block, multiple prediction modes are used to separately determine the image components to be predicted. Perform predictive coding and calculate the rate distortion cost results corresponding to each prediction mode in multiple prediction modes; finally, the minimum rate distortion cost result can be selected from the multiple calculated rate distortion cost results, and the minimum rate distortion cost result can be corresponding to The prediction mode is determined as the prediction mode parameter of the current block.
  • multiple prediction modes can be used to separately encode the image components to be predicted for the current block.
  • multiple prediction modes usually include traditional intra prediction modes and non-traditional intra prediction modes
  • traditional intra prediction modes can include direct current (DC) mode, planar (PLANAR) mode, angle mode, etc.
  • non-traditional intra prediction modes can include MIP mode, Cross-component Linear Model Prediction (CCLM) mode, Intra Block Copy (IBC) mode and PLT (Palette) mode, etc. .
  • the rate distortion cost result corresponding to each prediction mode can be obtained; then the minimum rate distortion cost result is selected from the multiple rate distortion cost results obtained, and The prediction mode corresponding to the minimum rate distortion cost result is determined as the prediction mode parameter of the current block; in this way, the current block can be encoded using the determined prediction mode, and in this prediction mode, the prediction residual can be small , which can improve coding efficiency.
  • predictive coding can be performed on the current block.
  • the prediction mode of the current block can be determined, and the corresponding prediction mode parameters can be written into
  • the code stream transmits the prediction mode parameters from the encoder to the decoder.
  • the intra prediction mode of the brightness or chroma component of the current block or the coding block where the current block is located can be obtained by decoding the code stream.
  • the predModeIntra intra prediction mode indicator
  • the prediction mode parameters by obtaining the prediction mode parameters, it can be determined based on the prediction mode parameters whether to use MIP to determine the intra prediction value when performing intra prediction.
  • Step 202 When the prediction mode parameter indicates using MIP to determine the intra prediction value, determine the MIP parameter of the current block.
  • the MIP parameter of the current block may continue to be determined.
  • the MIP parameters may include the MIP transposition indication parameter (which can be represented by isTransposed), the MIP mode index number (which can be represented by modeId), the size of the current block, and the category of the current block. (can be represented by mipSizeId) and other parameters.
  • At least one of the MIP transposition indication parameter, MIP mode index number, current block size, current block type and other information can be indicated through the determined MIP parameter.
  • the MIP parameters may include a MIP transposition indication parameter (which may be represented by isTransposed); here, the value of the MIP transposition indication parameter is used to indicate whether to input the sampling point used in the MIP mode.
  • the vector is transposed.
  • the adjacent reference sample set can be obtained based on the reference sample values corresponding to the adjacent reference pixels on the left side of the current block and the reference sample values corresponding to the adjacent reference pixels on the upper side; thus, after obtaining the After the adjacent reference sample set, an input reference sample set can be constructed at this time, that is, the sample point input vector used in MIP mode.
  • the construction methods on the encoding side and the decoding side are different, mainly related to the value of the MIP transposition indication parameter.
  • rate-distortion optimization can still be used to determine the value of the MIP transposition indication parameter. Specifically, it can include:
  • the first-generation value is less than the second-generation value, it can be determined that the value of the MIP transposition indication parameter is 1;
  • the first-generation value is not less than the second-generation value, it can be determined that the value of the MIP transposition indication parameter is 0.
  • the reference sample value corresponding to the upper side of the adjacent reference sample set can be stored before the reference sample value corresponding to the left side.
  • the buffer can be directly determined as the input reference sample set; when the value of the MIP transposition indication parameter is 1, In the buffer, the reference sample value corresponding to the upper side of the adjacent reference sample set can be stored after the reference sample value corresponding to the left side.
  • the buffer is transposed, that is, the samples used in MIP mode need to be transposed.
  • the input vector is transposed, and the transposed buffer is determined as the input reference sample set. In this way, after the input reference sample set is obtained, it can be used in the process of determining the intra prediction value corresponding to the current block in the MIP mode.
  • the MIP parameters may also include a MIP mode index number (which can be represented by modeId), where the MIP mode index number is used to indicate the MIP mode used by the current block, and the MIP mode is used to indicate the use of The MIP determines how the intra prediction block for the current block is calculated and derived.
  • MIP mode index number which can be represented by modeId
  • MIP mode since MIP modes can include many kinds, these various MIP modes can be distinguished by MIP mode index numbers, that is, different MIP modes have different MIP mode index numbers; in this way, according to the use MIP determines the calculation and derivation method of the intra prediction block of the current block, and can determine the specific MIP mode, so that the corresponding MIP mode index number can be obtained; in the embodiment of the present application, the value of the MIP mode index number can be 0, 1, 2, 3, 4 or 5.
  • the MIP parameters may also include parameters such as the size of the current block and the aspect ratio; wherein, according to the size of the current block (ie, the width and height of the current block), the current block may also be determined Category (can be represented by mipSizeId).
  • the value of mipSizeId can be set to 0; conversely, if one of the width and height of the current block is equal to 4, or the width and height of the current block are both equal to 8, then You can set the value of mipSizeId to 1; conversely, if the current block is a block of another size, you can set the value of mipSizeId to 2.
  • the value of mipSizeId can be set to 0; conversely, if one of the width and height of the current block is equal to 4, then the value of mipSizeId can be set to 1; vice versa , if the current block is a block of other sizes, then the value of mipSizeId can be set to 2.
  • the MIP parameters can also be determined, so that the LFNST transform kernel (which can be represented by kernel) used by the current block is determined based on the determined MIP parameters.
  • the MIP parameter can determine the size parameter of the current block, where the size parameter can represent the size of the current block, which can be the height and width of the current block, or the size of the current block. aspect ratio.
  • Step 203 Determine the intra prediction block of the current block according to the MIP parameters, and calculate the residual block between the current block and the intra prediction value.
  • the intra prediction block of the current block can be further determined according to the MIP parameter, and the current block can be calculated Residual block between intra prediction values.
  • the input data of MIP prediction includes: the position of the current block (xTbCmp, yTbCmp), the MIP prediction mode applied to the current block (which can be represented by modeId) , the height of the current block (expressed by nTbH), the width of the current block (expressed by nTbW), and the transposition processing indicator flag whether transposition is required (can be expressed by isTransposed), etc.;
  • the MIP prediction process can be divided into four steps: configuring core parameters, obtaining reference pixels, constructing input samples, and generating prediction values.
  • the current block can be divided into three categories, and mipSizeId is used to record the type of the current block; and for different types of current blocks, refer to the number of sampling points and matrix multiplication output sampling The number of points is different.
  • mipSizeId is used to record the type of the current block
  • the upper block and the left block of the current block are both encoded blocks.
  • the reference pixels of the MIP technology are the reconstructed values of the pixels in the previous row and the left column of the current block.
  • the process of the reference pixels adjacent to the upper side of the current block (represented by refT) and the reference pixels adjacent to the left side (represented by refL) is the acquisition process of the reference pixels.
  • this step is used for the input of matrix multiplication, which can mainly include: obtaining reference samples, constructing reference sample buffers and deriving matrix multiplication input samples; among them, the process of obtaining reference samples is the downsampling process, and constructing
  • the reference sampling buffer can also include the filling method of the buffer when transposition is not required and the filling method of the buffer when transposition is required.
  • this step is used to obtain the MIP predicted value of the current block, which can mainly include: constructing the matrix multiplication output sampling block, matrix multiplication output sampling embedding, matrix multiplication output sampling transposition and generating the MIP final prediction value; Among them, constructing the matrix multiplication output sampling block can include obtaining the weight matrix, obtaining the shift factor and offset factor, and matrix multiplication operations. Generating the final MIP predicted value can include generating predicted values that do not require upsampling and generating prediction values that require upsampling. Predictive value. In this way, after these four steps, the intra prediction block of the current block can be obtained.
  • a difference calculation can be performed based on the actual pixel value of the current block and the intra prediction value, and the calculated difference can be used as The residual block facilitates subsequent transformation processing of the residual block.
  • Step 204 When the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters.
  • the mapping mode of the LFNST transform set can be further determined according to the MIP parameters.
  • LFNST can be performed on the current block only when the current block satisfies the following conditions at the same time.
  • these conditions include: (a) the width and height of the current block are both greater than or equal to 4; (b) the width and height of the current block are less than or equal to the maximum size of the transform block; (c) the current block or the current coding block
  • the prediction mode is intra prediction mode; (d) the primary transformation of the current block is a two-dimensional forward primary transformation (DCT2) in both the horizontal and vertical directions; (e) the current block or the intra frame of the coding block where the current block is located
  • DCT2 two-dimensional forward primary transformation
  • the prediction mode is non-MIP mode or the prediction mode of the transform block is MIP mode and the width and height of the transform block are both greater than or equal to 16.
  • the current block can execute LFNST, it is also necessary to determine the LFNST transformation kernel used by the current block (which can be represented by kernel).
  • the MIP parameter may be the size parameter of the current block, where the size parameter may represent the size of the current block, and may be the height and width of the current block, or the size of the current block. aspect ratio.
  • the size parameter of the current block may be referred to.
  • the mapping mode of the LFNST transform set is determined based on the height and width of the current block, or the mapping mode of the LFNST transform set is determined based on the aspect ratio of the current block.
  • the mapping mode of the LFNST transform set when determining the mapping mode of the LFNST transform set according to the size parameter of the current block, it can first be determined whether the size parameter satisfies the first preset condition. If the size parameter satisfies the first preset condition, Then the first preset prediction mode can be determined as the mapping mode of the LFNST transform set; if the size parameter does not meet the first preset condition, then DIMD is used to determine the mapping mode of the LFNST transform set.
  • the first preset condition may be used to limit the size of the current block.
  • the first preset condition corresponds to the size parameter of the current block. If the size parameter of the current block is the height and width of the current block, then the first preset condition can limit the height and width respectively; if the size parameter of the current block is the aspect ratio of the current block, then the first preset condition can Limit the aspect ratio.
  • the first preset condition can be set to be that the width is greater than or equal to the preset width threshold, and/or that the height is greater than Or equal to the preset height threshold. For example, if the height of the current block is greater than or equal to the preset height threshold, or the width of the current block is greater than or equal to the preset width threshold, then it can be determined that the size parameter meets the first preset condition; if the height of the current block is less than the preset height threshold, and the width of the current block is less than the preset width threshold, then it can be determined that the size parameter does not meet the first preset condition.
  • the preset width threshold and the preset height threshold can be any value greater than or equal to 0, for example, the preset width threshold is 32, and the preset height threshold is also 32, that is, If the height or width of the current block is greater than or equal to 32, it can be determined that the current block satisfies the first preset condition.
  • the preset width threshold is 32, and the preset height threshold is also 16. That is, if the height of the current block is greater than or equal to 32, or the width of the current block is greater than or equal to 16, it can be determined that the current block meets the first preset condition.
  • DIMD can be restricted through the first preset condition, that is, only when the size parameter of the current block does not meet the first preset condition, DIMD is allowed to be used to determine Mapping mode for the LFNST transform set.
  • the encoding method proposed in the embodiment of the present application can use the first preset condition to limit the size of the image blocks using DIMD when determining the mapping mode of the LFNST transform set, and only allows the use of DIMD for some image blocks, thus effectively Reduces computational complexity. For example, it is only allowed to use DIMD to determine the mapping mode of the LFNST transform set for image blocks of small size.
  • the first preset prediction mode can be directly determined as the mapping mode of the LFNST transform set.
  • the first preset prediction mode may be PLANAR mode or DC mode.
  • mapping mode of the LFNST transform set when determining the mapping mode of the LFNST transform set, combined with the first preset condition, you can choose to directly set the mapping mode for some image blocks. For example, for image blocks with larger sizes, the PLANAR mode or DC mode is directly determined as the mapping mode of the LFNST transform set.
  • At least one intra prediction mode when using DIMD to determine the mapping mode of the LFNST transform set, at least one intra prediction mode can be traversed first to determine at least one gradient information corresponding to the current block; then, the mapping mode can be determined according to at least A gradient information determines the mapping mode of the LFNST transform set.
  • one intra prediction mode corresponds to one gradient information
  • the gradient information can be a gradient histogram.
  • the gradient amplitude value corresponding to each intra prediction mode can be determined based on at least one gradient information; and then The intra prediction mode with the largest gradient amplitude value among the at least one intra prediction mode may be determined as the mapping mode of the LFNST transform set.
  • one intra prediction mode corresponds to one gradient amplitude value.
  • the first step is to derive the prediction mode, and use the same prediction mode strength calculation method on the encoding and decoding side.
  • the Sobel operator is used to count the histogram of gradients in each prediction mode.
  • the area of effect is the three rows of adjacent reconstructed samples above the current block, the three adjacent columns of reconstructed samples on the left, and the corresponding adjacent reconstructed samples in the upper left.
  • the first prediction mode corresponding to the largest amplitude in the histogram and the second prediction mode corresponding to the second largest amplitude in the histogram can be obtained;
  • the second step is to derive the prediction block, the same prediction block derivation method is used on the encoding and decoding end to obtain the current prediction block.
  • the gradient of the second prediction mode is not 0; 2.
  • Neither the first prediction mode nor the second prediction mode is PLANAR or DC prediction mode.
  • the current prediction block only uses the first prediction mode to calculate the prediction sample value of the current block, that is, the ordinary prediction prediction process is applied to the first prediction mode; otherwise, that is, the above two conditions are established, then The current prediction block will be derived using weighted averaging.
  • the specific method is that the PLANAR mode occupies 1/3 of the weighted weight, and the remaining 2/3 is used as the weighted weight by the first prediction mode according to the gradient intensity ratio of the first prediction mode to the sum of the gradient intensity of the first and second prediction modes, and The second prediction mode is weighted according to the ratio of the gradient strength of the second prediction mode to the sum of the gradient strengths of the first and second prediction modes.
  • the above three prediction modes namely PLANAR, the first prediction mode and the second prediction mode, are weighted and averaged to obtain the prediction block of the current coding unit.
  • the decoder uses the same steps to obtain the prediction block.
  • the down-sampling vector of the current block can be determined first according to the MIP parameters; then matrix multiplication calculation is performed according to the down-sampling vector to obtain the MIP output vector; and then the MIP of the current block is determined according to the MIP output vector. Prediction block; finally traverse at least one intra prediction mode for the MIP prediction block to obtain at least one gradient information.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size parameters of the current block, and the sampling step size is determined by the size parameters of the current block.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side is adjusted according to the MIP transposition indication parameter. If transposition is not required, the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced. The sampled reference reconstruction sample is spliced after the downsampled reference reconstruction sample on the left, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients can be obtained according to the traversed prediction mode as an index, and the output vector (MIP output vector) can be calculated with the input (downsampling vector). Then according to the number of output vectors and the size parameters of the current block, the output vector is upsampled. If upsampling is not required, the vectors are filled in the horizontal direction in sequence as the MIP prediction block output of the current block. If upsampling is required, horizontal upsampling is performed first. The direction is then downsampled in the vertical direction, upsampled to the same size as the template, and then output as the MIP prediction block of the current block.
  • the DIMD method can be directly used for the MIP prediction block of the current block to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set. That is, traverse at least one intra prediction mode for the MIP prediction block of the current block, and calculate the gradient information of at least one intra prediction mode on the MIP prediction block of the current block,
  • the DIMD calculation process may be performed after the MIP output vector is upsampled.
  • the down-sampling vector of the current block can be determined first according to the MIP parameters; then matrix multiplication calculation is performed according to the down-sampling vector to obtain the MIP output vector; finally, at least one frame is traversed for the MIP output vector In intra prediction mode, at least one gradient information is obtained.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the size parameters of the current block, and the sampling step size is determined by the size parameters of the current block.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side is adjusted according to the MIP transposition indication parameter. If transposition is not required, the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced. The sampled reference reconstruction sample is spliced after the downsampled reference reconstruction sample on the left, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients can be obtained according to the traversed prediction mode as an index, and the output vector (MIP output vector) can be calculated with the input (downsampling vector).
  • the DIMD method can be directly used on the MIP output vector of the current block to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set. That is, the MIP output vector is traversed at least one intra prediction mode, and the gradient information of the at least one intra prediction mode on the MIP prediction block of the current block is calculated. Then the output vector is upsampled according to the number of output vectors (MIP output vectors) and the size parameters of the current block. If upsampling is not required, the vectors are filled in in the horizontal direction as the MIP prediction block output of the current block. If upsampling is required, Sampling first upsamples the horizontal direction and then downsamples the vertical direction. The upsampling is the same size as the template and then output as the MIP prediction block of the current block.
  • the DIMD calculation process can also be performed before upsampling the MIP output vector.
  • 67 intra prediction modes can be selectively skipped, reducing the number of traversed intra prediction modes.
  • Number for example, perform selective filtering with a step size of 1.
  • Step 205 According to the mapping mode of the LFNST transform set, select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set, and set the LFNST index sequence number. And write the video code stream.
  • an LFNST transform can be selected from multiple LFNST transform core candidate sets according to the mapping mode of the LFNST transform set. kernel candidate set, and then determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set, and then set the LFNST index sequence number and write the LFNST index sequence number into the video code stream.
  • the index number of the mapping mode of the LFNST transform set can be determined first; then the LFNST intra prediction mode can be determined according to the value of the index number. The value of the index serial number; then, one LFNST transform kernel candidate set can be selected from multiple LFNST transform kernel candidate sets according to the value of the LFNST intra prediction mode index serial number; finally, one can select from the selected LFNST transform kernel candidate set, Select the LFNST transform core used in the current block, then set the LFNST index number and write the LFNST index number into the video stream.
  • the index number of the mapping mode of the LFNST transform set can be further determined, and then the value of the index number of the mapping mode of the LFNST transform set can be determined. Convert to the value of the LFNST intra prediction mode index number (which can be represented by predModeIntra); then select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets based on the value of predModeIntra to determine the transform kernel candidate set ; and select the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set.
  • multiple LFNST transform kernel candidate sets may include 4 LFNST transform kernel candidate sets, where each Each LFNST transformation core candidate set includes two LFNST transformation cores; accordingly, the first lookup table can be used to determine the value of the LFNST intra prediction mode index number corresponding to the value of the index number.
  • the DC mode, PLANAR mode or angle prediction mode and the transformation set of LFNST can be bound based on the first lookup table, such as the first lookup table shown in Table 1.
  • the multiple LFNST transform kernel candidate sets may also include 35 LFNST transform kernel candidate sets, where, Each LFNST transform core candidate set includes three LFNST transform kernels; accordingly, the second lookup table can be used to determine the value of the LFNST intra prediction mode index number corresponding to the value of the index number.
  • the LFNST transform sets corresponding to different intra prediction modes will be more fine-grained, such as the second lookup table shown in Figure 2.
  • the LFNST transformation kernel can be understood as the transformation matrix of LFNST, which is a plurality of fixed coefficient matrices obtained through training.
  • rate-distortion optimization can be used to select the transform kernel used in the current block.
  • Transform kernel Specifically, the rate distortion cost (Rate Distortion Cost, RDCost) can be calculated for each transformation kernel using rate distortion optimization, and then the transformation kernel with the smallest rate distortion cost is selected as the transformation kernel used in the current block.
  • a group of LFNST transformation cores can be selected through RDCost, and the index number corresponding to the LFNST transformation core (which can be represented by lfnst_idx) is written into the video code stream and transmitted to the decoding side.
  • the first group of LFNST transformation kernels i.e., the first group of transformation matrices
  • the second group of LFNST transformation kernels i.e., the second group of transformation matrices
  • the value of the LFNST index number can be used to indicate whether the current block uses LFNST, and can also be used to indicate the index number of the LFNST transformation core in the LFNST transformation core candidate set.
  • the LFNST index serial number (that is, lfnst_idx)
  • LFNST index serial number when the value of the LFNST index serial number is equal to 0, LFNST will not be used; and when the value of the LFNST index serial number is greater than 0
  • LFNST will be used, and the index number of the transformation core is equal to the value of the LFNST index number, or the value of the LFNST index number minus 1.
  • the LFNST transform core used by the current block can be determined.
  • the MIP parameters may also include a MIP transposition indication parameter, where the value of the MIP transposition indication parameter is used to indicate whether to transpose the sampling point input vector used in the MIP mode.
  • the value of the MIP transposition indication parameter indicates transposition processing of the sampling point input vector used in the MIP mode
  • matrix transposition processing can be performed on the selected transformation kernel to obtain the LFNST transformation kernel used in the current block.
  • the value of the MIP transposition indication parameter when the value of the MIP transposition indication parameter is equal to 1, it can be considered that the value of the MIP transposition indication parameter indicates that the sampling point input vector used in the MIP mode is transposed. , at this time, it is necessary to perform corresponding matrix transposition processing on the selected transformation kernel, so that the LFNST transformation kernel used in the current block can be obtained.
  • Step 206 Use the LFNST transformation kernel to transform the residual block.
  • one LFNST transform kernel candidate set is selected from multiple LFNST transform kernel candidate sets, and the LFNST transform used by the current block is determined from the selected LFNST transform kernel candidate set.
  • the LFNST transformation kernel can be used, that is, the residual block is transformed using the transformation matrix selected by the current block.
  • the encoder traverses the prediction modes, and if the current coding unit (current block) is intra mode, obtains the allowed use flag of the encoding and decoding method proposed in the embodiment of this application.
  • bit that is, obtain the MIP allowed use flag bit (prediction mode parameter).
  • the flag bit may be a sequence-level flag bit used to indicate whether the current decoder is allowed to use the MIP technology. Among them, the sequence-level flag bit can be expressed in the form of sps_mip_enable_flag.
  • the encoding end tries the prediction method of MIP, and calculates the corresponding rate distortion cost and records it as cost1; if the allowed use flag of MIP is false, the encoding end does not try the prediction of MIP method, but continue to traverse other intra prediction technologies and calculate the corresponding rate distortion cost, recorded as cost2...costN.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the current coding unit size (size parameter of the current block), and the sampling step size is based on the coding unit size.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side can be adjusted in combination with the MIP transposition indicator parameter.
  • the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced.
  • the reference reconstruction sample after side downsampling is spliced after the reference reconstruction sample after left downsampling, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients are obtained according to the traversed prediction mode as an index, and the output vector (MIP output vector) is calculated with the input (downsampling vector). Then the output vector is upsampled according to the number of output vectors and the size of the current coding unit. Among them, if upsampling is not required, the vectors are sequentially filled in the horizontal direction and output as the current coding unit prediction block (MIP prediction block of the current block). If upsampling is required, the horizontal direction is first upsampled and then the vertical direction is downsampled, and the upsampling is equal to After the template size is the same, it is output as the current coding unit prediction block (MIP prediction block of the current block).
  • the first preset prediction mode is directly determined as the mapping mode of the LFNST transform set, for example, if the width and height of the current coding unit are both greater than or equal to 32, then the PLANAR mode (first preset prediction mode) can be used as the mapping mode of the LFNST transform set. If the size parameter of the current block meets the first preset condition, you may choose to use the DIMD method for the current MIP prediction block to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set.
  • the 67 intra prediction modes in the current VVC and ECM can be traversed for the current MIP prediction block (or some intra prediction modes can be traversed ), calculate the gradient information of each intra prediction mode on the current MIP prediction block, and then determine the corresponding gradient amplitude value based on the gradient information, and then sort the traversed intra prediction modes according to the gradient amplitude value, and the frame with the largest amplitude
  • the intra prediction mode is the optimal mode, that is, the optimal mode is used to map the LFNST transform set of the current coding unit.
  • the original image block of the current coding unit and the prediction block are differenced to obtain the residual block of the current coding unit (current block).
  • the residual block is After the main transformation, the frequency domain coefficient block is obtained, and LFNST is used to perform a secondary transformation on the area of interest of the frequency domain coefficient block.
  • the mapping prediction mode of the LFNST transformation set has been determined by the above method. After that, through processes such as quantization, inverse quantization, and inverse transformation, the rate distortion cost of the current coding unit is calculated, which is recorded as cost1.
  • the encoder traverses the prediction modes, and if the current coding unit (current block) is intra mode, obtains the allowed use flag bit of the encoding and decoding method proposed in the embodiment of this application, that is, Obtain the MIP allowed use flag bit (prediction mode parameter).
  • the flag bit may be a sequence-level flag bit used to indicate whether the current decoder is allowed to use the MIP technology. Among them, the sequence-level flag bit can be expressed in the form of sps_mip_enable_flag.
  • the encoding end tries the prediction method of MIP, and calculates the corresponding rate distortion cost and records it as cost1; if the allowed use flag of MIP is false, the encoding end does not try the prediction of MIP method, but continue to traverse other intra prediction technologies and calculate the corresponding rate distortion cost, recorded as cost2...costN.
  • Haar down-sampling can be performed on the obtained peripheral reference reconstruction samples according to the current coding unit size (size parameter of the current block), and the sampling step size is based on the coding unit size.
  • the splicing order of the reference reconstruction sample after downsampling on the upper side and the reference reconstruction sample after downsampling on the left side can be adjusted in combination with the MIP transposition indicator parameter.
  • the reference reconstruction sample after downsampling on the left side is spliced to the reference reconstruction sample after downsampling on the upper side, and the resulting vector is the input (downsampling vector); if transposition is needed, the upper side downsampling reference reconstruction sample is spliced.
  • the reference reconstruction sample after side downsampling is spliced after the reference reconstruction sample after left downsampling, and the resulting vector is used as input (downsampling vector).
  • the MIP matrix coefficients are obtained according to the traversed prediction mode as an index, and the output vector (MIP output vector) is calculated with the input (downsampling vector).
  • the first preset prediction mode is directly determined as the mapping mode of the LFNST transform set, for example, if the width and height of the current coding unit are both greater than or equal to 32, then the PLANAR mode (first preset prediction mode) can be used as the mapping mode of the LFNST transform set. If the size parameter of the current block meets the first preset condition, you may choose to use the DIMD method on the current MIP output vector to derive the optimal traditional intra prediction mode as the mapping mode of the LFNST transform set.
  • the 67 intra prediction modes (or traversal parts) of the current MIP output vector (MIP output vector) in the current VVC and ECM can be Intra prediction mode), calculates the gradient information of each intra prediction mode on the current MIP output vector, and then determines the corresponding gradient amplitude value based on the gradient information, and then sorts the traversed intra prediction modes according to the gradient amplitude value,
  • the intra prediction mode with the largest amplitude is the optimal mode, that is, the optimal mode is used to map the LFNST transform set of the current coding unit.
  • the output vector can be upsampled according to the number of output vectors and the current coding unit size. Among them, if upsampling is not required, the vectors are sequentially filled in the horizontal direction and output as the current coding unit prediction block (MIP prediction block of the current block). If upsampling is required, the horizontal direction is first upsampled and then the vertical direction is downsampled, and the upsampling is equal to After the template size is the same, it is output as the current coding unit prediction block (MIP prediction block of the current block).
  • the original image block of the current coding unit and the prediction block are differenced to obtain the residual block of the current coding unit (current block).
  • the residual block is After the main transformation, the frequency domain coefficient block is obtained, and LFNST is used to perform a secondary transformation on the area of interest of the frequency domain coefficient block.
  • the mapping prediction mode of the LFNST transformation set has been determined by the above method. After that, through processes such as quantization, inverse quantization, and inverse transformation, the rate distortion cost of the current coding unit is calculated, which is recorded as cost1.
  • the encoding method proposed in the embodiment of this application reduces the complexity of the software and hardware in the JVET-Z0048 solution, while maintaining similar performance, and there is no performance change in the brightness component. Compared with ECM4.0, it maintains the same performance as JVET-Z0048.
  • the encoding method proposed in the embodiment of the present application can be used only in the B frame, or can be used in both the I frame and the B frame. use simultaneously.
  • the encoding method proposed in the embodiment of this application can also be used only on I frames.
  • the conditions under which the coding method proposed in the embodiment of the present application is allowed to be used are different on B frames or I frames.
  • the I frame allows coding units of all sizes to use the coding method proposed in the embodiment of the present application
  • the B frame only allows coding units of all sizes to use the coding method proposed in the embodiment of the present application.
  • Small-sized coding units are allowed to use the coding method proposed in the embodiment of this application.
  • the chroma component of the current coding unit uses the MIP prediction mode, and the chroma component does not use the MIP prediction mode and does not use the traditional intra prediction mode, then the chroma component
  • the LFNST transform set can inherit the LFNST transform set of the luma component.
  • the LFNST transform set of the bright chroma component can be solved according to the encoding method proposed in the embodiment of the present application.
  • the coding method proposed in the embodiment of this application involves using DIMD to derive a MIP prediction block mapping LFNST transform set.
  • DIMD To the one hand, it is proposed to limit the size of the coding unit using DIMD. For larger size image blocks, the MIP output vector is more upsampled and the direction information is not obvious. Therefore, it is chosen to skip the process of deriving the traditional prediction mode from DIMD to reduce the computational complexity.
  • DIMD coding units On the basis of size, the computational complexity is further reduced, and the MIP output vector before upsampling is used as the input of DIMD and the optimal traditional intra prediction mode is derived.
  • the embodiment of the present application provides a coding method.
  • the prediction mode parameters are determined; when the prediction mode parameters indicate that the current block uses MIP to determine the intra-frame prediction value, the MIP parameters of the current block are determined; based on the MIP parameters, the current block is determined Intra-frame prediction block, calculate the residual block between the current block and the intra-frame prediction value; when the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; according to the mapping mode of the LFNST transform set, from multiple Select one LFNST transform kernel candidate set from each LFNST transform kernel candidate set, determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set, set the LFNST index number and write it into the video code stream; use the LFNST transform kernel to The residual block is transformed.
  • mapping mode of the LFNST transform set is determined according to the size parameter in the MIP parameter of the current block, where, for larger size image blocks, you can choose Not using DIMD to export mapping mode can reduce computational complexity, thereby improving coding efficiency.
  • Figure 8 is a schematic structural diagram of an encoder.
  • the encoder 110 may include: a first determination Unit 111, encoding unit 112, first transformation unit 113; where,
  • the first determining unit 111 is configured as the first determining unit, configured to determine a prediction mode parameter; when the prediction mode parameter indicates that the current block uses MIP to determine an intra prediction value, determine the MIP parameter of the current block. ; According to the MIP parameters, determine the intra prediction block of the current block, and calculate the residual block between the current block and the intra prediction value; when the current block uses LFNST, according to the MIP parameters to determine the mapping mode of the LFNST transform set; according to the mapping mode of the LFNST transform set, select one LFNST transform core candidate set from multiple LFNST transform core candidate sets, and determine the current LFNST transform core candidate set from the selected LFNST transform core candidate set.
  • the LFNST transformation kernel used by the block set the LFNST index number;
  • the encoding unit 112 is configured to write a video code stream
  • the first transformation unit 113 is configured to use the LFNST transformation kernel to perform transformation processing on the residual block.
  • the "unit" may be part of a circuit, part of a processor, part of a program or software, etc., and of course may also be a module, or may be non-modular.
  • each component in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially either The part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes a number of instructions to make a computer device (can It is a personal computer, server, or network device, etc.) or processor that executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk or optical disk and other media that can store program code.
  • embodiments of the present application provide a computer-readable storage medium for use in the encoder 110.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by the first processor, any of the foregoing embodiments can be implemented. The method described in one item.
  • Figure 9 is a schematic diagram 2 of the composition of the encoder.
  • the encoder 110 may include: a first memory 114 and a first processor 115. Communication interface 116 and first bus system 117 .
  • the first memory 114 , the first processor 115 , and the first communication interface 116 are coupled together through a first bus system 117 .
  • the first bus system 117 is used to implement connection communication between these components.
  • the first bus system 117 also includes a power bus, a control bus and a status signal bus.
  • various buses are labeled as first bus system 117 in FIG. 9 . in,
  • the first communication interface 116 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • the first memory 114 is used to store a computer program that can run on the first processor
  • the first processor 115 is configured to determine prediction mode parameters when running the computer program; when the prediction mode parameters indicate that the current block uses MIP to determine the intra prediction value, determine the MIP parameters of the current block; According to the MIP parameters, determine the intra prediction block of the current block, and calculate the residual block between the current block and the intra prediction value; when the current block uses LFNST, according to the MIP parameters , determine the mapping mode of the LFNST transform set; according to the mapping mode of the LFNST transform set, select one LFNST transform core candidate set from multiple LFNST transform core candidate sets, and determine the current block from the selected LFNST transform core candidate set Use the LFNST transformation kernel to set the LFNST index number and write it into the video code stream; use the LFNST transformation kernel to perform transformation processing on the residual block.
  • the first memory 114 in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDRSDRAM
  • enhanced SDRAM ESDRAM
  • Synchlink DRAM SLDRAM
  • Direct Rambus RAM DRRAM
  • the first memory 114 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
  • the first processor 115 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the first processor 115 .
  • the above-mentioned first processor 115 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA). or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
  • the storage medium is located in the first memory 114.
  • the first processor 115 reads the information in the first memory 114 and completes the steps of the above method in combination with its hardware.
  • the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, and other devices used to perform the functions described in this application electronic unit or combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Device
  • DSPD Digital Signal Processing Device
  • PLD programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the technology described in this application can be implemented through modules (such as procedures, functions, etc.) that perform the functions described in this application.
  • Software code may be stored in memory and executed by a processor.
  • the memory can be implemented in the processor or external to the processor.
  • the first processor 115 is further configured to perform the method described in any of the preceding embodiments when running the computer program.
  • Figure 10 is a schematic structural diagram of a decoder.
  • the decoder 120 may include: a second determination unit 121 and a second transformation unit 122; where,
  • the second determination unit 121 is configured to decode the code stream and determine the prediction mode parameters; when the prediction mode parameter indicates using MIP to determine the intra-frame prediction value, decode the code stream and determine the MIP parameters of the current block; decode the code stream, Determine the transform coefficient and LFNST index number of the current block; when the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameter; according to the mapping of the LFNST transform set mode, select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, and determine the LFNST transform kernel used by the current block from the selected LFNST transform kernel candidate set;
  • the second transform unit 122 is configured to use the LFNST transform kernel to perform transform processing on the transform coefficients.
  • the "unit" may be part of a circuit, part of a processor, part of a program or software, etc., and of course may also be a module, or may be non-modular.
  • each component in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially either The part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes a number of instructions to make a computer device (can It is a personal computer, server, or network device, etc.) or processor that executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk or optical disk and other media that can store program code.
  • embodiments of the present application provide a computer-readable storage medium for use in the decoder 120.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by the first processor, any of the foregoing embodiments can be implemented. The method described in one item.
  • Figure 11 is a schematic diagram 2 of the composition of the decoder.
  • the decoder 120 may include: a second memory 123 and a second processor 124. Communication interface 125 and second bus system 126 .
  • the second memory 123, the second processor 124, and the second communication interface 125 are coupled together through a second bus system 126.
  • the second bus system 126 is used to implement connection communication between these components.
  • the second bus system 126 also includes a power bus, a control bus and a status signal bus.
  • the various buses are labeled as second bus system 126 in FIG. 11 . in,
  • the second communication interface 125 is used for receiving and sending signals during the process of sending and receiving information with other external network elements
  • the second memory 123 is used to store computer programs that can run on the second processor
  • the second processor 124 is configured to decode the code stream and determine prediction mode parameters when running the computer program
  • the prediction mode parameter indicates using MIP to determine the intra-frame prediction value
  • decode the code stream to determine the MIP parameters of the current block decode the code stream to determine the transform coefficient and LFNST index number of the current block; when the LFNST index number When the current block is instructed to use LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameter; select one LFNST transform core candidate set from multiple LFNST transform core candidate sets according to the mapping mode of the LFNST transform set, and The LFNST transform kernel used by the current block is determined from the selected LFNST transform kernel candidate set; and the transform coefficient is transformed using the LFNST transform kernel.
  • the second memory 123 in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDRSDRAM
  • enhanced SDRAM ESDRAM
  • Synchlink DRAM SLDRAM
  • Direct Rambus RAM DRRAM
  • the second memory 123 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
  • the second processor 124 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the second processor 124 .
  • the above-mentioned second processor 124 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA). or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
  • the storage medium is located in the second memory 123.
  • the second processor 124 reads the information in the second memory 123 and completes the steps of the above method in combination with its hardware.
  • the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, and other devices used to perform the functions described in this application electronic unit or combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Device
  • DSPD Digital Signal Processing Device
  • PLD programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the technology described in this application can be implemented through modules (such as procedures, functions, etc.) that perform the functions described in this application.
  • Software code may be stored in memory and executed by a processor.
  • the memory can be implemented in the processor or external to the processor.
  • Embodiments of the present application provide an encoder and a decoder that determine the mapping mode of the LFNST transform set according to the size parameter in the MIP parameter of the current block when deriving a prediction block, where, for larger size image blocks, You can choose not to use DIMD to export the mapping mode, which can reduce computational complexity and thus improve coding efficiency.
  • Embodiments of the present application provide a coding and decoding method, an encoder, a decoder, and a storage medium.
  • the code stream is decoded to determine the prediction mode parameters; when the prediction mode parameters indicate that MIP is used to determine the intra-frame prediction value, the decoding code Stream, determine the MIP parameters of the current block; decode the code stream, determine the transform coefficient and LFNST index number of the current block; when the LFNST index number indicates that the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; according to the LFNST transform Set mapping mode, select one LFNST transform kernel candidate set from multiple LFNST transform kernel candidate sets, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set; use the LFNST transform kernel to transform the transform coefficients deal with.
  • the prediction mode parameters are determined; when the prediction mode parameters indicate that the current block uses MIP to determine the intra prediction value, the MIP parameters of the current block are determined; based on the MIP parameters, the intra prediction block of the current block is determined, and the current block and frame are calculated Residual block between intra-prediction values; when the current block uses LFNST, determine the mapping mode of the LFNST transform set according to the MIP parameters; select an LFNST transform kernel from multiple LFNST transform kernel candidate sets according to the mapping mode of the LFNST transform set candidate set, and determine the LFNST transform kernel used in the current block from the selected LFNST transform kernel candidate set, set the LFNST index number and write it into the video code stream; use the LFNST transform kernel to transform the residual block.
  • mapping mode of the LFNST transform set is determined according to the size parameter in the MIP parameter of the current block, where, for larger size image blocks, you can choose Not using DIMD to export mapping mode can reduce computational complexity, thereby improving coding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例提供了一种解码方法,在解码端,解码码流,确定预测模式参数;当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定当前块的变换系数和LFNST索引序号;当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核;使用LFNST变换核,对变换系数进行变换处理。

Description

编解码方法、编码器、解码器以及存储介质 技术领域
本申请实施例涉及图像处理技术领域,尤其涉及一种编解码方法、编码器、解码器以及存储介质。
背景技术
随着人们对视频显示质量要求的提高,高清和超高清视频等新视频应用形式应运而生。H.265/高效率视频编码(High Efficiency Video Coding,HEVC)已经无法满足视频应用迅速发展的需求,联合视频研究组(Joint Video Exploration Team,JVET)提出了下一代视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC),其相应的测试模型为VVC的参考软件测试平台(VVC Test Model,VTM)。增强的压缩模型(Enhanced Compression Model,ECM)在VTM10.0的基础上开始接收更新和更高效的压缩算法。
解码端帧内模式导出(Decoder side Intra Mode Derivation,DIMD)是ECM的帧内预测技术。该技术主要核心点在于帧内预测的模式在解码端使用与编码端相同的方法导出帧内预测模式,从而可以达到节省比特开销的目的。
然而,DIMD技术的使用在软件和硬件方面的均引入了较大的复杂度,增大了压缩代价。
发明内容
本申请实施例提供一种编解码方法、编码器、解码器以及存储介质,能够降低计算复杂度,从而可以提高编码效率。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种解码方法,应用于解码器,所述方法包括:
解码码流,确定预测模式参数;
当所述预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;
解码码流,确定所述当前块的变换系数和LFNST索引序号;
当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;
根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;
使用所述LFNST变换核,对所述变换系数进行变换处理。
第二方面,本申请实施例提供了一种编码方法,应用于编码器,所述方法包括:
确定预测模式参数;
当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;
根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;
当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;
根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;
使用所述LFNST变换核,对所述残差块进行变换处理。
第三方面,本申请实施例提供了一种编码器,所述编码器包括第一确定单元,编码单元,第一变换单元;其中,
所述第一确定单元,配置为确定预测模式参数;当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换 核,设置LFNST索引序号;
所述编码单元,配置为写入视频码流;
所述第一变换单元,还配置为使用所述LFNST变换核,对所述残差块进行变换处理。
第四方面,本申请实施例提供了一种编码器,所述编码器包括第一存储器和第一处理器;其中,
所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;
所述第一处理器,用于在运行所述计算机程序时,执行如第二方面所述的方法。
第五方面,本申请实施例提供了一种解码器,所述解码器包括第二确定单元,第二变换单元;其中,
所述第二确定单元,配置为解码码流,确定预测模式参数;当所述预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定所述当前块的变换系数和LFNST索引序号;当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;
所述第二变换单元,配置为使用所述LFNST变换核,对所述变换系数进行变换处理。
第六方面,本申请实施例提供了一种解码器,所述解码器包括第二存储器和第二处理器;其中,
所述第二存储器,用于存储能够在所述第二处理器上运行的计算机程序;
所述第二处理器,用于在运行所述计算机程序时,执行如第一方面所述的方法。
第七方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如第一方面所述的方法、或者实现如第二方面所述的方法。
本申请实施例提供了一种编解码方法、编码器、解码器以及存储介质,在解码端,解码码流,确定预测模式参数;当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定当前块的变换系数和LFNST索引序号;当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核;使用LFNST变换核,对变换系数进行变换处理。在编码端,确定预测模式参数;当预测模式参数指示当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据MIP参数,确定当前块的帧内预测块,计算当前块与帧内预测值之间的残差块;当当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;使用LFNST变换核,对残差块进行变换处理。由此可见,在本申请的实施例中,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。
附图说明
图1为基于矩阵的帧内预测技术的示意图;
图2为帧内预测模式与变换集的对应表;
图3为解码端帧内模式导出技术的示意图;
图4为本申请实施例提供的一种视频编码系统的组成框图;
图5为本申请实施例提供的一种视频解码系统的组成框图;
图6为本申请实施例提出的解码方法的实现流程示意图;
图7为本申请实施例提出的编码方法的实现流程示意图;
图8为编码器的组成结构示意图一;
图9为编码器的组成结构示意图二;
图10为解码器的组成结构示意图一;
图11为解码器的组成结构示意图二。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些 实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
在视频图像中,一般采用第一图像分量、第二图像分量和第三图像分量来表征编码块(Coding Block,CB);其中,这三个图像分量分别为一个亮度分量、一个蓝色色度分量和一个红色色度分量,具体地,亮度分量通常使用符号Y表示,蓝色色度分量通常使用符号Cb或者U表示,红色色度分量通常使用符号Cr或者V表示;这样,视频图像可以用YCbCr格式表示,也可以用YUV格式表示。
在本申请实施例中,第一图像分量可以为亮度分量,第二图像分量可以为蓝色色度分量,第三图像分量可以为红色色度分量,但是本申请实施例不作具体限定。
通用的视频编解码标准基于都采用基于块的混合编码框架。视频图像中的每一帧被分割成相同大小(比如128×128,64×64等)的正方形的最大编码单元(Largest Coding Unit,LCU)或编码树单元(Coding Tree Unit,CTU),每个最大编码单元或编码树单元还可以根据规则划分成矩形的编码单元(Coding Unit,CU);而且编码单元可能还会划分成更小的预测单元(Prediction Unit,PU)、变换单元(Transform Unit,TU)等。
混合编码框架可以包括有预测(Prediction)、变换(Transform)、量化(Quantization)、熵编码(Entropy coding)、环路滤波(Inloop Filter)等模块。其中,预测模块可以包括帧内预测(Intra Prediction)和帧间预测(Inter Prediction),帧间预测可以包括运动估计(Motion Estimation)和运动补偿(Motion Compensation)。由于视频图像的一个帧内相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测方式能够消除相邻像素之间的空间冗余;但是由于视频图像中的相邻帧之间也存在着很强的相似性,在视频编解码技术中使用帧间预测方式消除相邻帧之间的时间冗余,从而能够提高编解码效率。
视频编解码器的基本流程如下:在编码端,将一帧图像划分成块,对当前块使用帧内预测或帧间预测产生当前块的预测块,当前块的原始块减去预测块得到残差块,对残差块进行变换、量化得到量化系数矩阵,对量化系数矩阵进行熵编码输出到码流中。在解码端,对当前块使用帧内预测或帧间预测产生当前块的预测块,另一方面解码码流得到量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块,将预测块和残差块相加得到重建块。重建块组成重建图像,基于图像或基于块对重建图像进行环路滤波得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。解码图像可以为后续的帧作为帧间预测的参考帧。编码端确定的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息如果有必要需要在输出到码流中。解码端通过解码及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。编码端获得的解码图像通常也叫做重建图像。在预测时可以将当前块划分成预测单元,在变换时可以将当前块划分成变换单元,预测单元和变换单元的划分可以不同。上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化,本申请实施例适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。
当前块(current block)可以是当前编码单元(CU)或当前预测单元(PU)等。
国际视频编码标准制定组织JVET已成立超越H.266/VVC编码模型研究的小组,并将该模型,即平台测试软件,命名为增强的压缩模型(Enhanced Compression Model,ECM)。ECM在VTM10.0的基础上开始接收更新和更高效的压缩算法,目前已超越VVC约13%的编码性能。ECM不仅扩大了特定分辨率的编码单元尺寸,同时也集成了许多帧内预测和帧间预测技术。
下面将针对基于矩阵的帧内预测(Matrix-based Intra Prediction,MIP)技术的相关技术方案进行描述。
基于矩阵的帧内预测技术,即MIP技术,可以分为三个主要步骤,分别是下采样、矩阵相乘以及上采样。第一步将空间相邻重建样本进行下采样,得到下采样后的样本序列作为第二步的输入向量;第二步将第一步的输出向量作为输入,与预先设定好的矩阵相乘并加上偏置向量,并输出计算之后的样本向量;第三步将第二步的输出向量作为输入上采样成最终预测块。图1为基于矩阵的帧内预测技术的示意图,上述过程如图1所示,MIP技术在第一步过程中通过平均当前编码单元上边相邻的重建样本后得到上相邻下采样重建样本向量,通过平均左相邻的重建样本后得到左相邻下采样重建样本向量。上向量和左向量作为第二步矩阵向量相乘的输入,A k为预先设定好的矩阵,b k为预先设定好的偏置向量,其中k为MIP模式索引。第三步将第二步得到的结果进行线性插值上采样得到与实际编码单元样本数相符的预测样本块。
对于不同块尺寸的编码单元,MIP的模式个数有所不同。以H.266/VVC为例,对于4×4大小的编 码单元,MIP有16种预测模式;对于8×8大小的编码单元,或者宽、高等于4的编码单元,MIP有8种预测模式;对于其他尺寸的编码单元,MIP有6种预测模式。同时,MIP技术有一个转置功能,对于符合当前尺寸的预测模式,MIP在编码端都会尝试转置计算。若需要转置,则将输入上侧和左侧输入向量的顺序进行调换,而矩阵计算后再将输出调换。
因此,MIP不仅需要一个标志位来表示当前编码单元是否使用MIP技术,同时,若当前编码单元使用MIP技术,则需要额外传输一个转置标志位和MIP模式索引到解码端。
在VVC标准文本当中,MIP的转置标志位由定长编码方式(Fixed Length,FL)二值化,长度为1。而MIP的模式索引由截断二进制编码方式(Truncated Binary,TB)二值化。
低频不可分二次变换(Low-Frequency Non-Separable Transform,LFNST)技术同样是VVC文本中的已采纳的技术,下面将针对LFNST的相关技术方案进行描述。
LFNST在编码端应用在正向主变换以及量化之间,而在解码端应用在反量化和反主变换之间。在当前编码块的残差经过主变换后得到频域的系数,LFNST在这基础之上对部分系数再做频域转换,对部分频域系数进行变换,得到另一个域的系数,之后再做量化以及熵编码等操作。LFNST进一步去除了统计冗余,在VVC的参考软件VTM上有着不错的性能表现。
LFNST主要对变换块左上角的4×4或者8×8区域进行二次变换,此外,LFNST的变换核在VVC中主要归类为4个变换集,每个变换集有2个候选变换核。而在ECM中,LFNST的变换核进行了拓展,从原始的4个变换集拓展成35个变换集,从原来的每个变换集2个候选变换核拓展成每个变换集3个候选变换核。
LFNST允许作用在帧内预测和帧间预测上。其中,LFNST在帧内预测上采用帧内预测模式对应选择变换集的方式可以节省比特开销,由于帧内预测通常有对应的帧内预测模式,即DC模式、PLANAR模式或者角度预测模式,将这些帧内预测模式和LFNST的变换集进行绑定。如在VVC中,DC模式和PLANAR模式对应第一个变换集,具体如下表1所示。
表1
predModeIntra SetIdx
predModeIntra<0 1
0<=predModeIntra<=1 0
2<=predModeIntra<=12 1
13<=predModeIntra<=23 2
24<=predModeIntra<=44 3
45<=predModeIntra<=55 2
56<=predModeIntra<=80 1
81<=IntraPredMode<=83 0
其中,predModeIntra可以为帧内预测模式指示符,SetIdx可以为LFNST索引序号。这里,LFNST索引序号的取值设置为指示当前块使用LFNST、且LFNST变换核在LFNST变换核候选集中的索引序号。例如,如果LFNST变换集中包括有四个变换核候选集(set0,set1,set2,set3),分别对应于SetIdx的取值为0、1、2、3。
相应的,在ECM中对LFNST的变换核进行拓展之后,不同帧内预测模式对应的LFNST变换集对应就会更细粒度一些,例如,图2为帧内预测模式与变换集的对应表,如图2所示,拓展之后的对应有35个变换集。
下面将针对解码端帧内模式导出(Decoder side Intra Mode Derivation,DIMD)技术的相关技术方案进行描述。
DIMD是ECM的帧内预测技术,VVC中并没有这个技术。该技术主要核心点在于帧内预测的模式在解码端使用与编码端相同的方法导出帧内预测模式,以此避免在码流中传输当前编码单元的帧内预测模式索引,达到节省比特开销的目的。具体做法分为两个主要步骤,第一个步骤,即导出预测模式,在编解码端使用同样的预测模式强度计算方法。编码端利用索贝尔算子统计每种预测模式下的梯度直方图(histogram of gradients),作用区域为当前块的上方三行相邻重建样本、左侧三列相邻重建样本以及左上对应相邻重建样本,通过计算前述的L形区域梯度直方图可以得到直方图中幅度最大对应的第一预测模式和幅度第二大所对应的第二预测模式。解码端以同样步骤导出第一预测模式和第二预测模式;第二个步骤,导出预测块,在编解码端使用同样的预测块导出方式得到当前预测块。编码端判断以下两个条件,1、第二预测模式的梯度不为0;2、第一预测模式和第二预测模式均不为PLANAR或者DC预测模式。若上述两个条件不同时成立,则当前预测块仅使用第一预测模式计算当前块的预测样本值,即对第一预测模式应用普通预测预测过程;否则,即上述两个条件均成立,则当前预测块将使用加权求平 均方式导出当前预测块。具体方法为,PLANAR模式占据1/3的加权权重,剩下2/3由第一预测模式根据第一预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重,及第二预测模式根据第二预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重。将上述三种预测模式,即PLANAR、第一预测模式和第二预测模式,加权求平均得到当前编码单元的预测块。解码端以同样步骤得到预测块。图3为解码端帧内模式导出技术的示意图,上述具体操作过程如图3所示。
其中,在上述第二个步骤中,具体的权重计算方法如下公式所示:
Weight(PLANAR)=1/3          (1)
Weight(mode1)=2/3×(amp1/(amp1+amp2))       (2)
Weight(mode2)=1–Weight(PLANAR)–Weight(mode1)        (3)
其中,mode1和mode2分别代表第一预测模式和第二预测模式,amp1和amp2分别代表第一预测模式的梯度幅度值和第二预测模式的梯度幅度值。对于DIMD技术,需要传输一个标志位到解码端来表示当前编码单元是否使用DIMD技术。
为了对MIP与LFNST进行改进,在VVC以及ECM中,MIP与LFNST的关系都被简单化,所有的MIP预测模式在映射到LFNST的变换集过程前都被默认当成PLANAR模式。这是因为LFNST在设计初期就是利用帧内预测模式作为训练输入,通过深度学习训练得到LFNST的变换核系数,而MIP的预测模式与传统的帧内预测模式表达不同,MIP预测模式代表某一个预测矩阵系数,传统预测模式代表方向性。同时,MIP的预测结果与传统的PLANAR模式类似,因此MIP所有的预测模式都采用PLANAR映射到LFNST的变换集上。
可以选择对MIP的预测块使用DIMD对每个传统帧内预测模式的梯度幅度值进行排序,将最优可能的传统预测模式去映射到LFNST的变换集上。同时,也可以选择扩展原始MIP允许使用LFNST的编码单元尺寸范围,VVC和ECM中仅允许当前编码单元宽和高均大于等于16的时候,允许MIP使用LFNST。而在扩展之后,在当前编码单元宽和高均大于等于4的时候,允许MIP使用LFNST。
虽然上述方法很好的解决了MIP预测模式映射到LFNST的问题,也提升了编码效率,但同时也引入了相应的复杂度,无论是软件上的编解码时间还是硬件实现上的缓存以及时序问题等。
且提出的对MIP使用LFNST的范围作出拓展,允许条件变成宽高均大于等于4即可使用LFNST,这样使用DIMD对MIP预测块导出传统预测模式的范围扩大,编解码复杂度也会继续增加。
相对于编码端,在解码端增加复杂度代价更大。LFNST的4×4编码单元使用范围在原本的VVC以及ECM中均已支持,相信并不会带来太多的额外顾虑,但对预测块的DIMD导出过程是目前VVC和ECM中均没有的技术和操作。若能减少DIMD的使用条件或者简化步骤,同时保持相应的编码性能,便可以大大提升解码端的解码效率。
也就是说,常见的基于DIMD技术的编解码方案,会在软件和硬件方面的均引入了较大的复杂度,增大了压缩代价,降低了编码效率。
为了解决上述问题,在本申请的实施例中,在解码端,解码码流,确定预测模式参数;当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定当前块的变换系数和LFNST索引序号;当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核;使用LFNST变换核,对变换系数进行变换处理。在编码端,确定预测模式参数;当预测模式参数指示当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据MIP参数,确定当前块的帧内预测块,计算当前块与帧内预测值之间的残差块;当当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;使用LFNST变换核,对残差块进行变换处理。由此可见,在本申请的实施例中,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。
参见图4,其示出了本申请实施例提供的一种视频编码系统的组成框图示例;如图4所示,该视频编码系统10包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110等,其中,滤波单元108可以实现去方块滤波及样本自适应缩进(Sample Adaptive 0ffset,SAO)滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmatic Coding,CABAC)。针对输入的原始视频信号,通过编码 树块(Coding Tree Unit,CTU)的划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息,输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。
参见图5,其示出了本申请实施例提供的一种视频解码系统的组成框图示例;如图5所示,该视频解码系统20包括解码单元201、反变换与反量化单元202、帧内预测单元203、运动补偿单元204、滤波单元205和解码图像缓存单元206等,其中,解码单元201可以实现头信息解码以及CABAC解码,滤波单元205可以实现去方块滤波以及SAO滤波。输入的视频信号经过图4的编码处理之后,输出该视频信号的码流;该码流输入视频解码系统20中,首先经过解码单元201,用于得到解码后的变换系数;针对该变换系数通过反变换与反量化单元202进行处理,以便在像素域中产生残差块;帧内预测单元203可用于基于所确定的帧内预测模式和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元204是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换与反量化单元202的残差块与由帧内预测单元203或运动补偿单元204产生的对应预测性块进行求和,而形成解码的视频块;该解码的视频信号通过滤波单元205以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元206中,解码图像缓存单元206存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,即得到了所恢复的原始视频信号。
本申请实施例中的编码方法,可以应用在如图4所示的帧内估计单元102和帧内预测单元103部分。另外,本申请实施例中的解码方法,还可以应用在如图5所示的帧内预测单元203。也就是说,本申请实施例中的编解码方法,既可以应用于视频编码系统,也可以应用于视频解码系统,甚至还可以同时应用于视频编码系统和视频解码系统,但是本申请实施例不作具体限定。还需要说明的是,当该编解码方法应用于视频编码系统时,“当前块”具体是指帧内预测中的当前编码块;当该编解码方法应用于视频解码系统时,“当前块”具体是指帧内预测中的当前解码块。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
本申请实施例提出的了一种解码方法,图6为本申请实施例提出的解码方法的实现流程示意图,如图6所示,解码器进行解码处理的方法可以包括以下步骤:
步骤101、解码码流,确定预测模式参数。
在本申请的实施例中,解码器解码码流,可以先确定预测模式参数。
需要说明的是,在本申请的实施例中,预测模式参数指示了当前块的编码模式及与该编码模式相关的参数。其中,预测模式通常包括有传统帧内预测模式和非传统帧内预测模式,而传统帧内预测模式又可以包括有直流(Direct Current,DC)模式、平面(PLANAR)模式和角度模式等,非传统帧内预测模式又可以包括有MIP模式、跨分量线性模型预测(Cross-component Linear Model Prediction,CCLM)模式、帧内块复制(Intra Block Copy,IBC)模式和PLT(Palette)模式等。
可以理解的是,在本申请的实施例中,在编码侧,可以针对当前块进行预测编码,在预测编码的过程中就可以确定出当前块的预测模式,并将相应的预测模式参数写入码流,从可以将预测模式参数由编码器传输到解码器。
相应的,在解码侧,通过解码码流可以获取到当前块或者当前块所在编码块的亮度或色度分量的帧内预测模式,这时候可以确定出predModeIntra(帧内预测模式指示符)的取值,计算公式如下,
Figure PCTCN2022103686-appb-000001
其中,图像分量指示符(可以用cIdx表示)用于指示当前块的亮度分量或色度分量;这里,如果当前块预测的为亮度分量,那么cIdx等于0;如果当前块预测的为色度分量,那么cIdx等于1。另外,(xTbY,yTbY)是当前块左上角采样点的坐标,IntraPredModeY[xTbY][yTbY]为亮度分量的帧内预测模式,IntraPredModeC[xTbY][yTbY]为色度分量的帧内预测模式。
进一步地,在本申请的实施例中,通过预测模式参数的获取,可以基于预测模式参数确定在进行帧内预测时是否使用MIP确定帧内预测值。
步骤102、当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数。
在本申请的实施例中,在确定预测模式参数之后,如果预测模式参数指示使用MIP确定帧内预测值,那么可以继续解码码流,从而确定当前块的MIP参数。
需要说明的是,在本申请的实施例中,MIP参数可以包括有MIP转置指示参数(可以用isTransposed表示)、MIP模式索引序号(可以用modeId表示)、当前块的大小、当前块的类别(可以用mipSizeId表示)等参数;这些参数的取值可以通过解码码流得到。
也就是说,在本申请的实施例中,通过解码码流确定的MIP参数,可以对MIP转置指示参数、MIP模式索引序号、当前块的大小、当前块的类别等信息中的至少一个信息进行指示。
进一步地,在本申请的实施例中,通过解码码流,可以确定isTransposed的取值;当isTransposed的取值等于1时,便可以确定需要对MIP模式使用的采样点输入向量进行转置处理;当isTransposed的取值等于0时,便可以确定不需要对MIP模式使用的采样点输入向量进行转置处理;也就是说,MIP转置指示参数isTransposed可以用于指示是否对MIP模式使用的采样点输入向量进行转置处理。
进一步地,在本申请的实施例中,通过解码码流,还可以确定MIP模式索引序号modeId;其中,MIP模式索引序号可以用于指示当前块使用的MIP模式,MIP模式可以用于指示使用MIP确定当前块的帧内预测块的计算推导方式。也就是说,不同的MIP模式,其对应的MIP模式索引序号的取值是不同的;这里,MIP模式索引序号的取值可以为0、1、2、3、4或5。
进一步地,在本申请的实施例中,通过解码码流,还可以确定当前块的大小、宽高比、当前块的类别mipSizeId等参数信息。这样,在确定出MIP参数之后,以方便后续根据所确定的MIP参数来选择当前块使用的LFNST变换核(可以用kernel表示)。
也就是说,在本申请的实施例中,MIP参数可以对当前块的尺寸参数进行确定,其中,尺寸参数可以表征当前块的大小,既可以为当前块的高度和宽度,也可以为当前块的宽高比。
步骤103、解码码流,确定当前块的变换系数和LFNST索引序号。
在本申请的实施例中,如果预测模式参数指示使用MIP确定帧内预测值,那么在确定当前块的MIP参数之后,可以继续解码码流,进而确定当前块的变换系数和LFNST索引序号。
需要说明的是,在本申请的实施例中,LFNST索引序号的取值可以用于指示当前块是否使用LFNST,还可以用于指示LFNST变换核在LFNST变换核候选集中的索引序号。
也就是说,在本申请的实施例中,在解码出LFNST索引序号之后,当LFNST索引序号的取值等于0时,表明了当前块不使用LFNST;而当LFNST索引序号的取值大于0时,表明了当前块使用LFNST,此时,变换核的索引序号可以等于LFNST索引序号的取值,或者,变换核的索引序号也可以等于LFNST索引序号的取值减1。
步骤104、当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式。
在本申请的实施例中,在确定当前块的变换系数和LFNST索引序号之后,如果LFNST索引序号指示当前块使用LFNST,那么可以进一步根据MIP参数,确定LFNST变换集的映射模式。
需要说明的是,在本申请的实施例中,MIP参数可以为当前块的尺寸参数,其中,尺寸参数可以表征当前块的大小,既可以为当前块的高度和宽度,也可以为当前块的宽高比。
也就是说,在本申请的实施例中,在进行LFNST变换集的映射模式的确定时,可以参考当前块的尺寸参数。例如,基于当前块的高度和宽度确定LFNST变换集的映射模式,或者,基于当前块的宽高比确定LFNST变换集的映射模式。
进一步地,在本申请的实施例中,在根据当前块的尺寸参数确定LFNST变换集的映射模式时,可以先判断尺寸参数是否满足第一预设条件,如果尺寸参数满足第一预设条件,那么可以将第一预设预测模式确定为LFNST变换集的映射模式;如果尺寸参数不满足第一预设条件,那么便使用DIMD确定LFNST变换集的映射模式。
需要说明的是,在本申请的实施例中,第一预设条件可以用于对当前块的尺寸大小进行限定。其中,第一预设条件是与当前块的尺寸参数相对应的。如果当前块的尺寸参数为当前块的高度和宽度,那么第一预设条件可以对高度和宽度分别进行限定;如果当前块的尺寸参数为当前块的宽高比,那么第一预设条件可以对宽高比进行限定。
示例性的,在本申请的实施例中,假设当前块的尺寸参数为当前块的高度和宽度,那么可以设置第一预设条件为宽度大于或者等于预设宽度阈值,和/或,高度大于或者等于预设高度阈值。例如,如果当前块的高度大于或者等于预设高度阈值,或者当前块的宽度大于或者等于预设宽度阈值,那么便可以确定尺寸参数满足第一预设条件;如果当前块的高度小于预设高度阈值,且当前块的宽度小于预设宽度阈值,那么可以确定尺寸参不满足第一预设条件。
可以理解的是,在本申请的实施例中,预设宽度阈值和预设高度阈值可以为任意大于或者等于0的数值,例如,预设宽度阈值为32,预设高度阈值也为32,即如果当前块的高度或者宽度均大于或者等于32,则可以确定当前块满足第一预设条件。预设宽度阈值为32,预设高度阈值也为16,即如果当前块的高度大于或者等于32,或者当前块的宽度大于或者等于16,则可以确定当前块满足第一预设条件。
可以理解的是,在本申请的实施例中,通过第一预设条件,可以对DIMD的使用进行限制,即只有在当前块的尺寸参数不满足第一预设条件时,才允许使用DIMD确定LFNST变换集的映射模式。
可见,本申请实施例提出的解码方法,在确定LFNST变换集的映射模式时,可以利用第一预设条件对使用DIMD的图像块的尺寸进行限制,仅允许对部分图像块使用DIMD,从而有效降低了计算复杂度。例如,仅允许对尺寸较小的图像块使用DIMD确定LFNST变换集的映射模式。
进一步的,在本申请的实施例中,如果当前块的尺寸参数满足第一预设条件,那么可以直接将第一预设预测模式确定为LFNST变换集的映射模式。其中,第一预设预测模式可以为PLANAR模式或DC模式。
也就是说,在本申请的实施例中,在确定LFNST变换集的映射模式时,结合第一预设条件,可以选择对部分图像块直接进行映射模式的设定。例如,对尺寸较大的图像块,直接将PLANAR模式或DC模式确定为LFNST变换集的映射模式。
进一步地,在本申请的实施例中,在使用DIMD确定LFNST变换集的映射模式时,可以先遍历至少一种帧内预测模式,从而确定当前块对应的至少一个梯度信息;接着,可以根据至少一个梯度信息确定LFNST变换集的映射模式。其中,一种帧内预测模式对应一个梯度信息,梯度信息可以为梯度直方图。
可以理解的是,在本申请的实施例中,在根据至少一个梯度信息确定LFNST变换集的映射模式时,可以基于至少一个梯度信息,确定每一种帧内预测模式对应的梯度幅度值;然后可以将至少一种帧内预测模式中的、梯度幅度值最大的帧内预测模式确定为LFNST变换集的映射模式。其中,一种帧内预测模式对应一个梯度幅度值。
也就是说,在本申请的实施例中,基于DIMD技术,可以选择在解码端使用与编码端相同的方法导出帧内预测模式,以节省比特开销。其主要包括两个步骤:第一个步骤,即导出预测模式,在编解码端使用同样的预测模式强度计算方法。例如,利用索贝尔算子统计每种预测模式下的梯度直方图histogram of gradients,作用区域为当前块的上方三行相邻重建样本、左侧三列相邻重建样本以及左上对应相邻重建样本所构成的L形区域,通过计算该L形区域的梯度直方图可以得到直方图中幅度最大对应的第一预测模式和幅度第二大所对应的第二预测模式;第二个步骤,导出预测块,在编解码端使用同样的预测块导出方式得到当前预测块。例如,判断以下两个条件,1、第二预测模式的梯度不为0;2、第一预测模式和第二预测模式均不为PLANAR或者DC预测模式。若上述两个条件不同时成立,则当前预测块仅使用第一预测模式计算当前块的预测样本值,即对第一预测模式应用普通预测预测过程;否则,即上述两个条件均成立,则当前预测块将使用加权求平均方式导出当前预测块。具体方法为,PLANAR模式占据1/3的加权权重,剩下2/3由第一预测模式根据第一预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重,及第二预测模式根据第二预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重。将上述三种预测模式,即PLANAR、第一预测模式和第二预测模式,加权求平均得到当前编码单元的预测块。解码端以同样步骤得到预测块。
进一步地,在本申请的实施例中,可以先根据MIP参数,确定当前块的下采样向量;然后根据下采样向量进行矩阵乘法计算,获得MIP输出向量;再根据MIP输出向量确定当前块的MIP预测块;最终对MIP预测块遍历至少一种帧内预测模式,获得至少一个梯度信息。
可以理解的是,在本申请的实施例中,可以根据当前块的尺寸参数,对获取得到的周边参考重建样本进行哈尔下采样,采样步长由当前块的尺寸参数确定。其中,根据解码得到的MIP转置指示参数调 整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,可以根据解码得到的MIP模式索引序号获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。再根据输出向量个数与当前块的尺寸参数,对输出向量进行上采样,若不需要上采样则向量以水平方向依次填充作为当前块的MIP预测块输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前块的MIP预测块输出。
接着,如果确定对当前块使用DIMD,那么便可以直接对当前块的MIP预测块使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。即对当前块的MIP预测块遍历至少一种帧内预测模式,计算至少一种帧内预测模式在当前块的MIP预测块上的梯度信息,
也就是说,在本申请的实施例中,DIMD计算过程可以在MIP输出向量上采样之后进行。
进一步地,在本申请的实施例中,可以先根据MIP参数,确定当前块的下采样向量;接着根据下采样向量进行矩阵乘法计算,获得MIP输出向量;最后对MIP输出向量遍历至少一种帧内预测模式,获得至少一个梯度信息。
可以理解的是,在本申请的实施例中,可以根据当前块的尺寸参数,对获取得到的周边参考重建样本进行哈尔下采样,采样步长由当前块的尺寸参数确定。其中,根据解码得到的MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,可以根据解码得到的MIP模式索引序号获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。
接着,如果确定对当前块使用DIMD,那么便可以直接对当前块的MIP输出向量使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。即对MIP输出向量遍历至少一种帧内预测模式,计算至少一种帧内预测模式在当前块的MIP预测块上的梯度信息。然后再根据输出向量(MIP输出向量)个数与当前块的尺寸参数,对输出向量进行上采样,若不需要上采样则向量以水平方向依次填充作为当前块的MIP预测块输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前块的MIP预测块输出。
也就是说,在本申请的实施例中,DIMD计算过程也可以在MIP输出向量上采样之前进行。
需要说明的是,在本申请的实施例中,在使用DIMD确定LFNST变换集的映射模式时,可以选择遍历所有67种帧内预测模式,获得对应的67个梯度信息;也可以选择遍历所有67种帧内预测模式中的一部分帧内预测模式,获得对应的梯度信息。
也就是说,在本申请的实施例中,为了进一步降低编解码端的复杂度,对于DIMD的使用,针对67种帧内预测模式可以进行选择性的跳过,减少遍历的帧内预测模式的个数,例如,以步长为1进行有选择性的筛选。
步骤105、根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核。
在本申请的实施例中,如果LFNST索引序号指示当前块使用LFNST,那么在根据MIP参数,确定LFNST变换集的映射模式之后,可以根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,然后从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核。
进一步地,在本申请的实施例中,在确定当前块使用的LFNST变换核时,可以先确定LFNST变换集的映射模式的索引序号;然后可以根据索引序号的取值,确定LFNST帧内预测模式索引序号的取值;接着,可以根据LFNST帧内预测模式索引序号的取值,从多个LFNST变换核候选集中选择一个LFNST变换核候选集;最后便可以从所选择的LFNST变换核候选集中,选择LFNST索引序号指示的变换核,设置为当前块使用的LFNST变换核。
也就是说,在本申请的实施例中,在确定出LFNST变换集的映射模式之后,可以进一步确定LFNST变换集的映射模式的索引序号,然后将LFNST变换集的映射模式的索引序号的取值转换为LFNST帧内预测模式索引序号(可以用predModeIntra表示)的取值;然后再根据predModeIntra的取值,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,以确定出变换核候选集;并且在所选择的LFNST变换核候选集中,选择出LFNST索引序号指示的变换核,设置为当前块使用的LFNST变换核。
需要说明的是,在本申请的实施例中,在进行LFNST变换核候选集和对应的LFNST变换核的确定时,多个LFNST变换核候选集可以包括4个LFNST变换核候选集,其中,每个LFNST变换核候选集包括2个LFNST变换核;相应的,可以使用第一查找表确定索引序号的取值对应的LFNST帧内预测模式索引序号的取值。
可以理解的是,在本申请的实施例中,基于第一查找表,可以将DC模式、PLANAR模式或者角度预测模式和LFNST的变换集进行绑定,例如表1所示的第一查找表。
需要说明的是,在本申请的实施例中,在进行LFNST变换核候选集和对应的LFNST变换核的确定时,多个LFNST变换核候选集也可以包括35个LFNST变换核候选集,其中,每个LFNST变换核候选集包括3个LFNST变换核;相应的,可以使用第二查找表确定索引序号的取值对应的LFNST帧内预测模式索引序号的取值。
可以理解的是,在本申请的实施例中,基于第二查找表,不同帧内预测模式对应的LFNST变换集对应就会更细粒度一些,例如图2所示的第二查找表。
进一步地,在本申请的实施例中,MIP参数还可以包括MIP转置指示参数,其中,MIP转置指示参数的取值用于指示是否对MIP模式使用的采样点输入向量进行转置处理。相应的,当MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,可以选择对LFNST索引序号指示的变换核进行矩阵转置处理,得到当前块使用的LFNST变换核。
可以理解的是,在本申请的实施例中,当MIP转置指示参数的取值等于1,可以认为MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,此时便需要对所选择的变换核进行相应的矩阵转置处理,从而可以得到当前块使用的LFNST变换核。
步骤106、使用LFNST变换核,对变换系数进行变换处理。
在本申请的实施例中,在根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核之后,便可以使用LFNST变换核,对变换系数进行变换处理。
进一步地,在本申请的实施例中,从所选择的LFNST变换核候选集中确定的LFNST变换核即为当前块使用的LFNST变换核,该LFNST变换核可以为对变换系数进行变换处理的变换矩阵。进而可以将二次变换系数向量作为输入,使用变换矩阵(变换核)与之相乘得到一次变换系数向量,如此,经过矩阵计算之后,可以实现对变换系数的变换处理。
示例性的,在一种可能的实施例中,在解码端,解码器可以解码编码单元级类型标志位,若指示为帧内模式,则解码获取MIP允许使用标志位(预测模式参数),该标志位可以为序列级标志位,用于指示当前解码器是否允许使用MIP技术。其中,该序列级标志位可以表示为sps_mip_enable_flag的形式。
接着,如果MIP的允许使用标志位为真,则解码当前编码单元(当前块)的MIP使用标志位,否则,当前解码过程不需要解码编码单元级的MIP使用标志位,编码单元级的MIP使用标志位默认为否。
如果当前编码单元的MIP使用标志位为真,则解码获得当前编码单元的MIP参数,其中,MIP参数可以包括MIP转置指示参数、MIP模式索引序号、当前块的大小、当前块的类别等信息中的至少一个信息。否则继续解码其他帧内预测技术的使用标识位或索引等信息,并根据解码到的信息求得当前编码单元的最终预测块。
在解码获得MIP参数之后,可以根据当前编码单元尺寸大小的情况(当前块的尺寸参数),对获取得到的周边参考重建样本进行哈尔下采样,采样步长根据编码单元尺寸确定,同时可以结合解码得到的MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。其中,若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,根据解码得到的MIP预测模式获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。然后根据输出向量个数与当前编码单元尺寸情况,对输出向量进行上采样。其中,若不需要上采样则向量以水平方向依次填充作为当前编码单元预测块(当前块的MIP预测块)输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前编码单元预测块(当前块的MIP预测块)输出。
需要说明的是,如果当前块的尺寸参数满足第一预设条件,则直接将第一预设预测模式确定为LFNST变换集的映射模式,例如,如果当前编码单元的宽和高均大于或等于32,那么可以将PLANAR模式(第一预设预测模式)作为LFNST变换集的映射模式。如果当前块的尺寸参数满足第一预设条件,则可以选择对当前MIP预测块使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。
进一步地,在使用DIMD方法导出传统帧内预测模式最为LFNST变换集的映射模式时,可以对当前MIP预测块遍历在当前VVC以及ECM中的67种帧内预测模式(或者遍历部分帧内预测模式),计算每种帧内预测模式在当前MIP预测块上的梯度信息,进而基于梯度信息确定对应的梯度幅度值,再根据梯度幅度值对所遍历的帧内预测模式进行排序,最大幅度的帧内预测模式即为最优模式,即作为后续步骤中反变换过程的LFNST变换集的映射模式。
在完成LFNST变换集的映射模式的确定之后,可以继续解码其他帧内预测技术的使用标识位或索引等信息,并根据解码到的信息求得当前编码单元的最终预测块;进一步地,可以解码码流并获取残差信息,根据反量化及反变换得到时域残差信息,将最终预测块与时域残差信息叠加得到重建样本块;所有重建样本块经由环路滤波等技术后,得到最终的重建图像,可以同时作为视频输出也可以作为后面解码参考。
示例性的,在另一种可能的实施例中,在解码端,解码器可以解码编码单元级类型标志位,若指示为帧内模式,则解码获取MIP允许使用标志位(预测模式参数),该标志位可以为序列级标志位,用于指示当前解码器是否允许使用MIP技术。其中,该序列级标志位可以表示为sps_mip_enable_flag的形式。
接着,如果MIP的允许使用标志位为真,则解码当前编码单元(当前块)的MIP使用标志位,否则,当前解码过程不需要解码编码单元级的MIP使用标志位,编码单元级的MIP使用标志位默认为否。
如果当前编码单元的MIP使用标志位为真,则解码获得当前编码单元的MIP参数,其中,MIP参数可以包括MIP转置指示参数、MIP模式索引序号、当前块的大小、当前块的类别等信息中的至少一个信息。否则继续解码其他帧内预测技术的使用标识位或索引等信息,并根据解码到的信息求得当前编码单元的最终预测块。
在解码获得MIP参数之后,可以根据当前编码单元尺寸大小的情况(当前块的尺寸参数),对获取得到的周边参考重建样本进行哈尔下采样,采样步长根据编码单元尺寸确定,同时可以结合解码得到的MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。其中,若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,根据解码得到的MIP预测模式获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。
需要说明的是,如果当前块的尺寸参数满足第一预设条件,则直接将第一预设预测模式确定为LFNST变换集的映射模式,例如,如果当前编码单元的宽和高均大于或等于32,那么可以将PLANAR模式(第一预设预测模式)作为LFNST变换集的映射模式。如果当前块的尺寸参数满足第一预设条件,则可以选择对当前MIP输出向量使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。
进一步地,在使用DIMD方法导出传统帧内预测模式最为LFNST变换集的映射模式时,可以对当前MIP输出向量(MIP输出向量)在当前VVC以及ECM中的67种帧内预测模式(或者遍历部分帧内预测模式),计算每种帧内预测模式在当前MIP输出向量上的梯度信息,进而基于梯度信息确定对应的梯度幅度值,再根据梯度幅度值对所遍历的帧内预测模式进行排序,最大幅度的帧内预测模式即为最优模式,即作为后续步骤中反变换过程的LFNST变换集的映射模式。
进一步地,可以根据输出向量个数与当前编码单元尺寸情况,对输出向量进行上采样。其中,若不需要上采样则向量以水平方向依次填充作为当前编码单元预测块(当前块的MIP预测块)输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前编码单元预测块(当前块的MIP预测块)输出。
在完成LFNST变换集的映射模式的确定之后,可以继续解码其他帧内预测技术的使用标识位或索引等信息,并根据解码到的信息求得当前编码单元的最终预测块;进一步地,可以解码码流并获取残差信息,根据反量化及反变换得到时域残差信息,将最终预测块与时域残差信息叠加得到重建样本块;所有重建样本块经由环路滤波等技术后,得到最终的重建图像,可以同时作为视频输出也可以作为后面解码参考。
需要说明的是,本申请实施例提出的编解码方法,适用于编解码端的帧内预测部分,采用本申请实施例提出的方案,集成到JVET-Z008上后,在通测条件AI下测试结果如下表2和表3:
表2
Figure PCTCN2022103686-appb-000002
Figure PCTCN2022103686-appb-000003
表3
Figure PCTCN2022103686-appb-000004
采用本申请实施例提出的方案,在class D上同时跑了anchor和test,提供了较为准确的结果,如下表4和表5:
表4
Figure PCTCN2022103686-appb-000005
表5
Figure PCTCN2022103686-appb-000006
采用本申请实施例提出的方案,以ECM4.0作为anchor的(JVET-Z0048+proposed method)测试结果如下表6:
表6
Figure PCTCN2022103686-appb-000007
通过上述测试结果可知,本申请实施例提出的解码方法,降低了JVET-Z0048方案中的软硬件复杂 度,同时保持着与其相似的性能,亮度分量没有性能变化。与ECM4.0相比,与JVET-Z0048一样保持着相同的性能。
进一步地,在本申请的实施例中,考虑到硬件解码器对于I帧和B帧的要求不同,本申请实施例提出的解码方法可以仅在B帧使用,也可以在I帧和B帧都同时使用。或者,本申请实施例提出的解码方法也可以仅在I帧上使用。亦或者,本申请实施例提出的解码方法允许使用的条件在B帧或者I帧上不同,例如,I帧允许所有尺寸的编码单元都允许使用本申请实施例提出的解码方法,而B帧仅允许小尺寸的编码单元使用本申请实施例提出的解码方法。
进一步地,在本申请的实施例中,若当前编码单元(当前块)的亮度分量使用MIP预测模式,而色度分量不使用MIP预测模式且不使用传统帧内预测模式,则色度分量的LFNST变换集可以继承亮度分量的LFNST变换集。
进一步地,在本申请的实施例中,若当前编码单元(当前块)不使用传统帧内预测模式,那么亮色度分量的LFNST变换集均可以根据本申请实施例提出的解码方法求解。
综上所述,本申请实施例提出的解码方法,涉及到使用DIMD导出MIP预测块映射LFNST变换集的方法,一方面,提出限制使用DIMD的编码单元使用尺寸。对于较大尺寸的图像块,MIP输出向量上采样较多,方向信息不明显,因此选择跳过DIMD导出传统预测模式的过程以降低计算复杂度;另一方面,提出在限制DIMD的编码单元使用尺寸的基础上,进一步降低计算复杂度,使用未上采样前的MIP输出向量作为DIMD的输入并导出最优传统帧内预测模式。
本申请实施例提供了一种解码方法,在解码端,解码码流,确定预测模式参数;当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定当前块的变换系数和LFNST索引序号;当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核;使用LFNST变换核,对变换系数进行变换处理。由此可见,在本申请的实施例中,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。
基于上述实施例,本申请实施例提出的了一种编码方法,图7为本申请实施例提出的编码方法的实现流程示意图,如图7所示,编码器进行编码处理的方法可以包括以下步骤:
步骤201、确定预测模式参数。
在本申请的实施例中,编码器可以先确定预测模式参数。
需要说明的是,在本申请的实施例中,视频图像可以划分为多个图像块,每个当前待编码的图像块可以称为编码块(Coding Block,CB)。这里,每个编码块可以包括第一图像分量、第二图像分量和第三图像分量;而当前块为视频图像中当前待进行第一图像分量、第二图像分量或者第三图像分量预测的编码块。
其中,假定当前块进行第一图像分量预测,而且第一图像分量为亮度分量,即待预测图像分量为亮度分量,那么当前块也可以称为亮度块;或者,假定当前块进行第二图像分量预测,而且第二图像分量为色度分量,即待预测图像分量为色度分量,那么当前块也可以称为色度块。
需要说明的是,预测模式参数指示了当前块的编码模式及该模式相关的参数。通常可以采用率失真优化(Rate Distortion Optimization,RDO)的方式确定当前块的预测模式参数。
示例性的,在本申请的实施例中,在确定当前块的预测模式参数时,可以先确定当前块的待预测图像分量;然后基于当前块的参数,利用多种预测模式分别对待预测图像分量进行预测编码,计算多种预测模式下每一种预测模式对应的率失真代价结果;最终可以从计算得到的多个率失真代价结果中选取最小率失真代价结果,并将最小率失真代价结果对应的预测模式确定为当前块的预测模式参数。
也就是说,在本申请的实施例中,在编码侧,针对当前块可以采用多种预测模式分别对待预测图像分量进行编码。这里,多种预测模式通常包括有传统帧内预测模式和非传统帧内预测模式,而传统帧内预测模式又可以包括有直流(Direct Current,DC)模式、平面(PLANAR)模式和角度模式等,非传统帧内预测模式又可以包括有MIP模式、跨分量线性模型预测(Cross-component Linear Model Prediction,CCLM)模式、帧内块复制(Intra Block Copy,IBC)模式和PLT(Palette)模式等。
这样,在利用多种预测模式分别对当前块进行编码之后,可以得到每一种预测模式对应的率失真代价结果;然后从所得到的多个率失真代价结果中选取最小率失真代价结果,并将该最小率失真代价结果对应的预测模式确定为当前块的预测模式参数;如此,最终可以使用所确定的预测模式对当前块进行编码,而且在这种预测模式下,可以使得预测残差小,能够提高编码效率。
可以理解的是,在本申请的实施例中,在编码侧,可以针对当前块进行预测编码,在预测编码的过 程中就可以确定出当前块的预测模式,并将相应的预测模式参数写入码流,从可以将预测模式参数由编码器传输到解码器。
相应的,在解码侧,通过解码码流可以获取到当前块或者当前块所在编码块的亮度或色度分量的帧内预测模式,这时候可以确定出predModeIntra(帧内预测模式指示符)的取值。
进一步地,在本申请的实施例中,通过预测模式参数的获取,可以基于预测模式参数确定在进行帧内预测时是否使用MIP确定帧内预测值。
步骤202、当预测模式参数指示使用MIP确定帧内预测值时,确定当前块的MIP参数。
在本申请的实施例中,在确定预测模式参数之后,如果预测模式参数指示使用MIP确定帧内预测值,那么可以继续确定当前块的MIP参数。
需要说明的是,在本申请的实施例中,MIP参数可以包括有MIP转置指示参数(可以用isTransposed表示)、MIP模式索引序号(可以用modeId表示)、当前块的大小、当前块的类别(可以用mipSizeId表示)等参数。
也就是说,在本申请的实施例中,通过确定的MIP参数,可以对MIP转置指示参数、MIP模式索引序号、当前块的大小、当前块的类别等信息中的至少一个信息进行指示。
进一步地,在本申请的实施例中,MIP参数可以包括有MIP转置指示参数(可以用isTransposed表示);这里,MIP转置指示参数的取值用于指示是否对MIP模式使用的采样点输入向量进行转置处理。
具体地,在MIP模式中,根据当前块左侧边相邻参考像素对应的参考采样值和上侧边相邻参考像素对应的参考采样值,可以得到相邻参考采样集;如此,在得到相邻参考采样集之后,这时候可以构造一个输入参考样值集,即MIP模式使用的采样点输入向量。但是针对输入参考样值集的构造,在编码侧和解码侧的构造方式是有区别的,主要是和MIP转置指示参数的取值有关。
当应用于编码侧时,仍然可以利用率失真优化的方式,确定MIP转置指示参数的取值,具体地,可以包括:
分别计算进行转置处理的第一代价值和不进行转置处理的第二代价值;
如果第一代价值小于第二代价值,这时候可以确定MIP转置指示参数的取值为1;
如果第一代价值不小于第二代价值,这时候可以确定MIP转置指示参数的取值为0。
进一步地,当MIP转置指示参数的取值为0时,在缓冲区内,可以将相邻参考采样集中上侧边对应的参考采样值存储在左侧边对应的参考采样值之前,这时候不需要进行转置处理,即不需要对MIP模式使用的采样点输入向量进行转置处理,可以直接将缓冲区确定为输入参考样值集;当MIP转置指示参数的取值为1时,在缓冲区内,可以将相邻参考采样集中上侧边对应的参考采样值存储在左侧边对应的参考采样值之后,这时候对缓冲区进行转置处理,即需要对MIP模式使用的采样点输入向量进行转置处理,然后将转置后的缓冲区确定为输入参考样值集。这样,在得到输入参考样值集之后,可以用于MIP模式下确定当前块对应的帧内预测值的过程。
还需要说明的是,在编码侧,在确定出MIP转置指示参数的取值之后,还需要将所确定的MIP转置指示参数的取值写入码流中,便于后续在解器侧进行解码处理。
进一步地,在本申请的实施例中,MIP参数还可以包括MIP模式索引序号(可以用modeId表示),其中,MIP模式索引序号用于指示当前块使用的MIP模式,而MIP模式用于指示使用MIP确定当前块的帧内预测块的计算推导方式。也就是说,在MIP模式中,由于MIP模式又可以包括有很多种,这多种MIP模式可以通过MIP模式索引序号进行区分,即不同的MIP模式具有不同的MIP模式索引序号;这样,根据使用MIP确定当前块的帧内预测块的计算推导方式,可以确定出具体的MIP模式,从而就可以得到对应的MIP模式索引序号;本申请实施例中,MIP模式索引序号的取值可以为0、1、2、3、4或5。
进一步地,在本申请的实施例中,MIP参数还可以包括当前块的大小、宽高比等参数;其中,根据当前块的大小(即当前块的宽度和高度),还可以确定出当前块的类别(可以用mipSizeId表示)。
例如,如果当前块的宽度和高度均等于4,那么可以将mipSizeId的取值设置为0;反之,如果当前块的宽度和高度之一等于4,或者当前块的宽度和高度均等于8,那么可以将mipSizeId的取值设置为1;反之,如果当前块为其他大小的块,那么可以将mipSizeId的取值设置为2。
例如,如果当前块的宽度和高度均等于4,那么可以将mipSizeId的取值设置为0;反之,如果当前块的宽度和高度之一等于4,那么可以将mipSizeId的取值设置为1;反之,如果当前块为其他大小的块,那么可以将mipSizeId的取值设置为2。
这样,在使用MIP确定帧内预测值的过程中,还可以确定出MIP参数,便于根据所确定的MIP参数来确定当前块使用的LFNST变换核(可以用kernel表示)。
也就是说,在本申请的实施例中,MIP参数可以对当前块的尺寸参数进行确定,其中,尺寸参数可 以表征当前块的大小,既可以为当前块的高度和宽度,也可以为当前块的宽高比。
步骤203、根据MIP参数,确定当前块的帧内预测块,计算当前块与帧内预测值之间的残差块。
在本申请的实施例中,如果预测模式参数指示使用MIP确定帧内预测值,那么在确定当前块的MIP参数之后,可以根据MIP参数,进一步确定当前块的帧内预测块,并计算当前块与帧内预测值之间的残差块。
需要说明的是,在本申请的实施例中,针对MIP模式,MIP预测的输入数据,包括有:当前块的位置(xTbCmp,yTbCmp)、当前块所应用的MIP预测模式(可以用modeId表示)、当前块的高度(用nTbH表示)、当前块的宽度(用nTbW表示)以及是否需要转置的转置处理指示标志(可以用isTransposed表示)等;MIP预测的输出数据,包括有:当前块的预测块,该预测块中像素坐标[x][y]所对应的帧内预测值为predSamples[x][y];其中,x=0,1,…,nTbW-1;y=0,1,…,nTbH-1。
可以理解的是,在本申请的实施例中,MIP预测过程可以分为四个步骤:配置核心参数、获取参考像素、构造输入采样以及生成预测值。其中,对于配置核心参数来说,根据帧内当前块的大小,可以将当前块划分为三类,用mipSizeId记录当前块的种类;而且不同种类的当前块,参考采样点数量和矩阵乘法输出采样点数量是不同的。对于获取参考像素来说,预测当前块时,这时候当前块的上块和左块都是已编码的块,MIP技术的参考像素为当前块的上一行像素和左一列像素的重建值,获取当前块的上侧边相邻的参考像素(用refT表示)和左侧边相邻的参考像素(用refL表示)的过程即为参考像素的获取过程。对于构造输入采样来说,该步骤用于矩阵乘法的输入,主要可以包括:获取参考采样、构造参考采样缓冲区和推导矩阵乘法输入采样;其中,获取参考采样的过程为下采样过程,而构造参考采样缓冲区又可以包括不需要转置时缓冲区的填充方式和需要转置时缓冲区的填充方式。对于生成预测值来说,该步骤用于获取当前块的MIP预测值,主要可以包括:构造矩阵乘法输出采样块、矩阵乘法输出采样嵌位、矩阵乘法输出采样转置和生成MIP最终预测值;其中,构造矩阵乘法输出采样块又可以包括获取权重矩阵、获取移位因子和偏移因子和矩阵乘法运算,生成MIP最终预测值又可以包括生成不需要上采样的预测值和生成需要上采样的预测值。这样,在经过该四个步骤之后,可以得到当前块的帧内预测块。
可以理解的是,在本申请的实施例中,在确定出当前块的帧内预测块之后,可以根据当前块的像素真实值与帧内预测值进行差值计算,将计算得到的差值作为残差块,便于后续针对残差块进行变换处理。
步骤204、当当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式。
在本申请的实施例中,在确定当前块使用LFNST,那么可以进一步根据MIP参数,确定LFNST变换集的映射模式。
需要说明的是,在本申请的实施例中,并不是任意的当前块都可以执行LFNST。在一种可能的实施例中,只有当前块同时满足如下条件时,才可以对当前块进行LFNST。其中,这些条件包括:(a)当前块的宽度和高度均大于或等于4;(b)当前块的宽度和高度均小于或等于变换块的最大尺寸;(c)当前块或当前所在编码块的预测模式为帧内预测模式;(d)当前块的一次变换在水平方向和垂直方向上均为二维正向一次变换(DCT2);(e)当前块或当前块所在编码块的帧内预测模式为非MIP模式或者变换块的预测模式为MIP模式且变换块的宽度和高度均大于或等于16。
进一步地,在确定当前块可以执行LFNST时,这时候还需要确定当前块使用的LFNST变换核(可以用kernel表示)。
需要说明的是,在本申请的实施例中,MIP参数可以为当前块的尺寸参数,其中,尺寸参数可以表征当前块的大小,既可以为当前块的高度和宽度,也可以为当前块的宽高比。
也就是说,在本申请的实施例中,在进行LFNST变换集的映射模式的确定时,可以参考当前块的尺寸参数。例如,基于当前块的高度和宽度确定LFNST变换集的映射模式,或者,基于当前块的宽高比确定LFNST变换集的映射模式。
进一步地,在本申请的实施例中,在根据当前块的尺寸参数确定LFNST变换集的映射模式时,可以先判断尺寸参数是否满足第一预设条件,如果尺寸参数满足第一预设条件,那么可以将第一预设预测模式确定为LFNST变换集的映射模式;如果尺寸参数不满足第一预设条件,那么便使用DIMD确定LFNST变换集的映射模式。
需要说明的是,在本申请的实施例中,第一预设条件可以用于对当前块的尺寸大小进行限定。其中,第一预设条件是与当前块的尺寸参数相对应的。如果当前块的尺寸参数为当前块的高度和宽度,那么第一预设条件可以对高度和宽度分别进行限定;如果当前块的尺寸参数为当前块的宽高比,那么第一预设条件可以对宽高比进行限定。
示例性的,在本申请的实施例中,假设当前块的尺寸参数为当前块的高度和宽度,那么可以设置第一预设条件为宽度大于或者等于预设宽度阈值,和/或,高度大于或者等于预设高度阈值。例如,如果 当前块的高度大于或者等于预设高度阈值,或者当前块的宽度大于或者等于预设宽度阈值,那么便可以确定尺寸参数满足第一预设条件;如果当前块的高度小于预设高度阈值,且当前块的宽度小于预设宽度阈值,那么可以确定尺寸参不满足第一预设条件。
可以理解的是,在本申请的实施例中,预设宽度阈值和预设高度阈值可以为任意大于或者等于0的数值,例如,预设宽度阈值为32,预设高度阈值也为32,即如果当前块的高度或者宽度均大于或者等于32,则可以确定当前块满足第一预设条件。预设宽度阈值为32,预设高度阈值也为16,即如果当前块的高度大于或者等于32,或者当前块的宽度大于或者等于16,则可以确定当前块满足第一预设条件。
可以理解的是,在本申请的实施例中,通过第一预设条件,可以对DIMD的使用进行限制,即只有在当前块的尺寸参数不满足第一预设条件时,才允许使用DIMD确定LFNST变换集的映射模式。
可见,本申请实施例提出的编码方法,在确定LFNST变换集的映射模式时,可以利用第一预设条件对使用DIMD的图像块的尺寸进行限制,仅允许对部分图像块使用DIMD,从而有效降低了计算复杂度。例如,仅允许对尺寸较小的图像块使用DIMD确定LFNST变换集的映射模式。
进一步的,在本申请的实施例中,如果当前块的尺寸参数满足第一预设条件,那么可以直接将第一预设预测模式确定为LFNST变换集的映射模式。其中,第一预设预测模式可以为PLANAR模式或DC模式。
也就是说,在本申请的实施例中,在确定LFNST变换集的映射模式时,结合第一预设条件,可以选择对部分图像块直接进行映射模式的设定。例如,对尺寸较大的图像块,直接将PLANAR模式或DC模式确定为LFNST变换集的映射模式。
进一步地,在本申请的实施例中,在使用DIMD确定LFNST变换集的映射模式时,可以先遍历至少一种帧内预测模式,从而确定当前块对应的至少一个梯度信息;接着,可以根据至少一个梯度信息确定LFNST变换集的映射模式。其中,一种帧内预测模式对应一个梯度信息,梯度信息可以为梯度直方图。
可以理解的是,在本申请的实施例中,在根据至少一个梯度信息确定LFNST变换集的映射模式时,可以基于至少一个梯度信息,确定每一种帧内预测模式对应的梯度幅度值;然后可以将至少一种帧内预测模式中的、梯度幅度值最大的帧内预测模式确定为LFNST变换集的映射模式。其中,一种帧内预测模式对应一个梯度幅度值。
也就是说,在本申请的实施例中,基于DIMD技术,可以选择在解码端使用与编码端相同的方法导出帧内预测模式,以节省比特开销。其主要包括两个步骤:第一个步骤,即导出预测模式,在编解码端使用同样的预测模式强度计算方法。例如,利用索贝尔算子统计每种预测模式下的梯度直方图histogram of gradients,作用区域为当前块的上方三行相邻重建样本、左侧三列相邻重建样本以及左上对应相邻重建样本所构成的L形区域,通过计算该L形区域的梯度直方图可以得到直方图中幅度最大对应的第一预测模式和幅度第二大所对应的第二预测模式;第二个步骤,导出预测块,在编解码端使用同样的预测块导出方式得到当前预测块。例如,判断以下两个条件,1.第二预测模式的梯度不为0;2.第一预测模式和第二预测模式均不为PLANAR或者DC预测模式。若上述两个条件不同时成立,则当前预测块仅使用第一预测模式计算当前块的预测样本值,即对第一预测模式应用普通预测预测过程;否则,即上述两个条件均成立,则当前预测块将使用加权求平均方式导出当前预测块。具体方法为,PLANAR模式占据1/3的加权权重,剩下2/3由第一预测模式根据第一预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重,及第二预测模式根据第二预测模式的梯度强度比上第一和第二预测模式的梯度强度和作为加权权重。将上述三种预测模式,即PLANAR、第一预测模式和第二预测模式,加权求平均得到当前编码单元的预测块。解码端以同样步骤得到预测块。
进一步地,在本申请的实施例中,可以先根据MIP参数,确定当前块的下采样向量;然后根据下采样向量进行矩阵乘法计算,获得MIP输出向量;再根据MIP输出向量确定当前块的MIP预测块;最终对MIP预测块遍历至少一种帧内预测模式,获得至少一个梯度信息。
可以理解的是,在本申请的实施例中,可以根据当前块的尺寸参数,对获取得到的周边参考重建样本进行哈尔下采样,采样步长由当前块的尺寸参数确定。其中,根据MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,可以根据遍历的预测模式作为索引获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。再根据输出向量个数与当前块的尺寸参数,对输出向量进行上采样,若不需 要上采样则向量以水平方向依次填充作为当前块的MIP预测块输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前块的MIP预测块输出。
接着,如果确定对当前块使用DIMD,那么便可以直接对当前块的MIP预测块使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。即对当前块的MIP预测块遍历至少一种帧内预测模式,计算至少一种帧内预测模式在当前块的MIP预测块上的梯度信息,
也就是说,在本申请的实施例中,DIMD计算过程可以在MIP输出向量上采样之后进行。
进一步地,在本申请的实施例中,可以先根据MIP参数,确定当前块的下采样向量;接着根据下采样向量进行矩阵乘法计算,获得MIP输出向量;最后对MIP输出向量遍历至少一种帧内预测模式,获得至少一个梯度信息。
可以理解的是,在本申请的实施例中,可以根据当前块的尺寸参数,对获取得到的周边参考重建样本进行哈尔下采样,采样步长由当前块的尺寸参数确定。其中,根据MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,可以根据遍历的预测模式作为索引获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。
接着,如果确定对当前块使用DIMD,那么便可以直接对当前块的MIP输出向量使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。即对MIP输出向量遍历至少一种帧内预测模式,计算至少一种帧内预测模式在当前块的MIP预测块上的梯度信息。然后再根据输出向量(MIP输出向量)个数与当前块的尺寸参数,对输出向量进行上采样,若不需要上采样则向量以水平方向依次填充作为当前块的MIP预测块输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前块的MIP预测块输出。
也就是说,在本申请的实施例中,DIMD计算过程也可以在MIP输出向量上采样之前进行。
需要说明的是,在本申请的实施例中,在使用DIMD确定LFNST变换集的映射模式时,可以选择遍历所有67种帧内预测模式,获得对应的67个梯度信息;也可以选择遍历所有67种帧内预测模式中的一部分帧内预测模式,获得对应的梯度信息。
也就是说,在本申请的实施例中,为了进一步降低编解码端的复杂度,对于DIMD的使用,针对67种帧内预测模式可以进行选择性的跳过,减少遍历的帧内预测模式的个数,例如,以步长为1进行有选择性的筛选。
步骤205、根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流。
在本申请的实施例中,如果当前块使用LFNST,那么在根据MIP参数,确定LFNST变换集的映射模式之后,可以根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,然后从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,进而可以设置LFNST索引序号,并将LFNST索引序号写入视频码流。
进一步地,在本申请的实施例中,在确定当前块使用的LFNST变换核时,可以先确定LFNST变换集的映射模式的索引序号;然后可以根据索引序号的取值,确定LFNST帧内预测模式索引序号的取值;接着,可以根据LFNST帧内预测模式索引序号的取值,从多个LFNST变换核候选集中选择一个LFNST变换核候选集;最后便可以从所选择的LFNST变换核候选集中,选择当前块使用的LFNST变换核,进而可以设置LFNST索引序号,并将LFNST索引序号写入视频码流。
也就是说,在本申请的实施例中,在确定出LFNST变换集的映射模式之后,可以进一步确定LFNST变换集的映射模式的索引序号,然后将LFNST变换集的映射模式的索引序号的取值转换为LFNST帧内预测模式索引序号(可以用predModeIntra表示)的取值;然后再根据predModeIntra的取值,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,以确定出变换核候选集;并且在所选择的LFNST变换核候选集中,选择出当前块使用的LFNST变换核。
需要说明的是,在本申请的实施例中,在进行LFNST变换核候选集和对应的LFNST变换核的确定时,多个LFNST变换核候选集可以包括4个LFNST变换核候选集,其中,每个LFNST变换核候选集包括2个LFNST变换核;相应的,可以使用第一查找表确定索引序号的取值对应的LFNST帧内预测模式索引序号的取值。
可以理解的是,在本申请的实施例中,基于第一查找表,可以将DC模式、PLANAR模式或者角 度预测模式和LFNST的变换集进行绑定,例如表1所示的第一查找表。
需要说明的是,在本申请的实施例中,在进行LFNST变换核候选集和对应的LFNST变换核的确定时,多个LFNST变换核候选集也可以包括35个LFNST变换核候选集,其中,每个LFNST变换核候选集包括3个LFNST变换核;相应的,可以使用第二查找表确定索引序号的取值对应的LFNST帧内预测模式索引序号的取值。
可以理解的是,在本申请的实施例中,基于第二查找表,不同帧内预测模式对应的LFNST变换集对应就会更细粒度一些,例如图2所示的第二查找表。
需要说明的是,在本申请的实施例中,LFNST变换核可以理解为LFNST的变换矩阵,是通过训练得到的多个固定系数矩阵。
需要说明的是,在本申请的实施例中,由于LFNST变换核候选集中包括有预设的用于MIP的两个或多个变换核,这时候可以使用率失真优化的方式选择当前块使用的变换核。具体地,可以针对每一种变换核分别使用率失真优化的方式计算率失真代价(Rate Distortion Cost,RDCost),然后选取率失真代价最小的变换核作为当前块使用的变换核。
也就是说,在编码侧,可以通过RDCost选择出一组LFNST变换核,并将LFNST变换核对应的索引序号(可以用lfnst_idx表示)写入视频码流,传输到解码侧。其中,当选择LFNST变换核候选集中的第一组LFNST变换核(即第一组变换矩阵)时,将lfnst_idx设置为1;当选择LFNST变换核候选集中的第二组LFNST变换核(即第二组变换矩阵)时,将lfnst_idx设置为2。
需要说明的是,在本申请的实施例中,LFNST索引序号的取值可以用于指示当前块是否使用LFNST,还可以用于指示LFNST变换核在LFNST变换核候选集中的索引序号。
也就是说,在本申请的实施例中,针对LFNST索引序号(即lfnst_idx)的取值,当LFNST索引序号的取值等于0时,将不使用LFNST;而当LFNST索引序号的取值大于0时,将使用LFNST,且变换核的索引序号等于LFNST索引序号的取值、或者该LFNST索引序号的取值减1。如此,根据LFNST索引序号,可以确定出当前块使用的LFNST变换核。
进一步地,在本申请的实施例中,MIP参数还可以包括MIP转置指示参数,其中,MIP转置指示参数的取值用于指示是否对MIP模式使用的采样点输入向量进行转置处理。相应的,当MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,可以对所选择的变换核进行矩阵转置处理,得到当前块使用的LFNST变换核。
可以理解的是,在本申请的实施例中,当MIP转置指示参数的取值等于1,可以认为MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,此时便需要对所选择的变换核进行相应的矩阵转置处理,从而可以得到当前块使用的LFNST变换核。
步骤206、使用LFNST变换核,对残差块进行变换处理。
在本申请的实施例中,在根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核之后,便可以使用LFNST变换核,即使用当前块所选择的变换矩阵,对残差块进行变换处理。
示例性的,在一种可能的实施例中,在编码端,编码器遍历预测模式,若当前编码单元(当前块)为帧内模式,则获取本申请实施例提出编解码方法的允许使用标志位,即获取MIP允许使用标志位(预测模式参数),该标志位可以为序列级标志位,用于指示当前解码器是否允许使用MIP技术。其中,该序列级标志位可以表示为sps_mip_enable_flag的形式。
接着,如果MIP的允许使用标志位为真,则编码端尝试MIP的预测方法,并计算对应的率失真代价记为cost1;若MIP的允许使用标志位为假,则编码端不尝试MIP的预测方法,而是继续遍历其他帧内预测技术并计算对应的率失真代价记为cost2…costN。
其中,如果MIP的允许使用标志位为真,可以根据当前编码单元尺寸大小的情况(当前块的尺寸参数),对获取得到的周边参考重建样本进行哈尔下采样,采样步长根据编码单元尺寸确定,同时可以结合MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。其中,若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,根据遍历的预测模式作为索引获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。然后根据输出向量个数与当前编码单元尺寸情况,对输出向量进行上采样。其中,若不需要上采样则向量以水平方向依次填充作为当前编码单元预测块(当前块的MIP预测块)输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前编码单元预测块(当前块的MIP预测块)输出。
需要说明的是,如果当前块的尺寸参数满足第一预设条件,则直接将第一预设预测模式确定为LFNST变换集的映射模式,例如,如果当前编码单元的宽和高均大于或等于32,那么可以将PLANAR模式(第一预设预测模式)作为LFNST变换集的映射模式。如果当前块的尺寸参数满足第一预设条件,则可以选择对当前MIP预测块使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。
进一步地,在使用DIMD方法导出传统帧内预测模式最为LFNST变换集的映射模式时,可以对当前MIP预测块遍历在当前VVC以及ECM中的67种帧内预测模式(或者遍历部分帧内预测模式),计算每种帧内预测模式在当前MIP预测块上的梯度信息,进而基于梯度信息确定对应的梯度幅度值,再根据梯度幅度值对所遍历的帧内预测模式进行排序,最大幅度的帧内预测模式即为最优模式,即使用该最优模式映射当前编码单元的LFNST变换集。
在完成LFNST变换集的映射模式的确定之后,将当前编码单元的原始图像块与预测块(当前块的MIP预测块)做差得到当前编码单元(当前块)的残差块,残差块经过主变换后得到频域的系数块,利用LFNST对频域系数块的感兴趣区域做二次变换,其中LFNST的变换集的映射预测模式已通过上述方法确定。之后经过量化以及反量化和反变换等过程,计算当前编码单元的率失真代价,记为cost1。
进一步地,可以继续遍历其他帧内预测技术并计算对应的率失真代价记为cost2…costN;若cost1为所有率失真代价中最小,则当前编码单元采用MIP技术,将当前编码单元的MIP使用标志位以及对应的MIP转置标识位(MIP转置指示参数)都置真并写进码流;若cost1不为最小率失真代价,则当前编码单元采用其他帧内预测技术,将当前编码单元的MIP使用标志位置假并写进码流;其他帧内预测技术的标识位或索引等信息根据定义传输。
示例性的,在另一种可能的实施例中,编码器遍历预测模式,若当前编码单元(当前块)为帧内模式,则获取本申请实施例提出编解码方法的允许使用标志位,即获取MIP允许使用标志位(预测模式参数),该标志位可以为序列级标志位,用于指示当前解码器是否允许使用MIP技术。其中,该序列级标志位可以表示为sps_mip_enable_flag的形式。
接着,如果MIP的允许使用标志位为真,则编码端尝试MIP的预测方法,并计算对应的率失真代价记为cost1;若MIP的允许使用标志位为假,则编码端不尝试MIP的预测方法,而是继续遍历其他帧内预测技术并计算对应的率失真代价记为cost2…costN。
其中,如果MIP的允许使用标志位为真,可以根据当前编码单元尺寸大小的情况(当前块的尺寸参数),对获取得到的周边参考重建样本进行哈尔下采样,采样步长根据编码单元尺寸确定,同时可以结合MIP转置指示参数调整上侧下采样后的参考重建样本与左侧下采样后的参考重建样本拼接顺序。其中,若不需要转置则将左侧下采样后的参考重建样本拼接在上侧下采样后的参考重建样本之后,将得到的向量最为输入(下采样向量);若需要转置则将上侧下采样后的参考重建样本拼接在左侧下采样后的参考重建样本之后,将得到的向量作为输入(下采样向量)。
接着,根据遍历的预测模式作为索引获取MIP矩阵系数,与输入(下采样向量)计算得到输出向量(MIP输出向量)。
需要说明的是,如果当前块的尺寸参数满足第一预设条件,则直接将第一预设预测模式确定为LFNST变换集的映射模式,例如,如果当前编码单元的宽和高均大于或等于32,那么可以将PLANAR模式(第一预设预测模式)作为LFNST变换集的映射模式。如果当前块的尺寸参数满足第一预设条件,则可以选择对当前MIP输出向量使用DIMD方法导出最优传统帧内预测模式作为LFNST变换集的映射模式。
进一步地,在使用DIMD方法导出传统帧内预测模式最为LFNST变换集的映射模式时,可以对当前MIP输出向量(MIP输出向量)在当前VVC以及ECM中的67种帧内预测模式(或者遍历部分帧内预测模式),计算每种帧内预测模式在当前MIP输出向量上的梯度信息,进而基于梯度信息确定对应的梯度幅度值,再根据梯度幅度值对所遍历的帧内预测模式进行排序,最大幅度的帧内预测模式即为最优模式,即使用该最优模式映射当前编码单元的LFNST变换集。
进一步地,可以根据输出向量个数与当前编码单元尺寸情况,对输出向量进行上采样。其中,若不需要上采样则向量以水平方向依次填充作为当前编码单元预测块(当前块的MIP预测块)输出,若需要上采样则先上采样水平方向再下采样垂直方向,上采样至与模板尺寸相同后作为当前编码单元预测块(当前块的MIP预测块)输出。
在完成LFNST变换集的映射模式的确定之后,将当前编码单元的原始图像块与预测块(当前块的MIP预测块)做差得到当前编码单元(当前块)的残差块,残差块经过主变换后得到频域的系数块,利用LFNST对频域系数块的感兴趣区域做二次变换,其中LFNST的变换集的映射预测模式已通过上述方法确定。之后经过量化以及反量化和反变换等过程,计算当前编码单元的率失真代价,记为cost1。
进一步地,可以继续遍历其他帧内预测技术并计算对应的率失真代价记为cost2…costN;若cost1为所有率失真代价中最小,则当前编码单元采用MIP技术,将当前编码单元的MIP使用标志位以及对应的MIP转置标识位(MIP转置指示参数)都置真并写进码流;若cost1不为最小率失真代价,则当前编码单元采用其他帧内预测技术,将当前编码单元的MIP使用标志位置假并写进码流;其他帧内预测技术的标识位或索引等信息根据定义传输。
本申请实施例提出的编码方法,降低了JVET-Z0048方案中的软硬件复杂度,同时保持着与其相似的性能,亮度分量没有性能变化。与ECM4.0相比,与JVET-Z0048一样保持着相同的性能。
进一步地,在本申请的实施例中,考虑到硬件解码器对于I帧和B帧的要求不同,本申请实施例提出的编码方法可以仅在B帧使用,也可以在I帧和B帧都同时使用。或者,本申请实施例提出的编码方法也可以仅在I帧上使用。亦或者,本申请实施例提出的编码方法允许使用的条件在B帧或者I帧上不同,例如,I帧允许所有尺寸的编码单元都允许使用本申请实施例提出的编码方法,而B帧仅允许小尺寸的编码单元使用本申请实施例提出的编码方法。
进一步地,在本申请的实施例中,若当前编码单元(当前块)的亮度分量使用MIP预测模式,而色度分量不使用MIP预测模式且不使用传统帧内预测模式,则色度分量的LFNST变换集可以继承亮度分量的LFNST变换集。
进一步地,在本申请的实施例中,若当前编码单元(当前块)不使用传统帧内预测模式,那么亮色度分量的LFNST变换集均可以根据本申请实施例提出的编码方法求解。
综上所述,本申请实施例提出的编码方法,涉及到使用DIMD导出MIP预测块映射LFNST变换集的方法,一方面,提出限制使用DIMD的编码单元使用尺寸。对于较大尺寸的图像块,MIP输出向量上采样较多,方向信息不明显,因此选择跳过DIMD导出传统预测模式的过程以降低计算复杂度;另一方面,提出在限制DIMD的编码单元使用尺寸的基础上,进一步降低计算复杂度,使用未上采样前的MIP输出向量作为DIMD的输入并导出最优传统帧内预测模式。
本申请实施例提供了一种编码方法,在编码端,确定预测模式参数;当预测模式参数指示当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据MIP参数,确定当前块的帧内预测块,计算当前块与帧内预测值之间的残差块;当当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;使用LFNST变换核,对残差块进行变换处理。由此可见,在本申请的实施例中,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。
基于上述实施例,在本申请的再一实施例中,基于前述实施例相同的发明构思,图8为编码器的组成结构示意图一,如图8所示,编码器110可以包括:第一确定单元111,,编码单元112,第一变换单元113;其中,
所述第一确定单元111,配置为所述第一确定单元,配置为确定预测模式参数;当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核,设置LFNST索引序号;
所述编码单元112,配置为写入视频码流;
所述第一变换单元113,配置为使用所述LFNST变换核,对所述残差块进行变换处理。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、 磁碟或者光盘等各种可以存储程序代码的介质。
因此,本申请实施例提供了一种计算机可读存储介质,应用于编码器110,该计算机可读存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。
基于上述编码器110的组成以及计算机可读存储介质,图9为编码器的组成结构示意图二,如图9所示,编码器110可以包括:第一存储器114和第一处理器115,第一通信接口116和第一总线系统117。第一存储器114、第一处理器115、第一通信接口116通过第一总线系统117耦合在一起。可理解,第一总线系统117用于实现这些组件之间的连接通信。第一总线系统117除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图9中将各种总线都标为第一总线系统117。其中,
第一通信接口116,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
所述第一存储器114,用于存储能够在所述第一处理器上运行的计算机程序;
所述第一处理器115,用于在运行所述计算机程序时,确定预测模式参数;当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;使用所述LFNST变换核,对所述残差块进行变换处理。
可以理解,本申请实施例中的第一存储器114可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第一存储器114旨在包括但不限于这些和任意其它适合类型的存储器。
而第一处理器115可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器115中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器115可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器114,第一处理器115读取第一存储器114中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,第一处理器115还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。
图10为解码器的组成结构示意图一,如图10所示,解码器120可以包括:第二确定单元121,第二变换单元122;其中,
所述第二确定单元121,配置为解码码流,确定预测模式参数;当所述预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定所述当前块的变换系数和LFNST 索引序号;当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;
所述第二变换单元122,配置为使用所述LFNST变换核,对所述变换系数进行变换处理。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
因此,本申请实施例提供了一种计算机可读存储介质,应用于解码器120,该计算机可读存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。
基于上述解码器120的组成以及计算机可读存储介质,图11为解码器的组成结构示意图二,如图11所示,解码器120可以包括:第二存储器123和第二处理器124,第二通信接口125和第二总线系统126。第二存储器123和第二处理器124,第二通信接口125通过第二总线系统126耦合在一起。可理解,第二总线系统126用于实现这些组件之间的连接通信。第二总线系统126除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图11中将各种总线都标为第二总线系统126。其中,
第二通信接口125,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
所述第二存储器123,用于存储能够在所述第二处理器上运行的计算机程序;
所述第二处理器124,用于在运行所述计算机程序时,解码码流,确定预测模式参数;
当所述预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定所述当前块的变换系数和LFNST索引序号;当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;使用所述LFNST变换核,对所述变换系数进行变换处理。
可以理解,本申请实施例中的第二存储器123可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第二存储器123旨在包括但不限于这些和任意其它适合类型的存储器。
而第二处理器124可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第二处理器124中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第二处理器124可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第二存储器123,第二处理器124读取第二存储器123中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
本申请实施例提供了一种编码器和解码器,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。
需要说明的是,在本申请的实施例中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例提供了一种编解码方法、编码器、解码器以及存储介质,在解码端,解码码流,确定预测模式参数;当预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定当前块的变换系数和LFNST索引序号;当LFNST索引序号指示当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核;使用LFNST变换核,对变换系数进行变换处理。在编码端,确定预测模式参数;当预测模式参数指示当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据MIP参数,确定当前块的帧内预测块,计算当前块与帧内预测值之间的残差块;当当前块使用LFNST时,根据MIP参数,确定LFNST变换集的映射模式;根据LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;使用LFNST变换核,对残差块进行变换处理。由此可见,在本申请的实施例中,在进行预测块的导出时,根据当前块的MIP参数中的尺寸参数确定LFNST变换集的映射模式,其中,对于较大尺寸的图像块,可以选择不使用DIMD导出映射模式,能够降低计算复杂度,从而可以提高编码效率。

Claims (31)

  1. 一种解码方法,应用于解码器,所述方法包括:
    解码码流,确定预测模式参数;
    当所述预测模式参数指示使用基于矩阵的帧内预测MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;
    解码码流,确定所述当前块的变换系数和低频不可分离二次变换LFNST索引序号;
    当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;
    根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;
    使用所述LFNST变换核,对所述变换系数进行变换处理。
  2. 根据权利要求1所述的方法,其中,所述MIP参数包括所述当前块的尺寸参数,所述方法还包括:
    若所述尺寸参数满足第一预设条件,则将第一预设预测模式确定为所述LFNST变换集的映射模式;
    若所述尺寸参数不满足第一预设条件,则使用解码端帧内模式导出DIMD确定所述LFNST变换集的映射模式。
  3. 根据权利要求2所述的方法,其中,所述方法还包括:
    遍历至少一种帧内预测模式,确定所述当前块对应的至少一个梯度信息;
    根据所述至少一个梯度信息确定所述LFNST变换集的映射模式。
  4. 根据权利要求3所述的方法,其中,所述方法还包括:
    基于所述至少一个梯度信息,确定每一种帧内预测模式对应的梯度幅度值;
    将所述至少一种帧内预测模式中的、梯度幅度值最大的帧内预测模式确定为所述LFNST变换集的映射模式。
  5. 根据权利要求3或4所述的方法,其中,所述方法还包括:
    根据所述MIP参数,确定所述当前块的下采样向量;
    根据所述下采样向量进行矩阵乘法计算,获得MIP输出向量;
    根据所述MIP输出向量确定所述当前块的MIP预测块;
    对所述MIP预测块遍历所述至少一种帧内预测模式,获得所述至少一个梯度信息。
  6. 根据权利要求3或4所述的方法,其中,所述方法还包括:
    根据所述MIP参数,确定所述当前块的下采样向量;
    根据所述下采样向量进行矩阵乘法计算,获得MIP输出向量;
    对所述MIP输出向量遍历所述至少一种帧内预测模式,获得所述至少一个梯度信息。
  7. 根据权利要求2所述的方法,其中,所述尺寸参数包括所述当前块的高度和宽度,所述方法还包括:
    若所述当前块的高度大于或者等于预设高度阈值,和/或,所述当前块的宽度大于或者等于预设宽度阈值,则确定所述尺寸参数满足所述第一预设条件;
    若所述当前块的高度小于预设高度阈值,和/或,所述当前块的宽度小于预设宽度阈值,则确定所述尺寸参不满足所述第一预设条件。
  8. 根据权利要求2或7所述的方法,其中,所述第一预设预测模式为PLANAR模式或DC模式。
  9. 根据权利要求2所述的方法,其中,所述方法还包括:
    确定所述LFNST变换集的映射模式的索引序号;
    根据所述索引序号的取值,确定LFNST帧内预测模式索引序号的取值;
    根据所述LFNST帧内预测模式索引序号的取值,从多个LFNST变换核候选集中选择一个LFNST变换核候选集;
    从所选择的LFNST变换核候选集中,选择所述LFNST索引序号指示的变换核,设置为所述当前块使用的LFNST变换核。
  10. 根据权利要求9所述的方法,其中,
    所述多个LFNST变换核候选集包括4个LFNST变换核候选集,其中,每个LFNST变换核候选集包括2个LFNST变换核;相应的,
    使用第一查找表确定所述索引序号的取值对应的所述LFNST帧内预测模式索引序号的取值。
  11. 根据权利要求9所述的方法,其中,所述方法还包括:
    所述多个LFNST变换核候选集包括35个LFNST变换核候选集,其中,每个LFNST变换核候选集包括3个LFNST变换核;相应的,
    使用第二查找表确定所述索引序号的取值对应的所述LFNST帧内预测模式索引序号的取值。
  12. 根据权利要求10或11所述的方法,其中,所述MIP参数包括MIP转置指示参数,所述MIP转置指示参数的取值用于指示是否对MIP模式使用的采样点输入向量进行转置处理。
  13. 根据权利要求12所述的方法,其中,所述方法还包括:
    当所述MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,对所述LFNST索引序号指示的变换核进行矩阵转置处理,得到所述当前块使用的LFNST变换核。
  14. 一种编码方法,应用于编码器,所述方法包括:
    确定预测模式参数;
    当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;
    根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;
    当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;
    根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核,设置LFNST索引序号并写入视频码流;
    使用所述LFNST变换核,对所述残差块进行变换处理。
  15. 根据权利要求14所述的方法,其中,所述MIP参数包括所述当前块的尺寸参数,所述方法还包括:
    若所述尺寸参数满足第一预设条件,则将第一预设预测模式确定为所述LFNST变换集的映射模式;
    若所述尺寸参数不满足第一预设条件,则使用DIMD确定所述LFNST变换集的映射模式。
  16. 根据权利要求15所述的方法,其中,所述方法还包括:
    遍历至少一种帧内预测模式,确定所述当前块对应的至少一个梯度信息;
    根据所述至少一个梯度信息确定所述LFNST变换集的映射模式。
  17. 根据权利要求16所述的方法,其中,所述方法还包括:
    基于所述至少一个梯度信息,确定每一种帧内预测模式对应的梯度幅度值;
    将所述至少一种帧内预测模式中的、梯度幅度值最大的帧内预测模式确定为所述LFNST变换集的映射模式。
  18. 根据权利要求16或17所述的方法,其中,所述方法还包括:
    根据所述MIP参数,确定所述当前块的下采样向量;
    根据所述下采样向量进行矩阵乘法计算,获得MIP输出向量;
    根据所述MIP输出向量确定所述当前块的MIP预测块;
    对所述MIP预测块遍历所述至少一种帧内预测模式,获得所述至少一个梯度信息。
  19. 根据权利要求16或17所述的方法,其中,所述方法还包括:
    根据所述MIP参数,确定所述当前块的下采样向量;
    根据所述下采样向量进行矩阵乘法计算,获得MIP输出向量;
    对所述MIP输出向量遍历所述至少一种帧内预测模式,获得所述至少一个梯度信息。
  20. 根据权利要求15所述的方法,其中,所述尺寸参数包括所述当前块的高度和宽度,所述方法还包括:
    若所述当前块的高度大于或者等于预设高度阈值,和/或,所述当前块的宽度大于或者等于预设宽度阈值,则确定所述尺寸参数满足所述第一预设条件;
    若所述当前块的高度小于预设高度阈值,和/或,所述当前块的宽度小于预设宽度阈值,则确定所述尺寸参不满足所述第一预设条件。
  21. 根据权利要求15或19所述的方法,其中,所述第一预设预测模式为PLANAR模式或DC模式。
  22. 根据权利要求15所述的方法,其中,所述方法还包括:
    确定所述LFNST变换集的映射模式的索引序号;
    根据所述索引序号的取值,确定LFNST帧内预测模式索引序号的取值;
    根据所述LFNST帧内预测模式索引序号的取值,从多个LFNST变换核候选集中选择一个LFNST 变换核候选集;
    从所选择的LFNST变换核候选集中,选择所述当前块使用的变换核。
  23. 根据权利要求22所述的方法,其中,
    所述多个LFNST变换核候选集包括4个LFNST变换核候选集,其中,每个LFNST变换核候选集包括2个LFNST变换核;相应的,
    使用第一查找表确定所述索引序号的取值对应的所述LFNST帧内预测模式索引序号的取值。
  24. 根据权利要求22所述的方法,其中,所述方法还包括:
    所述多个LFNST变换核候选集包括35个LFNST变换核候选集,其中,每个LFNST变换核候选集包括3个LFNST变换核;相应的,
    使用第二查找表确定所述索引序号的取值对应的所述LFNST帧内预测模式索引序号的取值。
  25. 根据权利要求23或24所述的方法,其中,所述MIP参数包括MIP转置指示参数,所述MIP转置指示参数的取值用于指示是否对MIP模式使用的采样点输入向量进行转置处理。
  26. 根据权利要求23所述的方法,其中,所述方法还包括:
    当所述MIP转置指示参数的取值指示对MIP模式使用的采样点输入向量进行转置处理时,对所选择的变换核进行矩阵转置处理,得到所述当前块使用的LFNST变换核。
  27. 一种编码器,所述编码器包括第一确定单元,编码单元,第一变换单元;其中,
    所述第一确定单元,配置为确定预测模式参数;当所述预测模式参数指示所述当前块使用MIP确定帧内预测值时,确定当前块的MIP参数;根据所述MIP参数,确定所述当前块的帧内预测块,计算所述当前块与所述帧内预测值之间的残差块;当所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核,设置LFNST索引序号;
    所述编码单元,配置为写入视频码流;
    所述第一变换单元,配置为使用所述LFNST变换核,对所述残差块进行变换处理。
  28. 一种编码器,所述编码器包括第一存储器和第一处理器;其中,
    所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;
    所述第一处理器,用于在运行所述计算机程序时,执行如权利要求14至26任一项所述的方法。
  29. 一种解码器,所述解码器包括第二确定单元,第二变换单元;其中,
    所述第二确定单元,配置为解码码流,确定预测模式参数;当所述预测模式参数指示使用MIP确定帧内预测值时,解码码流,确定当前块的MIP参数;解码码流,确定所述当前块的变换系数和LFNST索引序号;当所述LFNST索引序号指示所述当前块使用LFNST时,根据所述MIP参数,确定LFNST变换集的映射模式;根据所述LFNST变换集的映射模式,从多个LFNST变换核候选集中选择一个LFNST变换核候选集,并从所选择的LFNST变换核候选集中确定所述当前块使用的LFNST变换核;
    所述第二变换单元,配置为使用所述LFNST变换核,对所述变换系数进行变换处理。
  30. 一种解码器,所述解码器包括第二存储器和第二处理器;其中,
    所述第二存储器,用于存储能够在所述第二处理器上运行的计算机程序;
    所述第二处理器,用于在运行所述计算机程序时,执行如权利要求1至13任一项所述的方法。
  31. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至13任一项所述的方法、或者实现如权利要求14至26任一项所述的方法。
PCT/CN2022/103686 2022-07-04 2022-07-04 编解码方法、编码器、解码器以及存储介质 WO2024007120A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/103686 WO2024007120A1 (zh) 2022-07-04 2022-07-04 编解码方法、编码器、解码器以及存储介质
TW112124177A TW202404361A (zh) 2022-07-04 2023-06-28 編解碼方法、編碼器、解碼器以及儲存媒介

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/103686 WO2024007120A1 (zh) 2022-07-04 2022-07-04 编解码方法、编码器、解码器以及存储介质

Publications (1)

Publication Number Publication Date
WO2024007120A1 true WO2024007120A1 (zh) 2024-01-11

Family

ID=89454679

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/103686 WO2024007120A1 (zh) 2022-07-04 2022-07-04 编解码方法、编码器、解码器以及存储介质

Country Status (2)

Country Link
TW (1) TW202404361A (zh)
WO (1) WO2024007120A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021134327A1 (zh) * 2019-12-30 2021-07-08 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质
WO2021134635A1 (zh) * 2019-12-31 2021-07-08 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质
CN113826395A (zh) * 2019-04-16 2021-12-21 Lg电子株式会社 图像编码中基于矩阵的帧内预测的变换
CN113853797A (zh) * 2019-04-16 2021-12-28 Lg电子株式会社 使用变换索引的图像编码
US20220060751A1 (en) * 2019-05-08 2022-02-24 Lg Electronics Inc. Image encoding/decoding method and device for performing mip and lfnst, and method for transmitting bitstream
CN114450945A (zh) * 2020-01-08 2022-05-06 Oppo广东移动通信有限公司 编码方法、解码方法、编码器、解码器以及存储介质
CN114556943A (zh) * 2020-04-03 2022-05-27 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113826395A (zh) * 2019-04-16 2021-12-21 Lg电子株式会社 图像编码中基于矩阵的帧内预测的变换
CN113853797A (zh) * 2019-04-16 2021-12-28 Lg电子株式会社 使用变换索引的图像编码
US20220060751A1 (en) * 2019-05-08 2022-02-24 Lg Electronics Inc. Image encoding/decoding method and device for performing mip and lfnst, and method for transmitting bitstream
WO2021134327A1 (zh) * 2019-12-30 2021-07-08 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质
WO2021134635A1 (zh) * 2019-12-31 2021-07-08 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质
CN114450945A (zh) * 2020-01-08 2022-05-06 Oppo广东移动通信有限公司 编码方法、解码方法、编码器、解码器以及存储介质
CN114556943A (zh) * 2020-04-03 2022-05-27 Oppo广东移动通信有限公司 变换方法、编码器、解码器以及存储介质

Also Published As

Publication number Publication date
TW202404361A (zh) 2024-01-16

Similar Documents

Publication Publication Date Title
WO2022104498A1 (zh) 帧内预测方法、编码器、解码器以及计算机存储介质
JP2022535898A (ja) ビデオ・データを処理する方法、装置及び記憶媒体
US11843781B2 (en) Encoding method, decoding method, and decoder
US11477465B2 (en) Colour component prediction method, encoder, decoder, and storage medium
WO2022087901A1 (zh) 图像预测方法、编码器、解码器以及计算机存储介质
WO2021004155A1 (zh) 图像分量预测方法、编码器、解码器以及存储介质
JP2023515742A (ja) ループ内フィルタリングの方法、コンピュータ可読記憶媒体及びプログラム
WO2021134635A1 (zh) 变换方法、编码器、解码器以及存储介质
WO2021258841A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
WO2021238396A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
CA3222255A1 (en) Decoding prediction method and apparatus, and computer storage medium
US20220329862A1 (en) Transformation method, encoder, decoder, and storage medium
WO2024007120A1 (zh) 编解码方法、编码器、解码器以及存储介质
US20220329809A1 (en) Transform method, encoder, decoder, and storage medium
WO2022227082A1 (zh) 块划分方法、编码器、解码器以及计算机存储介质
WO2022067805A1 (zh) 图像预测方法、编码器、解码器以及计算机存储介质
WO2023197195A1 (zh) 视频编解码方法、编码器、解码器及存储介质
WO2023141970A1 (zh) 解码方法、编码方法、解码器、编码器和编解码系统
WO2023193254A1 (zh) 解码方法、编码方法、解码器以及编码器
WO2022266971A1 (zh) 编解码方法、编码器、解码器以及计算机存储介质
WO2024007116A1 (zh) 解码方法、编码方法、解码器以及编码器
WO2023070505A1 (zh) 帧内预测方法、解码器、编码器及编解码系统
WO2023184747A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2023193253A1 (zh) 解码方法、编码方法、解码器以及编码器
WO2023197193A1 (zh) 编解码方法、装置、编码设备、解码设备以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22949711

Country of ref document: EP

Kind code of ref document: A1