CN110944177A

CN110944177A - Video decoding method, video decoder, video encoding method, and video encoder

Info

Publication number: CN110944177A
Application number: CN201811150819.0A
Authority: CN
Inventors: 林永兵; 郑建铧; 朱策
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-09-21
Filing date: 2018-09-29
Publication date: 2020-03-31
Anticipated expiration: 2038-09-29
Also published as: CN110944177B

Abstract

The invention discloses a video decoding method and a video decoder. The method comprises the following steps: analyzing the received code stream to obtain indication information of a transformation matrix pair which is inversely transformed by the current block and a quantization coefficient of the current block; performing inverse quantization processing on the quantized coefficient of the current block to obtain an inverse quantized coefficient of the current block; determining a transformation matrix pair for performing inverse transformation processing on the current block from four candidate transformation matrix pairs according to the indication information; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the four candidate transformation matrix pairs are both one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a variant of a DST4 matrix, and the other of the two transformation matrices is a DCT2 'matrix or a variant of a DCT 2' matrix; and obtaining a reconstructed block of the current block according to the transformation matrix pair subjected to inverse transformation processing on the current block. With the present patent, the implementation of the transformation/inverse transformation can be simplified.

Description

Video decoding method, video decoder, video encoding method, and video encoder

Technical Field

Embodiments of the present application relate generally to the field of video encoding, and more particularly, to video decoding methods and video decoders, video encoding methods and video encoders.

Background

Video encoding (video encoding and decoding) is widely used in digital video applications such as broadcast digital television, video dissemination over the internet and mobile networks, real-time session applications such as video chat and video conferencing, DVD and blu-ray discs, video content capture and editing systems, and security applications for camcorders.

With the development of the hybrid block-based video coding scheme in the h.261 standard in 1990, new video coding techniques and tools have been developed and form the basis for new video coding standards. Other Video Coding standards include MPEG-1 Video, MPEG-2 Video, ITU-T H.262/MPEG-2, ITU-T H.263, ITU-T H.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC) …, and extensions to such standards, such as scalability and/or 3D (three-dimensional) extensions. As video creation and usage becomes more widespread, video traffic becomes the largest burden on communication networks and data storage. One of the goals of most video coding standards is therefore to reduce the bit rate without sacrificing picture quality compared to previous standards. Even though the latest High Efficiency Video Coding (HEVC) can compress video about twice as much as AVC without sacrificing picture quality, there is still a need for a new technology to further compress video relative to HEVC.

Disclosure of Invention

Embodiments of the present application provide a video decoding method and a video decoder, and a video encoding method and a video encoder, which can simplify the implementation of transform/inverse transform.

The foregoing and other objects are achieved by the subject matter of the independent claims. Other implementations are apparent from the dependent claims, the description and the drawings.

In a first aspect, the present invention provides a video decoding method, including:

analyzing the received code stream to obtain indication information of a transformation matrix pair which is inversely transformed by the current block and a quantization coefficient of the current block, wherein the transformation matrix pair comprises a horizontal direction transformation matrix and a vertical direction transformation matrix;

performing inverse quantization processing on the quantized coefficient of the current block to obtain an inverse quantized coefficient of the current block;

determining a transformation matrix pair for the current block to perform inverse transformation processing from a candidate transformation matrix pair according to the indication information; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the candidate transformation matrix pair are both one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a modification of a DST4 matrix, and the other of the two transformation matrices is a DCT2 ' matrix or a modification of a DCT2 ' matrix, wherein the DCT2 ' matrix is a transposed matrix of the DCT2 matrix;

carrying out inverse transformation processing on the inverse quantization coefficient of the current block according to the transformation matrix subjected to inverse transformation processing by the current block so as to obtain a reconstructed residual block of the current block;

and obtaining a reconstructed block of the current block according to the reconstructed residual block of the current block.

Wherein, the horizontal direction transformation matrix and the vertical direction transformation matrix included in any one of the candidate transformation pairs are either the same or different.

Wherein the number of the candidate transformation matrix pairs may be 2, 3 or 4.

It can be seen that, since the butterfly fast algorithm exists in the transformation/inverse transformation of the DCT2 'matrix or the DCT 2' matrix, the implementation of the transformation/inverse transformation can be simplified. Meanwhile, the DCT2 'matrix or the DCT 2' matrix and the DST4 matrix or the DST4 matrix can be directly multiplexed with the transformation/inverse transformation realization circuit corresponding to the DCT2 matrix, so that the design of the realization circuit of the transformation/inverse transformation module can be simplified when the transformation/inverse transformation module is realized through the circuit.

With reference to the first aspect, in a possible implementation, the deformation of the DST4 matrix is obtained by performing a sign transformation on coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

The modification of the DCT2 'matrix is obtained by sign-transforming coefficients of at least a part of rows or at least a part of columns of the DCT 2' matrix.

With reference to the first aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

a second transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix being a DST4 matrix and a horizontal direction transformation matrix being a DCT 2' matrix;

a third transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix being a DCT 2' matrix and a horizontal direction transformation matrix being a DST4 matrix;

a fourth transform matrix pair of the four candidate transform matrix pairs includes a vertical direction transform matrix that is a DCT2 'matrix and a horizontal direction transform matrix that is a DCT 2' matrix.

With reference to the first aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

a second one of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix that is a modification of the DST4 matrix, and comprises a horizontal direction transformation matrix that is a modification of the DCT 2' matrix;

a third transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix that is a modification of a DCT 2' matrix, and comprises a horizontal direction transformation matrix that is a modification of a DST4 matrix;

a fourth transform matrix pair of the four candidate transform matrix pairs includes a vertical direction transform matrix that is a modification of the DCT2 'matrix, and includes a horizontal direction transform matrix that is a modification of the DCT 2' matrix.

With reference to the first aspect, in a possible implementation manner, the indication information includes an identifier of a vertical direction transform matrix in a transform matrix pair used for indicating the current block to be subjected to inverse transform processing, and an identifier of a horizontal direction transform matrix in a transform matrix pair used for indicating the current block to be subjected to inverse transform processing.

With reference to the first aspect, in a possible implementation manner, before performing inverse transform processing on inverse quantized coefficients of the current block according to a transform matrix of the current block, the method further includes: and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

With reference to the first aspect, in a possible implementation manner, the pair of transform matrices to which the current block is subjected to inverse transform processing includes a DST4 matrix; the size of the DCT2 matrix is 64; the transformation matrix included in the transformation matrix pair for deriving the current block from the DCT2 matrix according to a preset algorithm includes: the DST4 matrix is derived from the DCT2 matrix according to the following formula:

wherein, transMatrix represents the DCT2 matrix, nTbs represents the size of the DST4 matrix, i is more than or equal to 0 and less than or equal to nTbS-1, and j is more than or equal to 0 and less than or equal to nTbS-1; the offsets 64-nTbs represent the offsets of the columns; offset amount

Represents the offset of the row; (-1)^jIndicating that a symbol transformation is performed.

With reference to the first aspect, in a possible implementation, the pair of transform matrices for which the current block is inverse-transformed includes a DCT 2' matrix; the size of the DCT2 matrix is 64; the transformation matrix included in the transformation matrix pair for deriving the current block from the DCT2 matrix according to a preset algorithm includes: the DCT 2' matrix is derived from the DCT2 matrix according to the following formula:

transMatrix[j][i×2^6-Log2(nTbs)]；

wherein, the transMatrix represents the DCT2 matrix, nTbs represents the size of the DCT 2', i is more than or equal to 0 and less than or equal to nTbS-1, and j is more than or equal to 0 and less than or equal to nTbS-1.

In a second aspect, the present invention provides an encoding method, including:

determining indication information of a transformation matrix pair for performing transformation processing on a current residual block, wherein the transformation matrix pair comprises a horizontal direction transformation matrix and a vertical direction transformation matrix; the transformation matrix pair is one of candidate transformation matrix pairs, and both a horizontal direction transformation matrix and a vertical direction transformation matrix included in the candidate transformation matrix pair are one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a modification of a DST4 matrix, and the other of the two transformation matrices is a DCT2 ' matrix or a modification of a DCT2 ' matrix, wherein the DCT2 ' matrix is a transposed matrix of the DCT2 matrix;

quantizing a transform coefficient obtained by transforming the current residual block by the transform matrix to obtain a quantized coefficient of the current residual block;

writing the indication information of the transformation matrix pair into a code stream; and

and writing the quantized coefficient into a code stream after entropy coding processing.

With reference to the second aspect, in a possible implementation, the deformation of the DST4 matrix is obtained by sign-transforming coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

With reference to the second aspect, in a possible embodiment, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

With reference to the second aspect, in a possible embodiment, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

With reference to the second aspect, in a possible embodiment, the method further includes: the transformation matrix comprised by the pair of transformation matrices is derived from the DCT2 matrix according to a predetermined algorithm.

With reference to the second aspect, in a possible implementation, the transformation matrix included in the transformation matrix pair includes a DST4 matrix; the size of the DCT2 matrix is 64; the deriving of the transformation matrix included in the transformation matrix pair from the DCT2 matrix according to a predetermined algorithm comprises: the DST4 matrix is derived from the DCT2 matrix according to the following formula:

With reference to the first aspect, in a possible implementation, the transformation matrix included in the transformation matrix pair includes a DCT 2' matrix; the size of the DCT2 matrix is 64; the deriving of the transformation matrix included in the transformation matrix pair from the DCT2 matrix according to a predetermined algorithm comprises: the DCT 2' matrix is derived from the DCT2 matrix according to the following formula:

transMatrix[j][i×2^6-Log2(nTbs)]；

In a third aspect, the present invention provides a video decoder, comprising:

the entropy decoding unit is used for analyzing the received code stream to obtain indication information of a transformation matrix pair which is used for carrying out inverse transformation processing on a current block and a quantization coefficient of the current block, wherein the transformation matrix pair comprises a horizontal direction transformation matrix and a vertical direction transformation matrix;

an inverse quantization unit, configured to perform inverse quantization processing on the quantized coefficient of the current block to obtain an inverse quantized coefficient of the current block;

an inverse transformation processing unit, configured to determine, according to the indication information, a transformation matrix pair for performing inverse transformation processing on the current block from among four candidate transformation matrix pairs; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the candidate transformation matrix pair are both one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a modification of a DST4 matrix, and the other of the two transformation matrices is a DCT2 ' matrix or a modification of a DCT2 ' matrix, wherein the DCT2 ' matrix is a transposed matrix of the DCT2 matrix; carrying out inverse transformation processing on the inverse quantization coefficient of the current block according to the transformation matrix subjected to inverse transformation processing by the current block so as to obtain a reconstructed residual block of the current block;

a reconstruction unit, configured to obtain a reconstructed block of the current block based on the reconstructed residual block of the current block.

With reference to the third aspect, in a possible implementation, the deformation of the DST4 matrix is obtained by performing a sign transformation on coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

With reference to the third aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

With reference to the third aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

With reference to the third aspect, in a possible implementation manner, the indication information includes an identifier of a vertical direction transform matrix in a transform matrix pair used for indicating the current block to be subjected to inverse transform processing, and an identifier of a horizontal direction transform matrix in a transform matrix pair used for indicating the current block to be subjected to inverse transform processing.

With reference to the third aspect, in a possible implementation manner, the inverse transform processing unit is further configured to: and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

With reference to the third aspect, in a possible implementation, the pair of transform matrices for which the current block is inverse-transformed includes a DST4 matrix; the size of the DCT2 matrix is 64; the inverse transform processing unit is specifically configured to: the DST4 matrix is derived from the DCT2 matrix according to the following formula:

With reference to the third aspect, in a possible implementation, the pair of transform matrices on which the current block is inverse-transformed includes a DCT 2' matrix; the size of the DCT2 matrix is 64; the inverse transform processing unit is specifically configured to: the DCT 2' matrix is derived from the DCT2 matrix according to the following formula:

transMatrix[j][i×2^6-Log2(nTbs)]；

In a fourth aspect, the present invention provides a video encoder comprising:

a transformation processing unit configured to determine indication information of a transformation matrix pair for performing transformation processing on a current residual block, the transformation matrix pair including a horizontal direction transformation matrix and a vertical direction transformation matrix; the transformation matrix pair is one of candidate transformation matrix pairs, and both a horizontal direction transformation matrix and a vertical direction transformation matrix included in the candidate transformation matrix pair are one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a modification of a DST4 matrix, and the other of the two transformation matrices is a DCT2 ' matrix or a modification of a DCT2 ' matrix, wherein the DCT2 ' matrix is a transposed matrix of the DCT2 matrix;

a quantization unit that performs quantization processing on a transform coefficient obtained by performing transform processing on the current residual block by the transform matrix to obtain a quantized coefficient of the current residual block;

an entropy encoding unit, configured to perform entropy encoding processing on the quantized coefficient of the current residual block and the indication information;

and the output is used for writing the indication information of the transformation matrix pair after entropy coding and the quantization coefficient of the current residual block after entropy coding into a code stream.

With reference to the fourth aspect, in a possible implementation, the deformation of the DST4 matrix is obtained by performing a sign transformation on coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

With reference to the fourth aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

With reference to the fourth aspect, in a possible implementation, the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

With reference to the fourth aspect, in a possible implementation manner, the transformation processing unit is further configured to: the transformation matrix comprised by the pair of transformation matrices is derived from the DCT2 matrix according to a predetermined algorithm.

With reference to the fourth aspect, in a possible implementation, the pair of transformation matrices includes a DST4 matrix; the size of the DCT2 matrix is 64; the transform processing unit is specifically configured to: the DST4 matrix is derived from the DCT2 matrix according to the following formula:

With reference to the fourth aspect, in a possible implementation, the pair of transformation matrices includes a DCT 2' matrix; the size of the DCT2 matrix is 64; the inverse transform processing unit is specifically configured to: the DCT 2' matrix is derived from the DCT2 matrix according to the following formula:

transMatrix[j][i×2^6-Log2(nTbs)]×x[j]；

In a fifth aspect, the invention is directed to an apparatus for decoding a video stream, comprising a processor and a memory. The memory stores instructions that cause the processor to perform a method according to the first aspect or any possible embodiment of the first aspect.

In a sixth aspect, the disclosure is directed to an apparatus for video encoding comprising a processor and a memory. The memory stores instructions that cause the processor to perform a method according to the second aspect or any possible embodiment of the second aspect.

In a seventh aspect, a computer-readable storage medium is presented having instructions stored thereon that, when executed, cause one or more processors to decode video data. The instructions cause the one or more processors to perform a method according to the first aspect or any possible embodiment of the first aspect.

In an eighth aspect, a computer-readable storage medium is presented having instructions stored thereon that, when executed, cause one or more processors to encode video data. The instructions cause the one or more processors to perform a method according to the second aspect or any possible embodiment of the second aspect.

In a ninth aspect, a video decoder is presented, comprising execution circuitry for executing the method as in the first aspect or any possible embodiment of the first aspect.

In a tenth aspect, a video encoder is proposed, comprising execution circuitry for executing the method as in the second aspect or any possible embodiment of the second aspect.

In an eleventh aspect, the invention relates to a computer program comprising program code for performing a method according to the first aspect or any of the possible embodiments of the first aspect when the program code is run on a computer.

In a twelfth aspect, the invention relates to a computer program comprising program code for performing a method according to the second aspect or any of the possible embodiments of the second aspect when run on a computer.

The details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.

FIG. 1 is a block diagram of an example video encoding system for implementing an embodiment of the invention;

FIG. 2 is a block diagram showing an example structure of a video encoder for implementing an embodiment of the present invention;

FIG. 3 is a block diagram showing an example structure of a video decoder for implementing an embodiment of the present invention;

FIG. 4 is a block diagram of a block diagram including the encoder 20 of FIG. 2 and the decoder 30 of FIG. 3

FIG. 5 is a block diagram depicting another example encoding device or decoding device;

FIG. 6 is a schematic diagram showing a butterfly fast algorithm circuit implementation of the 16 × 16DCT2 matrix in HEVC;

FIG. 7 is a schematic diagram showing a 32 × 32 inverse transform implementation circuit according to an embodiment;

FIG. 8 is a schematic diagram showing an implementation circuit according to an embodiment;

FIG. 9 is a schematic diagram illustrating an inverse transform architecture of an 8x8DCT2 matrix according to an embodiment;

FIG. 10 is a flow chart showing a video decoding method according to an embodiment;

fig. 11 is a flowchart showing a video encoding method according to an embodiment.

In the following, identical reference signs refer to identical or at least functionally equivalent features, if no specific remarks are made with respect to the identical reference signs.

Detailed Description

In the following description, reference is made to the accompanying drawings which form a part hereof and in which is shown by way of illustration specific aspects of embodiments of the invention or in which embodiments of the invention may be practiced. It should be understood that embodiments of the invention may be used in other respects, and may include structural or logical changes not depicted in the drawings. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.

For example, it should be understood that the disclosure in connection with the described methods may equally apply to the corresponding apparatus or system for performing the methods, and vice versa. For example, if one or more particular method steps are described, the corresponding apparatus may comprise one or more units, such as functional units, to perform the described one or more method steps (e.g., a unit performs one or more steps, or multiple units, each of which performs one or more of the multiple steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a particular apparatus is described based on one or more units, such as functional units, the corresponding method may comprise one step to perform the functionality of the one or more units (e.g., one step performs the functionality of the one or more units, or multiple steps, each of which performs the functionality of one or more of the plurality of units), even if such one or more steps are not explicitly described or illustrated in the figures. Further, it is to be understood that features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless explicitly stated otherwise.

Video coding generally refers to processing a sequence of pictures that form a video or video sequence. In the field of video coding, the terms "picture", "frame" or "image" may be used as synonyms. Video encoding as used in this application (or this disclosure) refers to video encoding or video decoding. Video encoding is performed on the source side, typically including processing (e.g., by compressing) the original video picture to reduce the amount of data required to represent the video picture (and thus more efficiently store and/or transmit). Video decoding is performed at the destination side, typically involving inverse processing with respect to the encoder, to reconstruct the video pictures. Embodiments refer to video pictures (or collectively pictures, as will be explained below) "encoding" should be understood to refer to "encoding" or "decoding" of a video sequence. The combination of the encoding part and the decoding part is also called codec (encoding and decoding).

In the case of lossless video coding, the original video picture can be reconstructed, i.e., the reconstructed video picture has the same quality as the original video picture (assuming no transmission loss or other data loss during storage or transmission). In the case of lossy video coding, the amount of data needed to represent the video picture is reduced by performing further compression, e.g., by quantization, while the decoder side cannot fully reconstruct the video picture, i.e., the quality of the reconstructed video picture is lower or worse than the quality of the original video picture.

Several video coding standards of h.261 belong to the "lossy hybrid video codec" (i.e., the combination of spatial and temporal prediction in the sample domain with 2D transform coding in the transform domain for applying quantization). Each picture of a video sequence is typically partitioned into non-overlapping sets of blocks, typically encoded at the block level. In other words, the encoder side typically processes, i.e., encodes, video at the block (video block) level, e.g., generates a prediction block by spatial (intra-picture) prediction and temporal (inter-picture) prediction, subtracts the prediction block from the current block (the currently processed or to be processed block) to obtain a residual block, transforms the residual block and quantizes the residual block in the transform domain to reduce the amount of data to be transmitted (compressed), while the decoder side applies the inverse processing portion relative to the encoder to the encoded or compressed block to reconstruct the current block for representation. In addition, the encoder replicates the decoder processing loop such that the encoder and decoder generate the same prediction (e.g., intra-prediction and inter-prediction) and/or reconstruction for processing, i.e., encoding, subsequent blocks.

As used herein, the term "block" may be a portion of a picture or frame. For ease of description, embodiments of the present invention are described with reference to multipurpose Video Coding (VVC) or High-Efficiency Video Coding (HEVC) developed by the Video Coding Experts Group (VCEG) of the ITU-T Video Coding Experts Group and the JCT-VC (Joint Video Coding, MPEG) of the Joint working Group of Video Coding of the ISO/IEC moving Picture Experts Group. Those of ordinary skill in the art understand that embodiments of the present invention are not limited to HEVC or VVC. May refer to CU, PU, and TU. In HEVC, the CTU is split into CUs by using a quadtree structure represented as a coding tree. A decision is made at the CU level whether to encode a picture region using inter-picture (temporal) or intra-picture (spatial) prediction. Each CU may be further split into one, two, or four PUs according to the PU split type. The same prediction process is applied within one PU and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying a prediction process based on the PU split type, the CU may be partitioned into Transform Units (TUs) according to other quadtree structures similar to the coding tree used for the CU. In recent developments of video compression techniques, the coding blocks are partitioned using Quad-tree and binary tree (QTBT) partition frames. In the QTBT block structure, a CU may be square or rectangular in shape. In the VVC, a Coding Tree Unit (CTU) is first divided by a quadtree structure. The quadtree leaf nodes are further partitioned by a binary tree structure. The binary tree leaf nodes are called Coding Units (CUs), and the segments are used for prediction and transform processing without any other segmentation. This means that the block sizes of CU, PU and TU in the QTBT coding block structure are the same. Also, it has been proposed to use multiple partitions, such as ternary tree partitions, with QTBT block structures.

Embodiments of the encoder 20, decoder 30 and codec systems 10, 40 are described below based on fig. 1-4 (before embodiments of the invention are described in more detail based on fig. 10).

Fig. 1 is a conceptual or schematic block diagram depicting an exemplary encoding system 10, such as a video encoding system 10 that may utilize the techniques of the present application (this disclosure). Encoder 20 (e.g., video encoder 20) and decoder 30 (e.g., video decoder 30) of video encoding system 10 represent examples of devices that may be used to perform techniques for video encoding or video decoding methods according to various examples described in this application. As shown in fig. 1, encoding system 10 includes a source device 12 for providing encoded data 13, e.g., encoded pictures 13, to a destination device 14 that decodes encoded data 13, for example.

The source device 12 comprises an encoder 20 and may additionally, i.e. optionally, comprise a picture source 16, a pre-processing unit 18, e.g. a picture pre-processing unit 18, and a communication interface or unit 22.

The picture source 16 may include or may be any type of picture capture device for capturing real-world pictures, for example, and/or any type of picture or comment generation device (for screen content encoding, some text on the screen is also considered part of the picture or image to be encoded), for example, a computer graphics processor for generating computer animated pictures, or any type of device for obtaining and/or providing real-world pictures, computer animated pictures (e.g., screen content, Virtual Reality (VR) pictures), and/or any combination thereof (e.g., Augmented Reality (AR) pictures).

A (digital) picture is or can be seen as a two-dimensional array or matrix of sample points having intensity values. The sample points in the array may also be referred to as pixels (short for pixels) or pels (pels). The number of sampling points of the array or picture in the horizontal and vertical directions (or axes) defines the size and/or resolution of the picture. To represent color, three color components are typically employed, i.e., a picture may be represented as or contain three sample arrays. In the RBG format or color space, a picture includes corresponding red, green, and blue sampling arrays. However, in video coding, each pixel is typically represented in a luminance/chrominance format or color space, e.g., YCbCr, comprising a luminance component (sometimes also indicated by L) indicated by Y and two chrominance components indicated by Cb and Cr. The luminance (luma) component Y represents the luminance or gray level intensity (e.g. both are the same in a gray scale picture), while the two chrominance (chroma) components Cb and Cr represent the chrominance or color information components. Accordingly, a picture in YCbCr format includes a luminance sample array of luminance sample values (Y), and two chrominance sample arrays of chrominance values (Cb and Cr). Pictures in RGB format may be converted or transformed into YCbCr format and vice versa, a process also known as color transformation or conversion. If the picture is black and white, the picture may include only an array of luminance samples.

Picture source 16 (e.g., video source 16) may be, for example, a camera for capturing pictures, a memory, such as a picture store, any type of (internal or external) interface that includes or stores previously captured or generated pictures, and/or obtains or receives pictures. The camera may be, for example, an integrated camera local or integrated in the source device, and the memory may be an integrated memory local or integrated in the source device, for example. The interface may be, for example, an external interface that receives pictures from an external video source, for example, an external picture capturing device such as a camera, an external memory, or an external picture generating device, for example, an external computer graphics processor, computer, or server. The interface may be any kind of interface according to any proprietary or standardized interface protocol, e.g. a wired or wireless interface, an optical interface. The interface for obtaining picture data 17 may be the same interface as communication interface 22 or part of communication interface 22.

Unlike pre-processing unit 18 and the processing performed by pre-processing unit 18, picture or picture data 17 (e.g., video data 16) may also be referred to as raw picture or raw picture data 17.

Pre-processing unit 18 is configured to receive (raw) picture data 17 and perform pre-processing on picture data 17 to obtain a pre-processed picture 19 or pre-processed picture data 19. For example, the pre-processing performed by pre-processing unit 18 may include trimming, color format conversion (e.g., from RGB to YCbCr), toning, or denoising. It is to be understood that the pre-processing unit 18 may be an optional component.

Encoder 20, e.g., video encoder 20, is used to receive pre-processed picture data 19 and provide encoded picture data 21 (details will be described further below, e.g., based on fig. 2 or fig. 4).

Communication interface 22 of source device 12 may be used to receive encoded picture data 21 and transmit to other devices, e.g., destination device 14 or any other device for storage or direct reconstruction, or to process encoded picture data 21 prior to correspondingly storing encoded data 13 and/or transmitting encoded data 13 to other devices, e.g., destination device 14 or any other device for decoding or storage.

Destination device 14 includes a decoder 30 (e.g., a video decoder 30), and may additionally, that is, optionally, include a communication interface or unit 28, a post-processing unit 32, and a display device 34.

Communication interface 28 of destination device 14 is used, for example, to receive encoded picture data 21 or encoded data 13 directly from source device 12 or any other source, such as a storage device, such as an encoded picture data storage device.

Communication interface 22 and communication interface 28 may be used to transmit or receive encoded picture data 21 or encoded data 13 by way of a direct communication link between source device 12 and destination device 14, such as a direct wired or wireless connection, or by way of any type of network, such as a wired or wireless network or any combination thereof, or any type of private and public networks, or any combination thereof.

Communication interface 22 may, for example, be used to encapsulate encoded picture data 21 into a suitable format, such as a packet, for transmission over a communication link or communication network.

Communication interface 28, which forms a corresponding part of communication interface 22, may for example be used for decapsulating encoded data 13 to obtain encoded picture data 21.

Both communication interface 22 and communication interface 28 may be configured as a unidirectional communication interface, as indicated by the arrow from source device 12 to destination device 14 for encoded picture data 13 in fig. 1, or as a bidirectional communication interface, and may be used, for example, to send and receive messages to establish a connection, acknowledge and exchange any other information related to a communication link and/or a data transmission, for example, an encoded picture data transmission.

Decoder 30 is used to receive encoded picture data 21 and provide decoded picture data 31 or decoded picture 31 (details will be described further below, e.g., based on fig. 3 or fig. 5).

Post-processor 32 of destination device 14 is used to post-process decoded picture data 31 (also referred to as reconstructed picture data), e.g., decoded picture 131, to obtain post-processed picture data 33, e.g., post-processed picture 33. Post-processing performed by post-processing unit 32 may include, for example, color format conversion (e.g., from YCbCr to RGB), toning, cropping, or resampling, or any other processing for, for example, preparing decoded picture data 31 for display by display device 34.

Display device 34 of destination device 14 is used to receive post-processed picture data 33 to display a picture to, for example, a user or viewer. Display device 34 may be or may include any type of display for presenting the reconstructed picture, such as an integrated or external display or monitor. For example, the display may include a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), a Digital Light Processor (DLP), or any other display of any kind.

Although fig. 1 depicts source apparatus 12 and destination apparatus 14 as separate apparatuses, an apparatus embodiment may also include the functionality of both source apparatus 12 and destination apparatus 14 or both, i.e., source apparatus 12 or corresponding functionality and destination apparatus 14 or corresponding functionality. In such embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof.

It will be apparent to those skilled in the art from this description that the existence and (exact) division of the functionality of the different elements or source device 12 and/or destination device 14 shown in fig. 1 may vary depending on the actual device and application.

Encoder 20 (e.g., video encoder 20) and decoder 30 (e.g., video decoder 30) may each be implemented as any of a variety of suitable circuits, such as one or more microprocessors, Digital Signal Processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the techniques are implemented in part in software, an apparatus may store instructions of the software in a suitable non-transitory computer-readable storage medium and may execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a corresponding device.

Source device 12 may be referred to as a video encoding device or a video encoding apparatus. Destination device 14 may be referred to as a video decoding device or a video decoding apparatus. Source device 12 and destination device 14 may be examples of video encoding devices or video encoding apparatus.

Source device 12 and destination device 14 may comprise any of a variety of devices, including any type of handheld or stationary device, such as a notebook or laptop computer, a mobile phone, a smart phone, a tablet or tablet computer, a camcorder, a desktop computer, a set-top box, a television, a display device, a digital media player, a video game console, a video streaming device (e.g., a content service server or a content distribution server), a broadcast receiver device, a broadcast transmitter device, etc., and may not use or use any type of operating system.

In some cases, source device 12 and destination device 14 may be equipped for wireless communication. Thus, source device 12 and destination device 14 may be wireless communication devices.

In some cases, the video encoding system 10 shown in fig. 1 is merely an example, and the techniques of this application may be applicable to video encoding settings (e.g., video encoding or video decoding) that do not necessarily involve any data communication between the encoding and decoding devices. In other examples, the data may be retrieved from local storage, streamed over a network, and so on. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In some examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.

It should be understood that for each of the examples described above with reference to video encoder 20, video decoder 30 may be used to perform the reverse process. With respect to signaling syntax elements, video decoder 30 may be configured to receive and parse such syntax elements and decode the associated video data accordingly. In some examples, video encoder 20 may entropy encode one or more syntax elements defined … … into an encoded video bitstream. In such instances, video decoder 30 may parse such syntax elements and decode the relevant video data accordingly.

Encoder and encoding method

Fig. 2 shows a schematic/conceptual block diagram of an example of a video encoder 20 for implementing the techniques of this application. In the example of fig. 2, video encoder 20 includes a residual calculation unit 204, a transform processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transform processing unit 212, a reconstruction unit 214, a buffer 216, a loop filter unit 220, a Decoded Picture Buffer (DPB) 230, a prediction processing unit 260, and an entropy encoding unit 270. Prediction processing unit 260 may include inter prediction unit 244, intra prediction unit 254, and mode selection unit 262. Inter prediction unit 244 may include a motion estimation unit and a motion compensation unit (not shown). The video encoder 20 shown in fig. 2 may also be referred to as a hybrid video encoder or a video encoder according to a hybrid video codec.

For example, the residual calculation unit 204, the transform processing unit 206, the quantization unit 208, the prediction processing unit 260, and the entropy encoding unit 270 form a forward signal path of the encoder 20, and, for example, the inverse quantization unit 210, the inverse transform processing unit 212, the reconstruction unit 214, the buffer 216, the loop filter 220, the Decoded Picture Buffer (DPB) 230, the prediction processing unit 260 form a backward signal path of the encoder, wherein the backward signal path of the encoder corresponds to a signal path of a decoder (see the decoder 30 in fig. 3).

Encoder 20 receives picture 201 or block 203 of picture 201, e.g., a picture in a sequence of pictures forming a video or video sequence, e.g., via input 202. Picture block 203 may also be referred to as a current picture block or a picture block to be encoded, and picture 201 may be referred to as a current picture or a picture to be encoded (especially when the current picture is distinguished from other pictures in video encoding, such as previously encoded and/or decoded pictures in the same video sequence, i.e., a video sequence that also includes the current picture).

Segmentation

An embodiment of encoder 20 may include a partitioning unit (not shown in fig. 2) for partitioning picture 201 into a plurality of blocks, such as block 203, typically into a plurality of non-overlapping blocks. The partitioning unit may be used to use the same block size for all pictures in a video sequence and a corresponding grid defining the block size, or to alter the block size between pictures or subsets or groups of pictures and partition each picture into corresponding blocks.

In one example, prediction processing unit 260 of video encoder 20 may be used to perform any combination of the above-described segmentation techniques.

Like picture 201, block 203 is also or can be viewed as a two-dimensional array or matrix of sample points having intensity values (sample values), although smaller in size than picture 201. In other words, the block 203 may comprise, for example, one sample array (e.g., a luma array in the case of a black and white picture 201) or three sample arrays (e.g., a luma array and two chroma arrays in the case of a color picture) or any other number and/or class of arrays depending on the color format applied. The number of sampling points in the horizontal and vertical directions (or axes) of the block 203 defines the size of the block 203.

The encoder 20 as shown in fig. 2 is used to encode a picture 201 block by block, e.g., performing encoding and prediction for each block 203.

Residual calculation

The residual calculation unit 204 is configured to calculate a residual block 205 based on the picture block 203 and the prediction block 265 (further details of the prediction block 265 are provided below), e.g. by subtracting sample values of the picture block 203 from sample values of the prediction block 265 on a sample-by-sample (pixel-by-pixel) basis to obtain the residual block 205 in the sample domain.

Transformation of

The transform processing unit 206 is configured to apply a transform, such as a Discrete Cosine Transform (DCT) or a Discrete Sine Transform (DST), on the sample values of the residual block 205 to obtain transform coefficients 207 in a transform domain. The transform coefficients 207 may also be referred to as transform residual coefficients and represent the residual block 205 in the transform domain.

The transform processing unit 206 may be used to apply integer approximations of DCT/DST, such as the transform specified for HEVC/h.265. Such integer approximations are typically scaled by some factor compared to the orthogonal DCT transform. To maintain the norm of the residual block processed by the forward transform and the inverse transform, an additional scaling factor is applied as part of the transform process. The scaling factor is typically selected based on certain constraints, e.g., the scaling factor is a power of 2 for a shift operation, a trade-off between bit depth of transform coefficients, accuracy and implementation cost, etc. For example, a specific scaling factor may be specified on the decoder 30 side for the inverse transform by, for example, inverse transform processing unit 212 (and on the encoder 20 side for the corresponding inverse transform by, for example, inverse transform processing unit 212), and correspondingly, a corresponding scaling factor may be specified on the encoder 20 side for the forward transform by transform processing unit 206.

Quantization

Quantization unit 208 is used to quantize transform coefficients 207, e.g., by applying scalar quantization or vector quantization, to obtain quantized transform coefficients 209. Quantized transform coefficients 209 may also be referred to as quantized residual coefficients 209. The quantization process may reduce the bit depth associated with some or all of transform coefficients 207. For example, an n-bit transform coefficient may be rounded down to an m-bit transform coefficient during quantization, where n is greater than m. The quantization level may be modified by adjusting a Quantization Parameter (QP). For example, for scalar quantization, different scales may be applied to achieve finer or coarser quantization. Smaller quantization steps correspond to finer quantization and larger quantization steps correspond to coarser quantization. An appropriate quantization step size may be indicated by a Quantization Parameter (QP). For example, the quantization parameter may be an index of a predefined set of suitable quantization step sizes. For example, a smaller quantization parameter may correspond to a fine quantization (smaller quantization step size) and a larger quantization parameter may correspond to a coarse quantization (larger quantization step size), or vice versa. The quantization may comprise a division by a quantization step size and a corresponding quantization or inverse quantization, e.g. performed by inverse quantization 210, or may comprise a multiplication by a quantization step size. Embodiments according to some standards, such as HEVC, may use a quantization parameter to determine the quantization step size. In general, the quantization step size may be calculated based on the quantization parameter using a fixed point approximation of an equation that includes division. Additional scaling factors may be introduced for quantization and dequantization to recover the norm of the residual block that may be modified due to the scale used in the fixed point approximation of the equation for the quantization step size and quantization parameter. In one example implementation, the inverse transform and inverse quantization scales may be combined. Alternatively, a custom quantization table may be used and signaled from the encoder to the decoder, e.g., in a bitstream. Quantization is a lossy operation, where the larger the quantization step size, the greater the loss.

The inverse quantization unit 210 is configured to apply inverse quantization of the quantization unit 208 on the quantized coefficients to obtain inverse quantized coefficients 211, e.g., apply an inverse quantization scheme of the quantization scheme applied by the quantization unit 208 based on or using the same quantization step as the quantization unit 208. The dequantized coefficients 211 may also be referred to as dequantized residual coefficients 211, corresponding to transform coefficients 207, although the loss due to quantization is typically not the same as the transform coefficients.

The inverse transform processing unit 212 is configured to apply an inverse transform of the transform applied by the transform processing unit 206, for example, an inverse Discrete Cosine Transform (DCT) or an inverse Discrete Sine Transform (DST), to obtain an inverse transform block 213 in the sample domain. The inverse transform block 213 may also be referred to as an inverse transform dequantized block 213 or an inverse transform residual block 213.

The reconstruction unit 214 (e.g., summer 214) is used to add the inverse transform block 213 (i.e., the reconstructed residual block 213) to the prediction block 265 to obtain the reconstructed block 215 in the sample domain, e.g., to add sample values of the reconstructed residual block 213 to sample values of the prediction block 265.

Optionally, a buffer unit 216 (or simply "buffer" 216), such as a line buffer 216, is used to buffer or store the reconstructed block 215 and corresponding sample values, for example, for intra prediction. In other embodiments, the encoder may be used to use the unfiltered reconstructed block and/or corresponding sample values stored in buffer unit 216 for any class of estimation and/or prediction, such as intra prediction.

For example, an embodiment of encoder 20 may be configured such that buffer unit 216 is used not only to store reconstructed blocks 215 for intra prediction 254, but also for loop filter unit 220 (not shown in fig. 2), and/or such that buffer unit 216 and decoded picture buffer unit 230 form one buffer, for example. Other embodiments may be used to use filtered block 221 and/or blocks or samples from decoded picture buffer 230 (neither shown in fig. 2) as input or basis for intra prediction 254.

The loop filter unit 220 (or simply "loop filter" 220) is used to filter the reconstructed block 215 to obtain a filtered block 221, in order to facilitate pixel transitions or to improve video quality. Loop filter unit 220 is intended to represent one or more loop filters, such as a deblocking filter, a sample-adaptive offset (SAO) filter, or other filters, such as a bilateral filter, an Adaptive Loop Filter (ALF), or a sharpening or smoothing filter, or a collaborative filter. Although loop filter unit 220 is shown in fig. 2 as an in-loop filter, in other configurations, loop filter unit 220 may be implemented as a post-loop filter. The filtered block 221 may also be referred to as a filtered reconstructed block 221. The decoded picture buffer 230 may store the reconstructed encoded block after the loop filter unit 220 performs a filtering operation on the reconstructed encoded block.

Embodiments of encoder 20 (correspondingly, loop filter unit 220) may be configured to output loop filter parameters (e.g., sample adaptive offset information), e.g., directly or after entropy encoding by entropy encoding unit 270 or any other entropy encoding unit, e.g., such that decoder 30 may receive and apply the same loop filter parameters for decoding.

Decoded Picture Buffer (DPB) 230 may be a reference picture memory that stores reference picture data for use by video encoder 20 in encoding video data. DPB 230 may be formed from any of a variety of memory devices, such as Dynamic Random Access Memory (DRAM) including Synchronous DRAM (SDRAM), Magnetoresistive RAM (MRAM), Resistive RAM (RRAM), or other types of memory devices. The DPB 230 and the buffer 216 may be provided by the same memory device or separate memory devices. In a certain example, a Decoded Picture Buffer (DPB) 230 is used to store filtered blocks 221. Decoded picture buffer 230 may further be used to store other previous filtered blocks, such as previous reconstructed and filtered blocks 221, of the same current picture or of a different picture, such as a previous reconstructed picture, and may provide the complete previous reconstructed, i.e., decoded picture (and corresponding reference blocks and samples) and/or the partially reconstructed current picture (and corresponding reference blocks and samples), e.g., for inter prediction. In a certain example, if reconstructed block 215 is reconstructed without in-loop filtering, Decoded Picture Buffer (DPB) 230 is used to store reconstructed block 215.

Prediction processing unit 260, also referred to as block prediction processing unit 260, is used to receive or obtain block 203 (current block 203 of current picture 201) and reconstructed picture data, e.g., reference samples of the same (current) picture from buffer 216 and/or reference picture data 231 of one or more previously decoded pictures from decoded picture buffer 230, and to process such data for prediction, i.e., to provide prediction block 265, which may be inter-predicted block 245 or intra-predicted block 255.

The mode selection unit 262 may be used to select a prediction mode (e.g., intra or inter prediction mode) and/or a corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.

Embodiments of mode selection unit 262 may be used to select prediction modes (e.g., from those supported by prediction processing unit 260) that provide the best match or the smallest residual (smallest residual means better compression in transmission or storage), or that provide the smallest signaling overhead (smallest signaling overhead means better compression in transmission or storage), or both. The mode selection unit 262 may be configured to determine a prediction mode based on Rate Distortion Optimization (RDO), i.e., select a prediction mode that provides the minimum rate distortion optimization, or select a prediction mode in which the associated rate distortion at least meets the prediction mode selection criteria.

The prediction processing performed by the example of the encoder 20 (e.g., by the prediction processing unit 260) and the mode selection performed (e.g., by the mode selection unit 262) will be explained in detail below.

As described above, the encoder 20 is configured to determine or select the best or optimal prediction mode from a set of (predetermined) prediction modes. The prediction mode set may include, for example, intra prediction modes and/or inter prediction modes.

The intra prediction mode set may include 35 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in h.265, or may include 67 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in h.266 under development.

The set of (possible) inter prediction modes depends on the available reference pictures (i.e. at least partially decoded pictures stored in the DBP 230, e.g. as described above) and other inter prediction parameters, e.g. on whether the best matching reference block is searched using the entire reference picture or only a part of the reference picture, e.g. a search window area of an area surrounding the current block, and/or e.g. on whether pixel interpolation like half-pixel and/or quarter-pixel interpolation is applied.

In addition to the above prediction mode, a skip mode and/or a direct mode may also be applied.

The prediction processing unit 260 may further be configured to partition the block 203 into smaller block partitions or sub-blocks, for example, by iteratively using quad-tree (QT) partitioning, binary-tree (BT) partitioning, or ternary-tree (TT) partitioning, or any combination thereof, and to perform prediction for each of the block partitions or sub-blocks, for example, wherein mode selection includes selecting a tree structure of the partitioned block 203 and selecting a prediction mode to apply to each of the block partitions or sub-blocks.

The inter prediction unit 244 may include a Motion Estimation (ME) unit (not shown in fig. 2) and a Motion Compensation (MC) unit (not shown in fig. 2). The motion estimation unit is used to receive or obtain picture block 203 (current picture block 203 of current picture 201) and decoded picture 231, or at least one or more previously reconstructed blocks, e.g., reconstructed blocks of one or more other/different previously decoded pictures 231, for motion estimation. For example, the video sequence may comprise a current picture and a previously decoded picture 31, or in other words, the current picture and the previously decoded picture 31 may be part of, or form, a sequence of pictures forming the video sequence.

For example, the encoder 20 may be configured to select a reference block from a plurality of reference blocks of the same or different one of a plurality of other pictures and provide the reference picture (or reference picture index) to a motion estimation unit (not shown in fig. 2) and/or provide an offset (spatial offset) between the position (X, Y coordinates) of the reference block and the position of the current block as an inter prediction parameter. This offset is also called a Motion Vector (MV).

The motion compensation unit is used to obtain, e.g., receive, inter-prediction parameters and perform inter-prediction based on or using the inter-prediction parameters to obtain the inter-prediction block 245. The motion compensation performed by the motion compensation unit (not shown in fig. 2) may involve taking or generating a prediction block based on a motion/block vector determined by motion estimation (possibly performing interpolation to sub-pixel precision). Interpolation filtering may generate additional pixel samples from known pixel samples, potentially increasing the number of candidate prediction blocks that may be used to encode a picture block. Upon receiving the motion vector for the PU of the current picture block, motion compensation unit 246 may locate the prediction block in one reference picture list to which the motion vector points. Motion compensation unit 246 may also generate syntax elements associated with the blocks and video slices for use by video decoder 30 in decoding picture blocks of the video slices.

The intra prediction unit 254 is used to obtain, e.g., receive, the picture block 203 (current picture block) of the same picture and one or more previously reconstructed blocks, e.g., reconstructed neighboring blocks, for intra estimation. For example, the encoder 20 may be configured to select an intra-prediction mode from a plurality of (predetermined) intra-prediction modes.

Embodiments of encoder 20 may be used to select an intra prediction mode based on optimization criteria, such as based on a minimum residual (e.g., an intra prediction mode that provides a prediction block 255 that is most similar to current picture block 203) or a minimum code rate distortion.

The intra-prediction unit 254 is further configured to determine the intra-prediction block 255 based on the intra-prediction parameters as the selected intra-prediction mode. In any case, after selecting the intra-prediction mode for the block, intra-prediction unit 254 is also used to provide intra-prediction parameters, i.e., information indicating the selected intra-prediction mode for the block, to entropy encoding unit 270. In one example, intra-prediction unit 254 may be used to perform any combination of the intra-prediction techniques described below.

Entropy encoding unit 270 is configured to apply an entropy encoding algorithm or scheme (e.g., a Variable Length Coding (VLC) scheme, a Context Adaptive VLC (CAVLC) scheme, an arithmetic coding scheme, a Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or other entropy encoding methods or techniques) to individual or all of quantized residual coefficients 209, inter-prediction parameters, intra-prediction parameters, and/or loop filter parameters to obtain encoded picture data 21 that may be output by output 272, e.g., in the form of encoded bitstream 21. The encoded bitstream may be transmitted to video decoder 30, or archived for later transmission or retrieval by video decoder 30. Entropy encoding unit 270 may also be used to entropy encode other syntax elements of the current video slice being encoded.

Other structural variations of video encoder 20 may be used to encode the video stream. For example, the non-transform based encoder 20 may quantize the residual signal directly without the transform processing unit 206 for certain blocks or frames. In another embodiment, encoder 20 may have quantization unit 208 and inverse quantization unit 210 combined into a single unit.

Fig. 3 illustrates an exemplary video decoder 30 for implementing the techniques of the present application. Video decoder 30 is to receive encoded picture data (e.g., an encoded bitstream) 21, e.g., encoded by encoder 20, to obtain a decoded picture 231. During the decoding process, video decoder 30 receives video data, such as an encoded video bitstream representing picture blocks of an encoded video slice and associated syntax elements, from video encoder 20.

In the example of fig. 3, decoder 30 includes entropy decoding unit 304, inverse quantization unit 310, inverse transform processing unit 312, reconstruction unit 314 (e.g., summer 314), buffer 316, loop filter 320, decoded picture buffer 330, and prediction processing unit 360. The prediction processing unit 360 may include an inter prediction unit 344, an intra prediction unit 354, and a mode selection unit 362. In some examples, video decoder 30 may perform a decoding pass that is substantially reciprocal to the encoding pass described with reference to video encoder 20 of fig. 2.

Entropy decoding unit 304 is to perform entropy decoding on encoded picture data 21 to obtain, for example, quantized coefficients 309 and/or decoded encoding parameters (not shown in fig. 3), e.g., any or all of inter-prediction, intra-prediction parameters, loop filter parameters, and/or other syntax elements (decoded). The entropy decoding unit 304 is further for forwarding the inter-prediction parameters, the intra-prediction parameters, and/or other syntax elements to the prediction processing unit 360. Video decoder 30 may receive syntax elements at the video slice level and/or the video block level.

Inverse quantization unit 310 may be functionally identical to inverse quantization unit 110, inverse transform processing unit 312 may be functionally identical to inverse transform processing unit 212, reconstruction unit 314 may be functionally identical to reconstruction unit 214, buffer 316 may be functionally identical to buffer 216, loop filter 320 may be functionally identical to loop filter 220, and decoded picture buffer 330 may be functionally identical to decoded picture buffer 230.

Prediction processing unit 360 may include inter prediction unit 344 and intra prediction unit 354, where inter prediction unit 344 may be functionally similar to inter prediction unit 244 and intra prediction unit 354 may be functionally similar to intra prediction unit 254. The prediction processing unit 360 is typically used to perform block prediction and/or to obtain a prediction block 365 from the encoded data 21, as well as to receive or obtain (explicitly or implicitly) prediction related parameters and/or information about the selected prediction mode from, for example, the entropy decoding unit 304.

When the video slice is encoded as an intra-coded (I) slice, intra-prediction unit 354 of prediction processing unit 360 is used to generate a prediction block 365 for the picture block of the current video slice based on the signaled intra-prediction mode and data from previously decoded blocks of the current frame or picture. When a video frame is encoded as an inter-coded (i.e., B or P) slice, inter prediction unit 344 (e.g., a motion compensation unit) of prediction processing unit 360 is used to generate a prediction block 365 for the video block of the current video slice based on the motion vectors and other syntax elements received from entropy decoding unit 304. For inter prediction, a prediction block may be generated from one reference picture within one reference picture list. Video decoder 30 may construct the reference frame list using default construction techniques based on the reference pictures stored in DPB 330: list 0 and list 1.

Prediction processing unit 360 is used to determine prediction information for the video blocks of the current video slice by parsing the motion vectors and other syntax elements, and to generate a prediction block for the current video block being decoded using the prediction information. For example, prediction processing unit 360 uses some of the syntax elements received to determine a prediction mode (e.g., intra or inter prediction) for encoding video blocks of a video slice, an inter prediction slice type (e.g., B-slice, P-slice, or GPB-slice), construction information for one or more of a reference picture list of the slice, a motion vector for each inter-coded video block of the slice, an inter prediction state for each inter-coded video block of the slice, and other information to decode video blocks of the current video slice.

Inverse quantization unit 310 may be used to inverse quantize (i.e., inverse quantize) the quantized transform coefficients provided in the bitstream and decoded by entropy decoding unit 304. The inverse quantization process may include using quantization parameters calculated by video encoder 20 for each video block in the video slice to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied.

Inverse transform processing unit 312 is used to apply an inverse transform (e.g., an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients in order to produce a block of residuals in the pixel domain.

The reconstruction unit 314 (e.g., summer 314) is used to add the inverse transform block 313 (i.e., reconstructed residual block 313) to the prediction block 365 to obtain the reconstructed block 315 in the sample domain, e.g., by adding sample values of the reconstructed residual block 313 to sample values of the prediction block 365.

Loop filter unit 320 (either during or after the encoding cycle) is used to filter reconstructed block 315 to obtain filtered block 321 to facilitate pixel transitions or improve video quality. In one example, loop filter unit 320 may be used to perform any combination of the filtering techniques described below. Loop filter unit 320 is intended to represent one or more loop filters, such as a deblocking filter, a sample-adaptive offset (SAO) filter, or other filters, such as a bilateral filter, an Adaptive Loop Filter (ALF), or a sharpening or smoothing filter, or a collaborative filter. Although loop filter unit 320 is shown in fig. 3 as an in-loop filter, in other configurations, loop filter unit 320 may be implemented as a post-loop filter.

Decoded video block 321 in a given frame or picture is then stored in decoded picture buffer 330, which stores reference pictures for subsequent motion compensation.

Decoder 30 is used to output decoded picture 31, e.g., via output 332, for presentation to or viewing by a user.

Other variations of video decoder 30 may be used to decode the compressed bitstream. For example, decoder 30 may generate an output video stream without loop filter unit 320. For example, the non-transform based decoder 30 may directly inverse quantize the residual signal without the inverse transform processing unit 312 for certain blocks or frames. In another embodiment, video decoder 30 may have inverse quantization unit 310 and inverse transform processing unit 312 combined into a single unit.

Fig. 4 is an illustration of an example of a video encoding system 40 including encoder 20 of fig. 2 and/or decoder 30 of fig. 3, according to an example embodiment. System 40 may implement a combination of the various techniques of the present application. In the illustrated embodiment, video encoding system 40 may include an imaging device 41, video encoder 20, video decoder 30 (and/or a video encoder implemented by logic 47 of processing unit 46), an antenna 42, one or more processors 43, one or more memories 44, and/or a display device 45.

As shown, the imaging device 41, the antenna 42, the processing unit 46, the logic circuit 47, the video encoder 20, the video decoder 30, the processor 43, the memory 44, and/or the display device 45 are capable of communicating with each other. As discussed, although video encoding system 40 is depicted with video encoder 20 and video decoder 30, in different examples, video encoding system 40 may include only video encoder 20 or only video decoder 30.

In some examples, as shown, video encoding system 40 may include an antenna 42. For example, the antenna 42 may be used to transmit or receive an encoded bitstream of video data. Additionally, in some examples, video encoding system 40 may include a display device 45. Display device 45 may be used to present video data. In some examples, logic 47 may be implemented by processing unit 46, as shown. The processing unit 46 may comprise application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, or the like. Video coding system 40 may also include an optional processor 43, which optional processor 43 similarly may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, or the like. In some examples, the logic 47 may be implemented in hardware, such as video encoding specific hardware, and the processor 43 may be implemented in general purpose software, an operating system, and so on. In addition, the Memory 44 may be any type of Memory, such as a volatile Memory (e.g., Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), etc.), a nonvolatile Memory (e.g., flash Memory, etc.), and the like. In a non-limiting example, storage 44 may be implemented by a speed cache memory. In some instances, logic circuitry 47 may access memory 44 (e.g., to implement an image buffer). In other examples, logic 47 and/or processing unit 46 may include memory (e.g., cache, etc.) for implementing image buffers, etc.

In some examples, video encoder 20 implemented by logic circuitry may include an image buffer (e.g., implemented by processing unit 46 or memory 44) and a graphics processing unit (e.g., implemented by processing unit 46). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include video encoder 20 implemented by logic circuitry 47 to implement the various modules discussed with reference to fig. 2 and/or any other encoder system or subsystem described herein. Logic circuitry may be used to perform various operations discussed herein.

Video decoder 30 may be implemented in a similar manner by logic circuitry 47 to implement the various modules discussed with reference to decoder 30 of fig. 3 and/or any other decoder system or subsystem described herein. In some examples, logic circuit implemented video decoder 30 may include an image buffer (implemented by processing unit 2820 or memory 44) and a graphics processing unit (e.g., implemented by processing unit 46). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include video decoder 30 implemented by logic circuitry 47 to implement the various modules discussed with reference to fig. 3 and/or any other decoder system or subsystem described herein.

In some examples, antenna 42 of video encoding system 40 may be used to receive an encoded bitstream of video data. As discussed, the encoded bitstream may include data related to the encoded video frame, indicators, index values, mode selection data, etc., discussed herein, such as data related to the encoding partition (e.g., transform coefficients or quantized transform coefficients, (as discussed) optional indicators, and/or data defining the encoding partition). Video encoding system 40 may also include a video decoder 30 coupled to antenna 42 and configured to decode the encoded bitstream. The display device 45 is used to present video frames.

Fig. 5 is a simplified block diagram of an apparatus 500 that may be used as either or both of source device 12 and destination device 14 in fig. 1, according to an example embodiment. Apparatus 500 may implement the techniques of this application, and apparatus 500 may take the form of a computing system including multiple computing devices, or a single computing device such as a mobile phone, tablet computer, laptop computer, notebook computer, desktop computer, or the like.

The processor 502 in the apparatus 500 may be a central processor. Alternatively, processor 502 may be any other type of device or devices now or later developed that is capable of manipulating or processing information. As shown, although the disclosed embodiments may be practiced using a single processor, such as processor 502, speed and efficiency advantages may be realized using more than one processor.

In one embodiment, the Memory 504 of the apparatus 500 may be a Read Only Memory (ROM) device or a Random Access Memory (RAM) device. Any other suitable type of storage device may be used for memory 504. The memory 504 may include code and data 506 that is accessed by the processor 502 using a bus 512. The memory 504 may further include an operating system 508 and application programs 510, the application programs 510 including at least one program that permits the processor 502 to perform the methods described herein. For example, applications 510 may include applications 1 through N, applications 1 through N further including video coding applications that perform the methods described herein. The apparatus 500 may also include additional memory in the form of a slave memory 514, the slave memory 514 may be, for example, a memory card for use with a mobile computing device. Because a video communication session may contain a large amount of information, this information may be stored in whole or in part in the slave memory 514 and loaded into the memory 504 for processing as needed.

Device 500 may also include one or more output apparatuses, such as a display 518. In one example, display 518 may be a touch-sensitive display that combines a display and a touch-sensitive element operable to sense touch inputs. A display 518 may be coupled to the processor 502 via the bus 512. Other output devices that permit a user to program apparatus 500 or otherwise use apparatus 500 may be provided in addition to display 518, or other output devices may be provided as an alternative to display 518. When the output device is or includes a display, the display may be implemented in different ways, including by a Liquid Crystal Display (LCD), a Cathode Ray Tube (CRT) display, a plasma display, or a Light Emitting Diode (LED) display, such as an Organic LED (OLED) display.

The apparatus 500 may also include or be in communication with an image sensing device 520, the image sensing device 520 being, for example, a camera or any other image sensing device 520 now or later developed that can sense an image, such as an image of a user running the apparatus 500. The image sensing device 520 may be placed directly facing the user running the apparatus 500. In an example, the position and optical axis of image sensing device 520 may be configured such that its field of view includes an area proximate display 518 and display 518 is visible from that area.

The apparatus 500 may also include or be in communication with a sound sensing device 522, such as a microphone or any other sound sensing device now known or later developed that can sense sound in the vicinity of the apparatus 500. The sound sensing device 522 may be positioned to face directly the user operating the apparatus 500 and may be used to receive sounds, such as speech or other utterances, emitted by the user while operating the apparatus 500.

Although the processor 502 and memory 504 of the apparatus 500 are depicted in fig. 5 as being integrated in a single unit, other configurations may also be used. The operations of processor 502 may be distributed among multiple directly couplable machines (each machine having one or more processors), or distributed in a local area or other network. Memory 504 may be distributed among multiple machines, such as a network-based memory or a memory among multiple machines running apparatus 500. Although only a single bus is depicted here, the bus 512 of the device 500 may be formed from multiple buses. Further, the secondary memory 514 may be directly coupled to other components of the apparatus 500 or may be accessible over a network and may comprise a single integrated unit, such as one memory card, or multiple units, such as multiple memory cards. Accordingly, the apparatus 500 may be implemented in a variety of configurations.

In draft 2.0 of the general video coding (VVC) standard, two new Transform kernels, namely DCT8 and Discrete Sine Transform 7(DST 7: Discrete Sine Transform), are introduced in addition to the conventional Discrete Cosine Transform 2(DCT 2: Discrete Cosine Transform 2) Transform kernel. As shown in table 1, the transformation kernels exhibit different distribution characteristics for their corresponding basis functions.

TABLE 1DCT and DST transform basis functions

For the prediction residual, because different prediction modes have different residual characteristics, the characteristics of different transformation matrices can be fully utilized to better adapt to the residual characteristics by adopting a multi-core transformation technology (MTS) multi-core transformation selection, thereby achieving the purpose of improving the coding compression performance.

After the transformation kernels are determined, corresponding transformation matrices, such as DCT2 matrices, DST7 matrices, DCT8 matrices, and the like, may be obtained according to the basis functions corresponding to the transformation kernels.

The MTS scheme in the VVC draft Standard test model (VTM2.0) JVET-K1002 is shown in Table 2:

TABLE 2 MTS protocol

A, B in the table denotes a transformation matrix, a-DST 7 matrix; b ═ DCT8 matrix. The transform size (transform matrix size) includes: 4x4,8x8,16x16,32x 32. The transformation matrices in the horizontal direction/the vertical direction can be combined into 4 transformation matrix pairs, which correspond to different numbers index respectively. These indices are written into the code stream telling the decoder which set of transform matrix pairs to use.

Taking the MTS scheme shown in table 2 as an example, when the encoding side performs transform processing, 4 sets of transform matrix pairs in table 2 are traversed, the prediction residual block is horizontally and vertically transformed by using each set of transform matrix pairs, the transform matrix pair with the minimum rate-distortion cost is selected, and the index of the transform matrix pair is written into the code stream. After the corresponding transform matrix pair is determined (assuming that the transform matrix pair corresponds to index 1), the prediction residual block R is transformed (i.e., matrix-multiplied) by the transform matrices AH and B to obtain a transform coefficient block F.

F＝B*R*A’

And then entropy coding the coefficient block F and writing the coefficient block F into the code stream.

When the decoding end carries out inverse transformation processing, the index of the transformation matrix pair obtained by decoding determines the transformation matrix pair which needs to be adopted, and the coefficient block obtained by decoding is subjected to inverse transformation in the vertical and horizontal directions by utilizing the transformation matrix pair to obtain a prediction residual block (reconstruction residual block). Specifically, the transform coefficient block F obtained by decoding is inversely transformed (i.e., matrix-multiplied) by the a matrix and the B matrix to obtain the residual block R.

R＝B’*F*A

Wherein, A 'represents the transpose matrix of the A matrix, B' represents the transpose matrix of the B matrix, because A matrix and B matrix are orthogonal matrices, the transpose is equivalent to the inversion matrix.

When the matrix multiplication is implemented in a circuit, a butterfly fast algorithm (partial butterfly) is generally adopted, so that the number of times of required multiplication can be reduced by utilizing the symmetry of matrix coefficients. But DST7 and DCT8 transforms, and there is no butterfly fast algorithm (partial butterfly) like DCT2 transform; therefore, only matrix multiplication can be used, and the calculation complexity (such as the number of multiplication) is high. Meanwhile, coefficients of the transform matrices of DST7 and DCT8 need to be stored, and the number of coefficients needed to be stored is up to 2 × in consideration of different sizes (4x4+8x8+16x16+32x32) ═ 2720.

It should be noted that, in addition to the transformation matrices a and B, the VVC draft also utilizes a DCT2 matrix as the transformation matrix, and the size includes 4x4 to 128x 128.

In another implementation, a simplified scheme of MTS is proposed, as shown in table 3:

TABLE 3 alternative MTS scheme

Wherein C ═ DST4 matrix; d ═ DCT4 matrix, that is, DST7 matrix and DCT8 matrix in prior art 2 are replaced with DST4 matrix and DCT4 matrix. DST4 and DCT4 have similar transform kernel basis function characteristics, and specific transform basis functions are shown in table 4.

TABLE 4DCT4 and DST4 transform basis functions

After determining the transformation kernels to be DST4 and DCT4, the corresponding transformation matrices, that is, the DST4 matrix and the DCT4 matrix, may be obtained according to the basis functions corresponding to the transformation kernels.

Based on the basis functions of tables 1 and 4, the following transformation matrix examples can be obtained:

table 58 x8DCT2 transform matrix example

Table 64x 4DCT2 transformation matrix example

Table 74 x4DCT4 transformation matrix example

Table 84 x4DST4 transformation matrix example

As can be seen from the above transformation matrices, the coefficients of the 8X8DCT2 matrix include all the coefficients of the 4X4DCT2 matrix (the bold italic coefficients in table 5 are the same as the coefficients in table 6) and the 4X4DCT4 matrix (the underlined coefficients in table 5 are the same as the coefficients in table 7). As can be seen from tables 7 and 8 again, the DST4 matrix can be obtained by mirroring (FLIP) and sign transforming the DCT4 matrix.

In order to verify the effect between the MTS algorithm described in table 3 and the MTS algorithm described in table 2, the MTS algorithm described in table 3 was subjected to performance testing on the VVC reference software VTM-2.0 platform, and test data obtained by comparing the MTS algorithm described in table 2 are shown in tables 9 and 10.

Table 9 table 3 data of the MTS algorithm under AI test conditions

Table 10 table 3 data of MTS algorithm under RA test conditions

The numerical values in the table represent the percentage increase of coded bits at the same video image quality. Class X (a1, a2, B, C, or E) denotes the test video sequence, Y, U/V denote the luminance and chrominance components of the video image, respectively, and EncT and DecT denote the encoding and decoding times, respectively. The test condition AI represents the All Intra and the test condition RA represents the random access.

Fig. 6 illustrates a butterfly fast algorithm circuit implementation of a 16 × 16DCT2 matrix in HEVC, and as can be seen from fig. 6, the butterfly fast algorithm circuit of the 16 × 16DCT2 matrix includes implementation circuits of a 4 × 4DCT2 matrix, an 8 × 8DCT2 matrix, a 4 × 4DCT4 matrix, and an 8 × 8DCT4 matrix, that is, a circuit implementation of the 16 × 16DCT2 matrix can be directly multiplexed when the transforms of the 4 × 4DCT2 matrix, the 8 × 8DCT2 matrix, the 4 × 4DCT4 matrix, and the 8 × 8DCT4 matrix are implemented; however, only the 4 × 4DCT2 matrix and the 8 × 8DCT2 matrix can multiplex the butterfly fast algorithm circuit, and the implementation of the 4 × 4DCT4 matrix and the 8 × 8DCT4 matrix can multiplex the implementation circuit of the 16 × 16DCT2 matrix, but the butterfly fast algorithm is not used.

In order to further reduce the implementation complexity of the MTS and reduce the performance loss, an implementation scheme of the MTS provided by an embodiment of the present invention is shown in table 11.

TABLE 11 an MTS implementation

Here, C is DST4 matrix, E is DCT2 ' matrix, and DCT2 ' matrix is a transposed matrix of DCT2 matrix, and symbol "'" indicates transposition. In fact, the transpose of the DCT2 matrix is identical to the DCT3 matrix.

One implementation of the 4 × 4DST4 matrix is shown in table 8. An implementation example of the 4x4DCT 2' matrix is shown in table 12.

Table 124 x4DCT 2' transform matrix example

Comparing table 11 and table 3, it can be seen that table 9 replaces the DCT4 matrix in table 3 with the DCT 2' matrix. Considering that a butterfly fast algorithm exists for the transform/inverse transform implementation of the DCT 2' matrix, the implementation of the transform/inverse transform can be further simplified. Meanwhile, the circuit implementation can still multiplex the transformation/inverse transformation implementation circuit corresponding to the DCT2 matrix. For the implementation of the DST4 matrix, as shown above, the DST4 matrix may be multiplexed with the 2Nx2N DCT2 matrix transformation/inverse transformation implementation circuit through the operations of FLIP, sign transformation, and the like.

In order to verify the effect of the MTS scheme of table 11, the inventors performed performance tests on the VVC reference software VTM-2.0.1 platform for the technical scheme of table 11, and the codec compression performance relative to the MTS scheme of table 2 is shown in tables 13 and 14.

Watch 13

TABLE 14

As can be seen from tables 13 and 14, compared to the MTS scheme described in table 2, the MTS scheme described in table 11 has a very small increase in the average coding bit rate (in Y, the coding bit increases by 0.01% under the AI test condition, and the coding bit increases by 0.02% under the RA test condition), that is, the effect on the coding compression performance is almost negligible, but the scheme utilizes the butterfly fast algorithm, and can simplify the implementation of the transform/inverse transform, as shown by the data in the table, the coding time is saved by 2% to 4%, and the decoding time is also saved to some extent. Meanwhile, the coefficients of the used transformation matrix can be easily derived from the 2Nx2N DCT2 matrix without additional storage space; moreover, the used transform matrix realization circuit can multiplex the transform/inverse transform realization circuit corresponding to the 2Nx2N DCT2 matrix, and the design of the realization circuit of the codec can be simplified.

In order to clarify the implementation circuit of the transform matrix in the embodiment of the present invention, the transform/inverse transform implementation circuit corresponding to the 2Nx2N DCT2 matrix may be multiplexed, and the circuit multiplexing is specifically described as follows. For example, a partial button implementation method of an inverse transform circuit disclosed in the CoreTransform Design in the High Efficiency Video Coding (HEVC) Standard. The inverse DCT2 matrix transform can be implemented by decomposing into three modules, namely, EVEN, ODD and ADDSUB, wherein EVEN represents the column transform by using the matrix composed of the ODD row coefficients of the DCT2 matrix, ODD represents the column transform by using the matrix composed of the EVEN row coefficients of the DCT2 matrix, and ADDSUB represents the addition and subtraction module.

For example, fig. 7 depicts a 32 × 32 inverse transform implementation circuit, wherein an Even4 module, an Odd4 module, and an Addsub4 module constitute a 4 × 4 matrix inverse transform implementation circuit 701; the inverse transform realization circuit 701 of the 4 × 4 matrix, the Odd8 module and the Addsub8 module form an inverse transform realization circuit 702 of an 8 × 8 matrix; the inverse transform realization circuit 702 of the 8 × 8 matrix, the Odd16 module and the Addsub16 module form an inverse transform realization circuit 703 of a 16 × 16 matrix; the 16 × 16 matrix inverse transform circuit implementation 703, the Odd16 module, and the Addsub16 module constitute a 32 × 32 matrix inverse transform implementation circuit 704.

Fig. 8 illustrates an implementation circuit, and as shown in fig. 8, the matrix multiplication circuits of the Even4 module and the Odd4 module may be shared by the transform circuit and the inverse transform circuit.

FIG. 9 depicts the inverse transform architecture of the 8x8DCT2 matrix, where d_nRepresenting the coefficients of the 0 th column of the n-th row in the 32x32DCT2 matrix, fig. 9 specifically describes the internal structure of the EVEN module implementation and the ODD module implementation of the matrix multiplication operation. Where the EVEN8 and ODD8 matrices may be obtained from a 2Nx2N DCT2 matrix.

Specifically, the 8 × 8DCT2 matrix is shown in table 15:

TABLE 158 x8DCT2 matrix

The 4x4EVEN8 transformation matrix can be obtained by the following steps:

odd column coefficients (boxed coefficients in table 15) are extracted from the left half of the 8x8DCT2 matrix shown in table 15, resulting in a 4x4 matrix shown in table 16:

table 164 x4 matrix

Then, the matrix of table 16 is transposed to obtain the 4 × 4 transformation matrix EVEN8 shown in table 17. Where the EVEN8 matrix is actually a DCT 2' matrix.

Table 174 x4EVEN8 transformation matrix

The 4x4ODD8 transformation matrix can be obtained by the following steps:

the extraction of even column coefficients (underlined coefficients in table 15) from the right half of the 8x8DCT2 matrix shown in table 15 is obtained to form a 4x4 matrix as shown in table 18:

table 184 x4 matrix

Then, transposing or sign conversion processing is performed on the matrix of table 18, so as to obtain the 4 × 4 conversion matrix ODD8 shown in table 19. In fact, the ODD8 matrix is a modified DST4 matrix, for example, the sign of the DST4 matrix may be transformed, specifically, the ODD8 matrix may be obtained by inverting the negative sign of the ODD column coefficient of the DST4 matrix.

Table 194 x4ODD8 transformation matrix

As can be seen from the above description, for the transform matrix of NxN, it can be derived from the DCT2 matrix of 2Nx 2N. Therefore, only one DCT2 matrix coefficient of 64x64 needs to be stored, and the matrix coefficients of 32x32,16x16,8x8,4x4 and 2x2 can be obtained through derivation. Therefore, no additional storage space is required to store the matrix coefficients of 32x32,16x16,8x8,4x4,2x 2.

As can be seen from comparing table 12 and table 16, table 12 and table 16 are the same, that is, the 4 × 4DCT2 ' transform matrix can be directly derived from the 8 × 8DCT2 matrix, so that the implementation circuit of the 4 × 4DCT2 ' transform matrix is also included in the transform/inverse transform implementation circuit of 2Nx2NDCT2, and thus the implementation of the 4 × 4DCT2 ' transform matrix can directly multiplex the implementation circuit of 2Nx2N DCT 2.

As can be seen from comparing table 8 and table 19, the transform matrix shown in table 19 can be obtained only by performing symbol transform (inverting the odd-numbered column coefficient symbols) on the matrix described in table 8, that is, the 4x4DST4 matrix can be directly derived from the 8x8DCT2 matrix, so that the implementation circuit of the 4x4DST4 matrix is also included in the transform/inverse transform implementation circuit of the 2Nx2N DCT2 matrix, and thus the implementation circuit of the 4x4DST4 matrix can be directly multiplexed with the implementation circuit of the 2Nx2N DCT 2.

In order to further reduce the implementation complexity of the MTS and reduce the performance loss, an implementation scheme of the MTS provided by an embodiment of the present invention is shown in table 20.

TABLE 20 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix. Where the above variant may be a sign transformation (e.g. inversion) of the matrix by rows or columns.

For different circuit implementations of 2Nx2N DCT2 matrix, the deformation requirements for the C matrix and the E matrix are different for the purpose of multiplexing the circuits. For example, in the MTS scheme described in table 3, it is mentioned that the DST4 matrix is to multiplex an implementation circuit of a 2Nx2N DCT2 matrix through the FLIP and sign transform process (becoming a DCT4 matrix); in the MTS scheme described in table 11, it is mentioned that the implementation circuit of 2Nx2N DCT2 matrix actually includes 0DD8 matrix, and the DST4 matrix can be changed into ODD8 matrix only by sign transformation to realize multiplexing. And various C matrix and E matrix variants can be better adapted to different 2Nx2NDCT2 matrix realization circuits, so that the circuit multiplexing is simplified.

In addition, certain variants of the C matrix and the E matrix can be directly derived from the 2Nx2N DCT2 matrix, thereby simplifying the derivation process of the coefficients of the C matrix and the E matrix.

As an example, a variation of the C matrix may be to invert the odd rows of the C matrix, resulting in the following table 21:

example of deformation matrix coefficients for the matrix of Table 21C

As can be seen from comparing table 21 with table 19, table 21 and table 19 are the same and can be directly derived from the coefficients of the 8 × 8DCT2 matrix without any additional operation, so that the derivation process of the deformation matrix of C can be further simplified. Meanwhile, since the deformation matrix of C can be directly derived from the 2Nx2N DCT2 matrix, the circuit of different 2Nx2N DCT2 matrices can be adapted to realize simplified circuit multiplexing while ensuring small impact on the coding compression performance.

In order to verify the effect of the MTS scheme of table 20, the inventors performed performance tests on the vtc reference software VTM-2.0.1 platform for the technical scheme of table 20, and the codec compression performance of the MTS scheme relative to table 2 is shown in tables 22 and 23.

TABLE 22

TABLE 23

As can be seen from tables 13 and 14, the MTS scheme described in table 11 has a very small increase in the average coding bit rate (0.06% increase in the coding bit under the AI test condition and 0.08% increase in the coding bit under the RA test condition at Y) compared to the MTS scheme described in table 2, that is, the effect on the coding compression performance is almost negligible, but the scheme uses the butterfly fast algorithm, which can simplify the implementation of the transform/inverse transform. Meanwhile, the matrix coefficients of the used transformation matrix can be directly derived from the 2Nx2N DCT2 matrix, and no extra storage space is needed; moreover, the used transform matrix realization circuit can directly multiplex the transform/inverse transform realization circuit corresponding to the 2Nx2N DCT2 matrix, and the design of the codec realization circuit can be simplified.

Another embodiment of the present invention provides an implementation of MTS as shown in table 24.

TABLE 24 an MTS implementation

Another embodiment of the present invention provides an implementation of MTS as shown in table 25.

TABLE 25 an MTS implementation

Another embodiment of the present invention provides an implementation of MTS as shown in table 26.

TABLE 26 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Table 27 shows an implementation of an MTS according to another embodiment of the present invention.

TABLE 27 MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Another embodiment of the present invention provides an implementation of MTS as shown in table 28.

TABLE 28 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Another embodiment of the present invention provides an implementation of MTS as shown in table 29.

TABLE 29 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Another embodiment of the present invention provides an implementation of MTS as shown in table 30.

TABLE 30 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Table 31 shows an implementation of an MTS according to another embodiment of the present invention.

TABLE 31 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Another embodiment of the present invention provides an implementation of MTS as shown in table 32.

TABLE 32 MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

Another embodiment of the present invention provides an implementation of MTS as shown in table 33.

TABLE 33 an MTS implementation

Wherein, C is DST4 matrix, E is DCT 2' matrix.

From the above, in one embodiment, at least one of the DST4 matrix, the DCT2 'matrix, the variation of the DST4 matrix, or the variation of the DCT 2' matrix described above may be obtained from the 8 × 8DCT2 matrix. Since the encoder or the decoder may store the 8 × 8DCT2 matrix, at least one of the DST4 matrix, the DCT2 'matrix, the modification of the DST4 matrix, or the modification of the DCT 2' matrix obtained from the 8 × 8DCT2 matrix may reduce the number of transform matrices that the encoder or the decoder needs to store, and thus may reduce the storage space occupied by the transform matrices for the encoder or the decoder.

In another embodiment, at least one of the DST4 matrix, the DCT2 'matrix, the variation of the DST4 matrix, or the variation of the DCT 2' matrix described above may also be obtained directly from the 64 × 64DCT2 matrix. Since the encoder or the decoder may store the 64 × 64DCT2 matrix, at least one of the DST4 matrix, the DCT2 'matrix, the modification of the DST4 matrix, or the modification of the DCT 2' matrix, which is obtained from the 64 × 64DCT2 matrix, may reduce the number of transform matrices that the encoder or the decoder needs to store, and thus may reduce the storage space occupied by the transform matrices for the encoder or the decoder.

In one embodiment, the 64 × 64DCT2 matrix can be represented by tables 34 and 35 (the 64 × 64DCT2 matrix is relatively large and is therefore represented by two tables, where table 34 represents columns 0-15 (as transMatrixCon 0to15) and table 35 represents columns 16-31 (as transMatrixCon 16to31) of the matrix).

Table 3464 x64 DCT2 matrix 0-15 columns

16-31 columns of the 3564 × 64DCT2 matrix

By the following operations, a DCT2 matrix transMatrix of 64x64 can be obtained from table 34 and table 35.

transMatrix[m][n]＝transMatrixCol0to15[m][n]with m＝0..15,n＝0..63

transMatrix[m][n]＝transMatrixCol16to31[m-16][n]with m＝16..31,n＝0..63

transMatrix[m][n]＝(n&1？-1:1)*transMatrixCol16to31[47-m][n]

with m＝32..47,n＝0..63

transMatrix[m][n]＝(n&1？-1:1)*transMatrixCol0to15[63-m][n]

with m＝48..63,n＝0..63

In one embodiment, the transformation kernel is indicated by trType, for example indicating whether the transformation kernel is a variant of a DST4 matrix/DST 4 matrix or a variant of a DCT2 'matrix/DCT 2' matrix. For example, when trType is 1, the transform matrix is DST4 matrix, and when trType is 2, the transform matrix is DCT 2' matrix; of course, the other way around may be used, that is, when trType is 2, the transform matrix is DST4 matrix, and when trType is 1, the transform matrix is DCT 2' matrix. It is understood that trType may take another value to indicate the DST4 matrix and the DCT 2' matrix. The embodiment of the invention does not limit the corresponding relation between the trType and the transformation matrix, and the implementation of the embodiment of the invention is not influenced as long as the value of the trType can be in one-to-one correspondence with the transformation matrix.

For example, when the transform matrix is a DST4 matrix, a DST4 matrix can be derived from a 64 × 64DCT2 matrix by the following formula (1):

wherein, the transmrix represents the DCT2 matrix (64 multiplied by 64DCT2 matrix), nTbs represents the size of the transformation matrix, i is more than or equal to 0 and less than or equal to nTbS-1, and j is more than or equal to 0 and less than or equal to nTbS-1; the offset 64-nTbs represents the offset of the column, i.e., to 6The last nTbs column of the 4x64 matrix; offset amount

Wherein, in the embodiment of the present invention, i represents a column coordinate of a coefficient in the transform matrix, and j represents a row coordinate of the coefficient in the transform matrix.

For example, when nTbs is 4, i.e. the size of the DST4 matrix is 4 × 4, the DST4 matrix of 4 × 4 is derived from equation (1) as:

for example, when nTbs is 8, i.e. the size of the DST4 matrix is 8 × 8, the DST4 matrix of 8 × 8 is derived from equation (1) as:

when the size of the DST4 matrix is 16 or 32, the derivation can also be performed by using equation (1), and details are not described.

For example, when the transform matrix is a DCT2 'matrix, a DCT 2' matrix can be derived from a 64 × 64DCT2 matrix by the following equation (2):

transMatrix[j][i×2^6-Log2(nTbs)](2)

wherein, the transmrix represents the DCT2 matrix (64 multiplied by 64DCT2 matrix), nTbs represents the size of the transformation matrix, i is more than or equal to 0 and less than or equal to nTbS-1, and j is more than or equal to 0 and less than or equal to nTbS-1.

For example, when nTbs is 4, i.e., the size of the DCT2 'matrix is 4 × 4, the DCT 2' matrix of 4 × 4 is derived from equation (2) as:

for example, when nTbs is 8, i.e. the size of the DCT2 'matrix is 8 × 8, the DCT 2' matrix of 8 × 8 is derived from equation (2) as:

when the size of the DCT 2' matrix is 16 or 32, the derivation can also be obtained by using equation (2), and details are not described.

In one embodiment, the encoder or decoder may also derive a small-sized DCT2 matrix from a stored large-sized DCT2 matrix. For example, when the size of the large-sized DCT2 matrix is 64, that is, 64 × 64, a DCT2 matrix having a size smaller than 64 may be derived according to the following formula (3).

transMatrix[i][j×2^6-Log2(nTbs)](3)

As can be seen from comparing equations (2) and (3), the difference is the position change of i and j, which means that the matrices obtained by equations (2) and (3) are transposes of each other.

Fig. 10 depicts a flow of a video decoding method provided by an embodiment of the present invention, which may be performed by the video decoder shown in fig. 3, for example, and includes:

1001. analyzing the received code stream to obtain indication information of a transformation matrix pair which is inversely transformed by the current block and a quantization coefficient of the current block, wherein the transformation matrix pair comprises a horizontal direction transformation matrix and a vertical direction transformation matrix.

1002. Performing inverse quantization processing on the quantized coefficients of the current block to obtain inverse quantized coefficients of the current block.

1003. Determining a transformation matrix pair for the current block to perform inverse transformation processing from a candidate transformation matrix pair according to the indication information; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the candidate transformation matrix pair are both one of two preset transformation matrices; one of the two transform matrices is a DST4 matrix or a variant of a DST4 matrix, and the other of the two transform matrices is a DCT2 'matrix or a variant of a DCT 2' matrix.

The number of candidate transformation matrix pairs may be 2, 3 or 4.

Wherein, the horizontal direction transformation matrix and the vertical direction transformation matrix included in any one of the candidate transformation matrix pairs are either the same or different.

In one embodiment, the DST4 matrix is transformed by sign-transforming at least a part of the coefficients of the rows or at least a part of the columns of the DST4 matrix, for example, the sign-transforming may be sign-inverting.

In one embodiment, the DCT2 'matrix is transformed by sign transforming at least a portion of the coefficients of the rows or at least a portion of the columns of the DCT 2' matrix, for example, the sign transformation may be sign inversion.

For example, the candidate transformation matrix pair may be a candidate transformation matrix pair as described in any of table 11, table 20, table 24, table 25, or tables 26-33.

Wherein, in one embodiment, the number of candidate transformation matrix pairs is four; the indication information of the transformation matrix pair for the current block to be inverse transformed is the index in table 11, table 20, table 24 or table 25. Taking table 11 as an example, if the index is 0, it indicates that the vertical direction transformation matrix in the transformation matrix pair in which the current block is inverse transformed is the DST4 matrix, and the numerical transformation matrix is the DST4 matrix; if the index is 1, the vertical direction transformation matrix in the transformation matrix pair for performing inverse transformation processing on the current block is a DST4 matrix, and the numerical transformation matrix is a DCT 2' matrix; if the index is 2, the vertical direction transformation matrix in the transformation matrix pair for performing inverse transformation processing on the current block is a DCT 2' matrix, and the numerical transformation matrix is a DST4 matrix; if the index is 3, the vertical direction transform matrix in the transform matrix pair representing the current block being inverse transformed is the DCT2 'matrix, and the numerical transform matrix is the DCT 2' matrix. The processing of table 20, and the indices of tables 24-33 are similar to table 11 and will not be described in detail.

In another embodiment, the indication information of the pair of transform matrices for which the current block is inverse-transformed includes an identifier of a vertical-direction transform matrix in the pair of transform matrices for indicating that the current block is inverse-transformed, and an identifier of a horizontal-direction transform matrix in the pair of transform matrices for indicating that the current block is inverse-transformed. For example, one bit is used as the identifier of the transform matrix in the vertical direction, and the other bit is used as the identifier of the transform matrix in the horizontal direction.

Taking table 11 as an example, if the bit value of the vertical direction transform matrix is 0, it indicates that the vertical direction transform matrix is the DST4 matrix, otherwise, it indicates that the vertical direction transform matrix is the DCT 2' matrix; if the bit value of the horizontal direction transformation matrix is 0, it means that the horizontal direction transformation matrix is a DST4 matrix, otherwise, it means that the horizontal direction transformation matrix is a DCT 2' matrix.

Taking table 20 as an example, if the bit value of the vertical direction transform matrix is 0, it indicates that the vertical direction transform matrix is a modification of the DST4 matrix, and otherwise indicates that the vertical direction transform matrix is a modification of the DCT 2' matrix; if the bit value of the horizontal direction transformation matrix is 0, it represents that the horizontal direction transformation matrix is a deformation of the DST4 matrix, and conversely, it represents that the horizontal direction transformation matrix is a deformation of the DCT 2' matrix.

Taking table 24 as an example, if the bit value of the vertical direction transform matrix is 0, it indicates that the vertical direction transform matrix is a deformation of the DST4 matrix, and otherwise, it indicates that the vertical direction transform matrix is the DCT 2' matrix; if the bit value of the horizontal direction transformation matrix is 0, it indicates that the horizontal direction transformation matrix is a deformation of the DST4 matrix, and otherwise, it indicates that the horizontal direction transformation matrix is the DCT 2' matrix.

Taking table 25 as an example, if the bit value of the vertical direction transform matrix is 0, it indicates that the vertical direction transform matrix is the DST4 matrix, and otherwise, it indicates that the vertical direction transform matrix is a modification of the DCT 2' matrix; if the bit value of the horizontal direction transformation matrix is 0, it indicates that the horizontal direction transformation matrix is a DST4 matrix, and otherwise, it indicates that the horizontal direction transformation matrix is a modification of a DCT 2' matrix.

1004. And carrying out inverse transformation processing on the inverse quantization coefficient of the current block according to the transformation matrix subjected to inverse transformation processing by the current block so as to obtain a reconstruction residual block of the current block.

1005. And obtaining a reconstructed block of the current block according to the reconstructed residual block of the current block.

In one embodiment, before step 1004, the method may further comprise: and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

For example, when the transform matrix pair inverse-transformed to the current block includes a DST4 matrix and the DCT2 matrix has a size of 64, the deriving the transform matrix pair inverse-transformed to the current block from the DCT2 matrix according to the preset algorithm may include: the DST4 matrix is derived from the aforementioned equation (1).

For example, when the transform matrix pair inverse-transformed to the current block includes a DCT 2' matrix and the DCT2 matrix has a size of 64, the deriving the transform matrix pair inverse-transformed to the current block from the DCT2 matrix according to the preset algorithm may include: the DCT 2' matrix is derived from the aforementioned equation (2).

It can be seen that the decoder only needs to store the DCT2 matrix to derive the matrix included in the transform matrix pair, so that the number of transform matrices that the decoder needs to store can be reduced, and the occupation of the memory space of the decoder by the transform matrices can be reduced.

Fig. 11 depicts a flow of a video encoding method provided by an embodiment of the present invention, which may be performed by the video encoder shown in fig. 2, for example, and includes:

1101. determining indication information of a transformation matrix pair for performing transformation processing on a current residual block, wherein the transformation matrix pair comprises a horizontal direction transformation matrix and a vertical direction transformation matrix; the transformation matrix pair is one of candidate transformation matrix pairs, and both a horizontal direction transformation matrix and a vertical direction transformation matrix included in the candidate transformation matrix pair are one of two preset transformation matrices; one of the two transform matrices is a DST4 matrix or a variant of a DST4 matrix, and the other of the two transform matrices is a DCT2 'matrix or a variant of a DCT 2' matrix.

The number of candidate transformation matrix pairs may be 2, 3 or 4.

In one embodiment, the DST4 matrix is transformed by sign-transforming at least a part of the coefficients of the rows or the columns of the DST4 matrix, for example, the negative sign transformation may be sign inversion.

Specifically, the encoder may perform horizontal direction transformation and vertical direction transformation on the residual block by using the four candidate transformation matrix pairs, so as to select a transformation matrix pair with a minimum rate-distortion cost (rate-distortion cost) as a transformation matrix pair for performing transformation processing on the current residual block, and then determine indication information of the transformation matrix pair for performing transformation processing on the current residual block from any one of table 11, table 20, or tables 24 to 33.

1102. And quantizing the transform coefficient obtained by transforming the current residual block through the transform matrix to obtain the quantized coefficient of the current residual block.

1103. And performing entropy coding processing on the quantization coefficient of the current residual block and the indication information.

1104. And writing the indication information of the transformation matrix pair after entropy coding and the quantization coefficient of the current residual block after entropy coding into a code stream.

Wherein, in one embodiment, the encoding method further comprises: the transformation matrix comprised by the pair of transformation matrices is derived from the DCT2 matrix according to a predetermined algorithm.

For example, when the transformation matrix included in the transformation matrix pair includes a DST4 matrix, the DCT2 matrix is 64 in size; the deriving of the transformation matrix included in the transformation matrix pair from the DCT2 matrix according to a predetermined algorithm comprises: the DST4 matrix is derived from equation (1) above.

For example, when the transformation matrix included in the transformation matrix pair includes a DCT 2' matrix, the DCT2 matrix has a size of 64; the deriving of the transformation matrix included in the transformation matrix pair from the DCT2 matrix according to a predetermined algorithm comprises: the DCT 2' matrix is derived from equation (1) above.

It can be seen that the encoder only needs to store the DCT2 matrix to derive the matrix included in the transform matrix pair, so that the number of transform matrices that the encoder needs to store can be reduced, and the storage space occupied by the transform matrices for the encoder can be reduced.

An embodiment of the present invention provides a video decoder 30 having a structure as shown in fig. 3, including:

an entropy decoding unit 304, configured to parse the received code stream to obtain indication information of a transform matrix pair that is inverse-transformed by the current block and quantized coefficients 309 of the current block, where the transform matrix pair includes a horizontal direction transform matrix and a vertical direction transform matrix.

An inverse quantization unit 310, configured to perform an inverse quantization process on the quantized coefficients 309 of the current block to obtain inverse quantized coefficients 311 of the current block.

An inverse transform processing unit 312, configured to determine, from the candidate transform matrix pair, a transform matrix pair for performing inverse transform processing on the current block according to the indication information; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the candidate transformation matrix pair are both one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a variant of a DST4 matrix, and the other of the two transformation matrices is a DCT2 'matrix or a variant of a DCT 2' matrix; and performing inverse transformation on the inverse quantization coefficient of the current block according to the transformation matrix of the current block subjected to inverse transformation to obtain a reconstructed residual block 313 of the current block.

The specific processing may refer to the processing of step 1003.

A reconstructing unit 314, configured to obtain a reconstructed block 315 of the current block based on the reconstructed residual block of the current block.

In an embodiment, the inverse transform processing unit 312 may be further configured to: and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

For example, when the pair of transform matrices inverse-transformed for the current block includes a DST4 matrix, and the DCT2 matrix has a size of 64, the inverse-transform processing unit 312 may be specifically configured to derive the DST4 matrix according to the above formula (1).

For example, when the pair of transform matrices for inverse transform processing of the current block includes a DCT2 'matrix, and the DCT2 matrix has a size of 64, the inverse transform processing unit 312 may be specifically configured to derive the DCT 2' matrix according to the above formula (2).

An embodiment of the present invention provides a video encoder 20 with a structure as shown in fig. 2, including:

a transform processing unit 206 for determining indication information of a transform matrix pair for performing transform processing on the current residual block 205, the transform matrix pair including a horizontal direction transform matrix and a vertical direction transform matrix; the transformation matrix pair is one of candidate transformation matrix pairs, and both a horizontal direction transformation matrix and a vertical direction transformation matrix included in the candidate transformation matrix pair are one of two preset transformation matrices; one of the two transform matrices is a DST4 matrix or a variant of a DST4 matrix, and the other of the two transform matrices is a DCT2 'matrix or a variant of a DCT 2' matrix.

The specific implementation can refer to the process of 1101.

A quantization unit 207, which performs quantization processing on the transform coefficient 207 obtained by performing transform processing on the current residual block by the transform matrix pair to obtain a quantized coefficient of the current residual block.

Among them, the transform coefficient 207 may be specifically obtained by the transform processing unit 206.

An entropy encoding unit 270, configured to perform entropy encoding processing on the quantized coefficient of the current residual block and the indication information;

and an output 272, configured to write, into a code stream, the indication information of the transform matrix pair after entropy encoding processing and the quantization coefficient of the current residual block after entropy encoding processing.

In an embodiment, the transform processing unit 206 may be further configured to derive a transform matrix included in the transform matrix pair from the DCT2 matrix according to a predetermined algorithm.

For example, when the pair of transform matrices includes a DST4 matrix and the DCT2 matrix has a size of 64, the transform processing unit 206 may be specifically configured to derive the DST4 matrix according to the above equation (1).

For example, when the pair of transform matrices includes a DCT2 'matrix and the DCT2 matrix has a size of 64, the transform processing unit 206 may be specifically configured to derive the DCT 2' matrix according to the above equation (2).

An embodiment of the present invention further provides a video decoder, which includes an execution circuit for executing any of the video decoding methods described above.

An embodiment of the present invention further provides a video decoder, including: at least one processor; and a non-transitory computer readable storage medium coupled to the at least one processor, the non-transitory computer readable storage medium storing a computer program executable by the at least one processor, the computer program when executed by the at least one processor causing the video decoder to perform any of the video decoding methods described above.

An embodiment of the present invention further provides a video encoder, which includes an execution circuit for executing any of the video encoding methods described above.

An embodiment of the present invention further provides a video encoder, including: at least one processor; and a non-transitory computer-readable storage medium coupled to the at least one processor, the non-transitory computer-readable storage medium storing a computer program executable by the at least one processor, the computer program, when executed by the at least one processor, causing the video decoder to perform any of the video encoding methods described above.

Embodiments of the present invention further provide a computer-readable storage medium for storing a computer program executable by a processor, and when the computer program is executed by at least one processor, perform any one of the above methods.

An embodiment of the present invention further provides a computer program, which, when executed, performs any one of the methods described above.

In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer readable media may comprise computer readable storage media corresponding to tangible media, such as data storage media or communication media, including any medium that facilitates transfer of a computer program from one place to another, such as according to a communication protocol. In this manner, the computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium, e.g., a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this disclosure. The computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules for encoding and decoding, or incorporated in a composite codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.

The techniques of this disclosure may be implemented in a variety of devices or apparatuses including a wireless handset, an Integrated Circuit (IC), or a collection of ICs (e.g., a chipset). This disclosure describes various components, modules, or units to emphasize functional aspects of the apparatus for performing the disclosed techniques, but does not necessarily require realization by different hardware units. Specifically, as described above, the various units may be combined in a codec hardware unit, or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.

Claims

1. A video decoding method, comprising:

2. The method of claim 1, wherein the deformation of the DST4 matrix is obtained by sign transforming coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

3. The method according to claim 1 or 2, wherein the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

4. The method according to claim 1 or 2, wherein the number of candidate transformation matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

5. The method according to any one of claims 1 to 4, wherein the indication information includes an identifier of a vertical direction transform matrix in a transform matrix pair for indicating the current block to be inverse-transformed, and an identifier of a horizontal direction transform matrix in a transform matrix pair for indicating the current block to be inverse-transformed.

6. The method according to any one of claims 1 to 5, wherein before performing inverse transform processing on the inverse quantized coefficients of the current block according to the transform matrix for performing inverse transform processing on the current block, the method further comprises:

and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

7. The method of claim 6, wherein the pair of transform matrices for the current block being inverse transformed comprises a DST4 matrix; the size of the DCT2 matrix is 64;

the transformation matrix included in the transformation matrix pair for deriving the current block from the DCT2 matrix according to a preset algorithm includes:

the DST4 matrix is derived from the DCT2 matrix according to the following formula:

8. The method according to claim 6 or 7, wherein the pair of transform matrices for the current block being inverse transformed comprises a DCT 2' matrix; the size of the DCT2 matrix is 64;

the DCT 2' matrix is derived from the DCT2 matrix according to the following formula:

transMatrix[j][i×2^6-Log2(nTbs)]；

9. A method of encoding, comprising:

performing entropy coding processing on the quantization coefficient of the current residual block and the indication information;

and writing the indication information of the transformation matrix pair after entropy coding and the quantization coefficient of the current residual block after entropy coding into a code stream.

10. The method of claim 9, wherein the deformation of the DST4 matrix is obtained by sign transforming coefficients of at least a part of rows or at least a part of columns of the DST4 matrix; or

11. The method according to claim 9 or 10, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

12. The method according to claim 9 or 10, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

13. A video decoder, comprising:

an inverse transformation processing unit, configured to determine, from the candidate transformation matrix pair, a transformation matrix pair for performing inverse transformation processing on the current block according to the indication information; the horizontal direction transformation matrix and the vertical direction transformation matrix included in the candidate transformation matrix pair are both one of two preset transformation matrices; one of the two transformation matrices is a DST4 matrix or a modification of a DST4 matrix, and the other of the two transformation matrices is a DCT2 ' matrix or a modification of a DCT2 ' matrix, wherein the DCT2 ' matrix is a transposed matrix of the DCT2 matrix; carrying out inverse transformation processing on the inverse quantization coefficient of the current block according to the transformation matrix subjected to inverse transformation processing by the current block so as to obtain a reconstructed residual block of the current block;

14. The video decoder of claim 13, wherein the deformation of the DST4 matrix is obtained by sign transforming coefficients of at least a portion of rows or at least a portion of columns of a matrix corresponding to the DST4 matrix; or

15. The video decoder according to claim 13 or 14, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

16. The video decoder according to claim 13 or 14, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

17. The video decoder according to any of claims 13 to 16, wherein the indication information includes an identifier of a vertical direction transform matrix in a transform matrix pair for indicating the current block to be inverse-transformed, and an identifier of a horizontal direction transform matrix in a transform matrix pair for indicating the current block to be inverse-transformed.

18. The video decoder of any of claims 13 to 17, wherein the inverse transform processing unit is further configured to: and deriving a transformation matrix included in the transformation matrix pair subjected to the inverse transformation processing by the current block from the DCT2 matrix according to a preset algorithm.

19. The video decoder of claim 18, wherein the pair of transform matrices for which the current block is inverse transformed comprises DST4 matrices; the size of the DCT2 matrix is 64;

the inverse transform processing unit is specifically configured to:

20. The video decoder according to claim 18 or 19, wherein the pair of transform matrices for which the current block is inverse transformed comprises a DCT 2' matrix; the size of the DCT2 matrix is 64;

the inverse transform processing unit is specifically configured to:

transMatrix[j][i×2^6-Log2(nTbs)]；

21. A video encoder, comprising:

22. The video encoder of claim 21, wherein the deformation of the DST4 matrix is obtained by sign transforming coefficients of at least a portion of rows or at least a portion of columns of a matrix corresponding to the DST4 matrix; or

The DCT2 'matrix is modified by performing a sign transformation on coefficients of at least a portion of rows or at least a portion of columns in a matrix corresponding to the DCT 2' matrix.

23. The video encoder according to claim 21 or 22, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a DST4 matrix and the other is a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs comprises a vertical direction transformation matrix which is a DST4 matrix, and a horizontal direction transformation matrix which is a DST4 matrix;

24. The video encoder according to claim 21 or 22, wherein the number of candidate transform matrix pairs is four; when one of the two transformation matrices is a modification of a DST4 matrix and the other is a modification of a DCT 2' matrix, a first transformation matrix pair of the four candidate transformation matrix pairs includes a vertical direction transformation matrix that is a modification of a DST4 matrix, and includes a horizontal direction transformation matrix that is a modification of a DST4 matrix;

25. A video decoder comprising execution circuitry for executing the method of any of claims 1 to 8.

26. A video decoder, comprising:

at least one processor; and

a non-transitory computer-readable storage medium coupled with the at least one processor, the non-transitory computer-readable storage medium storing a computer program executable by the at least one processor, the computer program when executed by the at least one processor causing the video decoder to perform the method of any of claims 1 to 8.

27. A computer-readable storage medium storing a computer program executable by a processor, the computer program, when executed by the at least one processor, performing the method of any one of claims 1 to 12.