CN114026865A - Coding and decoding tool for chrominance component - Google Patents

Coding and decoding tool for chrominance component Download PDF

Info

Publication number
CN114026865A
CN114026865A CN202080045375.3A CN202080045375A CN114026865A CN 114026865 A CN114026865 A CN 114026865A CN 202080045375 A CN202080045375 A CN 202080045375A CN 114026865 A CN114026865 A CN 114026865A
Authority
CN
China
Prior art keywords
transform
video
block
codec
current block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080045375.3A
Other languages
Chinese (zh)
Inventor
张凯
张莉
刘鸿彬
邓智玭
王悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Original Assignee
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd, ByteDance Inc filed Critical Beijing ByteDance Network Technology Co Ltd
Publication of CN114026865A publication Critical patent/CN114026865A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Apparatus, systems, and methods are described for digital video codec including codec tools for chroma components. In a representative aspect, a method for video processing includes making a transition between a current block of video and a bitstream representation of the video, wherein whether a Multiple Transform Set (MTS) index and/or a transform skip flag is signaled in the bitstream representation is based on an enablement of a Block Differential Pulse Codec Modulation (BDPCM) based codec tool for the current block.

Description

Coding and decoding tool for chrominance component
Cross Reference to Related Applications
The present application claims in time the priority and benefit of international patent application No. PCT/CN2019/092388 filed on 21/6/2019 as specified by the applicable patent laws and/or paris convention. The entire disclosure of the above application is incorporated by reference herein as part of the disclosure of the present application for all purposes of law.
Technical Field
This patent document relates to video encoding and decoding techniques, devices and systems.
Background
Despite advances in video compression, digital video has resulted in the largest bandwidth usage on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that bandwidth requirements for digital video use will continue to grow.
Disclosure of Invention
Devices, systems, and methods are described that relate to digital video coding, and in particular to coding tools for chroma components. The described methods may be applied to existing video codec standards (e.g., High Efficiency Video Codec (HEVC)) and future video codec standards (e.g., general video codec (VVC)) or encoders.
In a representative aspect, the disclosed techniques may be used to provide an example method for video processing. The method comprises the following steps: applying a coding tool to one or more chroma components of the video based on selective application of the coding tool to a corresponding luma component of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and performing the conversion.
In another representative aspect, the disclosed techniques may be used to provide an example method for video processing. The method comprises the following steps: applying a coding tool to a current block of a first chroma component of video based on selective application of the coding tool to one or more corresponding blocks of other chroma components of the video as part of a conversion between the current block and a bitstream representation of the video; and performing the conversion.
In yet another representative aspect, the disclosed techniques may be used to provide an exemplary method for video processing. The method comprises the following steps: applying a coding tool to a luma component of the video based on a selective application of the coding tool to one or more corresponding chroma components of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and performing the conversion.
In yet another representative aspect, the disclosed techniques may be used to provide an exemplary method for video processing. The method comprises the following steps: performing a conversion between a current block of video and a bitstream representation of the video, wherein whether a Multiple Transform Set (MTS) index and/or a transform skip flag is signaled in the bitstream representation is based on an activation of a Block Differential Pulse Codec Modulation (BDPCM) -based codec utility for the current block.
In yet another representative aspect, the disclosed techniques may be used to provide an exemplary method for video processing. The method comprises the following steps: selecting a codec type having a plurality of binary numbers based on a Multiple Transform Set (MTS) type for a current block of video; and applying the codec type to the indication of the MTS type as part of a conversion between the current block and a bitstream representation of the video.
In yet another representative aspect, the above-described methods are implemented in the form of processor executable code and stored in a computer readable program medium.
In yet another representative aspect, an apparatus configured or operable to perform the above-described method is disclosed. The apparatus may include a processor programmed to implement the method.
In yet another representative aspect, a video decoder device may implement a method as described herein.
The above and other aspects and features of the disclosed technology are described in more detail in the accompanying drawings, the description and the claims.
Drawings
Fig. 1 shows an example of a block diagram of an encoder.
Fig. 2 shows an example of 67 intra prediction modes.
Fig. 3A and 3B illustrate examples of reference samples for a wide-angle intra prediction mode for a non-square block.
Fig. 4 illustrates an example of discontinuity when wide-angle intra prediction is used.
Fig. 5A-5D illustrate examples of samples used by the location-dependent intra prediction combining (PDPC) method.
Fig. 6 shows an example of division of a 4 × 8 block and an 8 × 4 block.
Fig. 7 shows an example of division of all blocks except for 4 × 8, 8 × 4, and 4 × 4.
Fig. 8 shows an example of dividing a 4 x 8 sample block into two independently decodable regions.
Fig. 9 shows an example of an order in which pixel rows are processed using a vertical predictor to maximize throughput of a 4 × N block.
Fig. 10 shows an example of quadratic transformation in JEM.
Fig. 11 shows an example of the proposed simplified quadratic transformation (RST).
Fig. 12 shows examples of a forward simplified transform and an inverse simplified transform.
Fig. 13 shows an example of a positive RST8 x 8 process utilizing a 16 x 48 matrix.
Fig. 14 shows an example of scanning positions 17 to 64 in an 8 x 8 block for non-zero elements.
FIG. 15 shows examples of sub-block transform modes SBT-V and SBT-H.
16A-16E illustrate flow diagrams of example methods for multiple transformations in accordance with the disclosed technology.
Fig. 17 is a block diagram of an example of a hardware platform for implementing the visual media decoding or encoding techniques described in this document.
FIG. 18 is a block diagram of an exemplary video processing system in which the disclosed techniques may be implemented.
Detailed Description
1 introduction
Due to the increasing demand for higher resolution video, video coding methods and techniques are ubiquitous in modern technology. Video encoders typically include electronic circuitry or software that compresses or decompresses digital video, and are continually being improved to provide greater codec efficiency. The video encoder converts uncompressed video to a compressed format and vice versa. There is a complex relationship between video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of the encoding and decoding algorithms, susceptibility to data loss and errors, ease of editing, random access, and end-to-end delay (latency). The compression format typically conforms to a standard video compression specification, such as the High Efficiency Video Codec (HEVC) standard (also known as h.265 or MPEG-H part 2), the universal video codec (VVC) standard to be finalized, or other current and/or future video codec standards.
Embodiments of the disclosed techniques may be applied to existing video codec standards (e.g., HEVC, h.265) and future standards to improve runtime performance. The section headings are used in this specification to improve the readability of the specification, and do not limit the discussion or the embodiments (and/or implementations) in any way to the corresponding sections only.
Embodiments and examples of methods for multiple transformations
2.1 color space and chroma subsampling
A color space, also referred to as a color model (or color system), is an abstract mathematical model that simply describes a range of colors as a tuple of numbers, typically 3 or 4 values or color components (e.g., RGB). Fundamentally, color space is a detailed description of the coordinate system and subspace.
For video compression, the most common color spaces are YCbCr and RGB.
YCbCr, Y 'CbCr, or YPb/CbPr/Cr, also written as YCBCR or Y' CBCR, are a family of color spaces used as part of a color image pipeline in video and digital photography systems. Y' is a luminance component, CB and CR are a blue color difference chrominance component and a red color difference chrominance component. Y' (with prime) is different from luminance Y, which means that the RGB primaries based on gamma correction encode the light intensity non-linearly.
Chroma subsampling is the practice of encoding images by achieving a lower resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luma.
4:4:4 format.Each of the three Y' CbCr components has the same sampling rate and therefore no chrominance subsampling. This solution is sometimes used for high-end film scanners and post-production of motion pictures.
4:2:2 format.The two chrominance components are sampled at half the sampling rate of the luminance: the horizontal chrominance resolution is halved. This reduces the bandwidth of the uncompressed video signal by a factor of three with little to no visual difference.
4:2:0 format.In 4:2:0, the horizontal sampling is doubled compared to 4:1:1, but since in this scheme the Cb and Cr channels are sampled only in every other row, the vertical resolution is halved. Therefore, the data rates are the same. Each of CB and Cr is sub-sampled horizontally and vertically by a factor of 2. The 4:2:0 approach has three different variations, with different horizontal and vertical orientations.
In MPEG-2, Cb and Cr are co-located horizontally. Cb and Cr are positioned between pixels in the vertical direction (positioned interstitially).
In JPEG/JFIF, H.261, and MPEG-1, Cb and Cr are positioned intermittently between alternate luminance samples, positioned midway between the alternate luminance samples.
In 4:2:0DV, Cb and Cr are co-located in the horizontal direction. In the vertical direction, Cb and Cr are co-located in spaced rows.
2.2 codec streams for typical video encoders
Fig. 1 shows an example of an encoder block diagram for a VVC, which includes three in-loop filter blocks: deblocking Filter (DF), Sample Adaptive Offset (SAO), and ALF. Unlike DF using a predetermined filter, which signals offset and filter coefficients using side information of the codec, SAO and ALF use the original samples of the current picture to reduce the mean square error between the original samples and the reconstructed samples by adding the offset and by applying a Finite Impulse Response (FIR) filter, respectively. ALF is located at the final processing stage of each picture and can be considered as a tool to try to capture and fix artifacts produced by previous stages.
2.3 Intra mode coding and decoding with 67 Intra prediction modes
To capture any edge direction that occurs in natural video, the number of directional intra modes extends from 33 to 65 as used in HEVC. In fig. 2, the additional directional pattern is shown as a red dashed arrow, and the planar and DC patterns remain the same. These denser directional intra prediction modes apply to all block sizes as well as luma and chroma intra prediction.
As shown in fig. 1, the conventional angular intra prediction direction is defined as 45 degrees to-135 degrees in the clockwise direction. In VTM2, a plurality of conventional intra prediction modes are adaptively replaced by a wide-angle intra prediction mode for non-square blocks. The replaced mode is signaled using the original method and is remapped to the index of the wide angle mode after parsing. The total number of intra-prediction modes is unchanged (e.g., 67), and the intra-mode codec is unchanged.
In HEVC, each intra coded block has a square shape and the length of each side is a power of 2. Therefore, no division operation is required to form the intra predictor using the DC mode. In VTV2, the blocks may have a rectangular shape, which typically requires the use of a division operation for each block. To avoid division operations for DC prediction, only the longer side is used to calculate the average of the non-square blocks.
2.4 Wide-Angle Intra prediction for non-Square blocks
In some embodiments, the conventional angular intra prediction direction is defined as 45 degrees to-135 degrees in the clockwise direction. In VTM2, a plurality of conventional intra prediction modes are adaptively replaced by a wide-angle intra prediction mode for non-square blocks. The replaced mode is signaled using the original method and is remapped to the index of the wide angle mode after parsing. The total number of intra-prediction modes is unchanged (e.g., 67), and the intra-mode codec is unchanged.
To support these prediction directions, a top reference of length 2W +1 and a left reference of length 2H +1 are defined, as defined by the examples in fig. 3A and 3B.
In some embodiments, the number of modes replaced in the wide-angle direction mode depends on the aspect ratio of the block. Table 1 shows the replaced intra prediction modes.
Figure BDA0003422772000000061
Table 1: intra prediction mode replaced by wide-angle mode
As shown in fig. 4, in the case of wide-angle intra prediction, two vertically adjacent prediction samples may use two non-adjacent reference samples. Therefore, low-pass reference sample filtering and side smoothing are applied to wide-angle prediction to reduce the increased gap Δ pαThe negative effects of (c).
2.5 example of location dependent intra prediction combining (PDPC).
In VTM2, the intra prediction result of the planar mode is further modified by a position-dependent intra prediction combining (PDPC) method. PDPC is an intra prediction method that invokes a combination of unfiltered boundary reference samples and HEVC-like intra prediction with filtered boundary reference samples. PDPC is applied to the following intra mode without signaling: plane, DC, horizontal, vertical, lower left angle pattern and its eight adjacent angle patterns, and upper right angle pattern and its eight adjacent angle patterns.
The prediction sample pred (x, y) is predicted by using a linear combination of intra prediction modes (DC, plane, angle) and reference samples according to the following equation:
pred(x,y)=(wL×R-1,y+wT×Rx,-1-wTL×R-1,-1+(64-wL-wT+wTL)×pred(x,y)+32)>>shift
herein, R isx,-1,R-1,yRespectively representing reference samples located at the top and left of the current sample (x, y), and R-1,-1Representing the reference sample located in the upper left corner of the current block.
In some embodiments, and if PDPC is applied to DC, planar, horizontal and vertical intra modes, no additional boundary filters are needed as is required in the case of HEVC DC mode boundary filters or horizontal/vertical mode edge filters.
FIGS. 5A-5D show reference samples (R) for PDPC applied to various prediction modesx,-1,R-1,yAnd R-1,-1) The definition of (1). The prediction sample pred (x ', y') is located at (x ', y') within the prediction block. Reference sample Rx,-1Is given by: x ═ x '+ y' +1, and reference sample R-1,yIs similarly given by: y ═ x '+ y' + 1.
In some embodiments, the PDPC weights depend on the prediction mode and are shown in table 2, where S ═ shift.
Figure BDA0003422772000000071
Table 2: examples of PDPC weights according to prediction mode
2.6 Intra sub-block partitioning
In jfet-M0102, an ISP is proposed that divides the luma intra prediction block vertically or horizontally into 2 or 4 sub-partitions according to the block size dimension, as shown in table 3. Fig. 6 and 7 show examples of two possibilities. All sub-partitions fulfill the condition of having at least 16 samples.
Figure BDA0003422772000000072
Figure BDA0003422772000000081
Table 3: number of sub-partitions depending on block size
For each of these sub-partitions, a residual signal is generated by entropy-decoding the coefficients transmitted by the encoder, and then inverse quantizing and inverse transforming. Then, intra-frame prediction is performed on the sub-partitions, and finally, corresponding reconstructed samples are obtained by adding the residual signal to the prediction signal. Thus, the reconstructed value of each sub-partition will be available to generate a prediction for the next sub-partition, which will repeat the process, and so on. All sub-partitions share the same intra mode.
Based on the intra mode and the utilized partitioning, two different classes of processing order are used, which are referred to as normal and reverse order. The first sub-partition to be processed is a sub-partition that includes the top-left sample of the CU, in normal order, and then continues down (horizontal partition) or right (vertical partition). Therefore, the reference samples for generating the sub-divided prediction signal are located only on the left and upper sides of the row. On the other hand, the reverse processing order starts from a sub-division including a left down-sampling of a CU and continues upward, or starts from a sub-division including a right up-sampling of a CU and continues to the left.
2.7 block differential pulse code modulation coding and decoding.
BDPCM is proposed in JFET-M0057. Since the shape of the horizontal (vertical) predictor is predicted for the current pixel using the left (a) (top (B)) pixel, the most throughput-efficient way to process the block is to process all pixels of one column (row) in parallel and to process these columns (rows) sequentially. To increase throughput, we introduce the following procedure: when the predictor selected on the block is vertical, the block of width 4 is divided into two halves with horizontal leading edges, and when the predictor selected on the block is horizontal, the block of height 4 is divided into two halves with vertical leading edges.
When dividing a block, samples from one region are not allowed to use pixels from another region to compute a prediction: if this happens, the prediction pixel is replaced by the reference pixel in the prediction direction. This is shown in fig. 8 for different positions of the current pixel X in a vertically predicted 4X 8 block.
Due to this property, it is now possible to process 4 × 4 blocks in 2 cycles, and 4 × 8 or 8 × 4 blocks in 4 cycles, and so on, as shown in fig. 9.
Table 4 summarizes the number of cycles required to process a block, which depends on the size of the block. It is clear that any block with both dimensions greater than or equal to 8 can be processed with 8 pixels or more per cycle.
Figure BDA0003422772000000091
Table 4: throughput of block size 4 × N, N × 4
2.8 quantized residual Domain BDPCM
In JFET-N0413, the quantized residual domain BDPCM (hereinafter denoted as RBDPCM) is proposed. Intra-prediction is performed on the entire block by sample replication in a prediction direction (horizontal prediction or vertical prediction) similar to intra-prediction. The residual is quantized and the delta between the quantized residual and its predictor (horizontal or vertical) quantization value is coded.
For a block of size M (rows) x N (columns), let ri,jI ≦ 0 ≦ M-1, j ≦ 0 ≦ N-1 for the prediction residual after performing intra prediction horizontally (copying left neighbor pixel values row by row on the prediction block) or vertically (copying top neighbor to each row in the prediction block) using unfiltered samples from top or left block boundary samples. Let Q (r)i,j) I is more than or equal to 0 and less than or equal to M-1, and j is more than or equal to 0 and less than or equal to N-1 represents residual error ri,jWherein the residual is a difference between the original block value and the prediction block value. Block DPCM is then applied to the quantized residual samples ri,jModified has an element
Figure BDA0003422772000000095
M x N array of
Figure BDA0003422772000000096
When signaling vertical BDPCM:
Figure BDA0003422772000000092
for horizontal prediction, a similar rule is applied, and residual quantized samples are obtained by the following equation
Figure BDA0003422772000000093
Residual quantized sampling
Figure BDA0003422772000000094
Is sent to the decoder.
On the decoder side, the above calculation is reversed to obtain Q (r)i,j) I is more than or equal to 0 and less than or equal to M-1, and j is more than or equal to 0 and less than or equal to N-1. For the case of a vertical prediction,
Figure BDA0003422772000000101
the value is obtained.
One advantage of this approach is that the inversion of DPCM can be done during coefficient parsing, or can be done after parsing, which simply adds predictors as the parsed coefficients.
Transform skipping is always used in the quantized residual domain BDPCM.
2.9 multiple transformation set in VVC (MTS)
In VTM4, large block-size transforms (up to 64 × 64 in size) are allowed, which is mainly beneficial for higher resolution video (e.g., 1080p and 4K sequences). For a transform block with a size (width or height, or both) equal to 64, the high frequency transform coefficients are zeroed out, leaving only the low frequency coefficients. For example, for an M × N transform block, with M as the block width and N as the block height, when M equals 64, only the left 32 columns of transform coefficients are retained. Similarly, when N equals 64, only the upper 32 rows of transform coefficients are retained. When the transform skip mode is used for large blocks, the whole block is used without zeroing any value.
In addition to DCT-II, which has been used in HEVC, a Multiple Transform Selection (MTS) scheme is used for residual coding of both inter and intra coded blocks. This scheme uses multiple transforms selected from DCT8/DST 7. The newly introduced transformation matrices are DST-VII and DCT-VIII. Table 4 below shows the basis functions of the selected DST/DCT.
Figure BDA0003422772000000102
Figure BDA0003422772000000111
Table 4: basis functions of transformation matrices for use in VVC
To maintain the orthogonality of the transform matrices, the transform matrices are quantized more accurately than in HEVC. In order to keep the median of the transformed coefficients in the 16-bit range, all coefficients will have 10 bits after the horizontal transform and after the vertical transform.
To control the MTS scheme, separate enable flags are specified for intra and inter frames at the SPS level, respectively. When MTS is enabled at SPS, CU level flag is signaled to indicate whether MTS is applied. Here, MTS is used only for brightness. The MTS CU level flag is signaled when the following conditions are met.
O has a width and a height of 32 or less
The omic CBF flag is equal to 1
If the MTS CU flag is equal to zero, the DCT2 is applied in both directions. However, if the MTS CU flag is equal to 1, two other flags are additionally signaled to indicate the transform type in the horizontal and vertical directions, respectively. The transformation and signaling mapping table is shown in table 5. For transform matrix precision, an 8-bit primary transform kernel is used. Thus, all transform cores used in HEVC remain the same, including 4-point DCT-2 and DST-7, 8-point, 16-point, and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8 use an 8-bit main transform core.
Figure BDA0003422772000000112
To reduce the complexity of large sizes of DST-7 and DCT-8, the high frequency transform coefficients are zeroed out for DST-7 and DCT-8 blocks with a size (width or height, or both) equal to 32. Only the coefficients in the 16 x 16 low frequency region are retained.
As in HEVC, the residual of a block may be coded with a transform skip mode. To avoid redundancy of syntax coding, the transform skip flag is not signaled when the CU level MTS _ CU _ flag is not equal to zero. The block size limit for transform skipping is the same as that of the MTS in JEM4, indicating that transform skipping can be applied to a CU when both the block width and height are equal to or less than 32.
2.10 simplified quadratic transformation (RST) proposed in JFET-N0193
Indivisible quadratic transformation (NSST) in JEM 2.10.1
In JEM, quadratic transforms are applied between the positive primary transform and quantization (at the encoder) and between the dequantization and inverse primary transform (at the decoder side). As shown in fig. 10, performing 4 × 4 (or 8 × 8) quadratic transform depends on the block size. For example, 4 × 4 quadratic transforms are applied for small blocks (i.e., min (width, height) <8), while 8 × 8 quadratic transforms are applied every 8 × 8 blocks for larger blocks (i.e., min (width, height) > 4).
The application of the indivisible transformation is described below using the input as an example. To apply the indivisible transform, a 4X 4 input block X
Figure BDA0003422772000000121
First expressed as a vector:
Figure BDA0003422772000000122
Figure BDA0003422772000000123
the indivisible transformation is calculated as
Figure BDA0003422772000000124
Wherein
Figure BDA0003422772000000125
Represents a transform coefficient vector, and T is a 16 × 16 transform matrix. 16 x 1 coefficient vector
Figure BDA0003422772000000126
The scan order (horizontal, vertical or diagonal) using the blocks is then reorganized into 4 x 4 blocks. The coefficients with the smaller index will be placed in the 4 x 4 coefficient block with the smaller scan index. A total of 35 transform sets and each transform set uses 3 indivisible transform matrices (kernels). The mapping from intra prediction mode to transform set is predefined. For each transform set, the selected indistinguishable quadratic transform (NSST) candidates are also specified by explicitly signaled quadratic transform indices. After transforming the coefficients, the index is signaled once in the bitstream for each intra CU.
2.10.2 simplified quadratic transformation (RST) in JFET-N0193
RST (also known as low frequency inseparable transform (LFNST)) is introduced in JFET-K0099, and 4 transform set (instead of 35 transform set) mappings are introduced in JFET-L0133. In this JFET-N0193, 16 × 64 (further reduced to 16 × 48) and 16 × 16 matrices are used. For ease of representation, the 16 × 64 (reduced to 16 × 48) transform is represented as RST8 × 8, while the 16 × 16 transform is represented as RST4 × 4. Fig. 11 shows an example of RST.
RST calculation
The main idea of the simplified transform (RT) is to map N-dimensional vectors to R-dimensional vectors in different spaces, where R/N (R < N) is a simplification factor.
The RT matrix is an R × N matrix as follows:
Figure BDA0003422772000000131
where the transformed R rows are the R basis of the N dimensional space. The inverse transform matrix of RT is the transpose of its forward transform. Positive RT and negative RT are shown in fig. 12.
In this contribution, RST8 × 8 with a reduction factor of 4(1/4 size) is applied. Thus, a 16 × 64 direct matrix is used instead of 64 × 64, where 64 × 64 is the conventional 8 × 8 indivisible transform matrix size. In other words, the kernel (primary) transform coefficients are generated in the 8 × 8 upper left region using a 64 × 16 inverse RST matrix at the decoder side. Forward RST8 x 8 uses a 16 x 64 (or 8 x 64, for an 8 x 8 block) matrix such that it produces non-zero coefficients only in the upper left 4 x 4 region within a given 8 x 8 region. In other words, if RST is applied, the 8 × 8 area except the upper left 4 × 4 area will have only zero coefficients. For RST4 x 4, 16 x 16 (or 8 x 16 for 4 x 4 blocks) direct matrix multiplication is applied.
The inverse RST is conditionally applied when the following two conditions are satisfied:
the block size is greater than or equal to a given threshold (W > -4 & & H > -4)
O transform skip mode flag is equal to zero
If both the width (W) and the height (H) of the transform coefficient block are greater than 4, RST8 × 8 is applied to the upper left 8 × 8 region of the transform coefficient block. Otherwise, RST4 × 4 is applied to the top left min (8, W) × min (8, H) region of the transform coefficient block.
If the RST index is equal to 0, then RST is not applied. Otherwise, the RST is applied, and the core is selected by the RST index. The RST selection method and the coding of the RST index will be explained later.
Furthermore, RST is used for intra CUs in both intra slices (slices) and inter slices, and for both luma and chroma. If dual tree is enabled, RST indices for luma and chroma are signaled separately. For inter-slice (dual tree disabled), a single RST index is signaled and used for luma and chroma.
In the 13 th jfet conference, intra sub-partitioning (ISP) was adopted as a new intra prediction mode. When the ISP mode is selected, RST is disabled and RST index is not signaled, since the performance improvement is insignificant even if RST is applied to every feasible segment. Furthermore, disabling RST for the residuals of ISP prediction may reduce encoding complexity.
RST selection
The RST matrix is selected from four sets of transforms, each set of transforms consisting of two transforms. Which transform set is applied is determined according to an intra prediction mode as follows:
(1) if one of the three CCLM modes is indicated, transform set 0 is selected.
(2) Otherwise, transform set selection is performed according to the following table:
transformation set selection table
Figure BDA0003422772000000141
The index (denoted IntraPredMode) accessing the table above has a range of [ -14, 83], which is the transformed mode index for wide-angle intra prediction.
Dimension reduced RST matrix
As a further simplification, a 16 × 48 matrix is applied instead of 16 × 64 with the same transform set configuration, where each transform set configuration takes 48 input data from three 4 × 4 blocks of the top-left 8 × 8 block except the bottom-right 4 × 4 block (as shown in fig. 13).
RST signaling
The positive RST8 × 8 with R ═ 16 uses a 16 × 64 matrix, so that within a given 8 × 8 region only non-zero coefficients are generated in the upper left 4 × 4 region. In other words, if RST is applied, the 8 × 8 area except for the upper left 4 × 4 area only produces zero coefficients. Thus, when any non-zero element is detected within the 8 × 8 block region except the top-left 4 × 4 (depicted in fig. 14), the RST index is not coded, since this means that the RST is not applied. In this case, the RST index is inferred to be zero.
Range of zero setting
In general, any coefficients in the 4 × 4 sub-block may be non-zero before applying the inverse RST to the 4 × 4 sub-block. However, in some cases, some of the coefficients in the 4 x 4 sub-block must be zero before applying the inverse RST to the sub-block.
Let nonZeroSize be a variable. When rearranging it into a 1-D array before inverting RST, any coefficient requiring an index not less than nonZeroSize must be zero.
When nonZeroSize equals 16, the coefficients in the upper left 4 × 4 sub-block have no zeroing constraint.
In jfet-N0193, when the current block size is 4 × 4 or 8 × 8, nonZeroSize is set equal to 8 (i.e., coefficients with scan indices in the range of [8, 15] will be 0 as shown in fig. 14). For other block sizes, nonZeroSize is set equal to 16.
RST description in working draft
Sequence parameter set RBSP syntax
Figure BDA0003422772000000151
Figure BDA0003422772000000161
Residual coding syntax
Figure BDA0003422772000000162
Figure BDA0003422772000000171
Coding/decoding unit syntax
Figure BDA0003422772000000172
Figure BDA0003422772000000181
Sequence parameter set RBSP semantics
......
sps _ st _ enabled _ flag equal to 1 specifies that st _ idx may be present in the residual codec syntax for the intra codec unit. sps _ st _ enabled _ flag equal to 0 specifies that st _ idx is not present in the residual codec syntax for the intra codec unit.
......
Coding and decoding unit semantics
......
st _ idx [ x0] [ y0] specifies which quadratic transform core to apply between two candidate cores in the selected transform set. st _ idx [ x0] [ y0] equal to 0 specifies that no quadratic transformation is applied. The matrix indices x0, y0 specify the position of the top left sample of the transform block under consideration relative to the top left sample of the picture (x0, y 0).
When st _ idx [ x0] [ y0] is not present, it is inferred that st _ idx [ x0] [ y0] is equal to 0.
Transformation process for scaling transform coefficients
General conditions
The inputs to this process are:
-a luma location (xTbY, yTbY) specifying an upper left sample of the current luma transform block relative to an upper left luma sample of the current picture;
a variable nTbW specifying the width of the current transform block,
a variable nTbH specifying the height of the current transform block,
a variable cIdx specifying the color component of the current block,
-an (nTbW) x (nTbH) matrix d [ x ] [ y ] with scaled transform coefficients, wherein x ═ 0.. nTbW-1, y ═ 0.. nTbH-1.
The output of this process is a (nTbW) x (nTbH) matrix r [ x ] [ y ] of residual samples, where x ═ 0.. nTbW-1, y ═ 0.. nTbH-1.
If st _ idx [ xTbY ] [ yTbY ] is not equal to 0, the following condition applies:
1. the variables nStSize, log2StSize, numStX, numStY and nonZeroSize are derived as follows:
if both nTbW and nTbH are greater than or equal to 8, log2StSize is set to 3 and nstsoutsize is set to 48.
Otherwise, log2StSize is set to 2 and nStOutSize is set to 16.
-nStSize is set to (1< < log2 StSize).
-numStx is set equal to 2 if nTbH is equal to 4 and nTbW is greater than 8.
Else, numStx is set equal to 1.
-numStY is set equal to 2 if nTbW is equal to 4 and nTbH is greater than 8.
Else, numStY is set equal to 1.
-nonZeroSize is set equal to 8 if nTbW and nTbH are both equal to 4, or nTbW and nTbH are both equal to 8.
-otherwise, non zerosize is set equal to 16.
2. For xsbdx ═ 0.. numStX-1 and ysbdx ═ 0.. numStY-1, the following apply:
the variable matrix u [ x ] of non zerosize-1 is derived as follows:
xC=(xSbIdx<<log2StSize)+
DiagScanOrder[log2StSize][log2StSize][x][0]
yC=(ySbIdx<<log2StSize)+
DiagScanOrder[log2StSize][log2StSize][x][1]
u[x]=d[xC][yC]
-transforming u [ x ] of non zerosize-1 into a variable matrix v [ x ] of x 0.nStOutSize-1 by invoking the one-dimensional transformation procedure as specified in clause 8.7.4.4 by: the scaled transform coefficients noszerosize's transform input length, x 0.. noszerosize-1's list u [ x ] of transform output length nStOutSize, index for transform set selection stPredModeIntra, and index for transform selection in transform set st _ idx [ xTbY ] [ yTbY ] are input, and the output is x 0.. nstonsize-1's list v [ x ]. The variable stPredModeIntra is set to predModeIntra specified in clause 8.4.4.2.1.
An array d [ (xsbeidx < < log2 stize) + x ] [ (ysbeidx < < log2 stize) + y ] of nStSize-1 is derived as follows:
if stPredModeIntra is less than or equal to 34, or equal to INTRA _ LT _ CCLM, INTRA _ T _ CCLM or INTRA _ L _ CCLM, then the following applies:
d[(xSbIdx<<log2StSize)+x][(ySbIdx<log2StSize)+y]=(y<4)?V[x+(y<log2StSize)]:((x<4)?V[32+x+((y-4)<2)]:
d[(xSbIdx<<log2StSize)+x][(ySbIdx<log2StSize)+y])
otherwise, the following applies:
d[(xSbIdx<<log2StSize)+x][(ySbIdx<log2StSize)+y]=(y<4)?V[y+(x<log2StSize)]:((x<4)?V[32+(y-4)+(x<2)]:
d[(xSbIdx<<log2StSize)+x][(ySbIdx<log2StSize)+y])
second order transformation process
The inputs to this process are:
a variable nTrS specifying the transform output length,
a variable nonZeroSize, which specifies the transform input length,
-transforming a list of inputs x [ j ], where j is 0.. non zeroSize-1,
the variable stPredModeIntra, specifying the index used for the transform set selection.
A variable stIdx specifying the index used for the transform selection in the set.
The output of this process is a list of transformed samples y [ i ] of i 0.. nTrS-1.
The transformation matrix derivation process as specified in clause 8.7.4.5 involves: the transform output length nTrS, the index for transform set selection stPredModeIntra, the index for transform set selection stIdx, and the transform matrix sectransmrix are inputs and outputs.
The list of transformed samples y [ i ] of nTrS-1 is derived as follows:
y[i]=Clip3(CoeffMin,CoeffMax,
Figure BDA0003422772000000211
Figure BDA0003422772000000212
wherein i ═ 0.. nTrS-1
CoeffMin=-(1<<15)and CoeffMax=(1<<15)-1;
Quadratic transform matrix derivation process
The inputs to this process are:
a variable nTrS specifying the transform output length,
a variable stPredModeIntra specifying the index for the transform set selection,
a variable stIdx specifying an index for specifying a transformation selection in the transformation set.
The output of this process is the transformation matrix secTransMatrix.
The variable stTrSetIdx is derived as follows:
Figure BDA0003422772000000221
the transformation matrix secTransMatrix is derived based on nTrS, stTrSetIdx and stIdx as follows:
if nTrS equals 16, stTrSetIdx equals 0, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS is equal to 16, stTrSetIdx is equal to 0, and stIdx is equal to 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 16, stTrSetIdx equals 1, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 16, stTrSetIdx equals 1, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS is equal to 16, stTrSetIdx is equal to 2, and stIdx is equal to 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 16, stTrSetIdx equals 2, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 16, stTrSetIdx equals 3, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS is equal to 16, stTrSetIdx is equal to 3, and stIdx is equal to 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 0, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 0, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 1, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 1, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 2, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 2, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 3, and stIdx equals 1, then the following applies:
SecTransMatrix[m][n]=...
if nTrS equals 48, stTrSetIdx equals 3, and stIdx equals 2, then the following applies:
SecTransMatrix[m][n]=...
dequantization reduction in 2.11 HEVC
In HEVC, the scaled transform coefficient d' is calculated as
d'=Clip3(coeffMin,coeffMax,d),
Where d is the scaled transform coefficient before clipping.
For the luminance component it is possible to use,
CoeffMin=CoeffMinY,CoeffMax=CoeffMaxY;
for the chrominance components,
CoeffMin=CoeffMinC,CoeffMax=CoeffMaxC。
in the present context, it is intended that,
CoeffMinY=-(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))
CoeffMinC=-(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))
CoeffMaxY=(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))-1
CoeffMaxC=(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))–1
extended _ precision _ processing _ flag is a syntax element signaled in SPS.
2.12 affine Linear weighted Intra prediction (ALWIP, or matrix-based Intra prediction)
Affine linear weighted intra prediction (ALWIP, or matrix-based intra prediction (MIP)) is proposed in jfet-N0217.
In JFET-N0217, two tests were performed. In test 1, the ALWIP was designed to have a memory limit of 8 kbytes, and at most 4 multiplications per sample. Test 2 is similar to test 1, but further simplifies the design in terms of memory requirements and model architecture.
A single set of matrices and offset vectors for all block shapes.
Omicron the number of modes is reduced to 19 for all block shapes.
Omicron reduces the memory requirement to 5760 10-bit values, i.e., 7.20 kilobytes.
Performing linear interpolation of the predicted samples in a single step per direction, instead of iterative interpolation as in the first test.
2.13 sub-block transformations
For inter-predicted CUs with CU cbf equal to 1, CU sbt flag may be signaled to indicate whether the entire residual block or a sub-portion of the residual block has been decoded. In the former case, the inter-frame MTS information is further parsed to determine the transform type of the CU. In the latter case, a portion of the residual block is coded with the inferred adaptive transform and another portion of the residual block is zeroed out. SBT is not applied to the combined inter-intra mode.
In the sub-block transform, a position-dependent transform is applied to the luminance transform blocks in SBT-V and SBT-H (always using the chroma TB of DCT-2). The two positions of SBT-H and SBT-V are associated with different core transitions. More specifically, the horizontal and vertical transforms of each SBT location are specified in fig. 15. For example, the horizontal and vertical transforms for SBT-V position 0 are DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the corresponding transform is set to DCT-2. Thus, the sub-block transform jointly specifies TU tiling of the residual block, cbf, and horizontal and vertical transforms, which can be considered as syntax shortcuts for the case where the main residual of the block is on one side of the block.
3 example of defects in existing implementations
The current design of MTS presents the following problems:
the signaled index may result in an overhead bit.
Some combinations of transforms may not be efficient in MTS and SBT.
In the current VVC, a Transform Skip (TS) flag is coded regardless of whether the current block is coded in a QR-BDPCM mode. However, when QR-BDPCM is enabled, no transform need be applied. Thus, when a block is coded using QR-BDPCM, the signaling of the TS flag is redundant.
In current VVCs, the transform skip flag is context coded with a context that can also be used to code a bin indicating whether the transform matrix is DCT 2. The shared context may be inefficient.
There are 5 context coded bins (bins) using 9 contexts, where the 9 contexts are used to code the transform matrix index that affects the parsing throughput.
The QR-BDPCM/TS may also be applied to chroma blocks. How to better determine the use of QR-BDPCM/TS requires further investigation.
4 exemplary method for multiple transformations
Embodiments of the presently disclosed technology overcome the shortcomings of existing implementations, providing video codecs with higher codec efficiency but lower computational complexity. The method for multiple transforms as described in this document, which may enhance existing and future video codec standards, is set forth in the examples described below for various implementations. The examples of the disclosed technology provided below illustrate the general concepts and are not meant to be construed as limiting. In examples, features described in these examples may be combined unless explicitly stated to the contrary.
In the following example, assume:
omicMax (x, y) returns the larger of x and y
Omicron Min (x, y) returns the smaller of x and y.
Implicit MTS
It is proposed to decide the transform (horizontal transform and/or vertical transform) to apply to a block based on decoded coefficients without receiving a transform index.
1. The decoded coefficients may be associated with one or more representative blocks of the same color component or different color components as the current block.
a. In one example, the determination of the transform for the first block may depend on the decoded coefficients of the first block.
b. In one example, the determination of the transform for the first block may depend on decoded coefficients of the second block, which may be different from the first block.
i. In one example, the second block may be in the same color component (e.g., a luminance component) as the color component of the first block.
1) In one example, the second block may be adjacent to the first block.
2) In one example, the second block may have the same intra prediction mode as the first block.
3) In one example, the second block may have the same block dimensions as the first block.
4) In one example, the second block may be the last decoded block before the first block in decoding order that satisfies certain conditions (e.g., the same intra prediction mode or the same dimension).
in one example, the second block may be in a different color component than the color component of the first block.
1) In one example, the first block may be in a luma component and the second block may be in a chroma component (e.g., Cb/Cr, B/R component).
a) In one example, the three blocks are in the same codec unit.
b) Further alternatively, the implicit MTS is applied only to luma blocks and not to chroma blocks.
2) In one example, the first block in the first color component and the second block in the second color component may be located at corresponding positions of the picture with respect to each other.
c. In one example, the determination of the transform for the first block may depend on decoded coefficients of a plurality of blocks including at least one block different from the first block.
i. In one example, the plurality of blocks may include a first block.
in one example, the plurality of blocks may include one block or a plurality of blocks adjacent to the first block.
in one example, the plurality of blocks may include one block or a plurality of blocks having the same block dimension as the first block.
in one example, the plurality of blocks may include the last N decoded blocks prior to the first block in decoding order that satisfy a particular condition (e.g., the same intra prediction mode or the same dimension). N is an integer greater than 1.
v. in one example, the plurality of blocks may include one or more blocks that are not in the same color component as the first block.
1) In one example, the first block may be in a luminance component. The plurality of blocks may include blocks in the chroma component (e.g., a second block in the Cb/B component, and a third block in the Cr/R component).
a) In one example, the three blocks are in the same codec unit.
b) Further alternatively, the implicit MTS is applied only to luma blocks and not to chroma blocks.
2) In one example, a first block included in a first color component of the plurality of blocks and a plurality of blocks not in the first component color component may be located at corresponding positions of a picture having the first block.
2. The decoded coefficients used for the transform determination are coefficients (denoted significant coefficients) not equal to zero. The coefficients used for the transform determination are referred to as representative coefficients.
a. In one example, the representative coefficients are all significant coefficients in the representative block.
b. Alternatively, the representative coefficients are partial significant coefficients in the representative block.
i. In one example, the representative coefficient is a decoded significant coefficient that is greater than or not greater than a threshold value
in one example, the representative coefficient is a decoded significant coefficient that is less than or not greater than a threshold
in one example, the representative coefficients are the first K (K > ═ 1) decoded significant coefficients in decoding order.
in one example, the representative coefficient is the last K (K > ═ 1) decoded significant coefficients in decoding order.
v. in one example, the representative coefficients may be those at predetermined locations in the block.
1) In one example, the representative coefficient may include only one coefficient located at (xPos, yPos) coordinates with respect to the representative block. For example, xPos-yPos-0.
2) For example, the location may depend on the dimensions of the block.
In one example, the representative coefficients may be those at predetermined positions in the coefficient scan order.
c. Alternatively, the representative coefficients may also include zero coefficients.
d. Alternatively, the representative coefficients may be coefficients derived from decoded coefficients, for example by clipping to a range via quantization.
3. The transform determination may depend on a function representing the coefficients, such as a function having the value V as an output and using the representative coefficients as an input.
a. In one example, V is derived to represent the number of coefficients.
i. Alternatively, V is derived as a sum representing coefficients.
1) Further, alternatively, the sum may be clipped to get V.
Alternatively, V is derived as the sum of the absolute values of the representative coefficients.
1) Further, alternatively, the sum may be clipped to get V.
b. In one example, the selection may be implicitly determined at the decoder based on the parity of V.
i. For example, if V is an even number, then the first type of transform is selected as a horizontal transform and the second type of transform is selected as a vertical transform; if V is an odd number, the third type of transform is selected as a horizontal transform and the fourth type of transform is selected as a vertical transform.
1) In one example, the first type of transformation is the same as the second type of transformation.
a) Alternatively, the first type of transformation is different from the second type of transformation.
2) In one example, the third class of transforms is the same as the fourth class of transforms.
a) Alternatively, the third class of transforms is different from the fourth class of transforms.
3) The first/second/third/fourth class of transforms are specific transforms such as DCT-X or DST-Y. X may be an integer, such as 2 or 8. Y may be an integer, such as 7 or 8.
4) Further alternatively, at least one of the third class of transforms and the fourth class of transforms is different from the first and second classes of transforms.
a) In one example, when V is even, the first and second classes of transforms are DCT-2, and when V is odd, the third and fourth classes of transforms are DST-7.
b) Alternatively, when V is odd, the first and second classes of transforms are DCT-2, and when V is even, the third and fourth classes of transforms are DST-7.
c. In one example, if V is less than the threshold T1, the fifth type of transform is selected as the horizontal transform and the sixth type of transform is selected as the vertical transform. For example, T1 is 1 or 2.
i. Alternatively, if V is greater than the threshold T2, the fifth type of transform is selected as the horizontal transform and the sixth type of transform is selected as the vertical transform.
For example, the threshold may depend on the dimensions of the block.
For example, the threshold may depend on QP.
in one example, the fifth class of transforms is the same as the sixth class of transforms.
1) Alternatively, the fifth class of transforms is different from the sixth class of transforms.
v. in one example, the fifth/sixth class of transform is a specific transform, such as DCT-X or DST-Y. X may be an integer, such as 2 or 8. Y may be an integer, such as 7 or 8.
4. The transform determination may also depend on the coded information of the current block.
a. In one example, if the current intra coded block is in an I-slice/picture, DST-7 may be applied to the current block when V is even, and DCT-2 may be applied to the current block when V is odd.
b. In one example, if the current intra coded block is in a P/B-slice/picture and V is an even number, DCT-2 may be applied to the current block and DST-7 may be applied to the current block when V is an odd number.
c. In one example, the determination may further depend on mode information (e.g., intra-frame or inter-frame).
5. A set of transforms from which an implicit MTS transform can be selected can be predefined.
a. In one example, the horizontal and vertical transform sets may not be identical.
i. Alternatively, the horizontal and vertical transform sets may not be identical.
b. In one example, the set of transforms may include DCT-2 and DST-7.
c. In one example, the set of transforms may include DCT-2, DST-7 and identifying transforms.
d. In one example, the transform set may depend on the information that is coded, the color components, the partitioning structure (e.g., dual tree/single tree; quad tree/binary tree/ternary tree/extended quad tree), the stripe/picture type, etc.
i. In one example, the set of transforms may depend on the block dimensions.
in one example, for an intra coded block, DCT-2 and DST-7 may be included.
in one example, DST-7 and identity transforms (i.e., no transforms applied) may be included for blocks that are coded with reference samples in the same picture (e.g., intra block copy).
6. In one example, one or more of the methods disclosed in items 1-5 can only be applied to a particular block.
a. For example, one or more of the methods disclosed in items 1-5 can only be applied to intra-coded blocks.
Simplified MTS/SBT
7. In one example, the vertical and horizontal transforms must be the same in the MTS.
a. In one example, a block can only select one of two types of transitions:
i. DCT-2 in the horizontal transform and DCT-2 in the vertical transform;
DST-7 in horizontal transform and DST-7 in vertical transform;
b. in one example, the signaling for the MTS may include at most one flag for the block.
i. In one example, if the flag is equal to 0, DCT-2 in the horizontal transform and DCT-2 in the vertical transform; if the flag is equal to 1, DST-7 in the horizontal transform and DST-7 in the vertical transform.
in one example, if the flag is equal to 1, DCT-2 in the horizontal transform and DCT-2 in the vertical transform; if the flag is equal to 0, DST-7 in the horizontal transform and DST-7 in the vertical transform.
8. Assume that the width and height of the block are W and H, respectively. Item 7 is applied only when
a.W > -T1 and H > -T2, for example T1-T2-8;
b.W ═ T1 and H ═ T2, for example T1 ═ T2 ═ 16;
min (W, H) > -T1, e.g., T1-8;
max (W, H) < ═ T1, e.g., T1 ═ 32;
e.W ═ T1, e.g., T1 ═ 64;
f.W H < ═ T1, e.g., T1 256.
9. In one example, transforms other than DST-8 may be applied in blocks coded with SBT.
a. In one example, only DCT-2 and DST-7 may be applied to blocks coded with SBT.
b. In one example, in the case of (SBT-V, position 0) for SBT as shown in FIG. 15, DCT-2 is applied horizontally and DST-7 is applied vertically.
c. In one example, in the case of (SBT-H, position 0) for SBT as shown in FIG. 15, DST-7 is applied horizontally and DCT-2 is applied vertically.
10. Assume that the width and height of the transform block are W and H, respectively. In one example, the selection of a transform for a block coded with SBT may depend on the transform block dimension, where the transform block may be smaller than the coded block when SBT is applied.
a. In one example, in case of (SBT-V, position 0) for SBT as shown in fig. 15, if W > -T1, DCT-2 is applied horizontally and DST-7 is applied vertically. Otherwise, DCT-8 is applied horizontally and DST-7 is applied vertically. For example, T1 ═ 8.
b. In one example, in case of (SBT-H, position 0) for SBT as shown in fig. 15, if H > -T1, DST-7 is applied horizontally and DCT-2 is applied vertically. Otherwise, DST-7 is applied horizontally and DCT-8 is applied vertically. For example, T1 ═ 8.
General features
11. The decision of the transform matrix may be done in the CU/CB level or in the TU level.
a. In one example, the decision is made in the CU level, where all TUs share the same transform matrix.
i. Further alternatively, when one CU is divided into a plurality of TUs, the transform matrix may be determined using coefficients in one TU (e.g., the first or last TU) or some or all of the TUs.
b. Whether a CU-level solution or a TU-level solution is used may depend on the block size and/or the VPDU size and/or the maximum CTU size of a block and/or the information that is coded.
i. In one example, the CU level determination method may be applied when the block size is larger than the VPDU size.
12. In the method disclosed in this document, the coefficients or representative coefficients may be quantized or dequantized.
13. Transform skipping may also be determined implicitly by the coefficients or representative coefficients using any of the methods disclosed in this document.
14. In the methods disclosed in this document, the coefficients or representative coefficients may be modified before being used to derive the transform.
a. For example, the coefficients or representative coefficients may be clipped prior to use in deriving the transform.
b. For example, the coefficients or representative coefficients may be scaled before being used to derive the transform.
c. For example, the coefficients or representative coefficients may be added with an offset before being used to derive the transform.
d. For example, the coefficients or representative coefficients may be filtered before being used to derive the transform.
e. For example, the coefficients or representative coefficients may be mapped to other values (e.g., via a look-up table) prior to being used to derive the transform.
15. The methods disclosed in this document can also be used to implicitly derive other codec modes/information by or on behalf of the coefficients.
a. In one example, the disclosed method may be used to derive a quadratic transform that may be applied to a sub-region of a block.
b. Further alternatively, the representative coefficients are from coefficients corresponding to sub-regions rather than the entire block.
16. In one example, the video may be at a sequence level/picture level/slice group level (tilegroup level), such as at a sequence header/picture header/SPS/VPS/DPS/PPS/APS/slice
With header/slice group header, signaling whether and/or how to apply the methods disclosed above.
17. In one example, whether and/or how the above disclosed methods are applied may depend on the codec information, which may include:
a. the block dimension.
i. In one example, the implicit MTS method described above may be applied for blocks whose width and height are not greater than a threshold (e.g., 32).
b.QP
c. Picture or slice type (e.g. I-frame or P/B-frame, I-slice or P/B-slice)
i. In one example, the proposed method may be enabled on I-frames, but disabled on P/B-frames.
d. Structure division method (Single tree or double tree)
i. In one example, the implicit MTS method described above may be applied for stripes/pictures/bricks/tiles of a single tree partitioning application.
e. Codec modes (e.g., inter mode/intra mode/IBC mode, etc.)
i. In one example, for an intra coded block, the implicit MTS method described above may be applied.
f. Coding and decoding methods (e.g., intra sub-block partitioning, Derived Tree (DT) methods, etc.)
i. In one example, the implicit MTS method described above may be disabled for intra-coded blocks to which DT is applied.
in one example, the implicit MTS method described above may be disabled for intra codec blocks to which ISPs are applied.
g. Color component
i. In one example, the implicit MTS method described above may be applied for luma blocks, and not applied for chroma blocks.
h. Intra prediction modes (e.g., DC, vertical, horizontal, etc.)
i. Motion information (e.g., MV and reference index).
j. Standard profile/level/hierarchy
Coding and decoding tool for chrominance component
18. It is proposed that a coding tool X may be applied to one or more chroma components of a block depending on whether the coding tool X is applied to one or more corresponding luma blocks. In the following discussion, "chroma component" may refer to "one or more chroma components.
a. In one example, the use of codec tool X for chroma blocks is derived from information whether the codec tool is applied to the corresponding luma block. Therefore, no additional signaling of the use of codec tool X for chroma blocks is required.
i. In one example, if codec tool X is applied to a corresponding luma block, codec tool X may be applied to the chroma components of the block; and, if the coding and decoding tool X is not applied to the corresponding luminance block, the coding and decoding tool X is not applied to the chrominance component of the block.
in one example, when codec tool X is applied to a corresponding luma block, codec tool X may be applied to a luma component and a chroma component in the same manner.
b. In one example, a notification message (e.g., a flag or index) may be conditionally signaled to indicate whether codec tool X is applied to the chroma component of the block. The condition may be defined as whether the codec tool X is applied to the corresponding luminance block. Additionally, alternatively, if codec tool X is not applied to the corresponding luma block, codec tool X is not applied to the chroma components of the block without signaling.
i. In one example, when codec tool X is applied to a corresponding luma block and the message indicates that codec tool X is also applied to a chroma component, codec tool X may be applied to the luma component and the chroma component in the same manner.
in one example, codec tool X may be applied to the luma component and the chroma component in different ways.
1) It can be signaled how to apply codec tool X to the luma component and the chroma component, respectively.
c. In the above discussion, a "corresponding luma block" may refer to a luma block that covers at least one "corresponding sample" of a chroma block. The sampling positions may be scaled according to a color format, e.g., 4:4:4 or 4:2: 0. Assuming that the upper left corner position of the chroma block is (x0, y0) and the width and height of the chroma block are W and H, all of which are scaled to the luma sample unit.
i. In one example, the corresponding sample may be at (x0, y 0);
in one example, the corresponding sample may be at (x0+ W-1, y0+ H-1);
in one example, the corresponding sample may be at (x0+ W/2-1, y0+ H/2-1);
in one example, the corresponding sample may be at (x0+ W/2, y0+ H/2);
v. in one example, the corresponding sample may be at (x0+ W/2, y0+ H/2-1);
in one example, the corresponding sample may be at (x0+ W/2-1, y0+ H/2);
19. it is proposed that a coding tool X may be applied to one chroma component of a block depending on whether the coding tool X is applied to one or more corresponding blocks of another chroma component.
a. In one example, the use of codec tool X for chroma blocks is derived from information whether the codec tool is applied to corresponding blocks of other chroma components. Therefore, no additional signaling of the use of codec tool X for chroma blocks is required.
b. In one example, a notification message (e.g., a flag) may be conditionally signaled to indicate whether codec tool X is applied to the chroma component of the block. The condition may be defined as whether codec tool X is applied to the corresponding blocks of the other chroma components. Additionally, alternatively, if codec tool X is not applied to corresponding blocks of other chroma components, codec tool X is not applied to chroma components of the blocks without signaling.
20. It is proposed that a coding tool X may be applied to the luminance component of a block depending on whether the coding tool X is applied to one or more corresponding blocks of the chrominance component.
a. In one example, the use of the codec tool X for luminance blocks is derived from information whether the codec tool is applied to the corresponding block of the chrominance component. Therefore, no additional signaling of the use of codec tool X for chroma blocks is required.
b. In one example, a notification message (e.g., a flag) may be conditionally signaled to indicate whether codec tool X is applied to the luma component of a block. The condition may be defined as whether codec tool X is applied to the corresponding block of the chroma component. Alternatively, if codec tool X is not applied to the corresponding block of the chroma component, codec tool X is not applied to the luma component of the block without signaling.
21. The codec tool X may be defined as follows.
a. In one example, codec tool X may be an MTS.
b. In one example, codec tool X may be transform skip.
c. In one example, codec tool X may be the RST.
d. In one example, codec tool X may be a RBDPCM.
e. In one example, codec tool X may be BDPCM.
Signaling of MTS index and transition skip flag
22. It is proposed that the MTS index and/or the transform skip flag may be conditionally signaled depending on the use of BDPCM or QR-BDPCM or any variant of BDPCM.
a. In one example, when BDPCM or QR-BDPCM or any variant of BDPCM is enabled for a block in the bitstream (e.g., intra _ bdcpcm _ flag is equal to true), the MTS index and/or transform skip flag may not be signaled for the block.
i. Further alternatively, the transform skip flag of a block may be inferred to be true when BDPCM or QR-BDPCM or any variant of BDPCM is enabled in the block.
b. In one example, when BDPCM or QR-BDPCM or any variant of BDPCM is enabled in a block, the MTS index of the block may be inferred to be 0.
23. Fixed-length codecs may be applied to codec different MTS types other than TS and DCT-2, such as DST7-DST7, DCT8-DST7, DST7-DCT8, DCT8-DCT8 in the VVC specification.
a. In one example, a fixed length with 2 binary numbers may be applied.
b. In one example, each bin may be context-coded.
c. In one example, the first bin or the last bin may be context-coded, while the remaining bins are bypass-coded.
d. In one example, all binary numbers are bypass coded.
24. Context modeling of transform matrix indices (e.g., TS, DCT-2, other transform matrices) may depend on the mode being coded, the transform block size and/or QT depth and/or MTT depth and/or BT depth and/or TT depth.
a. In one example, context modeling for transform matrix index may depend on the coding mode of the block, e.g., whether the block is coded in intra/inter/IBC mode.
b. In one example, context modeling for transformation matrix indexing may depend on a function of multiple partition depths, which may include QT depth, MTT depth, BT depth, TT depth.
c. In one example, the context modeling of the transform matrix index may depend on the transform depth of the TU/TB relative to the CU/PU.
d. In one example, the context index increment may be set as a function of the TU/TB dimensions.
i. In one example, the context index increment may be set to ((Log2(TbW) + Log2(TbH)) > >1) -X, where TBW and TbH indicate the width and height of the transform block, and X is an integer (such as X ═ 2).
in one example, the context index increment may be set to Log2(max (TbW, TbH)) -X, where TbW and TbH indicate the width and height of the transform block, X is an integer (such as X ═ 2), and max (a, b) returns a larger value.
in one example, the context index delta may be set to Log2(min (TbW, TbH)) -X, where TbW and TbH indicate the width and height of the transform block, X is an integer (such as X ═ 2), and min (a, b) returns a smaller value.
The above context index delta may be further clipped to a range of, for example, [ k0, k1], where k0 and k1 are integers.
e. In one example, the context index increment may be set as a function of TU/TB width or height.
f. In one example, the context index delta may be set as a function of the MTT depth.
g. In one example, the context index increment may be set to min (K, quadtree depth), where the function min (a, b) returns a smaller value between a and b, K being an integer such as 4 or 5.
h. In one example, the above method may be applied to code-specific binary numbers used in matrix index coding.
i. In one example, the bin used to indicate whether it is DCT2 is context codec and the context modeling is based on the method described above, e.g., the first bin of tu _ mts _ idx.
25. It is proposed that context modeling for codec of an indication of whether TS or DCT2 may be shared.
a. In one example, the context modeling (i.e., how the context index is selected) for coding the first bin of transform _ skip _ flag and tu _ mts _ idx may be the same.
b. Further alternatively, the context for encoding and decoding the two binary numbers may be different or partially shared.
i. In one example, a first aggregate context may be used to code transform _ skip _ flag and a second aggregate context may be used to code the first bin of tu _ mts _ idx. And the two sets are not shared.
c. Further alternatively, the context for encoding and decoding the two binary numbers may be completely shared.
26. It is proposed that a single context can be used for some or all of the context codec bins of the syntax element tu _ mts _ idx.
a. For example, a single context may be used for the first binary number of tu _ mts _ idx.
b. For example, a single context may be used for all binary numbers except the first binary number of tu _ mts _ idx.
i. Further alternatively, the first bin of tu _ mts _ idx may be context coded using the context modeling method mentioned in item 24.
1. In one example, depending on the transform size, the first bin may be context coded with N contexts (e.g., the context index increment may be set to ((Log2(TbW) + Log2(TbH)) > >1) -2, such that N ═ 4), and all remaining bins may be contexts coded with a single context.
c. Further alternatively, the first and second binary numbers of tu _ mts _ idx may be context-coded and all remaining binary numbers may be bypass-coded.
d. In one example, a first bin may be context coded with N contexts, while a second bin may be context coded with a single context, and the remaining bins may be bypass coded, according to the transform size (e.g., N-4). For example, all binary numbers of tu _ mts _ idx may be bypass coded.
27. The binarization of the transformation matrix may be defined in the following way:
a. in one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
0
DCT2-DCT2 1 0
DST7-DST7 1 1 0
DCT8-DST7 1 1 1 0
DST7-DCT8 1 1 1 1 0
DCT8-DCT8 1 1 1 1 1
b. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
0
DCT2-DCT2 1 0
DST7-DST7 1 1 0 0
DCT8-DST7 1 1 0 1
DST7-DCT8 1 1 1 0
DCT8-DCT8 1 1 1 1
c. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
1
DCT2-DCT2 0 1
DST7-DST7 0 0 0 0
DCT8-DST7 0 0 0 1
DST7-DCT8 0 0 1 0
DCT8-DCT8 0 0 1 1
d. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Figure BDA0003422772000000411
Figure BDA0003422772000000421
e. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
0
DCT2-DCT2 11
DST7-DST7 10
f. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
1
DCT2-DCT2 00
DST7-DST7 01
g. In one example, the following table shows the binary numbers of the mapping and the corresponding matrix.
Transformation matrix Binary string
TS
1
DCT2-DCT2 01
DST7-DST7 00
28. Two candidate combinations of MTS with only transitions are proposed: DCT2-DCT2 and DST7-DST 7.
The above examples may be incorporated in the context of methods described below, e.g., methods 1610, 1620, 1630, 1640, and 1650, which may be implemented at a video encoder and/or decoder.
Fig. 16A shows a flow diagram of an exemplary method for video processing. The method 1610 includes: in operation 1612, a coding tool is applied to one or more chroma components of the video based on selective application of the coding tool to a corresponding luma component of the video as part of a conversion between a current block of the video and a bitstream representation of the video.
The method 1610 includes: in operation 1614, a conversion is performed.
Fig. 16B shows a flow diagram of an exemplary method for video processing. The method 1620 comprises: in operation 1622, a coding tool is applied to the current block based on selective application of the coding tool to one or more corresponding blocks of other chroma components of the video as part of a conversion between the current block of the first chroma component of the video and a bitstream representation of the video.
The method 1620 comprises: in operation 1624, a conversion is performed.
Fig. 16C shows a flow diagram of an exemplary method for video processing. The method 1630 includes: in operation 1632, a coding tool is applied to a luma component of the video based on a selective application of the coding tool to one or more corresponding chroma components of the video as part of a conversion between a current block of the video and a bitstream representation of the video.
The method 1630 includes: in operation 1634, a conversion is performed.
Fig. 16D shows a flow diagram of an exemplary method for video processing. The method 1640 includes: in operation 1642, a conversion between a current block of video and a bitstream representation of the video is performed. In some embodiments, whether a Multiple Transform Set (MTS) index and/or a transform skip flag is signaled in the bitstream representation is based on an enablement of a Block Differential Pulse Codec Modulation (BDPCM) -based codec tool for the current block.
Fig. 16E shows a flow diagram of an exemplary method for video processing. The method 1650 includes: in operation 1652, a codec type having a plurality of binary numbers is selected based on a Multiple Transform Set (MTS) type for a current block of video.
The method 1650 includes: in operation 1654, a codec type is applied to the indication of the MTS type as part of a conversion between the current block and a bitstream representation of the video.
5 exemplary implementation of the disclosed technology
In the following examples, bold double braces are used to indicate addition, e.g., { { a } } indicates that "a" has been added, while bold double braces are used to indicate deletion, e.g., [ [ a ] ] indicates that "a" has been deleted.
5.1 example #1
The working draft specified in JFET-N1001-v 7 may be changed as follows.
8.7.4 transformation procedure for scaled transform coefficients
8.7.4.1 general case
...
Tables 8-17-specification of trTypeHor and trTypeVer depending on cu _ sbt _ horizontal _ flag and cu _ sbt _ pos _ flag
cu_sbt_horizontal_flag cu_sbt_pos_flag trTypeHor trTypeVer
0 0 {{nTbW>=8?0:}}2 1
0 1 1 1
1 0 1 {{nTbH>=8?0:}}2
1 1 1 1
5.2 example 2
The working draft specified in JFET-N1001-v 8 may be changed as follows.
7.3.7.10 transform unit syntax
Figure BDA0003422772000000451
Alternatively, the following may be applied:
Figure BDA0003422772000000461
5.3 example 3
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.3.7 fixed length binarization process
...
{{
9.5.3.8 binarization procedure for tu _ mts _ idx
The input to this process is a request for binarization of the syntax element tu _ mts _ idx. The output of this process is binarization of the syntax elements.
The binarization for the syntax element tu _ mts _ idx is specified in tables 9-14.
Table 9-14 binarization of tu _ mts _ idx
Figure BDA0003422772000000471
}}
5.4 example 4
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.4.2 derivation procedure for ctxTable, ctxIdx, and bypass flag
9.5.4.2.1 general case
...
Tables 9-17-assignment of ctxInc to syntax elements with context-coded binary numbers
Figure BDA0003422772000000472
5.5 example 5
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.4.3 derivation procedure for ctxTable, ctxIdx, and bypass flag
9.5.4.3.1 general case
...
Tables 9-17-assignment of ctxInc to syntax elements with context-coded binary numbers
Figure BDA0003422772000000481
5.6 example 6
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.4.4 derivation procedure for ctxTable, ctxIdx, and bypass flag
9.5.4.4.1 general case
...
Tables 9-17-assignment of ctxInc to syntax elements with context-coded binary numbers
Figure BDA0003422772000000482
5.7 example 6
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.4.5 derivation procedure for ctxTable, ctxIdx, and bypass flag
9.5.4.5.1 general case
...
Tables 9-17-assignment of ctxInc to syntax elements with context-coded binary numbers
Figure BDA0003422772000000491
5.8 example 7
The working draft specified in JFET-N1001-v 8 may be changed as follows.
9.5.4.6 derivation procedure for ctxTable, ctxIdx, and bypass flag
9.5.4.6.1 general case
...
Tables 9-17-assignment of ctxInc to syntax elements with context-coded binary numbers
Figure BDA0003422772000000492
Alternatively, the following may be applied:
Figure BDA0003422772000000493
fig. 17 is a block diagram of the video processing apparatus 1700. Device 1700 may be used to implement one or more of the methods described herein. The apparatus 1700 may be implemented in a smartphone, tablet, computer, internet of things (IoT) receiver, and/or the like. The apparatus 1700 may include one or more processors 1702, one or more memories 1704, and video processing hardware 1706. The processor 1702 may be configured to implement one or more of the methods described in this document (including, but not limited to, methods 1610, 1620, 1630, 1640, and 1650). The one or more memories 1704 may be used to store data and code for implementing the methods and techniques described herein. The video processing hardware 1706 may be used to implement some of the techniques described in this document in hardware circuits.
In some embodiments, the video codec method may be implemented using a device implemented on a hardware platform, as described with reference to fig. 17.
Some embodiments of the disclosed technology include deciding or determining to enable a video processing tool or mode. In one example, when a video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of the video blocks, but may not necessarily modify the generated bitstream based on the use of the tool or mode. That is, the conversion from a block of video to a bitstream representation of the video will use the video processing tool or mode when enabled based on the decision or determination. In another example, when a video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has been modified based on the video processing tool or mode. That is, the conversion from a bitstream representation of the video to blocks of the video will be performed using a video processing tool or mode that is enabled based on the decision or determination.
Some embodiments of the disclosed technology include deciding or determining to disable a video processing tool or mode. In one example, when a video processing tool or mode is disabled, the encoder will not use the tool or mode in the conversion from a block of video to a bitstream representation of the video. In another example, when a video processing tool or mode is disabled, the decoder will process the bitstream with knowledge that the bitstream has not been modified using the video processing tool or mode that was enabled based on the decision or determination.
Fig. 18 is a block diagram illustrating an exemplary video processing system 1800 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of system 1800. The system 1800 can include an input 1802 for receiving video content. The video content may be received in a raw or uncompressed format (e.g., 8 or 10 bit multi-component pixel values), or may be received in a compressed or encoded format. Input 1802 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces such as ethernet, Passive Optical Networks (PONs), etc., and wireless interfaces such as Wi-Fi or cellular interfaces.
System 1800 can include a codec component 1804 that can implement the various codecs and encoding methods described in this document. The codec component 1804 may reduce the average bit rate of the video from the input 1802 to the output of the codec component 1804 to produce a codec representation of the video. Thus, codec techniques are sometimes referred to as video compression or video transcoding techniques. The output of the codec component 1804 may be stored or transmitted via a connected communications component (as shown by component 1806). A stored or transmitted bitstream (or codec) representation of video received at input 1802 can be used by component 1808 to generate pixel values or displayable video that is sent to display interface 1810. The process of generating user-viewable video from a bitstream representation is sometimes referred to as video decompression. Further, while certain video processing operations are referred to as "codec" operations or tools, it is understood that codec tools or operations are used at the encoder and that the corresponding decoding tools or operations that reverse the codec results will be performed by the decoder.
Examples of a peripheral bus interface or display interface may include a Universal Serial Bus (USB) or a High Definition Multimedia Interface (HDMI) or display port (Displayport), among others. Examples of storage interfaces include SATA (serial advanced technology attachment), PCI, IDE interfaces, and the like. The techniques described in this document may be implemented in various electronic devices such as mobile phones, laptops, smart phones, or other devices capable of digital data processing and/or video display.
In some embodiments, the following technical solutions may be implemented:
1. a method for video processing, comprising: a transition is made between a current block of video and a bitstream representation of the video, wherein whether a Multiple Transform Set (MTS) index and/or a transform skip flag is signaled in the bitstream representation is based on an activation of a Block Differential Pulse Codec Modulation (BDPCM) based codec utility for the current block.
2. The method of claim 1 wherein the BDPCM-based codec is quantized residual domain BDPCM (QR-BDPCM).
3. The method of claim 1 or 2, wherein the MTS index and/or the transform skip flag are excluded from the bitstream representation when the BDPCM-based codec tool is enabled for the current block.
4. The method of claim 1 or 2, wherein the MTS index and/or transform skip flag is inferred to be false when the BDPCM-based codec tool is enabled for the current block.
5. The method of claim 1 or 2, wherein the transform skip flag is inferred to be true when a BDPCM-based codec tool is enabled for the current block.
6. The method according to any one of claims 1 to 5, wherein the MTS process comprises: at least one of a plurality of predetermined transformations is used during the conversion based on the MTS index.
7. The method of any of claims 1-5, wherein, in a transform skip mode based on a transform skip flag, a residual of a prediction error between a current video block and a reference video block is quantized without applying a transform.
8. The method of any of claims 1-5, wherein, in the BDPCM-based codec tool, a difference between a residual of an intra prediction of a current video block and a prediction of the residual is represented in a bitstream representation using Differential Pulse Code Modulation (DPCM).
9. A method for video processing, comprising: applying a coding tool to one or more chroma components of the video based on selective application of the coding tool to a corresponding luma component of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and performing the conversion.
10. The method of claim 9, wherein the indication of the application of the codec tool to the one or more chroma components is excluded from the bitstream representation.
11. The method of claim 9, wherein the indication to apply the codec tool to the one or more chroma components is signaled in a bitstream representation based on a condition.
12. The method of claim 11, wherein the condition comprises applying a codec tool to a corresponding luma component of the video.
13. The method of any of claims 10 to 12, wherein the indication comprises a flag or an index.
14. The method of any of claims 9-13, wherein the luma component covers corresponding samples of the chroma component of the one or more chroma components.
15. The method of claim 14, wherein the corresponding samples are scaled based on a color format of the current block.
16. A method for video processing, comprising: applying a coding tool to a current block of a first chroma component of video based on selective application of the coding tool to one or more corresponding blocks of other chroma components of the video as part of a conversion between the current block and a bitstream representation of the video; and performing the conversion.
17. The method of claim 16, wherein the indication to apply the codec tool to the current block is excluded from the bitstream representation.
18. The method of claim 16, wherein the indication to apply the codec tool to the current block is signaled in a bitstream representation based on a condition.
19. The method of claim 18, wherein the condition comprises applying a codec tool to one or more corresponding blocks of other chroma components.
20. A method for video processing, comprising: applying a coding tool to a luma component of the video based on a selective application of the coding tool to one or more corresponding chroma components of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and performing the conversion.
21. The method of claim 20, wherein the indication of the application of the codec tool to the luma component is excluded from the bitstream representation.
22. The method of claim 20, wherein the indication to apply the codec tool to the luma component is signaled in a bitstream representation based on a condition.
23. The method of claim 22, wherein the condition comprises applying a codec tool to one or more corresponding chroma components.
24. The method of any of claims 9 to 23, wherein the coding tool is selected from the group consisting of: multi-transform set (MTS), transform skip, simplified quadratic transform (RST), Block Differential Pulse Code Modulation (BDPCM), and quantized residual domain BDPCM (QR-BDPCM).
25. A method for video processing, comprising: selecting a codec type having a plurality of binary numbers based on a Multiple Transform Set (MTS) type for a current block of video; and applying the codec type to the indication of the MTS type as part of a conversion between the current block and a bitstream representation of the video.
26. The method of claim 25, wherein the codec type is a fixed length codec, wherein the MTS type is DST7-DST7, DCT8-DST7, DST7-DCT8, or DCT8-DCT8, wherein DST # is a discrete sine transform of type # and DCT # is a discrete cosine transform of type #.
27. The method of claim 26, wherein each of the plurality of bins is context coded.
28. The method of claim 26, wherein each of the plurality of bins is bypass coded.
29. The method of claim 25, wherein the MTS type is Transform Skip (TS) or type II discrete cosine transform (DCT-II), and wherein the indicated context model is based on a codec mode of the current block, a transform block size for the current block, and/or a depth of a codec tree applied to the current block.
30. The method of claim 29, wherein the codec mode is an intra mode, an inter mode, or an Intra Block Copy (IBC) mode.
31. The method of claim 29, wherein the coding and decoding tree is a Quadtree (QT), a multi-type tree (MTT), a treelet (TT), or a Binary Tree (BT).
32. The method of claim 29, wherein the index of the context model is increased based on a dimension of a Transform Unit (TU) or a dimension of a Transform Block (TB) associated with the current block.
33. The method of claim 29, wherein the increasing of the index of the context model is based on a depth of a coding tree, and wherein the coding tree is a multi-type tree (MTT).
34. The method of claim 25, wherein the MTS type is Transform Skip (TS) or type II discrete cosine transform (DCT-II), and wherein the context model for the indication is shared.
35. The method of claim 25, wherein the indication is a syntax element, and wherein a single context is used for one or more of the plurality of bins.
36. The method of claim 35, wherein the syntax element is tu _ mts _ idx.
37. The method of claim 35 or 36, wherein a single context is used for a first binary number of the plurality of binary numbers.
38. The method of claim 35 or 36, wherein a single context is used for a first bin and a second bin of the plurality of bins, and the remaining bins of the plurality of bins are bypass coded.
39. The method of claim 25, wherein the MTS type is Transform Skip (TS), DCT2-DCT2, DST7-DST7, DCT8-DST7, DST7-DCT8, or DCT8-DCT8, wherein DST # is a discrete sine transform of type # and DCT # is a discrete cosine transform of type #, and wherein the codec type is a binarization based on a table defined as:
transformation matrix Binary string
TS
0
DCT2-DCT2 1 0
DST7-DST7 1 1 0
DCT8-DST7 1 1 1 0
DST7-DCT8 1 1 1 1 0
DCT8-DCT8 1 1 1 1 1
40. The method of claim 25, wherein the MTS type is DCT2-DCT2 or DST7-DST7, wherein DST # is a discrete sine transform of type # and DCT is a discrete cosine transform of type #.
41. The method of any of claims 1-40, wherein the transforming generates the current block from a bitstream representation.
42. The method of any of claims 1-40, wherein the transforming generates a bitstream representation from the current block.
43. An apparatus in a video system, the apparatus comprising: a processor; and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of claims 1-42.
44. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method of any of claims 1-42.
From the foregoing, it will be appreciated that specific embodiments of the presently disclosed technology have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the presently disclosed technology is not limited thereto, but rather is defined by the following claims.
Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible, non-transitory computer-readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of the foregoing. The term "data processing unit" or "data processing apparatus" includes all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of the above.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such a device. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples have been described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (44)

1. A method for video processing, comprising:
performing a conversion between a current block of video and a bitstream representation of the video,
wherein whether a Multiple Transform Set (MTS) index and/or a transform skip flag is signaled in the bitstream representation is based on an activation of a Block Differential Pulse Codec Modulation (BDPCM) based codec tool for the current block.
2. The method of claim 1, wherein the BDPCM-based coding tool is quantized residual domain BDPCM (QR-BDPCM).
3. The method of claim 1 or 2, wherein the MTS index and/or the transform skip flag are excluded from the bitstream representation when the BDPCM-based coding tool is enabled for the current block.
4. The method of claim 1 or 2, wherein the MTS index and/or the transform skip flag is inferred to be false when the BDPCM-based coding tool is enabled for the current block.
5. The method of claim 1 or 2, wherein the transform skip flag is inferred to be true when the BDPCM-based codec utility is enabled for the current block.
6. The method of any one of claims 1 to 5, wherein the MTS procedure comprises: based on the MTS index, at least one of a plurality of predetermined transforms is used during the conversion.
7. The method of any of claims 1-5, wherein, in a transform skip mode based on the transform skip flag, a residual of a prediction error between the current video block and a reference video block is quantized without applying a transform.
8. The method of any of claims 1 to 5, wherein, in the BDPCM-based codec tool, differences between a residual of an intra prediction of the current video block and a prediction of the residual are represented in the bitstream representation using Differential Pulse Code Modulation (DPCM).
9. A method for video processing, comprising:
applying a coding tool to one or more chroma components of a video based on selective application of the coding tool to a corresponding luma component of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and
the conversion is performed.
10. The method of claim 9, wherein an indication of application of the coding tool to the one or more chroma components is excluded from the bitstream representation.
11. The method of claim 9, wherein an indication to apply the codec tool to the one or more chroma components is signaled in the bitstream representation based on a condition.
12. The method of claim 11, wherein the condition comprises applying the coding tool to a corresponding luma component of the video.
13. The method of any of claims 10 to 12, wherein the indication comprises a flag or an index.
14. The method of any of claims 9 to 13, wherein the luma component covers corresponding samples of a chroma component of the one or more chroma components.
15. The method of claim 14, wherein the corresponding sample is scaled based on a color format of the current block.
16. A method for video processing, comprising:
applying a coding tool to a current block of a first chroma component of video based on selective application of the coding tool to one or more corresponding blocks of other chroma components of the video as part of a transition between the current block and a bitstream representation of the video; and
the conversion is performed.
17. The method of claim 16, wherein an indication to apply the coding tool to the current block is excluded from the bitstream representation.
18. The method of claim 16, wherein an indication to apply the coding tool to the current block is signaled in the bitstream representation based on a condition.
19. The method of claim 18, wherein the condition comprises applying the coding tool to one or more corresponding blocks of the other chroma components.
20. A method for video processing, comprising:
applying a coding tool to a luma component of a video based on a selective application of the coding tool to one or more corresponding chroma components of the video as part of a conversion between a current block of the video and a bitstream representation of the video; and
the conversion is performed.
21. The method of claim 20, wherein an indication to apply the coding tool to the luma component is excluded from the bitstream representation.
22. The method of claim 20, wherein an indication to apply the codec tool to the luma component is signaled in the bitstream representation based on a condition.
23. The method of claim 22, wherein the condition comprises applying the coding tool to the one or more corresponding chroma components.
24. The method of any of claims 9 to 23, wherein the coding tool is selected from the group consisting of: multi-transform set (MTS), transform skip, simplified quadratic transform (RST), Block Differential Pulse Code Modulation (BDPCM), and quantized residual domain BDPCM (QR-BDPCM).
25. A method for video processing, comprising:
selecting a codec type having a plurality of binary numbers based on a Multiple Transform Set (MTS) type for a current block of video; and
applying the codec type to the indication of the MTS type as part of a conversion between the current block and a bitstream representation of the video.
26. The method of claim 25, wherein the codec type is a fixed length codec, wherein the MTS type is DST7-DST7, DCT8-DST7, DST7-DCT8, or DCT8-DCT8, wherein DST # is a discrete sine transform of type # and DCT # is a discrete cosine transform of type #.
27. The method of claim 26, wherein each of the plurality of bins is context coded.
28. The method of claim 26, wherein each of the plurality of bins is bypass coded.
29. The method of claim 25, wherein the MTS type is Transform Skip (TS) or type II discrete cosine transform (DCT-II), and wherein the indicated context model is based on a codec mode of the current block, a transform block size for the current block, and/or a depth of a coding tree applied to the current block.
30. The method of claim 29, wherein the codec mode is an intra mode, an inter mode, or an Intra Block Copy (IBC) mode.
31. The method of claim 29, wherein the coding tree is a Quadtree (QT), a multi-type tree (MTT), a Ternary Tree (TT), or a Binary Tree (BT).
32. The method of claim 29, wherein an increase of the index of the context model is based on a dimension of a Transform Unit (TU) or a dimension of a Transform Block (TB) associated with the current block.
33. The method of claim 29, wherein the increasing of the index of the context model is based on a depth of the coding tree, and wherein the coding tree is a multi-type tree (MTT).
34. The method of claim 25, wherein the MTS type is Transform Skip (TS) or type II discrete cosine transform (DCT-II), and wherein the indicated context model is shared.
35. The method of claim 25, wherein the indication is a syntax element, and wherein a single context is used for one or more of the plurality of bins.
36. The method of claim 35, wherein the syntax element is tu _ mts _ idx.
37. The method of claim 35 or 36, wherein the single context is for a first binary number of the plurality of binary numbers.
38. The method of claim 35 or 36, wherein the single context is for a first bin and a second bin of the plurality of bins, and wherein remaining bins of the plurality of bins are bypass coded.
39. The method of claim 25, wherein the MTS type is Transform Skip (TS), DCT2-DCT2, DST7-DST7, DCT8-DST7, DST7-DCT8, or DCT8-DCT8, wherein DST # is a discrete sine transform of type # and DCT # is a discrete cosine transform of type #, and wherein the codec type is binarization based on a table defined as:
transformation matrix Binary string TS 0 DCT2-DCT2 1 0 DST7-DST7 1 1 0 DCT8-DST7 1 1 1 0 DST7-DCT8 1 1 1 1 0 DCT8-DCT8 1 1 1 1 1
40. The method of claim 25, wherein the MTS type is DCT2-DCT2 or DST7-DST7, wherein DST # is a discrete sine transform of type # and DCT is a discrete cosine transform of type #.
41. The method of any of claims 1 to 40, wherein the converting generates the current block from the bitstream representation.
42. The method of any of claims 1 to 40, wherein the converting generates the bitstream representation from the current block.
43. An apparatus in a video system, comprising: a processor; and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of claims 1-42.
44. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method of any of claims 1-42.
CN202080045375.3A 2019-06-21 2020-06-19 Coding and decoding tool for chrominance component Pending CN114026865A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN2019092388 2019-06-21
CNPCT/CN2019/092388 2019-06-21
PCT/CN2020/097021 WO2020253810A1 (en) 2019-06-21 2020-06-19 Coding tools for chroma components

Publications (1)

Publication Number Publication Date
CN114026865A true CN114026865A (en) 2022-02-08

Family

ID=74036853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080045375.3A Pending CN114026865A (en) 2019-06-21 2020-06-19 Coding and decoding tool for chrominance component

Country Status (2)

Country Link
CN (1) CN114026865A (en)
WO (1) WO2020253810A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130308708A1 (en) * 2012-05-11 2013-11-21 Panasonic Corporation Video coding method, video decoding method, video coding apparatus and video decoding apparatus
US20130343464A1 (en) * 2012-06-22 2013-12-26 Qualcomm Incorporated Transform skip mode
WO2014071439A1 (en) * 2012-11-08 2014-05-15 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding the transform units of a coding unit
US20140355616A1 (en) * 2013-05-31 2014-12-04 Qualcomm Incorporated Single network abstraction layer unit packets with decoding order number for video coding
CN105556963A (en) * 2013-10-14 2016-05-04 联发科技(新加坡)私人有限公司 Method of residue differential pulse-code modulation for HEVC range extension
US20180205949A1 (en) * 2017-01-13 2018-07-19 Mediatek Inc. Method and Apparatus of Transform Coding
CN109089117A (en) * 2017-05-11 2018-12-25 联发科技股份有限公司 The method and device of coding or decoding video data
KR20190067732A (en) * 2017-12-07 2019-06-17 한국전자통신연구원 Method and apparatus for encoding and decoding using selective information sharing over channels

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9549182B2 (en) * 2012-07-11 2017-01-17 Qualcomm Incorporated Repositioning of prediction residual blocks in video coding
US20140286412A1 (en) * 2013-03-25 2014-09-25 Qualcomm Incorporated Intra dc prediction for lossless coding in video coding
WO2018226067A1 (en) * 2017-06-08 2018-12-13 엘지전자 주식회사 Method and apparatus for performing low-complexity computation of transform kernel for video compression
CN109922348B (en) * 2017-12-13 2020-09-18 华为技术有限公司 Image coding and decoding method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130308708A1 (en) * 2012-05-11 2013-11-21 Panasonic Corporation Video coding method, video decoding method, video coding apparatus and video decoding apparatus
US20130343464A1 (en) * 2012-06-22 2013-12-26 Qualcomm Incorporated Transform skip mode
WO2014071439A1 (en) * 2012-11-08 2014-05-15 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding the transform units of a coding unit
US20140355616A1 (en) * 2013-05-31 2014-12-04 Qualcomm Incorporated Single network abstraction layer unit packets with decoding order number for video coding
CN105556963A (en) * 2013-10-14 2016-05-04 联发科技(新加坡)私人有限公司 Method of residue differential pulse-code modulation for HEVC range extension
US20180205949A1 (en) * 2017-01-13 2018-07-19 Mediatek Inc. Method and Apparatus of Transform Coding
CN109089117A (en) * 2017-05-11 2018-12-25 联发科技股份有限公司 The method and device of coding or decoding video data
KR20190067732A (en) * 2017-12-07 2019-06-17 한국전자통신연구원 Method and apparatus for encoding and decoding using selective information sharing over channels

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
B. BROSS: "Non-CE8: Unified Transform Type Signalling and Residual Coding for Transform Skip", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING: MARRAKECH, MA, 9–18 JAN. 2019, JVET-M0464, 18 January 2019 (2019-01-18), pages 1 - 5 *

Also Published As

Publication number Publication date
WO2020253810A1 (en) 2020-12-24

Similar Documents

Publication Publication Date Title
WO2020207492A1 (en) Interaction between matrix-based intra prediction and other coding tools
CN113812162B (en) Context modeling for simplified quadratic transforms in video
WO2020244662A1 (en) Simplified transform coding tools
CN113875233B (en) Matrix-based intra prediction using upsampling
CN113728636B (en) Selective use of quadratic transforms in codec video
WO2020228717A1 (en) Block dimension settings of transform skip mode
WO2021238828A1 (en) Indication of multiple transform matrices in coded video
WO2020244661A1 (en) Implicit selection of transform candidates
WO2020228716A1 (en) Usage of transquant bypass mode for multiple color components
US20220094929A1 (en) Applicability of implicit transform selection
WO2020253642A1 (en) Block size dependent use of secondary transforms in coded video
WO2020253810A1 (en) Coding tools for chroma components

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination