WO2020221213A1

WO2020221213A1 - Intra sub-block partitioning and multiple transform selection

Info

Publication number: WO2020221213A1
Application number: PCT/CN2020/087285
Authority: WO
Inventors: Li Zhang; Kai Zhang; Hongbin Liu; Yue Wang
Original assignee: Beijing Bytedance Network Technology Co., Ltd.; Bytedance Inc.
Priority date: 2019-04-27
Filing date: 2020-04-27
Publication date: 2020-11-05
Also published as: CN113728631B; CN113728631A

Abstract

Intra Sub-block Partitioning and multiple transform selection are described. In one example aspect, a video processing method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; enabling a second mode different from the ISP mode for the block; and performing the conversion based on the ISP mode and the second mode.

Description

INTRA SUB-BLOCK PARTITIONING AND MULTIPLE TRANSFORM SELECTION

CROSS-REFERENCE TO RELATED APPLICATION

Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/084699, filed on April 27, 2019. The entire disclosures of International Patent Application No. PCT/CN2019/084699 is incorporated by reference as part of the disclosure of this application.

TECHNICAL FIELD

This patent document relates to video coding techniques, devices and systems.

BACKGROUND

In spite of the advances in video compression, digital video still accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.

SUMMARY

The present document describes various embodiments and techniques in which a secondary transform is used during decoding or encoding of video or images.

In one example aspect, a method of video processing is disclosed. The method includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and generating a residual signal for the current sub-block based on the predictions.

In yet another example aspect, another method of video processing is disclosed. The method includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and generating a residual signal for the sub-blocks based on the predictions.

In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and reconstructing the current sub-block using the predictions.

In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and reconstructing the sub-blocks using the predictions.

In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and transforming a residual signal for the sub-blocks based on the predictions. A maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.

In yet another example aspect, another method of video processing is disclosed. The method includes receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing inverse transform on a residual signal of the sub-blocks, and reconstructing the sub-blocks using an output from the inverse transform. A maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream.

In yet another example aspect, another method of video processing is disclosed. The method includes receiving or transmitting a bitstream representing a block of video data for performing video processing. The block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream and the sub-blocks share same quantization information.

In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern. Reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block. The method also includes encoding or reconstructing the block of video data based on the predictions.

In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and encoding or reconstructing the block of video data based on the predictions. The sub-blocks are partitioned in multiple partitioning directions.

In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data to generate a residual signal, performing an explicit transformation of the residual signal using one of two transformations, and encoding an output from the implicit transformation.

In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data partitioned into one or more sub-blocks, performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.

In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; enabling a second mode different from the ISP mode for the block; and performing the conversion based on the ISP mode and the second mode.

In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; and performing the conversion based on the ISP mode.

In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; and performing the conversion based on the ISP mode.

In yet another example aspect, another method of video processing is disclosed. The method includes determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; and performing the conversion based on the determined MTS scheme.

In yet another example aspect, a video encoder is disclosed. The video encoder comprises a processor configured to implement one or more of the above-described methods.

In yet another example aspect, a video decoder is disclosed. The video decoder comprises a processor configured to implement one or more of the above-described methods.

In yet another example aspect, a computer readable medium is disclosed. The medium includes code for implementing one or more of the above-described methods stored on the medium.

These, and other, aspects are described in the present document.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows an example of an encoder block diagram.

FIG. 2 shows an example of 67 intra prediction modes.

FIG. 3A-3B show examples of reference samples for wide-angular intra prediction.

FIG. 4 is an example illustration of a problem of discontinuity in case of directions beyond 45 degrees.

FIG. 5A-5D show an example illustration of samples used by PDPC applied to diagonal and adjacent angular intra modes.

FIG. 6 depicted an example of four reference lines.

FIG. 7 is an example of division of 4×8 and 8×4 blocks.

FIG. 8 is an example of division of all blocks except 4×8, 8×4 and 4×4.

FIG. 9 is an example of Affine Linear Weighted Intra-Prediction (ALWIP) for 4x4 blocks.

FIG. 10 is an example of ALWIP for 8x8 blocks.

FIG. 11 is an example of ALWIP for 8x4 blocks.

FIG. 12 is an example of ALWIP for 16x16 blocks.

FIG. 13 shows an example of secondary transform in JEM.

FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .

FIG. 15 is an illustration of sub-block transform modes SBT-V and SBT-H.

FIG. 16 is a block diagram of an example hardware platform for implementing a technique described in the present document.

FIG. 17 shows an example of missed splitting.

FIG. 18 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 19 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 20 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 21 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 22 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 23 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

FIG. 24 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.

DETAILED DESCRIPTION

Section headings are used in the present document to facilitate ease of understanding and do not limit the embodiments disclosed in a section to only that section. Furthermore, while certain embodiments are described with reference to Versatile Video Coding or other specific video codecs, the disclosed techniques are applicable to other video coding technologies also. Furthermore, while some embodiments describe video coding steps in detail, it will be understood that corresponding steps decoding that undo the coding will be implemented by a decoder. Furthermore, the term video processing encompasses video coding or compression, video decoding or decompression and video transcoding in which video pixels are represented from one compressed format into another compressed format or at a different compressed bitrate.

1. Summary

This patent document is related to video coding technologies. Specifically, it is related transform in video coding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.

2. Initial Discussion

Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard targeting at 50%bitrate reduction compared to HEVC.

2.1 Coding flow of a typical video codec

FIG. 1 shows an example of encoder block diagram of VVC, which contains three in-loop filtering blocks: deblocking filter (DF) , sample adaptive offset (SAO) and ALF. Unlike DF, which uses predefined filters, SAO and ALF utilize the original samples of the current picture to reduce the mean square errors between the original samples and the reconstructed samples by adding an offset and by applying a finite impulse response (FIR) filter, respectively, with coded side information signaling the offsets and filter coefficients. ALF is located at the last processing stage of each picture and can be regarded as a tool trying to catch and fix artifacts created by the previous stages.

2.2 Intra mode coding with 67 intra prediction modes

To capture the arbitrary edge directions presented in natural video, the number of directional intra modes is extended from 33, as used in HEVC, to 65. The additional directional modes are depicted as red dotted arrows in FIG. 2, and the planar and DC modes remain the same. These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.

Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction as shown in FIG. 2. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for the non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes is unchanged, i.e., 67, and the intra mode coding is unchanged.

In the HEVC, every intra-coded block has a square shape and the length of each of its side is a power of 2. Thus, no division operations are required to generate an intra-predictor using DC mode. In VVV2, blocks can have a rectangular shape that necessitates the use of a division operation per block in the general case. To avoid division operations for DC prediction, only the longer side is used to compute the average for non-square blocks.

2.3 Wide-angle intra prediction for non-square blocks

Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes for a certain block is unchanged, i.e., 67, and the intra mode coding is unchanged.

To support these prediction directions, the top reference with length 2W+1, and the left reference with length 2H+1, are defined as shown in FIG. 3A-3B.

The mode number of replaced mode in wide-angular direction mode is dependent on the aspect ratio of a block. The replaced intra prediction modes are illustrated in Table 1.

Table 1 -Intra prediction modes replaced by wide-angular modes

Condition	Replaced intra prediction modes
W /H == 2	Modes 2, 3, 4, 5, 6, 7
W /H > 2	Modes 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
W /H == 1	None
H /W == 1/2	Modes 61, 62, 63, 64, 65, 66
H /W < 1/2	Mode 57, 58, 59, 60, 61, 62, 63, 64, 65, 66

As shown in FIG. 4, two vertically-adjacent predicted samples may use two non-adjacent reference samples in the case of wide-angle intra prediction. Hence, low-pass reference samples filter and side smoothing are applied to the wide-angle prediction to reduce the negative effect of the increased gap Δp _α.

2.4 Position dependent intra prediction combination

In the VTM2, the results of intra prediction of planar mode are further modified by a position dependent intra prediction combination (PDPC) method. PDPC is an intra prediction method which invokes a combination of the un-filtered boundary reference samples and HEVC style intra prediction with filtered boundary reference samples. PDPC is applied to the following intra modes without signalling: planar, DC, horizontal, vertical, bottom-left angular mode and its eight adjacent angular modes, and top-right angular mode and its eight adjacent angular modes.

The prediction sample pred (x, y) is predicted using an intra prediction mode (DC, planar, angular) and a linear combination of reference samples according to the Equation as follows:

pred (x, y) = (wL×R _-1, _y + wT×R _x, _-1 –wTL ×R _-1, _-1+ (64 –wL –wT+wTL) ×pred (x, y) + 32 ) >> 6 where R _x, -1, R _-1, y represent the reference samples located at the top and left of current sample (x, y) , respectively, and R _-1, -1 represents the reference sample located at the top-left corner of the current block.

If PDPC is applied to DC, planar, horizontal, and vertical intra modes, additional boundary filters are not needed, as required in the case of HEVC DC mode boundary filter or horizontal/vertical mode edge filters.

FIG. 5A-5D illustrates the definition of reference samples (R _x, -1, R _-1, y and R _-1, -1) for PDPC applied over various prediction modes. The prediction sample pred (x’, y’ ) is located at (x’, y’ ) within the prediction block. The coordinate x of the reference sample R _x, -1 is given by: x = x’ + y’ + 1, and the coordinate y of the reference sample R _-1, y is similarly given by: y = x’ + y’ + 1.

FIGS. 5A to 5D provide definition of samples used by PDPC applied to diagonal and adjacent angular intra modes.

The PDPC weights are dependent on prediction modes and are shown in Table 2.

Table 2 -Example of PDPC weights according to prediction modes

2.5. Multiple reference line

Multiple reference line (MRL) intra prediction uses more reference lines for intra prediction. In FIG. 6, an example of 4 reference lines is depicted, where the samples of segments A and F are not fetched from reconstructed neighbouring samples but padded with the closest samples from Segment B and E, respectively. HEVC intra-picture prediction uses the nearest reference line (i.e., reference line 0) . In MRL, 2 additional lines (reference line 1 and reference line 3) are used.

The index of selected reference line (mrl_idx) is signaled and used to generate intra predictor. For reference line index, which is greater than 0, only include additional reference line modes in MPM list and only signal MPM index without remaining mode. The reference line index is signaled before intra prediction modes, and Planar and DC modes are excluded from intra prediction modes in case a nonzero reference line index is signaled.

MRL is disabled for the first line of blocks inside a CTU to prevent using extended reference samples outside the current CTU line. Also, PDPC is disabled when additional line is used.

2.6 Intra subblock partitioning (ISP)

ISP is proposed, which divides luma intra-predicted blocks vertically or horizontally into 2 or 4 sub-partitions depending on the block size dimensions, as shown in Table . FIG. 7 and FIG. 8 show examples of the two possibilities. FIG. 7 shows an example of division of 4×8 and 8×4 blocks. FIG. 8 shows an example of division of all blocks except 4×8, 8×4 and 4×4. All sub-partitions fulfill the condition of having at least 16 samples. For block sizes, 4×N or N×4 (with N>8) , if allowed, the 1×N or N×1 sub-partition may exist.

Table 3: Number of sub-partitions depending on the block size.

For each of these sub-partitions, a residual signal is generated by entropy decoding the coefficients sent by the encoder and then invert quantizing and invert transforming them. Then, the sub-partition is intra predicted and finally the corresponding reconstructed samples are obtained by adding the residual signal to the prediction signal. Therefore, the reconstructed values of each sub-partition will be available to generate the prediction of the next one, which will repeat the process and so on. All sub-partitions share the same intra mode.

Table 4 shows example transform types based on intra-prediction mode (s) .

Table 4: Specification of trTypeHor and trTypeVer depending on predModeIntra

2.6.1 Example Syntax and Semantics

Table 5 shows an example coding unit syntax.

Table 5: Coding unit syntax

Table 6 shows an example transform unit syntax. Some of the example variables include:

intra_subpartitions_mode_flag [x0] [y0] equal to 1 specifies that the current intra coding unit is partitioned into NumIntraSubPartitions [x0] [y0] rectangular transform block subpartitions. intra_subpartitions_mode_flag [x0] [y0] equal to 0 specifies that the current intra coding unit is not partitioned into rectangular transform block subpartitions.

When intra_subpartitions_mode_flag [x0] [y0] is not present, it is inferred to be equal to 0.

intra_subpartitions_split_flag [x0] [y0] specifies whether the intra subpartitions split type is horizontal or vertical. When intra_subpartitions_split_flag [x0] [y0] is not present, it is inferred as follows:

If cbHeight is greater than MaxTbSizeY, intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 0.

Otherwise (cbWidth is greater than MaxTbSizeY) , intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 1.

The variable IntraSubPartitionsSplitType specifies the type of split used for the current luma coding block as illustrated in Table 7 9. IntraSubPartitionsSplitType is derived as follows:

If intra_subpartitions_mode_flag [x0] [y0] is equal to 0, IntraSubPartitionsSplitType is set equal to 0.

Otherwise, the IntraSubPartitionsSplitType is set equal to 1 +intra_subpartitions_split_flag [x0] [y0] .

Table 6 Transform unit syntax

Table 7 shows example name association to IntraSubPartitionsSplitType

Table 7 Name association to IntraSubPartitionsSplitType

IntraSubPartitionsSplitType	Name of IntraSubPartitionsSplitType
0	ISP_NO_SPLIT
1	ISP_HOR_SPLIT
2	ISP_VER_SPLIT

The variable NumIntraSubPartitions specifies the number of transform block subpartitions an intra luma coding block is divided into. NumIntraSubPartitions is derived as follows:

If IntraSubPartitionsSplitType is equal to ISP_NO_SPLIT, NumIntraSubPartitions is set equal to 1.

Otherwise, if one of the following conditions is true, NumIntraSubPartitions is set equal to 2: cbWidth is equal to 4 and cbHeight is equal to 8, cbWidth is equal to 8 and cbHeight is equal to 4.

Otherwise, NumIntraSubPartitions is set equal to 4.

2.7 Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction)

Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction (MIP) ) is proposed.

2.7.1 Generation of the reduced prediction signal by matrix vector multiplication

The neighboring reference samples are firstly down-sampled via averaging to generate the reduced reference signal bdry _red. Then, the reduced prediction signal pred _red is computed by calculating a matrix vector product and adding an offset:

pred _red=A·bdry _red+b.

Here, A is a matrix that has W _red·H _red rows and 4 columns if W=H=4 and 8 columns in all other cases. b is a vector of size W _red·H _red.

2.7.2. Illustration of the entire ALWIP process

The entire process of averaging, matrix vector multiplication and linear interpolation is illustrated for different shapes in FIG. 9 to FIG. 12. Note, that the remaining shapes are treated as in one of the depicted cases.

Given a 4×4 block, as shown in FIG. 9, ALWIP takes two averages along each axis of the boundary. The resulting four input samples enter the matrix vector multiplication. The matrices are taken from the set S_0. After adding an offset, this yields the 16 final prediction samples. Linear interpolation is not necessary for generating the prediction signal. Thus, a total of (4·16) / (4·4) =4 multiplications per sample are performed.

Given an 8×8 block, as shown in FIG. 10, ALWIP takes four averages along each axis of the boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_1. This yields 16 samples on the odd positions of the prediction block. Thus, a total of (8·16) / (8·8) =2 multiplications per sample are performed. After adding an offset, these samples are interpolated vertically by using the reduced top boundary. Horizontal interpolation follows by using the original left boundary.

Given an 8×4 block, as shown in FIG. 11, ALWIP takes four averages along the horizontal axis of the boundary and the four original boundary values on the left boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_1. This yields 16 samples on the odd horizontal and each vertical positions of the prediction block. Thus, a total of (8·16) / (8·4) =4 multiplications per sample are performed. After adding an offset, these samples are interpolated horizontally by using the original left boundary. The transposed case is treated accordingly.

Given a 16×16 block, as shown in FIG. 12, ALWIP takes four averages along each axis of the boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_2. This yields 64 samples on the odd positions of the prediction block. Thus, a total of (8·64) / (16·16) =2 multiplications per sample are performed. After adding an offset, these samples are interpolated vertically by using eight averages of the top boundary. Horizontal interpolation follows by using the original left boundary. The interpolation process, in this case, does not add any multiplications. Therefore, totally, two multiplications per sample are required to calculate ALWIP prediction.

For larger shapes, the procedure is essentially the same and it is easy to check that the number of multiplications per sample is less than four.

For W×8 blocks with W>8, only horizontal interpolation is necessary as the samples are given at the odd horizontal and each vertical position.

Finally for W×4 blocks with W>8, let A_kbe the matrix that arises by leaving out every row that corresponds to an odd entry along the horizontal axis of the down-sampled block. Thus, the output size is 32 and again, only horizontal interpolation remains to be performed.

The transposed cases are treated accordingly.

2.7.1 Example Syntax and Semantics

Table 8 shows an example coding unit syntax

Table 8 Coding unit syntax

2.8 Multiple Transform Set (MTS) in VVC

2.8.1 Explicit Multiple Transform Set (MTS)

In VTM4, large block-size transforms, up to 64×64 in size, are enabled, which is primarily useful for higher resolution video, e.g., 1080p and 4K sequences. High frequency transform coefficients are zeroed out for the transform blocks with size (width or height, or both width and height) equal to 64, so that only the lower-frequency coefficients are retained. For example, for an M×N transform block, with M as the block width and N as the block height, when M is equal to 64, only the left 32 columns of transform coefficients are kept. Similarly, when N is equal to 64, only the top 32 rows of transform coefficients are kept. When transform skip mode is used for a large block, the entire block is used without zeroing out any values.

In addition to DCT-II which has been employed in HEVC, a Multiple Transform Selection (MTS) scheme is used for residual coding both inter and intra coded blocks. It uses multiple selected transforms from the DCT8/DST7. The newly introduced transform matrices are DST-VII and DCT-VIII. The table below shows the basis functions of the selected DST/DCT.

In order to keep the orthogonality of the transform matrix, the transform matrices are quantized more accurately than the transform matrices in HEVC. To keep the intermediate values of the transformed coefficients within the 16-bit range, after horizontal and after vertical transform, all the coefficients are to have 10-bit.

In order to control MTS scheme, separate enabling flags are specified at SPS level for intra and inter, respectively. When MTS is enabled at SPS, a CU level flag is signalled to indicate whether MTS is applied or not. Here, MTS is applied only for luma. The MTS CU level flag is signalled when the following conditions are satisfied.

- Both width and height smaller than or equal to 32

- CBF flag is equal to one

If MTS CU flag is equal to zero, then DCT2 is applied in both directions. However, if MTS CU flag is equal to one, then two other flags are additionally signalled to indicate the transform type for the horizontal and vertical directions, respectively. Transform and signalling mapping table as shown in Table 3-10. When it comes to transform matrix precision, 8-bit primary transform cores are used. Therefore, all the transform cores used in HEVC are kept as the same, including 4-point DCT-2 and DST-7, 8-point, 16-point and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8, use 8-bit primary transform cores.

To reduce the complexity of large size DST-7 and DCT-8, High frequency transform coefficients are zeroed out for the DST-7 and DCT-8 blocks with size (width or height, or both width and height) equal to 32. Only the coefficients within the 16x16 lower-frequency region are retained.

As in HEVC, the residual of a block can be coded with transform skip mode. To avoid the redundancy of syntax coding, the transform skip flag is not signalled when the CU level MTS_CU_flag is not equal to zero. The block size limitation for transform skip is the same to that for MTS in JEM4, which indicate that transform skip is applicable for a CU when both block width and height are equal to or less than 32.

2.8.1.1 Example Syntax and Semantics

MTS index may be signaled in the bitstream and such a design is called explicit MTS. In addition, an alternative way which directly derive the matrix according to transform block sizes is also supported, as implicit MTS.

For the explicit MTS, it supports all coded modes. While for the implicit MTS, only intra mode is supported. Table 9 shows example picture parameter set syntax.

Table 9 picture parameter set RBSP syntax.

Table 10 shows example transform unit syntax.

Table 10 Transform unit syntax

Some of the example variables include:

transform_skip_flag [x0] [y0] specifies whether a transform is applied to the luma transform block or not. The array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.

transform_skip_flag [x0] [y0] equal to 1 specifies that no transform is applied to the luma transform block. transform_skip_flag [x0] [y0] equal to 0 specifies that the decision whether transform is applied to the luma transform block or not depends on other syntax elements. When transform_skip_flag [x0] [y0] is not present, it is inferred to be equal to 0.

tu_mts_idx [x0] [y0] specifies which transform kernels are applied to the residual samples along the horizontal and vertical direction of the associated luma transform block. The array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.

When tu_mts_idx [x0] [y0] is not present, it is inferred to be equal to 0.

In the CABAC decoding process, one context is used to decode transform_skip_flag, truncated unary is used to binarize the tu_mts_idx. Each bin of the tu_mts_idx is context coded, and for the first bin, the quad-tree depth (i.e., cqtDepth) is used to select one context; and for the remaining bins, one context is used.

Table 11 shows example assignment of ctxInc to syntax elements.

Table 11 Assignment of ctxInc to syntax elements with context coded bins

2.8.2 Implicit Multiple Transform Set (MTS)

It is noted that ISP, SBT, and MTS enabled but with implicit signaling are all treated as implicit MTS

The implicitMtsEnabled is used to define whether implicit MTS is enabled. The variable implicitMtsEnabled is derived as follows:

If sps_mts_enabled_flag is equal to 1 and one of the following conditions is true, implicitMtsEnabled is set equal to 1:

- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT

- cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32

- sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are both equal to 0 and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA

Otherwise, implicitMtsEnabled is set equal to 0.

The variable trTypeHor specifying the horizontal transform kernel and the variable trTypeVer specifying the vertical transform kernel are derived as follows:

If cIdx is greater than 0, trTypeHor and trTypeVer are set equal to 0.

Otherwise, if implicitMtsEnabled is equal to 1, the following applies:

- If IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified in Table 8 15 depending on intraPredMode.

- Otherwise, if cu_sbt_flag is equal to 1, trTypeHor and trTypeVer are specified in Table 8 14 depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.

- Otherwise (sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are equal to 0) , trTypeHor and trTypeVer are derived as follows:

trTypeHor = (nTbW >= 4 && nTbW <= 16 && nTbW <= nTbH ) ? 1 : 0 (8 1030)

trTypeVer = (nTbH >= 4 && nTbH <= 16 && nTbH <= nTbW ) ? 1 : 0 (8 1031)

Otherwise, trTypeHor and trTypeVer are specified in Table 12 depending on tu_mts_idx [xTbY] [yTbY] .

Table 12 Specification of trTypeHor and trTypeVer depending on tu_mts_idx [x] [y]

tu_mts_idx [x0] [y0]	1	2	3	4
trTypeHor	1	2	1	2
trTypeVer	1	1	2	2

Table 13 shows example specification of trTypeHor and trTypeVer depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.

Table 13 Specification of trTypeHor and trTypeVer depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag

cu_sbt_horizontal_flag	cu_sbt_pos_flag	trTypeHor	trTypeVer
0	0	2	1
0	1	1	1
1	0	1	2
1	1	1	1

2.9 Reduced Secondary Transform (RST)

2.9.1 Non-Separable Secondary Transform (NSST) in JEM

In JEM, secondary transform is applied between forward primary transform and quantization (at encoder) and between de-quantization and invert primary transform (at decoder side) . As shown in FIG. 10, a 4x4 (or 8x8) secondary transform is performed depends on block size. For example, 4x4 secondary transform is applied for small blocks (i.e., min (width, height) < 8) and 8x8 secondary transform is applied for larger blocks (i.e., min (width, height) > 4) per 8x8 block.

FIG. 13 shows an example of secondary transform in JEM.

Application of a non-separable transform is described as follows using input as an example. To apply the non-separable transform, the 4x4 input block X

is first represented as a vector

The non-separable transform is calculated as

where

indicates the transform coefficient vector, and T is a 16x16 transform matrix. The 16x1 coefficient vector

is subsequently re-organized as 4x4 block using the scanning order for that block (horizontal, vertical or diagonal) . The coefficients with smaller index will be placed with the smaller scanning index in the 4x4 coefficient block. There are totally 35 transform sets and 3 non-separable transform matrices (kernels) per transform set are used. The mapping from the intra prediction mode to the transform set is pre-defined. For each transform set, the selected non-separable secondary transform candidate is further specified by the explicitly signalled secondary transform index. The index is signalled in a bit-stream once per Intra CU after transform coefficients.

2.9.2 Reduced Secondary Transform (RST)

The RST was introduced and 4 transform set (instead of 35 transform sets) mapping is introduced. 16x64 (may further be reduced to 16x48) and 16x16 matrices are employed for 8x8 and 4x4 blocks, respectively. For notational convenience, the 16x64 (may further be reduced to 16x48) transform is denoted as RST8x8 and the 16x16 one as RST4x4. FIG. 11 shows an example of RST.

FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .

2.10 Sub-block transform

For an inter-predicted CU with cu_cbf equal to 1, cu_sbt_flag may be signaled to indicate whether the whole residual block or a sub-part of the residual block is decoded. In the former case, inter MTS information is further parsed to determine the transform type of the CU. In the latter case, a part of the residual block is coded with inferred adaptive transform and the other part of the residual block is zeroed out. The SBT is not applied to the combined inter-intra mode.

In sub-block transform, position-dependent transform is applied on luma transform blocks in SBT-V and SBT-H (chroma TB always using DCT-2) . The two positions of SBT-H and SBT-V are associated with different core transforms. More specifically, the horizontal and vertical transforms for each SBT position is specified in FIG. 15. For example, the horizontal and vertical transforms for SBT-V position 0 is DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the corresponding transform is set as DCT-2. Therefore, the sub-block transform jointly specifies the TU tiling, cbf, and horizontal and vertical transforms of a residual block, which may be considered a syntax shortcut for the cases that the major residual of a block is at one side of the block.

2.10.1 Example Syntax and Semantics

Table 14 shows an example coding unit syntax.

Table 14 Coding unit syntax

Table 15 shows an example residual coding syntax.

Table 15 Residual coding syntax

3. Examples of problems solved by embodiments

The current design has the following problems:

1. ISP couldn’t be enabled when multiple reference line (MRL) is enabled.

2. Transform skip (TS) couldn’t be enabled when ISP is used. However, enabling both ISP and TS may achieve similar functionality as BDPCM while there is no need to add an additional module for handling BDPCM.

3. Delta QP is signaled per sub-partition which results in signaling it multiple times for ISP coded blocks.

4. The enabling of ISP mode (e.g., intra_subpartitions_mode_flag) is signaled when either width or height of one block is no larger than MaxTbSizeY, while the signaling of partition direction (i.e., splitting type, horizontal/vertical direction) is signaled when both width and height are no larger than MaxTbSizeY. If one of them is larger than MaxTbSizeY, the following applies:

a. If height is greater than MaxTbSizeY, horizontal splitting (i.e., intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 0) is used.

b. Otherwise (width is greater than MaxTbSizeY) , vertical splitting (i.e., intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 1) is used.

Such a design is based on the assumption that only width or height could be twice of the MaxTbSizeY which limits the flexibility. When the maximum transform size is set to, for example, 32x32, and the CU size is, for example, 128x128, according to the rules, it will be split to 4 128x32 sub-partitions. However, when the maximum transform size is 32x32, it is disallowed to coded one 128x32 sub-partitions in VVC. How to handle this case is unknown.

5. ISP and sub-block transform are both treated as implicit MTS since there is no need to signal the transform matrix. Sub-block transform could support block sizes up to 64x64 when the MaxSbtSize. However, the setting of implicitMTS only checks Max (width, height ) is less than or equal to 32. In addition, when cu_sbt_flag is equal to 1, implicitMTS shall be set to 1 automatically, there is no need to check the transform size.

6. TS is part of MTS. However, the signaling of enabling/disabling TS and maximum TS size is signaled in PPS. While MTS enabling/disabling flag is signaled in SPS.

7. Redundant check of block sizes is identified in the current VVC design for signaling transform_skip_flag and tu_mts_idx.

4. Example embodiments and techniques

The listing of embodiments below should be considered as examples to explain general concepts. These embodiments should not be interpreted in a narrow way. Furthermore, these embodiments can be combined in any manner.

In the following description, one block size is denoted by W*H wherein W is the block width and H is the block height. The maximum transform block size is denoted by MaxTbW *MaxTbH wherein MaxTbW and MaxTbH are the maximum transform block width and height, respectively. The minimum transform block size is denoted by MinTbW *MinTbH wherein MinTbW and MinTbH are the minimum transform block’ width and height, respectively. It is noted that MRL may represent those technologies that use non-adjacent reference lines in current picture to predict the current block, and ALWIP may represent those technologies that use matrix-based intra prediction methods. They are not limited to those mentioned in prior art.

Regarding ISP:

1. It is proposed that Intra Sub-block Partition (ISP) and multiple reference line (MRL) modes may be both enabled (e.g., the reference line may not be the closest one) for coding one block.

a. In one example, all sub-partitions use the same reference line index for intra prediction.

b. Alternatively, only the first K (e.g, K = 1) sub-partition follows the reference line index (e.g., signaled in the bitstream) . The remaining sub-partitions still use the closest reference line for intra prediction.

c. In one example, whether MRL is applied for the remaining sub-partitions (e.g., sub-partitions except the first sub-partition) may depend on the splitting direction in ISP or/and intra prediction mode or/and dimension of the block.

i. For example, if the block is split in horizonal direction in ISP, MRL may be applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are less than or equal to 50 in FIG. 2.

ii. For example, if the block is split in horizonal direction in ISP, MRL may be not applied to the remaining sub-partitions when above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are greater than 50 in FIG. 2.

iii. For example, if the block is split in vertical direction in ISP, MRL may be applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are greater than or equal to 18 in FIG. 2.

iv. For example, if the block is split in vertical direction in ISP, MRL may be not applied to the remaining sub-partitions when bottom-left neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are less than 18 in FIG. 2.

2. Indications of ISP mode information (e.g., on/off, splitting direction) may be signaled before the signaling of MRL related information.

a. In one example, when ISP mode is enabled for one block, the signaling of MRL related information may be skipped, e.g., the reference line index.

i. Alternatively, furthermore, the reference line index is referred to be 0.

3. ALWIP and ISP may be both enabled for one block.

a. Alternatively, furthermore, the matrix selection of one sub-partition may depend on the intra mode and/or dimension of the sub-partition.

b. Alternatively, furthermore, indications of ALWIP modes (e.g., intra_lwip_flag and related intra modes) and indications of ISP modes (e.g., intra_subpartitions_mode_flag and intra_subpartitions_split_flag)

4. Transform skip (TS) and ISP may be both enabled for one block.

a. Alternatively, furthermore, indication of enabling/disabling transform skip mode may be further signaled even when ISP mode is enabled (e.g., IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT) .

b. Alternatively, furthermore, whether to signal the indication of enabling/disabling transform skip mode may depend on whether the video content is screen content or not.

i. In one example, it may depend on a flag signaled in picture/slice/tile group/tile/brick-level.

ii. In one example, if the video content is screen content, the indication of enabling/disabling transform skip mode may be signaled. Alternatively, if the video content is camera content, the indication of enabling/disabling transform skip mode may be skipped and the TS mode is disabled for ISP coded blocks.

5. It is proposed that only one quantization parameter, and/or one quantization step, and/or one scaling matrix may be allowed for ISP coded blocks. That is, all sub-partitions shall the same quantization information.

a. In one example, one quantization parameter may be represented by cu_qp_delta_abs, and cu_qp_delta_sign_flag.

b. In one example, the quantization parameter information may be signaled for an ISP coded block only when there is at least one coefficient not equal to zero in at least one sub-partition.

c. In one example, the quantization parameter, and/or one quantization step, and/or one scaling matrix may be signaled once for the whole ISP coded block instead of being signaled for each sub-partition.

i. In one example, the information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition.

ii. In one example, the information may be signaled together with the first sub-partition in the encoding/decoding order.

iii. In one example, the information may be signaled together with the last sub-partition in the encoding/decoding order.

iv. In one example, the information may be signaled together with the m-th sub-partition in the encoding/decoding order wherein m is no larger than the total number of allowed sub-partitions.

6. It is proposed that reference samples located in a first sub-partition to predict a second sub-partition in an ISP coded block may be further modified (e.g., may be filtered) before being used as prediction.

a. In one example, whether to modify (e.g., filter) reference samples before being used as prediction may depend on block width and/or height.

b. In one example, whether to modify (e.g., filter) reference samples before being used as prediction may depend on the intra-prediction mode.

7. Indications of MaxTbW and/or MaxTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.

a. In one example, they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.

b. MaxTbW and/or MaxTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.

8. Indications of MinTbW and/or MinTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.

b. MinTbW and/or MinTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.

9. Mixed splitting directions may be enabled for ISP coded blocks wherein the block may be split for both horizontal and vertical directions.

a. In one example, the binary value of splitting direction coded for the ISP mode (e.g., intra_subpartitions_split_flag) may be replaced by an index of splitting directions.

b. In one example, the set of allowed splitting directions may depend on block dimension.

i. Alterantivley, indications of set of allowed splitting directions may be signaled.

c. In one example, the set of allowed splitting directions may depend on intra prediciton mode.

d. In one example, when W/MaxTbW and H/MaxTbH are both greater than M (e.g., M=1) , mixed splitting directions may be enabled, wherein both horizontal and vertical splitting may be invoked.

i. An example of mixed splitting direction is depicted in FIG. 12.

ii. Alternatively, when W/MaxTbW or H/MaxTbH is greater than M (e.g., M=1) , the mixed splitting directions may be enabled.

iii. In one example, a block may be split horizontally first followed by being split vertically when mixed ISP is applied.

1) Alternatively, a block may be split vertically first followed by being split horizontally when mixed ISP is applied. FIG. 17 shows an example of mixed splitting (also known as quad-tree splitting) .

10. Whether to and/or how to apply ISP on a block may depend on the relationship between the block dimensions W×H, and/or maximum and/or minimum transform block sizes.

a. In one example, if W/MinTbW and H/MinTbH are both equal to 1, ISP is disabled.

b. In one example, how to split the block may depend on the minimum transform block sizes.

i. In one example, if W/MinTbW is equal to K (K> 1) and H/MinTbH is equal to 1, ISP may be enabled and vertical splitting is applied.

ii. In one example, if W/MinTbW is equal to 1 and H/MinTbH is equal to K (K> 1) , ISP may be enabled and horizontal splitting is applied. Alternatively, furthermore, there is no need to signal the prediction direction.

iii. Alternatively, furthermore, there is no need to signal the prediction direction.

iv. Alternatively, furthermore, the block may be split to K sub-partitions.

c. In one example, ISP mode is disabled when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1.

i. Alternatively, ISP mode is disabled when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, such as 4.

ii. Alternatively, ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than 1.

iii. Alternatively, ISP mode is disabled when either W/MaxTbW or H/MaxTbH is greater than a threshold, such as 2 or 4.

iv. Alternatively, ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than a threshold, such as 2 or 4.

d. In one example, ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater (or no smaller) than a first threshold, and no greater (or smaller) than a second threshold.

i. Alternatively, ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater than a first threshold, and smaller than a second threshold.

ii. In one example, the first and second thresholds are 1, and 4, respectively.

iii. Alternatively, furthermore, the signaling of the splitting direction (e.g., intra_subpartitions_split_flag) may be skipped and the block may be split according to certain rules.

1) In one example, the quard-tree splitting may be applied firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.

2) In one example, the splitting of one partition tree may be terminated once either width reaches the MaxTbW or height reaches the MaxTbH.

a. Alternatively, the splitting of one partition tree may be terminated once both width reaches the MaxTbW and height reaches the MaxTbH.

b. Alternatively, the splitting of one partition tree may be terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N wherein M and N are two positive integers.

e. When ISP mode is disabled, signaling of the related information such as intra_subpartitions_mode_flag is skipped.

11. In one example, more than 4 sub-partitions and/or more than one splitting direction (such as both horizontal and vertical splitting are invoked) may be enabled.

i. Alternatively, the above method may be enabled under certain conditions.

ii. In one example, when W/MaxTbW > 4 and/or H/MaxTbH > 4.

Regarding MTS:

12. It is proposed to only keep two transforms (and corresponding invert transforms) for the explicit MTS design. For example, the two transforms may be DCT-II and DST-VII (and corresponding invert transforms) .

a. In one example, there is only two choices in terms of transform selection. Alternatively, furthermore, TS mode may be a third choice if it is applicable.

i. In one example, one choice is DCT-II for both horizontal and vertical transform; and the other one is DST-VII.

ii. One bit may be coded to indicate whether which transform of the two is used.

b. Alternatively, there are four choices in terms of transform selection. Alternatively, furthermore, TS mode may be a fourth choice if it is applicable.

i. The choices include: DCT-II/DST-VII for both horizontal and vertical transform; joint usage of DCT-II and DST-VII, each one for the horizontal or vertical transforms.

ii. In one example, fixed length coding may be utilized to code the four choices.

iii. Alternatively, truncated unary may be utilized to code the four choices.

1) Some examples of bin strings for the four choices are tabulated as follows:

(hor, ver)	Method #1	Method #2	Method #3	Method #4
(DCT-II, DCT-II)	0	0	0	0
(DST-VII, DST-VII)	1 0	1 1 0	1 1 0	1 0
(DCT-II, DST-VII)	1 1 0	1 0	1 1 1	1 1 1
(DST-VII, DCT-II)	1 1 1	1 1 1	1 0	1 1 0

13. It is proposed that the allowed transform sets and/or signaling of transform index in explicit MTS may depend on the block dimension.

a. In one example, for blocks with width and/or height no larger (or smaller) than a threshold, DCT-II and DST-VII may be allowed.

b. In one example, for blocks with width and/or height larger (or no smaller) than a threshold, DCT-II, DST-VII and DCT-VIII may be allowed.

c. Alternatively, furthermore, transform skip mode may be enabled.

d. In one example, the allowed transform sets may depend on coded mode.

i. In one example, for IBC coded blocks, the two-transformation basis (TS and DST-VII) may be allowed.

ii. In one example, for non-IBC coded blocks, the two-transformation basis (DCT-II and DST-VII) may be allowed or three-transformation transformation basis (TS, DCT-II and DST-VII) may be allowed.

e. How to signal the transform index may be changed according to the allowed transform sets.

14. Indications of the maximum allowed transform size (non-TS mode) used in MTS may be signaled.

a. In one example, they may be signaled in sequence/picture/slice/tile group/tile/brick level, or other kinds of video unit level.

i. In one example, they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.

b. In one example, they may be not signaled, but derived from the allowed maximum TS size.

c. In one example, indications of the maximum allowed transform size (non-TS mode) may control both maximum TS sizes and maximum sizes used in other transform matrix.

i. Alternatively, furthermore, there is no need to signal maximum sizes for non-TS and TS modes separately.

d. In one example, indications of the maximum allowed transform size may control both implicit and explicit MTS transform sizes.

i. Alternatively, furthermore, whether to apply the implicit MTS may depend on the signaled sizes.

15. It is proposed to align the maximum allowed transform size (non-TS mode) used in MTS and maximum allowed block size used in TS mode.

a. In one example, the maximum allowed transform size (non-TS mode) used in MTS and maximum allowed block size used in TS mode may be the same number.

16. It is proposed that the derivation of implicit MTS enabling flag is independent from the block dimension.

a. Alternatively, furthermore, the checking of block size in the derivation of implicit MTS enabling flag is skipped.

17. The shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) may be removed.

a. In one example, if all of the following shared conditions are true, MTS information may be further signaled. Otherwise, there is no need to signal the MTS information.

- tu_cbf_luma [x0] [y0]

- treeType! = DUAL_TREE_CHROMA

- !cu_sbt_flag

b. Alternatively, furthermore, when the shared condition check of other rules (e.g, mentioned above) returns true, condition check of block dimension compared to the allowed maximum TS sizes may be applied before signaling transform_skip_flag; and condition check of block dimension compared to the allowed maximum allowed MTS sizes (e.g., fixed to be 32x32) may be applied before signaling tu_mts_idx.

18. The shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) is kept unchanged, while the condition check of block dimension before signaling the transform matrix index (non-TS mode) may be removed.

4.1 Example setting of implicit MTS flag

Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example. The underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions.

In general, inputs to the transformation process for scaled transform coefficients are:

- a luma location (xTbY, yTbY ) specifying the top-left sample of the current luma transform block relative to the top left luma sample of the current picture,

- a variable nTbW specifying the width of the current transform block,

- a variable nTbH specifying the height of the current transform block,

- a variable cIdx specifying the colour component of the current block,

- an (nTbW) x (nTbH) array d [x] [y] of scaled transform coefficients with x = 0.. nTbW -1, y = 0.. nTbH -1.

Output of this process is the (nTbW) x (nTbH) array r [x] [y] of residual samples with x = 0.. nTbW -1, y = 0.. nTbH -1.

The variable implicitMtsEnabled is derived as follows:

- If sps_mts_enabled_flag is equal to 1 and one of the following conditions is true, implicitMtsEnabled is set equal to 1:

- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT

- cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to MaxSBTSize

That is, Max (nTbW, nTbH ) is compared against MaxSBTSize instead of a fixed number 32.

- Otherwise, implicitMtsEnabled is set equal to 0.

- If cIdx is greater than 0, trTypeHor and trTypeVer are set equal to 0.

- Otherwise, if implicitMtsEnabled is equal to 1, the following applies:

- If IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified depending on intraPredMode.

- Otherwise, if cu_sbt_flag is equal to 1, trTypeHor and trTypeVer are specified depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.

- Otherwise (sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are equal to 0 and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA ) , trTypeHor and trTypeVer are derived as follows:

trTypeHor = (nTbW >= 4 && nTbW <= 16 && nTbW <= nTbH ) ? 1 : 0 (8 1030)

trTypeVer = (nTbH >= 4 && nTbH <= 16 && nTbH <= nTbW ) ? 1 : 0 (8 1031)

- Otherwise, trTypeHor and trTypeVer are specified in Table 8 13 depending on tu_mts_idx [xTbY] [yTbY] .

Alternatively, the condition check ‘cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32’ in the determination of implicitMtsEnabled may be replaced by ‘cu_sbt_flag is equal to 1’.

4.2 Example setting of explicit MTS flag

Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example. The underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions. This section provides examples for redundant check removal during the MTS signaling process.

Alternatively, the following may apply:

FIG. 16 is a block diagram of a video processing apparatus 1600. The apparatus 1600 may be used to implement one or more of the methods described herein. The apparatus 1600 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 1600 may include one or more processors 1602, one or more memories 1604 and video processing hardware 1606. The processor (s) 1602 may be configured to implement one or more methods described in the present document. The memory (memories) 1604 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 1606 may be used to implement, in hardware circuitry, some techniques described in the present document.

FIG. 18 is a flowchart for a method 1800 of video processing in accordance with one or more examples of the present technology. The method 1800 includes, at operation 1802, partitioning a block of video data into sub-blocks using a partitioning pattern. The method 1800 includes, at operation 1804, performing prediction for one sub-block using at least one line of reference video data not adjacent to the current sub-block. The method 1800 also includes, at operation 1806, generating a residual signal for the sub-block based on the prediction.

FIG. 19 is a flowchart for a method 1900 of video processing in accordance with one or more examples of the present technology. The method 1900 includes, at operation 1902, partitioning a block of video data into sub-blocks using a partitioning pattern. The method 1900, at operation 1904, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks. The method 1900 also includes, at operation 1906, generating a residual signal for the sub-blocks based on the predictions.

FIG. 20 is a flowchart for a method 2000 of video processing in accordance with one or more examples of the present technology. The method 2000 includes, at operation 2002, performing predictions for a block of video data to generate a residual signal. The method 2000 includes, at operation 2004, performing an explicit transformation of the residual signal using one of two transformations. The method 2000 includes, at operation 2006, encoding an output from the implicit transformation.

Additional embodiments and techniques are described in the following examples.

1. A video processing method, comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and generating a residual signal for the current sub-block based on the predictions.

2. The method of example 1, wherein all the sub-blocks use a same reference line index of reference video data for the predictions.

3. The method of example 1, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that corresponds to the closest line of reference video data to the sub-block.

4. The method of example 1, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that is determined based on a partitioning direction of the sub-blocks, a prediction mode, or the dimension of the block.

Further embodiments of examples 1-4 are described in

items

1 and 2 in Section 4.

5. A video processing method, comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and generating a residual signal for the sub-blocks based on the predictions.

6. The method of example 5, wherein the matrix vector of a sub-block is selected based on an intra mode or a dimension of the sub-block.

Further embodiments of examples 5-6 are described in items 3-4 in Section 4.

7. A video processing method, comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and reconstructing the current sub-block using the predictions.

8. The method of example 7, wherein all the sub-blocks use a same reference line index of reference video data for the predictions.

9. The method of example 7, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that corresponding to the closest line of reference video data to the sub-block.

10. The method of example 7, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that is determined based on a partitioning direction of the sub-blocks, a prediction mode, or the dimension of the block.

Further embodiments of examples 7-10 are described in

items

1 and 2 in Section 4.

11. A video processing method, comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and reconstructing the sub-blocks using the predictions.

12. The method of example 11, wherein the matrix vector of a sub-block is selected based on an intra mode or a dimension of the sub-block.

Further embodiments of examples 1-4 are described in items 3-4 in Section 4.

13. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern; and transforming a residual signal for the sub-blocks based on the predictions, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.

14. A video processing method, comprising: receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing inverse transform on a residual signal of the sub-blocks, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream; and reconstructing the sub-blocks using an output from the inverse transform.

15. The method of example 13 or 14, wherein the maximum transform block dimension or the minimum transform block dimension is set to different values in different profiles, levels, or tiers.

Further embodiments of examples 13-15 are described in items 7-8 in Section 4.

16. A video processing method, comprising: receiving or transmitting a bitstream representing a block of video data for performing video processing, wherein the block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream, and wherein the sub-blocks share same quantization information.

17. The method of example 16, wherein the quantization information comprises a quantization parameter, a quantization step, or a scaling matrix.

18. The method of example 16, wherein the quantization information for all the sub-blocks in the block of video data is coded once in the bitstream.

Further embodiments of examples 16-18 are described in item 5 in Section 4.

19. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block; and encoding or reconstructing the block of video data based on the predictions.

Further embodiments of example 19 are described in item 6 in Section 4.

20. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein the sub-blocks are partitioned in multiple partitioning directions; and encoding or reconstructing the block of video data based on the predictions.

21. The method of example 20, the multiple partitioning directions are determined by the dimension of the block.

Further embodiments of examples 20-21 are described in item 9 in Section 4.

22. The method of any of example 1 to 21, wherein the sub-blocks are partitioned based on a minimum transform block dimension or a maximum transform block dimension.

Further embodiments of example 22 are described in items 10 and 11 in Section 4.

23. A video processing method, comprising: performing predictions for a block of video data to generate a residual signal; performing an explicit transformation of the residual signal using one of two transformations; and encoding an output from the implicit transformation.

24. The method of example 23, comprising: coding a transformation option in a bitstream representing the block of video data based on variations of the two transformations.

Further embodiments of examples 23-24 are described in items 12-13 in Section 4.

25. The method of example 22 or 23, comprising: signaling information about the explicit transformation without checking a dimension of the block of video data.

Further embodiments of example 25 are described in items 16-18 in Section 4.

26. A video processing method, comprising: receiving a block of video data partitioned into one or more sub-blocks; performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.

27. The method of any of example 23 to 26, wherein the explicit transformation is performed in one or more transformation directions that includes a horizonal direction and a vertical direction.

28. The method of example 27, wherein different transform directions use different transformations.

29. The method of example 27, wherein different transform directions use a same transformation.

Further embodiments of examples 27-29 are described in items 12-13 in Section 4.

30. The method of any of examples 23 to 29, wherein a maximum allowed transform size is coded in a level of a sequence, a picture, a slice, a tile, or a brick in a bitstream representing the block of video data.

31. The method of any of examples 23 to 30, comprising: deriving a maximum allowed transform size based on an allowed maximum transform skip size.

Further embodiments of example 30 are described in items 14-15 in Section 4.

32. A video processing apparatus comprising a processor configured to implement one or more of examples 1 to 31.

33. A computer-readable medium having code stored thereon, the code, when executed by a processor, causing the processor to implement a method recited in any one or more of examples 1 to 31.

FIG. 21 is a flowchart for a method 2100 of video processing in accordance with one or more examples of the present technology. The method 2100 includes, at 2102, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; at 2104, enabling a second mode different from the ISP mode for the block; and at 2106, performing the conversion based on the ISP mode and the second mode.

In some examples, the second mode is multiple reference line (MRL) mode.

In some examples, in the MRL mode, a reference line which is not the closest one of multiple reference lines is available for intra prediction of the block.

In some examples, all sub-partitions of the block use a same reference line index for intra prediction of the block.

In some examples, only first K sub-partitions of all sub-partitions of the block use a same reference line index for intra prediction, and the remaining sub-partitions use the closest reference line for intra prediction of the block, K being an integer.

In some examples, K =1.

In some examples, the reference line index is signaled in the bitstream.

In some examples, whether MRL mode is applied for the remaining sub-partitions depends on splitting direction in ISP mode or/and intra prediction mode or/and size of the block.

In some examples, if the block is split in horizonal direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference samples of the block are used in the intra prediction of a first sub-partition.

In some examples, if the block is split in horizonal direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.

In some examples, if the block is split in vertical direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.

In some examples, if the block is split in vertical direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when bottom-left neighboring reference samples of the block are used in the intra prediction of a first sub-partition.

In some examples, indications of ISP mode information are signaled before the signaling of MRL mode related information.

In some examples, the ISP mode information include at least one of on/off flag and splitting direction, and the MRL mode related information includes the reference line index.

In some examples, when ISP mode is enabled for the block, the signaling of MRL related information is skipped.

In some examples, when ISP mode is enabled for the block, the reference line index is referred to be 0.

In some examples, the second mode is a matrix based intra prediction (MIP) mode.

In some examples, matrix selection of one sub-partition depends on intra mode and/or size of the sub-partition.

In some examples, indications of the MIP modes and indications of the ISP modes are signaled for the block.

In some examples, the second mode is Transform skip (TS) mode.

In some examples, indication of enabling/disabling TS mode is further signaled even when ISP mode is enabled.

In some examples, whether to signal the indication of enabling/disabling TS mode depends on whether video content of the video is screen content or not.

In some examples, whether to signal the indication of enabling/disabling TS mode depends on a flag signaled in at least one of picture, slice, tile group, tile and brick-level.

In some examples, if the video content is screen content, the indication of enabling/disabling transform skip mode is signaled.

In some examples, if the video content is camera content, the indication of enabling/disabling transform skip mode is skipped and the TS mode is disabled for the blocks.

In some examples, all sub-partitions share the same quantization information including at least one of quantization parameter, quantization step and scaling matrix.

In some examples, the quantization parameter is represented by cu_qp_delta_abs and cu_qp_delta_sign_flag.

In some examples, the quantization information is signaled for the block only when there is at least one coefficient not equal to zero in at least one sub-partition.

In some examples, the quantization information is signaled once for the whole block instead of being signaled for each sub-partition.

In some examples, the quantization information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition, where m is an integer.

In some examples, the quantization information is signaled together with a first sub-partition in encoding or decoding order.

In some examples, the quantization information is signaled together with the last sub-partition in encoding or decoding order.

In some examples, the quantization information is signaled together with the m-th sub-partition in encoding or decoding order, wherein m is an integer no larger than the total number of allowed sub-partitions.

FIG. 22 is a flowchart for a method 2200 of video processing in accordance with one or more examples of the present technology. The method 2200 includes, at 2202, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; at 2204, performing the conversion based on the ISP mode.

In some examples, the reference samples are filtered before being used as prediction.

In some examples, whether to modify the reference samples before being used as prediction depends on width and/or height of the block.

In some examples, whether to modify the reference samples before being used as prediction depends on intra-prediction mode of the block.

In some examples, block size of the block is denoted by W*H, wherein W is the block width and H is the block height, a maximum transform block size of the block is denoted by MaxTbW *MaxTbH, wherein MaxTbW and MaxTbH are the maximum transform block width and maximum transform block height, respectively, and a minimum transform block size of the block is denoted by MinTbW *MinTbH, wherein MinTbW and MinTbH are the minimum transform block width and minimum transform block height, respectively.

In some examples, indications of MaxTbW and/or MaxTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.

In some examples, the indications of MaxTbW and/or MaxTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.

In some examples, MaxTbW and/or MaxTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.

In some examples, indications of MinTbW and/or MinTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.

In some examples, the indications of MinTbW and/or MinTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.

In some examples, MinTbW and/or MinTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.

FIG. 23 is a flowchart for a method 2300 of video processing in accordance with one or more examples of the present technology. The method 2200 includes, at 2302, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; at 2304, performing the conversion based on the ISP mode.

In some examples, binary value of splitting direction coded for the ISP mode is replaced by an index of splitting directions.

In some examples, the set of allowed splitting directions depends on block size.

In some examples, indications of set of allowed splitting directions are signaled.

In some examples, the set of allowed splitting directions depends on intra prediciton mode of the block.

In some examples, when W/MaxTbW and H/MaxTbH are both greater than M, the mixed splitting directions is enabled, M being an integer.

In some examples, when W/MaxTbW or H/MaxTbH is greater than M, the mixed splitting directions is enabled, M being an integer.

In some examples, M=1.

In some examples, the block is split by using quad-tree splitting.

In some examples, the block is split horizontally first followed by being split vertically when the mixed splitting directions are applied.

In some examples, the block is split vertically first followed by being split horizontally when the mixed splitting directions are applied.

In some examples, whether to and/or how to apply ISP mode on the block depend on the relationship between the block size of block W×H, and/or the maximum transform block size MaxTbW *MaxTbH and/or the minimum transform block size MinTbW *MinTbH.

In some examples, if W/MinTbW and H/MinTbH are both equal to 1, ISP mode is disabled for the block.

In some examples, how to split the block depends on the minimum transform block size of the block.

In some examples, if W/MinTbW is equal to K and H/MinTbH is equal to 1, ISP mode is enabled for the block and vertical splitting is applied to the block, K being an integer larger than 1.

In some examples, if W/MinTbW is equal to 1 and H/MinTbH is equal to K, ISP mode is enabled for the block and horizontal splitting is applied to the block, K being an integer larger than 1.

In some examples, the prediction direction is not needed to be signaled.

In some examples, the block is split to K sub-partitions.

In some examples, when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1, ISP mode is disabled for the block.

In some examples, when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 4.

In some examples, when both W/MaxTbW and H/MaxTbH are greater than 1, ISP mode is disabled for the block.

In some examples, when either W/MaxTbW or H/MaxTbH is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.

In some examples, when both W/MaxTbW and H/MaxTbH are greater than or equal to a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.

In some examples, when both W/MaxTbW and H/MaxTbH are greater than or equal to a first threshold and smaller than or equal to a second threshold, ISP mode is disabled for the block.

In some examples, when both W/MaxTbW and H/MaxTbH are greater than a first threshold and smaller than a second threshold, ISP mode is disabled for the block.

In some examples, the first threshold is 1 and the second threshold is 4.

In some examples, signaling of splitting direction is skipped and the block is split according to certain rules.

In some examples, the quard-tree splitting is applied to the block firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.

In some examples, the splitting of one partition tree is terminated once either width reaches the MaxTbW or height reaches the MaxTbH.

In some examples, the splitting of one partition tree is terminated once both width reaches the MaxTbW and height reaches the MaxTbH.

In some examples, the splitting of one partition tree is terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N, wherein M and N are two positive integers.

In some examples, when ISP mode is disabled for the block, signaling of related information including intra_subpartitions_mode_flag is skipped.

In some examples, when W/MaxTbW > 4 and/or H/MaxTbH > 4, more than 4 sub-partitions and/or more than one splitting direction are enabled.

FIG. 24 is a flowchart for a method 2400 of video processing in accordance with one or more examples of the present technology. The method 2400 includes, at 2402, determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; at 2404, performing the conversion based on the determined MTS scheme.

In some examples, the MTS scheme is explicit MTS where transform index of the MTS is signaled in the bitstream of the block.

In some examples, the MTS scheme is revised to allow only two transforms, wherein the two transforms are DCT-II and DST-VII.

In some examples, the MTS scheme includes two modes in terms of transform selection.

In some examples, the MTS scheme includes a third mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.

In some examples, a first mode of the two modes is DCT-II for both horizontal and vertical transform of the block, and a second mode of the two modes is DST-VII for both horizontal and vertical transform of the block.

In some examples, one bit is coded to indicate which mode of the two modes is used.

In some examples, the MTS scheme includes four modes in terms of transform selection.

In some examples, the MTS scheme includes a fifth mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.

In some examples, a first mode of the four modes is DCT-II for both horizontal and vertical transform of the block, a second mode of the four modes is DST-VII for both horizontal and vertical transform of the block, a third mode of the four modes is DCT-II for horizontal transform of the block and DST-VII for vertical transform of the block, and a fourth mode of the four modes is DST-VII for horizontal transform of the block and DCT-II for vertical transform of the block.

In some examples, fixed length coding is utilized to code the four modes.

In some examples, truncated unary is utilized to code the four modes.

In some examples, the allowed transform sets and/or signaling of transform index in explicit MTS depend on the block size.

In some examples, for blocks with width and/or height smaller than or equal to a threshold,

DCT-II and DST-VII are allowed.

In some examples, for blocks with width and/or height larger than or equal to a threshold, DCT-II, DST-VII and DCT-VIII are allowed.

In some examples, transform skip (TS) mode is allowed.

In some examples, the allowed transform sets depend on coded mode of the block.

In some examples, for intra block copy (IBC) mode coded blocks, a transform set of two-transformation basis including TS mode and DST-VII is allowed.

In some examples, non-IBC mode coded blocks, a transform set of two-transformation basis including DCT-II and DST-VII is allowed, or a transform set of three-transformation basis including TS mode, DCT-II and DST-VII is allowed.

In some examples, how to signal the transform index is changed according to the allowed transform sets.

In some examples, indications of the maximum allowed transform size used in non-TS mode of the MTS scheme are signaled.

In some examples, the indications are signaled in at least one of sequence, picture, slice, tile group, tile, brick level or other kinds of video unit level.

In some examples, the indications are signaled video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.

In some examples, the indications are derived from the allowed maximum TS size.

In some examples, the indications are used to control both maximum TS sizes and maximum sizes used in other transform matrix.

In some examples, maximum sizes for non-TS and TS modes are not needed to be signaled separately.

In some examples, the indications of the maximum allowed transform size are used to control both implicit MTS transform sizes and explicit MTS transform sizes.

In some examples, whether to apply the implicit MTS depends on the signaled maximum allowed transform sizes.

In some examples, the maximum allowed transform size used in non-TS mode of the MTS scheme is aligned with the maximum allowed transform size used in TS mode.

In some examples, the maximum allowed transform size used in non-TS mode of the MTS scheme is same as the maximum allowed transform size used in TS mode.

In some examples, the MTS scheme is implicit MTS where transform matrix of the MTS is directly derived according to transform block sizes of the block.

In some examples, derivation of implicit MTS enabling flag indicating whether implicit MTS is enabled is independent from block size of the block.

In some examples, checking of the block size in the derivation of implicit MTS enabling flag is skipped.

In some examples, shared condition check of block size before signaling MTS information is removed, the MTS information includes transform_skip_flag and tu_mts_idx.

In some examples, if all of the following shared conditions are true, the MTS information is further signaled:

- tu_cbf_luma [x0] [y0]

- treeType! = DUAL_TREE_CHROMA

- !cu_sbt_flag,

otherwise, the MTS information is not signaled.

In some examples, when the shared conditions check of certain rules returns true, condition check of block size compared to the allowed maximum TS sizes is applied before signaling transform_skip_flag; and condition check of block size compared to the allowed maximum allowed MTS sizes is applied before signaling tu_mts_idx.

In some examples, the shared condition check of block size before signaling the MTS information is kept unchanged, while the condition check of block size before signaling the transform matrix index used in non-TS mode of MTS is removed.

In some examples, the conversion generates the block of video from the bitstream representation.

In some examples, the conversion generates the bitstream representation from the block of video.

It will be appreciated that the disclosed techniques may be embodied in video encoders or decoders to improve compression efficiency using techniques that include the use of a reduced dimension secondary transform.

The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.

A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .

Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular techniques. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims

A method for processing video, comprising:

enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode;

enabling a second mode different from the ISP mode for the block; and

performing the conversion based on the ISP mode and the second mode.
The method of claim 1, wherein the second mode is multiple reference line (MRL) mode.
The method of claim 2, wherein, in the MRL mode, a reference line which is not the closest one of multiple reference lines is available for intra prediction of the block.
The method of any of claims 2-3, wherein all sub-partitions of the block use a same reference line index for intra prediction of the block.
The method of any of claims 2-3, wherein only first K sub-partitions of all sub-partitions of the block use a same reference line index for intra prediction, and the remaining sub-partitions use the closest reference line for intra prediction of the block, K being an integer.
The method of claim 5, wherein K =1.
The method of any of claims 4-6, wherein the reference line index is signaled in the bitstream.
The method of any of claims 5-7, wherein whether MRL mode is applied for the remaining sub-partitions depends on splitting direction in ISP mode or/and intra prediction mode or/and size of the block.
The method of claim 8, wherein if the block is split in horizonal direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
The method of claim 8, wherein if the block is split in horizonal direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
The method of claim 8, wherein if the block is split in vertical direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
The method of claim 8, wherein if the block is split in vertical direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when bottom-left neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
The method of any of claims 1-12, wherein indications of ISP mode information are signaled before the signaling of MRL mode related information.
The method of claim 13, wherein the ISP mode information include at least one of on/off flag and splitting direction, and the MRL mode related information includes the reference line index.
The method of claim 14, wherein when ISP mode is enabled for the block, the signaling of MRL related information is skipped.
The method of claim 14, wherein when ISP mode is enabled for the block, the reference line index is referred to be 0.
The method of claim 1, wherein the second mode is a matrix based intra prediction (MIP) mode.
The method of claim 17, wherein matrix selection of one sub-partition depends on intra mode and/or size of the sub-partition.
The method of claim 17 or 18, wherein indications of the MIP modes and indications of the ISP modes are signaled for the block.
The method of claim 1, wherein the second mode is Transform skip (TS) mode.
The method of claim 20, wherein indication of enabling/disabling TS mode is further signaled even when ISP mode is enabled.
The method of claim 20 or 21, wherein whether to signal the indication of enabling/disabling TS mode depends on whether video content of the video is screen content or not.
The method of claim 22, wherein whether to signal the indication of enabling/disabling TS mode depends on a flag signaled in at least one of picture, slice, tile group, tile and brick-level.
The method of claim 22, wherein if the video content is screen content, the indication of enabling/disabling transform skip mode is signaled.
The method of claim 22, wherein if the video content is camera content, the indication of enabling/disabling transform skip mode is skipped and the TS mode is disabled for the blocks.
The method of any of claims 1-25, wherein all sub-partitions share the same quantization information including at least one of quantization parameter, quantization step and scaling matrix.
The method of claim 26, wherein the quantization parameter is represented by cu_qp_delta_abs and cu_qp_delta_sign_flag.
The method of claim 26, wherein the quantization information is signaled for the block only when there is at least one coefficient not equal to zero in at least one sub-partition.
The method of any of claim 26-28, wherein the quantization information is signaled once for the whole block instead of being signaled for each sub-partition.
The method of claim 29, wherein the quantization information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition, where m is an integer.
The method of claim 29, wherein the quantization information is signaled together with a first sub-partition in encoding or decoding order.
The method of claim 29, wherein the quantization information is signaled together with the last sub-partition in encoding or decoding order.
The method of claim 29, wherein the quantization information is signaled together with the m-th sub-partition in encoding or decoding order, wherein m is an integer no larger than the total number of allowed sub-partitions.
A method for processing video, comprising:

enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; and

performing the conversion based on the ISP mode.
The method of claim 34, wherein the reference samples are filtered before being used as prediction.
The method of claim 34 or 35, wherein whether to modify the reference samples before being used as prediction depends on width and/or height of the block.
The method of claim 34 or 35, wherein whether to modify the reference samples before being used as prediction depends on intra-prediction mode of the block.
The method of any of claims 1-37, wherein block size of the block is denoted by W*H, wherein W is the block width and H is the block height,

a maximum transform block size of the block is denoted by MaxTbW *MaxTbH, wherein MaxTbW and MaxTbH are the maximum transform block width and maximum transform block height, respectively, and

a minimum transform block size of the block is denoted by MinTbW *MinTbH, wherein MinTbW and MinTbH are the minimum transform block width and minimum transform block height, respectively.
The method of claim 38, wherein indications of MaxTbW and/or MaxTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
The method of claim 39, wherein the indications of MaxTbW and/or MaxTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
The method of any of claims 38-40, wherein MaxTbW and/or MaxTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
The method of claim 38, wherein indications of MinTbW and/or MinTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
The method of claim 42, wherein the indications of MinTbW and/or MinTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
The method of any of claims 42-43, wherein MinTbW and/or MinTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
A method for processing video, comprising:

enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split into multiple sub-partitions for both horizontal and vertical directions; and

performing the conversion based on the ISP mode.
The method of claim 45, wherein binary value of splitting direction coded for the ISP mode is replaced by an index of splitting directions.
The method of claim 45, wherein the set of allowed splitting directions depends on block size.
The method of claim 45, wherein indications of set of allowed splitting directions are signaled.
The method of claim 45, wherein the set of allowed splitting directions depends on intra prediciton mode of the block.
The method of claim 45, wherein when W/MaxTbW and H/MaxTbH are both greater than M, the mixed splitting directions is enabled, M being an integer.
The method of claim 45, wherein when W/MaxTbW or H/MaxTbH is greater than M, the mixed splitting directions is enabled, M being an integer.
The method of claim 50 or 51, wherein M=1.
The method of any of claims 50-52, wherein the block is split by using quad-tree splitting.
The method of any of claims 50-53, wherein the block is split horizontally first followed by being split vertically when the mixed splitting directions are applied.
The method of any of claims 50-53, wherein the block is split vertically first followed by being split horizontally when the mixed splitting directions are applied.
The method of any of claims 38-55, wherein whether to and/or how to apply ISP mode on the block depend on the relationship between the block size of block W×H, and/or the maximum transform block size MaxTbW *MaxTbH and/or the minimum transform block size MinTbW *MinTbH.
The method of claim 56, wherein if W/MinTbW and H/MinTbH are both equal to 1, ISP mode is disabled for the block.
The method of claim 56, wherein how to split the block depends on the minimum transform block size of the block.
The method of claim 58, wherein if W/MinTbW is equal to K and H/MinTbH is equal to 1, ISP mode is enabled for the block and vertical splitting is applied to the block, K being an integer larger than 1.
The method of claim 58, wherein , if W/MinTbW is equal to 1 and H/MinTbH is equal to K, ISP mode is enabled for the block and horizontal splitting is applied to the block, K being an integer larger than 1.
The method of claim 59 or 60, wherein the prediction direction is not needed to be signaled.
The method of any of claims 59-61, wherein the block is split to K sub-partitions.
The method of claim 56, wherein when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1, ISP mode is disabled for the block.
The method of claim 56, wherein when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 4.
The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than 1, ISP mode is disabled for the block.
The method of claim 56, wherein when either W/MaxTbW or H/MaxTbH is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than or equal to a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than or equal to a first threshold and smaller than or equal to a second threshold, ISP mode is disabled for the block.
The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than a first threshold and smaller than a second threshold, ISP mode is disabled for the block.
The method of claim 68 or 69, wherein the first threshold is 1 and the second threshold is 4.
The method of any of claim 68-70, wherein signaling of splitting direction is skipped and the block is split according to certain rules.
The method of claim 71, wherein the quard-tree splitting is applied to the block firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N, wherein M and N are two positive integers.
The method of any of claim 56-75, wherein when ISP mode is disabled for the block, signaling of related information including intra_subpartitions_mode_flag is skipped.
The method of any of claims 38-76, wherein when W/MaxTbW > 4 and/or H/MaxTbH > 4, more than 4 sub-partitions and/or more than one splitting direction are enabled.
A method for processing video, comprising:

determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; and

performing the conversion based on the determined MTS scheme.
The method of claim 78, wherein the MTS scheme is explicit MTS where transform index of the MTS is signaled in the bitstream of the block.
The method of claim 78 or 79, wherein the MTS scheme is revised to allow only two transforms, wherein the two transforms are DCT-II and DST-VII.
The method of any of claims 78-80, wherein the MTS scheme includes two modes in terms of transform selection.
The method of claim 81, wherein the MTS scheme includes a third mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
The method of claim 82, wherein a first mode of the two modes is DCT-II for both horizontal and vertical transform of the block, and a second mode of the two modes is DST-VII for both horizontal and vertical transform of the block.
The method of claim 83, wherein one bit is coded to indicate which mode of the two modes is used.
The method of any of claims 78-80, wherein the MTS scheme includes four modes in terms of transform selection.
The method of claim 85, wherein the MTS scheme includes a fifth mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
The method of claim 86, wherein a first mode of the four modes is DCT-II for both horizontal and vertical transform of the block, a second mode of the four modes is DST-VII for both horizontal and vertical transform of the block, a third mode of the four modes is DCT-II for horizontal transform of the block and DST-VII for vertical transform of the block, and a fourth mode of the four modes is DST-VII for horizontal transform of the block and DCT-II for vertical transform of the block.
The method of claim 87, wherein fixed length coding is utilized to code the four modes.
The method of claim 87, wherein truncated unary is utilized to code the four modes.
The method of claim 78 or 79, wherein the allowed transform sets and/or signaling of transform index in explicit MTS depend on the block size.
The method of claim 90, wherein for blocks with width and/or height smaller than or equal to a threshold, DCT-II and DST-VII are allowed.
The method of claim 90, wherein for blocks with width and/or height larger than or equal to a threshold, DCT-II, DST-VII and DCT-VIII are allowed.
The method of claim 90, wherein transform skip (TS) mode is allowed.
The method of claim 90, wherein the allowed transform sets depend on coded mode of the block.
The method of claim 94, wherein for intra block copy (IBC) mode coded blocks, a transform set of two-transformation basis including TS mode and DST-VII is allowed.
The method of claim 94, wherein non-IBC mode coded blocks, a transform set of two-transformation basis including DCT-II and DST-VII is allowed, or a transform set of three-transformation basis including TS mode, DCT-II and DST-VII is allowed.
The method of any of claims 90-96, wherein how to signal the transform index is changed according to the allowed transform sets.
The method of any of claims 78-97, wherein indications of the maximum allowed transform size used in non-TS mode of the MTS scheme are signaled.
The method of claim 98, wherein the indications are signaled in at least one of sequence, picture, slice, tile group, tile, brick level or other kinds of video unit level.
The method of claim 99, wherein the indications are signaled video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
The method of claim 98, wherein the indications are derived from the allowed maximum TS size.
The method of claim 98, wherein the indications are used to control both maximum TS sizes and maximum sizes used in other transform matrix.
The method of claim 102, wherein maximum sizes for non-TS and TS modes are not needed to be signaled separately.
The method of claim 98, wherein the indications of the maximum allowed transform size are used to control both implicit MTS transform sizes and explicit MTS transform sizes.
The method of claim 104, wherein whether to apply the implicit MTS depends on the signaled maximum allowed transform sizes.
The method of any of claims 78-105, wherein the maximum allowed transform size used in non-TS mode of the MTS scheme is aligned with the maximum allowed transform size used in TS mode.
The method of claim 106, wherein the maximum allowed transform size used in non-TS mode of the MTS scheme is same as the maximum allowed transform size used in TS mode.
The method of claim 78, wherein the MTS scheme is implicit MTS where transform matrix of the MTS is directly derived according to transform block sizes of the block.
The method of claim 108, wherein derivation of implicit MTS enabling flag indicating whether implicit MTS is enabled is independent from block size of the block.
The method of claim 109, wherein checking of the block size in the derivation of implicit MTS enabling flag is skipped.
The method of any of claims 78-110, wherein shared condition check of block size before signaling MTS information is removed, the MTS information includes transform_skip_flag and tu_mts_idx.
The method of claim 111, wherein if all of the following shared conditions are true, the MTS information is further signaled:

- tu_cbf_luma [x0] [y0]

- treeType! = DUAL_TREE_CHROMA

- !cu_sbt_flag,

otherwise, the MTS information is not signaled.
The method of claim 111, wherein when the shared conditions check of certain rules returns true, condition check of block size compared to the allowed maximum TS sizes is applied before signaling transform_skip_flag; and condition check of block size compared to the allowed maximum allowed MTS sizes is applied before signaling tu_mts_idx.
The method of claim 111, wherein the shared condition check of block size before signaling the MTS information is kept unchanged, while the condition check of block size before signaling the transform matrix index used in non-TS mode of MTS is removed.
The method of any of claims 1-114, wherein the conversion generates the block of video from the bitstream representation.
The method of anyone of claims 1 -114, wherein the conversion generates the bitstream representation from the block of video.
An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of claims 1 to 116.
A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of claims 1 to 116.