WO2020221213A1 - Intra sub-block partitioning and multiple transform selection - Google Patents
Intra sub-block partitioning and multiple transform selection Download PDFInfo
- Publication number
- WO2020221213A1 WO2020221213A1 PCT/CN2020/087285 CN2020087285W WO2020221213A1 WO 2020221213 A1 WO2020221213 A1 WO 2020221213A1 CN 2020087285 W CN2020087285 W CN 2020087285W WO 2020221213 A1 WO2020221213 A1 WO 2020221213A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- mode
- transform
- sub
- mts
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- This patent document relates to video coding techniques, devices and systems.
- the present document describes various embodiments and techniques in which a secondary transform is used during decoding or encoding of video or images.
- a method of video processing includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and generating a residual signal for the current sub-block based on the predictions.
- another method of video processing includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and generating a residual signal for the sub-blocks based on the predictions.
- another method of video processing includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and reconstructing the current sub-block using the predictions.
- another method of video processing includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and reconstructing the sub-blocks using the predictions.
- another method of video processing includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and transforming a residual signal for the sub-blocks based on the predictions.
- a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.
- another method of video processing includes receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing inverse transform on a residual signal of the sub-blocks, and reconstructing the sub-blocks using an output from the inverse transform.
- a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream.
- another method of video processing includes receiving or transmitting a bitstream representing a block of video data for performing video processing.
- the block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream and the sub-blocks share same quantization information.
- another method of video processing includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern. Reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block. The method also includes encoding or reconstructing the block of video data based on the predictions.
- another method of video processing includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and encoding or reconstructing the block of video data based on the predictions.
- the sub-blocks are partitioned in multiple partitioning directions.
- another method of video processing includes performing predictions for a block of video data to generate a residual signal, performing an explicit transformation of the residual signal using one of two transformations, and encoding an output from the implicit transformation.
- another method of video processing includes receiving a block of video data partitioned into one or more sub-blocks, performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.
- another method of video processing includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; enabling a second mode different from the ISP mode for the block; and performing the conversion based on the ISP mode and the second mode.
- ISP Intra Sub-block Partition
- another method of video processing includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; and performing the conversion based on the ISP mode.
- ISP Intra Sub-block Partition
- another method of video processing includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; and performing the conversion based on the ISP mode.
- ISP Intra Sub-block Partition
- another method of video processing includes determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; and performing the conversion based on the determined MTS scheme.
- MTS Multiple Transform Selection
- a video encoder comprises a processor configured to implement one or more of the above-described methods.
- a video decoder comprises a processor configured to implement one or more of the above-described methods.
- a computer readable medium includes code for implementing one or more of the above-described methods stored on the medium.
- FIG. 1 shows an example of an encoder block diagram.
- FIG. 2 shows an example of 67 intra prediction modes.
- FIG. 3A-3B show examples of reference samples for wide-angular intra prediction.
- FIG. 4 is an example illustration of a problem of discontinuity in case of directions beyond 45 degrees.
- FIG. 5A-5D show an example illustration of samples used by PDPC applied to diagonal and adjacent angular intra modes.
- FIG. 6 depicted an example of four reference lines.
- FIG. 7 is an example of division of 4 ⁇ 8 and 8 ⁇ 4 blocks.
- FIG. 8 is an example of division of all blocks except 4 ⁇ 8, 8 ⁇ 4 and 4 ⁇ 4.
- FIG. 9 is an example of Affine Linear Weighted Intra-Prediction (ALWIP) for 4x4 blocks.
- FIG. 10 is an example of ALWIP for 8x8 blocks.
- FIG. 11 is an example of ALWIP for 8x4 blocks.
- FIG. 12 is an example of ALWIP for 16x16 blocks.
- FIG. 13 shows an example of secondary transform in JEM.
- FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .
- FIG. 15 is an illustration of sub-block transform modes SBT-V and SBT-H.
- FIG. 16 is a block diagram of an example hardware platform for implementing a technique described in the present document.
- FIG. 17 shows an example of missed splitting.
- FIG. 18 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 19 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 20 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 21 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 22 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 23 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- FIG. 24 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
- Section headings are used in the present document to facilitate ease of understanding and do not limit the embodiments disclosed in a section to only that section.
- certain embodiments are described with reference to Versatile Video Coding or other specific video codecs, the disclosed techniques are applicable to other video coding technologies also.
- video processing encompasses video coding or compression, video decoding or decompression and video transcoding in which video pixels are represented from one compressed format into another compressed format or at a different compressed bitrate.
- This patent document is related to video coding technologies. Specifically, it is related transform in video coding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
- Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards.
- the ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards.
- AVC H. 264/MPEG-4 Advanced Video Coding
- H. 265/HEVC High Efficiency Video Coding
- the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized.
- Joint Video Exploration Team JVET was founded by VCEG and MPEG jointly in 2015.
- JVET Joint Exploration Model
- FIG. 1 shows an example of encoder block diagram of VVC, which contains three in-loop filtering blocks: deblocking filter (DF) , sample adaptive offset (SAO) and ALF.
- DF deblocking filter
- SAO sample adaptive offset
- ALF utilize the original samples of the current picture to reduce the mean square errors between the original samples and the reconstructed samples by adding an offset and by applying a finite impulse response (FIR) filter, respectively, with coded side information signaling the offsets and filter coefficients.
- FIR finite impulse response
- ALF is located at the last processing stage of each picture and can be regarded as a tool trying to catch and fix artifacts created by the previous stages.
- the number of directional intra modes is extended from 33, as used in HEVC, to 65.
- the additional directional modes are depicted as red dotted arrows in FIG. 2, and the planar and DC modes remain the same.
- These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.
- Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction as shown in FIG. 2.
- VTM2 several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for the non-square blocks.
- the replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing.
- the total number of intra prediction modes is unchanged, i.e., 67, and the intra mode coding is unchanged.
- every intra-coded block has a square shape and the length of each of its side is a power of 2. Thus, no division operations are required to generate an intra-predictor using DC mode.
- blocks can have a rectangular shape that necessitates the use of a division operation per block in the general case. To avoid division operations for DC prediction, only the longer side is used to compute the average for non-square blocks.
- Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction.
- VTM2 several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks.
- the replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing.
- the total number of intra prediction modes for a certain block is unchanged, i.e., 67, and the intra mode coding is unchanged.
- top reference with length 2W+1, and the left reference with length 2H+1 are defined as shown in FIG. 3A-3B.
- the mode number of replaced mode in wide-angular direction mode is dependent on the aspect ratio of a block.
- the replaced intra prediction modes are illustrated in Table 1.
- two vertically-adjacent predicted samples may use two non-adjacent reference samples in the case of wide-angle intra prediction.
- low-pass reference samples filter and side smoothing are applied to the wide-angle prediction to reduce the negative effect of the increased gap ⁇ p ⁇ .
- PDPC position dependent intra prediction combination
- PDPC is an intra prediction method which invokes a combination of the un-filtered boundary reference samples and HEVC style intra prediction with filtered boundary reference samples.
- PDPC is applied to the following intra modes without signalling: planar, DC, horizontal, vertical, bottom-left angular mode and its eight adjacent angular modes, and top-right angular mode and its eight adjacent angular modes.
- the prediction sample pred (x, y) is predicted using an intra prediction mode (DC, planar, angular) and a linear combination of reference samples according to the Equation as follows:
- pred (x, y) (wL ⁇ R -1, y + wT ⁇ R x, -1 –wTL ⁇ R -1, -1 + (64 –wL –wT+wTL) ⁇ pred (x, y) + 32 ) >> 6 where R x, -1 , R -1, y represent the reference samples located at the top and left of current sample (x, y) , respectively, and R -1, -1 represents the reference sample located at the top-left corner of the current block.
- FIG. 5A-5D illustrates the definition of reference samples (R x, -1 , R -1, y and R -1, -1 ) for PDPC applied over various prediction modes.
- the prediction sample pred (x’, y’ ) is located at (x’, y’ ) within the prediction block.
- FIGS. 5A to 5D provide definition of samples used by PDPC applied to diagonal and adjacent angular intra modes.
- the PDPC weights are dependent on prediction modes and are shown in Table 2.
- Multiple reference line (MRL) intra prediction uses more reference lines for intra prediction.
- FIG. 6 an example of 4 reference lines is depicted, where the samples of segments A and F are not fetched from reconstructed neighbouring samples but padded with the closest samples from Segment B and E, respectively.
- HEVC intra-picture prediction uses the nearest reference line (i.e., reference line 0) .
- reference line 0 the nearest reference line
- 2 additional lines reference line 1 and reference line 3 are used.
- the index of selected reference line (mrl_idx) is signaled and used to generate intra predictor.
- reference line index which is greater than 0, only include additional reference line modes in MPM list and only signal MPM index without remaining mode.
- the reference line index is signaled before intra prediction modes, and Planar and DC modes are excluded from intra prediction modes in case a nonzero reference line index is signaled.
- MRL is disabled for the first line of blocks inside a CTU to prevent using extended reference samples outside the current CTU line. Also, PDPC is disabled when additional line is used.
- ISP ISP is proposed, which divides luma intra-predicted blocks vertically or horizontally into 2 or 4 sub-partitions depending on the block size dimensions, as shown in Table .
- FIG. 7 and FIG. 8 show examples of the two possibilities.
- FIG. 7 shows an example of division of 4 ⁇ 8 and 8 ⁇ 4 blocks.
- FIG. 8 shows an example of division of all blocks except 4 ⁇ 8, 8 ⁇ 4 and 4 ⁇ 4. All sub-partitions fulfill the condition of having at least 16 samples. For block sizes, 4 ⁇ N or N ⁇ 4 (with N>8) , if allowed, the 1 ⁇ N or N ⁇ 1 sub-partition may exist.
- Table 3 Number of sub-partitions depending on the block size.
- a residual signal is generated by entropy decoding the coefficients sent by the encoder and then invert quantizing and invert transforming them. Then, the sub-partition is intra predicted and finally the corresponding reconstructed samples are obtained by adding the residual signal to the prediction signal. Therefore, the reconstructed values of each sub-partition will be available to generate the prediction of the next one, which will repeat the process and so on. All sub-partitions share the same intra mode.
- Table 4 shows example transform types based on intra-prediction mode (s) .
- Table 5 shows an example coding unit syntax.
- Table 6 shows an example transform unit syntax. Some of the example variables include:
- intra_subpartitions_mode_flag [x0] [y0] 1 specifies that the current intra coding unit is partitioned into NumIntraSubPartitions [x0] [y0] rectangular transform block subpartitions.
- intra_subpartitions_mode_flag [x0] [y0] 0 specifies that the current intra coding unit is not partitioned into rectangular transform block subpartitions.
- intra_subpartitions_split_flag [x0] [y0] specifies whether the intra subpartitions split type is horizontal or vertical. When intra_subpartitions_split_flag [x0] [y0] is not present, it is inferred as follows:
- intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 0.
- intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 1.
- IntraSubPartitionsSplitType specifies the type of split used for the current luma coding block as illustrated in Table 7 9. IntraSubPartitionsSplitType is derived as follows:
- IntraSubPartitionsSplitType is set equal to 0.
- IntraSubPartitionsSplitType is set equal to 1 +intra_subpartitions_split_flag [x0] [y0] .
- Table 7 shows example name association to IntraSubPartitionsSplitType
- NumIntraSubPartitions specifies the number of transform block subpartitions an intra luma coding block is divided into. NumIntraSubPartitions is derived as follows:
- NumIntraSubPartitions is set equal to 2: cbWidth is equal to 4 and cbHeight is equal to 8, cbWidth is equal to 8 and cbHeight is equal to 4.
- Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction)
- Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction (MIP) ) is proposed.
- the neighboring reference samples are firstly down-sampled via averaging to generate the reduced reference signal bdry red .
- the reduced prediction signal pred red is computed by calculating a matrix vector product and adding an offset:
- pred red A ⁇ bdry red +b.
- b is a vector of size W red ⁇ H red .
- ALWIP takes two averages along each axis of the boundary.
- the resulting four input samples enter the matrix vector multiplication.
- ALWIP takes four averages along each axis of the boundary.
- the resulting eight input samples enter the matrix vector multiplication.
- the matrices are taken from the set S_1. This yields 16 samples on the odd positions of the prediction block.
- these samples are interpolated vertically by using the reduced top boundary. Horizontal interpolation follows by using the original left boundary.
- ALWIP takes four averages along the horizontal axis of the boundary and the four original boundary values on the left boundary.
- the resulting eight input samples enter the matrix vector multiplication.
- the matrices are taken from the set S_1. This yields 16 samples on the odd horizontal and each vertical positions of the prediction block.
- ALWIP takes four averages along each axis of the boundary.
- the resulting eight input samples enter the matrix vector multiplication.
- the matrices are taken from the set S_2. This yields 64 samples on the odd positions of the prediction block.
- these samples are interpolated vertically by using eight averages of the top boundary. Horizontal interpolation follows by using the original left boundary. The interpolation process, in this case, does not add any multiplications. Therefore, totally, two multiplications per sample are required to calculate ALWIP prediction.
- the procedure is essentially the same and it is easy to check that the number of multiplications per sample is less than four.
- the transposed cases are treated accordingly.
- Table 8 shows an example coding unit syntax
- VTM4 large block-size transforms, up to 64 ⁇ 64 in size, are enabled, which is primarily useful for higher resolution video, e.g., 1080p and 4K sequences.
- High frequency transform coefficients are zeroed out for the transform blocks with size (width or height, or both width and height) equal to 64, so that only the lower-frequency coefficients are retained.
- M size
- N the block height
- transform skip mode is used for a large block, the entire block is used without zeroing out any values.
- a Multiple Transform Selection (MTS) scheme is used for residual coding both inter and intra coded blocks. It uses multiple selected transforms from the DCT8/DST7.
- the newly introduced transform matrices are DST-VII and DCT-VIII.
- the table below shows the basis functions of the selected DST/DCT.
- the transform matrices are quantized more accurately than the transform matrices in HEVC.
- the transform matrices are quantized more accurately than the transform matrices in HEVC.
- MTS In order to control MTS scheme, separate enabling flags are specified at SPS level for intra and inter, respectively.
- a CU level flag is signalled to indicate whether MTS is applied or not.
- MTS is applied only for luma.
- the MTS CU level flag is signalled when the following conditions are satisfied.
- MTS CU flag is equal to zero, then DCT2 is applied in both directions. However, if MTS CU flag is equal to one, then two other flags are additionally signalled to indicate the transform type for the horizontal and vertical directions, respectively.
- Transform and signalling mapping table as shown in Table 3-10.
- 8-bit primary transform cores are used. Therefore, all the transform cores used in HEVC are kept as the same, including 4-point DCT-2 and DST-7, 8-point, 16-point and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8, use 8-bit primary transform cores.
- High frequency transform coefficients are zeroed out for the DST-7 and DCT-8 blocks with size (width or height, or both width and height) equal to 32. Only the coefficients within the 16x16 lower-frequency region are retained.
- the residual of a block can be coded with transform skip mode.
- the transform skip flag is not signalled when the CU level MTS_CU_flag is not equal to zero.
- the block size limitation for transform skip is the same to that for MTS in JEM4, which indicate that transform skip is applicable for a CU when both block width and height are equal to or less than 32.
- MTS index may be signaled in the bitstream and such a design is called explicit MTS.
- implicit MTS an alternative way which directly derive the matrix according to transform block sizes is also supported, as implicit MTS.
- Table 9 picture parameter set RBSP syntax.
- Table 10 shows example transform unit syntax.
- Some of the example variables include:
- transform_skip_flag [x0] [y0] specifies whether a transform is applied to the luma transform block or not.
- the array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.
- transform_skip_flag [x0] [y0] 1 specifies that no transform is applied to the luma transform block.
- transform_skip_flag [x0] [y0] 0 specifies that the decision whether transform is applied to the luma transform block or not depends on other syntax elements. When transform_skip_flag [x0] [y0] is not present, it is inferred to be equal to 0.
- tu_mts_idx [x0] [y0] specifies which transform kernels are applied to the residual samples along the horizontal and vertical direction of the associated luma transform block.
- the array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.
- one context is used to decode transform_skip_flag, truncated unary is used to binarize the tu_mts_idx.
- Each bin of the tu_mts_idx is context coded, and for the first bin, the quad-tree depth (i.e., cqtDepth) is used to select one context; and for the remaining bins, one context is used.
- Table 11 shows example assignment of ctxInc to syntax elements.
- the implicitMtsEnabled is used to define whether implicit MTS is enabled.
- the variable implicitMtsEnabled is derived as follows:
- - cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32
- variable trTypeHor specifying the horizontal transform kernel and the variable trTypeVer specifying the vertical transform kernel are derived as follows:
- trTypeHor and trTypeVer are set equal to 0.
- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified in Table 8 15 depending on intraPredMode.
- trTypeHor and trTypeVer are specified in Table 8 14 depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.
- trTypeHor and trTypeVer are derived as follows:
- trTypeHor and trTypeVer are specified in Table 12 depending on tu_mts_idx [xTbY] [yTbY] .
- Table 13 shows example specification of trTypeHor and trTypeVer depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.
- secondary transform is applied between forward primary transform and quantization (at encoder) and between de-quantization and invert primary transform (at decoder side) .
- a 4x4 (or 8x8) secondary transform is performed depends on block size.
- 4x4 secondary transform is applied for small blocks (i.e., min (width, height) ⁇ 8) and 8x8 secondary transform is applied for larger blocks (i.e., min (width, height) > 4) per 8x8 block.
- FIG. 13 shows an example of secondary transform in JEM.
- non-separable transform Application of a non-separable transform is described as follows using input as an example. To apply the non-separable transform, the 4x4 input block X
- the non-separable transform is calculated as where indicates the transform coefficient vector, and T is a 16x16 transform matrix.
- the 16x1 coefficient vector is subsequently re-organized as 4x4 block using the scanning order for that block (horizontal, vertical or diagonal) .
- the coefficients with smaller index will be placed with the smaller scanning index in the 4x4 coefficient block.
- the mapping from the intra prediction mode to the transform set is pre-defined.
- the selected non-separable secondary transform candidate is further specified by the explicitly signalled secondary transform index.
- the index is signalled in a bit-stream once per Intra CU after transform coefficients.
- the RST was introduced and 4 transform set (instead of 35 transform sets) mapping is introduced.
- 16x64 may further be reduced to 16x48
- 16x16 matrices are employed for 8x8 and 4x4 blocks, respectively.
- the 16x64 (may further be reduced to 16x48) transform is denoted as RST8x8 and the 16x16 one as RST4x4.
- FIG. 11 shows an example of RST.
- FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .
- cu_sbt_flag may be signaled to indicate whether the whole residual block or a sub-part of the residual block is decoded.
- inter MTS information is further parsed to determine the transform type of the CU.
- a part of the residual block is coded with inferred adaptive transform and the other part of the residual block is zeroed out.
- the SBT is not applied to the combined inter-intra mode.
- sub-block transform position-dependent transform is applied on luma transform blocks in SBT-V and SBT-H (chroma TB always using DCT-2) .
- the two positions of SBT-H and SBT-V are associated with different core transforms.
- the horizontal and vertical transforms for each SBT position is specified in FIG. 15.
- the horizontal and vertical transforms for SBT-V position 0 is DCT-8 and DST-7, respectively.
- the sub-block transform jointly specifies the TU tiling, cbf, and horizontal and vertical transforms of a residual block, which may be considered a syntax shortcut for the cases that the major residual of a block is at one side of the block.
- Table 14 shows an example coding unit syntax.
- Table 15 shows an example residual coding syntax.
- the current design has the following problems:
- ISP could’t be enabled when multiple reference line (MRL) is enabled.
- Transform skip (TS) could’t be enabled when ISP is used. However, enabling both ISP and TS may achieve similar functionality as BDPCM while there is no need to add an additional module for handling BDPCM.
- Delta QP is signaled per sub-partition which results in signaling it multiple times for ISP coded blocks.
- ISP mode e.g., intra_subpartitions_mode_flag
- partition direction i.e., splitting type, horizontal/vertical direction
- Such a design is based on the assumption that only width or height could be twice of the MaxTbSizeY which limits the flexibility.
- the maximum transform size is set to, for example, 32x32
- the CU size is, for example, 128x128, according to the rules, it will be split to 4 128x32 sub-partitions.
- the maximum transform size is 32x32, it is disallowed to coded one 128x32 sub-partitions in VVC. How to handle this case is unknown.
- ISP and sub-block transform are both treated as implicit MTS since there is no need to signal the transform matrix.
- Sub-block transform could support block sizes up to 64x64 when the MaxSbtSize. However, the setting of implicitMTS only checks Max (width, height ) is less than or equal to 32. In addition, when cu_sbt_flag is equal to 1, implicitMTS shall be set to 1 automatically, there is no need to check the transform size.
- TS is part of MTS. However, the signaling of enabling/disabling TS and maximum TS size is signaled in PPS. While MTS enabling/disabling flag is signaled in SPS.
- Redundant check of block sizes is identified in the current VVC design for signaling transform_skip_flag and tu_mts_idx.
- one block size is denoted by W*H wherein W is the block width and H is the block height.
- the maximum transform block size is denoted by MaxTbW *MaxTbH wherein MaxTbW and MaxTbH are the maximum transform block width and height, respectively.
- the minimum transform block size is denoted by MinTbW *MinTbH wherein MinTbW and MinTbH are the minimum transform block’ width and height, respectively.
- MRL may represent those technologies that use non-adjacent reference lines in current picture to predict the current block
- ALWIP may represent those technologies that use matrix-based intra prediction methods. They are not limited to those mentioned in prior art.
- Intra Sub-block Partition ISP
- MNL multiple reference line
- all sub-partitions use the same reference line index for intra prediction.
- K 1
- whether MRL is applied for the remaining sub-partitions may depend on the splitting direction in ISP or/and intra prediction mode or/and dimension of the block.
- MRL may be applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition.
- prediction modes that are less than or equal to 50 in FIG. 2.
- MRL may be not applied to the remaining sub-partitions when above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition.
- prediction modes that are greater than 50 in FIG. 2.
- MRL may be applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition.
- prediction modes that are greater than or equal to 18 in FIG. 2.
- MRL may be not applied to the remaining sub-partitions when bottom-left neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition.
- prediction modes that are less than 18 in FIG. 2.
- ISP mode information e.g., on/off, splitting direction
- MRL related information may be signaled before the signaling of MRL related information.
- the signaling of MRL related information may be skipped, e.g., the reference line index.
- the reference line index is referred to be 0.
- ALWIP and ISP may be both enabled for one block.
- the matrix selection of one sub-partition may depend on the intra mode and/or dimension of the sub-partition.
- indications of ALWIP modes e.g., intra_lwip_flag and related intra modes
- indications of ISP modes e.g., intra_subpartitions_mode_flag and intra_subpartitions_split_flag
- Transform skip (TS) and ISP may be both enabled for one block.
- indication of enabling/disabling transform skip mode may be further signaled even when ISP mode is enabled (e.g., IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT) .
- whether to signal the indication of enabling/disabling transform skip mode may depend on whether the video content is screen content or not.
- the indication of enabling/disabling transform skip mode may be signaled.
- the indication of enabling/disabling transform skip mode may be skipped and the TS mode is disabled for ISP coded blocks.
- one quantization parameter may be represented by cu_qp_delta_abs, and cu_qp_delta_sign_flag.
- the quantization parameter information may be signaled for an ISP coded block only when there is at least one coefficient not equal to zero in at least one sub-partition.
- the quantization parameter, and/or one quantization step, and/or one scaling matrix may be signaled once for the whole ISP coded block instead of being signaled for each sub-partition.
- the information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition.
- the information may be signaled together with the first sub-partition in the encoding/decoding order.
- the information may be signaled together with the last sub-partition in the encoding/decoding order.
- the information may be signaled together with the m-th sub-partition in the encoding/decoding order wherein m is no larger than the total number of allowed sub-partitions.
- reference samples located in a first sub-partition to predict a second sub-partition in an ISP coded block may be further modified (e.g., may be filtered) before being used as prediction.
- whether to modify (e.g., filter) reference samples before being used as prediction may depend on block width and/or height.
- whether to modify (e.g., filter) reference samples before being used as prediction may depend on the intra-prediction mode.
- MaxTbW and/or MaxTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.
- MaxTbW and/or MaxTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.
- MinTbW and/or MinTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.
- MinTbW and/or MinTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.
- Mixed splitting directions may be enabled for ISP coded blocks wherein the block may be split for both horizontal and vertical directions.
- the binary value of splitting direction coded for the ISP mode (e.g., intra_subpartitions_split_flag) may be replaced by an index of splitting directions.
- the set of allowed splitting directions may depend on block dimension.
- the set of allowed splitting directions may depend on intra prediciton mode.
- the mixed splitting directions may be enabled.
- a block may be split horizontally first followed by being split vertically when mixed ISP is applied.
- a block may be split vertically first followed by being split horizontally when mixed ISP is applied.
- FIG. 17 shows an example of mixed splitting (also known as quad-tree splitting) .
- Whether to and/or how to apply ISP on a block may depend on the relationship between the block dimensions W ⁇ H, and/or maximum and/or minimum transform block sizes.
- how to split the block may depend on the minimum transform block sizes.
- ISP may be enabled and vertical splitting is applied.
- ISP may be enabled and horizontal splitting is applied. Alternatively, furthermore, there is no need to signal the prediction direction.
- the block may be split to K sub-partitions.
- ISP mode is disabled when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1.
- ISP mode is disabled when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, such as 4.
- ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than 1.
- ISP mode is disabled when either W/MaxTbW or H/MaxTbH is greater than a threshold, such as 2 or 4.
- ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than a threshold, such as 2 or 4.
- ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater (or no smaller) than a first threshold, and no greater (or smaller) than a second threshold.
- ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater than a first threshold, and smaller than a second threshold.
- the first and second thresholds are 1, and 4, respectively.
- the signaling of the splitting direction (e.g., intra_subpartitions_split_flag) may be skipped and the block may be split according to certain rules.
- the quard-tree splitting may be applied firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
- the splitting of one partition tree may be terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
- the splitting of one partition tree may be terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
- the splitting of one partition tree may be terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N wherein M and N are two positive integers.
- more than 4 sub-partitions and/or more than one splitting direction may be enabled.
- the above method may be enabled under certain conditions.
- the two transforms may be DCT-II and DST-VII (and corresponding invert transforms) .
- TS mode may be a third choice if it is applicable.
- one choice is DCT-II for both horizontal and vertical transform; and the other one is DST-VII.
- One bit may be coded to indicate whether which transform of the two is used.
- TS mode may be a fourth choice if it is applicable.
- the choices include: DCT-II/DST-VII for both horizontal and vertical transform; joint usage of DCT-II and DST-VII, each one for the horizontal or vertical transforms.
- fixed length coding may be utilized to code the four choices.
- truncated unary may be utilized to code the four choices.
- bin strings for the four choices are tabulated as follows:
- DCT-II and DST-VII may be allowed.
- DCT-II, DST-VII and DCT-VIII may be allowed.
- transform skip mode may be enabled.
- the allowed transform sets may depend on coded mode.
- the two-transformation basis (TS and DST-VII) may be allowed.
- the two-transformation basis (DCT-II and DST-VII) may be allowed or three-transformation transformation basis (TS, DCT-II and DST-VII) may be allowed.
- How to signal the transform index may be changed according to the allowed transform sets.
- a may be signaled in sequence/picture/slice/tile group/tile/brick level, or other kinds of video unit level.
- they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.
- b may be not signaled, but derived from the allowed maximum TS size.
- indications of the maximum allowed transform size may control both maximum TS sizes and maximum sizes used in other transform matrix.
- indications of the maximum allowed transform size may control both implicit and explicit MTS transform sizes.
- whether to apply the implicit MTS may depend on the signaled sizes.
- the maximum allowed transform size (non-TS mode) used in MTS and maximum allowed block size used in TS mode may be the same number.
- the shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) may be removed.
- MTS information may be further signaled. Otherwise, there is no need to signal the MTS information.
- condition check of block dimension compared to the allowed maximum TS sizes may be applied before signaling transform_skip_flag; and condition check of block dimension compared to the allowed maximum allowed MTS sizes (e.g., fixed to be 32x32) may be applied before signaling tu_mts_idx.
- the shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) is kept unchanged, while the condition check of block dimension before signaling the transform matrix index (non-TS mode) may be removed.
- VVC working draft version 5 JVET_N1001_v2 Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example.
- the underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions.
- nTbW specifying the width of the current transform block
- nTbH specifying the height of the current transform block
- variable implicitMtsEnabled is derived as follows:
- Max (nTbW, nTbH ) is less than or equal to MaxSBTSize That is, Max (nTbW, nTbH ) is compared against MaxSBTSize instead of a fixed number 32.
- implicitMtsEnabled is set equal to 0.
- variable trTypeHor specifying the horizontal transform kernel and the variable trTypeVer specifying the vertical transform kernel are derived as follows:
- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified depending on intraPredMode.
- sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are equal to 0 and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA )
- trTypeHor and trTypeVer are derived as follows:
- trTypeHor and trTypeVer are specified in Table 8 13 depending on tu_mts_idx [xTbY] [yTbY] .
- condition check ‘cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32’ in the determination of implicitMtsEnabled may be replaced by ‘cu_sbt_flag is equal to 1’.
- VVC working draft version 5 JVET_N1001_v2 Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example.
- the underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions.
- This section provides examples for redundant check removal during the MTS signaling process.
- FIG. 16 is a block diagram of a video processing apparatus 1600.
- the apparatus 1600 may be used to implement one or more of the methods described herein.
- the apparatus 1600 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on.
- the apparatus 1600 may include one or more processors 1602, one or more memories 1604 and video processing hardware 1606.
- the processor (s) 1602 may be configured to implement one or more methods described in the present document.
- the memory (memories) 1604 may be used for storing data and code used for implementing the methods and techniques described herein.
- the video processing hardware 1606 may be used to implement, in hardware circuitry, some techniques described in the present document.
- FIG. 18 is a flowchart for a method 1800 of video processing in accordance with one or more examples of the present technology.
- the method 1800 includes, at operation 1802, partitioning a block of video data into sub-blocks using a partitioning pattern.
- the method 1800 includes, at operation 1804, performing prediction for one sub-block using at least one line of reference video data not adjacent to the current sub-block.
- the method 1800 also includes, at operation 1806, generating a residual signal for the sub-block based on the prediction.
- FIG. 19 is a flowchart for a method 1900 of video processing in accordance with one or more examples of the present technology.
- the method 1900 includes, at operation 1902, partitioning a block of video data into sub-blocks using a partitioning pattern.
- the method 1900 at operation 1904, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks.
- the method 1900 also includes, at operation 1906, generating a residual signal for the sub-blocks based on the predictions.
- FIG. 20 is a flowchart for a method 2000 of video processing in accordance with one or more examples of the present technology.
- the method 2000 includes, at operation 2002, performing predictions for a block of video data to generate a residual signal.
- the method 2000 includes, at operation 2004, performing an explicit transformation of the residual signal using one of two transformations.
- the method 2000 includes, at operation 2006, encoding an output from the implicit transformation.
- a video processing method comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and generating a residual signal for the current sub-block based on the predictions.
- a video processing method comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and generating a residual signal for the sub-blocks based on the predictions.
- a video processing method comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and reconstructing the current sub-block using the predictions.
- a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that is determined based on a partitioning direction of the sub-blocks, a prediction mode, or the dimension of the block.
- a video processing method comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and reconstructing the sub-blocks using the predictions.
- a video processing method comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern; and transforming a residual signal for the sub-blocks based on the predictions, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.
- a video processing method comprising: receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing inverse transform on a residual signal of the sub-blocks, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream; and reconstructing the sub-blocks using an output from the inverse transform.
- a video processing method comprising: receiving or transmitting a bitstream representing a block of video data for performing video processing, wherein the block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream, and wherein the sub-blocks share same quantization information.
- the quantization information comprises a quantization parameter, a quantization step, or a scaling matrix.
- a video processing method comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block; and encoding or reconstructing the block of video data based on the predictions.
- example 19 is described in item 6 in Section 4.
- a video processing method comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein the sub-blocks are partitioned in multiple partitioning directions; and encoding or reconstructing the block of video data based on the predictions.
- example 22 is described in items 10 and 11 in Section 4.
- a video processing method comprising: performing predictions for a block of video data to generate a residual signal; performing an explicit transformation of the residual signal using one of two transformations; and encoding an output from the implicit transformation.
- example 25 are described in items 16-18 in Section 4.
- a video processing method comprising: receiving a block of video data partitioned into one or more sub-blocks; performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.
- example 30 are described in items 14-15 in Section 4.
- a video processing apparatus comprising a processor configured to implement one or more of examples 1 to 31.
- a computer-readable medium having code stored thereon, the code, when executed by a processor, causing the processor to implement a method recited in any one or more of examples 1 to 31.
- FIG. 21 is a flowchart for a method 2100 of video processing in accordance with one or more examples of the present technology.
- the method 2100 includes, at 2102, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; at 2104, enabling a second mode different from the ISP mode for the block; and at 2106, performing the conversion based on the ISP mode and the second mode.
- ISP Intra Sub-block Partition
- the second mode is multiple reference line (MRL) mode.
- a reference line which is not the closest one of multiple reference lines is available for intra prediction of the block.
- all sub-partitions of the block use a same reference line index for intra prediction of the block.
- only first K sub-partitions of all sub-partitions of the block use a same reference line index for intra prediction, and the remaining sub-partitions use the closest reference line for intra prediction of the block, K being an integer.
- K 1.
- the reference line index is signaled in the bitstream.
- whether MRL mode is applied for the remaining sub-partitions depends on splitting direction in ISP mode or/and intra prediction mode or/and size of the block.
- MRL mode is applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- MRL mode is not applied to the remaining sub-partitions when above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- MRL mode is applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- MRL mode is not applied to the remaining sub-partitions when bottom-left neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- indications of ISP mode information are signaled before the signaling of MRL mode related information.
- the ISP mode information include at least one of on/off flag and splitting direction
- the MRL mode related information includes the reference line index
- the reference line index is referred to be 0.
- the second mode is a matrix based intra prediction (MIP) mode.
- MIP matrix based intra prediction
- matrix selection of one sub-partition depends on intra mode and/or size of the sub-partition.
- indications of the MIP modes and indications of the ISP modes are signaled for the block.
- the second mode is Transform skip (TS) mode.
- indication of enabling/disabling TS mode is further signaled even when ISP mode is enabled.
- whether to signal the indication of enabling/disabling TS mode depends on whether video content of the video is screen content or not.
- whether to signal the indication of enabling/disabling TS mode depends on a flag signaled in at least one of picture, slice, tile group, tile and brick-level.
- the indication of enabling/disabling transform skip mode is signaled.
- the indication of enabling/disabling transform skip mode is skipped and the TS mode is disabled for the blocks.
- all sub-partitions share the same quantization information including at least one of quantization parameter, quantization step and scaling matrix.
- the quantization parameter is represented by cu_qp_delta_abs and cu_qp_delta_sign_flag.
- the quantization information is signaled for the block only when there is at least one coefficient not equal to zero in at least one sub-partition.
- the quantization information is signaled once for the whole block instead of being signaled for each sub-partition.
- the quantization information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition, where m is an integer.
- the quantization information is signaled together with a first sub-partition in encoding or decoding order.
- the quantization information is signaled together with the last sub-partition in encoding or decoding order.
- the quantization information is signaled together with the m-th sub-partition in encoding or decoding order, wherein m is an integer no larger than the total number of allowed sub-partitions.
- FIG. 22 is a flowchart for a method 2200 of video processing in accordance with one or more examples of the present technology.
- the method 2200 includes, at 2202, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; at 2204, performing the conversion based on the ISP mode.
- ISP Intra Sub-block Partition
- the reference samples are filtered before being used as prediction.
- whether to modify the reference samples before being used as prediction depends on width and/or height of the block.
- whether to modify the reference samples before being used as prediction depends on intra-prediction mode of the block.
- block size of the block is denoted by W*H, wherein W is the block width and H is the block height
- a maximum transform block size of the block is denoted by MaxTbW *MaxTbH, wherein MaxTbW and MaxTbH are the maximum transform block width and maximum transform block height, respectively
- a minimum transform block size of the block is denoted by MinTbW *MinTbH, wherein MinTbW and MinTbH are the minimum transform block width and minimum transform block height, respectively.
- indications of MaxTbW and/or MaxTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
- the indications of MaxTbW and/or MaxTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- MaxTbW and/or MaxTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
- indications of MinTbW and/or MinTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
- the indications of MinTbW and/or MinTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- MinTbW and/or MinTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
- FIG. 23 is a flowchart for a method 2300 of video processing in accordance with one or more examples of the present technology.
- the method 2200 includes, at 2302, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; at 2304, performing the conversion based on the ISP mode.
- ISP Intra Sub-block Partition
- binary value of splitting direction coded for the ISP mode is replaced by an index of splitting directions.
- the set of allowed splitting directions depends on block size.
- indications of set of allowed splitting directions are signaled.
- the set of allowed splitting directions depends on intra prediciton mode of the block.
- the mixed splitting directions is enabled, M being an integer.
- the mixed splitting directions is enabled, M being an integer.
- M 1.
- the block is split by using quad-tree splitting.
- the block is split horizontally first followed by being split vertically when the mixed splitting directions are applied.
- the block is split vertically first followed by being split horizontally when the mixed splitting directions are applied.
- whether to and/or how to apply ISP mode on the block depend on the relationship between the block size of block W ⁇ H, and/or the maximum transform block size MaxTbW *MaxTbH and/or the minimum transform block size MinTbW *MinTbH.
- ISP mode is disabled for the block.
- how to split the block depends on the minimum transform block size of the block.
- ISP mode is enabled for the block and horizontal splitting is applied to the block, K being an integer larger than 1.
- the prediction direction is not needed to be signaled.
- the block is split to K sub-partitions.
- ISP mode is disabled for the block.
- ISP mode is disabled for the block, wherein the threshold is 4.
- ISP mode is disabled for the block.
- ISP mode is disabled for the block, wherein the threshold is 2 or 4.
- ISP mode is disabled for the block, wherein the threshold is 2 or 4.
- ISP mode is disabled for the block.
- ISP mode is disabled for the block.
- the first threshold is 1 and the second threshold is 4.
- signaling of splitting direction is skipped and the block is split according to certain rules.
- the quard-tree splitting is applied to the block firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
- the splitting of one partition tree is terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
- the splitting of one partition tree is terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
- the splitting of one partition tree is terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N, wherein M and N are two positive integers.
- W/MaxTbW > 4 and/or H/MaxTbH > 4 more than 4 sub-partitions and/or more than one splitting direction are enabled.
- FIG. 24 is a flowchart for a method 2400 of video processing in accordance with one or more examples of the present technology.
- the method 2400 includes, at 2402, determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; at 2404, performing the conversion based on the determined MTS scheme.
- MTS Multiple Transform Selection
- the MTS scheme is explicit MTS where transform index of the MTS is signaled in the bitstream of the block.
- the MTS scheme is revised to allow only two transforms, wherein the two transforms are DCT-II and DST-VII.
- the MTS scheme includes two modes in terms of transform selection.
- the MTS scheme includes a third mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
- TS transform skip
- a first mode of the two modes is DCT-II for both horizontal and vertical transform of the block
- a second mode of the two modes is DST-VII for both horizontal and vertical transform of the block.
- one bit is coded to indicate which mode of the two modes is used.
- the MTS scheme includes four modes in terms of transform selection.
- the MTS scheme includes a fifth mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
- TS transform skip
- a first mode of the four modes is DCT-II for both horizontal and vertical transform of the block
- a second mode of the four modes is DST-VII for both horizontal and vertical transform of the block
- a third mode of the four modes is DCT-II for horizontal transform of the block and DST-VII for vertical transform of the block
- a fourth mode of the four modes is DST-VII for horizontal transform of the block and DCT-II for vertical transform of the block.
- fixed length coding is utilized to code the four modes.
- truncated unary is utilized to code the four modes.
- the allowed transform sets and/or signaling of transform index in explicit MTS depend on the block size.
- DCT-II, DST-VII and DCT-VIII are allowed.
- transform skip (TS) mode is allowed.
- the allowed transform sets depend on coded mode of the block.
- a transform set of two-transformation basis including TS mode and DST-VII is allowed.
- non-IBC mode coded blocks a transform set of two-transformation basis including DCT-II and DST-VII is allowed, or a transform set of three-transformation basis including TS mode, DCT-II and DST-VII is allowed.
- how to signal the transform index is changed according to the allowed transform sets.
- indications of the maximum allowed transform size used in non-TS mode of the MTS scheme are signaled.
- the indications are signaled in at least one of sequence, picture, slice, tile group, tile, brick level or other kinds of video unit level.
- the indications are signaled video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- the indications are derived from the allowed maximum TS size.
- the indications are used to control both maximum TS sizes and maximum sizes used in other transform matrix.
- maximum sizes for non-TS and TS modes are not needed to be signaled separately.
- the indications of the maximum allowed transform size are used to control both implicit MTS transform sizes and explicit MTS transform sizes.
- whether to apply the implicit MTS depends on the signaled maximum allowed transform sizes.
- the maximum allowed transform size used in non-TS mode of the MTS scheme is aligned with the maximum allowed transform size used in TS mode.
- the maximum allowed transform size used in non-TS mode of the MTS scheme is same as the maximum allowed transform size used in TS mode.
- the MTS scheme is implicit MTS where transform matrix of the MTS is directly derived according to transform block sizes of the block.
- derivation of implicit MTS enabling flag indicating whether implicit MTS is enabled is independent from block size of the block.
- checking of the block size in the derivation of implicit MTS enabling flag is skipped.
- the MTS information includes transform_skip_flag and tu_mts_idx.
- the MTS information is further signaled:
- the MTS information is not signaled.
- condition check of block size compared to the allowed maximum TS sizes is applied before signaling transform_skip_flag; and condition check of block size compared to the allowed maximum allowed MTS sizes is applied before signaling tu_mts_idx.
- the shared condition check of block size before signaling the MTS information is kept unchanged, while the condition check of block size before signaling the transform matrix index used in non-TS mode of MTS is removed.
- the conversion generates the block of video from the bitstream representation.
- the conversion generates the bitstream representation from the block of video.
- the disclosed techniques may be embodied in video encoders or decoders to improve compression efficiency using techniques that include the use of a reduced dimension secondary transform.
- the disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them.
- the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.
- the computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them.
- data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
- the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
- a propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
- a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a computer program does not necessarily correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) .
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
- the processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
- processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
- a processor will receive instructions and data from a read only memory or a random-access memory or both.
- the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- a computer need not have such devices.
- Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto optical disks e.g., CD ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Intra Sub-block Partitioning and multiple transform selection are described. In one example aspect, a video processing method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; enabling a second mode different from the ISP mode for the block; and performing the conversion based on the ISP mode and the second mode.
Description
CROSS-REFERENCE TO RELATED APPLICATION
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/084699, filed on April 27, 2019. The entire disclosures of International Patent Application No. PCT/CN2019/084699 is incorporated by reference as part of the disclosure of this application.
This patent document relates to video coding techniques, devices and systems.
In spite of the advances in video compression, digital video still accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
The present document describes various embodiments and techniques in which a secondary transform is used during decoding or encoding of video or images.
In one example aspect, a method of video processing is disclosed. The method includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and generating a residual signal for the current sub-block based on the predictions.
In yet another example aspect, another method of video processing is disclosed. The method includes partitioning a block of video data into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and generating a residual signal for the sub-blocks based on the predictions.
In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block, and reconstructing the current sub-block using the predictions.
In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks, and reconstructing the sub-blocks using the predictions.
In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and transforming a residual signal for the sub-blocks based on the predictions. A maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.
In yet another example aspect, another method of video processing is disclosed. The method includes receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern, performing inverse transform on a residual signal of the sub-blocks, and reconstructing the sub-blocks using an output from the inverse transform. A maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream.
In yet another example aspect, another method of video processing is disclosed. The method includes receiving or transmitting a bitstream representing a block of video data for performing video processing. The block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream and the sub-blocks share same quantization information.
In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern. Reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block. The method also includes encoding or reconstructing the block of video data based on the predictions.
In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern and encoding or reconstructing the block of video data based on the predictions. The sub-blocks are partitioned in multiple partitioning directions.
In yet another example aspect, another method of video processing is disclosed. The method includes performing predictions for a block of video data to generate a residual signal, performing an explicit transformation of the residual signal using one of two transformations, and encoding an output from the implicit transformation.
In yet another example aspect, another method of video processing is disclosed. The method includes receiving a block of video data partitioned into one or more sub-blocks, performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.
In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; enabling a second mode different from the ISP mode for the block; and performing the conversion based on the ISP mode and the second mode.
In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; and performing the conversion based on the ISP mode.
In yet another example aspect, another method of video processing is disclosed. The method includes enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; and performing the conversion based on the ISP mode.
In yet another example aspect, another method of video processing is disclosed. The method includes determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; and performing the conversion based on the determined MTS scheme.
In yet another example aspect, a video encoder is disclosed. The video encoder comprises a processor configured to implement one or more of the above-described methods.
In yet another example aspect, a video decoder is disclosed. The video decoder comprises a processor configured to implement one or more of the above-described methods.
In yet another example aspect, a computer readable medium is disclosed. The medium includes code for implementing one or more of the above-described methods stored on the medium.
These, and other, aspects are described in the present document.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 shows an example of an encoder block diagram.
FIG. 2 shows an example of 67 intra prediction modes.
FIG. 3A-3B show examples of reference samples for wide-angular intra prediction.
FIG. 4 is an example illustration of a problem of discontinuity in case of directions beyond 45 degrees.
FIG. 5A-5D show an example illustration of samples used by PDPC applied to diagonal and adjacent angular intra modes.
FIG. 6 depicted an example of four reference lines.
FIG. 7 is an example of division of 4×8 and 8×4 blocks.
FIG. 8 is an example of division of all blocks except 4×8, 8×4 and 4×4.
FIG. 9 is an example of Affine Linear Weighted Intra-Prediction (ALWIP) for 4x4 blocks.
FIG. 10 is an example of ALWIP for 8x8 blocks.
FIG. 11 is an example of ALWIP for 8x4 blocks.
FIG. 12 is an example of ALWIP for 16x16 blocks.
FIG. 13 shows an example of secondary transform in JEM.
FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .
FIG. 15 is an illustration of sub-block transform modes SBT-V and SBT-H.
FIG. 16 is a block diagram of an example hardware platform for implementing a technique described in the present document.
FIG. 17 shows an example of missed splitting.
FIG. 18 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 19 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 20 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 21 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 22 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 23 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
FIG. 24 is a flowchart for a method of video processing in accordance with one or more examples of the present technology.
Section headings are used in the present document to facilitate ease of understanding and do not limit the embodiments disclosed in a section to only that section. Furthermore, while certain embodiments are described with reference to Versatile Video Coding or other specific video codecs, the disclosed techniques are applicable to other video coding technologies also. Furthermore, while some embodiments describe video coding steps in detail, it will be understood that corresponding steps decoding that undo the coding will be implemented by a decoder. Furthermore, the term video processing encompasses video coding or compression, video decoding or decompression and video transcoding in which video pixels are represented from one compressed format into another compressed format or at a different compressed bitrate.
1. Summary
This patent document is related to video coding technologies. Specifically, it is related transform in video coding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
2. Initial Discussion
Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard targeting at 50%bitrate reduction compared to HEVC.
2.1 Coding flow of a typical video codec
FIG. 1 shows an example of encoder block diagram of VVC, which contains three in-loop filtering blocks: deblocking filter (DF) , sample adaptive offset (SAO) and ALF. Unlike DF, which uses predefined filters, SAO and ALF utilize the original samples of the current picture to reduce the mean square errors between the original samples and the reconstructed samples by adding an offset and by applying a finite impulse response (FIR) filter, respectively, with coded side information signaling the offsets and filter coefficients. ALF is located at the last processing stage of each picture and can be regarded as a tool trying to catch and fix artifacts created by the previous stages.
2.2 Intra mode coding with 67 intra prediction modes
To capture the arbitrary edge directions presented in natural video, the number of directional intra modes is extended from 33, as used in HEVC, to 65. The additional directional modes are depicted as red dotted arrows in FIG. 2, and the planar and DC modes remain the same. These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.
Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction as shown in FIG. 2. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for the non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes is unchanged, i.e., 67, and the intra mode coding is unchanged.
In the HEVC, every intra-coded block has a square shape and the length of each of its side is a power of 2. Thus, no division operations are required to generate an intra-predictor using DC mode. In VVV2, blocks can have a rectangular shape that necessitates the use of a division operation per block in the general case. To avoid division operations for DC prediction, only the longer side is used to compute the average for non-square blocks.
2.3 Wide-angle intra prediction for non-square blocks
Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes for a certain block is unchanged, i.e., 67, and the intra mode coding is unchanged.
To support these prediction directions, the top reference with length 2W+1, and the left reference with length 2H+1, are defined as shown in FIG. 3A-3B.
The mode number of replaced mode in wide-angular direction mode is dependent on the aspect ratio of a block. The replaced intra prediction modes are illustrated in Table 1.
Table 1 -Intra prediction modes replaced by wide-angular modes
Condition | Replaced intra prediction modes |
W /H == 2 | |
W /H > 2 | |
W /H == 1 | None |
H /W == 1/2 | |
H /W < 1/2 | |
As shown in FIG. 4, two vertically-adjacent predicted samples may use two non-adjacent reference samples in the case of wide-angle intra prediction. Hence, low-pass reference samples filter and side smoothing are applied to the wide-angle prediction to reduce the negative effect of the increased gap Δp
α.
2.4 Position dependent intra prediction combination
In the VTM2, the results of intra prediction of planar mode are further modified by a position dependent intra prediction combination (PDPC) method. PDPC is an intra prediction method which invokes a combination of the un-filtered boundary reference samples and HEVC style intra prediction with filtered boundary reference samples. PDPC is applied to the following intra modes without signalling: planar, DC, horizontal, vertical, bottom-left angular mode and its eight adjacent angular modes, and top-right angular mode and its eight adjacent angular modes.
The prediction sample pred (x, y) is predicted using an intra prediction mode (DC, planar, angular) and a linear combination of reference samples according to the Equation as follows:
pred (x, y) = (wL×R
-1,
y + wT×R
x,
-1 –wTL ×R
-1,
-1+ (64 –wL –wT+wTL) ×pred (x, y) + 32 ) >> 6 where R
x, -1, R
-1, y represent the reference samples located at the top and left of current sample (x, y) , respectively, and R
-1, -1 represents the reference sample located at the top-left corner of the current block.
If PDPC is applied to DC, planar, horizontal, and vertical intra modes, additional boundary filters are not needed, as required in the case of HEVC DC mode boundary filter or horizontal/vertical mode edge filters.
FIG. 5A-5D illustrates the definition of reference samples (R
x, -1, R
-1, y and R
-1, -1) for PDPC applied over various prediction modes. The prediction sample pred (x’, y’ ) is located at (x’, y’ ) within the prediction block. The coordinate x of the reference sample R
x, -1 is given by: x = x’ + y’ + 1, and the coordinate y of the reference sample R
-1, y is similarly given by: y = x’ + y’ + 1.
FIGS. 5A to 5D provide definition of samples used by PDPC applied to diagonal and adjacent angular intra modes.
The PDPC weights are dependent on prediction modes and are shown in Table 2.
Table 2 -Example of PDPC weights according to prediction modes
2.5. Multiple reference line
Multiple reference line (MRL) intra prediction uses more reference lines for intra prediction. In FIG. 6, an example of 4 reference lines is depicted, where the samples of segments A and F are not fetched from reconstructed neighbouring samples but padded with the closest samples from Segment B and E, respectively. HEVC intra-picture prediction uses the nearest reference line (i.e., reference line 0) . In MRL, 2 additional lines (reference line 1 and reference line 3) are used.
The index of selected reference line (mrl_idx) is signaled and used to generate intra predictor. For reference line index, which is greater than 0, only include additional reference line modes in MPM list and only signal MPM index without remaining mode. The reference line index is signaled before intra prediction modes, and Planar and DC modes are excluded from intra prediction modes in case a nonzero reference line index is signaled.
MRL is disabled for the first line of blocks inside a CTU to prevent using extended reference samples outside the current CTU line. Also, PDPC is disabled when additional line is used.
2.6 Intra subblock partitioning (ISP)
ISP is proposed, which divides luma intra-predicted blocks vertically or horizontally into 2 or 4 sub-partitions depending on the block size dimensions, as shown in Table . FIG. 7 and FIG. 8 show examples of the two possibilities. FIG. 7 shows an example of division of 4×8 and 8×4 blocks. FIG. 8 shows an example of division of all blocks except 4×8, 8×4 and 4×4. All sub-partitions fulfill the condition of having at least 16 samples. For block sizes, 4×N or N×4 (with N>8) , if allowed, the 1×N or N×1 sub-partition may exist.
Table 3: Number of sub-partitions depending on the block size.
For each of these sub-partitions, a residual signal is generated by entropy decoding the coefficients sent by the encoder and then invert quantizing and invert transforming them. Then, the sub-partition is intra predicted and finally the corresponding reconstructed samples are obtained by adding the residual signal to the prediction signal. Therefore, the reconstructed values of each sub-partition will be available to generate the prediction of the next one, which will repeat the process and so on. All sub-partitions share the same intra mode.
Table 4 shows example transform types based on intra-prediction mode (s) .
Table 4: Specification of trTypeHor and trTypeVer depending on predModeIntra
2.6.1 Example Syntax and Semantics
Table 5 shows an example coding unit syntax.
Table 5: Coding unit syntax
Table 6 shows an example transform unit syntax. Some of the example variables include:
intra_subpartitions_mode_flag [x0] [y0] equal to 1 specifies that the current intra coding unit is partitioned into NumIntraSubPartitions [x0] [y0] rectangular transform block subpartitions. intra_subpartitions_mode_flag [x0] [y0] equal to 0 specifies that the current intra coding unit is not partitioned into rectangular transform block subpartitions.
When intra_subpartitions_mode_flag [x0] [y0] is not present, it is inferred to be equal to 0.
intra_subpartitions_split_flag [x0] [y0] specifies whether the intra subpartitions split type is horizontal or vertical. When intra_subpartitions_split_flag [x0] [y0] is not present, it is inferred as follows:
If cbHeight is greater than MaxTbSizeY, intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 0.
Otherwise (cbWidth is greater than MaxTbSizeY) , intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 1.
The variable IntraSubPartitionsSplitType specifies the type of split used for the current luma coding block as illustrated in Table 7 9. IntraSubPartitionsSplitType is derived as follows:
If intra_subpartitions_mode_flag [x0] [y0] is equal to 0, IntraSubPartitionsSplitType is set equal to 0.
Otherwise, the IntraSubPartitionsSplitType is set equal to 1 +intra_subpartitions_split_flag [x0] [y0] .
Table 6 Transform unit syntax
Table 7 shows example name association to IntraSubPartitionsSplitType
Table 7 Name association to IntraSubPartitionsSplitType
IntraSubPartitionsSplitType | Name of |
0 | |
1 | |
2 | ISP_VER_SPLIT |
The variable NumIntraSubPartitions specifies the number of transform block subpartitions an intra luma coding block is divided into. NumIntraSubPartitions is derived as follows:
If IntraSubPartitionsSplitType is equal to ISP_NO_SPLIT, NumIntraSubPartitions is set equal to 1.
Otherwise, if one of the following conditions is true, NumIntraSubPartitions is set equal to 2: cbWidth is equal to 4 and cbHeight is equal to 8, cbWidth is equal to 8 and cbHeight is equal to 4.
Otherwise, NumIntraSubPartitions is set equal to 4.
2.7 Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction)
Affine linear weighted intra prediction (ALWIP, a. k. a. Matrix based intra prediction (MIP) ) is proposed.
2.7.1 Generation of the reduced prediction signal by matrix vector multiplication
The neighboring reference samples are firstly down-sampled via averaging to generate the reduced reference signal bdry
red. Then, the reduced prediction signal pred
red is computed by calculating a matrix vector product and adding an offset:
pred
red=A·bdry
red+b.
Here, A is a matrix that has W
red·H
red rows and 4 columns if W=H=4 and 8 columns in all other cases. b is a vector of size W
red·H
red.
2.7.2. Illustration of the entire ALWIP process
The entire process of averaging, matrix vector multiplication and linear interpolation is illustrated for different shapes in FIG. 9 to FIG. 12. Note, that the remaining shapes are treated as in one of the depicted cases.
Given a 4×4 block, as shown in FIG. 9, ALWIP takes two averages along each axis of the boundary. The resulting four input samples enter the matrix vector multiplication. The matrices are taken from the set S_0. After adding an offset, this yields the 16 final prediction samples. Linear interpolation is not necessary for generating the prediction signal. Thus, a total of (4·16) / (4·4) =4 multiplications per sample are performed.
Given an 8×8 block, as shown in FIG. 10, ALWIP takes four averages along each axis of the boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_1. This yields 16 samples on the odd positions of the prediction block. Thus, a total of (8·16) / (8·8) =2 multiplications per sample are performed. After adding an offset, these samples are interpolated vertically by using the reduced top boundary. Horizontal interpolation follows by using the original left boundary.
Given an 8×4 block, as shown in FIG. 11, ALWIP takes four averages along the horizontal axis of the boundary and the four original boundary values on the left boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_1. This yields 16 samples on the odd horizontal and each vertical positions of the prediction block. Thus, a total of (8·16) / (8·4) =4 multiplications per sample are performed. After adding an offset, these samples are interpolated horizontally by using the original left boundary. The transposed case is treated accordingly.
Given a 16×16 block, as shown in FIG. 12, ALWIP takes four averages along each axis of the boundary. The resulting eight input samples enter the matrix vector multiplication. The matrices are taken from the set S_2. This yields 64 samples on the odd positions of the prediction block. Thus, a total of (8·64) / (16·16) =2 multiplications per sample are performed. After adding an offset, these samples are interpolated vertically by using eight averages of the top boundary. Horizontal interpolation follows by using the original left boundary. The interpolation process, in this case, does not add any multiplications. Therefore, totally, two multiplications per sample are required to calculate ALWIP prediction.
For larger shapes, the procedure is essentially the same and it is easy to check that the number of multiplications per sample is less than four.
For W×8 blocks with W>8, only horizontal interpolation is necessary as the samples are given at the odd horizontal and each vertical position.
Finally for W×4 blocks with W>8, let A_kbe the matrix that arises by leaving out every row that corresponds to an odd entry along the horizontal axis of the down-sampled block. Thus, the output size is 32 and again, only horizontal interpolation remains to be performed.
The transposed cases are treated accordingly.
2.7.1 Example Syntax and Semantics
Table 8 shows an example coding unit syntax
Table 8 Coding unit syntax
2.8 Multiple Transform Set (MTS) in VVC
2.8.1 Explicit Multiple Transform Set (MTS)
In VTM4, large block-size transforms, up to 64×64 in size, are enabled, which is primarily useful for higher resolution video, e.g., 1080p and 4K sequences. High frequency transform coefficients are zeroed out for the transform blocks with size (width or height, or both width and height) equal to 64, so that only the lower-frequency coefficients are retained. For example, for an M×N transform block, with M as the block width and N as the block height, when M is equal to 64, only the left 32 columns of transform coefficients are kept. Similarly, when N is equal to 64, only the top 32 rows of transform coefficients are kept. When transform skip mode is used for a large block, the entire block is used without zeroing out any values.
In addition to DCT-II which has been employed in HEVC, a Multiple Transform Selection (MTS) scheme is used for residual coding both inter and intra coded blocks. It uses multiple selected transforms from the DCT8/DST7. The newly introduced transform matrices are DST-VII and DCT-VIII. The table below shows the basis functions of the selected DST/DCT.
In order to keep the orthogonality of the transform matrix, the transform matrices are quantized more accurately than the transform matrices in HEVC. To keep the intermediate values of the transformed coefficients within the 16-bit range, after horizontal and after vertical transform, all the coefficients are to have 10-bit.
In order to control MTS scheme, separate enabling flags are specified at SPS level for intra and inter, respectively. When MTS is enabled at SPS, a CU level flag is signalled to indicate whether MTS is applied or not. Here, MTS is applied only for luma. The MTS CU level flag is signalled when the following conditions are satisfied.
- Both width and height smaller than or equal to 32
- CBF flag is equal to one
If MTS CU flag is equal to zero, then DCT2 is applied in both directions. However, if MTS CU flag is equal to one, then two other flags are additionally signalled to indicate the transform type for the horizontal and vertical directions, respectively. Transform and signalling mapping table as shown in Table 3-10. When it comes to transform matrix precision, 8-bit primary transform cores are used. Therefore, all the transform cores used in HEVC are kept as the same, including 4-point DCT-2 and DST-7, 8-point, 16-point and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8, use 8-bit primary transform cores.
To reduce the complexity of large size DST-7 and DCT-8, High frequency transform coefficients are zeroed out for the DST-7 and DCT-8 blocks with size (width or height, or both width and height) equal to 32. Only the coefficients within the 16x16 lower-frequency region are retained.
As in HEVC, the residual of a block can be coded with transform skip mode. To avoid the redundancy of syntax coding, the transform skip flag is not signalled when the CU level MTS_CU_flag is not equal to zero. The block size limitation for transform skip is the same to that for MTS in JEM4, which indicate that transform skip is applicable for a CU when both block width and height are equal to or less than 32.
2.8.1.1 Example Syntax and Semantics
MTS index may be signaled in the bitstream and such a design is called explicit MTS. In addition, an alternative way which directly derive the matrix according to transform block sizes is also supported, as implicit MTS.
For the explicit MTS, it supports all coded modes. While for the implicit MTS, only intra mode is supported. Table 9 shows example picture parameter set syntax.
Table 9 picture parameter set RBSP syntax.
Table 10 shows example transform unit syntax.
Table 10 Transform unit syntax
Some of the example variables include:
transform_skip_flag [x0] [y0] specifies whether a transform is applied to the luma transform block or not. The array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.
transform_skip_flag [x0] [y0] equal to 1 specifies that no transform is applied to the luma transform block. transform_skip_flag [x0] [y0] equal to 0 specifies that the decision whether transform is applied to the luma transform block or not depends on other syntax elements. When transform_skip_flag [x0] [y0] is not present, it is inferred to be equal to 0.
tu_mts_idx [x0] [y0] specifies which transform kernels are applied to the residual samples along the horizontal and vertical direction of the associated luma transform block. The array indices x0, y0 specify the location (x0, y0 ) of the top-left luma sample of the considered transform block relative to the top-left luma sample of the picture.
When tu_mts_idx [x0] [y0] is not present, it is inferred to be equal to 0.
In the CABAC decoding process, one context is used to decode transform_skip_flag, truncated unary is used to binarize the tu_mts_idx. Each bin of the tu_mts_idx is context coded, and for the first bin, the quad-tree depth (i.e., cqtDepth) is used to select one context; and for the remaining bins, one context is used.
Table 11 shows example assignment of ctxInc to syntax elements.
Table 11 Assignment of ctxInc to syntax elements with context coded bins
2.8.2 Implicit Multiple Transform Set (MTS)
It is noted that ISP, SBT, and MTS enabled but with implicit signaling are all treated as implicit MTS
The implicitMtsEnabled is used to define whether implicit MTS is enabled. The variable implicitMtsEnabled is derived as follows:
If sps_mts_enabled_flag is equal to 1 and one of the following conditions is true, implicitMtsEnabled is set equal to 1:
- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT
- cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32
- sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are both equal to 0 and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA
Otherwise, implicitMtsEnabled is set equal to 0.
The variable trTypeHor specifying the horizontal transform kernel and the variable trTypeVer specifying the vertical transform kernel are derived as follows:
If cIdx is greater than 0, trTypeHor and trTypeVer are set equal to 0.
Otherwise, if implicitMtsEnabled is equal to 1, the following applies:
- If IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified in Table 8 15 depending on intraPredMode.
- Otherwise, if cu_sbt_flag is equal to 1, trTypeHor and trTypeVer are specified in Table 8 14 depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.
- Otherwise (sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are equal to 0) , trTypeHor and trTypeVer are derived as follows:
trTypeHor = (nTbW >= 4 && nTbW <= 16 && nTbW <= nTbH ) ? 1 : 0 (8 1030)
trTypeVer = (nTbH >= 4 && nTbH <= 16 && nTbH <= nTbW ) ? 1 : 0 (8 1031)
Otherwise, trTypeHor and trTypeVer are specified in Table 12 depending on tu_mts_idx [xTbY] [yTbY] .
Table 12 Specification of trTypeHor and trTypeVer depending on tu_mts_idx [x] [y]
tu_mts_idx [x0] [y0] | 0 | 1 | 2 | 3 | 4 |
|
0 | 1 | 2 | 1 | 2 |
|
0 | 1 | 1 | 2 | 2 |
Table 13 shows example specification of trTypeHor and trTypeVer depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.
Table 13 Specification of trTypeHor and trTypeVer depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag
cu_sbt_horizontal_flag | cu_sbt_pos_flag | trTypeHor | trTypeVer |
0 | 0 | 2 | 1 |
0 | 1 | 1 | 1 |
1 | 0 | 1 | 2 |
1 | 1 | 1 | 1 |
2.9 Reduced Secondary Transform (RST)
2.9.1 Non-Separable Secondary Transform (NSST) in JEM
In JEM, secondary transform is applied between forward primary transform and quantization (at encoder) and between de-quantization and invert primary transform (at decoder side) . As shown in FIG. 10, a 4x4 (or 8x8) secondary transform is performed depends on block size. For example, 4x4 secondary transform is applied for small blocks (i.e., min (width, height) < 8) and 8x8 secondary transform is applied for larger blocks (i.e., min (width, height) > 4) per 8x8 block.
FIG. 13 shows an example of secondary transform in JEM.
Application of a non-separable transform is described as follows using input as an example. To apply the non-separable transform, the 4x4 input block X
The non-separable transform is calculated as
where
indicates the transform coefficient vector, and T is a 16x16 transform matrix. The 16x1 coefficient vector
is subsequently re-organized as 4x4 block using the scanning order for that block (horizontal, vertical or diagonal) . The coefficients with smaller index will be placed with the smaller scanning index in the 4x4 coefficient block. There are totally 35 transform sets and 3 non-separable transform matrices (kernels) per transform set are used. The mapping from the intra prediction mode to the transform set is pre-defined. For each transform set, the selected non-separable secondary transform candidate is further specified by the explicitly signalled secondary transform index. The index is signalled in a bit-stream once per Intra CU after transform coefficients.
2.9.2 Reduced Secondary Transform (RST)
The RST was introduced and 4 transform set (instead of 35 transform sets) mapping is introduced. 16x64 (may further be reduced to 16x48) and 16x16 matrices are employed for 8x8 and 4x4 blocks, respectively. For notational convenience, the 16x64 (may further be reduced to 16x48) transform is denoted as RST8x8 and the 16x16 one as RST4x4. FIG. 11 shows an example of RST.
FIG. 14 shows an example of the proposed Reduced Secondary Transform (RST) .
2.10 Sub-block transform
For an inter-predicted CU with cu_cbf equal to 1, cu_sbt_flag may be signaled to indicate whether the whole residual block or a sub-part of the residual block is decoded. In the former case, inter MTS information is further parsed to determine the transform type of the CU. In the latter case, a part of the residual block is coded with inferred adaptive transform and the other part of the residual block is zeroed out. The SBT is not applied to the combined inter-intra mode.
In sub-block transform, position-dependent transform is applied on luma transform blocks in SBT-V and SBT-H (chroma TB always using DCT-2) . The two positions of SBT-H and SBT-V are associated with different core transforms. More specifically, the horizontal and vertical transforms for each SBT position is specified in FIG. 15. For example, the horizontal and vertical transforms for SBT-V position 0 is DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the corresponding transform is set as DCT-2. Therefore, the sub-block transform jointly specifies the TU tiling, cbf, and horizontal and vertical transforms of a residual block, which may be considered a syntax shortcut for the cases that the major residual of a block is at one side of the block.
2.10.1 Example Syntax and Semantics
Table 14 shows an example coding unit syntax.
Table 14 Coding unit syntax
Table 15 shows an example residual coding syntax.
Table 15 Residual coding syntax
3. Examples of problems solved by embodiments
The current design has the following problems:
1. ISP couldn’t be enabled when multiple reference line (MRL) is enabled.
2. Transform skip (TS) couldn’t be enabled when ISP is used. However, enabling both ISP and TS may achieve similar functionality as BDPCM while there is no need to add an additional module for handling BDPCM.
3. Delta QP is signaled per sub-partition which results in signaling it multiple times for ISP coded blocks.
4. The enabling of ISP mode (e.g., intra_subpartitions_mode_flag) is signaled when either width or height of one block is no larger than MaxTbSizeY, while the signaling of partition direction (i.e., splitting type, horizontal/vertical direction) is signaled when both width and height are no larger than MaxTbSizeY. If one of them is larger than MaxTbSizeY, the following applies:
a. If height is greater than MaxTbSizeY, horizontal splitting (i.e., intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 0) is used.
b. Otherwise (width is greater than MaxTbSizeY) , vertical splitting (i.e., intra_subpartitions_split_flag [x0] [y0] is inferred to be equal to 1) is used.
Such a design is based on the assumption that only width or height could be twice of the MaxTbSizeY which limits the flexibility. When the maximum transform size is set to, for example, 32x32, and the CU size is, for example, 128x128, according to the rules, it will be split to 4 128x32 sub-partitions. However, when the maximum transform size is 32x32, it is disallowed to coded one 128x32 sub-partitions in VVC. How to handle this case is unknown.
5. ISP and sub-block transform are both treated as implicit MTS since there is no need to signal the transform matrix. Sub-block transform could support block sizes up to 64x64 when the MaxSbtSize. However, the setting of implicitMTS only checks Max (width, height ) is less than or equal to 32. In addition, when cu_sbt_flag is equal to 1, implicitMTS shall be set to 1 automatically, there is no need to check the transform size.
6. TS is part of MTS. However, the signaling of enabling/disabling TS and maximum TS size is signaled in PPS. While MTS enabling/disabling flag is signaled in SPS.
7. Redundant check of block sizes is identified in the current VVC design for signaling transform_skip_flag and tu_mts_idx.
4. Example embodiments and techniques
The listing of embodiments below should be considered as examples to explain general concepts. These embodiments should not be interpreted in a narrow way. Furthermore, these embodiments can be combined in any manner.
In the following description, one block size is denoted by W*H wherein W is the block width and H is the block height. The maximum transform block size is denoted by MaxTbW *MaxTbH wherein MaxTbW and MaxTbH are the maximum transform block width and height, respectively. The minimum transform block size is denoted by MinTbW *MinTbH wherein MinTbW and MinTbH are the minimum transform block’ width and height, respectively. It is noted that MRL may represent those technologies that use non-adjacent reference lines in current picture to predict the current block, and ALWIP may represent those technologies that use matrix-based intra prediction methods. They are not limited to those mentioned in prior art.
Regarding ISP:
1. It is proposed that Intra Sub-block Partition (ISP) and multiple reference line (MRL) modes may be both enabled (e.g., the reference line may not be the closest one) for coding one block.
a. In one example, all sub-partitions use the same reference line index for intra prediction.
b. Alternatively, only the first K (e.g, K = 1) sub-partition follows the reference line index (e.g., signaled in the bitstream) . The remaining sub-partitions still use the closest reference line for intra prediction.
c. In one example, whether MRL is applied for the remaining sub-partitions (e.g., sub-partitions except the first sub-partition) may depend on the splitting direction in ISP or/and intra prediction mode or/and dimension of the block.
i. For example, if the block is split in horizonal direction in ISP, MRL may be applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are less than or equal to 50 in FIG. 2.
ii. For example, if the block is split in horizonal direction in ISP, MRL may be not applied to the remaining sub-partitions when above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are greater than 50 in FIG. 2.
iii. For example, if the block is split in vertical direction in ISP, MRL may be applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are greater than or equal to 18 in FIG. 2.
iv. For example, if the block is split in vertical direction in ISP, MRL may be not applied to the remaining sub-partitions when bottom-left neighboring reference (reconstructed) samples of the block are used in the intra prediction of the first sub-partition. E.g., prediction modes that are less than 18 in FIG. 2.
2. Indications of ISP mode information (e.g., on/off, splitting direction) may be signaled before the signaling of MRL related information.
a. In one example, when ISP mode is enabled for one block, the signaling of MRL related information may be skipped, e.g., the reference line index.
i. Alternatively, furthermore, the reference line index is referred to be 0.
3. ALWIP and ISP may be both enabled for one block.
a. Alternatively, furthermore, the matrix selection of one sub-partition may depend on the intra mode and/or dimension of the sub-partition.
b. Alternatively, furthermore, indications of ALWIP modes (e.g., intra_lwip_flag and related intra modes) and indications of ISP modes (e.g., intra_subpartitions_mode_flag and intra_subpartitions_split_flag)
4. Transform skip (TS) and ISP may be both enabled for one block.
a. Alternatively, furthermore, indication of enabling/disabling transform skip mode may be further signaled even when ISP mode is enabled (e.g., IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT) .
b. Alternatively, furthermore, whether to signal the indication of enabling/disabling transform skip mode may depend on whether the video content is screen content or not.
i. In one example, it may depend on a flag signaled in picture/slice/tile group/tile/brick-level.
ii. In one example, if the video content is screen content, the indication of enabling/disabling transform skip mode may be signaled. Alternatively, if the video content is camera content, the indication of enabling/disabling transform skip mode may be skipped and the TS mode is disabled for ISP coded blocks.
5. It is proposed that only one quantization parameter, and/or one quantization step, and/or one scaling matrix may be allowed for ISP coded blocks. That is, all sub-partitions shall the same quantization information.
a. In one example, one quantization parameter may be represented by cu_qp_delta_abs, and cu_qp_delta_sign_flag.
b. In one example, the quantization parameter information may be signaled for an ISP coded block only when there is at least one coefficient not equal to zero in at least one sub-partition.
c. In one example, the quantization parameter, and/or one quantization step, and/or one scaling matrix may be signaled once for the whole ISP coded block instead of being signaled for each sub-partition.
i. In one example, the information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition.
ii. In one example, the information may be signaled together with the first sub-partition in the encoding/decoding order.
iii. In one example, the information may be signaled together with the last sub-partition in the encoding/decoding order.
iv. In one example, the information may be signaled together with the m-th sub-partition in the encoding/decoding order wherein m is no larger than the total number of allowed sub-partitions.
6. It is proposed that reference samples located in a first sub-partition to predict a second sub-partition in an ISP coded block may be further modified (e.g., may be filtered) before being used as prediction.
a. In one example, whether to modify (e.g., filter) reference samples before being used as prediction may depend on block width and/or height.
b. In one example, whether to modify (e.g., filter) reference samples before being used as prediction may depend on the intra-prediction mode.
7. Indications of MaxTbW and/or MaxTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.
a. In one example, they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.
b. MaxTbW and/or MaxTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.
8. Indications of MinTbW and/or MinTbH may be signaled in sequence/picture/slice/tile group/tile/brick-level.
a. In one example, they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.
b. MinTbW and/or MinTbH may be set to different numbers in different profiles/levels/tiers of a video coding standard.
9. Mixed splitting directions may be enabled for ISP coded blocks wherein the block may be split for both horizontal and vertical directions.
a. In one example, the binary value of splitting direction coded for the ISP mode (e.g., intra_subpartitions_split_flag) may be replaced by an index of splitting directions.
b. In one example, the set of allowed splitting directions may depend on block dimension.
i. Alterantivley, indications of set of allowed splitting directions may be signaled.
c. In one example, the set of allowed splitting directions may depend on intra prediciton mode.
d. In one example, when W/MaxTbW and H/MaxTbH are both greater than M (e.g., M=1) , mixed splitting directions may be enabled, wherein both horizontal and vertical splitting may be invoked.
i. An example of mixed splitting direction is depicted in FIG. 12.
ii. Alternatively, when W/MaxTbW or H/MaxTbH is greater than M (e.g., M=1) , the mixed splitting directions may be enabled.
iii. In one example, a block may be split horizontally first followed by being split vertically when mixed ISP is applied.
1) Alternatively, a block may be split vertically first followed by being split horizontally when mixed ISP is applied. FIG. 17 shows an example of mixed splitting (also known as quad-tree splitting) .
10. Whether to and/or how to apply ISP on a block may depend on the relationship between the block dimensions W×H, and/or maximum and/or minimum transform block sizes.
a. In one example, if W/MinTbW and H/MinTbH are both equal to 1, ISP is disabled.
b. In one example, how to split the block may depend on the minimum transform block sizes.
i. In one example, if W/MinTbW is equal to K (K> 1) and H/MinTbH is equal to 1, ISP may be enabled and vertical splitting is applied.
ii. In one example, if W/MinTbW is equal to 1 and H/MinTbH is equal to K (K> 1) , ISP may be enabled and horizontal splitting is applied. Alternatively, furthermore, there is no need to signal the prediction direction.
iii. Alternatively, furthermore, there is no need to signal the prediction direction.
iv. Alternatively, furthermore, the block may be split to K sub-partitions.
c. In one example, ISP mode is disabled when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1.
i. Alternatively, ISP mode is disabled when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, such as 4.
ii. Alternatively, ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than 1.
iii. Alternatively, ISP mode is disabled when either W/MaxTbW or H/MaxTbH is greater than a threshold, such as 2 or 4.
iv. Alternatively, ISP mode is disabled when both W/MaxTbW and H/MaxTbH is greater than a threshold, such as 2 or 4.
d. In one example, ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater (or no smaller) than a first threshold, and no greater (or smaller) than a second threshold.
i. Alternatively, ISP mode may be enabled when both W/MaxTbW and H/MaxTbH is greater than a first threshold, and smaller than a second threshold.
ii. In one example, the first and second thresholds are 1, and 4, respectively.
iii. Alternatively, furthermore, the signaling of the splitting direction (e.g., intra_subpartitions_split_flag) may be skipped and the block may be split according to certain rules.
1) In one example, the quard-tree splitting may be applied firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
2) In one example, the splitting of one partition tree may be terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
a. Alternatively, the splitting of one partition tree may be terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
b. Alternatively, the splitting of one partition tree may be terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N wherein M and N are two positive integers.
e. When ISP mode is disabled, signaling of the related information such as intra_subpartitions_mode_flag is skipped.
11. In one example, more than 4 sub-partitions and/or more than one splitting direction (such as both horizontal and vertical splitting are invoked) may be enabled.
i. Alternatively, the above method may be enabled under certain conditions.
ii. In one example, when W/MaxTbW > 4 and/or H/MaxTbH > 4.
Regarding MTS:
12. It is proposed to only keep two transforms (and corresponding invert transforms) for the explicit MTS design. For example, the two transforms may be DCT-II and DST-VII (and corresponding invert transforms) .
a. In one example, there is only two choices in terms of transform selection. Alternatively, furthermore, TS mode may be a third choice if it is applicable.
i. In one example, one choice is DCT-II for both horizontal and vertical transform; and the other one is DST-VII.
ii. One bit may be coded to indicate whether which transform of the two is used.
b. Alternatively, there are four choices in terms of transform selection. Alternatively, furthermore, TS mode may be a fourth choice if it is applicable.
i. The choices include: DCT-II/DST-VII for both horizontal and vertical transform; joint usage of DCT-II and DST-VII, each one for the horizontal or vertical transforms.
ii. In one example, fixed length coding may be utilized to code the four choices.
iii. Alternatively, truncated unary may be utilized to code the four choices.
1) Some examples of bin strings for the four choices are tabulated as follows:
(hor, ver) | |
|
|
Method #4 |
(DCT-II, DCT-II) | 0 | 0 | 0 | 0 |
(DST-VII, DST-VII) | 1 0 | 1 1 0 | 1 1 0 | 1 0 |
(DCT-II, DST-VII) | 1 1 0 | 1 0 | 1 1 1 | 1 1 1 |
(DST-VII, DCT-II) | 1 1 1 | 1 1 1 | 1 0 | 1 1 0 |
13. It is proposed that the allowed transform sets and/or signaling of transform index in explicit MTS may depend on the block dimension.
a. In one example, for blocks with width and/or height no larger (or smaller) than a threshold, DCT-II and DST-VII may be allowed.
b. In one example, for blocks with width and/or height larger (or no smaller) than a threshold, DCT-II, DST-VII and DCT-VIII may be allowed.
c. Alternatively, furthermore, transform skip mode may be enabled.
d. In one example, the allowed transform sets may depend on coded mode.
i. In one example, for IBC coded blocks, the two-transformation basis (TS and DST-VII) may be allowed.
ii. In one example, for non-IBC coded blocks, the two-transformation basis (DCT-II and DST-VII) may be allowed or three-transformation transformation basis (TS, DCT-II and DST-VII) may be allowed.
e. How to signal the transform index may be changed according to the allowed transform sets.
14. Indications of the maximum allowed transform size (non-TS mode) used in MTS may be signaled.
a. In one example, they may be signaled in sequence/picture/slice/tile group/tile/brick level, or other kinds of video unit level.
i. In one example, they may be signaled in SPS/VPS/PPS/picture header/slice header/tile group header etc. al.
b. In one example, they may be not signaled, but derived from the allowed maximum TS size.
c. In one example, indications of the maximum allowed transform size (non-TS mode) may control both maximum TS sizes and maximum sizes used in other transform matrix.
i. Alternatively, furthermore, there is no need to signal maximum sizes for non-TS and TS modes separately.
d. In one example, indications of the maximum allowed transform size may control both implicit and explicit MTS transform sizes.
i. Alternatively, furthermore, whether to apply the implicit MTS may depend on the signaled sizes.
15. It is proposed to align the maximum allowed transform size (non-TS mode) used in MTS and maximum allowed block size used in TS mode.
a. In one example, the maximum allowed transform size (non-TS mode) used in MTS and maximum allowed block size used in TS mode may be the same number.
16. It is proposed that the derivation of implicit MTS enabling flag is independent from the block dimension.
a. Alternatively, furthermore, the checking of block size in the derivation of implicit MTS enabling flag is skipped.
17. The shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) may be removed.
a. In one example, if all of the following shared conditions are true, MTS information may be further signaled. Otherwise, there is no need to signal the MTS information.
- tu_cbf_luma [x0] [y0]
- treeType! = DUAL_TREE_CHROMA
- !cu_sbt_flag
b. Alternatively, furthermore, when the shared condition check of other rules (e.g, mentioned above) returns true, condition check of block dimension compared to the allowed maximum TS sizes may be applied before signaling transform_skip_flag; and condition check of block dimension compared to the allowed maximum allowed MTS sizes (e.g., fixed to be 32x32) may be applied before signaling tu_mts_idx.
18. The shared condition check of block dimension before signaling the MTS information (e.g., transform_skip_flag and tu_mts_idx) is kept unchanged, while the condition check of block dimension before signaling the transform matrix index (non-TS mode) may be removed.
4.1 Example setting of implicit MTS flag
Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example. The underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions.
In general, inputs to the transformation process for scaled transform coefficients are:
- a luma location (xTbY, yTbY ) specifying the top-left sample of the current luma transform block relative to the top left luma sample of the current picture,
- a variable nTbW specifying the width of the current transform block,
- a variable nTbH specifying the height of the current transform block,
- a variable cIdx specifying the colour component of the current block,
- an (nTbW) x (nTbH) array d [x] [y] of scaled transform coefficients with x = 0.. nTbW -1, y = 0.. nTbH -1.
Output of this process is the (nTbW) x (nTbH) array r [x] [y] of residual samples with x = 0.. nTbW -1, y = 0.. nTbH -1.
The variable implicitMtsEnabled is derived as follows:
- If sps_mts_enabled_flag is equal to 1 and one of the following conditions is true, implicitMtsEnabled is set equal to 1:
- IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT
- cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to
MaxSBTSize
That is, Max (nTbW, nTbH ) is compared against MaxSBTSize instead of a fixed number 32.
- sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are both equal to 0 and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA
- Otherwise, implicitMtsEnabled is set equal to 0.
The variable trTypeHor specifying the horizontal transform kernel and the variable trTypeVer specifying the vertical transform kernel are derived as follows:
- If cIdx is greater than 0, trTypeHor and trTypeVer are set equal to 0.
- Otherwise, if implicitMtsEnabled is equal to 1, the following applies:
- If IntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT, trTypeHor and trTypeVer are specified depending on intraPredMode.
- Otherwise, if cu_sbt_flag is equal to 1, trTypeHor and trTypeVer are specified depending on cu_sbt_horizontal_flag and cu_sbt_pos_flag.
- Otherwise (sps_explicit_mts_intra_enabled_flag and sps_explicit_mts_inter_enabled_flag are equal to 0
and CuPredMode [xTbY] [yTbY] is equal to MODE_INTRA ) , trTypeHor and trTypeVer are derived as follows:
trTypeHor = (nTbW >= 4 && nTbW <= 16 && nTbW <= nTbH ) ? 1 : 0 (8 1030)
trTypeVer = (nTbH >= 4 && nTbH <= 16 && nTbH <= nTbW ) ? 1 : 0 (8 1031)
- Otherwise, trTypeHor and trTypeVer are specified in Table 8 13 depending on tu_mts_idx [xTbY] [yTbY] .
Alternatively, the condition check ‘cu_sbt_flag is equal to 1 and Max (nTbW, nTbH ) is less than or equal to 32’ in the determination of implicitMtsEnabled may be replaced by ‘cu_sbt_flag is equal to 1’.
4.2 Example setting of explicit MTS flag
Some proposed changes to VVC working draft version 5 JVET_N1001_v2 are described in this example. The underlined sections indicate the addition to the working draft, while the strikethrough sections indicate proposed deletions. This section provides examples for redundant check removal during the MTS signaling process.
Alternatively, the following may apply:
FIG. 16 is a block diagram of a video processing apparatus 1600. The apparatus 1600 may be used to implement one or more of the methods described herein. The apparatus 1600 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 1600 may include one or more processors 1602, one or more memories 1604 and video processing hardware 1606. The processor (s) 1602 may be configured to implement one or more methods described in the present document. The memory (memories) 1604 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 1606 may be used to implement, in hardware circuitry, some techniques described in the present document.
FIG. 18 is a flowchart for a method 1800 of video processing in accordance with one or more examples of the present technology. The method 1800 includes, at operation 1802, partitioning a block of video data into sub-blocks using a partitioning pattern. The method 1800 includes, at operation 1804, performing prediction for one sub-block using at least one line of reference video data not adjacent to the current sub-block. The method 1800 also includes, at operation 1806, generating a residual signal for the sub-block based on the prediction.
FIG. 19 is a flowchart for a method 1900 of video processing in accordance with one or more examples of the present technology. The method 1900 includes, at operation 1902, partitioning a block of video data into sub-blocks using a partitioning pattern. The method 1900, at operation 1904, performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks. The method 1900 also includes, at operation 1906, generating a residual signal for the sub-blocks based on the predictions.
FIG. 20 is a flowchart for a method 2000 of video processing in accordance with one or more examples of the present technology. The method 2000 includes, at operation 2002, performing predictions for a block of video data to generate a residual signal. The method 2000 includes, at operation 2004, performing an explicit transformation of the residual signal using one of two transformations. The method 2000 includes, at operation 2006, encoding an output from the implicit transformation.
Additional embodiments and techniques are described in the following examples.
1. A video processing method, comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and generating a residual signal for the current sub-block based on the predictions.
2. The method of example 1, wherein all the sub-blocks use a same reference line index of reference video data for the predictions.
3. The method of example 1, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that corresponds to the closest line of reference video data to the sub-block.
4. The method of example 1, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that is determined based on a partitioning direction of the sub-blocks, a prediction mode, or the dimension of the block.
Further embodiments of examples 1-4 are described in items 1 and 2 in Section 4.
5. A video processing method, comprising: partitioning a block of video data into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and generating a residual signal for the sub-blocks based on the predictions.
6. The method of example 5, wherein the matrix vector of a sub-block is selected based on an intra mode or a dimension of the sub-block.
Further embodiments of examples 5-6 are described in items 3-4 in Section 4.
7. A video processing method, comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for a current sub-block in the sub-blocks using at least one line of reference video data not adjacent to the current sub-block; and reconstructing the current sub-block using the predictions.
8. The method of example 7, wherein all the sub-blocks use a same reference line index of reference video data for the predictions.
9. The method of example 7, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that corresponding to the closest line of reference video data to the sub-block.
10. The method of example 7, wherein a first sub-block of the sub-blocks uses a first line index of reference data and remaining sub-blocks use a second line index of reference video data that is determined based on a partitioning direction of the sub-blocks, a prediction mode, or the dimension of the block.
Further embodiments of examples 7-10 are described in items 1 and 2 in Section 4.
11. A video processing method, comprising: receiving a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing predictions for the sub-blocks by calculating a matrix vector product based on a reference signal for each of the sub-blocks; and reconstructing the sub-blocks using the predictions.
12. The method of example 11, wherein the matrix vector of a sub-block is selected based on an intra mode or a dimension of the sub-block.
Further embodiments of examples 1-4 are described in items 3-4 in Section 4.
13. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern; and transforming a residual signal for the sub-blocks based on the predictions, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in a bitstream representing the block of video data.
14. A video processing method, comprising: receiving a bitstream representing a block of video data that is partitioned into sub-blocks using a partitioning pattern; performing inverse transform on a residual signal of the sub-blocks, wherein a maximum transform block dimension or a minimum transform block dimension is indicated in a level of a sequence, a picture, a slice, a tile group, a tile, or a brick in the bitstream; and reconstructing the sub-blocks using an output from the inverse transform.
15. The method of example 13 or 14, wherein the maximum transform block dimension or the minimum transform block dimension is set to different values in different profiles, levels, or tiers.
Further embodiments of examples 13-15 are described in items 7-8 in Section 4.
16. A video processing method, comprising: receiving or transmitting a bitstream representing a block of video data for performing video processing, wherein the block of video data is partitioned into sub-blocks using a partitioning pattern and a residual signal of the sub-blocks is quantized in the bitstream, and wherein the sub-blocks share same quantization information.
17. The method of example 16, wherein the quantization information comprises a quantization parameter, a quantization step, or a scaling matrix.
18. The method of example 16, wherein the quantization information for all the sub-blocks in the block of video data is coded once in the bitstream.
Further embodiments of examples 16-18 are described in item 5 in Section 4.
19. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein reference samples in a first sub-block are modified prior to being used for performing predictions for a second sub-block; and encoding or reconstructing the block of video data based on the predictions.
Further embodiments of example 19 are described in item 6 in Section 4.
20. A video processing method, comprising: performing predictions for a block of video data that is partitioned into sub-blocks using a partitioning pattern, wherein the sub-blocks are partitioned in multiple partitioning directions; and encoding or reconstructing the block of video data based on the predictions.
21. The method of example 20, the multiple partitioning directions are determined by the dimension of the block.
Further embodiments of examples 20-21 are described in item 9 in Section 4.
22. The method of any of example 1 to 21, wherein the sub-blocks are partitioned based on a minimum transform block dimension or a maximum transform block dimension.
Further embodiments of example 22 are described in items 10 and 11 in Section 4.
23. A video processing method, comprising: performing predictions for a block of video data to generate a residual signal; performing an explicit transformation of the residual signal using one of two transformations; and encoding an output from the implicit transformation.
24. The method of example 23, comprising: coding a transformation option in a bitstream representing the block of video data based on variations of the two transformations.
Further embodiments of examples 23-24 are described in items 12-13 in Section 4.
25. The method of example 22 or 23, comprising: signaling information about the explicit transformation without checking a dimension of the block of video data.
Further embodiments of example 25 are described in items 16-18 in Section 4.
26. A video processing method, comprising: receiving a block of video data partitioned into one or more sub-blocks; performing an explicit transformation of the block of video data using one of two inverse transformations; and reconstructing the block of video data based on the implicit transformation.
27. The method of any of example 23 to 26, wherein the explicit transformation is performed in one or more transformation directions that includes a horizonal direction and a vertical direction.
28. The method of example 27, wherein different transform directions use different transformations.
29. The method of example 27, wherein different transform directions use a same transformation.
Further embodiments of examples 27-29 are described in items 12-13 in Section 4.
30. The method of any of examples 23 to 29, wherein a maximum allowed transform size is coded in a level of a sequence, a picture, a slice, a tile, or a brick in a bitstream representing the block of video data.
31. The method of any of examples 23 to 30, comprising: deriving a maximum allowed transform size based on an allowed maximum transform skip size.
Further embodiments of example 30 are described in items 14-15 in Section 4.
32. A video processing apparatus comprising a processor configured to implement one or more of examples 1 to 31.
33. A computer-readable medium having code stored thereon, the code, when executed by a processor, causing the processor to implement a method recited in any one or more of examples 1 to 31.
FIG. 21 is a flowchart for a method 2100 of video processing in accordance with one or more examples of the present technology. The method 2100 includes, at 2102, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode; at 2104, enabling a second mode different from the ISP mode for the block; and at 2106, performing the conversion based on the ISP mode and the second mode.
In some examples, the second mode is multiple reference line (MRL) mode.
In some examples, in the MRL mode, a reference line which is not the closest one of multiple reference lines is available for intra prediction of the block.
In some examples, all sub-partitions of the block use a same reference line index for intra prediction of the block.
In some examples, only first K sub-partitions of all sub-partitions of the block use a same reference line index for intra prediction, and the remaining sub-partitions use the closest reference line for intra prediction of the block, K being an integer.
In some examples, K =1.
In some examples, the reference line index is signaled in the bitstream.
In some examples, whether MRL mode is applied for the remaining sub-partitions depends on splitting direction in ISP mode or/and intra prediction mode or/and size of the block.
In some examples, if the block is split in horizonal direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
In some examples, if the block is split in horizonal direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
In some examples, if the block is split in vertical direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
In some examples, if the block is split in vertical direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when bottom-left neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
In some examples, indications of ISP mode information are signaled before the signaling of MRL mode related information.
In some examples, the ISP mode information include at least one of on/off flag and splitting direction, and the MRL mode related information includes the reference line index.
In some examples, when ISP mode is enabled for the block, the signaling of MRL related information is skipped.
In some examples, when ISP mode is enabled for the block, the reference line index is referred to be 0.
In some examples, the second mode is a matrix based intra prediction (MIP) mode.
In some examples, matrix selection of one sub-partition depends on intra mode and/or size of the sub-partition.
In some examples, indications of the MIP modes and indications of the ISP modes are signaled for the block.
In some examples, the second mode is Transform skip (TS) mode.
In some examples, indication of enabling/disabling TS mode is further signaled even when ISP mode is enabled.
In some examples, whether to signal the indication of enabling/disabling TS mode depends on whether video content of the video is screen content or not.
In some examples, whether to signal the indication of enabling/disabling TS mode depends on a flag signaled in at least one of picture, slice, tile group, tile and brick-level.
In some examples, if the video content is screen content, the indication of enabling/disabling transform skip mode is signaled.
In some examples, if the video content is camera content, the indication of enabling/disabling transform skip mode is skipped and the TS mode is disabled for the blocks.
In some examples, all sub-partitions share the same quantization information including at least one of quantization parameter, quantization step and scaling matrix.
In some examples, the quantization parameter is represented by cu_qp_delta_abs and cu_qp_delta_sign_flag.
In some examples, the quantization information is signaled for the block only when there is at least one coefficient not equal to zero in at least one sub-partition.
In some examples, the quantization information is signaled once for the whole block instead of being signaled for each sub-partition.
In some examples, the quantization information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition, where m is an integer.
In some examples, the quantization information is signaled together with a first sub-partition in encoding or decoding order.
In some examples, the quantization information is signaled together with the last sub-partition in encoding or decoding order.
In some examples, the quantization information is signaled together with the m-th sub-partition in encoding or decoding order, wherein m is an integer no larger than the total number of allowed sub-partitions.
FIG. 22 is a flowchart for a method 2200 of video processing in accordance with one or more examples of the present technology. The method 2200 includes, at 2202, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; at 2204, performing the conversion based on the ISP mode.
In some examples, the reference samples are filtered before being used as prediction.
In some examples, whether to modify the reference samples before being used as prediction depends on width and/or height of the block.
In some examples, whether to modify the reference samples before being used as prediction depends on intra-prediction mode of the block.
In some examples, block size of the block is denoted by W*H, wherein W is the block width and H is the block height, a maximum transform block size of the block is denoted by MaxTbW *MaxTbH, wherein MaxTbW and MaxTbH are the maximum transform block width and maximum transform block height, respectively, and a minimum transform block size of the block is denoted by MinTbW *MinTbH, wherein MinTbW and MinTbH are the minimum transform block width and minimum transform block height, respectively.
In some examples, indications of MaxTbW and/or MaxTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
In some examples, the indications of MaxTbW and/or MaxTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
In some examples, MaxTbW and/or MaxTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
In some examples, indications of MinTbW and/or MinTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
In some examples, the indications of MinTbW and/or MinTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
In some examples, MinTbW and/or MinTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
FIG. 23 is a flowchart for a method 2300 of video processing in accordance with one or more examples of the present technology. The method 2200 includes, at 2302, enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split the block is split into multiple sub-partitions for both horizontal and vertical directions; at 2304, performing the conversion based on the ISP mode.
In some examples, binary value of splitting direction coded for the ISP mode is replaced by an index of splitting directions.
In some examples, the set of allowed splitting directions depends on block size.
In some examples, indications of set of allowed splitting directions are signaled.
In some examples, the set of allowed splitting directions depends on intra prediciton mode of the block.
In some examples, when W/MaxTbW and H/MaxTbH are both greater than M, the mixed splitting directions is enabled, M being an integer.
In some examples, when W/MaxTbW or H/MaxTbH is greater than M, the mixed splitting directions is enabled, M being an integer.
In some examples, M=1.
In some examples, the block is split by using quad-tree splitting.
In some examples, the block is split horizontally first followed by being split vertically when the mixed splitting directions are applied.
In some examples, the block is split vertically first followed by being split horizontally when the mixed splitting directions are applied.
In some examples, whether to and/or how to apply ISP mode on the block depend on the relationship between the block size of block W×H, and/or the maximum transform block size MaxTbW *MaxTbH and/or the minimum transform block size MinTbW *MinTbH.
In some examples, if W/MinTbW and H/MinTbH are both equal to 1, ISP mode is disabled for the block.
In some examples, how to split the block depends on the minimum transform block size of the block.
In some examples, if W/MinTbW is equal to K and H/MinTbH is equal to 1, ISP mode is enabled for the block and vertical splitting is applied to the block, K being an integer larger than 1.
In some examples, if W/MinTbW is equal to 1 and H/MinTbH is equal to K, ISP mode is enabled for the block and horizontal splitting is applied to the block, K being an integer larger than 1.
In some examples, the prediction direction is not needed to be signaled.
In some examples, the block is split to K sub-partitions.
In some examples, when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1, ISP mode is disabled for the block.
In some examples, when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 4.
In some examples, when both W/MaxTbW and H/MaxTbH are greater than 1, ISP mode is disabled for the block.
In some examples, when either W/MaxTbW or H/MaxTbH is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
In some examples, when both W/MaxTbW and H/MaxTbH are greater than or equal to a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
In some examples, when both W/MaxTbW and H/MaxTbH are greater than or equal to a first threshold and smaller than or equal to a second threshold, ISP mode is disabled for the block.
In some examples, when both W/MaxTbW and H/MaxTbH are greater than a first threshold and smaller than a second threshold, ISP mode is disabled for the block.
In some examples, the first threshold is 1 and the second threshold is 4.
In some examples, signaling of splitting direction is skipped and the block is split according to certain rules.
In some examples, the quard-tree splitting is applied to the block firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
In some examples, the splitting of one partition tree is terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
In some examples, the splitting of one partition tree is terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
In some examples, the splitting of one partition tree is terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N, wherein M and N are two positive integers.
In some examples, when ISP mode is disabled for the block, signaling of related information including intra_subpartitions_mode_flag is skipped.
In some examples, when W/MaxTbW > 4 and/or H/MaxTbH > 4, more than 4 sub-partitions and/or more than one splitting direction are enabled.
FIG. 24 is a flowchart for a method 2400 of video processing in accordance with one or more examples of the present technology. The method 2400 includes, at 2402, determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; at 2404, performing the conversion based on the determined MTS scheme.
In some examples, the MTS scheme is explicit MTS where transform index of the MTS is signaled in the bitstream of the block.
In some examples, the MTS scheme is revised to allow only two transforms, wherein the two transforms are DCT-II and DST-VII.
In some examples, the MTS scheme includes two modes in terms of transform selection.
In some examples, the MTS scheme includes a third mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
In some examples, a first mode of the two modes is DCT-II for both horizontal and vertical transform of the block, and a second mode of the two modes is DST-VII for both horizontal and vertical transform of the block.
In some examples, one bit is coded to indicate which mode of the two modes is used.
In some examples, the MTS scheme includes four modes in terms of transform selection.
In some examples, the MTS scheme includes a fifth mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
In some examples, a first mode of the four modes is DCT-II for both horizontal and vertical transform of the block, a second mode of the four modes is DST-VII for both horizontal and vertical transform of the block, a third mode of the four modes is DCT-II for horizontal transform of the block and DST-VII for vertical transform of the block, and a fourth mode of the four modes is DST-VII for horizontal transform of the block and DCT-II for vertical transform of the block.
In some examples, fixed length coding is utilized to code the four modes.
In some examples, truncated unary is utilized to code the four modes.
In some examples, the allowed transform sets and/or signaling of transform index in explicit MTS depend on the block size.
In some examples, for blocks with width and/or height smaller than or equal to a threshold,
DCT-II and DST-VII are allowed.
In some examples, for blocks with width and/or height larger than or equal to a threshold, DCT-II, DST-VII and DCT-VIII are allowed.
In some examples, transform skip (TS) mode is allowed.
In some examples, the allowed transform sets depend on coded mode of the block.
In some examples, for intra block copy (IBC) mode coded blocks, a transform set of two-transformation basis including TS mode and DST-VII is allowed.
In some examples, non-IBC mode coded blocks, a transform set of two-transformation basis including DCT-II and DST-VII is allowed, or a transform set of three-transformation basis including TS mode, DCT-II and DST-VII is allowed.
In some examples, how to signal the transform index is changed according to the allowed transform sets.
In some examples, indications of the maximum allowed transform size used in non-TS mode of the MTS scheme are signaled.
In some examples, the indications are signaled in at least one of sequence, picture, slice, tile group, tile, brick level or other kinds of video unit level.
In some examples, the indications are signaled video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
In some examples, the indications are derived from the allowed maximum TS size.
In some examples, the indications are used to control both maximum TS sizes and maximum sizes used in other transform matrix.
In some examples, maximum sizes for non-TS and TS modes are not needed to be signaled separately.
In some examples, the indications of the maximum allowed transform size are used to control both implicit MTS transform sizes and explicit MTS transform sizes.
In some examples, whether to apply the implicit MTS depends on the signaled maximum allowed transform sizes.
In some examples, the maximum allowed transform size used in non-TS mode of the MTS scheme is aligned with the maximum allowed transform size used in TS mode.
In some examples, the maximum allowed transform size used in non-TS mode of the MTS scheme is same as the maximum allowed transform size used in TS mode.
In some examples, the MTS scheme is implicit MTS where transform matrix of the MTS is directly derived according to transform block sizes of the block.
In some examples, derivation of implicit MTS enabling flag indicating whether implicit MTS is enabled is independent from block size of the block.
In some examples, checking of the block size in the derivation of implicit MTS enabling flag is skipped.
In some examples, shared condition check of block size before signaling MTS information is removed, the MTS information includes transform_skip_flag and tu_mts_idx.
In some examples, if all of the following shared conditions are true, the MTS information is further signaled:
- tu_cbf_luma [x0] [y0]
- treeType! = DUAL_TREE_CHROMA
- !cu_sbt_flag,
otherwise, the MTS information is not signaled.
In some examples, when the shared conditions check of certain rules returns true, condition check of block size compared to the allowed maximum TS sizes is applied before signaling transform_skip_flag; and condition check of block size compared to the allowed maximum allowed MTS sizes is applied before signaling tu_mts_idx.
In some examples, the shared condition check of block size before signaling the MTS information is kept unchanged, while the condition check of block size before signaling the transform matrix index used in non-TS mode of MTS is removed.
In some examples, the conversion generates the block of video from the bitstream representation.
In some examples, the conversion generates the bitstream representation from the block of video.
It will be appreciated that the disclosed techniques may be embodied in video encoders or decoders to improve compression efficiency using techniques that include the use of a reduced dimension secondary transform.
The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular techniques. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.
Claims (118)
- A method for processing video, comprising:enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode;enabling a second mode different from the ISP mode for the block; andperforming the conversion based on the ISP mode and the second mode.
- The method of claim 1, wherein the second mode is multiple reference line (MRL) mode.
- The method of claim 2, wherein, in the MRL mode, a reference line which is not the closest one of multiple reference lines is available for intra prediction of the block.
- The method of any of claims 2-3, wherein all sub-partitions of the block use a same reference line index for intra prediction of the block.
- The method of any of claims 2-3, wherein only first K sub-partitions of all sub-partitions of the block use a same reference line index for intra prediction, and the remaining sub-partitions use the closest reference line for intra prediction of the block, K being an integer.
- The method of claim 5, wherein K =1.
- The method of any of claims 4-6, wherein the reference line index is signaled in the bitstream.
- The method of any of claims 5-7, wherein whether MRL mode is applied for the remaining sub-partitions depends on splitting direction in ISP mode or/and intra prediction mode or/and size of the block.
- The method of claim 8, wherein if the block is split in horizonal direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only bottom-left or/and left or/and above-left or/and above neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- The method of claim 8, wherein if the block is split in horizonal direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- The method of claim 8, wherein if the block is split in vertical direction in ISP mode, MRL mode is applied to the remaining sub-partitions when only left or/and above-left or/and above or/and above-right neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- The method of claim 8, wherein if the block is split in vertical direction in ISP mode, MRL mode is not applied to the remaining sub-partitions when bottom-left neighboring reference samples of the block are used in the intra prediction of a first sub-partition.
- The method of any of claims 1-12, wherein indications of ISP mode information are signaled before the signaling of MRL mode related information.
- The method of claim 13, wherein the ISP mode information include at least one of on/off flag and splitting direction, and the MRL mode related information includes the reference line index.
- The method of claim 14, wherein when ISP mode is enabled for the block, the signaling of MRL related information is skipped.
- The method of claim 14, wherein when ISP mode is enabled for the block, the reference line index is referred to be 0.
- The method of claim 1, wherein the second mode is a matrix based intra prediction (MIP) mode.
- The method of claim 17, wherein matrix selection of one sub-partition depends on intra mode and/or size of the sub-partition.
- The method of claim 17 or 18, wherein indications of the MIP modes and indications of the ISP modes are signaled for the block.
- The method of claim 1, wherein the second mode is Transform skip (TS) mode.
- The method of claim 20, wherein indication of enabling/disabling TS mode is further signaled even when ISP mode is enabled.
- The method of claim 20 or 21, wherein whether to signal the indication of enabling/disabling TS mode depends on whether video content of the video is screen content or not.
- The method of claim 22, wherein whether to signal the indication of enabling/disabling TS mode depends on a flag signaled in at least one of picture, slice, tile group, tile and brick-level.
- The method of claim 22, wherein if the video content is screen content, the indication of enabling/disabling transform skip mode is signaled.
- The method of claim 22, wherein if the video content is camera content, the indication of enabling/disabling transform skip mode is skipped and the TS mode is disabled for the blocks.
- The method of any of claims 1-25, wherein all sub-partitions share the same quantization information including at least one of quantization parameter, quantization step and scaling matrix.
- The method of claim 26, wherein the quantization parameter is represented by cu_qp_delta_abs and cu_qp_delta_sign_flag.
- The method of claim 26, wherein the quantization information is signaled for the block only when there is at least one coefficient not equal to zero in at least one sub-partition.
- The method of any of claim 26-28, wherein the quantization information is signaled once for the whole block instead of being signaled for each sub-partition.
- The method of claim 29, wherein the quantization information is signaled with the m-th sub-partition only when there is at least one coefficient not equal to zero in the m-th sub-partition, where m is an integer.
- The method of claim 29, wherein the quantization information is signaled together with a first sub-partition in encoding or decoding order.
- The method of claim 29, wherein the quantization information is signaled together with the last sub-partition in encoding or decoding order.
- The method of claim 29, wherein the quantization information is signaled together with the m-th sub-partition in encoding or decoding order, wherein m is an integer no larger than the total number of allowed sub-partitions.
- A method for processing video, comprising:enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein the block is split into multiple sub-partitions based on the ISP mode and reference samples located in a first sub-partition to predict a second sub-partition in the block are further modified before being used as prediction; andperforming the conversion based on the ISP mode.
- The method of claim 34, wherein the reference samples are filtered before being used as prediction.
- The method of claim 34 or 35, wherein whether to modify the reference samples before being used as prediction depends on width and/or height of the block.
- The method of claim 34 or 35, wherein whether to modify the reference samples before being used as prediction depends on intra-prediction mode of the block.
- The method of any of claims 1-37, wherein block size of the block is denoted by W*H, wherein W is the block width and H is the block height,a maximum transform block size of the block is denoted by MaxTbW *MaxTbH, wherein MaxTbW and MaxTbH are the maximum transform block width and maximum transform block height, respectively, anda minimum transform block size of the block is denoted by MinTbW *MinTbH, wherein MinTbW and MinTbH are the minimum transform block width and minimum transform block height, respectively.
- The method of claim 38, wherein indications of MaxTbW and/or MaxTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
- The method of claim 39, wherein the indications of MaxTbW and/or MaxTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- The method of any of claims 38-40, wherein MaxTbW and/or MaxTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
- The method of claim 38, wherein indications of MinTbW and/or MinTbH are signaled in at least one of sequence, picture, slice, tile group, tile and brick-level.
- The method of claim 42, wherein the indications of MinTbW and/or MinTbH are signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- The method of any of claims 42-43, wherein MinTbW and/or MinTbH are set to different numbers in different profiles, levels or tiers of a video coding standard.
- A method for processing video, comprising:enabling, for a conversion between a block of the video and a bitstream representation of the block, Intra Sub-block Partition (ISP) mode for the block, wherein mixed splitting directions are enabled in the ISP mode, and the block is split into multiple sub-partitions for both horizontal and vertical directions; andperforming the conversion based on the ISP mode.
- The method of claim 45, wherein binary value of splitting direction coded for the ISP mode is replaced by an index of splitting directions.
- The method of claim 45, wherein the set of allowed splitting directions depends on block size.
- The method of claim 45, wherein indications of set of allowed splitting directions are signaled.
- The method of claim 45, wherein the set of allowed splitting directions depends on intra prediciton mode of the block.
- The method of claim 45, wherein when W/MaxTbW and H/MaxTbH are both greater than M, the mixed splitting directions is enabled, M being an integer.
- The method of claim 45, wherein when W/MaxTbW or H/MaxTbH is greater than M, the mixed splitting directions is enabled, M being an integer.
- The method of claim 50 or 51, wherein M=1.
- The method of any of claims 50-52, wherein the block is split by using quad-tree splitting.
- The method of any of claims 50-53, wherein the block is split horizontally first followed by being split vertically when the mixed splitting directions are applied.
- The method of any of claims 50-53, wherein the block is split vertically first followed by being split horizontally when the mixed splitting directions are applied.
- The method of any of claims 38-55, wherein whether to and/or how to apply ISP mode on the block depend on the relationship between the block size of block W×H, and/or the maximum transform block size MaxTbW *MaxTbH and/or the minimum transform block size MinTbW *MinTbH.
- The method of claim 56, wherein if W/MinTbW and H/MinTbH are both equal to 1, ISP mode is disabled for the block.
- The method of claim 56, wherein how to split the block depends on the minimum transform block size of the block.
- The method of claim 58, wherein if W/MinTbW is equal to K and H/MinTbH is equal to 1, ISP mode is enabled for the block and vertical splitting is applied to the block, K being an integer larger than 1.
- The method of claim 58, wherein , if W/MinTbW is equal to 1 and H/MinTbH is equal to K, ISP mode is enabled for the block and horizontal splitting is applied to the block, K being an integer larger than 1.
- The method of claim 59 or 60, wherein the prediction direction is not needed to be signaled.
- The method of any of claims 59-61, wherein the block is split to K sub-partitions.
- The method of claim 56, wherein when either W/MaxTbW is greater than 1 or H/MaxTbH is greater than 1, ISP mode is disabled for the block.
- The method of claim 56, wherein when W*H/ (MaxTbW*MaxTbH) is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 4.
- The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than 1, ISP mode is disabled for the block.
- The method of claim 56, wherein when either W/MaxTbW or H/MaxTbH is greater than a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
- The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than or equal to a threshold, ISP mode is disabled for the block, wherein the threshold is 2 or 4.
- The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than or equal to a first threshold and smaller than or equal to a second threshold, ISP mode is disabled for the block.
- The method of claim 56, wherein when both W/MaxTbW and H/MaxTbH are greater than a first threshold and smaller than a second threshold, ISP mode is disabled for the block.
- The method of claim 68 or 69, wherein the first threshold is 1 and the second threshold is 4.
- The method of any of claim 68-70, wherein signaling of splitting direction is skipped and the block is split according to certain rules.
- The method of claim 71, wherein the quard-tree splitting is applied to the block firstly, followed by horizontal binary tree splitting, then vertical binary tree splitting.
- The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once either width reaches the MaxTbW or height reaches the MaxTbH.
- The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once both width reaches the MaxTbW and height reaches the MaxTbH.
- The method of claim 71 or 72, wherein the splitting of one partition tree is terminated once both width reaches the MaxTbW/M and height reaches the MaxTbH/N, wherein M and N are two positive integers.
- The method of any of claim 56-75, wherein when ISP mode is disabled for the block, signaling of related information including intra_subpartitions_mode_flag is skipped.
- The method of any of claims 38-76, wherein when W/MaxTbW > 4 and/or H/MaxTbH > 4, more than 4 sub-partitions and/or more than one splitting direction are enabled.
- A method for processing video, comprising:determining, for a conversion between a block of the video and a bitstream representation of the block, a Multiple Transform Selection (MTS) scheme associated with the block, wherein the MTS scheme is revised to allow partial transforms and corresponding invert transforms; andperforming the conversion based on the determined MTS scheme.
- The method of claim 78, wherein the MTS scheme is explicit MTS where transform index of the MTS is signaled in the bitstream of the block.
- The method of claim 78 or 79, wherein the MTS scheme is revised to allow only two transforms, wherein the two transforms are DCT-II and DST-VII.
- The method of any of claims 78-80, wherein the MTS scheme includes two modes in terms of transform selection.
- The method of claim 81, wherein the MTS scheme includes a third mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
- The method of claim 82, wherein a first mode of the two modes is DCT-II for both horizontal and vertical transform of the block, and a second mode of the two modes is DST-VII for both horizontal and vertical transform of the block.
- The method of claim 83, wherein one bit is coded to indicate which mode of the two modes is used.
- The method of any of claims 78-80, wherein the MTS scheme includes four modes in terms of transform selection.
- The method of claim 85, wherein the MTS scheme includes a fifth mode of transform skip (TS) mode in addition to the two modes of transform selection when the TS mode is applicable.
- The method of claim 86, wherein a first mode of the four modes is DCT-II for both horizontal and vertical transform of the block, a second mode of the four modes is DST-VII for both horizontal and vertical transform of the block, a third mode of the four modes is DCT-II for horizontal transform of the block and DST-VII for vertical transform of the block, and a fourth mode of the four modes is DST-VII for horizontal transform of the block and DCT-II for vertical transform of the block.
- The method of claim 87, wherein fixed length coding is utilized to code the four modes.
- The method of claim 87, wherein truncated unary is utilized to code the four modes.
- The method of claim 78 or 79, wherein the allowed transform sets and/or signaling of transform index in explicit MTS depend on the block size.
- The method of claim 90, wherein for blocks with width and/or height smaller than or equal to a threshold, DCT-II and DST-VII are allowed.
- The method of claim 90, wherein for blocks with width and/or height larger than or equal to a threshold, DCT-II, DST-VII and DCT-VIII are allowed.
- The method of claim 90, wherein transform skip (TS) mode is allowed.
- The method of claim 90, wherein the allowed transform sets depend on coded mode of the block.
- The method of claim 94, wherein for intra block copy (IBC) mode coded blocks, a transform set of two-transformation basis including TS mode and DST-VII is allowed.
- The method of claim 94, wherein non-IBC mode coded blocks, a transform set of two-transformation basis including DCT-II and DST-VII is allowed, or a transform set of three-transformation basis including TS mode, DCT-II and DST-VII is allowed.
- The method of any of claims 90-96, wherein how to signal the transform index is changed according to the allowed transform sets.
- The method of any of claims 78-97, wherein indications of the maximum allowed transform size used in non-TS mode of the MTS scheme are signaled.
- The method of claim 98, wherein the indications are signaled in at least one of sequence, picture, slice, tile group, tile, brick level or other kinds of video unit level.
- The method of claim 99, wherein the indications are signaled video parameter set (VPS) , sequence parameter set (SPS) and picture parameter set (PPS) , picture header, slice header, and tile group header.
- The method of claim 98, wherein the indications are derived from the allowed maximum TS size.
- The method of claim 98, wherein the indications are used to control both maximum TS sizes and maximum sizes used in other transform matrix.
- The method of claim 102, wherein maximum sizes for non-TS and TS modes are not needed to be signaled separately.
- The method of claim 98, wherein the indications of the maximum allowed transform size are used to control both implicit MTS transform sizes and explicit MTS transform sizes.
- The method of claim 104, wherein whether to apply the implicit MTS depends on the signaled maximum allowed transform sizes.
- The method of any of claims 78-105, wherein the maximum allowed transform size used in non-TS mode of the MTS scheme is aligned with the maximum allowed transform size used in TS mode.
- The method of claim 106, wherein the maximum allowed transform size used in non-TS mode of the MTS scheme is same as the maximum allowed transform size used in TS mode.
- The method of claim 78, wherein the MTS scheme is implicit MTS where transform matrix of the MTS is directly derived according to transform block sizes of the block.
- The method of claim 108, wherein derivation of implicit MTS enabling flag indicating whether implicit MTS is enabled is independent from block size of the block.
- The method of claim 109, wherein checking of the block size in the derivation of implicit MTS enabling flag is skipped.
- The method of any of claims 78-110, wherein shared condition check of block size before signaling MTS information is removed, the MTS information includes transform_skip_flag and tu_mts_idx.
- The method of claim 111, wherein if all of the following shared conditions are true, the MTS information is further signaled:- tu_cbf_luma [x0] [y0]- treeType! = DUAL_TREE_CHROMA- !cu_sbt_flag,otherwise, the MTS information is not signaled.
- The method of claim 111, wherein when the shared conditions check of certain rules returns true, condition check of block size compared to the allowed maximum TS sizes is applied before signaling transform_skip_flag; and condition check of block size compared to the allowed maximum allowed MTS sizes is applied before signaling tu_mts_idx.
- The method of claim 111, wherein the shared condition check of block size before signaling the MTS information is kept unchanged, while the condition check of block size before signaling the transform matrix index used in non-TS mode of MTS is removed.
- The method of any of claims 1-114, wherein the conversion generates the block of video from the bitstream representation.
- The method of anyone of claims 1 -114, wherein the conversion generates the bitstream representation from the block of video.
- An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of claims 1 to 116.
- A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of claims 1 to 116.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080031501.XA CN113728631B (en) | 2019-04-27 | 2020-04-27 | Intra sub-block segmentation and multiple transform selection |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNPCT/CN2019/084699 | 2019-04-27 | ||
CN2019084699 | 2019-04-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020221213A1 true WO2020221213A1 (en) | 2020-11-05 |
Family
ID=73028709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/087285 WO2020221213A1 (en) | 2019-04-27 | 2020-04-27 | Intra sub-block partitioning and multiple transform selection |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113728631B (en) |
WO (1) | WO2020221213A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024174979A1 (en) * | 2023-02-20 | 2024-08-29 | Douyin Vision Co., Ltd. | Transform for intra block copy |
WO2024188249A1 (en) * | 2023-03-13 | 2024-09-19 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024140853A1 (en) * | 2022-12-30 | 2024-07-04 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018074812A1 (en) * | 2016-10-19 | 2018-04-26 | 에스케이텔레콤 주식회사 | Device and method for encoding or decoding image |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101791078B1 (en) * | 2010-04-16 | 2017-10-30 | 에스케이텔레콤 주식회사 | Video Coding and Decoding Method and Apparatus |
CN108712652A (en) * | 2012-06-29 | 2018-10-26 | 韩国电子通信研究院 | Method for video coding and computer-readable medium |
WO2015070801A1 (en) * | 2013-11-14 | 2015-05-21 | Mediatek Singapore Pte. Ltd. | Method of video coding using prediction based on intra picture block copy |
EP3202150B1 (en) * | 2014-09-30 | 2021-07-21 | Microsoft Technology Licensing, LLC | Rules for intra-picture prediction modes when wavefront parallel processing is enabled |
CN108293116A (en) * | 2015-11-24 | 2018-07-17 | 三星电子株式会社 | Video encoding/decoding method and equipment and method for video coding and equipment |
WO2018123316A1 (en) * | 2016-12-26 | 2018-07-05 | 日本電気株式会社 | Image encoding method, image decoding method, image encoding device, image decoding device and program |
EP3402190A1 (en) * | 2017-05-11 | 2018-11-14 | Thomson Licensing | Method and apparatus for intra prediction in video encoding and decoding |
WO2019009590A1 (en) * | 2017-07-03 | 2019-01-10 | 김기백 | Method and device for decoding image by using partition unit including additional region |
-
2020
- 2020-04-27 WO PCT/CN2020/087285 patent/WO2020221213A1/en active Application Filing
- 2020-04-27 CN CN202080031501.XA patent/CN113728631B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018074812A1 (en) * | 2016-10-19 | 2018-04-26 | 에스케이텔레콤 주식회사 | Device and method for encoding or decoding image |
Non-Patent Citations (4)
Title |
---|
HERNANDEZ, S.L ET AL.: "Non-CE3/Non-CE8: Enable Transform Skip in CUs using ISP", JVET-N0401-V5, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 14TH MEETING, 19 March 2019 (2019-03-19), Geneva , CH, XP030203720 * |
HUNG, C.H ET AL.: "CE6-related: An Explicit MTS Design with Fast Encoder", JVET-N0424-V7, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 14TH MEETING, 19 March 2019 (2019-03-19), Geneva, CH, XP030203510 * |
LIM, S.C ET AL.: "Non-CE6: Simplification on implicit transform selection in ISP mode", JVET-N0375, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/ IEC JTC 1/SC 29/WG 11 14TH MEETING, 19 March 2019 (2019-03-19), Geneva, CH, XP030203709 * |
MA, T.C ET AL.: "Non-CE3/6: Enabling Transform Skip for ISP", JVET-N0475, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/ SC 29/WG 11 14TH MEETING, 17 March 2019 (2019-03-17), Geneva, CH, XP030203119 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024174979A1 (en) * | 2023-02-20 | 2024-08-29 | Douyin Vision Co., Ltd. | Transform for intra block copy |
WO2024188249A1 (en) * | 2023-03-13 | 2024-09-19 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
Also Published As
Publication number | Publication date |
---|---|
CN113728631B (en) | 2024-04-02 |
CN113728631A (en) | 2021-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020228673A1 (en) | Conditional use of reduced secondary transform for video processing | |
WO2020216296A1 (en) | Clipping operation in secondary transform based video processing | |
WO2020244656A1 (en) | Conditional signaling of reduced secondary transform in video bitstreams | |
WO2020244662A1 (en) | Simplified transform coding tools | |
JP7514354B2 (en) | Skip Conversion Mode Block Dimension Settings | |
WO2020228762A1 (en) | Context modeling for residual coding | |
WO2020182207A1 (en) | Partitions on sub-block transform mode | |
WO2020221213A1 (en) | Intra sub-block partitioning and multiple transform selection | |
JP7444970B2 (en) | Using default and user-defined scaling matrices | |
WO2021110018A1 (en) | Separable secondary transform processing of coded video | |
WO2020228716A1 (en) | Usage of transquant bypass mode for multiple color components | |
WO2020233664A1 (en) | Sub-block based use of transform skip mode | |
WO2021180022A1 (en) | Handling of transform skip mode in video coding | |
WO2020228693A1 (en) | Coding of multiple intra prediction methods | |
WO2020253642A1 (en) | Block size dependent use of secondary transforms in coded video | |
WO2020253874A1 (en) | Restriction on number of context coded bins | |
WO2021190594A1 (en) | Implicit determination of transform skip mode | |
WO2020253810A1 (en) | Coding tools for chroma components |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20798910 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.12.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20798910 Country of ref document: EP Kind code of ref document: A1 |