CN113767627B - Cropping operations in video processing based on quadratic transforms - Google Patents
Cropping operations in video processing based on quadratic transforms Download PDFInfo
- Publication number
- CN113767627B CN113767627B CN202080031268.5A CN202080031268A CN113767627B CN 113767627 B CN113767627 B CN 113767627B CN 202080031268 A CN202080031268 A CN 202080031268A CN 113767627 B CN113767627 B CN 113767627B
- Authority
- CN
- China
- Prior art keywords
- transform
- block
- video
- inverse
- current block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims description 69
- 238000000034 method Methods 0.000 claims abstract description 183
- 238000013139 quantization Methods 0.000 claims abstract description 67
- 238000006243 chemical reaction Methods 0.000 claims abstract description 57
- 239000011159 matrix material Substances 0.000 claims description 67
- 230000007704 transition Effects 0.000 claims description 30
- 238000004590 computer program Methods 0.000 claims description 10
- 230000015654 memory Effects 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 8
- 238000003672 processing method Methods 0.000 abstract description 14
- 230000009466 transformation Effects 0.000 description 44
- 238000010586 diagram Methods 0.000 description 28
- 238000005516 engineering process Methods 0.000 description 23
- 241000023320 Luma <angiosperm> Species 0.000 description 18
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 18
- 230000008569 process Effects 0.000 description 18
- 239000013598 vector Substances 0.000 description 17
- 238000005192 partition Methods 0.000 description 15
- 239000000523 sample Substances 0.000 description 11
- 230000006835 compression Effects 0.000 description 10
- 238000007906 compression Methods 0.000 description 10
- 230000011664 signaling Effects 0.000 description 10
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 9
- 230000001419 dependent effect Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000000638 solvent extraction Methods 0.000 description 6
- 230000002441 reversible effect Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000009467 reduction Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000006837 decompression Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000013074 reference sample Substances 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A video processing method includes: for conversion between a block of video and a bitstream representation of the video, it is determined that output values from an inverse quadratic transform (e.g., an inverse low frequency indivisible transform) having a reduced size are constrained within a range of [ min, max ], including min, max. The inverse quadratic transform is applied to the blocks between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block, and min and max are integer values. The method also includes performing a conversion based on the determination.
Description
Cross Reference to Related Applications
In accordance with patent laws and/or regulations applicable to the paris convention, the present application is intended to claim in time the priority and benefit of international patent application No. PCT/CN2019/083853 filed on 23/4/2019. The entire disclosure of the above application is incorporated by reference herein as part of the disclosure of the present application for all purposes dictated by law.
Technical Field
This patent document relates to video encoding and decoding techniques, devices and systems.
Background
Despite advances in video compression, digital video still accounts for the largest bandwidth usage on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the demand for bandwidth for digital video usage is expected to continue to grow.
Disclosure of Invention
This document describes various embodiments and techniques in which a quadratic transform (also referred to as a low frequency indivisible transform) is used during decoding or encoding of a video or image.
In one example aspect, a method of video processing is disclosed. The method comprises the following steps: determining that output values from the inverse quadratic transform having a reduced size are constrained to be in a range [ min, max ], including min, max, for a transition between a block of video and a bitstream representation of the video. The inverse quadratic transform is applied to the blocks between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block, and min and max are integer values. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a conversion between a block of video and a bitstream representation of the video, a manner of applying a quadratic transform having a reduced size to subblocks of the block is determined based on a number of subblocks to which the quadratic transform is applicable. The quadratic transform is applied to blocks between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a transition between a block of video and a bitstream representation of the video, a quadratic transform having a reduced size is determined to be applicable to a single sub-block of the block, if the size of the block satisfies a condition. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a conversion between a block of video and a bitstream representation of the video, it is determined that a quadratic transform having a reduced size is applicable to a region in the block having a size of K × L. K and L are positive integers, and K is not equal to L. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a transition between a block of video and a bitstream representation of the video, a non-zero range is determined based on characteristics of the block. A non-zero range corresponds to a range outside which coefficients associated with a quadratic transform having a reduced size are set to zero. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video encoding is disclosed. The method includes determining that a quadratic transform having a reduced size is applicable to two neighboring sub-blocks of a block of the video. Each of the two adjacent sub-blocks has a size of M × N, M and N being positive integers. The quadratic transform is performed between the positive main transform and the quantization step. The reduced size is reduced from the size of the block. The method also includes generating a codec representation of the video based on the determination.
In another example aspect, a method of video decoding is disclosed. The method includes determining that a quadratic transform having a reduced size is applicable to two neighboring sub-blocks of a block of video. Each of the two adjacent sub-blocks has a size of M × N, M and N being positive integers. The quadratic transformation is performed between the inverse quantization step and the inverse main transformation. The reduced size is reduced from the size of the block. The method also includes generating a block of video by parsing a codec representation of the video according to the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a transition between a block of video and a bitstream representation of the video, it is determined whether to apply a quadratic transform having a reduced size to the block based on a characteristic associated with the block according to a rule. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: for a conversion between a block of video and a bitstream representation of the video, a bit precision constraint is determined for coefficients of one or more transform matrices of a quadratic transform having a reduced size that are applicable to the block. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method also includes performing a conversion based on the determination.
In another example aspect, a method of video processing is disclosed. The method comprises the following steps: determining a constraint rule for selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block; and performing the conversion by applying a quadratic transform having a reduced size according to the constraint rule. The quadratic transform having a reduced size has a size reduced from the size of the current video block. During the conversion, a quadratic transform with a reduced size is applied in a specific order together with the main transform.
In another example aspect, another method of video processing is disclosed. The method comprises the following steps: determining a constraint rule for selectively applying a quadratic transform having a reduced size during transitions between bit stream representations of the current video block and the neighboring video region and pixels of the current video block and pixels of the neighboring region; and performing the conversion by applying a quadratic transform having a reduced size according to the constraint rule. The quadratic transform having a reduced size has a size reduced from the size of the current video block and the neighboring video area. During the conversion, a quadratic transform with a reduced size is applied in a specific order together with the main transform.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: determining a zeroing rule for selectively applying a quadratic transform having a reduced size during conversion of a bitstream representation of the current video block to pixels of the current video block; and the conversion is performed by applying a quadratic transform with a reduced size according to the zeroing rule. The quadratic transform having a reduced size has a size reduced from the size of the current video block. The zeroing rule specifies the maximum number of coefficients used by a quadratic transform with a reduced size.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: determining a zeroing rule for selectively applying a quadratic transform having a reduced size during conversion of a bitstream representation of the current video block to pixels of the current video block; and the conversion is performed by applying a quadratic transform with a reduced size according to the zeroing rule. A quadratic transform with a reduced size has a size reduced from the size of the current video block. The zeroing rule specifies the maximum number of coefficients used by a quadratic transform with a reduced size.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: determining conditions for selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block; and performing conversion by applying quadratic conversion with a reduced size according to the condition. A quadratic transform with a reduced size has a size reduced from the size of the current video block. This condition is signaled in the bit stream representation.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: a quadratic transform having a reduced size is selectively applied during a transition between a bitstream representation of the current video block and pixels of the current video block, and the transition is performed by applying the quadratic transform having the reduced size according to the condition. The quadratic transform having a reduced size has a size reduced from the size of the current video block. The conversion includes selectively applying location-dependent intra prediction combining (PDPC) based on coexistence rules.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: applying a quadratic transform having a reduced size during a transition between a bitstream representation of the current video block and pixels of the current video block, and performing the transition by applying the quadratic transform having the reduced size according to the condition. The quadratic transform having a reduced size has a size reduced from the size of the current video block. The application controls the use of neighboring samples for intra prediction during the transition.
In yet another example aspect, another video processing method is disclosed. The method comprises the following steps: a quadratic transform having a reduced size is selectively applied during a transition between a bitstream representation of the current video block and pixels of the current video block, and the transition is performed by applying the quadratic transform having the reduced size according to the condition. The quadratic transform having a reduced size has a size reduced from the size of the current video block. The selective application controls the use of quantization matrices during the conversion.
In yet another example aspect, a video encoder is disclosed. The video encoder includes a processor configured to implement one or more of the methods described above.
In yet another example aspect, a video decoder is disclosed. The video decoder includes a processor configured to implement one or more of the above-described methods.
In yet another example aspect, a computer-readable medium is disclosed. The medium comprises code for implementing one or more of the above methods stored on the medium.
These and other aspects are described in this document.
Drawings
Fig. 1 shows an example of a block diagram of an encoder.
Fig. 2 shows an example of 67 intra prediction modes.
Fig. 3A to 3B illustrate examples of reference samples for wide-angle intra prediction.
Fig. 4 is an example illustration of the discontinuity problem in the case where the direction exceeds 45 degrees.
Fig. 5A to 5D show an example illustration of samples by PDPC applied to diagonal and adjacent angular intra modes.
Fig. 6 is an example of dividing 4 × 8 and 8 × 4 blocks.
Fig. 7 is a division example of all blocks except for 4 × 8, 8 × 4, and 4 × 4.
Fig. 8 divides a 4x 8 block of samples into two independent decodable regions.
Fig. 9 illustrates an example sequence of processing for pixel rows to maximize throughput for a 4xN block with a vertical predictor.
Fig. 10 shows an example of quadratic transformation.
Fig. 11 shows an example of the proposed Reduced Secondary Transform (RST).
Fig. 12 shows an example of a transform of forward and reverse (or inverse) downscaling.
Fig. 13 shows an example of a positive RST8x8 process utilizing a 16x48 matrix.
Fig. 14 shows an example of scanning positions 17 to 64 of non-zero elements.
FIG. 15 is a diagram of sub-block transform modes SBT-V and SBT-H.
FIG. 16 is a block diagram of an example hardware platform for implementing the techniques described in this document.
FIG. 17 is a flow diagram of an example method of video processing.
FIG. 18 is a block diagram of an example video processing system in which the disclosed techniques may be implemented.
FIG. 19 is a flow diagram of an example method of video processing in accordance with the present technology.
FIG. 20 is a flow diagram of another example method of video processing in accordance with the present technology.
FIG. 21 is a flow diagram of another example method of video processing in accordance with the present technology.
FIG. 22 is a flow diagram of another example method of video processing in accordance with the present technology.
FIG. 23 is a flow diagram of another example method of video processing in accordance with the present technology.
Fig. 24A is a flow diagram of an example method of video encoding in accordance with the present technology.
Fig. 24B is a flow diagram of an example method of video decoding in accordance with the present technology.
FIG. 25 is a flow diagram of another example method of video processing in accordance with the present technology.
FIG. 26 is a flow diagram of yet another example method of video processing in accordance with the present technology.
Detailed Description
The section headings are used in this document to facilitate understanding and do not limit the embodiments disclosed in the sections to only that section. Furthermore, although certain embodiments are described with reference to multi-function video codecs or other specific video codecs, the disclosed techniques are also applicable to other video codecs techniques. Additionally, while some embodiments describe video codec steps in detail, it should be understood that the corresponding decoding steps of undoing the codec would be implemented by the decoder. In addition, the term video processing encompasses video codec or compression, video decoding or decompression, and video transcoding where video pixels are represented from one compression format to another compression format or at a different compression bitrate.
1. Overview
This patent document relates to video coding and decoding techniques. In particular, it relates to transforms in video codecs. It can be applied to existing Video codec standards, such as HEVC, or standards to be finalized (Versatile Video Coding, versatile Video codec). It may also be applicable to future video codec standards or video codecs.
2. Preliminary discussion
The video codec standard has evolved largely through the development of the well-known ITU-T and ISO/IEC standards. ITU-T has established H.261 and H.263, ISO/IEC has established MPEG-1 and MPEG-4 visualizations, and these two organizations have jointly established the H.262/MPEG-2 Video and H.264/MPEG-4 Advanced Video Coding (AVC) and H.265/HEVC [1] standards. Since h.262, video codec standards have been based on hybrid video codec structures, in which temporal prediction plus transform coding is utilized. In order to explore future Video coding and decoding technologies beyond HEVC, VCEG and MPEG united in 2015 to form Joint Video Exploration Team (jfet). Thereafter, JFET takes a number of new methods and puts them into a reference software named Joint Exploration Model (JEM) [2]. In month 4 of 2018, the joint video experts group (jfet) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) holds in an effort to the VVC standard with a 50% reduction in bitrate compared to HEVC.
2.1 color space and chroma subsampling
A color space, also called a color model (or color system), is an abstract mathematical model that simply describes a range of colors as a tuple of numbers, typically 3 or 4 values or color components (e.g., RGB). Basically, a color space is a refinement of the coordinate system and the subspace.
For video compression, the most common color spaces are YCbCr and RGB.
YCbCr, Y 'CbCr, or Y Pb/Cb Pr/Cr (also known as YBCCR or Y' CbCr) are a family of color spaces used as part of a color image pipeline in video and digital photography systems. Y' is the luminance component and CB and CR are the blue-difference and red-difference chrominance components. Y' (with a single prime) is different from Y, which is luminance, meaning that the light intensity is non-linearly encoded based on the RGB primaries of the gamma correction.
Chroma subsampling is the practice of encoding an image by imposing a lower resolution on chroma information than on luma information, with the human visual system being less sensitive to color differences than to luma.
2.1.1 Format 4
Each of the three Y' CbCr components has the same sampling rate and therefore no chrominance subsampling. This solution is sometimes used in high-end film scanners (high-end film scanners) and film post-production.
2.1.2 Format 4
Two chrominance components are sampled at half the sampling rate of the luminance: the horizontal chrominance resolution is halved. This reduces the bandwidth of the uncompressed video signal by a factor of three with little or no visual difference
2.1.3 format 4
In 4. Therefore, the data rates are the same. Cb and Cr are each sub-sampled by a factor of 2 horizontally and vertically. There are three variants of the 4.
In MPEG-2, cb and Cr coexist horizontally. Cb and Cr are located between pixels in the vertical direction (interstitially located).
In JPEG/JFIF, H.261, and MPEG-1, cb and Cr are interstitially located among the alternating luminance samples.
In 4. In the vertical direction, they are in one place on alternating lines.
2.2 codec flow for typical video codecs
Fig. 1 shows an example of a block diagram of an encoder for VVC, which contains three loop filter modules: deblocking Filter (DF), sample Adaptive Offset (SAO), and ALF. Unlike DF using predefined filters, SAO and ALF utilize original samples of the current picture to reduce the mean square error between the original and reconstructed samples by adding offsets and by applying a Finite Impulse Response (FIR) filter, respectively, and the side information of the codec signals the offsets and filter coefficients. ALF is located at the final processing stage of each picture and can be viewed as a tool that attempts to capture and fix artifacts produced by previous stages.
2.3 Intra mode codec with 67 Intra prediction modes
To capture any edge direction present in natural video, the number of directional intra modes extends from 33 to 65 used in HEVC. In fig. 2, the additional directivity pattern is depicted as a dashed arrow, and the planar and DC patterns remain unchanged. These denser directional intra prediction modes apply to all block sizes as well as to luma and chroma intra prediction.
As shown in fig. 2, the conventional angular intra prediction direction is defined as from 45 degrees to-135 degrees in the clockwise direction. In VTM2, for non-square blocks, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes. The replaced mode is signaled using the original method and remapped to the index of the wide mode after parsing. The total number of intra prediction modes is unchanged, e.g., 67, and the intra mode coding is unchanged.
In HEVC, each intra coded block has a square shape and the length of each of its sides is a power of 2. Therefore, no partitioning operation is required to generate the intra predictor using the DC mode. In VVV2, the chunks may have a rectangular shape, which generally requires the use of a partitioning operation for each chunk. To avoid the partitioning operation of DC prediction, only the longer edges are used to calculate the average of the non-square blocks.
2.4 Wide-Angle Intra prediction of non-Square blocks
The conventional angular intra prediction direction is defined as from 45 degrees to-135 degrees in the clockwise direction. In VTM2, for non-square blocks, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes. The replaced mode is signaled using the original method and remapped to the index of the wide mode after parsing. The total number of intra prediction modes for a particular block is unchanged, e.g., 67, and the intra mode codec is unchanged.
To support these prediction directions, a top reference of length 2W +1 and a left reference of length 2H +1 are defined as shown in FIGS. 3A-3B.
The mode number of the replaced mode in the wide-angle direction mode depends on the aspect ratio of the block. Alternative intra prediction modes are shown in table 1.
TABLE 1 Intra prediction modes replaced by Wide-Angle mode
As shown in fig. 4, in the case of wide-angle intra prediction, two vertically adjacent prediction samples may use two non-adjacent reference samples. Thus, a low-pass reference sample filter and side smoothing are applied to the wide-angle prediction to reduce the increased gap Δ p α The negative effects of (c).
2.5 location-dependent Intra prediction combining
In VTM2, the result of intra prediction in planar mode is further modified by a position dependent intra prediction combination (PDPC) method. PDPC is an intra prediction method that invokes a combination of unfiltered boundary reference samples and HEVC style intra prediction and filtered boundary reference samples. PDPC applies to the following intra modes without signaling: planar, DC, horizontal, vertical, lower left angle patterns and eight adjacent angle patterns thereof, and upper right angle patterns and eight adjacent angle patterns thereof.
The predicted samples pred (x, y) are predicted using a linear combination of intra prediction modes (DC, plane, angle) and reference samples according to the following equation:
pred(x,y)=(wL×R -1,y +wT×R x,-1 -wTL×R -1,-1 +(64-wL-wT+wTL)×pred(x,y)+32)>>6
wherein R is x,-1 、R -1,y Respectively indicate to be located atReference samples at the top and left side of the current sample (x, y), and R -1,-1 Representing reference samples located at the upper left corner of the current block.
If PDPC is applied to DC, planar, horizontal and vertical intra modes, no additional boundary filter is needed, as required in case of HEVC DC mode boundary filter or horizontal/vertical mode edge filter.
Fig. 5A to 5D show reference samples (R) of PDPC applied to various prediction modes x,-1 、R -1,y And R -1,-1 ) The definition of (1). The prediction samples pred (x ', y') are located at (x ', y') within the prediction block. Reference sample R x,-1 Is given by: x = x '+ y' +1, and is referenced to a sample point R -1,y Is similarly given by: y = x '+ y' +1.
Fig. 5A-5D provide definitions of samples used by PDPCs applied to diagonal and adjacent corner intra modes.
The PDPC weights depend on the prediction mode and are shown in table 2.
Table 2 example of PDPC weights according to prediction mode
2.6 Intra sub-block partitioning (ISP)
In some embodiments, it is proposed that ISP divide the luma intra prediction block vertically or horizontally into 2 or 4 sub-partitions according to block size, as shown in table 3. Fig. 6 and 7 show examples of two possibilities. All sub-partitions satisfy the condition of at least 16 samples.
TABLE 3 number of sub-partitions according to Block size
Size of block | Number of |
4×4 | Is not divided into |
|
2 |
All |
4 |
Fig. 6 shows examples of division of 4 × 8 and 8 × 4 blocks.
Fig. 7 shows an example of division of all blocks except for 4 × 8, 8 × 4, and 4 × 4.
For each of these sub-partitions, a residual signal is generated by entropy-decoding the coefficients transmitted by the encoder, and then inverse-quantizing and inverse-transforming them. Then, intra prediction is performed on the sub-partitions, and finally the corresponding reconstructed samples are obtained by adding a residual signal to the prediction signal. Thus, the reconstructed value for each sub-partition will be available to generate a prediction for the next sub-partition, which will repeat the process, and so on. All sub-partitions share the same intra mode.
Based on the intra mode and the utilized partition, two different categories of processing orders are used, which are referred to as normal order and reverse order. In the normal order, the first sub-partition to be processed is the one that contains the upper left sample of the CU and then continues down (horizontal partitioning) or right (vertical partitioning). As a result, the reference samples used to generate the sub-partition prediction signal are located only at the left and upper sides of the line. On the other hand, the reverse processing order starts from the sub-partition containing the lower-left sample of the CU and continues upward, or starts from the sub-partition containing the upper-right sample of the CU and continues to the left.
2.7 Block differential pulse code modulation codec (BDPCM)
Because of the shape of horizontal (vertical) predictors, which use left (a) (right (B)) pixels to predict the current pixel, the most throughput-efficient way to process a block is to process all pixels of one column (row) in parallel and to process the columns (rows) sequentially. To increase throughput, we introduce the following procedure: when the predictor selected on the block is vertical, the block of width 4 is divided into two halves with horizontal edges, and when the predictor selected on the block is horizontal, the block of height 4 is divided into two halves with vertical edges.
When a block is divided, samples from one region do not allow the prediction to be computed using pixels from another region: if this happens, the prediction pixel is replaced by a reference pixel in the prediction direction. This is shown in fig. 8 for different positions of the current pixel X in a vertically predicted 4X 8 block.
Fig. 8 shows an example of dividing a block having 4 × 8 samples into two independently decodable regions.
Due to this property, it is now possible to process 4x4 blocks in 2 cycles, 4x 8 or 8x 4 blocks in 4 cycles, and so on, as shown in fig. 9.
Fig. 9 shows an example of a processing order for pixel rows to maximize throughput for a 4xN block with a vertical predictor.
Table 4 summarizes the number of cycles required to process a block according to the block size. It is cumbersome to show that any block with both sizes greater than or equal to 8 can be processed in a manner of 8 pixels per cycle or more.
TABLE 4 worst case throughput for 4xN, nx4 sized blocks
2.8 quantized residual Domain BDPCM
In some embodiments, a quantized residual domain BDPCM (hereinafter referred to as RBDPCM) is proposed. Similar to intra prediction, intra prediction is performed on the entire block by sample point copying in the prediction direction (horizontal or vertical prediction). The residual is quantized and the difference between the quantized residual and its predictor (horizontal or vertical) quantization value is coded.
For a block of size M (rows) x N (columns), after intra prediction is performed using unfiltered samples from the top or left block boundary samples, either horizontally (copying left adjacent pixel values on the prediction block row by row) or vertically (copying the top adjacent row to each row in the prediction block), let r be i,j I is more than or equal to 0 and less than or equal to M-1, j is more than or equal to 0 and less than or equal to N-1 to form a prediction residual error. Let Q (r) i,j ) I is more than or equal to 0 and less than or equal to M-1, j is more than or equal to 0 and less than or equal to N-1 represents residual error r i,j Wherein the residual is a difference between the original block value and the prediction block value. The block DPCM is then applied to the quantized residual samples, resulting in modified samples having elementsM x N array ofWhen signaling vertical BDPCM:
for horizontal prediction, a similar rule is applied, and residual quantized samples are obtained by the following equation
On the decoder side, the above calculation is reversed to produce Q (r) i,j ) I is more than or equal to 0 and less than or equal to M-1, and j is more than or equal to 0 and less than or equal to N-1. For vertical preThe situation is measured, and the situation is measured,
in the case of the horizontal case,
inverse quantization residual Q -1 (Q(r i,j ) Is added to the intra block prediction value to generate reconstructed sample values.
The main advantage of this scheme is that the inverse DPCM can be done during coefficient parsing with the predictor added only while the coefficients are being parsed on the fly, or can be performed after parsing.
Transform skipping is always used in the quantized residual domain BDPCM.
2.9 Multiple transform sets in VVC
In VTM4, conversion of larger block sizes (up to 64 × 64 in size) is possible, mainly for higher resolution video, such as 1080p and 4K sequences. For a transform block with a size (width or height, or width and height) equal to 64, the high frequency transform coefficients are zeroed out so that only the low frequency coefficients remain. For example, for an M × N transform block, where M is the block width and N is the block height, when M equals 64, only the left 32 columns of transform coefficients are retained. Similarly, when N equals 64, only the first 32 rows of transform coefficients are retained. When the transform skip mode is used for larger blocks, the entire block is used without zeroing out any values.
In addition to DCT-II, which has been adopted in HEVC, a Multiple Transform Selection (MTS) scheme is used for both residual coding and decoding of inter and intra coded blocks. It uses a number of transforms selected from DCT8/DST 7. The newly introduced transform matrices are DST-VII and DCT-VIII. Table 5 shows the basis functions of the selected DST/DCT.
TABLE 5 selection of basis functions for DST/DCT
To preserve the orthogonality of the transform matrices, the transform matrices are quantized more accurately than the transform matrices in HEVC. In order to keep the median of the transform coefficients in the 16-bit range, all coefficients will have 10 bits after the horizontal transform and after the vertical transform.
To control the MTS scheme, separate enable flags are specified for intra and inter frames, respectively, at the SPS level. When MTS is enabled at SPS, CU level flag is signaled to indicate whether MTS is applied. Here, MTS is applied only to luminance. The MTS CU level flag is signaled when the following conditions are met.
-both width and height are less than or equal to 32.
The CBF flag is equal to 1.
If the MTS CU flag is equal to 0, DCT2 is applied in both directions. However, if the MTS CU flag is equal to 1, two further flags are additionally signaled to indicate the transform type for the horizontal and vertical direction, respectively. The transformation and signaling mapping table is shown in table 6. When transform matrix precision is involved, an 8-bit primary transform kernel is used. Therefore, all transform kernels used in HEVC remain unchanged, including 4-point DCT-2 and DST-7, 8-point, 16-point, and 32-point DCT-2. Also, other transform kernels, including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7, and DCT-8, use 8-bit primary transform kernels.
Table 6 transformation and signaling mapping table
To reduce the complexity of larger sizes of DST-7 and DCT-8, the high frequency transform coefficients are zeroed out for DST-7 and DCT-8 blocks with sizes (width or height, or both width and height) equal to 32. Only the coefficients in the 16x16 low frequency region are retained.
As in HEVC, the residual of a block may be coded with a transform skip mode. To avoid redundancy in syntax coding, the transform skip flag is not signaled when the MTS _ CU _ flag at the CU level is not equal to zero. The block size restriction for transform skipping is the same as that of MTS in JEM4, which indicates that transform skipping applies to a CU when both the block width and height are equal to or less than 32.
2.10 example reduced quadratic transform (RST)
2.10.1 example Un-Separable Secondary Transform (NSST)
In some embodiments, a quadratic transform (also referred to as an indivisible transform) is applied between the positive main transform and the quantization (at the encoder), and between the inverse quantization and the main inverse transform (at the decoder side). As shown in fig. 10, 4 × 4 (or 8 × 8) secondary transform is performed according to block size. For example, a 4 × 4 quadratic transform is applied to small blocks (e.g., min (width, height) < 8), and an 8 × 8 quadratic transform is applied to large blocks of each 8 × 8 block (e.g., min (width, height) > 4).
Fig. 10 shows an example of quadratic transformation in JEM.
The application of the indivisible transformation is described below using the input as an example. To apply the indivisible transform, a 4X4 input block X
The indivisible transformation is calculated asWhereinA transform coefficient vector is indicated, and T is a 16 × 16 transform matrix. Then, using the scan order (horizontal, vertical, or diagonal) of the block, the 16 × 1 coefficient vector is encodedReorganized into 4x4 blocks. The coefficients with the smaller index will be placed in a 4x4 coefficient block with the smaller scan index. There are a total of 35 transform sets, and each transform set uses 3 indivisible transform matrices (kernels). The mapping from intra prediction mode to transform set is predefined. For each transform set, the selected indivisible quadratic transform candidate is further specified by an explicitly signaled quadratic transform index. After transforming the coefficients, the index is signaled once per intra CU in the bitstream.
2.10.2 example reduced Secondary Shift (RST)/Low Frequency Un-Distributible Shift (Low Frequency Non-
Separable Transform,LFNST)
A reduced quadratic transform (RST), also known as a low frequency indivisible transform (LFNST), is introduced as a 4 transform set (instead of 35 transform sets) map. In some embodiments, 16x64 (which may be further reduced to 16x 48) and 16x16 matrices are used for 8x8 and 4x4 blocks, respectively. For ease of illustration, the 16x64 (which may be further reduced to 16x 48) transform is denoted as RST8x8, and the 16x16 transform is denoted as RST4x4. Fig. 11 shows an example of RST.
Fig. 11 shows an example of the proposed reduced quadratic transform (RST).
RST calculation
The main idea of Reduced Transform (RT) is to map N-dimensional vectors to R-dimensional vectors in different spaces, where R/N (R < N) is the reduction factor.
The RT matrix is an R × N matrix as follows:
where the R rows of the transform are R bases of the N-dimensional space. The inverse transform matrix of RT is the transpose of its forward transform. Examples of positive and inverse RTs are depicted in 12.
Fig. 12 shows an example of a transform of forward and reverse downscaling.
In some embodiments, a RST8x8 with a reduction factor of 4 (1/4 size) is applied. Thus, a 16x64 direct matrix is used instead of the conventional 64x64 direct matrix of 8x8 indivisible transform matrix size. In other words, a 64 × 16 inverse RST matrix is used at the decoder side to generate the core (primary) transform coefficients in the 8 × 8 upper left region. The positive RST8x8 uses a 16x64 (or 8x64 for an 8x8 block) matrix such that it produces non-zero coefficients only in the upper left 4x4 area within a given 8x8 area. In other words, if RST is applied, then the 8 × 8 area, except the upper left 4 × 4 area, will have only zero coefficients. For RST4x4, a 16x16 (or 8x16 for a 4x4 block) direct matrix multiplication is applied.
The inverse RST is conditionally applied when the following two conditions are met:
a. the block size is greater than or equal to a given threshold (W > =4& & H > = 4);
b. the transform skip mode flag is equal to zero.
RST8x8 is applied to the upper left 8x8 region of the transform coefficient block if both the width (W) and the height (H) of the transform coefficient block are greater than 4. Otherwise, RST4x4 is applied to the top left min (8, w) × min (8, h) region of the transform coefficient block.
If the RST index is equal to 0, then RST is not applied. Otherwise, the RST is applied, with its core selected along with the RST index. The RST selection method and the coding of the RST index will be explained later.
In addition, RST is applied to intra CUs in both intra and inter slices, as well as to both luma and chroma. If dual tree is enabled, the RST indices for luma and chroma are signaled separately. For inter-frame stripes (dual tree disabled), a single RST index is signaled and used for both luma and chroma.
In some embodiments, intra Sub-Partition (ISP) is employed as the new Intra prediction mode. When ISP mode is selected, RST is disabled and RST index is not signaled, since performance improvement is negligible even if RST is applied to every feasible partition block. In addition, disabling RST for the residual of ISP prediction may reduce encoding complexity.
RST selection
The RST matrix is selected from four sets of transforms, each of which includes two transforms. Which transform set to apply is determined according to the intra prediction mode as follows:
(1) If one of the three CCLM modes is indicated, transform set0 is selected.
(2) Otherwise, change set selection is performed according to table 7:
table 7 transform set selection table
IntraPredMode | Transformation set indexing |
IntraPredMode<0 | 1 |
0<=IntraPredMode<=1 | 0 |
2<=IntraPredMode<=12 | 1 |
13<=IntraPredMode<=23 | 2 |
24<=IntraPredMode<=44 | 3 |
45<=IntraPredMode<=55 | 2 |
56<= |
1 |
The index (denoted IntraPredMode) accessing the table ranges from [ -14, 83], which is the transform mode index for wide-angle intra prediction.
Reduced size RST matrix
As a further reduction, instead of a 16x64 matrix, a 16x48 matrix is applied with the same transform set configuration, each of these matrices taking 48 input data from three 4x4 blocks in the upper left 8x8 block (excluding the lower right 4x4 block) (fig. 13).
Fig. 13 shows an example of a positive RST8x8 process utilizing a 16x48 matrix.
RST signaling
The forward RST8x8 of R =16 uses a 16x64 matrix such that it produces non-zero coefficients only in the upper left 4x4 region within a given 8x8 region. In other words, if RST is applied, then the 8 × 8 area, except the upper left 4 × 4 area, will only generate zero coefficients. Thus, when any non-zero element is detected within an 8 × 8 block region other than the upper left 4 × 4 (which is depicted in fig. 14), the RST index is not coded, since this means that the RST is not applied. In this case, the RST index is inferred to be zero
Fig. 14 shows an example of scanning positions 17 to 64 of non-zero elements.
Range of zero setting
In general, any coefficients in the 4 × 4 sub-block may be non-zero before applying the inverse RST to the 4 × 4 sub-block. However, it is constrained that in some cases some of the coefficients in the 4x4 sub-block must be zero before applying the inverse RST to the sub-block.
Let nonZeroSize be a variable. Any coefficient requiring an index not less than nonZeroSize must be zero when it is rearranged into a 1-D array before inversion RST.
When nonZeroSize is equal to 16, there is no zeroing constraint for the coefficients in the upper left 4 × 4 sub-block.
In some embodiments, when the current block size is 4 × 4 or 8 × 8, nonZeroSize is set equal to 8. For other block sizes, nonZeroSize is set to 16.
Example description of RST
In the tables and descriptions below, bold italic text is used to indicate that changes can be made to the current syntax to accommodate certain embodiments described in this document.
Sequence parameter set RBSP syntax
Residual coding syntax
Coding/decoding unit syntax
Sequence parameter set RBSP semantics
……
……
Coding and decoding unit semantics
……
Transformation process for scaling transform coefficients
Overview
The inputs to this process are:
-a luminance position (xTbY, yTbY) specifying an upper left sample of the current luminance transform block relative to an upper left luminance sample of the current picture,
a variable nTbW specifying the current transform block width,
a variable nTbH specifying the current transform block height,
a variable cIdx specifying the color component of the current block,
-an (nTbW) x (nTbH) array d [ x ] [ y ] of scaled transform coefficients of x =0.. NTbW-1, y =0.. NTbH-1.
The output of this process is an (nTbW) x (nTbH) array r [ x ] [ y ] of residual samples of x =0.. NTbW-1, y =0.. NTbH-1.
Second order transformation process
Quadratic transform matrix derivation process
2.11 Inverse quantization clipping in HEVC
In HEVC, the scaled transform coefficient d 'is calculated as d' = Clip3 (coeffMin, coeffMax, d), where d is the scaled transform coefficient before clipping.
For the luminance component, coeffMin = CoeffMinY; coeffMax = CoeffMaxY. For the chrominance components, coeffMin = CoeffMinC; coeffMax = CoeffMaxC; wherein
CoeffMinY=-(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))
CoeffMinC=-(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))
CoeffMaxY=(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))-1
CoeffMaxC=(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))1
"extended _ precision _ processing _ flag" is a syntax element signaled in the SPS.
2.12 affine Linear weighted Intra prediction (ALWIP, a.k.a. matrix based Intra prediction, MIP)
In some embodiments, two tests are performed. In test 1, the ALWIP was designed to have a memory limit of 8 kbytes and a maximum of 4 multiplications per sample point. Test 2 is similar to test 1, but the design is further reduced in memory requirements and model architecture.
* A single set of matrices and offset vectors for all block shapes.
* The number is reduced to 19 for all block-shaped patterns.
* The memory requirement is reduced to 5760 10-bit values, i.e., 7.20 kilobytes.
* Linear interpolation of the predicted samples is performed in a single step for each direction, as in the first test, instead of iterative interpolation.
2.13 sub-block transformations
For inter-predicted CUs with CU cbf equal to 1, CU sbt flag may be signaled to indicate whether to decode the entire residual block or a sub-portion of the residual block. In the former case, the inter-frame MTS information will be further parsed to determine the transform type of the CU. In the latter case, a portion of the residual block is coded with an inferred adaptive transform and another portion of the residual block is zeroed out. SBT does not apply to the combined intra-inter mode.
In the sub-block transform, a position-dependent transform is applied to the luminance transform block in SBT-V and SBT-H (chroma TB always uses DCT-2). The two positions of SBT-H and SBT-V are associated with different kernel transformations. More specifically, the horizontal and vertical translation of each SBT position is specified in fig. 15. For example, the horizontal and vertical transforms for SBT-V position 0 are DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the corresponding transform is set to DCT-2. Thus, the sub-block transform jointly specifies TU tiling, cbf, and horizontal and vertical transforms of the residual block, which may be considered syntax shortcuts for the case where the main residual part of the block is on one side of the block.
FIG. 15 is a diagram of sub-block transform modes SBT-V and SBT-H.
3. Examples of problems addressed by embodiments
The current design has the following problems:
(1) The clipping and shift/rounding operations in the MTS/RST may not be optimal.
(2) RST applied on two adjacent 4x4 blocks can be expensive.
(3) RST can be performed in different ways for different color components.
(4) RST may not work well for screen content codecs.
(5) The interaction between RST and other codec tools is unclear.
(6) The transformation matrix of RST can be stored more efficiently.
(7) How to apply the quantization matrix on the RST is not clear.
4. Example embodiments and techniques
The embodiments listed below should be considered as examples for explaining the general concept. These examples should not be construed narrowly. Furthermore, the embodiments may be combined in any manner.
In the following description, the coding information may include a prediction mode (e.g., intra/inter/IBC mode), a motion vector, a reference picture, an inter prediction direction, an intra prediction mode, a CIIP (combined intra inter prediction) mode, an ISP mode, an affine intra mode, a transform core adopted, a transform skip flag, and the like, such as information required when coding a block.
In the discussion that follows, satShift (x, n) is defined as
Shift (x, n) is defined as Shift (x, n) = (x + offset 0) > > n.
In one example, offset0 and/or offset1 is set to (1 < < n) > >1or (1 < (n-1)). In another example, offset0 and/or offset1 is set to 0.
In another example, offset0= offset1= ((1 < < n) > > 1) -1or ((1 < < (n-1))) -1.
Clip3 (min, max, x) is defined as
1. After the inverse RST, the output value should be clipped to [ MinCoef, maxCoef ], including the range of MinCoef, maxCoef, where MinCoef and/or MaxCoef are two potentially variable integer values.
a. In one example, assuming that the inverse quantized coefficients are clipped to, [ QMinCoef, qmaxcooef ], a range that includes QMinCoef, qmaxcooef, minCoef may be set equal to QMinCoef and/or MaxCoef may be set equal to qmaxcooef.
b. In one example, minCoef and/or MaxCoef may depend on the color component.
i. In one example, minCoef and/or MaxCoef may depend on the bit depth of the corresponding color component.
c. In one example, minCoef and/or MaxCoef may depend on the shape (e.g., square or non-square) and/or block size of the block.
d. In one example, the values of MinCoef and/or MaxCoef or the selection of candidate values may be signaled, such as in SPS, PPS, slice header/slice group header/CTU/CU.
e. In one example, for the luminance component, minCoef and/or MaxCoef may be derived as:
MinCoef=-(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))
MaxCoef=(1<<(extended_precision_processing_flagMax(15,BitDepthY+6):15))1
where BitDepthY is the bit depth of the luma component and the extended _ precision _ processing _ flag may be signaled, such as with SPS signaling.
f. In one example, for a component, minCoef and/or MaxCoef may be derived as:
MinCoef=-(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))
MaxCoef=(1<<(extended_precision_processing_flagMax(15,BitDepthC+6):15))1,
where BitDepthC is the bit depth of the chroma component and the extended _ precision _ processing _ flag may be signaled, such as with SPS signaling.
g. In some embodiments, minCoef is- (1 < < 15), and MaxCoef is (1 < < 15) -1.
h. In one example, the consistent bitstream should satisfy that the transform coefficients after the positive RST should be within a given range.
2. It is suggested that the way of applying positive RST and/or inverse RST to the M × N sub-blocks of coefficients may depend on the number of sub-blocks to which positive RST and/or inverse RST are applied, e.g., M = N =4.
a. In one example, the zeroing range may depend on the sub-block index to which RST is applied.
i. Alternatively, the zeroing range may depend on the number of sub-blocks to which RST is applied.
b. In one example, when there are S sub-blocks in the entire coefficient block to which positive RST and/or inverse RST are applied (where S >1, e.g., S = 2), the manner in which positive RST and/or inverse RST are applied to a first sub-block of coefficients and to a second sub-block of coefficients may be different. For example, the first mxn sub-block may be the upper left mxn sub-block.
i. In one example, the non zerosize described in section 2.10 may be different for a first M × N sub-block of coefficients (denoted non zerosize 0) and a second M × N sub-block of coefficients (denoted non zerosize 1).
1) In one example, nonZeroSize0 may be greater than nonZeroSize1. For example, nonZeroSize0=16 and nonZeroSize1=8.
in one example, the nonZeroSize described in section 2.10 may be different when only one mxn sub-block is to be applied with positive RST and/or inverse RST, or when more than one mxn sub-block is to be applied with positive RST and/or inverse RST.
1) In one example, the nonZeroSize may be equal to 8 if there is more than one mxn sub-block to which the positive RST and/or the inverse RST are to be applied.
3. It is proposed that if the current block size is 4 × H or W × 4, where H >8 and W >8, then the positive RST and/or the inverse RST are applied to only one M × N sub-block of coefficients (such as the top left M × N sub-block). For example, M = N =4.
a. In one example, if H > T1 and/or W > T2 (e.g., T1= T2= 16), then positive RST and/or inverse RST are applied only to one mxn sub-block of coefficients.
b. In one example, if H < T1 and/or W < T2 (e.g., T1= T2= 32), then positive RST and/or inverse RST are applied only to one mxn sub-block of coefficients.
c. In one example, for all H >8 and/or W >8, positive RST and/or inverse RST are applied to only one mxn sub-block of coefficients.
d. In one example, if the current block size is M × H or W × N, where H > = N and W > = M (e.g., M = N = 4), then positive RST and/or inverse RST are applied to only one M × N sub-block (such as the top left M × N sub-block).
Rst can be applied to non-square areas. Assume that the region size is represented by K × L, where K is not equal to L.
a. Alternatively, in addition, zeroing may be applied to the transform coefficients after positive RST such that the maximum number of non-zero coefficients is met.
i. In one example, if the transform coefficient is located outside the upper left MxM region (where M is not greater than K and M is not greater than L), the transform coefficient may be set to 0.
5. It is proposed that the coefficients in two adjacent mxn sub-blocks may be contained in a single positive RST and/or inverse RST. For example, M = N =4.
a. In one example, one or more of the following operations may be performed at an encoder. The operations may be performed in sequence.
i. The coefficients in two adjacent mxn sub-blocks are rearranged into a 1-D vector having 2 mxn elements.
Apply a positive RST of a transformation matrix having 2 xmxn columns and mxn rows (or mxn columns and 2 xmxn rows) to the 1-D vector.
Rearranging the transformed 1-D vector having M x N elements into a first M x N sub-block (such as the upper left sub-block).
All coefficients in the second mxn sub-block may be set to zero.
b. In one example, one or more of the following operations may be performed at a decoder. The operations may be performed in sequence.
i. The coefficients in a first M x N sub-block (such as the top left sub-block) are rearranged into a 1-D vector having M x N elements.
Apply the inverse RST of the transformation matrix with M N columns and 2M N rows (or 2M N columns and M N rows) to the 1-D vector.
Rearranging the transformed 1-D vector having 2 x M x N elements into two adjacent M x N sub-blocks.
c. In one example, a block may be divided into K (K > 1) sub-blocks, and both primary and secondary transforms may be performed at the sub-block level.
6. The zeroing range (e.g., nonZeroSize described in section 2.10) may depend on the color components.
a. In one example, the range may be different for luma and chroma components for the same block size.
7. The zeroing range (e.g., nonZeroSize described in section 2.10) may depend on the codec information.
a. In one example, it may depend on the codec mode, such as intra mode or non-intra mode.
b. In one example, it may depend on the codec mode, such as intra mode or inter mode or IBC mode.
c. In one example, it may depend on reference picture/motion information.
8. The zeroing range (e.g., nonZeroSize described in section 2.10) for which a particular block size is suggested may depend on the Quantization Parameter (QP).
a. In one example, assume that nonZeroSize is equal to nonZeroSize a when QP is equal to QPA, and nonZeroSize is equal to nonZeroSize b when QP is equal to QPB. If QPA is not smaller than QPB, then nonZeroSizeA is not larger than nonZeroSizeB.
b. Different transform/inverse transform matrices may be used for different nonZeroSize.
9. The recommendation may signal a zeroing range (e.g., nonZeroSize described in section 2.10), such as in SPS, PPS, picture header, slice header, CTU line, CTU, CU, or any video data unit.
a. Alternatively, multiple ranges may be defined. And may signal an indication of which candidate nonZeroSize to select, such as in SPS, PPS, picture header, slice header, CTU row, CTU, and CU.
10. Whether and/or how RST is applied may depend on the color format, and/or the use of separate plane coding, and/or the color components.
a. In one example, RST may not be applied to chroma components (such as Cb and/or Cr).
b. In one example, if the color format is 4.
c. In one example, RST may not be applied to the chroma components if split plane coding is used.
d. In one example, the particular chunk size, nonZeroSize, may depend on the color components.
i. In one example, for the same block size, the non zerosize on the chroma component may be smaller than the non zerosize on the luma component.
11. It is proposed that RST control information (such as whether RST is applied and/or which set of transform matrices is selected) can be signaled separately for the luma and chroma components when they are coded with a single coding structure tree.
12. Whether and how the RST is applied may depend on the coding information (such as the coding mode) of the current block and/or the neighboring blocks.
a. In one example, the RST cannot be used for one or more particular intra prediction modes.
i. For example, RST cannot be used for LM mode.
RST cannot be used in LM-T mode, for example.
RST cannot be used in LM-A mode, for example.
RST cannot be used for wide-angle intra prediction mode, for example.
v. for example, RST cannot be used for BDPCM mode or/and DPCM mode or/and RBDPCM mode.
For example, RST cannot be used in the ALWIP mode.
RST cannot be used for certain specific angular intra prediction modes (such as DC, planar, vertical, horizontal, etc.), for example.
RST can be used for the luma component but not for the chroma component, e.g., in LM mode or/and LM-T mode or/and LM-a mode.
For example, RST may not be used for the chroma component when applying joint chroma residual coding.
b. If the RST cannot be applied, the syntax element indicating the RST to which the information in the current block relates may not be signaled.
13. It is proposed to apply RST to blocks that are not intra coded.
a. In one example, RST may be applied to the inter-coded block.
b. In one example, RST may be applied to Intra Block Copy (IBC) coded blocks.
c. In one example, the RST can be applied to a block that is coded with combined inter-frame intra prediction (CIIP).
14. It is suggested that RST can be controlled at different levels.
a. For example, information indicating whether RST (such as a control flag) is applicable may be signaled in PPS, slice header, picture header, slice group header, slice, CTU row, CTU.
Whether rst is applicable may depend on the standard configuration file/level/hierarchy.
15. Suggesting whether to apply location-dependent intra prediction combining (PDPC) may depend on whether RST is applied.
a. In one example, if the current block has RST applied, then PDPC may not be applied.
b. In one example, if the current block has RST applied, PDPC may be applied.
c. Alternatively, whether RST is applicable may depend on whether PDPC is applicable or not.
i. In one example, when PDPC is applied, RST is not applied.
if the RST cannot be applied, then no syntax element indicating the RST to which the information in the current block relates may be signaled.
16. It is proposed that whether filtering is applied to neighboring samples for intra prediction may depend on whether RST is applied.
a. In one example, if the current block has RST applied, the neighboring samples may not be filtered.
b. In one example, if the current block has RST applied, nearby samples may be filtered.
c. Alternatively, whether to apply RST may depend on whether neighboring samples used for intra prediction are filtered.
i. In one example, RST is not applied when neighboring samples for intra prediction are filtered.
in one example, RST is not applied when neighboring samples for intra prediction are not filtered.
if the RST cannot be applied, then the syntax element indicating the RST to which the information in the current block relates may not be signaled.
17. It is suggested that RST can be applied when the current block is coded with transform skip.
a. For example, the primary transform is skipped, but the secondary transform may still be applied.
b. The quadratic transform matrix used in the transform skip mode may be different from the quadratic transform matrix used in the no transform skip mode.
18. It is proposed that the transformation matrix for RST can be stored with a bit width of less than 8. For example, the transformation matrix for RST may be stored at bit widths of 6 or 4.
19. It is proposed that the transformation matrix for RST can be stored in a predictive manner.
a. In one example, a first element in the first transformation matrix of the RST can be predicted by a second element in the first transformation matrix of the RST.
i. For example, the difference between two elements may be stored.
For example, the difference may be stored at a bit width of less than 8, such as 6 or 4.
b. In one example, a first element in a first transformation matrix of the RST can be predicted by a second element in a second transformation matrix of the RST.
i. For example, the difference between two elements may be stored.
For example, the difference may be stored at a bit width of less than 8, such as 6 or 4. 20. It is proposed that a first transformation matrix of RST can be derived from a second transformation matrix of RST.
a. In one example, a portion of elements of a second transformation matrix of the RST can be picked to construct a first transformation matrix of the RST.
b. In one example, a first transformation matrix of the RST is derived by rotating or flipping all or a portion of a second transformation matrix of the RST.
c. In one example, a first transformation matrix of RST is derived by downsampling or upsampling a second transformation matrix of RST.
21. The syntax element suggested for indicating the RST to which the information in the current block relates may be signaled before signaling the residual (which may be transformed).
a. In one example, the signaling of RST-related information may not depend on non-zero or zero coefficients counted in resolving the residual.
b. In one example, non-zero or zero coefficients may not be counted when resolving the residual.
c. In one example, a coded block flag (cbf) flag of a subblock with RST set to all zeros may not be signaled and inferred to be 0.
d. In one example, the valid flag for the coefficient set by RST to 0 may not be signaled and inferred to be 0.
e. Resolving the scan order of the residual block may depend on whether and how RST is applied.
i. In one example, the coefficients set to zero by RST may not be scanned.
f. The arithmetic codec context used to resolve the residual block may depend on whether and how RST is applied.
22. Suggesting whether and how to apply the quantization matrix may depend on whether and how to apply RST.
a. In one example, a different quantization matrix may be applied whether or not RST is applied.
b. Alternatively, whether and how RST is applied may depend on whether and how the quantization matrix is applied.
i. In one example, RST may not be applied when applying the quantization matrix to the block.
23. It is suggested that RST can be applied to the quantized coefficients/residuals.
a. In one example, RST can be applied to the residual when using transform skipping.
b. In one example, RST may be applied to the quantized transformed coefficients of the block.
24. It is proposed to apply RST to the sub-block transform block.
a. In one example, RST may be applied to the top left coefficient generated by the sub-block transform.
Fig. 16 is a block diagram of the video processing apparatus 1600. The apparatus 1600 may be used to implement one or more of the methods described herein. The apparatus 1600 may be implemented in a smartphone, tablet, computer, internet of Things (IoT) receiver, etc. The apparatus 1600 may include one or more processors 1602, one or more memories 1604, and video processing hardware 1606. The processor 1602 may be configured to implement one or more of the methods described in this document. The memory(s) 1604 may be used to store data and code for implementing the methods and techniques described herein. The video processing hardware 1606 may be used to implement some of the techniques described in this document in hardware circuits.
Fig. 17 is a flow diagram of an example method 1700 of video processing. Method 1700 includes determining (1702) a constraint rule for selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block. The method 1700 includes performing (1704) the transformation by applying a quadratic transformation having a reduced size according to the constraint rule. The quadratic transform having a reduced size has a size reduced from the size of the current video block. During the conversion, a quadratic transform with a reduced size is applied in a specific order together with the main transform.
Additional embodiments and techniques are described in the following examples.
1. A video processing method, comprising: determining a constraint rule for selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block; and performing the conversion by applying a quadratic transform having a reduced size according to the constraint rule; wherein the secondary transform having the reduced size has a size reduced from the size of the current video block, and wherein during the conversion, the secondary transform having the reduced size is applied in a particular order with the primary transform.
2. The method of example 1, wherein the converting comprises encoding the current video block into a bitstream representation, and wherein the particular order comprises first applying a primary transform in a positive direction, then selectively applying a secondary transform having a reduced size in the positive direction, and then quantizing an output of the secondary transform having the reduced size in the positive direction.
3. The method of example 1, wherein converting comprises decoding the current video block from the bitstream representation, and wherein the particular order comprises first applying inverse quantization to the bitstream representation, then selectively applying a secondary transform having a reduced size in an inverse direction, and then applying a primary transform to an output of the secondary transform having the reduced size in the inverse direction.
4. The method of any of examples 1-3, wherein the constraint rule specifies clipping an output range of the quadratic transform in an inverse direction having a reduced size to [ MinCoef, maxCoef ], a range including MinCoef, maxCoef, wherein MinCoef and/or MaxCoef are two integer values as a function of a condition of the current video block.
5. The method of example 4, wherein the condition of the current video block is a type of color or luminance component represented by the current video block.
6. The method of example 1, wherein the constraint rule specifies applying a quadratic transform having a reduced size to one or more MxN subblocks of the current video block and zeroing remaining subblocks of the current video block.
7. The method of example 1, wherein the constraint rule specifies that a quadratic transform having a reduced size be applied differently to different sub-blocks of the current video block.
8. The method of any of examples 1-5, wherein, since the current video block has a size of 4 xH or Wx4, the constraint rule specifies that a quadratic transform having a reduced size is applied to exactly one MxN subblock of the current video block, where H is a height in integer-pixels and W is a width in integer-pixels.
9. The method of example 8, wherein H >8 or W >8.
10. The method of any of examples 1-9, wherein the current video block is a non-square region of the video.
11. The method according to example 2 or 3, wherein the constraint rule specifies zeroing the transform coefficients of the primary transform in a positive direction or padding zero coefficients to the output of the secondary transform in a reverse direction.
Additional embodiments of examples 1-5 are described in section 4, item 1. Additional embodiments of examples 6-7 are described in section 4, item 2. Additional embodiments of examples 8-9 are described in section 4, item 3. Additional embodiments of examples 10 to 11 are described in section 4, item 4.
12. A video processing method, comprising: determining a constraint rule for selectively applying a quadratic transform having a reduced size during transitions between bit stream representations of the current video block and the neighboring video region and pixels of the current video block and pixels of the neighboring video region; and performing the conversion by applying a quadratic transform having a reduced size according to the constraint rule; wherein the secondary transform having the reduced size has a size reduced from the size of the current video block and the neighboring video region, and wherein during the transition the secondary transform having the reduced size is applied in a particular order together with the primary transform.
13. The method of example 12, wherein the adjacent video region comprises a top-left block of the current video block.
14. The method of example 12, wherein the current video block and the neighboring video region correspond to sub-blocks of a parent video block.
Additional embodiments of examples 12-14 are described in section 4, item 5.
15. A video processing method, comprising: determining a zeroing rule for selectively applying a quadratic transform having a reduced size during conversion of a bitstream representation of the current video block to pixels of the current video block; and performing the conversion by applying a quadratic transform having a reduced size according to the zeroing rule; wherein the quadratic transform having a reduced size has a size reduced from the size of the current video block; wherein the zeroing rule specifies a maximum number of coefficients used by the quadratic transform having a reduced size.
16. The method of example 15, wherein the maximum number of coefficients is a function of a component identification of the current video block.
17. The method of example 16, wherein the maximum number of coefficients is different for luminance video blocks and chrominance video blocks.
18. The method according to any one of examples 15 to 17, wherein the zeroing rule specifies a zeroing range that is a function of the codec information of the current video block.
19. The method of any of examples 15 to 17, wherein the zeroing rule specifies a zeroing range that is a function of a quantization parameter of the current video block.
20. The method according to any of examples 15 to 19, wherein the zeroing range is indicated in the bitstream representation by a field comprised at a sequence parameter set level, or a picture header, or a slice group header, or a codec tree element row, or a codec tree element, or a codec element, or at a video data element level.
Additional embodiments of examples 15-17 are described in section 4, item 6. Other embodiments of example 18 are described in section 4, item 7. Other embodiments of example 19 are described in section 4, item 8. Other embodiments of example 20 are described in section 4, item 9.
21. A video processing method, comprising: determining conditions for selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block; and performing a conversion by applying a quadratic conversion having a reduced size according to the condition; wherein the quadratic transform having a reduced size has a size reduced from the size of the current video block; and wherein the condition is signaled in a bit stream representation.
22. The method of example 21, wherein the condition is a color format or use of a split plane codec or based on color identification of the current video block.
Additional embodiments of examples 21-22 are described in section 4, item 10.
23. The method of any of examples 21 to 22, wherein the condition is signaled in a bitstream representation for the chroma component and the luma component, respectively.
Other embodiments of example 23 are described in section 4, item 11.
24. The method according to any one of examples 21 to 23, wherein the condition depends on codec information of the current video block and the neighboring video area.
25. The method of example 24, wherein the condition does not include applying a current video block that is coded using a particular intra prediction mode.
Additional embodiments of examples 24-25 are described in section 4, item 12.
26. The method of example 24, wherein the condition specifies an application to a current video block that is inter-coded.
27. The method of example 24, wherein the condition specifies an application to a current video block that is coded using intra block copy mode.
Additional embodiments of examples 25 to 26 are described in section 4, item 13.
28. The method of example 21, wherein the condition is signaled at a level in the bitstream representation such that all blocks within the level comply with the condition, wherein the level is a sequence parameter set level, or a picture header, or a slice group header, or a codec tree element row, or a codec tree element, or a codec element or a video data element level.
Other embodiments of example 28 are described in section 4, item 14.
29. The method of example 21, wherein the condition is to codec the current video block using a transform skip mode.
Other embodiments of example 29 are described in section 4, item 17.
30. A video processing method, comprising: selectively applying a quadratic transform having a reduced size during a transition between a bitstream representation of a current video block and pixels of the current video block, and performing the transition by applying the quadratic transform having the reduced size according to a condition; wherein the quadratic transform having a reduced size has a size reduced from the size of the current video block; and wherein the converting comprises selectively applying location-dependent intra prediction combining (PDPC) based on the coexistence rule.
31. The method of example 30, wherein the coexistence rule does not include applying PDPC to the current video block due to applying the quadratic transform.
32. The method of example 30, wherein the coexistence rule specifies that PDPC be applied to the current video block as a result of applying a quadratic transform.
33. The method of example 30, wherein selectively applying the quadratic transform is performed on a current video block using the PDPC.
Additional embodiments of examples 30 to 33 are described in section 4, item 15.
34. A video processing method, comprising: applying a quadratic transform having a reduced size during a conversion between a bitstream representation of a current video block and pixels of the current video block, and performing the conversion by applying the quadratic transform having the reduced size according to a condition; wherein the quadratic transform having a reduced size has a size reduced from the size of the current video block; and wherein the application controls the use of adjacent samples for intra prediction during the transition.
Other embodiments of example 34 are described in section 4, item 16.
35. A video processing method, comprising: selectively applying a quadratic transform having a reduced size during a conversion between a bitstream representation of a current video block and pixels of the current video block, and performing the conversion by applying the quadratic transform having the reduced size according to a condition; wherein the quadratic transform having a reduced size has a size reduced from the size of the current video block; and wherein the selective application controls the use of quantization matrices during the conversion.
36. The method of example 35, wherein the use of the quantization matrix occurs only as a result of applying a quadratic transform.
Additional embodiments of examples 35 to 36 are described in section 4, item 22.
37. The method of any one of examples 1 to 36, wherein the primary and secondary transforms are stored as transform matrices having a bit width of less than 8.
38. The method of any of examples 1 to 36, wherein the primary transform and the secondary transform are stored as a predictive transform matrix.
39. The method according to any one of examples 1 to 36, wherein the primary transformation is derivable from the secondary transformation using a first rule, or wherein the secondary transformation is derivable from the primary transformation using a second rule.
40. The method of any of examples 1 to 36, wherein the bitstream representation comprises information on a secondary transform or a primary transform prior to residual information of the current video block.
Additional embodiments of examples 37-40 are described in section 4, items 18, 19, 20, and 21
41. A video processing apparatus comprising a processor configured to implement one or more of examples 1 to 40.
42. A computer-readable medium having code stored thereon, which, when executed by a processor, causes the processor to implement the method set forth in any one or more of examples 1-40.
It should be appreciated that the disclosed techniques may be implemented in a video encoder or decoder to improve compression efficiency using techniques that include using a quadratic transform of a reduced size.
Fig. 18 is a block diagram illustrating an example video processing system 1800 in which various techniques disclosed herein may be implemented. Various embodiments may include some or all of the components of system 1800. The system 1800 can include an input 1802 for receiving video content. The video content may be received in a raw or uncompressed format, such as 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. Input 1902 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces (such as ethernet, passive Optical Network (PON), etc.) and wireless interfaces (such as Wi-Fi or cellular interfaces).
The system 1800 may include a codec component 1804 that may implement various codec or encoding methods described in this document. The codec component 1804 may reduce the average bit rate of video from the codec component 1804 input 1802 to the output to produce a codec representation of the video. Thus, codec techniques are sometimes referred to as video compression or video transcoding techniques. The output of the codec component 1804 may be stored or transmitted via a connected communication, as shown by component 1806. Stored or communicatively conveyed bitstream (or codec) representation of video received at input 1802 may be used by component 1808 to generate pixel values or displayable video that is sent to display interface 1810. The process of generating a video viewable by a user from a bit stream representation is sometimes referred to as video decompression. Additionally, while certain video processing operations are referred to as "codec" operations or tools, it should be understood that codec tools or operations are used at the encoder and corresponding decoding tools or operations that are the inverse of the results of the codec will be performed by the decoder.
Examples of a peripheral bus interface or a display interface may include a Universal Serial Bus (USB) or a High Definition Multimedia Interface (HDMI) or a Displayport (Displayport), etc. Examples of storage interfaces include SATA (serial advanced technology attachment), PCI, IDE interfaces, and the like. The techniques described in this document may be implemented in various electronic devices, such as mobile phones, laptops, smart phones, or other devices capable of performing digital data processing and/or video display.
Fig. 19 is a flow diagram of an example method 1900 of video processing in accordance with the present technology. The method 1900 includes, at operation 1910, determining that output values from the inverse quadratic transform having the reduced size are constrained within a range [ min, max ], including min, max, for a transition between a block of video and a bitstream representation of the video. The inverse quadratic transform is applied to the blocks between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block, min and max are integer values. The method 1900 includes, at operation 1920, performing a conversion based on the determination. In some embodiments, the inverse quadratic transform having a reduced size comprises an inverse low frequency indivisible transform, wherein low frequencies correspond to the reduced size.
In some embodiments, the coefficients after the dequantization step are constrained to [ qmin, qmax ], including qmin, qmax, qmin and qmax being positive integers. At least one of (1) min equals qmin, or (2) max equals qmax is satisfied. In some embodiments, the range is based on the color components of the block. In some embodiments, at least one of min or max is based on a bit depth of the color component. In some embodiments, the range is based on the shape of the block. In some embodiments, the range is based on whether the block has a square shape or a non-square shape. In some embodiments, the range is based on the size of the block. In some embodiments, at least one of min or max is signaled in the bitstream representation. In some embodiments, the range is signaled in a sequence parameter set, a picture parameter set, a slice header, a slice group header, a codec tree unit, or a codec unit.
In some embodiments, min is- (1 < < (extended _ precision _ processing _ flagMax (15, bitDepthY + 6): 15)) and max is (1 < < (extended _ precision _ processing _ flagMax (15, bitDepthY + 6): 15)) for the luminance component of the block. BitDepthY is the bit depth of the luminance component, and where extended _ precision _ processing _ flag is a variable signaled in the bit stream representation.
In some embodiments, for the chroma component of a block, min is- (1 < < (extended _ precision _ processing _ flagMax (15, bitDepthC + 6): 15)) and max is (1 < < (extended _ precision _ processing _ flagMax (15, bitDepthC + 6): 15)). BitDepthC is the bit depth of the luma component, not strong where extended _ precision _ processing _ flag is a variable signaled in the bitstream representation.
In some embodiments, min is equal to- (1 < < 15), and max is equal to (1 < < 15) -1.
In some embodiments, extended _ precision _ processing _ flag is signaled in the sequence parameter set. In some embodiments, the coefficients applicable to the block after the quadratic transform between the primary transform and the quantization step are limited in range.
Fig. 20 is a flow diagram of an example method 2000 of video processing in accordance with the present technology. The method 2000 includes, for a transition between a block of video and a bitstream representation of the video, determining a manner of applying a quadratic transform having a reduced size to sub-blocks of the block based on a number of sub-blocks to which the quadratic transform is applied, in operation 2010. The quadratic transform is applied to the blocks between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2000 further includes, at operation 2020, performing a conversion based on the determination.
In some embodiments, the quadratic transform having a reduced size comprises a low frequency indivisible transform, wherein low frequencies correspond to the reduced size. In some embodiments, the reduced size corresponds to the size of the sub-block.
In some embodiments, the sub-blocks have a size of 4x4. In some embodiments, the sub-block is associated with a sub-block index. Coefficients outside the non-zero range of the sub-block are set to zero and a zeroing range is determined based on the sub-block index. In some embodiments, coefficients outside the non-zero range of the sub-block are set to zero. The non-zero range is determined based on the number of sub-blocks for which the quadratic transform is applicable.
In some embodiments, the number of sub-blocks for which the quadratic transform is applicable is greater than 1. The quadratic transform is applied to the first sub-block in a first manner and the quadratic transform is applied to the second sub-block in a second manner different from the first manner. In some embodiments, coefficients outside the first non-zero range of the first sub-block are set to zero. Coefficients outside a second non-zero range of the second sub-block are set to zero, and the first non-zero range is different from the second non-zero range. In some embodiments, the first non-zero range is greater than the second non-zero range. In some embodiments, the first non-zero range is denoted as 16 and the second non-zero range is denoted as 8.
In some embodiments, where the quadratic transform is applied to only one sub-block, coefficients outside the first non-zero range of the only one sub-block are set to zero. In the case where the quadratic transform is applied to the plurality of sub-blocks, coefficients outside the second non-zero range of the plurality of sub-blocks are set to zero. In some embodiments, the first non-zero range is different from the second non-zero range. In some embodiments, the second non-zero range is denoted as 8.
FIG. 21 is a flow diagram of another example method of video processing in accordance with the present technology. The method 2100 includes, in operation 2110, determining, for a conversion between a block of video and a bitstream representation of the video, that a quadratic transform having a reduced size is applicable to a single sub-block of the block, if the size of the block satisfies a condition. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2100 further includes, at operation 2120, performing a conversion based on the determination.
In some embodiments, the reduced size corresponds to the size of the sub-block. In some embodiments, the single sub-block to which the quadratic transform applies is the top-left sub-block of the current block. In some embodiments, the single sub-block has a size of M × N, M and N being positive integers. In some embodiments, M = N =4. In some embodiments, the condition specifies that the size of the block is 4 × H or W × 4, and wherein H >8 and W >8. In some embodiments, at least one of (1) H > T1 or (2) W > T2 is satisfied, T1 and T2 being greater than 8. In some embodiments, T1= T2=16. In some embodiments, at least one of (1) H < T1, or (2) W < T2 is satisfied, T1 and T2 being greater than 8. In some embodiments, T1= T2=32. In some embodiments, the condition specifies that the size of the block is M H or W N, and where H ≧ N and W ≧ M.
FIG. 22 is a flow diagram of another example method of video processing in accordance with the present technology. The method 2200 includes, at operation 2210, determining, for a conversion between a block of video and a bitstream representation of the video, that a quadratic transform having a reduced size is applicable to a region in the block having a size of K × L. K and L are positive integers, and K is not equal to L. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2200 also includes, at operation 2220, performing a transformation based on the determination.
In some embodiments, the reduced size corresponds to the size of the region. In some embodiments, coefficients outside the non-zero range of the region are set to zero. In some embodiments, the non-zero range is represented as an upper left region in the block, the upper left region having a size of M × M, M being less than or equal to K and L.
FIG. 23 is a flow diagram of another example method of video processing in accordance with the present technology. The method 2300 includes, at operation 2310, determining a non-zero range based on characteristics of a block for a transition between the block of video and a bitstream representation of the video. A non-zero range corresponds to a range outside which coefficients associated with a quadratic transform having a reduced size are set to zero. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2300 also includes, at operation 2320, performing a conversion based on the determination.
In some embodiments, the characteristics of the block include a color component of the block. In some embodiments, the first non-zero range of the luma component of the block is different from the second non-zero range of the chroma component of the block. In some embodiments, the characteristics of the block include codec information of the block. In some embodiments, the coding information comprises information indicating whether the block is coded in intra mode or non-intra mode. In some embodiments, the coding information includes information indicating whether the block is coded in intra mode, inter block copy mode. In some embodiments, the coding information comprises reference pictures of motion information. In some embodiments, the characteristic of the block comprises a quantization parameter of the block. In some embodiments, the first non-zero range corresponds to a first quantization parameter and the second non-zero range corresponds to a second quantization parameter, and wherein the first non-zero range is less than or equal to the second non-zero range if the first quantization parameter is greater than or equal to the second quantization parameter.
In some embodiments, different non-zero ranges are associated with different transform matrices of the quadratic transform. In some embodiments, the non-zero range is signaled in the bitstream representation as a sequence parameter set, a picture header, a slice group header, a Coding Tree Unit (CTU) row, a CTU, or a codec unit. In some embodiments, the plurality of non-zero ranges are applicable to the secondary transform, and the value indicating the selection of one of the plurality of non-zero ranges is signaled in the bitstream representation as a sequence parameter set, a picture header, a slice group header, a Codec Tree Unit (CTU) row, a CTU, or a codec unit.
In some embodiments, performing the conversion includes generating a bitstream representation based on the blocks of the video. In some embodiments, performing the conversion includes generating a block of video from the bitstream representation.
FIG. 24A is a flow diagram of an example method of video encoding in accordance with the present technology. The method 2400 includes, in operation 2410, determining that a quadratic transform having a reduced size is applicable to two adjacent sub-blocks of a block of a video. Each of the two adjacent sub-blocks has a size of M × N, M and N being positive integers. The quadratic transform is performed between the positive main transform and the quantization step. The reduced size is reduced from the size of the block. The method 2400 further includes, at operation 2420, generating a codec representation of the video based on the determination.
In some embodiments, the reduced size corresponds to the size of two adjacent blocks. In some embodiments, the method includes arranging the coefficients of two adjacent sub-blocks into a one-dimensional vector having 2 × M × N elements. In some embodiments, the method includes obtaining mxn transform elements by applying a quadratic transform to the one-dimensional vector using a transform matrix. The transform matrix has a first size of 2 × M × N elements and a second size of M × N elements. In some embodiments, the method includes rearranging the mxn transformed elements into a first sub-block of two adjacent sub-blocks. In some embodiments, the method includes setting elements in a second sub-block of the two adjacent sub-blocks to zero. In some embodiments, both the positive primary transform and the secondary transform are performed at the sub-block level.
Fig. 24B is a flow diagram of an example method of video decoding in accordance with the present technology. The method 2450 includes, at operation 2460, determining that a quadratic transform having a reduced size is applicable to two adjacent sub-blocks of a block of the video. Each of the two adjacent sub-blocks has a size of M × N, M and N being positive integers. The quadratic transformation is performed between the inverse quantization step and the inverse main transformation. The reduced size is reduced from the size of the block. The method 2450 further includes, at operation 2470, generating a block of the video by parsing the codec representation of the video according to the determination.
In some embodiments, the reduced size corresponds to the size of two adjacent blocks. In some embodiments, the method includes arranging coefficients of a first sub-block of two adjacent sub-blocks into a one-dimensional vector having M × N elements. In some embodiments, the method includes obtaining 2 × M × N transform elements by applying a quadratic transform to the one-dimensional vector using a transform matrix. The transform matrix has a first size of M × N elements and a second size of 2 × M × N elements. In some embodiments, the method includes rearranging the 2 × mxn transformed elements into two adjacent sub-blocks. In some embodiments, M = N =4.
In some embodiments, the quadratic transform having a reduced size comprises a low frequency indivisible transform, the low frequency corresponding to the reduced size.
FIG. 25 is a flow diagram of another example method of video processing in accordance with the present technology. The method 2500 includes, at operation 2510, determining, for a transition between a block of video and a bitstream representation of the video, whether to apply a quadratic transform having a reduced size to the block based on a characteristic associated with the block according to a rule. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2500 includes, at operation 2520, performing a conversion based on the determination.
In some embodiments, the characteristics associated with a block include the codec information of the block or the codec information of neighboring blocks. In some embodiments, the rule specifies that in the event that the coding information indicates that a block or a neighboring block is coded in one or more particular coding modes, the quadratic transform is not applicable to the block. In some embodiments, the one or more particular codec modes include at least one of: a Linear Mode (LM) mode, an LM-T mode, an LM-a mode, one or more wide-angle intra prediction modes, a block differential pulse-code modulation (BDPCM) mode, a differential pulse-code modulation (DPCM) mode, a residual domain block differential pulse-code modulation (rbcdm) mode, a matrix-based intra prediction (MIP) mode, or one or more angular intra prediction modes. In some embodiments, the rule specifies that in case the block is coded in the joint chroma residual coding mode, the quadratic transform does not apply to the chroma components of the block. Coding the block in the joint chroma residual coding mode includes determining a joint residual that is an average of residuals associated with chroma components of the block. In some embodiments, the rules specify that the quadratic transform applies to the luma component of the block and not to the chroma component of the block that is coded in LM mode, LM-T mode, or LM-A mode.
In some embodiments, the characteristics associated with a block include coefficients or residuals of the block after a quantization or dequantization step. In some embodiments, the rule specifies that a secondary transform is applied to the residual if the block is coded using the transform skip mode. The transform skip mode is a mode that skips the forward or inverse main transform. In some embodiments, the rule specifies that a quadratic transform is applied to the quantized transform coefficients of the block.
In some embodiments, the characteristics associated with a block include whether the block is coded using intra coding tools. In some embodiments, the rule specifies that in the case of a block being coded using an inter-coding tool, a quadratic transform applies to the block. In some embodiments, the rule specifies that in the case of a block being coded using an intra-block-copy coding tool, a quadratic transform is applicable to the block. In some embodiments, the rule specifies that in the case of a block being coded using a combined inter-frame intra prediction coding tool, a quadratic transform is applicable to the block.
In some embodiments, the characteristics associated with the block include information associated with a chroma format of the block. In some embodiments, the rule specifies that the quadratic transform does not apply to the chroma components of the block. In some embodiments, the rule specifies that in the case where the chroma format of the block is 4. In some embodiments, the rules specify that in the case where the chroma components of the chroma format are separately coded, the quadratic transform does not apply to the chroma components of the block. In some embodiments, the rule specifies that a quadratic transform applies to the block. A non-zero range of the quadratic transform associated with the size of the block is determined based on the color components of the block, the non-zero range being a range outside of which coefficients of the block are set to zero. In some embodiments, for a same size block, a first non-zero range of chroma components of the block is smaller than a second non-zero range of luma components of the block.
In some embodiments, a determination of whether a position-dependent intra prediction combining (PDPC) codec step is applicable to the block is made based on whether a quadratic transform is applicable. In some embodiments, in the case where a quadratic transform is applied to the block, no PDPC codec step is applied. In some embodiments, a PDPC codec step is applied in case a quadratic transform is applied to the block.
In some embodiments, the characteristic associated with the block includes whether a position-dependent intra prediction combining (PDPC) codec step is applicable to the block. In some embodiments, the rule specifies that no quadratic transform is applied to the block if the PDPC codec step applies. In some embodiments, whether to filter neighboring samples of a block for an intra-prediction coding step is determined based on whether a quadratic transform is applicable to the block. In some embodiments, where a quadratic transform is applied to the block, no neighboring samples are filtered. In some embodiments, where a quadratic transform is applied to the block, the neighboring samples are filtered.
In some embodiments, the characteristics associated with the block include whether or not to filter neighboring samples of the block for an intra prediction codec step applied to the block. In some embodiments, the rule specifies that the quadratic transform does not apply in the case of filtering nearby samples. In some embodiments, the rule specifies that the quadratic transform does not apply without filtering the neighborhood samples.
In some embodiments, the characteristic associated with the block includes whether the block is coded with a transform skip mode that skips the forward or inverse primary transform. In some embodiments, the block is coded with a transform skip mode, and wherein a quadratic transform is applied to the block. In some embodiments, a first transform matrix for a quadratic transform if the transform skip mode is enabled is different from a second transform matrix for the quadratic transform if the transform skip mode is disabled. In some embodiments, whether a quantization matrix is applicable to the block is determined based on whether a quadratic transform is applied. In some embodiments, the first quantization matrix is applied if a quadratic transform is applicable, and wherein a different second quantization matrix is applied if a quadratic transform is not applicable.
In some embodiments, the characteristic associated with a block includes whether a quantization matrix is applicable to the block. In some embodiments, the rule specifies that the quadratic transform is not applicable if a quantization matrix is applied. In some embodiments, the characteristic associated with a block includes whether a sub-block level transform is applicable to the block. In some embodiments, the rule specifies that the quadratic transform applies to coefficients of the top-left sub-block of the block generated by the sub-block level transform. In some embodiments, the order in which the residual block is scanned is determined based on whether a quadratic transform is applied to the block after the quantization or dequantization step. In some embodiments, coefficients set to zero by the quadratic transform are not scanned. In some embodiments, the arithmetic codec context used to resolve the residual block after the quantization or dequantization step is determined based on whether a quadratic transform is applied to the block.
In some embodiments, information related to the secondary transform is signaled in the bitstream representation at one or more levels including picture parameter set, slice, header, picture header, slice group header, slice, codec tree unit row, or codec tree unit. In some embodiments, whether quadratic transformation is applicable is based on one or more levels of signaling information. In some embodiments, the information is signaled separately for the luma component and the chroma components that are coded within the coding tree unit. In some embodiments, in the case that the quadratic transform is not applicable to the block, one or more syntax elements related to the quadratic transform are excluded from the bitstream representation of the block. In some embodiments, one or more syntax elements related to the quadratic transform are signaled prior to the quantized transform residual in the bitstream representation. In some embodiments, the one or more syntax elements are signaled independently of the number of coefficients determined when parsing the quantized residual. In some embodiments, the number of coefficients is not counted when the quantized residual is parsed. In some embodiments, a syntax flag indicating that all sub-blocks are set to zero by the quadratic transform is excluded from the bitstream representation, and the value of the syntax flag is implied to be 0. In some embodiments, syntax flags indicating that coefficients are set to zero by the quadratic transform are excluded from the bitstream representation, and the value of the syntax flags is implied to be 0.
FIG. 26 is a flow diagram of another example method of video processing in accordance with the present technology. The method 2600 includes, at operation 2610, determining, for a conversion between a block of video and a bitstream representation of the video, a bit precision constraint for coefficients of one or more transform matrices of a quadratic transform having a reduced size that are applicable to the block. The quadratic transform is performed between the forward main transform and the quantization step or between the inverse quantization step and the inverse main transform. The reduced size is reduced from the size of the block. The method 2600 further includes, at operation 2620, performing the conversion based on the determination.
In some embodiments, the bit precision constraints include that coefficients of one or more transform matrices may be stored with a bit width of less than 8. In some embodiments, the bit precision constraints include that coefficients of the one or more transform matrices may be stored based on correlations between the one or more transform matrices. In some embodiments, a difference between a first element and a second element in the transformation matrix is stored, wherein the first element is derived based on the second element. In some embodiments, a difference between a first element in the first transformation matrix and a second element in the second transformation matrix is stored, wherein the first element is derived based on the second element. In some embodiments, the difference is represented by a bit width of less than 8. In some embodiments, the bit width is 6 or 4.
In some embodiments, the quadratic transform having a reduced size comprises a low frequency indivisible transform, the low frequency corresponding to the reduced size.
In some embodiments, performing the conversion includes generating a bitstream representation based on the blocks of the video. In some embodiments, performing the conversion includes generating a block of video from the bitstream representation.
Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode. In an example, when a video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of blocks of video, but the resulting bitstream is not necessarily modified based on the use of the tool or mode. That is, when a video processing tool or mode is enabled based on the decision or determination, the conversion from a block of video to a bitstream representation of the video will use the video processing tool or mode. In another example, when a video processing tool or mode is enabled, the decoder will process the bitstream knowing that the bitstream has been modified based on the video processing tool or mode. That is, the conversion from a bitstream representation of the video to blocks of the video will be performed using a video processing tool or mode that is enabled based on the decision or determination.
Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode. In an example, when a video processing tool or mode is disabled, the encoder will not use that tool or mode in the conversion of blocks of video to bitstream representations of video. In another example, when a video processing tool or mode is disabled, the decoder will process the bitstream knowing that the bitstream was not modified using the video processing tool or mode that was enabled based on the decision or determination.
In this document, the term "video processing" may refer to video encoding, video decoding, video compression, or video decompression. For example, a video compression algorithm may be applied during a transition from a pixel representation of a video to a corresponding bitstream representation, and vice versa. The bitstream representation of the current video block may, for example, correspond to bits in the bitstream that are co-located or distributed at different locations, as defined by the syntax. For example, a macroblock may be encoded according to transformed and codec error residual values and also encoded using bits in a header and other fields in the bitstream.
The disclosed and other aspects, examples, embodiments, modules, and functional operations described in this document may be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments may be implemented as one or more computer program products, such as one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not require such a device. Computer readable media suitable for storing computer program instructions and data include all forms of non volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; a magneto-optical disk; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or claim, but rather as descriptions of features specific to particular embodiments of particular technologies. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few embodiments and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.
Claims (20)
1. A method of processing video data, comprising:
determining, for a transition between a current block of video and a bitstream of the video, whether an inverse quadratic transform is applied to the current block, wherein output values from the inverse quadratic transform are constrained to [ min, max ], within a first range comprising min, max, wherein the inverse quadratic transform is applicable to the current block between dequantization and an inverse main transform, and wherein min and max are integer values; and
performing the conversion based on the determination,
wherein the coefficients after dequantization are constrained to a second range of [ qmin, qmax ], including qmin, qmax, qmin and qmax being integers, and wherein the first range and the second range have at least one relationship as follows: (1) min equals qmin, or (2) max equals qmax,
wherein, in response to the current block being coded using a prediction mode other than an intra prediction mode, the inverse secondary transform is not applied to the current block, and
wherein a quantization matrix used in the dequantization is determined based on whether the inverse quadratic transform is applied.
2. The method of claim 1, wherein the inverse quadratic transform comprises an inverse low frequency indivisible transform.
3. The method of claim 1, wherein applying a clipping operation clips the inverse quadratic transformed output values to within [ min, max ], the first range comprising min, max.
4. The method of claim 1, wherein min is equal to- (1 < < 15) and max is equal to (1 < < 15) -1.
5. The method of claim 1, wherein the matrix for the inverse quadratic transform is selected from four sets of transforms, each of the four sets of transforms including two transform matrices.
6. The method of claim 5, wherein transform set0 is selected for the current block in response to the current block being a chroma block and one of three cross-component linear model intra prediction modes being used for the current block.
7. The method of claim 1, wherein the inverse quadratic transform is applied to the dequantized transform coefficients of the current block.
8. The method of claim 1, wherein the inverse quadratic transform is not applied to the current block in response to the current block being coded with a transform skip mode.
9. The method of claim 1, further comprising:
determining that a forward quadratic transform is applied to the current block, wherein input values of the forward quadratic transform are constrained to [ min, max ], the first range comprising min, max, wherein the forward quadratic transform is applicable to the current block between a forward primary transform and quantization.
10. The method of claim 1, wherein coefficients of the current block after the forward quadratic transform applied between the forward primary transform and the quantization step are limited to a third range.
11. The method of claim 1, wherein, in response to the inverse quadratic transform not being applied to the current block, a syntax element indicating information related to a quadratic transform in the current block is not included in a bitstream of the current block.
12. The method of any of claims 1-11, wherein the converting comprises encoding the video into the bitstream.
13. The method of any of claims 1-11, wherein the converting comprises decoding the video from the bitstream.
14. An apparatus for processing video data comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to:
determining, for a transition between a current block of video and a bitstream of the video, whether an inverse quadratic transform is applied to the current block, wherein output values from the inverse quadratic transform are constrained to [ min, max ], within a first range comprising min, max, wherein the inverse quadratic transform is applicable to the current block between dequantization and an inverse main transform, and wherein min and max are integer values; and
performing the conversion based on the determination,
wherein the coefficients after dequantization are constrained to a second range of [ qmin, qmax ], including qmin, qmax, qmin and qmax being integers, and wherein the first range and the second range have at least one relationship as follows: (1) min equals qmin, or (2) max equals qmax,
wherein, in response to the current block being coded using a prediction mode other than an intra prediction mode, the inverse secondary transform is not applied to the current block, and
wherein a quantization matrix used in the dequantization is determined based on whether the inverse quadratic transform is applied.
15. The apparatus of claim 14, wherein the inverse quadratic transform comprises an inverse low frequency undivided transform.
16. The apparatus of claim 14, wherein min is equal to- (1 < < 15) and max is equal to (1 < < 15) -1.
17. A non-transitory computer-readable storage medium storing instructions that cause a processor to:
determining, for a transition between a current block of video and a bitstream of the video, whether an inverse quadratic transform is applied to the current block, wherein output values from the inverse quadratic transform are constrained to [ min, max ], within a first range comprising min, max, wherein the inverse quadratic transform is applicable to the current block between dequantization and an inverse main transform, and wherein min and max are integer values; and
performing the conversion based on the determination and the determination,
wherein the coefficients after dequantization are constrained to a second range of [ qmin, qmax ], including qmin, qmax, qmin and qmax being integers, and wherein the first range and the second range have at least one relationship as follows: (1) min equals qmin, or (2) max equals qmax,
wherein, in response to the current block being coded using a prediction mode other than an intra-prediction mode, the inverse quadratic transform is not applied to the current block, and
wherein a quantization matrix used in the dequantization is determined based on whether the inverse quadratic transform is applied.
18. A method of storing a bitstream of video, comprising:
determining whether an inverse quadratic transform is applied to a current block of video, wherein output values from the inverse quadratic transform are constrained within [ min, max ], a first range comprising min, max, wherein the inverse quadratic transform is applicable to the current block between dequantization and an inverse main transform, and wherein min and max are integer values;
generating a bitstream of the video based on the determining, wherein the coefficients after dequantization are constrained to a second range of [ qmin, qmax ], including qmin, qmax, qmin and qmax being integers, and wherein the first range and the second range have at least one relationship as follows: (1) min is equal to qmin, or (2) max is equal to qmax, wherein the inverse quadratic transform is not applied to the current block in response to the current block being coded using a prediction mode other than an intra prediction mode, and wherein a quantization matrix used in the dequantization is determined based on whether the inverse quadratic transform is applied; and
storing the generated bitstream into a non-transitory computer-readable recording medium.
19. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of claims 1-13.
20. A non-transitory computer-readable medium having stored thereon a computer program comprising program code for carrying out the method according to any one of claims 1-13.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2019083853 | 2019-04-23 | ||
CNPCT/CN2019/083853 | 2019-04-23 | ||
PCT/CN2020/086421 WO2020216296A1 (en) | 2019-04-23 | 2020-04-23 | Clipping operation in secondary transform based video processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113767627A CN113767627A (en) | 2021-12-07 |
CN113767627B true CN113767627B (en) | 2022-11-25 |
Family
ID=72940845
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310488397.2A Pending CN116743994A (en) | 2019-04-23 | 2020-04-23 | Method and apparatus for processing video data |
CN202080031192.6A Active CN113785576B (en) | 2019-04-23 | 2020-04-23 | Use of secondary transforms in codec video |
CN202080031268.5A Active CN113767627B (en) | 2019-04-23 | 2020-04-23 | Cropping operations in video processing based on quadratic transforms |
CN202080031341.9A Active CN113728636B (en) | 2019-04-23 | 2020-04-23 | Selective use of quadratic transforms in codec video |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310488397.2A Pending CN116743994A (en) | 2019-04-23 | 2020-04-23 | Method and apparatus for processing video data |
CN202080031192.6A Active CN113785576B (en) | 2019-04-23 | 2020-04-23 | Use of secondary transforms in codec video |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202080031341.9A Active CN113728636B (en) | 2019-04-23 | 2020-04-23 | Selective use of quadratic transforms in codec video |
Country Status (5)
Country | Link |
---|---|
US (3) | US11546636B2 (en) |
EP (1) | EP3932061A4 (en) |
JP (2) | JP7256293B2 (en) |
CN (4) | CN116743994A (en) |
WO (3) | WO2020216296A1 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116743994A (en) | 2019-04-23 | 2023-09-12 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing video data |
WO2020251254A1 (en) * | 2019-06-10 | 2020-12-17 | 주식회사 엑스리스 | Method for encoding/decoding image signal and device therefor |
EP3962082A4 (en) * | 2019-06-12 | 2022-10-05 | Sony Group Corporation | Image processing device and method |
GB2585030A (en) * | 2019-06-25 | 2020-12-30 | British Broadcasting Corp | Method of signalling in a video codec |
CN112135148B (en) * | 2019-06-25 | 2022-05-10 | 华为技术有限公司 | Non-separable transformation method and device |
WO2021040941A1 (en) * | 2019-08-30 | 2021-03-04 | Alibaba Group Holding Limited | Matrix weighted intra prediction of video signals |
WO2021052832A1 (en) * | 2019-09-20 | 2021-03-25 | Nokia Technologies Oy | An apparatus, a method and a computer program for video coding and decoding |
JP7536484B2 (en) * | 2020-03-18 | 2024-08-20 | キヤノン株式会社 | Image encoding device, image encoding method and program, image decoding device, image decoding method and program |
US12114014B2 (en) * | 2021-10-01 | 2024-10-08 | Tencent America LLC | Secondary transforms for compound inter-intra prediction modes |
WO2024208638A1 (en) * | 2023-04-06 | 2024-10-10 | Interdigital Ce Patent Holdings, Sas | Non-separable transforms for low delay applications |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108141596A (en) * | 2015-09-29 | 2018-06-08 | 高通股份有限公司 | For the non-separable quadratic transformation of video coding |
CN108141594A (en) * | 2015-10-13 | 2018-06-08 | 三星电子株式会社 | For being encoded to image or decoded method and apparatus |
CN109076230A (en) * | 2016-05-03 | 2018-12-21 | 高通股份有限公司 | Binaryzation quadratic transformation index |
CN109076242A (en) * | 2016-05-13 | 2018-12-21 | 索尼公司 | Image processing equipment and method |
CN109644269A (en) * | 2016-08-24 | 2019-04-16 | 索尼公司 | Image processing equipment, image processing method and program |
Family Cites Families (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2721787B1 (en) | 1994-06-22 | 1996-07-26 | Thomson Consumer Electronics | Method for quantifying coefficients. |
US6389072B1 (en) * | 1998-12-23 | 2002-05-14 | U.S. Philips Corp. | Motion analysis based buffer regulation scheme |
JP2002094989A (en) * | 2000-09-14 | 2002-03-29 | Pioneer Electronic Corp | Video signal encoder and video signal encoding method |
US7280597B2 (en) | 2003-06-24 | 2007-10-09 | Mitsubishi Electric Research Laboratories, Inc. | System and method for determining coding modes, DCT types and quantizers for video coding |
US20130003856A1 (en) * | 2011-07-01 | 2013-01-03 | Samsung Electronics Co. Ltd. | Mode-dependent transforms for residual coding with low latency |
KR101892329B1 (en) * | 2011-11-03 | 2018-08-27 | 톰슨 라이센싱 | Video encoding and decoding based on image refinement |
CN104488270B (en) * | 2012-06-29 | 2018-05-18 | 韩国电子通信研究院 | A kind of video encoding/decoding method using decoding device |
AU2013202653A1 (en) | 2013-04-05 | 2014-10-23 | Canon Kabushiki Kaisha | Method, apparatus and system for generating intra-predicted samples |
CN105516730B (en) * | 2014-09-24 | 2018-04-24 | 晨星半导体股份有限公司 | Video coding device and video decoded device and its coding and coding/decoding method |
KR102600756B1 (en) * | 2015-03-06 | 2023-11-10 | 한국과학기술원 | Video encoding and decoding method based on low-complexity transformation and device using the same |
ITUB20155295A1 (en) * | 2015-10-16 | 2017-04-16 | Torino Politecnico | Apparatuses and methods for encoding and decoding images |
WO2017173593A1 (en) | 2016-04-06 | 2017-10-12 | Mediatek Singapore Pte. Ltd. | Separate coding secondary transform syntax elements for different color components |
US10931947B2 (en) * | 2016-05-04 | 2021-02-23 | Sharp Kabushiki Kaisha | Systems and methods for coding transform data |
CN113411578B (en) * | 2016-05-13 | 2024-04-12 | 夏普株式会社 | Image decoding device and method, image encoding device and method |
CN109076222B9 (en) | 2016-05-13 | 2021-10-15 | 索尼公司 | Image processing apparatus and method |
US11350127B2 (en) * | 2016-05-13 | 2022-05-31 | Sony Corporation | Apparatus and method for image processing |
US11095893B2 (en) * | 2016-10-12 | 2021-08-17 | Qualcomm Incorporated | Primary transform and secondary transform in video coding |
KR102416804B1 (en) * | 2016-10-14 | 2022-07-05 | 세종대학교산학협력단 | Image encoding method/apparatus, image decoding method/apparatus and and recording medium for storing bitstream |
EP3567858A4 (en) * | 2017-01-03 | 2020-06-17 | LG Electronics Inc. -1- | Method and device for encoding/decoding video signal using secondary transform |
EP3349451A1 (en) | 2017-01-11 | 2018-07-18 | Thomson Licensing | Method and apparatus for selecting a coding mode used for encoding/decoding a residual block |
EP3586511B1 (en) | 2017-03-16 | 2022-01-05 | MediaTek Inc. | Method and apparatus of enhanced multiple transforms and non-separable secondary transform for video coding |
US20200177889A1 (en) | 2017-03-21 | 2020-06-04 | Lg Electronics Inc. | Transform method in image coding system and apparatus for same |
US10855997B2 (en) * | 2017-04-14 | 2020-12-01 | Mediatek Inc. | Secondary transform kernel size selection |
US10805641B2 (en) * | 2017-06-15 | 2020-10-13 | Qualcomm Incorporated | Intra filtering applied together with transform processing in video coding |
US11134272B2 (en) * | 2017-06-29 | 2021-09-28 | Qualcomm Incorporated | Memory reduction for non-separable transforms |
EP3643065A1 (en) * | 2017-07-24 | 2020-04-29 | ARRIS Enterprises LLC | Intra mode jvet coding |
EP4395317A3 (en) * | 2017-07-28 | 2024-08-14 | Panasonic Intellectual Property Corporation of America | Encoding device and encoding method |
CN108322745B (en) * | 2018-02-28 | 2019-12-03 | 中南大学 | Fast selecting method in a kind of frame based on inseparable quadratic transformation mode |
WO2020013541A1 (en) * | 2018-07-12 | 2020-01-16 | 엘지전자 주식회사 | Methods and apparatuses for processing video signal |
KR102452108B1 (en) | 2018-09-02 | 2022-10-07 | 엘지전자 주식회사 | Method for encoding/decoding video signals and device therefor |
US11172211B2 (en) * | 2019-04-04 | 2021-11-09 | Tencent America LLC | Method and apparatus for video coding |
US11991393B2 (en) * | 2019-04-16 | 2024-05-21 | Hfi Innovation Inc. | Methods and apparatuses for coding video data with secondary transform |
CN116743994A (en) | 2019-04-23 | 2023-09-12 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing video data |
-
2020
- 2020-04-23 CN CN202310488397.2A patent/CN116743994A/en active Pending
- 2020-04-23 WO PCT/CN2020/086421 patent/WO2020216296A1/en active Application Filing
- 2020-04-23 JP JP2021561889A patent/JP7256293B2/en active Active
- 2020-04-23 EP EP20795742.4A patent/EP3932061A4/en active Pending
- 2020-04-23 WO PCT/CN2020/086458 patent/WO2020216303A1/en active Application Filing
- 2020-04-23 WO PCT/CN2020/086444 patent/WO2020216299A1/en unknown
- 2020-04-23 CN CN202080031192.6A patent/CN113785576B/en active Active
- 2020-04-23 CN CN202080031268.5A patent/CN113767627B/en active Active
- 2020-04-23 CN CN202080031341.9A patent/CN113728636B/en active Active
-
2021
- 2021-08-19 US US17/406,260 patent/US11546636B2/en active Active
- 2021-08-19 US US17/406,242 patent/US11647229B2/en active Active
-
2023
- 2023-03-29 JP JP2023054241A patent/JP7509944B2/en active Active
- 2023-03-30 US US18/193,131 patent/US20230262263A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108141596A (en) * | 2015-09-29 | 2018-06-08 | 高通股份有限公司 | For the non-separable quadratic transformation of video coding |
CN108141594A (en) * | 2015-10-13 | 2018-06-08 | 三星电子株式会社 | For being encoded to image or decoded method and apparatus |
CN109076230A (en) * | 2016-05-03 | 2018-12-21 | 高通股份有限公司 | Binaryzation quadratic transformation index |
CN109076242A (en) * | 2016-05-13 | 2018-12-21 | 索尼公司 | Image processing equipment and method |
CN109644269A (en) * | 2016-08-24 | 2019-04-16 | 索尼公司 | Image processing equipment, image processing method and program |
Also Published As
Publication number | Publication date |
---|---|
JP2023089032A (en) | 2023-06-27 |
EP3932061A4 (en) | 2022-06-01 |
US20220182675A1 (en) | 2022-06-09 |
US11647229B2 (en) | 2023-05-09 |
CN113785576B (en) | 2023-05-16 |
KR20210154151A (en) | 2021-12-20 |
CN113728636B (en) | 2022-11-04 |
CN113767627A (en) | 2021-12-07 |
WO2020216299A1 (en) | 2020-10-29 |
US11546636B2 (en) | 2023-01-03 |
JP7256293B2 (en) | 2023-04-11 |
WO2020216303A1 (en) | 2020-10-29 |
JP7509944B2 (en) | 2024-07-02 |
CN113785576A (en) | 2021-12-10 |
JP2022529055A (en) | 2022-06-16 |
EP3932061A1 (en) | 2022-01-05 |
WO2020216296A1 (en) | 2020-10-29 |
CN116743994A (en) | 2023-09-12 |
US20230262263A1 (en) | 2023-08-17 |
CN113728636A (en) | 2021-11-30 |
US20220109876A1 (en) | 2022-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113767627B (en) | Cropping operations in video processing based on quadratic transforms | |
CN113812154A (en) | Multiple quadratic transform matrices for video processing | |
US12081758B2 (en) | Block dimension settings of transform skip mode | |
US20220394259A1 (en) | Residual Coding for Transform Skipped Blocks | |
US11991358B2 (en) | Indication of multiple transform matrices in coded video | |
WO2021110018A1 (en) | Separable secondary transform processing of coded video | |
US12096013B2 (en) | Signaling for transform skip mode | |
US11546595B2 (en) | Sub-block based use of transform skip mode | |
WO2021104409A1 (en) | Cross-component adaptive filtering and subblock coding | |
WO2020228716A1 (en) | Usage of transquant bypass mode for multiple color components | |
CN113728640A (en) | Intra-prediction and residual coding | |
CN113728631A (en) | Intra sub-block partitioning and multi-transform selection | |
WO2020253642A1 (en) | Block size dependent use of secondary transforms in coded video | |
WO2021190593A1 (en) | Coded video processing using enhanced secondary transform | |
KR102727219B1 (en) | Using quadratic transforms in coded video | |
WO2020253810A1 (en) | Coding tools for chroma components | |
WO2024174979A1 (en) | Transform for intra block copy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |