WO2021137445A1 - Procédé de détermination de noyaux de transformée de traitement de signal vidéo et appareil associé - Google Patents

Procédé de détermination de noyaux de transformée de traitement de signal vidéo et appareil associé Download PDF

Info

Publication number
WO2021137445A1
WO2021137445A1 PCT/KR2020/017198 KR2020017198W WO2021137445A1 WO 2021137445 A1 WO2021137445 A1 WO 2021137445A1 KR 2020017198 W KR2020017198 W KR 2020017198W WO 2021137445 A1 WO2021137445 A1 WO 2021137445A1
Authority
WO
WIPO (PCT)
Prior art keywords
transform
current block
kernel
transform kernel
dct
Prior art date
Application number
PCT/KR2020/017198
Other languages
English (en)
Korean (ko)
Inventor
이범식
Original Assignee
(주)휴맥스
조선대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)휴맥스, 조선대학교 산학협력단 filed Critical (주)휴맥스
Publication of WO2021137445A1 publication Critical patent/WO2021137445A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Definitions

  • the present invention relates to a video signal processing method and an apparatus therefor, and more particularly, to a method for determining a transform kernel for video signal processing and an apparatus therefor.
  • Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or storing it in a form suitable for a storage medium.
  • Targets of compression encoding include audio, video, and text.
  • a technique for performing compression encoding on an image is called video compression.
  • Compression encoding of a video signal is performed by removing redundant information in consideration of spatial correlation, temporal correlation, stochastic correlation, and the like.
  • a method and apparatus for processing a video signal with higher efficiency are required.
  • An object of the present specification is to increase coding efficiency of a video signal by providing a video signal processing method and an apparatus therefor.
  • the present specification provides a video signal processing apparatus.
  • a video signal decoding apparatus includes a processor, wherein the processor determines a transform kernel for horizontal transformation of the current block and a transformation kernel for vertical transformation of the current block based on a preset condition. and obtaining a residual signal for the current block using the transform kernel, wherein the preset condition includes whether an intra subblock partitioning (ISP) prediction method is applied to the current block and whether the current block It is characterized in that it is a condition based on whether a low frequency non-separable transform (LFNST) is applied to .
  • ISP intra subblock partitioning
  • a video signal encoding apparatus includes a processor, wherein the processor includes a transform kernel for horizontal transformation of the current block and a transformation for vertical transformation of the current block based on a preset condition. Determining a kernel, and obtaining a transform block for the current block using the transform kernel, wherein the preset condition includes whether an intra subblock partitioning (ISP) prediction method is applied to the current block; It is characterized in that the condition is based on whether a low frequency non-separable transform (LFNST) is applied to the current block.
  • ISP intra subblock partitioning
  • the bitstream is the current block of the current block based on a preset condition. determining a transformation kernel for horizontal transformation and a transformation kernel for vertical transformation of the current block; and obtaining a transform block for the current block by using the transform kernel. is encoded through an encoding method including, and the preset condition is whether an intra subblock partitioning (ISP) prediction method is applied to the current block and a low-band non-separated transform (Low Frequency) to the current block.
  • ISP intra subblock partitioning
  • Low Frequency Low Frequency
  • both a transform kernel for horizontal transformation of the current block and a transform kernel for vertical transformation of the current block are DCT (Discrete Cosine) Transform) It is characterized in that it is a type 2 (DCT-2) transform kernel.
  • DCT Discrete Cosine
  • the transform kernel is determined based on a horizontal size and a vertical size of the current block.
  • the transform kernel for the horizontal transform of the current block is a DCT type 2 (DCT-2) transform kernel
  • the transformation kernel for horizontal transformation of the current block is not a DCT type 2 (DCT-2) transformation kernel
  • a transform kernel for vertical transformation of the current block is a DCT type 2 (DCT-2) transform kernel
  • DCT-2 DCT type 2
  • the transformation kernel for vertical transformation of the current block is not a DCT type 2 (DCT-2) transformation kernel.
  • the preset condition is a condition based on a division direction of the current block when a sub-block transform (SBT) is applied to the current block, and the current block moves in a vertical direction. It is divided into two subblocks, and when SBT is applied to the left subblock among the two subblocks, the transform kernel for the horizontal direction of the left subblock is a DCT type 8 (DCT-8) transform kernel, The transform kernel for the vertical direction of the left subblock is a Discrete Sine Transform (DST) type 7 (DST-7) transform kernel.
  • DCT-8 DCT type 8
  • DST Discrete Sine Transform
  • the transform kernel for the horizontal direction of the left subblock and the transform kernel for the vertical direction of the left subblock are both DST type 7 (DST-7).
  • a transform kernel for the horizontal direction of the upper subblock is a DST type 7 (DST-7) transform kernel
  • the transform kernel for the vertical direction of the upper subblock is a DCT type 8 (DCT-8) transform kernel.
  • the transform kernel for the horizontal direction of the lower subblock and the transform kernel for the vertical direction of the lower subblock are DST type 7 ( DST-7) is characterized as a conversion kernel.
  • the present specification has an effect that efficient video signal processing is possible by providing a method and an apparatus for determining a transform kernel for video signal decoding/encoding.
  • FIG. 1 is a schematic block diagram of a video signal encoding apparatus according to an embodiment of the present invention.
  • FIG. 2 is a schematic block diagram of a video signal decoding apparatus according to an embodiment of the present invention.
  • CTU Coding Tree Unit
  • CU Coding Unit
  • FIG. 4 shows a method of signaling the division of a quad tree and a multi-type tree according to an embodiment of the present invention.
  • FIG. 5 shows a general directional intra prediction method according to an embodiment of the present invention.
  • FIG. 6 illustrates a matrix-based intra prediction method according to an embodiment of the present invention.
  • LNNST low frequency non-separable transform
  • FIG 8 shows an MTS according to an ISP block size according to an embodiment of the present invention.
  • FIG. 10 illustrates a main transformation kernel according to a block size when an explicit primary transformation kernel selection is applied according to an embodiment of the present invention.
  • FIG. 11 illustrates a transform kernel determined according to a subblock transform size and a partition type according to an embodiment of the present invention.
  • Coding can be interpreted as encoding or decoding as the case may be.
  • an apparatus for generating a video signal bitstream by performing encoding (encoding) of a video signal is referred to as an encoding apparatus or an encoder
  • an apparatus for reconstructing a video signal by performing decoding (decoding) of a video signal bitstream is decoding referred to as a device or decoder.
  • a video signal processing apparatus is used as a term that includes both an encoder and a decoder.
  • Information is a term including all values, parameters, coefficients, elements, and the like, and the meaning may be interpreted differently in some cases, so the present invention is not limited thereto.
  • the 'unit' is used to refer to a basic unit of image processing or a specific position of a picture, and refers to an image area including both a luma component and a chroma component.
  • 'block' refers to an image region including a specific component among the luma component and the chroma component (ie, Cb and Cr).
  • terms such as 'unit', 'block', 'partition' and 'region' may be used interchangeably according to embodiments.
  • a unit may be used as a concept including all of a coding unit, a prediction unit, and a transform unit.
  • a picture indicates a field or a frame, and according to embodiments, the terms may be used interchangeably.
  • the encoding apparatus 100 of the present invention includes a transform unit 110 , a quantizer 115 , an inverse quantizer 120 , an inverse transform unit 125 , a filtering unit 130 , and a prediction unit 150 . ) and an entropy coding unit 160 .
  • the transform unit 110 converts a residual signal that is a difference between the input video signal and the prediction signal generated by the prediction unit 150 to obtain a transform coefficient value.
  • a discrete cosine transform DCT
  • DST discrete sine transform
  • the transform is performed by dividing the input picture signal into blocks.
  • the coding efficiency may vary according to the distribution and characteristics of values in the transform region.
  • the quantization unit 115 quantizes the transform coefficient values output from the transform unit 110 .
  • the picture signal is not coded as it is, but the picture is predicted using the region already coded through the prediction unit 150, and a residual value between the original picture and the prediction picture is added to the predicted picture to obtain a reconstructed picture.
  • method is used to obtain
  • the encoder performs a process of reconstructing the encoded current block.
  • the inverse quantization unit 120 inversely quantizes the transform coefficient value, and the inverse transform unit 125 restores the residual value using the inverse quantized transform coefficient value.
  • the filtering unit 130 performs a filtering operation for improving the quality of the reconstructed picture and improving the encoding efficiency.
  • a deblocking filter For example, a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter may be included.
  • the filtered picture is output or stored in a decoded picture buffer (DPB, 156) to be used as a reference picture.
  • DPB decoded picture buffer
  • the prediction unit 150 includes an intra prediction unit 152 and an inter prediction unit 154 .
  • the intra prediction unit 152 performs intra prediction within the current picture, and the inter prediction unit 154 predicts the current picture using the reference picture stored in the decoded picture buffer 156 Inter prediction. carry out
  • the intra prediction unit 152 performs intra prediction on reconstructed samples in the current picture, and transmits intra encoding information to the entropy coding unit 160 .
  • the intra encoding information may include at least one of an intra prediction mode, a Most Probable Mode (MPM) flag, and an MPM index.
  • the inter prediction unit 154 may include a motion estimation unit 154a and a motion compensation unit 154b.
  • the motion estimator 154a obtains a motion vector value of the current region by referring to a specific region of the reconstructed reference picture.
  • the motion estimation unit 154a transfers motion information (reference picture index, motion vector information, etc.) on the reference region to the entropy coding unit 160 .
  • the motion compensation unit 154b performs motion compensation using the motion vector value transmitted from the motion estimation unit 154a.
  • the inter prediction unit 154 transmits inter encoding information including motion information on the reference region to the entropy coding unit 160 .
  • the transform unit 110 obtains a transform coefficient value by transforming a residual value between the original picture and the predicted picture.
  • the transformation may be performed in units of a specific block within the picture, and the size of the specific block may vary within a preset range.
  • the quantization unit 115 quantizes the transform coefficient values generated by the transform unit 110 and transmits the quantized values to the entropy coding unit 160 .
  • the entropy coding unit 160 entropy-codes the quantized transform coefficients, intra-encoding information, inter-encoding information, and the like to generate a video signal bitstream.
  • a Variable Length Coding (VLC) scheme and an arithmetic coding scheme may be used.
  • the variable length coding (VLC) method converts input symbols into continuous codewords, and the length of the codewords may be variable. For example, symbols that occur frequently are expressed as short codewords, and symbols that do not occur frequently are expressed as long codewords.
  • a context-based adaptive variable length coding (CAVLC) scheme may be used as the variable length coding scheme.
  • Arithmetic coding converts consecutive data symbols into one prime number, and the arithmetic coding can obtain an optimal fractional bit required to represent each symbol.
  • Context-based Adaptive Binary Arithmetic Code (CABAC) may be used as arithmetic coding.
  • the generated bitstream is encapsulated in a Network Abstraction Layer (NAL) unit as a basic unit.
  • NAL Network Abstraction Layer
  • the NAL unit includes an integer number of coded coding tree units.
  • the bitstream is divided into NAL units, and then each divided NAL unit must be decoded.
  • information necessary for decoding a video signal bitstream is a higher level set such as a picture parameter set (PPS), a sequence parameter set (SPS), a video parameter set (VPS), and the like. It can be transmitted through the RBSP (Raw Byte Sequence Payload).
  • FIG. 1 shows the encoding apparatus 100 according to an embodiment of the present invention. Separately displayed blocks are logically separated and illustrated elements of the encoding apparatus 100 . Accordingly, the elements of the above-described encoding apparatus 100 may be mounted as one chip or a plurality of chips according to the design of the device. According to an embodiment, an operation of each element of the above-described encoding apparatus 100 may be performed by a processor (not shown).
  • the decoding apparatus 200 of the present invention includes an entropy decoding unit 210 , an inverse quantization unit 220 , an inverse transform unit 225 , a filtering unit 230 , and a prediction unit 250 .
  • the entropy decoding unit 210 entropy-decodes the video signal bitstream to extract transform coefficients for each region, intra-encoding information, inter-encoding information, and the like.
  • the inverse quantizer 220 inverse quantizes the entropy-decoded transform coefficient, and the inverse transform unit 225 restores a residual value using the inverse quantized transform coefficient.
  • the video signal processing apparatus 200 restores the original pixel value by adding the residual value obtained by the inverse transform unit 225 with the prediction value obtained by the prediction unit 250 .
  • the filtering unit 230 improves picture quality by filtering the picture.
  • This may include a deblocking filter for reducing block distortion and/or an adaptive loop filter for removing distortion from the entire picture.
  • the filtered picture is output or stored in the decoded picture buffer DPB 256 to be used as a reference picture for the next picture.
  • the prediction unit 250 includes an intra prediction unit 252 and an inter prediction unit 254 .
  • the prediction unit 250 generates a prediction picture by using the encoding type decoded through the entropy decoding unit 210, transform coefficients for each region, intra/inter encoding information, and the like.
  • a current picture including the current block or a decoded area of other pictures may be used.
  • a picture (or tile/slice) that uses only the current picture for reconstruction, that is, only performs intra prediction, is an intra picture or an I picture (or tile/slice), and a picture that can perform both intra prediction and inter prediction (or, A tile/slice) is called an inter picture (or a tile/slice).
  • a picture (or tile/slice) using at most one motion vector and a reference picture index to predict the sample values of each block among inter-pictures (or tile/slice) is a predictive picture or a P picture (or , tile/slice), and a picture (or tile/slice) using up to two motion vectors and a reference picture index is called a bi-predictive picture or a B picture (or tile/slice).
  • a P picture (or tile/slice) uses at most one set of motion information to predict each block
  • a B picture (or tile/slice) uses up to two sets of motion information to predict each block.
  • the motion information set includes one or more motion vectors and one reference picture index.
  • the intra prediction unit 252 generates a prediction block by using the intra encoding information and reconstructed samples in the current picture.
  • the intra encoding information may include at least one of an intra prediction mode, a Most Probable Mode (MPM) flag, and an MPM index.
  • MPM Most Probable Mode
  • the intra prediction unit 252 predicts pixel values of the current block by using the reconstructed pixels located on the left and/or above the current block as reference pixels.
  • the reference pixels may be pixels adjacent to a left boundary and/or pixels adjacent to an upper boundary of the current block.
  • the reference pixels may be pixels adjacent within a preset distance from a left boundary of the current block among pixels of a neighboring block of the current block and/or pixels adjacent within a preset distance from an upper boundary of the current block.
  • the neighboring blocks of the current block are a left (L) block, an upper (A) block, a lower left (BL) block, an above right (AR) block, or an above left (Above Left) block adjacent to the current block. AL) blocks.
  • the inter prediction unit 254 generates a prediction block by using the reference picture stored in the decoded picture buffer 256 and the inter encoding information.
  • the inter encoding information may include motion information (reference picture index, motion vector information, etc.) of the current block with respect to the reference block.
  • Inter prediction may include L0 prediction, L1 prediction, and bi-prediction.
  • L0 prediction is prediction using one reference picture included in the L0 picture list
  • L1 prediction means prediction using one reference picture included in the L1 picture list.
  • one set of motion information eg, a motion vector and a reference picture index
  • a maximum of two reference regions may be used, and the two reference regions may exist in the same reference picture or in different pictures, respectively.
  • a maximum of two sets of motion information (eg, a motion vector and a reference picture index) may be used, and the two motion vectors may correspond to the same reference picture index or may correspond to different reference picture indexes. may correspond.
  • the reference pictures may be temporally displayed (or output) before or after the current picture.
  • the inter prediction unit 254 may obtain the reference block of the current block by using the motion vector and the reference picture index.
  • the reference block exists in the reference picture corresponding to the reference picture index.
  • a pixel value of a block specified by the motion vector or an interpolated value thereof may be used as a predictor of the current block.
  • an 8-tap interpolation filter may be used for a luma signal and a 4-tap interpolation filter may be used for a chroma signal.
  • the interpolation filter for motion prediction in units of subpels is not limited thereto.
  • the inter prediction unit 254 performs motion compensation for predicting the texture of the current unit from the previously reconstructed picture using motion information.
  • a reconstructed video picture is generated by adding the prediction value output from the intra prediction unit 252 or the inter prediction unit 254 and the residual value output from the inverse transform unit 225 . That is, the video signal decoding apparatus 200 reconstructs the current block by using the prediction block generated by the prediction unit 250 and the residual obtained from the inverse transform unit 225 .
  • FIG. 2 shows the decoding apparatus 200 according to an embodiment of the present invention. Separately displayed blocks are logically separated and illustrated elements of the decoding apparatus 200 . Accordingly, the elements of the decoding apparatus 200 described above may be mounted as one chip or a plurality of chips according to the design of the device. According to an embodiment, the operation of each element of the above-described decoding apparatus 200 may be performed by a processor (not shown).
  • a Coding Tree Unit is divided into Coding Units (CUs) within a picture.
  • a picture may be divided into a sequence of coding tree units (CTUs).
  • a coding tree unit consists of an NXN block of luma samples and two blocks of corresponding chroma samples.
  • a coding tree unit may be divided into a plurality of coding units.
  • the coding unit refers to a basic unit for processing a picture in the process of processing the video signal described above, that is, intra/inter prediction, transformation, quantization, and/or entropy coding.
  • the size and shape of the coding unit in one picture may not be constant.
  • the coding unit may have a square or rectangular shape.
  • the rectangular coding unit includes a vertical coding unit (or a vertical block) and a horizontal coding unit (or a horizontal block).
  • a vertical block is a block having a height greater than a width
  • a horizontal block is a block having a width greater than a height.
  • a non-square block may refer to a rectangular block, but the present invention is not limited thereto.
  • the coding tree unit is first divided into a quad tree (QT) structure. That is, in the quad tree structure, one node having a size of 2NX2N may be divided into four nodes having a size of NXN.
  • a quad tree may also be referred to as a quaternary tree. Quad tree partitioning can be performed recursively, and not all nodes need to be partitioned to the same depth.
  • a leaf node of the aforementioned quad tree may be further divided into a multi-type tree (MTT) structure.
  • MTT multi-type tree
  • one node in the multi-type tree structure, one node may be divided into a binary (binary) or ternary (ternary) tree structure of horizontal or vertical division. That is, in the multi-type tree structure, there are four partitioning structures: vertical binary partitioning, horizontal binary partitioning, vertical ternary partitioning, and horizontal ternary partitioning.
  • both a width and a height of a node in each tree structure may have a value of a power of two.
  • a node having a size of 2NX2N may be divided into two NX2N nodes by vertical binary division and divided into two 2NXN nodes by horizontal binary division.
  • a node of size 2NX2N is divided into nodes of (N/2)X2N, NX2N, and (N/2)X2N by vertical ternary division, and horizontal binary division can be divided into nodes of 2NX(N/2), 2NXN, and 2NX(N/2) by This multi-type tree splitting can be performed recursively.
  • a leaf node of a multi-type tree may be a coding unit. If the coding unit is not too large for the maximum transform length, the coding unit is used as a unit of prediction and transform without further splitting. Meanwhile, in the aforementioned quad tree and multi-type tree, at least one of the following parameters may be predefined or transmitted through an RBSP of a higher level set such as PPS, SPS, or VPS.
  • Preset flags may be used to signal the division of the aforementioned quad tree and multi-type tree.
  • a flag 'qt_split_flag' indicating whether to split a quad tree node
  • a flag 'mtt_split_flag' indicating whether to split a multi-type tree node
  • a flag 'mtt_split_vertical_flag' indicating a split direction of a multi-type tree node ' or a flag 'mtt_split_binary_flag' indicating a split shape of a multi-type tree node
  • a coding tree unit is a root node of a quad tree, and may be first divided into a quad tree structure.
  • 'qt_split_flag' is signaled for each node 'QT_node'.
  • the corresponding node is divided into four square nodes, and when the value of 'qt_split_flag' is 0, the corresponding node becomes a leaf node 'QT_leaf_node' of the quad tree.
  • Each quad tree leaf node 'QT_leaf_node' may be further divided into a multi-type tree structure.
  • 'mtt_split_flag' is signaled for each node 'MTT_node'.
  • the corresponding node is divided into a plurality of rectangular nodes, and when the value of 'mtt_split_flag' is 0, the corresponding node becomes the leaf node 'MTT_leaf_node' of the multi-type tree.
  • the node 'MTT_node' is divided into two rectangular nodes, and when the value of 'mtt_split_binary_flag' is 0, the node 'MTT_node' is divided into three rectangular nodes.
  • This specification relates to multiple transform selection (MTS) applied to a residual signal generated by a matrix-based prediction method and an intra sub-partition (ISP) intra prediction method in a video codec.
  • a distribution of the residual signal of the hole may be different for each region. For example, a distribution of values of a residual signal within a specific region may vary according to a prediction method.
  • coding efficiency may vary for each transform region according to distribution and characteristics of values in the transform region. Accordingly, when a transform kernel used for transforming a specific transform block is adaptively selected from among a plurality of available transform kernels, coding efficiency may be further improved.
  • the encoder and the decoder may set a transform kernel other than the basic transform kernel to be additionally usable in transforming the video signal.
  • a method of adaptively selecting a transform kernel is referred to as adaptive multiple core transform (AMT) or multiple transform selection (MTS).
  • the matrix-based intra prediction method (MIP) in the present specification predicts the pixels of the neighboring blocks using a predefined matrix and offset values, unlike the existing prediction methods having directionality from the pixels of the neighboring blocks. It refers to an intra prediction method that generates a residual signal by obtaining a signal.
  • FIG. 5 shows a general directional intra prediction method according to an embodiment of the present invention
  • FIG. 6 shows a matrix-based intra prediction method according to an embodiment of the present invention.
  • the residual block is predicted using a matrix B including pixels of neighboring blocks, a predefined matrix A_k, and an offset value o_k in order to generate a prediction signal in the current block.
  • the residual signal obtained through the prediction method through MIP has a characteristic that the directionality of the residual is weak and the signal characteristics are uniform compared to the residual signal obtained through the existing directional prediction method.
  • a transform showing higher energy compression performance when the input signal is uniform like DCT-2 than using the transform kernels of DST-7 and DCT-8, which show strong compression performance for directional intra prediction. It is very advantageous to use the kernel.
  • a low-frequency non-separable transform refers to a second-order transform technique.
  • the application of the LFNST kernel varies according to the mode of the prediction signal in the screen.
  • LFNST is a secondary transformation kernel having a higher compression performance by applying a residual signal obtained through a prediction method to a previously defined secondary transformation kernel with respect to a low-frequency region of transform coefficients obtained through peripheral rings such as DCT-2 and DST-7. conversion method.
  • the LFNST kernel is a transform kernel obtained through offline learning, and is defined and applied differently depending on the intra-screen prediction mode and the size of the block to which the secondary transform is applied.
  • indexes for LFNST kernels are defined from 0 to 3
  • intra prediction modes are mapped to each index
  • a total of two LFNST kernels are defined and applied to each index.
  • the LFNST index mapped according to the intra prediction mode is defined as shown in FIG. 7 and Table 1.
  • LNNST low frequency non-separable transform
  • MTS Multiple transform selection
  • DCT-2, DST-7, and DCT-8 multiple transform kernels such as DCT-2, DST-7, and DCT-8.
  • MTS means using DCT-2 and DCT-2 transform kernels or a combination of DST-7 and DCT-8 for the horizontal and vertical directions of the transform block.
  • the MTS may use the MTS transform kernel differently according to a method of generating the intra prediction residual.
  • MIP when generating an intra prediction residual using MIP, only a pair of DCT-2 and DCT-2 transforms are used as the MTS transform kernel.
  • the MTS main transformation kernel uses only DCT-2 and DCT-2 transformation pairs.
  • Intra sub-block partitioning refers to an intra prediction method in which intra prediction is performed by dividing blocks in a horizontal or vertical direction during intra prediction. For example, a 4x4 block is divided into two 4x2 subblocks (horizontal direction) or divided into two 2x4 subblocks (vertical direction) to perform intra prediction.
  • the ISP has the effect of increasing the compression efficiency because the prediction distance is short and the prediction accuracy is high.
  • different transform kernels may be applied to each subblock.
  • ISP and LFNST are used, a method for determining the type of the transform kernel according to the block size is proposed.
  • FIG 8 shows an MTS according to an ISP block size according to an embodiment of the present invention.
  • the transformation kernel is determined to be an MTS kernel other than DCT-2, and when the horizontal and vertical sizes of the current block are less than 4 or greater than 16, the transformation kernel is It can be determined as DCT-2.
  • FIG. 9 shows an example of using a transform kernel according to an ISP block size.
  • DCT such as DST-7 for horizontal and vertical A transform kernel other than -2
  • a DCT-2 MTS transform kernel may be used.
  • a kernel suitable for an intra prediction residual signal such as DST-7 may be used.
  • DCT-2 may be used.
  • FIG. 9 only blocks of 4x4, 4x8, 8x4, and 8x8 sizes are separately indicated.
  • Explicit MTS means a method of explicitly signaling which changed kernel is used in a transform block by transmitting information on the use of a transform kernel in the MTS.
  • FIG. 10 illustrates a main transformation kernel according to a block size when an explicit primary transformation kernel selection is applied according to an embodiment of the present invention.
  • a transform kernel other than DCT-2 such as DCT-7 is used for the horizontal and vertical lengths, and For other blocks, the DCT-2 MTS conversion kernel is used.
  • a kernel suitable for an intra prediction residual signal such as DST-7 may be used for a smaller block, that is, when the block size is large, since the residual signal has a uniform property, DCT-2 may be used.
  • FIG. 11 illustrates a transform kernel determined according to a subblock transform size and a partition type according to an embodiment of the present invention.
  • transform kernels for the horizontal and vertical directions are determined to be DCT-2 and DCT-2, respectively, and the horizontal and vertical lengths of the transform block are
  • the type of the transform kernel may be changed according to the division type of the subblock. For example, when the subblock to which the transform is applied is vertically divided and the position is on the left, the transform kernel for the horizontal direction may be determined as DCT-8, and the transform kernel for the vertical direction may be determined as DST-7.
  • the transform kernel for the horizontal direction may be determined as DST-7
  • the transform kernel for the vertical direction may be determined as DST-7
  • the transform kernel for the vertical direction may be determined as DCT-8.
  • the transform kernel for the horizontal direction is determined to be DST-7
  • the transform kernel for the vertical direction is determined to be DST-7.
  • the method according to embodiments of the present invention may include one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), and Programmable Logic Devices (PLDs). , FPGAs (Field Programmable Gate Arrays), processors, controllers, microcontrollers, microprocessors, and the like.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • processors controllers
  • microcontrollers microcontrollers
  • microprocessors and the like.
  • the method according to the embodiments of the present invention may be implemented in the form of a module, procedure, or function that performs the functions or operations described above.
  • the software code may be stored in the memory and driven by the processor.
  • the memory may be located inside or outside the processor, and data may be exchanged with the processor by various known means.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un appareil de décodage de signal vidéo comprenant un processeur. Le processeur : détermine un noyau de transformée d'une transformation de direction de largeur d'un bloc courant et un noyau de transformée d'une transformation de direction de hauteur du bloc courant sur la base de conditions prédéfinies ; et acquiert un signal résiduel du bloc courant à l'aide des noyaux de transformée, les conditions prédéfinies étant des conditions basées sur le fait de savoir si un procédé de prédiction de partitionnement intra-sous-bloc (ISP) a été, ou non, appliqué au bloc courant et si une transformation non séparable basse fréquence (LFNST) a été, ou non, appliquée au bloc courant.
PCT/KR2020/017198 2019-12-31 2020-11-27 Procédé de détermination de noyaux de transformée de traitement de signal vidéo et appareil associé WO2021137445A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR20190179485 2019-12-31
KR10-2019-0179485 2019-12-31

Publications (1)

Publication Number Publication Date
WO2021137445A1 true WO2021137445A1 (fr) 2021-07-08

Family

ID=76685895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/017198 WO2021137445A1 (fr) 2019-12-31 2020-11-27 Procédé de détermination de noyaux de transformée de traitement de signal vidéo et appareil associé

Country Status (1)

Country Link
WO (1) WO2021137445A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422782A (zh) * 2021-12-23 2022-04-29 北京达佳互联信息技术有限公司 视频编码方法、装置、存储介质及电子设备
WO2023044919A1 (fr) * 2021-09-27 2023-03-30 Oppo广东移动通信有限公司 Procédé, dispositif, et système de codage et de décodage vidéo, et support de stockage

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3389274A1 (fr) * 2016-02-04 2018-10-17 Samsung Electronics Co., Ltd. Procédé et appareil de décodage de vidéo par transformation multiple de chrominance, et procédé et appareil de codage de vidéo par transformation multiple de chrominance
US20190320203A1 (en) * 2018-04-13 2019-10-17 Mediatek Inc. Implicit Transform Settings
US20190387241A1 (en) * 2018-06-03 2019-12-19 Lg Electronics Inc. Method and apparatus for processing video signals using reduced transform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3389274A1 (fr) * 2016-02-04 2018-10-17 Samsung Electronics Co., Ltd. Procédé et appareil de décodage de vidéo par transformation multiple de chrominance, et procédé et appareil de codage de vidéo par transformation multiple de chrominance
US20190320203A1 (en) * 2018-04-13 2019-10-17 Mediatek Inc. Implicit Transform Settings
US20190387241A1 (en) * 2018-06-03 2019-12-19 Lg Electronics Inc. Method and apparatus for processing video signals using reduced transform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DE-LUXÁN-HERNÁNDEZ (FRAUNHOFER) S; GEORGE V; VENUGOPAL G; BRANDENBURG J; BROSS B; SCHWARZ H; MARPE D; WIEGAND (HHI) T; KOO M; SALE: "Non-CE6: Combination of JVET-P0196 and JVET-P0392 on applying LFNST for ISP Blocks and simplifying the transform signalling", 16. JVET MEETING; 20191001 - 20191011; GENEVA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 10 October 2019 (2019-10-10), Geneva, pages 1 - 4, XP030218433 *
F. LE LEANNEC (INTERDIGITAL), K. NASER (INTERDIGITAL), F. GALPIN (INTERDIGITAL): "CE6-related: LFNST applied to ISP mode", 16. JVET MEETING; 20191001 - 20191011; GENEVA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 4 October 2019 (2019-10-04), Geneva, pages 1 - 4, XP030217002 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023044919A1 (fr) * 2021-09-27 2023-03-30 Oppo广东移动通信有限公司 Procédé, dispositif, et système de codage et de décodage vidéo, et support de stockage
CN114422782A (zh) * 2021-12-23 2022-04-29 北京达佳互联信息技术有限公司 视频编码方法、装置、存储介质及电子设备
CN114422782B (zh) * 2021-12-23 2023-09-19 北京达佳互联信息技术有限公司 视频编码方法、装置、存储介质及电子设备

Similar Documents

Publication Publication Date Title
WO2019244116A1 (fr) Partition de bordure de codage vidéo
CN112740681A (zh) 自适应多重变换译码
WO2011133002A2 (fr) Dispositif et procédé de codage d'image
WO2010095915A2 (fr) Procédé de codage vidéo pour le codage d'un bloc de division, procédé de décodage vidéo pour le décodage d'un bloc de division, et moyen d'enregistrement aux fins d'application desdits procédés
WO2015005621A1 (fr) Procédé et appareil de traitement de signal vidéo
WO2013062191A1 (fr) Procédé et appareil de décodage d'image à mode de prédiction intra
WO2012134085A2 (fr) Procédé pour décoder une image dans un mode de prévision interne
WO2013062192A1 (fr) Procédé et appareil de codage d'informations de prédiction intra
WO2010087620A2 (fr) Procédé et appareil de codage et de décodage d'images par utilisation adaptative d'un filtre d'interpolation
WO2012057528A2 (fr) Procédé de codage et de décodage à prédiction intra adaptative
CN113196780A (zh) 使用多变换核处理视频信号的方法和设备
TW202218422A (zh) 用於在視訊譯碼期間進行濾波的多個神經網路模型
KR20210084567A (ko) 화면 내 예측 필터링을 이용한 비디오 신호 처리 방법 및 장치
WO2021137445A1 (fr) Procédé de détermination de noyaux de transformée de traitement de signal vidéo et appareil associé
JP2023105114A (ja) イントラ予測ベースのビデオ信号処理方法及び装置
WO2014171770A1 (fr) Procédé et appareil de traitement de signal vidéo
KR102480967B1 (ko) 영상 부호화/복호화 방법 및 장치
TW202209881A (zh) 多個自我調整迴路濾波器組
WO2018074626A1 (fr) Procédé et appareil de codage vidéo utilisant un filtre d'interpolation adaptatif
CN114830673A (zh) 用于多个层的共享解码器图片缓冲器
KR20200057991A (ko) 비디오 신호를 위한 dst-7, dct-8 변환 커널 생성 유도 방법 및 장치
WO2018169267A1 (fr) Dispositif et procédé de codage ou de décodage d'image
IL293448A (en) A history-based motion vector contract constraint for a merge evaluation area
WO2012091517A2 (fr) Dispositif de numérisation adaptatif et son procédé de numérisation
WO2013162272A1 (fr) Procédé et dispositif de traitement du signal vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20909594

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20909594

Country of ref document: EP

Kind code of ref document: A1