WO2022265415A1 - 저주파 비분리 변환 설계 방법 및 장치 - Google Patents
저주파 비분리 변환 설계 방법 및 장치 Download PDFInfo
- Publication number
- WO2022265415A1 WO2022265415A1 PCT/KR2022/008512 KR2022008512W WO2022265415A1 WO 2022265415 A1 WO2022265415 A1 WO 2022265415A1 KR 2022008512 W KR2022008512 W KR 2022008512W WO 2022265415 A1 WO2022265415 A1 WO 2022265415A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- lfnst
- transform
- matrix
- current block
- transform coefficients
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 106
- 239000011159 matrix material Substances 0.000 claims abstract description 237
- 230000005540 biological transmission Effects 0.000 claims description 13
- 230000009466 transformation Effects 0.000 description 122
- 239000013598 vector Substances 0.000 description 54
- 239000000523 sample Substances 0.000 description 47
- 238000013139 quantization Methods 0.000 description 32
- 238000006243 chemical reaction Methods 0.000 description 28
- 238000001914 filtration Methods 0.000 description 27
- 241000023320 Luma <angiosperm> Species 0.000 description 25
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 25
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 22
- 230000008569 process Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 16
- 238000012545 processing Methods 0.000 description 16
- 238000013507 mapping Methods 0.000 description 10
- 230000011664 signaling Effects 0.000 description 10
- 239000013074 reference sample Substances 0.000 description 8
- 230000002123 temporal effect Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000014509 gene expression Effects 0.000 description 5
- 238000005192 partition Methods 0.000 description 5
- 230000001131 transforming effect Effects 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000009977 dual effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000010454 slate Substances 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/463—Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
Definitions
- This document relates to video/image coding technology, and more particularly, to a video coding method and apparatus based on low frequency non-separate transform (LFNST) in a video or video coding system.
- LNNST low frequency non-separate transform
- a highly efficient video/image compression technology is required to effectively compress, transmit, store, and reproduce high-resolution and high-quality video/image information having various characteristics as described above.
- the technical problem of this document is to provide a method and apparatus for increasing image coding efficiency.
- Another technical problem of this document is to provide an image coding method and apparatus for setting an LFNST kernel in consideration of computational complexity for samples.
- Another technical task of this document is to provide an image coding method and apparatus to which LFNST is applied, which can improve coding performance and minimize complexity.
- a video encoding method performed by an encoding device includes deriving modified transform coefficients by applying a Low-Frequency Non-Separable Transform (LFNST) to the transform coefficients; Encoding image information including residual information derived based on the modified transform coefficients, wherein an LFNST set for applying the LFNST is derived based on the intra prediction mode, and the size of the current block and an LFNST matrix is derived based on the LFNST set, and if the width or height of the current block is 4 and both width and height are 4 or more, the LFNST matrix is derived as a 16x16 dimensional matrix, and the width or height of the current block is 8, and if both width and height are 8 or more, the LFNST matrix can be derived as a 32x64 dimensional matrix.
- LFNST Low-Frequency Non-Separable Transform
- a digital storage medium storing image data including encoded image information and/or bitstream generated according to an image encoding method performed by an encoding device, and such image information and/or A bitstream transmission method may be provided.
- LFNST may be applied based on various conditions.
- the LFNST set index can be efficiently derived based on the intra prediction mode.
- FIG. 1 schematically shows an example of a video/image coding system to which embodiments of this document can be applied.
- FIG. 2 is a diagram schematically illustrating a configuration of a video/image encoding apparatus to which embodiments of the present document may be applied.
- FIG. 3 is a diagram schematically illustrating a configuration of a video/image decoding device to which embodiments of the present document may be applied.
- 4 exemplarily shows intra-directional modes of 65 prediction directions.
- 5 illustratively shows 93 prediction directions associated with wide-angle intra prediction modes.
- FIG. 6 is a diagram for explaining intra prediction for a non-square block according to an embodiment of the present document.
- FIG. 8 is a diagram for explaining RST according to an embodiment of the present document.
- FIG. 9 is a diagram illustrating a forward LFNST input area according to an embodiment of the present document.
- FIG. 11 is a diagram illustrating an input sequence of input data according to another embodiment of the present document.
- FIG. 12 is a diagram illustrating a non-square ROI according to an embodiment of the present document.
- FIG. 13 is a diagram illustrating a scanning sequence of transform coefficients according to an embodiment of the present document.
- FIG. 14 schematically illustrates an example of a video/image decoding method according to embodiments of the present document.
- FIG. 15 schematically illustrates an example of a video/video encoding method according to an embodiment of the present document.
- FIG. 16 shows an example of a content streaming system to which the embodiments disclosed in this document can be applied.
- each component in the drawings described in this document is shown independently for convenience of description of different characteristic functions, and does not mean that each component is implemented as separate hardware or separate software.
- two or more of the components may be combined to form one component, or one component may be divided into a plurality of components.
- Embodiments in which each configuration is integrated and/or separated are also included in the scope of rights of this document as long as they do not deviate from the essence of this document.
- FIG. 1 schematically shows an example of a video/image coding system to which embodiments of this document can be applied.
- a video/image coding system may include a first device (source device) and a second device (receive device).
- the source device may transmit encoded video/image information or data to a receiving device in a file or streaming form through a digital storage medium or network.
- the source device may include a video source, an encoding device, and a transmission unit.
- the receiving device may include a receiving unit, a decoding device, and a renderer.
- the encoding device may be referred to as a video/image encoding device, and the decoding device may be referred to as a video/image decoding device.
- a transmitter may be included in an encoding device.
- a receiver may be included in a decoding device.
- the renderer may include a display unit, and the display unit may be configured as a separate device or an external component.
- a video source may acquire video/images through a process of capturing, synthesizing, or generating video/images.
- a video source may include a video/image capture device and/or a video/image generation device.
- a video/image capture device may include, for example, one or more cameras, a video/image archive containing previously captured video/images, and the like.
- Video/image generating devices may include, for example, computers, tablets and smart phones, etc., and may (electronically) generate video/images.
- a virtual video/image may be generated through a computer or the like, and in this case, a video/image capture process may be replaced by a process of generating related data.
- An encoding device may encode an input video/picture.
- the encoding device may perform a series of procedures such as prediction, transformation, and quantization for compression and coding efficiency.
- Encoded data (encoded video/video information) may be output in the form of a bitstream.
- the transmission unit may transmit the encoded video/image information or data output in the form of a bit stream to the receiving unit of the receiving device in the form of a file or streaming through a digital storage medium or a network.
- Digital storage media may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, and SSD.
- the transmission unit may include an element for generating a media file through a predetermined file format, and may include an element for transmission through a broadcasting/communication network.
- the receiving unit may receive/extract the bitstream and transmit it to a decoding device.
- the decoding device may decode video/images by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to operations of the encoding device.
- the renderer may render the decoded video/image.
- the rendered video/image may be displayed through the display unit.
- This document is about video/image coding.
- the method/embodiment disclosed in this document may be applied to a method disclosed in a versatile video coding (VVC) standard.
- the method/embodiment disclosed in this document is an essential video coding (EVC) standard, an AOMedia Video 1 (AV1) standard, a 2nd generation of audio video coding standard (AVS2), or a next-generation video/image coding standard (ex. H.267 or H.268, etc.).
- EVC essential video coding
- AV1 AOMedia Video 1
- AVS2 2nd generation of audio video coding standard
- next-generation video/image coding standard ex. H.267 or H.268, etc.
- a video may mean a set of a series of images over time.
- a picture generally means a unit representing one image in a specific time period
- a slice/tile is a unit constituting a part of a picture in coding.
- a slice/tile may include one or more coding tree units (CTUs).
- CTUs coding tree units
- One picture may consist of one or more slices/tiles.
- a tile is a rectangular region of CTUs within a particular tile column and a particular tile row in a picture.
- the tile column is a rectangular region of CTUs having the same height as the picture, and a width specified by syntax elements in a picture parameter set (The tile column is a rectangular region of CTUs having).
- the tile row is a rectangular region of CTUs, the rectangular region has a height specified by syntax elements in a picture parameter set, and a width equal to the width of the picture (The tile row is a rectangular region of CTUs). having a height specified by syntax elements in the picture parameter set and a width equal to the width of the picture).
- a tile scan may represent a specific sequential ordering of CTUs partitioning a picture, the CTUs may be ordered sequentially with a CTU raster scan within a tile, and tiles within a picture may be sequentially ordered with a raster scan of the tiles of the picture.
- a tile scan is a specific sequential ordering of CTUs partitioning a picture in which the CTUs are ordered consecutively in CTU raster scan in a tile whereas tiles in a picture are ordered consecutively in a raster scan of the tiles of the picture).
- a slice may contain an integer number of complete tiles or an integer number of consecutive complete CTU rows within a tile of a picture, which may be contained exclusively in a single NAL unit. complete CTU rows within a tile of a picture that may be exclusively contained in a single NAL unit)
- one picture may be divided into two or more sub-pictures.
- a subpicture may be a rectangular region of one or more slices within a picture.
- a pixel or pel may mean a minimum unit constituting one picture (or image). Also, 'sample' may be used as a term corresponding to a pixel.
- a sample may generally represent a pixel or a pixel value, may represent only a pixel/pixel value of a luma component, or only a pixel/pixel value of a chroma component.
- a or B may mean “only A”, “only B” or “both A and B”.
- a or B in this document may be interpreted as “A and/or B”.
- A, B or C in this document means “only A”, “only B”, “only C”, or “any and all combinations of A, B and C ( any combination of A, B and C)”.
- a slash (/) or comma (comma) used in this document may mean “and/or”.
- A/B can mean “A and/or B”. Accordingly, “A/B” may mean “only A”, “only B”, or “both A and B”.
- A, B, C may mean “A, B or C”.
- At least one of A and B may mean “only A”, “only B”, or “both A and B”.
- the expression “at least one of A or B” or “at least one of A and/or B” means “at least one It can be interpreted the same as "A and B (at least one of A and B) of
- At least one of A, B and C means “only A”, “only B”, “only C”, or “A, B and C” It may mean “any combination of A, B and C”. Also, “at least one of A, B or C” or “at least one of A, B and/or C” means It can mean “at least one of A, B and C”.
- parentheses used in this document may mean “for example”. Specifically, when it is indicated as “prediction (intra prediction)”, “intra prediction” may be suggested as an example of “prediction”. In other words, “prediction” in this document is not limited to “intra prediction”, and “intra prediction” may be suggested as an example of “prediction”. Also, even when indicated as “prediction (ie, intra prediction)”, “intra prediction” may be suggested as an example of “prediction”.
- an encoding device may include a video encoding device and/or a video encoding device.
- the encoding device 200 includes an image partitioner 210, a predictor 220, a residual processor 230, an entropy encoder 240, It may include an adder 250, a filter 260, and a memory 270.
- the prediction unit 220 may include an inter prediction unit 221 and an intra prediction unit 222 .
- the residual processing unit 230 may include a transformer 232 , a quantizer 233 , a dequantizer 234 , and an inverse transformer 235 .
- the residual processing unit 230 may further include a subtractor 231 .
- the adder 250 may be called a reconstructor or a reconstructed block generator.
- the above-described image segmentation unit 210, prediction unit 220, residual processing unit 230, entropy encoding unit 240, adder 250, and filtering unit 260 may be one or more hardware components ( For example, it may be configured by an encoder chipset or processor). Also, the memory 270 may include a decoded picture buffer (DPB) and may be configured by a digital storage medium. The hardware component may further include a memory 270 as an internal/external component.
- DPB decoded picture buffer
- the image divider 210 may divide an input image (or picture or frame) input to the encoding device 200 into one or more processing units.
- the processing unit may be called a coding unit (CU).
- the coding unit may be partitioned recursively from a coding tree unit (CTU) or a largest coding unit (LCU) according to a quad-tree binary-tree ternary-tree (QTBTTT) structure.
- CTU coding tree unit
- LCU largest coding unit
- QTBTTT quad-tree binary-tree ternary-tree
- one coding unit may be divided into a plurality of coding units of deeper depth based on a quad tree structure, a binary tree structure, and/or a ternary structure.
- a quad tree structure may be applied first and a binary tree structure and/or ternary structure may be applied later.
- a binary tree structure may be applied first.
- a coding procedure according to this document may be performed based on a final coding unit that is not further divided. In this case, based on the coding efficiency according to the image characteristics, the largest coding unit can be directly used as the final coding unit, or the coding unit is recursively divided into coding units of lower depth as needed to obtain an optimal A coding unit having a size of may be used as the final coding unit.
- the coding procedure may include procedures such as prediction, transformation, and reconstruction, which will be described later.
- the processing unit may further include a prediction unit (PU) or a transform unit (TU).
- the prediction unit and the transform unit may be divided or partitioned from the above-described final coding unit.
- the prediction unit may be a unit of sample prediction
- the transform unit may be a unit for deriving transform coefficients and/or a unit for deriving a residual signal from transform coefficients.
- the encoding device 200 subtracts the prediction signal (predicted block, prediction sample array) output from the inter prediction unit 221 or the intra prediction unit 222 from the input video signal (original block, original sample array) to obtain a residual A signal (residual signal, residual block, residual sample array) may be generated, and the generated residual signal is transmitted to the conversion unit 232 .
- a unit for subtracting a prediction signal (prediction block, prediction sample array) from an input video signal (original block, original sample array) in the encoder 200 may be called a subtraction unit 231 .
- the prediction unit may perform prediction on a block to be processed (hereinafter, referred to as a current block) and generate a predicted block including prediction samples for the current block.
- a reference picture including the reference block and a reference picture including the temporal neighboring block may be the same or different.
- the temporal neighboring block may be called a collocated reference block, a collocated CU (colCU), and the like, and a reference picture including the temporal neighboring block may be called a collocated picture (colPic).
- the inter-prediction unit 221 constructs a motion information candidate list based on neighboring blocks, and provides information indicating which candidate is used to derive the motion vector and/or reference picture index of the current block. can create Inter prediction may be performed based on various prediction modes. For example, in the case of skip mode and merge mode, the inter prediction unit 221 may use motion information of neighboring blocks as motion information of the current block.
- the prediction unit 220 may generate a prediction signal based on various prediction methods described later.
- the predictor may apply intra-prediction or inter-prediction to predict one block, as well as apply intra-prediction and inter-prediction at the same time. This may be called combined inter and intra prediction (CIIP).
- the prediction unit may be based on an intra block copy (IBC) prediction mode or a palette mode for block prediction.
- IBC intra block copy
- the IBC prediction mode or the palette mode may be used for video/video coding of content such as a game, for example, screen content coding (SCC).
- SCC screen content coding
- IBC basically performs prediction within the current picture, but may be performed similarly to inter prediction in that a reference block is derived within the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document.
- Palette mode can be viewed as an example of intra coding or intra prediction. When the palette mode is applied, a sample value within a picture may be signaled based on information
- the quantization unit 233 quantizes the transform coefficients and transmits them to the entropy encoding unit 240, and the entropy encoding unit 240 may encode the quantized signal (information on the quantized transform coefficients) and output it as a bitstream. there is. Information about the quantized transform coefficients may be referred to as residual information.
- the quantization unit 233 may rearrange block-type quantized transform coefficients into a one-dimensional vector form based on a coefficient scan order, and the quantized transform coefficients based on the one-dimensional vector form quantized transform coefficients. Information about transform coefficients may be generated.
- the entropy encoding unit 240 may perform various encoding methods such as exponential Golomb, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC).
- the entropy encoding unit 240 may encode together or separately information necessary for video/image reconstruction (eg, values of syntax elements) in addition to quantized transform coefficients.
- Encoded information eg, encoded video/video information
- NAL network abstraction layer
- the video/video information may further include information on various parameter sets such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- the video/image information may further include general constraint information.
- information and/or syntax elements transmitted/signaled from an encoding device to a decoding device may be included in video/image information.
- the video/image information may be encoded through the above-described encoding procedure and included in the bitstream.
- the bitstream may be transmitted through a network or stored in a digital storage medium.
- the network may include a broadcasting network and/or a communication network
- the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, and SSD.
- a transmission unit (not shown) for transmitting the signal output from the entropy encoding unit 240 and/or a storage unit (not shown) for storing may be configured as internal/external elements of the encoding device 200, or the transmission unit It may also be included in the entropy encoding unit 240.
- the quantized transform coefficients output from the quantization unit 233 may be used to generate a prediction signal.
- a residual signal residual block or residual samples
- the adder 250 adds the reconstructed residual signal to the prediction signal output from the inter predictor 221 or the intra predictor 222 to obtain a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array) can be created
- a predicted block may be used as a reconstruction block.
- the adder 250 may be called a restoration unit or a restoration block generation unit.
- the generated reconstruction signal may be used for intra prediction of the next processing target block in the current picture, or may be used for inter prediction of the next picture after filtering as described below.
- LMCS luma mapping with chroma scaling
- the filtering unit 260 may improve subjective/objective picture quality by applying filtering to the reconstructed signal.
- the filtering unit 260 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and store the modified reconstructed picture in the memory 270, specifically the DPB of the memory 270. can be stored in
- the various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and the like.
- the filtering unit 260 may generate various filtering-related information and transmit them to the entropy encoding unit 240, as will be described later in the description of each filtering method. Filtering-related information may be encoded in the entropy encoding unit 240 and output in the form of a bitstream.
- the modified reconstructed picture transmitted to the memory 270 may be used as a reference picture in the inter prediction unit 221 .
- the encoding device can avoid prediction mismatch between the encoding device 200 and the decoding device, and can also improve encoding efficiency.
- the DPB of the memory 270 may store the modified reconstructed picture to be used as a reference picture in the inter prediction unit 221 .
- the memory 270 may store motion information of a block in a current picture from which motion information is derived (or encoded) and/or motion information of blocks in a previously reconstructed picture.
- the stored motion information may be transmitted to the inter prediction unit 221 to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory 270 may store reconstructed samples of reconstructed blocks in the current picture and transfer them to the intra predictor 222 .
- a decoding device may include an image decoding device and/or a video decoding device.
- the decoding device 300 includes an entropy decoder 310, a residual processor 320, a predictor 330, an adder 340, and a filtering unit. (filter, 350) and memory (memoery, 360).
- the prediction unit 330 may include an inter prediction unit 331 and an intra prediction unit 332 .
- the residual processing unit 320 may include a dequantizer 321 and an inverse transformer 321 .
- the above-described entropy decoding unit 310, residual processing unit 320, prediction unit 330, adder 340, and filtering unit 350 may be configured as one hardware component (for example, a decoder chipset or processor) according to an embodiment. ) can be configured by Also, the memory 360 may include a decoded picture buffer (DPB) and may be configured by a digital storage medium.
- the hardware component may further include a memory 360 as an internal/external component.
- the decoding device 300 may restore an image corresponding to a process in which the video/image information is processed by the encoding device of FIG. 2 .
- the decoding device 300 may derive units/blocks based on block division related information obtained from the bitstream.
- the decoding device 300 may perform decoding using a processing unit applied in the encoding device.
- a processing unit of decoding may be a coding unit, for example, and a coding unit may be partitioned from a coding tree unit or a largest coding unit along a quad tree structure, a binary tree structure and/or a ternary tree structure.
- One or more transform units may be derived from a coding unit.
- the restored video signal decoded and output through the decoding device 300 may be reproduced through a playback device.
- the decoding device 300 may receive a signal output from the encoding device of FIG. 2 in the form of a bitstream, and the received signal may be decoded through the entropy decoding unit 310 .
- the entropy decoding unit 310 may parse the bitstream to derive information (eg, video/image information) necessary for image restoration (or picture restoration).
- the video/video information may further include information on various parameter sets such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- the video/image information may further include general constraint information.
- the decoding device may decode a picture further based on the information about the parameter set and/or the general restriction information.
- Dual values that is, quantized transform coefficients and related parameter information may be input to the residual processing unit 320 .
- the residual processor 320 may derive a residual signal (residual block, residual samples, residual sample array). Also, among information decoded by the entropy decoding unit 310 , information about filtering may be provided to the filtering unit 350 . Meanwhile, a receiving unit (not shown) receiving a signal output from the encoding device may be further configured as an internal/external element of the decoding device 300, or the receiving unit may be a component of the entropy decoding unit 310.
- the decoding device may be referred to as a video/video/picture decoding device, and the decoding device may be divided into an information decoder (video/video/picture information decoder) and a sample decoder (video/video/picture sample decoder).
- the information decoder may include the entropy decoding unit 310, and the sample decoder includes the inverse quantization unit 321, an inverse transform unit 322, an adder 340, a filtering unit 350, and a memory 360. ), at least one of an inter prediction unit 332 and an intra prediction unit 331.
- the inverse quantization unit 321 may inversely quantize the quantized transform coefficients and output transform coefficients.
- the inverse quantization unit 321 may rearrange the quantized transform coefficients in a 2D block form. In this case, the rearrangement may be performed based on a coefficient scanning order performed by the encoding device.
- the inverse quantization unit 321 may perform inverse quantization on quantized transform coefficients using a quantization parameter (eg, quantization step size information) and obtain transform coefficients.
- a quantization parameter eg, quantization step size information
- a residual signal (residual block, residual sample array) is obtained by inverse transforming the transform coefficients.
- the prediction unit 330 may perform prediction on a current block and generate a predicted block including predicted samples of the current block.
- the prediction unit 330 may determine whether intra prediction or inter prediction is applied to the current block based on the information about the prediction output from the entropy decoding unit 310, and determine a specific intra/inter prediction mode.
- the prediction unit 330 may generate a prediction signal based on various prediction methods described later. For example, the prediction unit 330 may apply intra-prediction or inter-prediction to predict one block, and may simultaneously apply intra-prediction and inter-prediction. This may be called combined inter and intra prediction (CIIP). Also, the prediction unit 330 may be based on an intra block copy (IBC) prediction mode or a palette mode for block prediction.
- IBC intra block copy
- the IBC prediction mode or the palette mode may be used for video/video coding of content such as a game, for example, screen content coding (SCC). IBC basically performs prediction within the current picture, but may be performed similarly to inter prediction in that a reference block is derived within the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document.
- Palette mode can be viewed as an example of intra coding or intra prediction. When the palette mode is applied, information on a palette table and a palette index may be included in the video/image information and signaled.
- the intra predictor 331 may predict a current block by referring to samples in the current picture.
- the referenced samples may be located in the neighborhood of the current block or may be located apart from each other according to a prediction mode.
- prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
- the intra prediction unit 331 may determine a prediction mode applied to the current block by using a prediction mode applied to neighboring blocks.
- the inter prediction unit 332 may derive a predicted block for a current block based on a reference block (reference sample array) specified by a motion vector on a reference picture.
- motion information may be predicted in units of blocks, subblocks, or samples based on correlation of motion information between neighboring blocks and the current block.
- the motion information may include a motion vector and a reference picture index.
- the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
- a neighboring block may include a spatial neighboring block present in the current picture and a temporal neighboring block present in the reference picture.
- the inter predictor 332 may construct a motion information candidate list based on neighboring blocks and derive a motion vector and/or reference picture index of the current block based on the received candidate selection information. Inter prediction may be performed based on various prediction modes, and the prediction information may include information indicating an inter prediction mode for the current block.
- the adder 340 restores the obtained residual signal by adding it to the prediction signal (predicted block, prediction sample array) output from the prediction unit (including the inter prediction unit 332 and/or the intra prediction unit 331). Signals (reconstructed picture, reconstructed block, reconstructed sample array) can be generated. When there is no residual for the block to be processed, such as when the skip mode is applied, a predicted block may be used as a reconstruction block.
- the adder 340 may be called a restoration unit or a restoration block generation unit.
- the generated reconstruction signal may be used for intra prediction of the next processing target block in the current picture, output after filtering as described later, or may be used for inter prediction of the next picture.
- the filtering unit 350 may improve subjective/objective picture quality by applying filtering to the reconstructed signal.
- the filtering unit 350 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and store the modified reconstructed picture in the memory 360, specifically the DPB of the memory 360.
- the various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and the like.
- a (modified) reconstructed picture stored in the DPB of the memory 360 may be used as a reference picture in the inter prediction unit 332 .
- the memory 360 may store motion information of a block in the current picture from which motion information is derived (or decoded) and/or motion information of blocks in a previously reconstructed picture.
- the stored motion information may be transmitted to the inter prediction unit 332 to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory 360 may store reconstructed samples of reconstructed blocks in the current picture and transfer them to the intra prediction unit 331 .
- the embodiments described in the filtering unit 260, the inter prediction unit 221 and the intra prediction unit 222 of the encoding device 200 are the filtering unit 350 and the inter prediction of the decoding device 300, respectively.
- the same or corresponding may be applied to the unit 332 and the intra predictor 331.
- the predicted block includes prediction samples in the spatial domain (or pixel domain).
- the predicted block is identically derived from an encoding device and a decoding device, and the encoding device decodes residual information (residual information) between the original block and the predicted block, rather than the original sample value itself of the original block.
- Video coding efficiency can be increased by signaling to the device.
- the decoding device may derive a residual block including residual samples based on the residual information, generate a reconstructed block including reconstructed samples by combining the residual block and the predicted block, and reconstruct the reconstructed blocks. It is possible to create a reconstruction picture that contains
- the residual information may be generated through transformation and quantization procedures.
- the encoding apparatus derives a residual block between the original block and the predicted block, and derives transform coefficients by performing a transform procedure on residual samples (residual sample array) included in the residual block. And, by performing a quantization procedure on the transform coefficients, quantized transform coefficients may be derived, and related residual information may be signaled to the decoding device (through a bitstream).
- the residual information may include information such as value information of the quantized transform coefficients, location information, transform technique, transform kernel, and quantization parameter.
- the decoding device may perform an inverse quantization/inverse transform procedure based on the residual information and derive residual samples (or residual blocks).
- the decoding device may generate a reconstructed picture based on the predicted block and the residual block.
- the encoding device may also derive a residual block by inverse quantizing/inverse transforming the quantized transform coefficients for reference for inter prediction of a later picture, and generate a reconstructed picture based on the residual block.
- At least one of quantization/inverse quantization and/or transform/inverse transform may be omitted. If the quantization/inverse quantization is omitted, the quantized transform coefficient may be referred to as a transform coefficient. If the transform/inverse transform is omitted, the transform coefficients may be called coefficients or residual coefficients, or may still be called transform coefficients for unity of expression.
- quantized transform coefficients and transform coefficients may be referred to as transform coefficients and scaled transform coefficients, respectively.
- the residual information may include information on transform coefficient(s), and the information on the transform coefficient(s) may be signaled through residual coding syntax.
- Transform coefficients may be derived based on the residual information (or information about the transform coefficient(s)), and scaled transform coefficients may be derived through inverse transform (scaling) of the transform coefficients.
- Residual samples may be derived based on an inverse transform (transform) of the scaled transform coefficients. This may be applied/expressed in other parts of this document as well.
- Intra prediction may indicate prediction for generating prediction samples for a current block based on reference samples within a picture to which the current block belongs (hereinafter referred to as current picture).
- neighboring reference samples to be used for intra prediction of the current block may be derived.
- the neighboring reference samples of the current block include a sample adjacent to the left boundary of the current block and a total of 2xnH samples adjacent to the bottom-left of the current block of size nWxnH, samples adjacent to the top boundary of the current block, and A total of 2 ⁇ nW samples neighboring the top-right and 1 sample neighboring the top-left of the current block may be included.
- the neighboring reference samples of the current block may include a plurality of columns of upper neighboring samples and a plurality of rows of left neighboring samples.
- the neighboring reference samples of the current block are a total of nH samples adjacent to the right boundary of the current block of size nWxnH, a total of nW samples adjacent to the bottom boundary of the current block, and the lower right side of the current block ( bottom-right) may include one neighboring sample.
- neighboring reference samples of the current block may not be decoded yet or may not be available.
- the decoder may construct neighboring reference samples to be used for prediction by substituting unavailable samples with available samples.
- neighboring reference samples to be used for prediction may be configured through interpolation of available samples.
- a prediction sample may be derived based on the average or interpolation of neighboring reference samples of the current block, and (ii) the neighboring reference samples of the current block Among them, a prediction sample may be derived based on a reference sample existing in a specific (prediction) direction with respect to the prediction sample.
- Case (i) may be called a non-directional mode or non-angular mode, and case (ii) may be called a directional mode or angular mode.
- Prediction samples may be generated.
- LIP linear interpolation intra prediction
- chroma prediction samples may be generated based on luma samples using a linear model (LM). This case may be called an LM mode or a chroma component LM (CCLM) mode.
- a temporary prediction sample of the current block is derived based on the filtered neighboring reference samples, and at least one reference sample derived according to the intra prediction mode among existing neighboring reference samples, that is, unfiltered neighboring reference samples.
- a prediction sample of the current block may be derived by performing a weighted sum of ? and temporary prediction samples. The above case may be called position dependent intra prediction (PDPC).
- a reference sample line having the highest prediction accuracy is selected among multiple reference sample lines adjacent to the current block, and a prediction sample is derived using a reference sample located in a prediction direction in the corresponding line, and at this time, the used reference sample line is decoded.
- Intra prediction encoding may be performed by instructing (signaling) to .
- the above case may be referred to as multi-reference line intra prediction or MRL-based intra prediction.
- intra prediction may be performed based on the same intra prediction mode by dividing the current block into vertical or horizontal sub-partitions, but neighboring reference samples may be derived and used in units of sub-partitions. That is, in this case, the intra-prediction mode for the current block is equally applied to the sub-partitions, but intra-prediction performance can be improved in some cases by deriving and using neighboring reference samples in units of sub-partitions.
- This prediction method may be called intra prediction based on intra sub-partitions (ISP).
- the aforementioned intra prediction methods may be referred to as an intra prediction type to be distinguished from an intra prediction mode.
- the intra prediction type may be called various terms such as an intra prediction technique or an additional intra prediction mode.
- the intra prediction type (or additional intra prediction mode, etc.) may include at least one of the aforementioned LIP, PDPC, MRL, and ISP.
- a general intra prediction method excluding specific intra prediction types such as LIP, PDPC, MRL, and ISP may be referred to as a normal intra prediction type.
- the normal intra prediction type may be generally applied when the specific intra prediction type as described above is not applied, and prediction may be performed based on the aforementioned intra prediction mode. Meanwhile, post-processing filtering may be performed on the derived prediction samples as needed.
- the intra prediction modes may include two non-directional intra prediction modes and 65 directional prediction modes.
- the non-directional intra prediction modes may include a planar intra prediction mode and a DC intra prediction mode, and the directional intra prediction modes may include intra prediction modes numbered 2 to 66.
- 4 exemplarily shows 65 directional intra prediction modes.
- a residual block (residual samples) may be obtained by receiving (inverse quantized) transform coefficients and performing the primary (separate) inverse transform.
- the encoding device and the decoding device may generate a reconstructed block based on the residual block and the predicted block and generate a reconstructed picture based on the residual block.
- a reduced secondary transform (RST) with a reduced size of a transformation matrix (kernel) can be applied in the concept of NSST in order to reduce the amount of computation and memory required for non-separate secondary transformation.
- RST reduced secondary transform
- kernel transformation matrix
- the RST since the RST is mainly performed in a low-frequency region including non-zero coefficients in a transform block, it may be referred to as a low-frequency non-separable transform (LFNST).
- the conversion index may be named LFNST index.
- LFNST may mean a transform performed on residual samples of a target block based on a transform matrix having a reduced size.
- the simplified transformation is performed, the amount of computation required for transformation may be reduced due to the reduction in the size of the transformation matrix. That is, LFNST can be used to solve the computational complexity issue that occurs when transforming large blocks or non-separate transforms.
- the inverse transform unit 235 of the encoding apparatus 200 and the inverse transform unit 322 of the decoding apparatus 300 modify transforms based on the inverse RST of transform coefficients. It may include an inverse RST unit for deriving coefficients, and an inverse primary transform unit for deriving residual samples for the target block based on inverse primary transform for modified transform coefficients.
- the inverse primary transform means an inverse transform of the primary transform applied to the residual.
- deriving a transform coefficient based on a transform may mean deriving a transform coefficient by applying a corresponding transform.
- FIG. 8 is a diagram for explaining RST or LFNST to which RST is applied according to an embodiment of the present document.
- a “target block” may mean a current block, residual block, or transform block on which coding is performed.
- a reduced transformation matrix may be determined by mapping an N dimensional vector to an R dimensional vector located in another space, where R is less than N.
- N may mean the square of the length of one side of a block to which a transform is applied or the total number of transform coefficients corresponding to a block to which a transform is applied
- the simplification factor may mean an R/N value.
- the simplification factor may be referred to by various terms such as reduced factor, reduction factor, reduced factor, reduction factor, simplified factor, and simple factor.
- R may be referred to as a reduced coefficient, but in some cases, a reduced factor may mean R.
- the simplification factor may mean an N/R value.
- the size of the simplified transform matrix according to an embodiment is RxN smaller than the size NxN of the normal transform matrix, and may be defined as in Equation 1 below.
- matrix operation can be understood as an operation that obtains a column vector by placing the matrix on the left of the column vector and multiplying the matrix by the column vector.
- the size of the inverse RST matrix T NxR is NxR smaller than the size NxN of a normal inverse transform matrix, and has a transpose relationship with the simplified transform matrix T RxN shown in Equation 1.
- the matrix T t in the Transform block may mean an inverse RST matrix T RxN T (the superscript T means transpose).
- T means transpose
- modified transform coefficients of the target block or residual samples of the target block may be derived.
- the inverse RST matrix T RxN T may be expressed as (T RxN ) T NxR .
- modified transform coefficients for the target block may be derived when transform coefficients for the target block are multiplied by the inverse RST matrix T RxN T .
- the size of the normal inverse transformation matrix is 64x64 (NxN), but the size of the simplified inverse transformation matrix is reduced to 64x16 (NxR).
- memory usage can be reduced by R/N ratio.
- NxR the number of multiplication operations
- a simplified inverse transform matrix or inverse transform matrix may also be named a simplified transform matrix or a transform matrix if it is not confusing whether it is a transform or an inverse transform.
- a maximum of 16 x 48 transformation kernel is obtained by selecting only 48 data instead of a 16 x 64 transformation kernel matrix for 64 data constituting an 8 x 8 area. matrix can be applied.
- maximum means that the maximum value of m is 16 for an m x 48 transform kernel matrix capable of generating m coefficients.
- a transposed matrix of the transformation kernel matrix described above may be used. That is, when inverse RST or inverse LFNST is performed as an inverse transformation process performed by the decoding device, the input coefficient data to which inverse RST is applied is composed of a 1-dimensional vector according to a predetermined arrangement order (diagonal scanning order), and the 1-dimensional vector
- the modified coefficient vector obtained by multiplying the corresponding inverse RST matrix from the left side may be arranged in a two-dimensional block according to a predetermined arrangement order.
- the size of the transformation matrix of Equation 4 is 48 x 16
- the column vectors are c 1 to c 16
- forward LFNST receives as input a transform coefficient to which a first-order transform is applied.
- transform coefficients belonging to a specific region predefined in the transform block may be received as inputs.
- FIG. 9 is a diagram illustrating a forward LFNST input area according to an embodiment of the present document.
- this input region that is, the region of input transform coefficients input for the forward LFNST may be referred to as Region Of Interest or ROI.
- an Rx96 matrix from a 96x96 square matrix it can be generated by sampling R rows from the 96x96 matrix based on the forward LFNST. If the rows constituting the 96x96 matrix are arranged in order of importance from the top, the Rx96 matrix can be constructed by sequentially sampling R rows from the top.
- the ROIs in FIG. 9(a) and FIG. 9(b) are composed of 48 and 96 input samples (input conversion coefficients or input data), respectively.
- the order in which input samples are read from the ROI can be set in advance, but the basic order can be arbitrarily set. More specifically, when the forward LFNST matrix applied to an arbitrary ROI is an RxN matrix (ie, the ROI consists of N input samples), even if the reading order of the input samples is changed, the order of the N column vectors is changed. If rearranged according to , compared to before the change, the output value does not change regardless of the order of the input samples (the output value consists of R transform coefficients).
- FIGS. 10(a) and 11(a) correspond to FIG. 9(a)
- FIGS. 10(b) and 11(b) correspond to FIG. 9(b).
- row priority order may be applied, and for modes 35 to 80, column priority order may be applied as shown in FIG. 11(a) and FIG. 11(b).
- the order of FIGS. 10(a) and 10(b) may be applied as it is to modes 0, 1, and 34 indicating planar mode and DC mode, and FIG. 10 (a) and 10(b) or 11(a) and 11(b) may be applied.
- the upper left quadrangular area of the transform block may be set as the ROI. That is, in the MxN transform block, the upper left m x n (m ⁇ M, n ⁇ N) region can be set as the ROI, and the number of input samples (transform coefficients that have undergone the primary transform) is m x n in terms of the forward LFNST.
- both m and n may be 8, and the dimension of the forward LFNST matrix may be R x 64 (R is less than or equal to 64, and examples of R values are 16, 32, 48, 64, etc.).
- a method of selecting R rows from an mn x mn square matrix (eg, a 64x64 matrix) may be the same as the method of generating an Rx96 matrix from a 96x96 described above.
- a 4x4 subblock may correspond to a transform group (Coefficient Group, CG) for transform blocks to which LFNST can be applied, but this CG is not necessarily a 4x4 subblock.
- the CG may be any predefined p x q sub block other than a 4x4 sub block.
- FIGS. 10(a) and 10(b) or FIG. 11 (a) and 11 (b) may follow a specific order, and transform coefficients output by forward LFNST may be arranged according to the scan order for the corresponding CG and transform coefficient.
- FIG. 12 is a diagram illustrating a non-square ROI according to an embodiment of the present document. As shown in (a) of FIG. 12, when the ROI is non-square, when input samples are read in the row-major direction and in the column-major direction, the phases of the transform coefficients are not aligned in the two cases.
- the second mode and the 66th mode are performed in the order shown in FIG. 12 (b).
- Input samples can be read according to That is, for the second mode, input samples may be read according to the left order of FIG. 12(b), and for the 66th mode, input samples may be read according to the right order of FIG. 12(b). If the input samples are read using the symmetry of the two prediction modes in this way, the same LFNST kernel can be applied to the two ROIs of FIG. 12(b).
- LFNST_4x4 For a transform block (a transform block in which both the horizontal and vertical lengths are greater than or equal to 4 and the horizontal or vertical length is 4), an LFNST kernel with a 16x16 matrix form that can be applied to the upper left 4x4 region can be applied (LFNST_4x4 can be named).
- LFNST_4x4 and LFNST_8x8 each consist of 4 sets, each set consists of 2 transform kernels, and which set of kernels to apply may be determined by the intra prediction mode. Which of the two kernels is to be applied to the determined set and whether to apply the LFNST may be designated through signaling of the LFNST index. If the LFNST index value is 0, LFNST is not applied, if it is 1, the first kernel is applied, and if it is 2, the second kernel can be applied.
- LFNST_4x4 and LFNST_8x8 each consist of four LFNST sets
- a group of LFNST sets named LFNST_4x4 or LFNST_8x8 can be represented as an LFNST set list for convenience of description below.
- LFNST_8x8 may indicate an LFNST set list applied to a transformation block in which both the horizontal and vertical lengths are greater than or equal to 8 and the horizontal or vertical length is 8, and additionally, the horizontal and vertical lengths are greater than 16.
- the LFNST set list applied to transform blocks that are greater than or equal to can be named LFNST_16x16.
- LFNST_4x4, LFNST_8x8, and LFNST_16x16 may have are as follows.
- the transformation matrix is based on when forward transformation is applied.
- LFNST_4x4 can have a 16x16 matrix, and the ROI can be the upper left 4x4 area.
- LFNST_8x8 can have an Rx48 matrix or an Sx64 matrix, and 16, 32, and 48 are possible as R values, and 16, 32, 48, and 64 are possible as S values.
- the ROI for the Rx48 matrix may be (a) of FIG. 9, and the ROI for the Sx64 matrix may be an 8x8 area in the upper left corner.
- LFNST_16x16 can have Rx96 matrix, Sx64 matrix or Tx48 matrix, R value can be 16, 32, 48, 64, 80, 96, S value can be 16, 32, 48, 64 and T value 16, 32, and 48 are possible.
- the ROI for the Rx96 matrix may be (b) of FIG. 9, the ROI for the Sx64 matrix may be an 8x8 area in the upper left corner, and the ROI for the Tx48 matrix may be (a) of FIG.
- LFNST_4x4x4 As an architecture for LFNST_4x4, LFNST_8x8, and LFNST_16x16, any combination of matrix dimensions and ROI suggested in Nos. 1, 2, and 3 above is possible.
- the ROI of the upper left 4x4 area is applied to a 16x16 matrix
- the ROI of the upper left 8x8 area is applied to the 32x64 matrix
- the ROI as shown in (b) in FIG. 9 is applied to the 32x96 matrix
- N output samples (output transform coefficients) are generated.
- the NxR matrix becomes a transposed matrix of the RxN matrix in the forward LFNST, and N output samples may be arranged in ROIs of FIGS. 9 to 12 .
- the order shown in FIG. 10 or 11 may be followed according to the intra prediction mode value. For example, when symmetry between intra prediction modes is utilized, the row priority order of FIG. 10 is applied to intra prediction modes -14 to -1 and 2 to 33, and the column priority order of FIG. 11 is applied to modes 35 to 80. can be applied.
- the output data area may be configured as follows based on forward LFNST.
- R when the number of output transform coefficients in the forward LFNST criterion is R and the number of input samples is N, R may be set to be less than or equal to N.
- a transform coefficient is parsed in a region other than a region in which the LFNST transform coefficient may exist, signaling of the LFNST index may be omitted and it may be inferred that the LFNST is not applied.
- an area in which LFNST transform coefficients may exist is configured in units of 4x4 subblocks and residual coding is performed in units of corresponding 4x4 subblocks, it is checked whether transform coefficients exist in areas other than the area in which LFNST transform coefficients may exist. It can be done more simply.
- the CG may be of a shape other than a 4x4 sub-block, in which case (eg mxn block, m n) R value can be set as a multiple of mxn.
- CGs in which forward LFNST output transform coefficients may exist may be composed of first k CGs arranged according to the scanning order of CGs.
- the output coefficients of the forward LFNST can be arranged according to the transform coefficient scanning order.
- row vectors of the forward LFNST kernel are arranged from top to bottom in order of importance, so if the transform coefficients constituting the output vector are arranged in order from top to bottom (assuming that the output vector is a column vector here), more The coefficients can be arranged sequentially, starting with significant coefficients.
- the scanning order of conversion coefficients is to scan from the most important coefficient, and by scanning from the DC position indicated by the upper-left position, conversion coefficients with less importance are placed as they get farther from the DC position, and they mainly have a value of 0 or close to 0. .
- the residual coding part is designed to increase coding efficiency when transform coefficients having 0 or values close to 0 frequently appear as the distance from the DC position increases.
- the output transform coefficients of the forward LFNST do not necessarily have to be arranged according to one fixed scan order. That is, according to another embodiment, the output transform coefficients of the LFNST may be sorted according to an order other than the scan order.
- the LFNST-specific scan order is not the pre-determined scan order. may be applied.
- a different scan order is applied to forward LFNST output transform coefficients for each intra prediction mode (or group of intra prediction modes). can do.
- an LFNST set list, an LFNST set, and an LFNST kernel may be applied based on the size of a transform block.
- the LFNST set list, LFNST set, and LFNST kernel can be more subdivided and applied according to the size of the conversion block.
- the LFNST kernel configuration per set may indicate which LFNST set consists of how many candidate nulls.
- LFNST_MxN a different LFNST set list may be applied to every possible transform block shape (ie, every possible MxN block), and the corresponding set list may be expressed as, for example, LFNST_MxN.
- a corresponding LFNST set list may be applied to each group by grouping transform block shapes.
- two types of LFNST set lists are applied, namely LFNST_4x4 and LFNST_8x8, divided into two groups according to the shape of the transform block. Examples of other groupings are as follows.
- a separate group may be set for cases where both the horizontal and vertical lengths of the transform block are equal to or greater than 16, and an LFNST set list applied to the group may be allocated.
- the LFNST set list may be named LFNST_16x16.
- When combined with the grouping of the VVC standard (Group 1) 4x4, 4xN/Nx4 (N 8) Conversion block, (Group 2) 8x8, 8xN/Nx8 (N 16) Transformation blocks, (Group 3) Transformation blocks whose width and height are both greater than or equal to 16, which can be divided into three groups, LFNST_4x4, LFNST_8x8, LFNST_16x16 for each group and/or a list of LFNST sets applied to the group. can be named as
- Group 1 is again a 4x4 conversion block and 4xN / Nx4 (N 8) and can be divided into Group 1A and Group 1B.
- Group 2 is also 8x8 conversion block and 8xN/Nx8 (N 16) It can be divided into conversion blocks and can be classified as Group 2A and Group 2B.
- Group 3 can be divided into Group 3A and Group 3B through a specific criterion. For example, 16x16 and 16xN/Nx16 (N 16) The conversion block can be set to Group 3A, and the remaining cases can be classified as Group 3B.
- Group 1, Group 2, and Group 3 may or may not be divided into detailed groups as described above.
- the entire group may be configured as Group 1A, Group 1B, Group 2, Group 3A, and Group 3B.
- Group 1, Group 2, and Group 3 are all divided, the groups can be classified as Group 1A, Group 1B, Group 2A, Group 2B, Group 3A, and Group 3B.
- Group 1 (LFNST_4x4) consists of 18 LFNST sets, and each LFNST set consists of 3 kernels, and the dimension of the corresponding kernel matrix may be 16x16.
- Group 2 (LFNST_8x8) consists of 18 LFNST sets, and each LFNST set consists of 3 kernels, and the dimension of the corresponding kernel matrix may be 16x48.
- Group 3 (LFNST_16x16) consists of 18 LFNST sets, and each LFNST set consists of 3 kernels, and the dimension of the corresponding kernel matrix may be 32x96.
- All LFNST set lists in the above configuration can be configured with a different number of sets than 18.
- the LFNST set list may consist of 16, 15, 10, 6, or 4 transform sets.
- the dimensions of the kernel matrices constituting LFNST_8x8 may be set to 32x48 to 48x48.
- No. 2 No. 3, No. 4, and No. 5 above can be freely combined.
- number 3 the number of LFNST sets is set to 15, and by applying number 4, the dimensions of the kernel matrices constituting LFNST_8x8 may be set to 32x48.
- whether to apply LFNST may be determined based on a color component, and when it is determined that LFNST is applied, an LFNST set list, an LFNST set, and an LFNST kernel may be applied based on the color component. there is.
- LFNST when the tree type of a coding unit is a single tree, LFNST is applied only to the luma component, and in the case of a separate tree, that is, a dual tree, a separate tree for the luma component ( In the case of dual tree luma), LFNST is applied to the luma component, and in the case of a separate tree (dual tree chroma) for the chroma component, LFNST is applied to the chroma component.
- LFNST can be applied only to the luma component. If LFNST is applied only to the luma component, the LFNST index indicates only the LFNST kernel applied to the luma component because LFNST is applied only to the luma component and not to the chroma component in the single tree, as in the VVC standard. When the LFNST is applied only to the luma component, the LFNST index is not signaled since the LFNST is not applied to the split tree for the chroma component (if the LFNST index is not signaled, it can be assumed that the LFNST is not applied by default).
- LFNST may be applied to both the luma component and the chroma component when a single tree is used.
- it can be implemented in two ways. That is, 1) configuring image information to select a corresponding LFNST kernel for both luma and chroma components by signaling one LFNST index, and 2) configuring image information so that separate LFNST indices are signaled for luma and chroma components. It can be configured to select the LFNST kernel most suitable for each component.
- LFNST set list, LFNST set, and LFNST kernel for luma and chroma components are set differently, and one LFNST index is signaled to select an LFNST kernel for luma and chroma components, one signaled
- the LFNST kernels for the luma component and chroma component designated by the LFNST index of may be different because they are selected from different LFNST set lists and LFNST sets.
- the same LFNST set list is applied to luma and chroma components.
- different LFNST set lists, different LFNST sets, and different LFNST kernels may be applied to R, G, and B components, respectively.
- the same LFNST set list, LFNST set, and LFNST kernel can be applied to the three components.
- LFNST set lists LFNST sets
- LFNST kernels may be applied according to a range of quantization parameter (QP) values.
- QP quantization parameter
- the LFNST set list applied to the low QP range and the LFNST set list applied to the high QP range may be separately used.
- the low QP range may represent a case below a predefined threshold QP value
- the high QP range may represent a case exceeding a predefined QP value.
- the threshold QP value 27, 28, 29, 30, 31, etc. may be used.
- all possible QP values can be partitioned into N sets.
- the N sets may not include overlapping values. That is, when two different sets are selected among N sets, the intersection between the two sets may be an empty set.
- a different LFNST set list, LFNST set, and LFNST kernel may be applied to each of the N sets.
- LFNST set lists when there are M possible LFNST set lists, a mapping relationship between the N sets and the M LFNST set lists may be formed. That is, the LFNST set list mapped to each of the N sets may be any one of the M LFNST set lists. Naturally, LFNST set lists mapped to N sets may overlap each other.
- LFNST_4x4 and LFNST_8x8 exist as LFNST set lists, or LFNST_4x4, LFNST_8x8, and LFNST_16x16 exist, as described above, if M LFNST set lists exist, M LFNST sets for LFNST_4x4, LFNST_8x8, and LFNST_16x16, respectively. Lists may exist.
- LFNST_4x4_i 1, 2, ..., M
- LFNST_8x8_i 1, 2, ..., M
- LFNST_4x4_1, LFNST_8x8_1 or LFNST_16x16_1 for the low QP range may be mapped, and for the high QP range, LFNST_4x4_2, LFNST_8x8_2, or LFNST_16x16_2 may be mapped according to the transform block size.
- LFNST_4x4, LFNST_8x8, and LFNST_16x16 is expressed as (LFNST_4x4, LFNST_8x8, LFNST_16x16), or a pair of LFNST_4x4, LFNST_8x8 is expressed as (LFNST_4x4, LFNST_8x8)
- the tuple or pair of the ith LFNST set list is (xLFNST_4x16_i6_i, LFNST_8x4_i, LFNST).
- it can be expressed as (LFNST_4x4_i, LFNST_8x8_i).
- mapping to the i th LFNST set list means that they are mapped to (LFNST_4x4_i, LFNST_8x8_i, LFNST_16x16_i) or (LFNST_4x4_i, LFNST_8x8_i).
- mapping to one of the M LFNST set lists according to QP values may mean that an LFNST set list mapped to each QP value exists and that the corresponding LFNST set list is applied. For example, if the corresponding LFNST set list is the jth LFNST set list, (LFNST_4x4_j, LFNST_8x8_j, LFNST_16x16_j) may be applied.
- LFNST set list when there are M applicable LFNST set lists as described above, it is not a method of applying the LFNST set list for a specific condition (range of QP values) based on whether a specific condition (range of QP values) is satisfied, but a higher order list. It can be configured to specify the LFNST set list through a level syntax element (hereinafter also referred to as an HLS element).
- HLS element level syntax element
- the corresponding HLS element is the Sequence Parameter Set (SPS), Picture Parameter Set (PPS), and Picture Header (PH), which are syntax tables that collect high-level syntax elements. It may be located in the Slice Header (SH) or the like. In this regard, a corresponding HLS element may have a value from 0 to M-1 to designate one of M possible LFNST set lists.
- SPS Sequence Parameter Set
- PPS Picture Parameter Set
- PH Picture Header
- SH Slice Header
- a corresponding HLS element may have a value from 0 to M-1 to designate one of M possible LFNST set lists.
- the corresponding HLS element may be related to LFNST set list index information.
- the LFNST set list index information may be located in SPS, PPS, PH, SH, and the like.
- the LFNST set list index information may have a value from 0 to M-1 to designate one of M possible LFNST set lists.
- LFNST_4x4, LFNST_8x8, LFNST_16x16, or LFNST_4x4 and LFNST_8x8 exist, and the ith LFNST set list is designated by the LFNST set list index information, (LFNST_4x4_i, LFNST_8x8_i, LFNST_16x16_i) or (LFNST_4x4_i, LFNST_8x8_i) may be applied.
- the value of the LFNST set list index information can be inferred as a specific value, and the LFNST set list designated by the inferred value will be the default LFNST set list.
- the default LFNST set list is the kth LFNST set list
- the default LFNST set list may be (LFNST_4x4_k, LFNST_8x8_k, LFNST_16x16_k) or (LFNST_4x4_k, LFNST_8x8_k).
- a different LFNST set list may be applied for each transform block size.
- LFNST_4x4 and LFNST_8x8 as in the VVC standard, 4x4, 4xN/Nx4 (N ⁇ 8), 8x8, 8xN/Nx8 (N ⁇ 16), 16x16, 16xN/Nx16 (N ⁇ 32 ), different LFNST set lists can be applied to cases of 32x32 or more (both horizontal and vertical lengths are 32 or more).
- LFNST set list for each block size set can be expressed as LFNST_4x4, LFNST_4xN_Nx4, LFNST_8x8, LFNST_8xN_Nx8, LFNST_16x16, LFNST_16xN_Nx16, LFNST_32x32.
- LFNST_4x4 and LFNST_8x8 each indicating an LFNST set list, may each consist of four LFNST sets, and the four LFNST sets may be distinguished by index values of 0, 1, 2, and 3. That is, the LFNST sets can be classified as the 0th LFNST set, the 1st LFNST set, the 2nd LFNST set, and the 3rd LFNST set, and the LFNST set index for each LFNST set has a value of 0, 1, 2 or 3. can have
- Intra pred in Tables 5 to 10 above. mode may indicate an intra prediction mode for the current block, and the intra pred. If the value of mode is one of -14 to -1 and 67 to 80, it may indicate that the intra prediction mode for the current block is the WAIP mode.
- LFNST_4x4 and LFNST_8x8 that is, a LFNST matrix
- LFNST matrix may be applied when the ROI for LFNST_8x8 is an 8x8 region at the top left of the block to be transformed.
- Tables 11 to 20 below show examples of kernel coefficient data for LFNST_4x4 applicable to 4xN or Nx4 blocks (N ⁇ 4).
- g_lfnst4x4[ 36 ][ 3 ][ 16 ][ 16 ] array of Tables 11 to 20 [ 36 ] indicates that the number of LFNST sets is 36, and [ 3 ] indicates that the number of LNFST kernel candidates per LFNST set is 3, [ 16 ] [ 16 ] represents a 16x16 matrix based on the forward LFNST (the corresponding array definitions in Tables 11 to 20 are described according to C/C++ grammar).
- 32 may represent the horizontal (x-axis) length of the matrix
- 64 may represent the vertical (y-axis) length of the matrix.
- Tables 11 to 20 are used for LFNST in which the number of LFNST sets is 35, the array can be represented as g_lfnst4x4[ 35 ][ 3 ][ 16 ][ 16 ].
- An ROI to which the LFNST kernels of Tables 11 to 20 can be applied may be an upper left 4x4 region.
- Each LFNST kernel consists of 16 transform basis vectors (row vectors), and one vector has a length of 16 ([ 16 ][ 16 ]).
- the row-direction basis vector of the LFNST kernel (16X16 dimensional matrix) of Tables 11 to 20 and the transform coefficient may be multiplied during matrix operation.
- the transform coefficient may be multiplied by the row-direction basis vector of the kernel (16X16 dimensional matrix) in which the LFNST kernel below is transposed.
- Tables 11-20 may represent some of the 36 LFNST sets. As described above, the LFNST set may be selected according to the intra prediction mode and may be mapped according to Table 5 or Table 6. According to Table 5, 35 LFNST sets are used, and according to Table 6, 36 LFNST sets are used. Tables 11 to 20 may be kernels corresponding to specific set numbers among 35 or 36 sets.
- An ROI to which the LFNST kernels in Tables 21 to 50 can be applied may be an 8x8 upper left region.
- Each LFNST kernel consists of 32 transform basis vectors (row vectors), and one vector has a length of 64 ([ 32 ][ 64 ]).
- the transform coefficient may be multiplied by the row-direction basis vector of the LFNST kernel (32X64 dimensional matrix) of Tables 21 to 50 during matrix operation.
- the transform coefficient may be multiplied by the row-direction basis vector of the kernel (64X32-dimensional matrix) in which the LFNST kernel below is transposed.
- Tables 21 to 26 show three LFNST kernels applied when the LFNST set index of Table 5 or Table 6 is 0 (when the intra prediction mode is planner mode), and Tables 27 to 32 show the values of Table 5 or Table 6.
- Tables 33 to 38 show three LFNST kernels applied when the LFNST set index is 1 (when the intra prediction mode is DC mode), and Tables 33 to 38 show when the LFNST set index of Table 5 or Table 6 is 2 (when the intra prediction mode is Tables 39 to 44 apply when the LFNST set index of Table 5 or Table 6 is 18 (when the intra prediction mode indicates the horizontal or vertical direction).
- Tables 45 to 50 may show three LFNST kernels applied when the LFNST set index of Table 5 or Table 6 is 34 (when the intra prediction mode indicates the upper left direction).
- LFNST kernel consists of two tables.
- Table 21 and Table 22 may be the first LFNST kernel applied when the LFNST set index is 0.
- Table 23 and Table 24 may be the second LFNST kernel applied when the LFNST set index is 0.
- Table 25 and Table 26 may be a third LFNST kernel applied when the LFNST set index is 0.
- FIG. 14 Each step disclosed in FIG. 14 is based on some of the contents described above in FIGS. 4 to 13 . Accordingly, detailed descriptions of overlapping details with those described above in FIGS. 3 to 13 will be omitted or simplified.
- the decoding device may obtain information about quantized transform coefficients from residual information and may receive various information for image decoding. More specifically, the decoding apparatus 300 may decode information about quantized transform coefficients of the target block from the bitstream, and based on the information about the quantized transform coefficients of the target block, Quantized transform coefficients can be derived.
- information on the LFNST applied to the target block may be received, and the information on the LFNST may be included in a Sequence Parameter Set (SPS) or a slice header.
- SPS Sequence Parameter Set
- This information includes information on whether LFNST is applied, information on the minimum transform size to which LFNST is applied, information on the maximum transform size to which LFNST is applied, and a transform index indicating one of the transform kernels included in the transform set. It may include at least one of information about.
- the decoding device may further receive information related to the intra prediction mode from the bitstream.
- the decoding apparatus 300 may derive transform coefficients for the current block based on the residual information (S1420).
- the decoding device may derive transform coefficients by performing inverse quantization on the quantized transform coefficients of the target block.
- the LFNST set index may be derived as one predetermined value.
- the LFNST set index may be derived as one of N predetermined values.
- a transformation kernel is derived as a 64x32 matrix, and the horizontal and vertical lengths of the current block Based on all of 8, the 64x16 matrix sampled from the 64x32 matrix can be applied to the inverse quadratic transform of the current block.
- the step of deriving the corrected transform coefficients includes the step of deriving an input array by arranging the transform coefficients according to the forward diagonal scanning order, and the number of modified transform coefficients greater than the input transform coefficients through matrix operation of the input array and the transform kernel. and arranging modified transform coefficients in an output region.
- the input array is arranged in units of 4x4 subblocks that may be arranged in forward diagonal scanning order from the DC position of the target block, and may be arranged according to the forward diagonal scanning order within the 4x4 subblock. Therefore, R, the number of transform coefficients constituting the input array, may be set to a multiple of 16, which is the number of transform coefficients in the 4x4 sub-block.
- the size of the inverse secondary transform may be set to a third value based on the fact that both the horizontal and vertical lengths of the target block are greater than or equal to 16.
- the third value may be set to 4.
- the LFNST is applied to the 16x16 region at the upper left of the target block, which may correspond to the aforementioned LFNST_16x16.
- At least one of the number of transform sets applied to the target block, the number of transform kernels constituting the transform set, and the dimension of the transform kernel are derived based on grouping according to the size of the inverse secondary transform, that is, the size to which LFNST is applied.
- the number of transform sets, the number of transform kernels constituting the transform set, and the dimensions of the transform kernels may be set and configured in various ways according to the size of the inverse secondary transform or the size of the target block.
- the encoding apparatus 200 may derive transform coefficients for the current block based on a primary transform for residual samples (S1530).
- Intra pred. mode may indicate an intra prediction mode for the current block, and the intra pred. If the value of mode is one of -14 to -1 and 67 to 80, it may indicate that the intra prediction mode for the current block is the WAIP mode.
- a transformation kernel is derived as a 32x64 matrix, and based on the fact that the target block has both horizontal and vertical lengths of 8 Based on this, a 16x64 matrix sampled from a 32x64 matrix may be applied to the secondary transform of the target block.
- An input region which means a region of input transform coefficients subject to secondary transform in the encoding device, may correspond to an output region described in the decoding method and an ROI described with reference to the above-described drawings. Therefore, redundant description of the ROI is omitted.
- the modified transform coefficients may be arranged in units of 4x4 subblocks that may be arranged in a forward diagonal scanning order from the DC position of the target block, and may be arranged according to a forward diagonal scanning order within the 4x4 subblock. Accordingly, R, the number of modified transform coefficients, may be set to a multiple of 16, which is the number of transform coefficients in the 4x4 subblock.
- the size of the secondary transformation may be set to a first value.
- the first value may be set to 2.
- LFNST is applied to the upper left 4x4 region of the target block, which may correspond to the aforementioned LFNST_4x4.
- the size of the secondary transformation may be set to a second value based on the fact that both the horizontal and vertical lengths of the target block are greater than or equal to 8 and the horizontal and vertical lengths are 8.
- the second value may be set to 3.
- LFNST is applied to the upper left 8x8 region of the target block, which may correspond to the aforementioned LFNST_8x8.
- At least one of the number of transform sets applied to the target block, the number of transform kernels constituting the transform set, and the dimension of the transform kernel may be derived based on grouping according to the size of the secondary transform, that is, the size to which LFNST is applied. there is.
- the number of transformation sets, the number of transformation kernels constituting the transformation set, and the dimensions of the transformation kernels may be set and configured in various ways corresponding to the size of the secondary transformation or the size of the target block.
- the dimension of the transform kernel can be set to 16x16.
- the dimension of the transform kernel can be set to Rx48 or Sx64, where R is any one of 16, 32, and 48, and S is one of 16, 32, 48, and 64. can be set to either one.
- the dimension of the transform kernel is set to any one of Rx96, Sx64 or Tx48, R is any one of 16, 32, 48, 64, 80, and 96, and S is Any one of 16, 32, 48, and 64, and T may be set to any one of 16, 32, and 48.
- the transform applied to LFNST4x4 may be 16x16.
- the encoding apparatus 200 may encode image information including residual information about a target block (S1560).
- the encoding device 200 may encode video/image information including intra prediction mode related information, residual information, and LFNST index information.
- the encoding device 200 may encode video/image information including information related to the intra prediction mode, the residual information, and the LFNST index information.
- the image information may further include LFNST set list index information.
- the LFNST set list index information may be information for designating one of a plurality of LFNST set lists. For example, when M LFNST set lists exist, the LFNST set list index information may have a value from 0 to M-1.
- the encoding device 200 may determine the LFNST set list for the current block related to the LFNST matrix applied to derive the modified transform coefficients, and the LFNST set list related to the LFSNT set list. Index information can be created and encoded.
- the LFNST set list index information may be generated so that the LFNST set can be derived based on the LFNST set index in the LFNST set list. That is, the LFNST set list index information may be related to the LFNST set list for the current block. That is, a plurality of LFNST set lists may exist, one LFNST set list may include a plurality of LFNST sets, and one LFNST set may include a plurality of LFNST matrices.
- the LFNST matrix applied to the current block may be determined based on the LFNST set list index information, the LFSNT set index, and the LFNST index information.
- the LFNST set list index may be generated/encoded with different values according to a range of quantization parameter values.
- the range of the quantization parameter values may be divided based on a predetermined threshold value. For example, 27, 28, 29, 30, and 31 may be used as threshold values.
- the LFNST set list index may be generated/encoded with different values according to ranges of transform block sizes.
- the conversion block size is 4x4, 4xN/Nx4 (N ⁇ 8), 8x8, 8xN/Nx8 (N ⁇ 16), 16x16, 16xN/Nx16 (N ⁇ 32), or 32x32 or more (both horizontal and vertical lengths are 32 or more) can be configured to apply different LFNST set lists to each case.
- the LFNST set list index may be generated/encoded with a different value depending on whether a specific coding tool is applied to the current block.
- the specific coding tool may be at least one of PDPC, MIP mode, ISP mode, ACT or MTS.
- the LFNST set list index may be generated/encoded with a different value according to the color format of the current block.
- a decoding device and an encoding device to which the embodiment (s) of this document are applied are multimedia broadcasting transceiving devices, mobile communication terminals, home cinema video devices, digital cinema video devices, surveillance cameras, video conversation devices, video communication devices, and the like.
- OTT over the top video
- video devices may include game consoles, Blu-ray players, Internet-connected TVs, home theater systems, smart phones, tablet PCs, digital video recorders (DVRs), and the like.
- the processing method to which the embodiment(s) of this document is applied may be produced in the form of a program executed by a computer and stored in a computer-readable recording medium.
- Multimedia data having a data structure according to the embodiment(s) of this document may also be stored in a computer-readable recording medium.
- the computer-readable recording medium includes all types of storage devices and distributed storage devices in which computer-readable data is stored.
- the computer-readable recording medium includes, for example, Blu-ray Disc (BD), Universal Serial Bus (USB), ROM, PROM, EPROM, EEPROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical A data storage device may be included.
- the computer-readable recording medium includes media implemented in the form of a carrier wave (eg, transmission through the Internet).
- the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
- embodiment(s) of this document may be implemented as a computer program product using program codes, and the program code may be executed on a computer by the embodiment(s) of this document.
- the program code may be stored on a carrier readable by a computer.
- a content streaming system to which embodiments of this document are applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the bitstream may be generated by an encoding method or a bitstream generation method to which the embodiments of this document are applied, and the streaming server may temporarily store the bitstream in a process of transmitting or receiving the bitstream.
- the streaming server transmits multimedia data to a user device based on a user request through a web server, and the web server serves as a medium informing a user of what kind of service is available.
- the web server transmits the request to the streaming server, and the streaming server transmits multimedia data to the user.
- the content streaming system may include a separate control server, and in this case, the control server serves to control commands/responses between devices in the content streaming system.
- the streaming server may receive content from a media storage and/or encoding server. For example, when content is received from the encoding server, the content can be received in real time. In this case, in order to provide smooth streaming service, the streaming server may store the bitstream for a certain period of time.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (12)
- 디코딩 장치에 의하여 수행되는 영상 디코딩 방법에 있어서,비트스트림으로부터 영상 정보를 획득하는 단계;상기 레지듀얼 정보로부터 현재 블록에 대한 변환 계수들을 도출하는 단계;상기 변환 계수들에 LFNST(Low-Frequency Non-Separable Transform)를 적용하여 수정된 변환 계수들을 도출하는 단계;상기 수정된 변환 계수들을 기반으로 상기 현재 블록에 대한 레지듀얼 샘플들을 생성하는 단계를 포함하고,상기 현재 블록에 대한 인트라 예측 모드를 기반으로 상기 LFNST를 적용하기 위한 LFNST 세트가 도출되고, 상기 현재 블록의 크기 및 상기 LFNST 세트를 기반으로 LFNST 매트릭스가 도출되고,상기 현재 블록의 폭 또는 높이가 4이고 폭과 높이 모두 4 이상이면, 상기 LFNST 매트릭스는 16x16 차원 매트릭스로 도출되고,상기 현재 블록의 폭 또는 높이가 8이고 폭과 높이 모두 8 이상이면, 상기 LFNST 매트릭스는 64x32 차원 매트릭스로 도출되는 것을 특징으로 하는 영상 디코딩 방법.
- 영상 인코딩 장치에 의하여 수행되는 영상 인코딩 방법에 있어서,현재 블록에 대한 인트라 예측 모드를 기반으로 상기 현재 블록에 대한 예측 샘플들을 생성하는 단계;상기 예측 샘플들에 기초하여 상기 현재 블록에 대한 레지듀얼 샘플들을 도출하는 단계;상기 레지듀얼 샘플에 대한 1차 변환을 기반으로 상기 현재 블록에 대한 변환 계수들을 도출하는 단계;상기 변환 계수들에 대하여 LFNST(Low-Frequency Non-Separable Transform)를 적용하여 수정된 변환 계수들을 도출하는 단계와;상기 수정된 변환 계수들 기반으로 도출된 레지듀얼 정보를 포함하는 영상 정보를 인코딩하는 단계를 포함하되,상기 인트라 예측 모드를 기반으로 상기 LFNST를 적용하기 위한 LFNST 세트가 도출되고, 상기 현재 블록의 크기 및 상기 LFNST 세트를 기반으로 LFNST 매트릭스가 도출되고,상기 현재 블록의 폭 또는 높이가 4이고 폭과 높이 모두 4 이상이면, 상기 LFNST 매트릭스는 16x16 차원 매트릭스로 도출되고,상기 현재 블록의 폭 또는 높이가 8이고 폭과 높이 모두 8 이상이면, 상기 LFNST 매트릭스는 32x64 차원 매트릭스로 도출되는 것을 특징으로 하는 영상 인코딩 방법.
- 컴퓨터 판독 가능한 디지털 저장 매체로서, 상기 디지털 저장 매체에는 소정 방법에 따라 생성된 비트스트림이 저장되고, 상기 방법은,현재 블록에 대한 인트라 예측 모드를 기반으로 상기 현재 블록에 대한 예측 샘플들을 생성하는 단계;상기 예측 샘플들에 기초하여 상기 현재 블록에 대한 레지듀얼 샘플들을 도출하는 단계;상기 레지듀얼 샘플에 대한 1차 변환을 기반으로 상기 현재 블록에 대한 변환 계수들을 도출하는 단계;상기 변환 계수들에 대하여 LFNST(Low-Frequency Non-Separable Transform)를 적용하여 수정된 변환 계수들을 도출하는 단계와;상기 비트스트림을 생성하기 위하여, 상기 수정된 변환 계수들 기반으로 도출된 레지듀얼 정보를 포함하는 영상 정보를 인코딩하는 단계를 포함하되,상기 인트라 예측 모드를 기반으로 상기 LFNST를 적용하기 위한 LFNST 세트가 도출되고, 상기 현재 블록의 크기 및 상기 LFNST 세트를 기반으로 LFNST 매트릭스가 도출되고,상기 현재 블록의 폭 또는 높이가 4이고 폭과 높이 모두 4 이상이면, 상기 LFNST 매트릭스는 16x16 차원 매트릭스로 도출되고,상기 현재 블록의 폭 또는 높이가 8이고 폭과 높이 모두 8 이상이면, 상기 LFNST 매트릭스는 32x64 차원 매트릭스로 도출되는 것을 특징으로 하는 디지털 저장 매체.
- 영상에 대한 데이터의 전송 방법에 있어서,상기 영상에 대한 비트스트림을 획득하되, 상기 비트스트림은 현재 블록에 대한 인트라 예측 모드를 기반으로 상기 현재 블록에 대한 예측 샘플들을 생성하는 단계; 상기 예측 샘플들에 기초하여 상기 현재 블록에 대한 레지듀얼 샘플들을 도출하는 단계; 상기 레지듀얼 샘플에 대한 1차 변환을 기반으로 상기 현재 블록에 대한 변환 계수들을 도출하는 단계; 상기 변환 계수들에 대하여 LFNST(Low-Frequency Non-Separable Transform)를 적용하여 수정된 변환 계수들을 도출하는 단계와; 상기 수정된 변환 계수들 기반으로 도출된 레지듀얼 정보를 포함하는 영상 정보를 인코딩하는 단계를 수행하여 생성되고,상기 비트스트림을 포함하는 상기 데이터를 전송하는 단계를 포함하고,상기 인트라 예측 모드를 기반으로 상기 LFNST를 적용하기 위한 LFNST 세트가 도출되고, 상기 현재 블록의 크기 및 상기 LFNST 세트를 기반으로 LFNST 매트릭스가 도출되고,상기 현재 블록의 폭 또는 높이가 4이고 폭과 높이 모두 4 이상이면, 상기 LFNST 매트릭스는 16x16 차원 매트릭스로 도출되고,상기 현재 블록의 폭 또는 높이가 8이고 폭과 높이 모두 8 이상이면, 상기 LFNST 매트릭스는 32x64 차원 매트릭스로 도출되는 것을 특징으로 하는 전송 방법.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280047809.2A CN117597934A (zh) | 2021-06-16 | 2022-06-16 | 用于设计低频不可分离变换的方法和装置 |
EP22825338.1A EP4358517A1 (en) | 2021-06-16 | 2022-06-16 | Method and device for designing low-frequency non-separable transform |
KR1020237043359A KR20240010480A (ko) | 2021-06-16 | 2022-06-16 | 저주파 비분리 변환 설계 방법 및 장치 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163211529P | 2021-06-16 | 2021-06-16 | |
US63/211,529 | 2021-06-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022265415A1 true WO2022265415A1 (ko) | 2022-12-22 |
Family
ID=84526669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2022/008512 WO2022265415A1 (ko) | 2021-06-16 | 2022-06-16 | 저주파 비분리 변환 설계 방법 및 장치 |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4358517A1 (ko) |
KR (1) | KR20240010480A (ko) |
CN (1) | CN117597934A (ko) |
WO (1) | WO2022265415A1 (ko) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200028373A (ko) * | 2010-09-28 | 2020-03-16 | 삼성전자주식회사 | 비디오의 부호화 방법 및 장치, 복호화 방법 및 장치 |
KR20200057991A (ko) * | 2018-11-19 | 2020-05-27 | (주)휴맥스 | 비디오 신호를 위한 dst-7, dct-8 변환 커널 생성 유도 방법 및 장치 |
KR20200086735A (ko) * | 2018-09-02 | 2020-07-17 | 엘지전자 주식회사 | 영상 신호를 처리하기 위한 방법 및 장치 |
KR20200086732A (ko) * | 2018-09-05 | 2020-07-17 | 엘지전자 주식회사 | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 |
KR102231975B1 (ko) * | 2014-07-25 | 2021-03-24 | 인텔 코포레이션 | 순방향 변환 행렬을 사용하여 비디오 인코더에 의해 순방향 변환을 수행하는 기술 |
-
2022
- 2022-06-16 KR KR1020237043359A patent/KR20240010480A/ko unknown
- 2022-06-16 WO PCT/KR2022/008512 patent/WO2022265415A1/ko active Application Filing
- 2022-06-16 EP EP22825338.1A patent/EP4358517A1/en active Pending
- 2022-06-16 CN CN202280047809.2A patent/CN117597934A/zh active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200028373A (ko) * | 2010-09-28 | 2020-03-16 | 삼성전자주식회사 | 비디오의 부호화 방법 및 장치, 복호화 방법 및 장치 |
KR102231975B1 (ko) * | 2014-07-25 | 2021-03-24 | 인텔 코포레이션 | 순방향 변환 행렬을 사용하여 비디오 인코더에 의해 순방향 변환을 수행하는 기술 |
KR20200086735A (ko) * | 2018-09-02 | 2020-07-17 | 엘지전자 주식회사 | 영상 신호를 처리하기 위한 방법 및 장치 |
KR20200086732A (ko) * | 2018-09-05 | 2020-07-17 | 엘지전자 주식회사 | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 |
KR20200057991A (ko) * | 2018-11-19 | 2020-05-27 | (주)휴맥스 | 비디오 신호를 위한 dst-7, dct-8 변환 커널 생성 유도 방법 및 장치 |
Also Published As
Publication number | Publication date |
---|---|
EP4358517A1 (en) | 2024-04-24 |
CN117597934A (zh) | 2024-02-23 |
KR20240010480A (ko) | 2024-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020149648A1 (ko) | 변환 스킵 플래그를 이용한 영상 코딩 방법 및 장치 | |
WO2020046091A1 (ko) | 다중 변환 선택에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020246806A1 (ko) | 매트릭스 기반 인트라 예측 장치 및 방법 | |
WO2021206445A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021086055A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020050651A1 (ko) | 다중 변환 선택에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020159316A1 (ko) | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020130661A1 (ko) | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020242183A1 (ko) | 광각 인트라 예측 및 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021167421A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021040410A1 (ko) | 레지듀얼 코딩에 대한 영상 디코딩 방법 및 그 장치 | |
WO2021025530A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021201549A1 (ko) | 레지듀얼 코딩에 대한 영상 디코딩 방법 및 그 장치 | |
WO2021137556A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021096293A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021010680A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020145720A1 (ko) | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020145582A1 (ko) | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020130581A1 (ko) | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2022220546A1 (ko) | 저주파 비분리 변환 설계 방법 및 장치 | |
WO2022182083A1 (ko) | 영상 코딩 방법 및 그 장치 | |
WO2021141478A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021060827A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021071283A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021086064A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22825338 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20237043359 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020237043359 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280047809.2 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022825338 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022825338 Country of ref document: EP Effective date: 20240116 |