WO2020050668A1 - 영상 신호를 처리하기 위한 방법 및 장치 - Google Patents
영상 신호를 처리하기 위한 방법 및 장치 Download PDFInfo
- Publication number
- WO2020050668A1 WO2020050668A1 PCT/KR2019/011517 KR2019011517W WO2020050668A1 WO 2020050668 A1 WO2020050668 A1 WO 2020050668A1 KR 2019011517 W KR2019011517 W KR 2019011517W WO 2020050668 A1 WO2020050668 A1 WO 2020050668A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transform
- nsst
- block
- current block
- unit
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/625—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention relates to a method and apparatus for processing a video signal, and more particularly, to a method and apparatus for encoding or decoding a video signal by performing transformation.
- Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or storing it in a form suitable for a storage medium.
- Media such as video, image, and audio may be the subject of compression encoding, and a technique for performing compression encoding on an image is referred to as video image compression.
- Next-generation video content will have the characteristics of high spatial resolution, high frame rate and high dimensionality of scene representation. In order to process such content, a huge increase in terms of memory storage, memory access rate and processing power will be produced.
- the video codec standard after the high efficiency video coding (HEVC) standard requires an efficient conversion technique to convert a video signal in a spatial domain into a frequency domain along with a prediction technique having higher accuracy.
- Embodiments of the present invention are to provide a video signal processing method and apparatus that applies a transform having high coding efficiency and low complexity.
- Decoding method of a video signal determining the input length and output length of the non-separated transform based on the height and width of the current block, and the input of the non-separated transform Determining a non-separation transformation matrix corresponding to a length and an output length, and applying the non-separation transformation matrix to the number of coefficients corresponding to the input length in the current block, and If the height and width are greater than or equal to 8, and the height and width of the current block are 8 respectively, the input length of the non-separated transform is determined as 8.
- the input length of the non-separated transform may be determined as 16.
- the output length may be determined to be 48 or 64.
- the step of applying the non-separation transformation matrix to the current block may include the non-separation transformation matrix when the product of the width and height is less than a threshold value when the height and the width are 8 respectively. It may include the step of applying to the upper left 4x4 area of the current block.
- the determining of the non-separated transform matrix may include determining a non-separated transform set index based on the intra prediction mode of the current block, and a ratio within the non-separated transform set included in the non-separated transform set index.
- the method may include determining a non-separated transform kernel corresponding to a separated transform index, and determining the non-separated transform matrix from the non-separated transform kernel based on the input length and the output length.
- An image signal processing apparatus includes a memory for storing the image signal and a processor coupled with the memory, wherein the processor is configured to determine the height and width of the current block. Determine an input length and an output length of the non-separated transform based on the non-separated transform matrix corresponding to the input length and the output length of the non-separated transform, and the non-separated transform matrix to the input length in the current block.
- the input length of the non-separated transform is 8
- the output length is determined to be greater than the input length and less than or equal to 64.
- FIG. 1 is an embodiment to which the present invention is applied, and shows a schematic block diagram of an encoding device in which encoding of a video / image signal is performed.
- FIG. 2 is an embodiment to which the present invention is applied, and shows a schematic block diagram of a decoding apparatus in which decoding of a video signal is performed.
- Figure 3 is an embodiment to which the present invention can be applied,
- Figure 3a is a QT (quadtree, QT)
- Figure 3b is a BT (binary tree, BT)
- Figure 3c is a TT (ternary tree, TT)
- Figure 3d is AT (asymmetric tree, AT).
- FIG. 4 and 5 are embodiments to which the present invention is applied, and FIG. 4 shows a schematic block diagram of a transform and quantization unit, an inverse quantization and an inverse transform unit in the encoding apparatus of FIG. 1, and FIG. 5 is an inverse quantization and A schematic block diagram of an inverse transform section is shown.
- FIG. 6 is an embodiment to which the present invention is applied, and shows a flowchart for encoding a video signal through primary and secondary transforms.
- FIG. 7 is an embodiment to which the present invention is applied, and shows a flowchart for decoding a video signal through secondary inverse transform and primary inverse transform.
- FIG 8 shows an example of a transform configuration group to which an adaptive multiple transform (AMT) is applied according to an embodiment of the present invention.
- AMT adaptive multiple transform
- FIG. 9 shows an example of an encoding flowchart to which AMT is applied according to an embodiment of the present invention.
- FIG. 10 shows an example of a decoding flowchart to which AMT is applied according to an embodiment of the present invention.
- FIG. 11 shows an example of a flowchart for encoding an AMT flag and an AMT index according to an embodiment of the present invention.
- FIG. 12 shows an example of a decoding flowchart for performing transformation based on the AMT flag and AMT index.
- FIG. 13 and 14 is an embodiment to which the present invention is applied, FIG. 13 shows a diagram for explaining Givens rotation, and FIG. 14 shows 4x4 NSST (non) composed of a Givens rotation layer and permutations -separable secondary transform).
- FIG. 15 shows an example of a configuration of a non-separated transform set for each intra prediction mode according to an embodiment of the present invention.
- FIG. 16 shows three forward scan sequences for transform coefficients or transform coefficient blocks applied in a high efficiency video coding (HEVC) standard, (a) is a diagonal scan, (b) is a horizontal scan, (c) shows a vertical scan.
- HEVC high efficiency video coding
- FIG. 17 and 18 are embodiments to which the present invention is applied, and FIG. 17 shows positions of transform coefficients when a forward diagonal scan is applied when 4x4 RST is applied to 4x8 blocks, and FIG. 18 shows two 4x4 blocks An example of a case in which valid transform coefficients of is merged into one block is shown.
- 19 is an embodiment to which the present invention is applied, and shows an example of a method of configuring a mixed NSST set for each intra prediction mode.
- 20 is an embodiment to which the present invention is applied, and shows an example of a method of selecting an NSST set (or kernel) in consideration of an intra prediction mode and a transform block size.
- 21A and 21B show forward and reverse reduced transforms as an embodiment to which the present invention is applied.
- FIG. 22 shows an example of a decoding flowchart using a reduced transform according to an embodiment of the present invention.
- FIG. 23 shows an example of a flow chart for application of a conditional reduced transform according to an embodiment of the present invention.
- FIG. 24 shows an example of a decoding flowchart for a second inverse transform to which a conditional reduced transform according to an embodiment of the present invention is applied.
- 25A, 25B, 26A, and 26B show examples of reduced transform and reduced inverse transform according to an embodiment of the present invention.
- FIG. 27 shows an example of a region to which a reduced quadratic transformation is applied according to an embodiment of the present invention.
- 29 is an embodiment to which the present invention is applied, and shows an example of an encoding flowchart for performing transformation.
- FIG. 30 is an embodiment to which the present invention is applied, and shows an example of a decoding flowchart for performing transformation.
- 31 is an embodiment to which the present invention is applied, and shows an example of a detailed block diagram of a conversion unit in an encoding device.
- 32 is an embodiment to which the present invention is applied, and shows an example of a detailed block diagram of an inverse transform unit in a decoding apparatus.
- 34 is an embodiment to which the present invention is applied and shows an example of a block diagram of an apparatus for processing a video signal.
- 35 shows an example of a video coding system as an embodiment to which the present invention is applied.
- 36 is an embodiment to which the present invention is applied, and is a structural diagram of a content streaming system.
- the term 'processing unit' in the present specification means a unit in which encoding / decoding processing processes such as prediction, transformation, and / or quantization are performed.
- the processing unit may be interpreted to include a unit for a luminance component and a unit for a chroma component.
- the processing unit may correspond to a block, a coding unit (CU), a prediction unit (PU), or a transform unit (TU).
- the processing unit may be interpreted as a unit for a luminance component or a unit for a color difference component.
- the processing unit may correspond to a coding tree block (CTB), a coding block (CB), a PU or a transform block (TB) for the luminance component.
- the processing unit may correspond to CTB, CB, PU or TB for the color difference component.
- the present invention is not limited thereto, and the processing unit may be interpreted to include a unit for a luminance component and a unit for a color difference component.
- processing unit is not necessarily limited to square blocks, and may be configured in a polygonal shape having three or more vertices.
- a pixel, a pixel, or a coefficient transformation coefficient or transformation coefficient that has undergone first-order transformation
- a sample a pixel value, a pixel value, or a coefficient (a transform coefficient or a transform coefficient that has undergone first-order transformation) is used.
- Embodiments of the present invention provide an image and video compression method and apparatus.
- the compressed data has the form of a bitstream, and the bitstream may be stored in various types of storage or streamed through a network and delivered to a terminal having a decoder.
- the decoded image may be displayed by the display device or simply bitstream data may be stored.
- the method and apparatus proposed in the embodiment of the present invention can be applied to both an encoder and a decoder, to a device that generates a bitstream or to a device that accepts a bitstream, and correlates with whether the terminal outputs through a display device. Can be applied without.
- the video compression device is largely composed of a prediction unit, a transform and quantization unit, and an entropy coding unit, and schematic block diagrams of the encoding device and the decoding device are shown in FIGS. 1 and 2.
- the transform and quantization unit converts the residual signal obtained by subtracting the prediction signal from the original signal into a frequency domain signal through a transform such as a DCT (discrete cosine transform) -2, and then applies quantization to significantly reduce the number of non-zero signals.
- a transform such as a DCT (discrete cosine transform) -2
- FIG. 1 is an embodiment to which the present invention is applied, and shows a schematic block diagram of an encoding device in which encoding of a video / image signal is performed.
- the image splitter 110 may divide the input image (or picture, frame) input to the encoding apparatus 100 into one or more processing units.
- the processing unit may be referred to as a coding unit (CU).
- the coding unit may be recursively divided according to a quad-tree binary-tree (QTBT) structure from a coding tree unit (CTU) or a largest coding unit (LCU).
- QTBT quad-tree binary-tree
- CTU coding tree unit
- LCU largest coding unit
- one coding unit may be divided into a plurality of coding units of a deeper depth based on a quad tree structure and / or a binary tree structure.
- a quad tree structure may be applied first, and a binary tree structure may be applied later.
- a binary tree structure may be applied first.
- the coding procedure according to the present invention can be performed based on the final coding unit that is no longer split.
- the maximum coding unit may be directly used as a final coding unit based on coding efficiency according to image characteristics, or the coding unit may be recursively divided into coding units having a lower depth than optimal if necessary.
- the coding unit of the size of can be used as the final coding unit.
- the coding procedure may include procedures such as prediction, transformation, and reconstruction, which will be described later.
- the processing unit may further include a prediction unit (PU) or a transformation unit (TU).
- the prediction unit and the transform unit may be partitioned or partitioned from the above-described final coding unit, respectively.
- the prediction unit may be a unit of sample prediction
- the transformation unit may be a unit for deriving a transform coefficient and / or a unit for deriving a residual signal from the transform coefficient.
- the unit may be used interchangeably with terms such as a block or area depending on the case.
- the MxN block may represent samples of M columns and N rows or a set of transform coefficients.
- the sample may generally represent a pixel or a pixel value, and may indicate only a pixel / pixel value of a luma component or only a pixel / pixel value of a saturation component.
- the sample may be used as a term for one picture (or image) corresponding to a pixel or pel.
- the encoding apparatus 100 subtracts a prediction signal (a predicted block, a prediction sample array) output from the inter prediction unit 180 or the intra prediction unit 185 from the input image signal (original block, original sample array)
- a signal residual signal, residual block, residual sample array
- a unit that subtracts a prediction signal (a prediction block, a prediction sample array) from an input image signal (original block, original sample array) in the encoder 100 may be referred to as a subtraction unit 115.
- the prediction unit may perform prediction on a block to be processed (hereinafter referred to as a current block), and generate a predicted block including prediction samples for the current block.
- the prediction unit may determine whether intra prediction or inter prediction is applied in units of a current block or CU. As described later in the description of each prediction mode, the prediction unit may generate various information regarding prediction, such as prediction mode information, and transmit it to the entropy encoding unit 190.
- the prediction information may be encoded by the entropy encoding unit 190 and output in the form of a bitstream.
- the intra prediction unit 185 may predict the current block by referring to samples in the current picture.
- the referenced samples may be located in the neighborhood of the current block or may be located apart depending on a prediction mode.
- prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
- the non-directional mode may include, for example, a DC mode and a planar mode (Planar mode).
- the directional mode may include, for example, 33 directional prediction modes or 65 directional prediction modes depending on the degree of detail of the prediction direction. However, this is an example, and more or less directional prediction modes may be used depending on the setting.
- the intra prediction unit 185 may determine a prediction mode applied to the current block using a prediction mode applied to neighboring blocks.
- the inter prediction unit 180 may derive a predicted block for the current block based on a reference block (reference sample array) specified by a motion vector on the reference picture.
- motion information may be predicted in units of blocks, subblocks, or samples based on the correlation of motion information between a neighboring block and a current block.
- the motion information may include a motion vector and a reference picture index.
- the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
- the neighboring block may include a spatial neighboring block existing in the current picture and a temporal neighboring block present in the reference picture.
- the reference picture including the reference block and the reference picture including the temporal neighboring block may be the same or different.
- the temporal neighboring block may be referred to by a name such as a collocated reference block or a colCU, and a reference picture including a temporal neighboring block may also be called a collocated picture (colPic).
- the inter prediction unit 180 constructs a motion information candidate list based on neighboring blocks, and provides information indicating which candidate is used to derive the motion vector and / or reference picture index of the current block. Can be created. Inter prediction may be performed based on various prediction modes. For example, in the case of the skip mode and the merge mode, the inter prediction unit 180 may use motion information of neighboring blocks as motion information of the current block.
- the residual signal may not be transmitted.
- a motion vector of a current block is obtained by using a motion vector of a neighboring block as a motion vector predictor and signaling a motion vector difference. I can order.
- the prediction signal generated by the inter prediction unit 180 or the intra prediction unit 185 may be used to generate a reconstructed signal or may be used to generate a residual signal.
- the transform unit 120 may generate transform coefficients by applying a transform technique to the residual signal.
- the transform technique may include at least one of DCT, Discrete Sine Transform (DST), Karhunen-Loeve Transform (KLT), Graph-Based Transform (GBT), or Conditionally Non-linear Transform (CNT).
- DCT Discrete Sine Transform
- KLT Karhunen-Loeve Transform
- GBT Graph-Based Transform
- CNT Conditionally Non-linear Transform
- GBT refers to a transformation obtained from this graph when it is said that the relationship information between pixels is graphically represented.
- CNT means a transform obtained by generating a predictive signal using all previously reconstructed pixels and based on it.
- the transform process may be applied to pixel blocks having the same size of a square, or may be applied to blocks of variable sizes other than squares.
- the quantization unit 130 quantizes the transform coefficients and transmits them to the entropy encoding unit 190, and the entropy encoding unit 190 encodes a quantized signal (information about quantized transform coefficients) and outputs it as a bitstream. have.
- Information about the quantized transform coefficients may be referred to as residual information.
- the quantization unit 130 may rearrange block-type quantized transform coefficients into a one-dimensional vector form based on a coefficient scan order, and the quantized transform based on the one-dimensional vector form quantized transform coefficients Information about coefficients may be generated.
- the entropy encoding unit 190 may perform various encoding methods such as exponential Golomb (CAVLC), context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC).
- the entropy encoding unit 190 may encode information necessary for video / image reconstruction (eg, values of syntax elements, etc.) together with the quantized transform coefficients together or separately.
- the encoded information (eg, video / video information) may be transmitted or stored in the unit of a network abstraction layer (NAL) unit in the form of a bitstream.
- NAL network abstraction layer
- the bitstream can be transmitted over a network or stored on a digital storage medium.
- the network may include a broadcasting network and / or a communication network
- the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD.
- the signal output from the entropy encoding unit 190 may be configured as an internal / external element of the encoding apparatus 100 by a transmitting unit (not shown) and / or a storing unit (not shown) for storing, or the transmitting unit It may be a component of the entropy encoding unit 190.
- the quantized transform coefficients output from the quantization unit 130 may be used to generate a prediction signal.
- the residual signal may be reconstructed by applying inverse quantization and inverse transform to the quantized transform coefficients through the inverse quantization unit 140 and the inverse transform unit 150 in the loop.
- the adder 155 adds the reconstructed residual signal to the predicted signal output from the inter predictor 180 or the intra predictor 185, so that the reconstructed signal (restored picture, reconstructed block, reconstructed sample array) Can be created. If there is no residual for the block to be processed, such as when the skip mode is applied, the predicted block may be used as a reconstructed block.
- the adding unit 155 may be called a restoration unit or a restoration block generation unit.
- the generated reconstructed signal may be used for intra prediction of the next processing target block in the current picture, or may be used for inter prediction of the next picture through filtering as described below.
- the filtering unit 160 may apply subjective filtering to the reconstructed signal to improve subjective / objective image quality.
- the filtering unit 160 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and may transmit the modified reconstructed picture to the decoded picture buffer 170.
- Various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and the like.
- the filtering unit 160 may generate various information regarding filtering as described later in the description of each filtering method and transmit it to the entropy encoding unit 190.
- the filtering information may be encoded by the entropy encoding unit 190 and output in the form of a bitstream.
- the modified reconstructed picture transmitted to the decoded picture buffer 170 may be used as a reference picture in the inter prediction unit 180.
- inter prediction is applied through the encoding apparatus 100, prediction mismatch between the encoding apparatus 100 and the decoding apparatus can be avoided, and encoding efficiency can be improved.
- the decoded picture buffer 170 may store the corrected reconstructed picture for use as a reference picture in the inter prediction unit 180.
- FIG. 2 is an embodiment to which the present invention is applied, and shows a schematic block diagram of a decoding apparatus in which decoding of a video signal is performed.
- the decoding apparatus 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an adding unit 235, a filtering unit 240, and a decoded picture buffer (DPB). 250, an inter prediction unit 260, and an intra prediction unit 265.
- the inter prediction unit 260 and the intra prediction unit 265 may be collectively called a prediction unit. That is, the prediction unit may include an inter prediction unit 180 and an intra prediction unit 185.
- the inverse quantization unit 220 and the inverse conversion unit 230 may be collectively referred to as a residual processing unit. That is, the residual processing unit may include an inverse quantization unit 220 and an inverse conversion unit 230.
- the entropy decoding unit 210, the inverse quantization unit 220, the inverse transform unit 230, the addition unit 235, the filtering unit 240, the inter prediction unit 260, and the intra prediction unit 265 described above are embodiments. It may be configured by one hardware component (for example, a decoder or processor). Also, the decoded picture buffer 250 may be implemented by one hardware component (eg, a memory or digital storage medium) according to an embodiment.
- the decoding apparatus 200 may restore an image in response to a process in which the video / image information is processed by the encoding apparatus 100 of FIG. 2.
- the decoding apparatus 200 may perform decoding using a processing unit applied by the encoding apparatus 100.
- the processing unit of decoding may be, for example, a coding unit, and the coding unit may be divided along a quad tree structure and / or a binary tree structure from a coding tree unit or a largest coding unit. Then, the decoded video signal decoded and output through the decoding apparatus 200 may be reproduced through the reproduction apparatus.
- the decoding apparatus 200 may receive the signal output from the encoding apparatus 100 of FIG. 2 in the form of a bitstream, and the received signal may be decoded through the entropy decoding unit 210.
- the entropy decoding unit 210 may parse the bitstream to derive information (eg, video / image information) necessary for image reconstruction (or picture reconstruction).
- the entropy decoding unit 210 decodes information in a bitstream based on a coding method such as exponential Golomb coding, CAVLC, or CABAC, and quantizes a value of a syntax element required for image reconstruction and a transform coefficient for residual.
- the CABAC entropy decoding method receives bins corresponding to each syntax element in the bitstream, and decodes the syntax element information to be decoded and decoding information of neighboring and decoding target blocks or information of symbols / bins decoded in the previous step.
- the context model is determined by using, and the probability of occurrence of the bin is predicted according to the determined context model, and arithmetic decoding of the bin is performed to generate a symbol corresponding to the value of each syntax element. have.
- the CABAC entropy decoding method may update the context model using the decoded symbol / bin information for the next symbol / bin context model after determining the context model.
- a prediction unit inter prediction unit 260 and intra prediction unit 265
- the entropy decoding unit 210 performs entropy decoding.
- the dual value that is, quantized transform coefficients and related parameter information may be input to the inverse quantization unit 220.
- information related to filtering among information decoded by the entropy decoding unit 210 may be provided to the filtering unit 240.
- a receiving unit (not shown) receiving a signal output from the encoding apparatus 100 may be further configured as an internal / external element of the decoding apparatus 200, or the receiving unit may be a component of the entropy decoding unit 210. It might be.
- the inverse quantization unit 220 may inverse quantize the quantized transform coefficients to output transform coefficients.
- the inverse quantization unit 220 may rearrange the quantized transform coefficients in a two-dimensional block form. In this case, reordering may be performed based on the coefficient scan order performed by the encoding apparatus 100.
- the inverse quantization unit 220 may perform inverse quantization on quantized transform coefficients by using a quantization parameter (for example, quantization step size information), and obtain transform coefficients.
- a quantization parameter for example, quantization step size information
- the inverse transform unit 230 obtains a residual signal (residual block, residual sample array) by inverse transforming the transform coefficients.
- the prediction unit may perform prediction on the current block and generate a predicted block including prediction samples for the current block.
- the prediction unit may determine whether intra prediction or inter prediction is applied to the current block based on the information on the prediction output from the entropy decoding unit 210, and may determine a specific intra / inter prediction mode.
- the intra prediction unit 265 may predict the current block by referring to a sample in the current picture.
- the referenced sample may be located in the neighborhood of the current block or spaced apart depending on the prediction mode.
- prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
- the intra prediction unit 265 may determine a prediction mode applied to the current block using a prediction mode applied to neighboring blocks.
- the inter prediction unit 260 may derive the predicted block for the current block based on the reference block (reference sample array) specified by the motion vector on the reference picture.
- motion information may be predicted in units of blocks, subblocks, or samples based on the correlation of motion information between neighboring blocks and the current block.
- the motion information may include a motion vector and a reference picture index.
- the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
- the neighboring block may include a spatial neighboring block existing in the current picture and a temporal neighboring block present in the reference picture.
- the inter prediction unit 260 configures a motion information candidate list based on information related to prediction of neighboring blocks, and derives a motion vector and / or reference picture index of the current block based on the received candidate selection information. can do.
- Inter prediction may be performed based on various prediction modes, and information regarding prediction may include information indicating a mode of inter prediction for a current block.
- the adding unit 235 adds the obtained residual signal to the prediction signal (predicted block, prediction sample array) output from the inter prediction unit 260 or the intra prediction unit 265, thereby restoring signals (restored pictures, reconstructed blocks). , A reconstructed sample array). If there is no residual for the block to be processed, such as when the skip mode is applied, the predicted block may be used as a reconstructed block.
- the adding unit 235 may be referred to as a restoration unit or a restoration block generation unit.
- the generated reconstructed signal may be used for intra prediction of the next processing target block in the current picture, or may be used for inter prediction of the next picture through filtering as described below.
- the filtering unit 240 may improve subjective / objective image quality by applying filtering to the reconstructed signal. For example, the filtering unit 240 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and may transmit the modified reconstructed picture to the decoded picture buffer 250.
- Various filtering methods may include, for example, deblocking filtering, sample adaptive offset (SAO), adaptive loop filter (ALF), bilateral filter.
- the corrected reconstructed picture transmitted to the decoded picture buffer 250 may be used as a reference picture by the inter prediction unit 260.
- the embodiments described in the filtering unit 160, the inter prediction unit 180, and the intra prediction unit 185 of the encoding device 100 are respectively the filtering unit 240 and the inter of the decoding device 200. The same may be applied to the prediction unit 260 and the intra prediction unit 265.
- Figure 3 is an embodiment to which the present invention can be applied,
- Figure 3a is a QT (quadtree, QT)
- Figure 3b is a BT (binary tree, BT)
- Figure 3c is a TT (ternary tree, TT)
- Figure 3d is AT (asymmetric tree, AT).
- one block can be divided based on QT.
- one subblock divided by QT may be further divided recursively using QT.
- a leaf block that is no longer QT split may be split by at least one of BT, TT, or AT.
- BT may have two types of splitting: horizontal BT (2NxN, 2NxN) and vertical BT (Nx2N, Nx2N).
- the TT may have two types of splitting: horizontal TT (2Nx1 / 2N, 2NxN, 2Nx1 / 2N) and vertical TT (1 / 2Nx2N, Nx2N, 1 / 2Nx2N).
- AT is horizontal-up AT (2Nx1 / 2N, 2Nx3 / 2N), horizontal-down AT (2Nx3 / 2N, 2Nx1 / 2N), vertical-left AT ( It can have four types of partitioning: 1 / 2Nx2N, 3 / 2Nx2N), and vertical-right AT (3 / 2Nx2N, 1 / 2Nx2N).
- Each BT, TT, AT can be further divided recursively using BT, TT, AT.
- Block A can be divided into four sub-blocks (A0, A1, A2, A3) by QT.
- the sub-block A1 may be divided into four sub-blocks (B0, B1, B2, and B3) again by QT.
- Block B3 which is no longer divided by QT, may be divided into vertical BT (C0, C1) or horizontal BT (D0, D1). Each sub-block, such as block C0, may be further divided recursively, such as in the form of horizontal BT (E0, E1) or vertical BT (F0, F1).
- Block B3C shows an example of TT segmentation.
- Block B3 which is no longer divided by QT, may be divided into vertical TT (C0, C1, C2) or horizontal TT (D0, D1, D2).
- each sub-block may be further divided recursively in the form of a horizontal TT (E0, E1, E2) or a vertical TT (F0, F1, F2).
- Block B3 which is no longer divided by QT, may be divided into vertical AT (C0, C1) or horizontal AT (D0, D1). As in block C1, each sub-block may be further recursively divided in the form of horizontal AT (E0, E1) or vertical TT (F0, F1).
- BT, TT, AT partitioning can be used together.
- sub-blocks divided by BT can be divided by TT or AT.
- sub-blocks divided by TT can be divided by BT or AT.
- the sub-block divided by AT can be divided by BT or TT.
- each sub-block may be divided into vertical BTs, or after vertical BT splitting, each sub-block may be divided into horizontal BTs. In this case, the order of division is different, but the shape of the final division is the same.
- a sequence of searching for blocks may be variously defined.
- a search is performed from left to right and from top to bottom, and searching for blocks means an order of determining whether to divide additional blocks of each divided sub-block, or when each block is no longer divided It may mean a coding order of blocks or a search order when sub-blocks refer to information of other neighboring blocks.
- transformation may be performed for each processing unit (or transformation block) divided by a partitioning structure, and in particular, a transformation matrix may be applied by being divided by row and column directions.
- a transformation matrix may be applied by being divided by row and column directions.
- other conversion types may be used depending on the length of the processing unit (or conversion block) in the row direction or the column direction.
- FIG. 4 is a schematic diagram of a transformation and quantization unit 120/130, an inverse quantization and an inverse transformation unit 140/150 in the encoding apparatus 100 of FIG. 5 shows a block diagram of the inverse quantization and inverse transform unit 220/230 in the decoding apparatus 200.
- the transform and quantization unit 120/130 may include a primary transform unit 121, a secondary transform unit 122, and a quantization unit 130. have.
- the inverse quantization and inverse transform unit 140/150 includes an inverse quantization unit 140, an inverse secondary transform unit 151, and an inverse primary transform unit 152. You can.
- the inverse quantization and inverse transform unit 220/230 includes an inverse quantization unit 220, an inverse secondary transform unit 231, and an inverse primary transform unit ( 232).
- the primary transform when performing the conversion, it is possible to perform the conversion through a plurality of steps.
- two stages of a primary transform and a secondary transform may be applied, or further transform stages may be used according to an algorithm.
- the primary transform may be referred to as a core transform.
- the primary transform unit 121 may apply a primary transform to the residual signal, where the primary transform may be defined as a table in the encoder and / or decoder.
- the second-order transform unit 122 may apply a second-order transform to the first-converted signal, where the second-order transform may be defined as an table in the encoder and / or decoder.
- a non-separable secondary transform may be applied conditionally as a secondary transform.
- NSST is applied only in the case of an intra prediction block, and may have a transform set applicable to each prediction mode group.
- the prediction mode group may be set based on symmetry with respect to the prediction direction. For example, since prediction mode 52 and prediction mode 16 are symmetric based on prediction mode 34 (diagonal direction), the same transform set may be applied by forming a group. At this time, when applying the transform for the prediction mode 52, the input data is transposed and applied, because the transform set is the same as the prediction mode 16.
- each has a set of transformations, and the set of transformations may be composed of two transformations.
- the set of transformations may be composed of 3 transforms per transform set.
- the quantization unit 130 may perform quantization on the second-converted signal.
- the inverse quantization and inverse transformation unit 140/150 performs the above-described process in reverse, and redundant description is omitted.
- FIG. 5 shows a schematic block diagram of an inverse quantization and inverse transform unit 220/230 in the decoding apparatus 200.
- the inverse quantization and inverse transform unit 220/230 includes an inverse quantization unit 220, an inverse secondary transform unit 231, and an inverse primary transform unit. (232).
- the inverse quantization unit 220 obtains a transform coefficient from an entropy decoded signal using quantization step size information.
- the inverse quadratic transform unit 231 performs an inverse quadratic transform on the transform coefficients.
- the inverse quadratic transform represents the inverse transform of the quadratic transform described in FIG. 4.
- the inverse primary transform unit 232 performs an inverse primary transform on the inverse quadratic transformed signal (or block), and obtains a residual signal.
- the inverse primary transform represents the inverse transform of the primary transform described in FIG. 4.
- FIG. 6 is an embodiment to which the present invention is applied, and shows a flowchart for encoding a video signal through primary and secondary transforms. Each operation illustrated in FIG. 6 may be performed by the conversion unit 120 of the encoding device 100.
- the encoding apparatus 100 may determine (or select) a forward secondary transform based on at least one of a prediction mode, a block shape, and / or a block size of the current block (S610).
- the encoding apparatus 100 may determine an optimal forward secondary transform through RD optimization (rate-distortion optimization).
- the optimal forward quadratic transform may correspond to one of a plurality of transform combinations, and the plurality of transform combinations may be defined by a transform index.
- the encoding apparatus 100 may compare the results of all of the forward secondary transform, quantization, and residual coding for each candidate.
- the encoding apparatus 100 may signal a secondary transform index corresponding to an optimal forward secondary transform (S620).
- S620 optimal forward secondary transform
- other embodiments described in the specification may be applied to the secondary transform index.
- the encoding apparatus 100 may perform forward primary transform on the current block (residual block) (S630).
- the encoding apparatus 100 may perform forward secondary transform on the current block using an optimal forward secondary transform (S640). Meanwhile, the forward quadratic transformation may be RST described below.
- RST means a transformation in which N residual data (Nx1 residual vector) is input and R (R ⁇ N) transform coefficient data (Rx1 transform coefficient vector) are output.
- RST may be applied to a specific area of the current block.
- a specific region may mean an upper left N / 2xN / 2 region.
- the present invention is not limited to this, and may be set differently according to at least one of a prediction mode, a block shape, or a block size.
- a specific region may mean an upper left MxM region (M ⁇ N).
- the encoding apparatus 100 may generate a transform coefficient block by performing quantization on the current block (S650).
- the encoding apparatus 100 may generate an bitstream by performing entropy encoding on the transform coefficient block.
- FIG. 7 is an embodiment to which the present invention is applied, and shows a flowchart for decoding a video signal through secondary inverse transform and primary inverse transform. Each operation illustrated in FIG. 7 may be performed by the inverse transform unit 230 of the decoding apparatus 200.
- the decoding apparatus 200 may obtain a secondary transform index from the bitstream (S710).
- the decoding apparatus 200 may derive a secondary transform corresponding to the secondary transform index (S720).
- steps S710 and S720 are examples, and the present invention is not limited thereto.
- the decoding apparatus 200 may derive a secondary transform based on at least one of a prediction mode, a block shape, and / or a block size of the current block without obtaining a secondary transform index.
- the decoder 200 may entropy decode a bitstream to obtain a transform coefficient block, and perform inverse quantization on the transform coefficient block (S730).
- the decoder 200 may perform an inverse quadratic transform on the inverse quantized transform coefficient block (S740).
- the reverse quadratic transformation may be reverse RST.
- the reverse RST is a transpose matrix of RST described in FIG. 6, which means a transform in which R transform coefficient data (Rx1 transform coefficient vectors) are input and N residual data (Nx1 residual vector) are output. .
- the reduced secondary transform may be applied to a specific area of the current block.
- a specific region may mean an upper left N / 2xN / 2 region.
- the present invention is not limited to this, and may be set differently according to at least one of a prediction mode, a block shape, or a block size.
- the decoder 200 may perform a reverse primary transform on the result of the reverse secondary transform (S750).
- the decoder 200 generates a residual block through step S750, and generates a reconstructed block by adding the residual block and the prediction block.
- FIG 8 shows an example of a transform configuration group to which an adaptive multiple transform (AMT) is applied according to an embodiment of the present invention.
- AMT adaptive multiple transform
- the transform setting group is determined based on the prediction mode, and the number of groups may be six (G0 to G5) in total.
- G0 to G4 correspond to a case where intra prediction is applied
- G5 indicates transformation combinations (or transformation sets, transformation combination sets) applied to a residual block generated by inter prediction.
- One transform combination is a horizontal transform (or row transform) applied to the rows of the corresponding 2D block and a vertical transform (or column) applied to the columns. It may be composed of (column transform).
- each of all transform setting groups may include four transform combination candidates.
- the four transform combination candidates may be selected or determined through a transform combination index of 0 to 3, and a transform combination index from the encoding apparatus 100 to the decoding apparatus 200 may be transmitted through an encoding procedure.
- statistical characteristics of residual data (or residual signals) obtained through intra prediction may be different according to intra prediction modes. Therefore, as shown in FIG. 8, other transforms than the normal cosine transform may be applied for each intra prediction mode.
- the conversion type may be expressed, for example, DCT-Type 2, DCT-II, DCT-2.
- a transform set configuration for a case where 35 intra prediction modes are used and a case where 67 intra prediction modes are used is illustrated, respectively.
- a plurality of transform combinations may be applied to each transform setup group classified in the intra prediction mode column.
- the plurality of transformation combinations may be composed of four combinations. More specifically, in group 0, since DST-7 and DCT-5 can be applied to both the row (horizontal) direction and the column (vertical) direction, four combinations are possible.
- the transform combination index may be referred to as an AMT index (AMT index), and may be expressed as amt_idx.
- DCT-2 may be optimal for both the row direction and the column direction due to the characteristics of a residual signal. Therefore, transformation can be performed adaptively by defining an AMT flag for each coding unit. Here, if the AMT flag is 0, DCT-2 is applied to both the row direction and the column direction, and if the AMT flag is 1, one of four combinations can be selected or determined through the AMT index.
- the transform kernels of FIG. 8 are not applied and DST-7 may be applied to both the row direction and the column direction. .
- the amount of additional information may be reduced by applying DST-7 without parsing the AMT index.
- AMT can be applied only when both the width and height of the conversion unit are 32 or less.
- FIG. 8 may be set in advance through off-line training.
- the AMT index may be defined by one index that can simultaneously indicate a combination of horizontal and vertical transformations.
- the AMT index may be separately defined by a horizontal transform index and a vertical transform index.
- the technique of applying a transform selected from a plurality of transform kernels may be referred to as multiple transform selection (MTS) or enhanced multiple transform (EMT).
- AMT index may be referred to as an MTS index.
- FIG. 9 shows an example of an encoding flowchart to which AMT is applied according to an embodiment of the present invention.
- the operations illustrated in FIG. 9 may be performed by the conversion unit 120 of the encoding device 100.
- This document basically describes an embodiment in which transforms are applied separately in the horizontal direction and the vertical direction, but the combination of transforms may be configured as non-separable transforms.
- it may also consist of a mixture of separable transforms and non-separable transforms.
- non-separation transformation selection of transformation by row / column or selection by horizontal / vertical direction becomes unnecessary, and the transformation of FIG. 8 is performed only when separable transformation is selected. Combinations can be used.
- the schemes proposed in the present specification can be applied regardless of a first order transform or a second order transform. That is, there is no restriction that it should be applied to either one, and both can be applied.
- the first transform may mean a transform for transforming the residual block at the beginning
- the second transform may mean a transform for applying a transform to a block generated as a result of the first transform.
- the encoding apparatus 100 may determine a transform setting group corresponding to the current block (S910).
- the conversion setting group may be composed of combinations as shown in FIG. 8.
- the encoding apparatus 100 may perform transformation on candidate transformation combinations available in the transformation setup group (S920).
- the encoding apparatus 100 may determine or select a transformation combination having the lowest rate distortion (RD) cost (S930).
- RD rate distortion
- the encoding apparatus 100 may encode a transform combination index corresponding to the selected transform combination (S940).
- FIG. 10 shows an example of a decoding flowchart to which AMT is applied according to an embodiment of the present invention.
- the operations illustrated in FIG. 10 may be performed by the inverse transform unit 230 of the decoding apparatus 200.
- the decoding apparatus 200 may determine a transform setting group for the current block (S1010).
- the decoding apparatus 200 may parse (or acquire) a transform combination index from a video signal, where the transform combination index may correspond to any one of a plurality of transform combinations in a transform setup group (S1020).
- the conversion setting group may include DCT-2, DST-7, or DCT-8.
- the decoding apparatus 200 may derive a transform combination corresponding to the transform combination index (S1030).
- the transform combination is composed of horizontal transform and vertical transform, and may include at least one of DCT-2, DST-7, or DCT-8.
- the transform combination described in FIG. 8 may be used as the transform combination.
- the decoding apparatus 200 may perform an inverse transform on the current block based on the derived transform combination (S1040). If the transform combination is composed of a row (horizontal) transform and a column (vertical) transform, the row (horizontal) transform can be applied first, followed by the column (vertical) transform. However, the present invention is not limited to this, and may be applied on the contrary, or when the non-separation transformations are configured, the non-separation transformation may be directly applied.
- the inverse transform of DST-7 or the inverse transform of DCT-8 may be applied for each column and then for each row.
- different transformations may be applied to the vertical transformation or the horizontal transformation for each row and / or for each column.
- the transform combination index may be obtained based on the AMT flag indicating whether AMT is performed. That is, the transform combination index can be obtained only when AMT is performed according to the AMT flag. Also, the decoding apparatus 200 may determine whether the number of non-zero coefficients is greater than a threshold value. At this time, the transform combination index may be parsed only when the number of non-zero transform coefficients is greater than a threshold value.
- the AMT flag or AMT index is a sequence, picture, slice, block, coding unit, transform unit, or prediction unit. ).
- step S1010 may be preset and omitted in the encoding device 100 and / or the decoding device 200.
- FIG. 11 shows an example of a flowchart for encoding an AMT flag and an AMT index according to an embodiment of the present invention.
- the operations of FIG. 11 may be performed by the conversion unit 120 of the encoding device 100.
- the encoding apparatus 100 may determine whether AMT is applied to the current block (S1110).
- the encoding apparatus 100 may determine the AMT index based on at least one of a prediction mode, a horizontal transform, and a vertical transform of the current block (S1130).
- the AMT index indicates an index indicating any one of a plurality of transform combinations for each intra prediction mode, and the AMT index may be transmitted for each transform unit.
- the encoding apparatus 100 may encode the AMT index (S1140).
- FIG. 12 shows an example of a decoding flowchart for performing transformation based on the AMT flag and AMT index.
- the decoding apparatus 200 may parse the AMT flag from the bitstream (S1210).
- the AMT flag may indicate whether AMT is applied to the current block.
- the decoding apparatus 200 may check whether AMT is applied to the current block based on the AMT flag (S1220). For example, it is possible to check whether the AMT flag is 1.
- the decoding apparatus 200 may parse the AMT index (S1230).
- the AMT index refers to an index indicating any one of a plurality of transform combinations for each intra prediction mode, and the AMT index may be transmitted for each transform unit.
- the AMT index may refer to an index indicating any one of the conversion combinations defined in the preset conversion combination table, where the preset conversion combination table may refer to FIG. 8, but the present invention is not limited thereto.
- the decoding apparatus 200 may derive or determine a horizontal transform and a vertical transform based on at least one of the AMT index or prediction mode (S1240).
- the decoding apparatus 200 may derive a transform combination corresponding to the AMT index.
- the decoding apparatus 200 may derive or determine a horizontal transform and a vertical transform corresponding to the AMT index.
- the decoding apparatus 200 may apply a preset vertical inverse transformation for each column (S1250).
- the vertical inverse transform may be an inverse transform of DCT-2.
- the decoding apparatus 200 may apply a predetermined horizontal inverse transform for each row (S1260).
- the horizontal inverse transform may be an inverse transform of DCT-2. That is, when the AMT flag is 0, a conversion kernel preset in the encoding device 100 or the decoding device 200 may be used. For example, it is not defined in the conversion combination table as shown in FIG. 8, but a commonly used conversion kernel may be used.
- Secondary transform refers to applying the transform kernel once again with the result of applying the primary transform as input.
- the primary transform may include DCT-2, DST-7 in HEVC, or AMT described above.
- the non-separable transform does not apply the NxN transform kernel sequentially for the row direction and column direction, but considers the NxN 2-dimensional residual block as an N 2 x1 vector and then transforms the N 2 xN 2 for this vector. Points to applying the kernel only once.
- NSST may refer to a non-separated square matrix applied to a vector composed of coefficients of a transform block.
- the embodiments of the present document mainly describe NSST as an example of non-separation transformation applied to the upper left region (low frequency region) determined according to the size of the block, but the embodiment of the present invention is limited to the terms of NSST It is not, and any type of non-separation transformation can be applied to the embodiments of the present invention.
- the non-separation transform applied to the upper left region (low frequency region) determined according to the size of the block may be referred to as a low frequency non-separable transform (LFNST).
- MxN transformation or transformation matrix refers to a matrix of M rows and N columns.
- NSST the 2D block data obtained by applying the first transform is divided into MxM blocks, and then an M 2 xM 2 non-separation transform is applied to each MxM block.
- M can be 4 or 8.
- NSST is not applied to all regions of the 2D block obtained by the first transform, but only to some regions. For example, NSST can be applied only to a top-left 8x8 block.
- 64x64 non-separation transformation can be applied to the upper left 8x8 area only when the width and height of the 2D block obtained through the primary transformation are both 8 or higher, and for the rest of the cases, the corresponding 16x16 ratio is divided into 4x4 blocks. Separation transformation can be applied.
- the M 2 xM 2 non-separation transform may be applied in the form of a matrix product, but may be approximated by combinations of Givens rotation layers and permutation layers to reduce computational and memory requirements.
- 13 shows one Givens rotation. It can be described by one angle of one Givens rotation as shown in FIG. 13.
- FIG. 13 and 14 is an embodiment to which the present invention is applied, FIG. 13 shows a diagram for explaining Givens rotation, and FIG. 14 shows a 4x4 NSST composed of a Givens rotation layer and permutations. It shows the composition of one round.
- Both 8x8 NSST and 4x4 NSST can be configured in a hierarchical combination of Givens rotations.
- the matrix corresponding to one Givens rotation is as shown in Equation 1, and the matrix product is represented in FIG. 13.
- t m and t n output by the Givens rotation may be calculated as in Equation 2.
- one more substitution is finally performed on the data output through the Givens rotation layers, and information about the substitution is separately stored for each transformation.
- the permutation is performed at the end of the forward NSST, and the inverse permutation is first applied to the inverse NSST.
- the reverse NSST performs the Gibbons rotation layers and permutations applied in the forward NSST in reverse order, and rotates by taking a minus (-) value for each Gibbons rotation angle.
- FIG. 15 shows an example of a configuration of a non-separated transform set for each intra prediction mode according to an embodiment of the present invention.
- Intra prediction modes to which the same NSST or NSST set is applied may form a group.
- FIG. 15 classifies 67 intra prediction modes into 35 groups.
- the 20th mode and the 48th mode belong to the 20th group (hereinafter, a mode group).
- a plurality of NSSTs instead of one NSST may be configured as a set.
- Each set may include cases where NSST is not applied.
- NSST is not applied.
- three different NSSTs can be applied to one mode group, it can be configured to select one of four cases, including when no NSST is applied.
- an index may be transmitted in TU units to distinguish one of the four cases.
- the number of NSSTs may be configured differently for each mode group. For example, mode groups 0 and 1 may signal to select one of three cases, respectively, including the case where NSST is not applied.
- Embodiment 1 RST applicable to 4x4 block
- a non-separable transform that can be applied to one 4x4 block is a 16x16 transform. That is, if the data elements constituting the corresponding 4x4 block are arranged in a row in a row-first or column-first order, a 16x1 vector is obtained, and a corresponding non-separation transformation can be applied to the 16x1 vector. have.
- the forward 16x16 transform is composed of 16 directional transform basis vectors, and when an inner product is taken for the 16x1 vector and each transform base vector, a transform coefficient for the transform base vector is obtained. do.
- the process of obtaining the corresponding transform coefficients for all 16 transform base vectors is the same as multiplying the 16x16 non-separation transform matrix and the input 16x1 vector.
- the transform coefficients obtained by matrix multiplication have a 16x1 vector form, and statistical characteristics may be different for each transform coefficient. For example, when the 16x1 transform coefficient vector is composed of 0th to 15th elements, the variance of the 0th element may be greater than that of the 15th element. That is, the larger the element located in front, the larger the corresponding variance value and the larger the energy value.
- the original 4x4 block signal can be restored.
- the forward 16x16 non-separated transform is an orthonormal transform
- the corresponding backward 16x16 transform can be obtained by transposing the matrix for the forward 16x16 transform. For simplicity, multiplying the inverse 16x16 non-separated transform matrix by a 16x1 transform coefficient vector yields data in the form of a 16x1 vector, and the 4x4 block signal can be restored by arranging it in the row-first or column-first order.
- elements constituting the 16x1 transform coefficient vector may have different statistical characteristics.
- the original transform may be applied to some transform coefficients that appear first without using all transform coefficients. It is possible to restore a signal that is very close to the signal.
- the inverse 16x16 non-separated transform is composed of 16 column base vectors, only L column base vectors are left to construct a 16xL matrix, and only L transform coefficients, which are more important among transform coefficients, are left (Lx1).
- Example 2 Setting the application area of 4x4 RST and arranging the transform coefficients
- 4x4 RST may be applied as a secondary transform, and at this time, may be applied secondary to a block to which a primary transform such as DCT-type 2 is applied.
- a primary transform such as DCT-type 2
- 4x4 RST may be applied to each divided block.
- the above methods 1) and 2) can be mixed and applied. For example, after dividing into 4x4 blocks only for the upper left MxM area, 4x4 RST may be applied.
- the second transform is applied only to the upper left 8x8 region, when the NxN block is equal to or greater than 8x8, 8x8 RST is applied, and when the NxN block is smaller than 8x8 (4x4, 8x4, 4x8), the 2 After dividing into 4x4 blocks as shown in), each 4x4 RST can be applied.
- FIG. 16 shows three forward scan sequences for transform coefficients or transform coefficient blocks applied in the HEVC standard, (a) is a diagonal scan, (b) is a horizontal scan, and (c) is a vertical scan. (vertical scan).
- FIG. 16 shows three forward scan orders for a transform coefficient or transform coefficient block (4x4 block, Coefficient Group (CG)) applied in the HEVC standard, and the residual coding is (a), (b), or ( It is performed in the reverse order of the scan order of c) (ie, coded in the order of 16 to 1).
- the three scan orders shown in (a), (b), and (c) are selected according to the intra-prediction mode, so that the L transform coefficients are determined to determine the scan order according to the intra-prediction mode. Can be configured.
- FIG. 17 and 18 are embodiments to which the present invention is applied, and FIG. 17 shows positions of transform coefficients when a forward diagonal scan is applied when 4x4 RST is applied to 4x8 blocks, and FIG. 18 shows two 4x4 blocks An example of a case in which valid transform coefficients of is merged into one block is shown.
- L transform coefficients arranged in two 4x4 blocks may be configured as one block.
- L value is 8
- a flag (coded_sub_block_flag) indicating whether residual coding of a corresponding block is applied may be coded as 0.
- the combination method for the positions of the transform coefficients of two 4x4 blocks may vary. For example, the positions may be combined in any order, but the following method may also be applied.
- the transform coefficients for the first 4x4 block may be arranged first, and then the transform coefficients for the second 4x4 block may be arranged.
- It can be arranged by connecting as follows. naturally, You can also change the order like so:
- Example 3 Method of coding a non-separable secondary transform (NSST) index for 4x4 RST
- a value of 0 may be filled from the L + 1th position to the 16th position according to the transform coefficient scan order for each 4x4 block. Therefore, if any one of the two 4x4 blocks has a non-zero value from the L + 1th position to the 16th position, it is derived that 4x4 RST is not applied. If the 4x4 RST has a structure that applies a transform selected from a transform set prepared as a joint expoloration model (JEM) NSST, an index (hereinafter referred to as an NSST index) to which the transform is applied may be signaled. have.
- JEM joint expoloration model
- the NSST index can be known through bitstream parsing, and bitstream parsing can be performed after residual coding. In this case, if a non-zero transform coefficient exists between the L + 1th position and the 16th position by residual decoding, the decoder may not parse the NSST index because it is certain that 4x4 RST is not applied. Therefore, the signaling cost can be reduced by selectively parsing the NSST index only when necessary.
- 4x4 RST When 4x4 RST is applied to a plurality of 4x4 blocks in a specific area as shown in FIG. 17 (in this case, all the same 4x4 RST may be applied, or different 4x4 RST may be applied), all 4x4 blocks through one NSST index
- the 4x4 RST to be applied (same or different) can be specified. Whether or not to apply 4x4 RST and 4x4 RST to all 4x4 blocks is determined by one NSST index. Therefore, it is determined whether non-zero transform coefficients exist at positions L + 1 through 16 for all 4x4 blocks. As a result of investigation during the dual decoding process, if a non-zero transform coefficient exists in a position not allowed in the 4x4 block (L + 1 to 16), the encoding apparatus 100 may be set not to code the NSST index. .
- the encoding apparatus 100 may signal each NSST index separately for a luminance block and a chrominance block, or may signal separate NSST indexes for a Cb component and a Cr component, respectively, in the case of a chrominance block.
- one common NSST index can be used.
- signaling of the NSST index is also performed only once.
- 4x4 RST indicated by the same NSST index may be applied. In this case, the 4x4 RST itself for the Cb component and the Cr component may be the same, and the NSST index is the same. Separate 4x4 RSTs can be set for the Cb component and the Cr component.
- the L + 1 position to the 16th position are non-zero transform coefficients for the conditional signaling described above. It is checked whether is present, and when a non-zero transform coefficient is found in the L + 1th position to the 16th position, signaling for the NSST index may be omitted.
- the encoding apparatus 100 has a non-zero transform coefficient at a position where there is no valid transform coefficient. After checking whether or not signaling for the NSST index can be determined. In particular, when the L value is 8 as shown in FIG. 18, when 4x4 RST is applied, there are no valid transform coefficients in one 4x4 block (block indicated by X in FIG. 18 (b)), whether or not residual coding of the corresponding block is performed. The flag for (coded_sub_block_flag) is checked, and if it is 1, the NSST index may be set not to be signaled.
- NSST is mainly described as an example of non-separation transformation, but other known terms (eg, LFNST) may be used for non-separation transformation.
- the NSST set (NSST set) and the NSST index may be used by replacing the LFNST set and the LFNST index.
- the RST described in this document is a reduced input length in a square non-separated transformation matrix applied to at least a portion of the transform block (the left 4x4, 8x8 region, or the rest of the 8x8 block except the right-bottom 4x4 region).
- RST may also be replaced with LFNST and used.
- Example 4 Optimization method for coding 4x4 index before residual coding
- whether to apply 4x4 RST may be configured to be determined through the NSST index value (for example, if the NSST index is 0, so that 4x4 RST is not applied), or a separate syntax element (eg, NSST flag) ) May be signaled whether to apply 4x4 RST.
- a separate syntax component is an NSST flag
- the decoding apparatus 200 first determines whether to apply 4x4 RST by parsing the NSST flag, and if the NSST flag value is 1, a valid conversion coefficient may exist as described above. Residual coding (decoding) may be omitted for missing positions.
- coding is performed at the last non-zero coefficient position in the TU first when performing residual coding. If the coding for the NSST index is performed after coding for the position of the last non-zero coefficient, and the position of the last non-zero coefficient is a position where a non-zero coefficient cannot exist when assuming application of 4x4 RST,
- the decoding apparatus 200 may be configured not to code the NSST index and not apply 4x4 RST. For example, in the case of positions indicated by X in FIG.
- the decoding apparatus 200 may omit coding for the NSST index. If the last non-zero coefficient is not located in the area indicated by X, the decoding apparatus 200 may perform coding for the NSST index.
- the remaining residual coding part can be processed in the following two ways. .
- coding of the NSST index is omitted when the x position (Px) and y position (Py) of the last nonzero coefficient are less than Tx and Ty respectively
- 4x4 RST may not be applied.
- NSST index coding is omitted.
- the method of determining whether to encode the NSST index through comparison with the threshold may be differently applied to the luminance component and the chrominance component, for example, different Tx and Ty may be applied to the luminance component and the chrominance component, respectively.
- a threshold value may be applied to the component, and a threshold value may not be applied to the color difference component. Conversely, a threshold value may be applied to the color difference component and a threshold value may not be applied to the luminance component.
- NST index coding is omitted when the last non-zero coefficient is located in a region where a valid transform coefficient does not exist, NSST when the X and Y coordinates for the last non-zero coefficient are less than the threshold respectively
- Index coding omitted). For example, you can perform a threshold check for the position coordinates of the last nonzero coefficient first, and then check whether the last nonzero coefficient is located in a region where a valid transform coefficient does not exist. The order can be changed.
- Example 4 The methods presented in Example 4) can also be applied to 8x8 RST. That is, if the last non-zero coefficient is located in a region other than 4x4 in the upper left in the upper 8x8 region, coding for the NSST index may be omitted, otherwise coding for the NSST index may be performed. In addition, if the X and Y coordinate values for the position of the last non-zero coefficient are less than a certain threshold, coding for the NSST index may be omitted. Both methods can be applied simultaneously.
- Example 5 When applying RST, different NSST index coding and residual coding schemes are applied to luminance components and color difference components, respectively.
- Embodiments 3 and 4 can be applied differently to luminance components and color difference components, respectively. That is, NSST index coding and residual coding may be applied differently to luminance components and color difference components.
- the method described in Example 4 may be applied to the luminance component
- the method described in Example 3 may be applied to the color difference component.
- the conditional NSST index coding proposed in Example 3 or 4 is applied to the luminance component, and the conditional NSST index coding may not be applied to the luminance component, and the opposite (conditional NSST index is applied to the color difference component) It is possible to apply coding and not apply to luminance components).
- a mixed NSST transform set for applying various NSST conditions in a process of applying NSST and a method of configuring the corresponding MNTS are provided.
- the 4x4 NSST set includes only 4x4 kernels and the 8x8 NSST set includes only 8x8 kernels according to the size of the pre-selected sub-block.
- the embodiment of the present invention further proposes a method of configuring the mixed NSST set as follows.
- -NSST kernels available in the NSST set are not fixed, and NSST kernels having one or more variable sizes may be included in the NSST set (eg, 4x4 NSST kernels and 8x8 NSST kernels are included in one NSST set).
- the number of NSST kernels available in the NSST set may be fixed and variable (eg, the first set includes 3 kernels, and the second set includes 4 kernels).
- NSST kernels 1, 2, and 3 in the first set are mapped to NSST indexes 1, 2, and 3, respectively, but the second set
- NSST kernels 3, 2 and 1 map to NSST indexes 1, 2 and 3 respectively.
- the priority of NSST kernels available in the NSST transform set may be determined according to the size of the NSST kernel (eg, 4x4 NSST and 8x8 NSST).
- the 8x8 NSST kernel may be more important than the 4x4 NSST kernel, so a low value NSST index is assigned to the 8x8 NSST kernel.
- the priority of NSST kernels available in the NSST transformation set may be determined according to the order of the NSST kernels.
- a given 4x4 NSST first kernel may take precedence over a 4x4 NSST second kernel.
- the NSST index can be signaled with a smaller number of bits by assigning a higher priority (smaller index) to the frequently occurring NSST kernel.
- Table 1 and Table 2 below show examples of the mixed NSST set proposed in this embodiment.
- a method of determining an NSST set is proposed in consideration of an intra prediction mode and a block size in the process of determining a secondary transform set.
- the proposed method in this embodiment is configured to configure a transform set suitable for the intra prediction mode in conjunction with the sixth embodiment to configure kernels of various sizes to be applied to blocks.
- 19 is an embodiment to which the present invention is applied, and shows an example of a method of configuring a mixed NSST set for each intra prediction mode.
- Example 19 is an example of a table according to a method of applying the method proposed in Example 2 in connection with Example 6. That is, as shown in FIG. 19, an index ('Mixed Type') indicating whether to follow the existing method of configuring the NSST set or the other method of configuring the NSST set may be defined for each intra prediction mode.
- the NSST set is configured using the NSST set configuration method defined in the system instead of following the JEM NSST set configuration method. It is composed.
- the method of configuring the NSST set defined in the system may refer to the mixed NSST set proposed in Example 6.
- the table of FIG. 19 is a method of constructing two types of transform sets based on mixed type information (flags) related to the intra prediction mode (JEM-based NSST set configuration, in an embodiment of the present invention
- mixed type information flags
- JEM-based NSST set configuration JEM-based NSST set configuration
- the proposed mixed type NSST set configuration method is described, but the mixed type NSST configuration method may be one or more, and at this time, the mixed type information may be represented by various values of N (N> 2).
- 20 is an embodiment to which the present invention is applied, and shows an example of a method of selecting an NSST set (or kernel) in consideration of an intra prediction mode and a transform block size.
- the decoding apparatus 200 may determine the used NSST kernel using NSST index information.
- the NSST index is efficiently considered by considering a change in a statistical distribution of NSST index values transmitted after encoding. It provides a method for encoding.
- An embodiment of the present invention provides a method of selecting a kernel to be applied using a syntax indicating a kernel size.
- Table 3 shows a method of binarizing NSST index values, and since the number of NSS kernels available for each transform set is different, the NSST index can be binarized according to the maximum NSST index value.
- Reductions that can be applied to core transformations (e.g. DCT, DST, etc.) and quadratic transformations (e.g. NSST) due to complexity issues in transformation (e.g. large block transformation or non-separation transformation) Provides a transform.
- the main idea of a reduced transform is to map an N-dimensional vector to an R-dimensional vector in another space, where R / N (R ⁇ N) is the reduction factor.
- the reduced transform is an RxN matrix as in Equation 3 below.
- Equation 1 the R rows of the transform are R bases of the new N-dimensional space.
- the reason why it is referred to as a reduced transform is that the number of elements of the vector output by the transform is smaller than the number of elements of the input vector (R ⁇ N).
- the inverse transform matrix for the reduced transform is the transpose of the forward transform. The reduced transforms in the forward and reverse directions will be described with reference to FIGS. 21A and 21B.
- 21A and 21B show forward and reverse reduced transforms as an embodiment to which the present invention is applied.
- the number of elements of the reduced transform is RxN pieces, which is R / N smaller than the size of the complete matrix (NxN), which means that the required memory is R / N of the complete matrix.
- the number of multiplications required is also RxN less than the original NxN by R / N.
- R coefficients are obtained after applying the reduced transform, which means that only R values need to be transmitted in place of the original N coefficients.
- FIG. 22 shows an example of a decoding flowchart using a reduced transform according to an embodiment of the present invention.
- the proposed reduced transform (inverse transform in the decoder) can be applied to coefficients (inverse quantized coefficients) as shown in FIG.
- a predetermined reduction factor R, or R / N
- a conversion kernel to perform the conversion may be required.
- the transform kernel may be determined based on available information such as block size (width, height), intra prediction mode, and Cidx. If the current coding block is a luma block, CIdx is equal to 0. Otherwise (Cb or Cr block) CIdx will be a non-zero value such as 1.
- FIG. 23 shows an example of a flow chart for application of a conditional reduced transform according to an embodiment of the present invention.
- the operations of FIG. 23 may be performed by the inverse quantization unit 140 and the inverse transformation unit 150 of the decoding apparatus 200.
- a reduced transform may be used if certain conditions are met.
- the reduced transform can be applied to blocks larger than a certain size as shown below.
- TH is a predefined value (eg 4)
- the reduced transformation is applied when the width of the current block is greater than the predefined value TH and the height of the current block is greater than the predefined value TH, as in the above conditions. You can. Alternatively, if the product of the width and height of the current block is greater than a predefined value (K) and a smaller value among the width and height of the current block is greater than a predefined value (TH), a reduced transformation may be applied.
- the reduced transform can be applied to a group of predetermined blocks as follows.
- a reduced transformation may be applied.
- the normal transform may be a predefined and usable transform in a video coding system. Examples of common transformations are:
- the reduced transform condition is an index (Transform_idx) indicating which transform (eg, DCT-4, DST-1) is used or which kernel is applied (when multiple kernels are available).
- Transform_idx may be transmitted twice.
- One is an index indicating a horizontal transform (Transform_idx_h) and the other is an index indicating a vertical transform (Transform_idx_v).
- the decoding apparatus 200 performs inverse quantization on the input bitstream (S2305). Thereafter, the decoding apparatus 200 determines whether to apply the transform (S2310). The decoding apparatus 200 may determine whether to apply the transform through a flag indicating whether to skip the transform.
- the decoding apparatus 200 parses the transform index (Transform_idx) indicating the transform to be applied (S2315). Also, the decoding apparatus 200 may select a transform kernel (S2330). For example, the decoding apparatus 200 may select a transform kernel corresponding to the transform index (Transform_idx). Also, the decoding apparatus 200 may select a transform kernel in consideration of block size (width, height), intra prediction mode, and CIdx (luma, chroma).
- the decoding apparatus 200 determines whether a condition for applying the reduced transform is satisfied (S2320).
- the conditions for the application of the reduced transform may include conditions as described above.
- the decoding apparatus 200 may apply a normal inverse transform (S2325).
- the decoding apparatus 200 may determine an inverse transform matrix from the transform kernel selected in step S2330, and apply the determined inverse transform matrix to a current block including transform coefficients.
- the decoding apparatus 200 may apply the reduced inverse transform (S2335).
- the decoding apparatus 200 may determine a reduced inverse transform matrix in consideration of a reduction factor from the transform kernel selected in step S2330, and apply the reduced inverse transform matrix to a current block including transform coefficients.
- FIG. 24 shows an example of a decoding flowchart for a second inverse transform to which a conditional reduced transform according to an embodiment of the present invention is applied.
- the operations of FIG. 24 may be performed by the inverse transform unit 230 of the decoding apparatus 200.
- the reduced transform may be applied to a quadratic transform as shown in FIG. 24.
- a reduced inverse transform may be applied.
- the decoding apparatus 200 performs inverse quantization (S2405). For transform coefficients generated through inverse quantization, the decoding apparatus 200 determines whether to apply NSST (S2410). That is, the decoding apparatus 200 determines whether parsing of the NSST index (NSST_idx) is necessary according to whether NSST is applied.
- the decoding apparatus 200 parses the NSST index (S2415) and determines whether the NSST index is greater than 0 (S2420).
- the NSST index can be restored by a technique such as CABAC by the entropy decoding unit 210.
- the decoding apparatus 200 may omit the secondary inverse transform and apply a core inverse transform or a primary inverse transform (S2445).
- the decoding apparatus 200 selects a transform kernel for the second inverse transform (S2435). For example, the decoding apparatus 200 may select a transform kernel corresponding to the NSST index (NSST_idx). Also, the decoding apparatus 200 may select a transform kernel in consideration of block size (width, height), intra prediction mode, and CIdx (luma, chroma).
- the decoding apparatus 200 determines whether a condition for applying the reduced transform is satisfied (S2425).
- the conditions for the application of the reduced transform may include conditions as described above.
- the decoding apparatus 200 may apply a normal secondary inverse transform (S2430). For example, the decoding apparatus 200 may determine a secondary inverse transform matrix from the transform kernel selected in step S2435, and apply the determined secondary inverse transform matrix to the current block including the transform coefficients.
- the decoding apparatus 200 may apply the reduced second-order inverse transform (S2440). For example, the decoding apparatus 200 may determine a reduced inverse transform matrix by considering a reduction factor from the transform kernel selected in step S2335, and apply the reduced inverse transform matrix to a current block including transform coefficients. Thereafter, the decoding apparatus 200 applies a core inverse transform or a first inverse transform (S2445).
- Example 10 Reduced Transform as a Secondary Transform with Different Block Size
- 25A, 25B, 26A, and 26B show examples of reduced transform and reduced inverse transform according to an embodiment of the present invention.
- a reduced transform in a video codec for different block sizes such as 4x4, 8x8, 16x16, etc. may be used as a secondary transform and a secondary inverse transform.
- the 8x8 block size and the reduction factor R 16
- the second order transform and the second order inverse transform may be set as shown in FIGS. 25A and 25B.
- the pseudocode of the reduced transform and the reduced inverse transform may be set as shown in FIG. 26.
- Example 11 Reduced Transform as a Secondary Transform with Non-Rectangular Shape
- FIG. 27 shows an example of a region to which a reduced quadratic transformation is applied according to an embodiment of the present invention.
- the quadratic transform can be applied to 4x4 and 8x8 corners.
- the reduced transform can also be applied to non-squares.
- RST may be applied only to a partial area (hatched area) of the block.
- Each square in FIG. 27 represents a 4x4 area, and RST may be applied to 10 4x4 pixels (ie, 160 pixels).
- RST 16 pixels
- the entire RST matrix is a 16x16 matrix, which may be an acceptable amount of computation.
- Non-separated transform can be applied only to three 4x4 blocks (a total of 48 transform coefficients).
- Changing the reduction factor can change memory and multiplication complexity.
- SPS sequence parameter set
- Reduced_transform_enabled_flag 1 indicates that reduced transformation is possible and applied. Reduced_transform_enabled_flag is 0 indicates that reduced transformation is not possible. If Reduced_transform_enabled_flag does not exist, it is inferred to be 0. ( Reduced_transform_enabled_flag equals to 1 specifies that reduced transform is enabled and applied.When Reduced_transform_enabled_flag is not present, it is inferred to be equal to 0).
- Reduced_transform_factor represents the number of reduced dimensions to keep for the reduced transform. If Reduced_transform_factor does not exist, it is inferred to be the same as R. ( Reduced_transform_factor specifies that the number of reduced dimensions to keep for reduced transform.When Reduced_transform_factor is not present, it is inferred to be equal to R).
- min_reduced_transform_size represents the minimum transform size for applying a reduced transform. If min_reduced_transform_size does not exist, it is inferred as 0. ( min_reduced_transform_size specifies that the minimum transform size to apply reduced transform.When min_reduced_transform_size is not present, it is inferred to be equal to 0).
- max_reduced_transform_size represents the maximum transform size for applying a reduced transform. If max_reduced_transform_size does not exist, it is inferred as 0.
- reduced_transform_size represents the number of reduced dimensions to keep for the reduced transform. If reduced_transform_size does not exist, it is inferred as 0. ( reduced_transform_size specifies that the number of reduced dimensions to keep for reduced transform.When Reduced_transform_factor is not present, it is inferred to be equal to 0.)
- the non-separated quadratic transform (4x4 NSST) that can be applied to a 4x4 block is a 16x16 transform.
- 4x4 NSST is applied secondary to a block to which a primary transform such as DCT-2, DST-7, or DCT-8 is applied.
- a primary transform such as DCT-2, DST-7, or DCT-8 is applied.
- NxM the size of the block to which the first transform is applied
- the 4x4 NSST is applied to the NxM block, the following method may be considered.
- 4x4 NSST is applied to the NxM area, but may be applied to only some areas.
- 4x4 NSST may be applied only to the upper left KxJ region. The conditions for this case are as a) and b) below.
- 4x4 NSST may be applied to each divided block.
- the computational complexity of the 4x4 NSST is a very important consideration factor of the encoder and decoder, so we will analyze it in detail.
- the 16x16 quadratic transform is composed of 16 row-direction transform base vectors, and when a dot product is obtained for the 16x1 vector and each transform base vector, transform coefficients for the transform base vector are obtained.
- the process of obtaining all transform coefficients for the 16 transform base vectors is the same as multiplying the input 16x1 vector with a 16x16 non-separated transform matrix. Therefore, the total number of multiplications required for 4x4 forward NSST is 256.
- the coefficient of the original 4x4 primary transform block can be restored.
- multiplying the inverse 16x16 non-separation transform matrix by a 16x1 transform coefficient vector obtains data in the form of a 16x1 vector, and by arranging the data in the row-first or column-first order that was applied first, a 4x4 block signal (first-order transform. Coefficient) can be restored. Therefore, the total number of multiplications required for 4x4 reverse NSST is 256.
- the number of multiplications required in sample units is 16. This is the number obtained when dividing the total number of multiplications obtained in the inner product process of the 16x1 vector, which is a 4x4 NSST execution process, and each transform base vector, by 256, the total number of samples. In the case of the forward 4x4 NSST and the reverse 4x4 NSST, the same multiplication number is 16.
- the number of multiplication per sample required when 4x4 NSST is applied is determined as follows according to the area to which 4x4 NSST is applied.
- a range in which 4x4 NSST is applied may be reduced to reduce the number of worst-case multiplications required for each sample stage.
- a method for reducing worst case complexity may be as follows.
- the Lx16 transform matrix is constructed by selecting L row transform vectors from the forward 16 ⁇ 16 non-separated transform matrix, and L transform coefficients are obtained by multiplying the L ⁇ 16 transform matrix and the 16 ⁇ 1 input vector.
- Table 7 The worst case multiplication number per sample in a 4x4 block according to the L value conversion is shown in Table 7 below.
- 4x4 NSST and 4x4 RST may be used in combination to reduce multiplication complexity in the worst case.
- the example below describes the conditions for applying 4x4 NSST and 4x4 RST under conditions for applying 4x4 NSST (that is, when the width and height of the current block are both greater than or equal to 4)).
- 4x4 NSST for a 4x4 block is a square (16x16) transformation matrix that receives 16 data and outputs 16 data
- 4x4 RST receives 16 data based on the encoder side and R less than 16
- 4x4 RST receives 16 data based on the encoder side and R less than 16
- 4x4 RST receives 16 data based on the encoder side and R less than 16
- 4x4 RST receives 16 data based on the encoder side and R less than 16
- 4x4 RST means a non-square (16x8) transformation matrix that receives R data (eg, 8) smaller than 16 and outputs 16 data.
- a 4x4 RST based on an 8x16 matrix is applied to the current block, otherwise (if either the width or height of the current block is not 4) 4x4 NSST may be applied to the upper left 4x4 area of the current block. More specifically, when the size of the current block is 4x4, a non-separated transform having an input length of 16 and an output length of 8 may be applied. In the case of the inverse non-separation transform, a non-separation transform having an input length of 8 and an output length of 16 may be applied.
- 4x4 NSST and 4x4 RST may be used in combination as shown in Table 11 below to reduce multiplication complexity in the worst case.
- Table 11 describes the conditions for applying 4x4 NSST and 4x4 RST under conditions for applying 4x4 NSST (that is, when the width and height of the current block are both greater than or equal to 4)).
- 4x4 RST based on an 8x16 matrix
- 4x4 NSST is the current block If applied to the upper left 4x4 area, and the width of the current block is greater than or equal to the height
- 4x4 NSST is applied to the upper left 4x4 area of the current block and the 4x4 area located to the right of the upper left 4x4 area, and in the rest (current If the product of the width and height of the block is greater than or equal to the threshold and the width of the current block is less than the height)
- 4x4 NSST is applied to the 4x4 area located below the upper left 4x4 area and the upper left 4x4 area of the current block.
- 4x4 RST (e.g., 8x16 matrix) can be applied to 4x4 block instead of 4x4 NSST to reduce the computational complexity of multiplication in the worst case.
- the non-separated quadratic transform (8x8 NSST) that can be applied to an 8x8 block is a 64x64 transform.
- 8x8 NSST is applied secondarily to a block to which a primary transform such as DCT-2, DST-7, or DCT-8 is applied.
- NxM the size of the block to which the first transform is applied
- the 8x8 NSST is applied to the NxM block, the following method may be considered.
- 8x8 NSST is applied to the NxM area, but may be applied only to some areas.
- 8x8 NSST may be applied only to the upper left KxJ region. The conditions for this case are as follows c) and d).
- 8x8 NSST may be applied to each divided block.
- the computational complexity of the 8x8 NSST is a very important consideration factor of the encoder and decoder, so we will analyze it in detail.
- the computational complexity of 8x8 NSST is analyzed based on the number of multiplications.
- the 64x64 non-separated quadratic transform consists of 64 row-direction transform base vectors, and when a dot product is obtained for the 64x1 vector and each transform base vector, transform coefficients for the transform base vector are obtained.
- the process of obtaining all transform coefficients for 64 transform base vectors is equivalent to multiplying the input 64x1 vector with a 64x64 non-separated transform matrix. Therefore, the total number of multiplications required for 8x8 forward NSST is 4096.
- the coefficient of the original 8x8 primary transform block can be restored.
- multiplying the inverse 64x64 non-separated transform matrix by a 64x1 transform coefficient vector yields data in the form of a 64x1 vector, and if you arrange the data in the row-first or column-first order that was applied first, an 8x8 block signal (1st transform) Coefficient) can be restored. Therefore, the total number of multiplications required for 8x8 reverse NSST is 4096.
- the number of multiplications required in sample units is 64. This is the number obtained when dividing the total number of multiplications obtained from the dot product process of the 64x1 vector, which is an 8x8 NSST process, and the transform base vectors, by 4096 from the total number of samples. In the case of the forward 8x8 NSST and the reverse 8x8 NSST, the same multiplication number is 64.
- the number of multiplications per sample required when 8x8 NSST is applied is determined as follows according to the area to which 8x8 NSST is applied.
- the range of applying 8x8 NSST to reduce the number of worst-case multiplications required for each sample may be reduced.
- RST is not applied to all of the 64 transform coefficients included in the 8x8 block, and RST is applied to some areas (eg, the area except the lower-right 4x4 area in the 8x8 block). Can be applied.
- the Lx64 transform matrix is constructed by selecting L row transform vectors from the forward 64x64 non-separated transform matrix, and L transform coefficients are obtained by multiplying the Lx64 transform matrix and the 64x1 input vector.
- Table 10 The worst case multiplication number per sample in the 8x8 block according to the change of the L value is shown in Table 10 below.
- 8x8 RSTs having different L values may be used in combination as shown in Table 13 below to reduce multiplication complexity in the worst case.
- Table 13 describes the condition for applying 8x8 RST under conditions for applying 8x8 NSST (that is, when the width and height of the current block are both greater than or equal to 8)).
- 8x8 RST based on the 8x64 matrix is applied to the current block, otherwise (if either the width or height of the current block is not 8) 8x8 RST based on 16x64 matrix may be applied to the current block. More specifically, if the size of the current block is 8x8, a non-separated transform having an input length of 64 and an output length of 8 may be applied, otherwise a non-separated transform having an input length of 64 and an output length of 16 Can be applied.
- a non-separated transform having an input length of 8 and an output length of 64 may be applied, otherwise a non-separated transform having an input length of 16 and an output length of 64 Can be applied.
- RST since RST is not applied to the entire 8x8 block, but can be applied to only some areas, for example, when RST is applied to the remaining areas except the lower right 4x4 area of the 8x8 block, 8x48 Alternatively, 8x8 RST based on a 16x18 matrix may be applied. That is, if the width and height of the current block correspond to 8 respectively, 8x8 RST based on the 8x48 matrix is applied, and if not (when the width or height of the current block is not 8) 8x8 RST based on the 16x48 matrix is applied. You can.
- a non-separated transform having an input length of 48 and an output length of 8 may be applied, otherwise a non-separated transform having an input length of 48 and an output length of 16 Can be applied.
- a non-separated transform having an input length of 8 and an output length of 48 may be applied, otherwise a non-separated transform having an input length of 16 and an output length of 48 Can be applied.
- RST when RST is applied to a block larger than 8x8 based on the decoder side, if the block height and width correspond to 8, an input length less than 64 (eg 8) and an output length less than or equal to 64 (eg : Non-separated transformation matrix (48x8 or 64x8 matrix) with 48 or 64) can be applied, and if the height or width of the block does not correspond to 8, input length less than 64 (e.g. 16) and output less than or equal to 64 A non-separated transformation matrix (48x16 or 64x16 matrix) having a length (eg, 48 or 64) may be applied.
- Table 12 is an example of application of various 8x8 RSTs under conditions for applying 8x8 NSST (ie, when the width and height of the current block is greater than or equal to 8).
- an 8x8 RST based on an 8x64 matrix (or 8x48 matrix) is applied, and if the product of the width and height of the current block is less than the threshold (TH), An 8x8 RST based on a 16x64 matrix (or 16x48 matrix) is applied to the upper left 8x8 region of the current block, and in the rest (the width or height of the current block is not 8, the product of the width and height of the current block is greater than the threshold) 8x8 RST based on a 32x64 matrix (or 32x48 matrix) is applied to the upper left 8x8 area of the current block.
- TH threshold
- 29 is an embodiment to which the present invention is applied, and shows an example of an encoding flowchart for performing transformation.
- the encoding apparatus 100 performs a primary transformation on the residual block (S2910).
- the first order transformation may be referred to as a core transformation.
- the encoding apparatus 100 may perform primary transformation using the above-described MTS.
- the encoding apparatus 100 may transmit an MTS index indicating a specific MTS among MTS candidates to the decoding apparatus 200.
- the MTS candidate may be configured based on the intra prediction mode of the current block.
- the encoding apparatus 100 determines whether to apply the secondary transform (S2920). For example, the encoding apparatus 100 may determine whether to apply the secondary transform based on the primary transformed residual transform coefficient.
- the quadratic transformation can be NSST or RST.
- the encoding apparatus 100 determines a secondary transform (S2930). At this time, the encoding apparatus 100 may determine the secondary transform based on the specified NSST (or RST) transform set according to the intra prediction mode.
- the encoding apparatus 100 may determine an area to which the second transform is applied based on the size of the current block prior to step S2930.
- the encoding apparatus 100 performs a secondary transform using the secondary transform determined in step S2930 (S2940).
- FIG. 30 is an embodiment to which the present invention is applied, and shows an example of a decoding flowchart for performing transformation.
- the decoding apparatus 200 determines whether to apply the second inverse transform (S3010).
- the second inverse transform may be NSST or RST.
- the decoding apparatus 200 may determine whether to apply the second inverse transform based on the second transform flag received from the encoding apparatus 100.
- the decoding apparatus 200 determines a second inverse transform (S3020). At this time, the decoding apparatus 200 may determine the second inverse transform applied to the current block based on the specified NSST (or RST) transform set according to the intra prediction mode.
- the decoding apparatus 200 may determine an area to which the second inverse transform is applied based on the size of the current block prior to step S3020.
- the decoding apparatus 200 performs secondary inverse transform on the inverse quantized residual block using the secondary inverse transform determined in step S3020 (S3030).
- the decoding apparatus 200 performs a primary inverse transform on the secondary inverse transformed residual block (S3040).
- the first inverse transform may be referred to as core inverse transform.
- the decoding apparatus 200 may perform primary inverse transform using the above-described MTS.
- the decoding apparatus 200 may determine whether MTS is applied to the current block prior to step S3040. In this case, a step of determining whether MTS is applied to the decoding flowchart of FIG. 30 may be further included.
- the decoding apparatus 200 may configure the MTS candidate based on the intra prediction mode of the current block. In this case, configuring the MTS candidate may be further included in the decoding flowchart of FIG. 30. Then, the decoding apparatus 200 may determine a primary inverse transform applied to the current block using mts_idx indicating a specific MTS among the configured MTS candidates.
- FIG. 31 is an embodiment to which the present invention is applied, and shows an example of a detailed block diagram of a transform unit 120 in the encoding device 100.
- the encoding apparatus 100 to which the embodiment of the present invention is applied includes a primary transform unit 3110, a secondary transform determining unit 3120, a secondary transform determining unit 3130, and a secondary transform unit 3140. It can contain.
- the primary transform unit 3110 may perform a primary transform on the residual block.
- the first order transformation may be referred to as a core transformation.
- the primary transform unit 3110 may perform primary transform using the above-described MTS.
- the primary transform unit 3110 may transmit an MTS index indicating a specific MTS among MTS candidates to the decoding apparatus 200. At this time, the MTS candidate may be configured based on the intra prediction mode of the current block.
- the secondary transform application determining unit 3120 may determine whether to apply the secondary transform.
- the secondary transform application determining unit 3120 may determine whether to apply the secondary transform based on the transform coefficient of the primary transformed residual block.
- the quadratic transformation can be NSST or RST.
- the secondary transform determining unit 3130 determines a secondary transform. At this time, the secondary transform determining unit 3130 may determine the secondary transform based on the specified NSST (or RST) transform set according to the intra prediction mode as described above.
- the secondary transform determining unit 3130 may determine an area to which the secondary transform is applied based on the size of the current block.
- the secondary transform unit 3140 may perform a secondary transform using the determined secondary transform.
- 32 is an embodiment to which the present invention is applied, and shows an example of a detailed block diagram of an inverse transform unit 230 in the decoding apparatus 200.
- the decoding apparatus 200 to which the present invention is applied includes a second inverse transform determination unit 3210, a second inverse transform determination unit 3220, a second inverse transform unit 3230, and a first inverse transform unit 3240. .
- the second inverse transform application determining unit 3210 may determine whether to apply the second inverse transform.
- the second inverse transform may be NSST or RST.
- the second inverse transform application determining unit 3210 may determine whether to apply the second inverse transform based on the second transform flag received from the encoding apparatus 100.
- the second inverse transform application unit 3210 may determine whether to apply the second inverse transform based on the transform coefficient of the residual block.
- the second inverse transform determining unit 3220 may determine a second inverse transform.
- the secondary inverse transform determiner 3220 may determine a secondary inverse transform applied to the current block based on the set of NSST (or RST) transforms specified according to the intra prediction mode.
- the secondary inverse transform determining unit 3220 may determine an area to which the secondary inverse transform is applied based on the size of the current block.
- the secondary inverse transform unit 3230 may perform a secondary inverse transform on an inverse quantized residual block using the determined secondary inverse transform.
- the primary inverse transform unit 3240 may perform a primary inverse transform on the secondary inverse transformed residual block. As an embodiment, the primary inverse transform unit 3240 may perform the primary transform using the above-described MTS. Further, as an example, the primary inverse transform unit 3240 may determine whether MTS is applied to the current block.
- the primary inverse transform unit 3240 may configure the MTS candidate based on the intra prediction mode of the current block. Then, the primary inverse transform unit 3240 may determine a primary transform applied to the current block using mts_idx indicating a specific MTS among the configured MTS candidates.
- FIG. 33 shows an example of a decoding flowchart to which transformation is applied according to an embodiment of the present invention.
- the operations of FIG. 33 may be performed by the inverse transform unit 230 of the decoding apparatus 100.
- step S3305 the decoding apparatus 200 determines the input length and output length of the non-separated transform based on the height and width of the current block.
- the input length of the non-separated transform is 8 and the output length may be determined to be greater than the input length and less than or equal to 64 (eg, 48 or 64).
- the output length is determined to be 64, and some of the transform coefficients of the 8x8 block (eg, the right-bottom 4x4 region of the 8x8 block) In the case where non-separation transformation is applied to the part except for), the output length may be determined as 48.
- the decoding apparatus 200 determines a non-separation transformation matrix corresponding to an input length and an output length of the non-separation transformation. For example, if the input length of the non-separated transform is 8 and the output length is 48 or 64 (when the size of the current block is 4x4), a 48x8 or 64x8 matrix derived from the transform kernel is determined as the non-separated transform, If the input length of the non-separated transform is 16 and the output length is 48 or 64 (eg, the current block is less than 8x8 and not 4x4), the 48x16 or 64x16 transform kernel may be determined as the non-separated transform.
- the decoding apparatus 200 determines a non-separated transform set index (eg, NSST index) based on the intra prediction mode of the current block, and the non-separated transform set included in the non-separated transform set index
- a non-separated transform kernel corresponding to the non-separated transform index may be determined, and a non-separated transform matrix may be determined from the non-separated transform kernel based on the input length and the output length determined in step S3305.
- step S3315 the decoding apparatus 200 applies the non-separated transform matrix determined in the current block to coefficients equal to the input length (8 or 16) determined in the current block. For example, if the input length of the non-separated transform is 8 and the output length is 48 or 64, a 48x8 or 64x8 matrix derived from the transform kernel is applied to the 8 coefficients included in the current block, and the When the input length is 16 and the output length is 48 or 64, a 48x16 or 64x16 matrix derived from the transform kernel can be applied to the coefficients of the 16 upper left 4x4 regions of the current block.
- the coefficients to which the non-separation transform is applied are applied to the input length (eg 8 or 16) according to the scan order determined from the DC position of the current block (eg (a), (b), or (c) of FIG. 16). These are coefficients up to the corresponding position.
- the decoding apparatus 200 may display the upper left 4x4 area of the current block.
- 64 transformed data (transformed coefficients) to which the non-separation transformation is applied are arranged in an 8x8 block by application of the non-separation transformation matrix.
- 48 transformed data (transformed coefficients) to which non-separation transformation is applied are arranged in the remaining regions except the lower right 4x4 region.
- the video signal processing apparatus 3400 of FIG. 34 may correspond to the encoding apparatus 100 of FIG. 1 or the decoding apparatus 200 of FIG. 2.
- the image processing apparatus 3400 for processing the image signal includes a memory 3420 for storing the image signal, and a processor 3410 for processing the image signal while being combined with the memory.
- the processor 3410 may include at least one processing circuit for processing a video signal, and may process a video signal by executing instructions for encoding or decoding the video signal. That is, the processor 3410 may encode the original image data or decode the encoded image signal by executing the above-described encoding or decoding methods.
- 35 shows an example of a video coding system as an embodiment to which the present invention is applied.
- the video coding system may include a source device and a receiving device.
- the source device may deliver the encoded video / video information or data to a receiving device through a digital storage medium or network in the form of a file or streaming.
- the source device may include a video source, an encoding device, and a transmitter.
- the receiving device can include a receiver, a decoding apparatus and a renderer.
- the encoding device may be referred to as a video / video encoding device, and the decoding device may be referred to as a video / video decoding device.
- the transmitter can be included in the encoding device.
- the receiver may be included in the decoding device.
- the renderer may include a display unit, and the display unit may be configured as a separate device or an external component.
- the video source may acquire a video / image through a capture, synthesis, or generation process of the video / image.
- the video source may include a video / image capture device and / or a video / image generation device.
- the video / image capture device may include, for example, one or more cameras, a video / image archive including previously captured video / images, and the like.
- the video / image generating device may include, for example, a computer, a tablet and a smart phone, and the like (electronically) to generate the video / image.
- a virtual video / image may be generated through a computer or the like, and in this case, the video / image capture process may be replaced by a process in which related data is generated.
- the encoding device can encode the input video / video.
- the encoding apparatus may perform a series of procedures such as prediction, transformation, and quantization for compression and coding efficiency.
- the encoded data (encoded video / video information) may be output in the form of a bitstream.
- the transmitting unit may transmit the encoded video / video information or data output in the form of a bitstream to a receiving unit of a receiving device through a digital storage medium or a network in a file or streaming format.
- the digital storage media may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD.
- the transmission unit may include an element for generating a media file through a predetermined file format, and may include an element for transmission through a broadcast / communication network.
- the receiver can extract the bitstream and deliver it to the decoding device.
- the decoding apparatus may decode a video / image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding apparatus.
- the renderer can render the decoded video / image.
- the rendered video / image may be displayed through the display unit.
- 36 is an embodiment to which the present invention is applied, and is a structural diagram of a content streaming system.
- the content streaming system to which the present invention is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the encoding server plays a role of compressing the content input from multimedia input devices such as a smartphone, a camera, and a camcorder into digital data to generate a bitstream and transmit it to a streaming server.
- multimedia input devices such as a smartphone, camera, and camcorder directly generate a bitstream
- the encoding server may be omitted.
- the bitstream may be generated by an encoding method or a bitstream generation method to which the present invention is applied, and the streaming server may temporarily store the bitstream in the process of transmitting or receiving the bitstream.
- the streaming server transmits multimedia data to the user device based on a user request through the web server, and the web server serves as a medium for informing the user of the service.
- the web server delivers it to the streaming server, and the streaming server transmits multimedia data to the user.
- the content streaming system may include a separate control server, in which case the control server serves to control commands / responses between devices in the content streaming system.
- the streaming server can receive content from the media storage and / or encoding server. For example, when content is received from an encoding server, the content may be received in real time. In this case, in order to provide a smooth streaming service, the streaming server may store the bitstream for a certain period of time.
- Examples of user devices include mobile phones, smart phones, laptop computers, digital broadcasting terminals, personal digital assistants (PDAs), portable multimedia players (PMPs), navigation, slate PCs, and tablets.
- PDAs personal digital assistants
- PMPs portable multimedia players
- slate PCs slate PCs
- tablets tablet PC
- wearable device e.g., watch type (smartwatch), glass type (smart glass), head mounted display (HMD), digital TV, desktop computer , Digital signage, and the like.
- Each server in the content streaming system can be operated as a distributed server, and in this case, data received from each server can be distributed.
- the processing method to which the present invention is applied can be produced in the form of a computer-implemented program, and can be stored in a computer-readable recording medium.
- Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
- the computer-readable recording medium includes all kinds of storage devices and distributed storage devices in which computer-readable data is stored.
- the computer-readable recording medium includes, for example, Blu-ray Disc (BD), Universal Serial Bus (USB), ROM, PROM, EPROM, EEPROM, RAM, CD-ROM, magnetic tape, floppy disk and optical. It may include a data storage device.
- the computer-readable recording medium includes media implemented in the form of a carrier wave (for example, transmission via the Internet).
- the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
- an embodiment of the present invention may be implemented as a computer program product by program code, and the program code may be executed on a computer by an embodiment of the present invention.
- the program code can be stored on a computer readable carrier.
- the embodiments described in the present invention may be implemented and implemented on a processor, microprocessor, controller, or chip.
- the functional units illustrated in each drawing may be implemented and implemented on a computer, processor, microprocessor, controller, or chip.
- the decoder and encoder to which the present invention is applied are a multimedia broadcast transmission / reception device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video communication device, a real-time communication device such as video communication, a mobile streaming device, Storage media, camcorders, video-on-demand (VoD) service providers, OTT video (Over the top video) devices, Internet streaming service providers, three-dimensional (3D) video devices, video telephony video devices, and medical video devices. And can be used to process video signals or data signals.
- the OTT video (Over the top video) device may include a game console, a Blu-ray player, an Internet-connected TV, a home theater system, a smartphone, a tablet PC, and a digital video recorder (DVR).
- the processing method to which the present invention is applied can be produced in the form of a computer-implemented program, and can be stored in a computer-readable recording medium.
- Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
- the computer-readable recording medium includes all kinds of storage devices and distributed storage devices in which computer-readable data is stored.
- the computer-readable recording medium includes, for example, Blu-ray Disc (BD), Universal Serial Bus (USB), ROM, PROM, EPROM, EEPROM, RAM, CD-ROM, magnetic tape, floppy disk and optical. It may include a data storage device.
- the computer-readable recording medium includes media implemented in the form of a carrier wave (for example, transmission via the Internet).
- the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
- an embodiment of the present invention may be implemented as a computer program product by program code, and the program code may be executed on a computer by an embodiment of the present invention.
- the program code can be stored on a computer readable carrier.
- Embodiments according to the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof.
- one embodiment of the invention includes one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), FPGAs ( field programmable gate arrays), processors, controllers, microcontrollers, microprocessors, and the like.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGAs field programmable gate arrays
- processors controllers, microcontrollers, microprocessors, and the like.
- an embodiment of the present invention may be implemented in the form of a module, procedure, function, etc. that performs the functions or operations described above.
- the software code can be stored in memory and driven by a processor.
- the memory is located inside or outside the processor, and can exchange data with the processor by various means already known.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Picture Signal Circuits (AREA)
- Closed-Circuit Television Systems (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
Claims (10)
- 영상 신호를 디코딩하기 위한 방법에 있어서,현재 블록의 높이(height)와 너비(width)에 기반하여 비분리 변환의 입력 길이 및 출력 길이를 결정하는 단계;상기 비분리 변환의 입력 길이 및 출력 길이에 대응하는 비분리 변환 행렬을 결정하는 단계; 및상기 비분리 변환 행렬을 상기 현재 블록에서 상기 입력 길이에 해당하는 개수만큼의 계수들에 적용하는 단계를 포함하고,상기 현재 블록의 높이와 너비는 8보다 크거나 같고,상기 현재 블록의 높이와 너비가 각각 8인 경우, 상기 비분리 변환의 입력 길이는 8로 결정되는 것을 특징으로 하는 방법.
- 제1항에 있어서,상기 현재 블록의 높이와 너비가 8인 경우에 해당하지 않으면, 상기 비분리 변환의 입력 길이는 16으로 결정되는 것을 특징으로 하는 방법.
- 제1항에 있어서,상기 출력 길이는,48 또는 64로 결정되는 것을 특징으로 하는 방법.
- 제1항에 있어서,상기 비분리 변환 행렬을 상기 현재 블록에 적용하는 단계는,상기 높이와 너비가 각각 8인 경우에 해당하지 않으면서 상기 너비와 높이의 곱이 임계값보다 작으면, 상기 비분리 변환 행렬을 상기 현재 블록의 좌상측 4x4 영역에 적용하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항에 있어서,상기 비분리 변환 행렬을 결정하는 단계는,상기 현재 블록의 인트라 예측 모드에 기반하여 비분리 변환 집합 인덱스를 결정하는 단계;상기 비분리 변환 집합 인덱스에 포함된 비분리 변환 집합 내에서 비분리 변환 인덱스에 대응하는 비분리 변환 커널을 결정하는 단계; 및상기 입력 길이 및 출력 길이에 기반하여 상기 비분리 변환 커널로부터 상기 비분리 변환 행렬을 결정하는 단계를 포함하는 것을 특징으로 하는 방법.
- 영상 신호를 디코딩하기 위한 장치에 있어서,상기 영상 신호를 저장하는 메모리; 및상기 메모리와 결합된 프로세서를 포함하고,상기 프로세서는,현재 블록의 높이(height)와 너비(width)에 기반하여 비분리 변환의 입력 길이 및 출력 길이를 결정하고,상기 비분리 변환의 입력 길이 및 출력 길이에 대응하는 비분리 변환 행렬을 결정하고,상기 비분리 변환 행렬을 상기 현재 블록에서 상기 입력 길이에 해당하는 개수만큼의 계수들에 적용하도록 설정되고,상기 현재 블록의 높이와 너비는 8보다 크거나 같고,상기 현재 블록의 높이와 너비가 각각 8인 경우, 상기 비분리 변환의 입력 길이는 8, 출력 길이는 상기 입력 길이보다 크고 64보다 작거나 같은 값으로 결정되는 것을 특징으로 하는 장치.
- 제6항에 있어서,상기 현재 블록의 높이와 너비가 8인 경우에 해당하지 않으면, 상기 비분리 변환의 입력 길이는 16, 출력 길이는 상기 입력 길이보다 크고 64보다 작거나 같은 값으로 결정되는 것을 특징으로 하는 장치.
- 제6항에 있어서,상기 출력 길이는,48로 결정되는 것을 특징으로 하는 장치.
- 제6항에 있어서,상기 프로세서는,상기 높이와 너비가 각각 8인 경우에 해당하지 않으면서 상기 너비와 높이의 곱이 임계값보다 작으면, 상기 비분리 변환 행렬을 상기 현재 블록의 좌상측 4x4 영역에 적용하는 단계를 포함하는 것을 특징으로 하는 장치.
- 제10항에 있어서,상기 프로세서는,상기 현재 블록의 인트라 예측 모드에 기반하여 비분리 변환 집합 인덱스를 결정하고,상기 비분리 변환 집합 인덱스에 포함된 비분리 변환 집합 내에서 비분리 변환 인덱스에 대응하는 비분리 변환 커널을 결정하고,상기 입력 길이 및 출력 길이에 기반하여 상기 비분리 변환 커널로부터 상기 비분리 변환 행렬을 결정하도록 설정되는 것을 특징으로 하는 장치.
Priority Applications (16)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201980014843.8A CN111771378B (zh) | 2018-09-05 | 2019-09-05 | 由设备对图像信号进行编码/解码的方法及比特流发送方法 |
KR1020207017920A KR102443501B1 (ko) | 2018-09-05 | 2019-09-05 | 영상 신호를 처리하기 위한 방법 및 장치 |
KR1020237024123A KR20230112741A (ko) | 2018-09-05 | 2019-09-05 | 영상 신호를 처리하기 위한 방법 및 장치 |
EP19858300.7A EP3723374A4 (en) | 2018-09-05 | 2019-09-05 | METHOD AND APPARATUS FOR PROCESSING A VIDEO SIGNAL |
JP2020537505A JP7106652B2 (ja) | 2018-09-05 | 2019-09-05 | 映像信号を処理するための方法及び装置 |
CN202310080184.6A CN116074508A (zh) | 2018-09-05 | 2019-09-05 | 对图像信号进行编码/解码的设备及发送图像信号的设备 |
CN202310072429.0A CN116055718A (zh) | 2018-09-05 | 2019-09-05 | 设备对图像信号进行编码/解码的方法及比特流发送方法 |
KR1020227031341A KR102557256B1 (ko) | 2018-09-05 | 2019-09-05 | 영상 신호를 처리하기 위한 방법 및 장치 |
CN202310078267.1A CN116055719A (zh) | 2018-09-05 | 2019-09-05 | 对图像信号进行编码/解码的设备及发送图像信号的设备 |
US16/901,818 US11082694B2 (en) | 2018-09-05 | 2020-06-15 | Method and apparatus for processing image signal |
US17/360,164 US11589051B2 (en) | 2018-09-05 | 2021-06-28 | Method and apparatus for processing image signal |
JP2022112263A JP7328414B2 (ja) | 2018-09-05 | 2022-07-13 | 映像信号を処理するための方法及び装置 |
US18/097,775 US11818352B2 (en) | 2018-09-05 | 2023-01-17 | Method and apparatus for processing image signal |
JP2023126442A JP7508664B2 (ja) | 2018-09-05 | 2023-08-02 | 映像信号を処理するための方法及び装置 |
US18/369,493 US20240031573A1 (en) | 2018-09-05 | 2023-09-18 | Method and apparatus for processing image signal |
JP2024098614A JP2024120019A (ja) | 2018-09-05 | 2024-06-19 | 映像信号を処理するための方法及び装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862727526P | 2018-09-05 | 2018-09-05 | |
US62/727,526 | 2018-09-05 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/901,818 Continuation US11082694B2 (en) | 2018-09-05 | 2020-06-15 | Method and apparatus for processing image signal |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020050668A1 true WO2020050668A1 (ko) | 2020-03-12 |
Family
ID=69723094
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2019/011517 WO2020050668A1 (ko) | 2018-09-05 | 2019-09-05 | 영상 신호를 처리하기 위한 방법 및 장치 |
Country Status (6)
Country | Link |
---|---|
US (4) | US11082694B2 (ko) |
EP (1) | EP3723374A4 (ko) |
JP (4) | JP7106652B2 (ko) |
KR (3) | KR20230112741A (ko) |
CN (4) | CN116055719A (ko) |
WO (1) | WO2020050668A1 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022546895A (ja) * | 2019-09-17 | 2022-11-10 | キヤノン株式会社 | ビデオサンプルのブロックを符号化並びに復号するための方法、装置、及びシステム |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116055719A (zh) * | 2018-09-05 | 2023-05-02 | Lg电子株式会社 | 对图像信号进行编码/解码的设备及发送图像信号的设备 |
CN113365052B (zh) * | 2019-03-09 | 2022-03-25 | 杭州海康威视数字技术股份有限公司 | 进行编码和解码的方法、解码端、编码端和系统 |
CN117354521A (zh) | 2019-06-07 | 2024-01-05 | 北京字节跳动网络技术有限公司 | 视频比特流中的简化二次变换的有条件信令 |
EP3790275A4 (en) * | 2019-06-25 | 2022-03-16 | Wilus Institute of Standards and Technology Inc. | VIDEO SIGNAL PROCESSING METHOD AND APPARATUS USING SECONDARY TRANSFORMATION |
EP3994887A4 (en) | 2019-08-03 | 2022-09-28 | Beijing Bytedance Network Technology Co., Ltd. | MATRIX SELECTION FOR A REDUCED SECONDARY TRANSFORM IN VIDEO CODING |
WO2024123148A1 (ko) * | 2022-12-09 | 2024-06-13 | 엘지전자 주식회사 | 영상 인코딩/디코딩 방법 및 장치, 그리고 비트스트림을 저장한 기록 매체 |
WO2024136471A1 (ko) * | 2022-12-20 | 2024-06-27 | 엘지전자 주식회사 | 영상 인코딩/디코딩 방법 및 장치, 그리고 비트스트림을 저장한 기록 매체 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017019649A1 (en) * | 2015-07-28 | 2017-02-02 | Microsoft Technology Licensing, Llc | Reduced size inverse transform for video decoding and encoding |
US20170094313A1 (en) * | 2015-09-29 | 2017-03-30 | Qualcomm Incorporated | Non-separable secondary transform for video coding |
KR20180014655A (ko) * | 2016-08-01 | 2018-02-09 | 한국전자통신연구원 | 영상 부호화/복호화 방법 |
KR20180085526A (ko) * | 2017-01-19 | 2018-07-27 | 가온미디어 주식회사 | 효율적 변환을 처리하는 영상 복호화 및 부호화 방법 |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8483285B2 (en) * | 2008-10-03 | 2013-07-09 | Qualcomm Incorporated | Video coding using transforms bigger than 4×4 and 8×8 |
US10448053B2 (en) * | 2016-02-15 | 2019-10-15 | Qualcomm Incorporated | Multi-pass non-separable transforms for video coding |
WO2017191782A1 (en) * | 2016-05-04 | 2017-11-09 | Sharp Kabushiki Kaisha | Systems and methods for coding transform data |
US11758136B2 (en) * | 2016-06-24 | 2023-09-12 | Electronics And Telecommunications Research Institute | Method and apparatus for transform-based image encoding/decoding |
WO2018012830A1 (ko) * | 2016-07-13 | 2018-01-18 | 한국전자통신연구원 | 영상 부호화/복호화 방법 및 장치 |
US11095893B2 (en) * | 2016-10-12 | 2021-08-17 | Qualcomm Incorporated | Primary transform and secondary transform in video coding |
WO2018097691A2 (ko) * | 2016-11-28 | 2018-05-31 | 한국전자통신연구원 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
US20180288439A1 (en) * | 2017-03-31 | 2018-10-04 | Mediatek Inc. | Multiple Transform Prediction |
US10855997B2 (en) * | 2017-04-14 | 2020-12-01 | Mediatek Inc. | Secondary transform kernel size selection |
US11134272B2 (en) * | 2017-06-29 | 2021-09-28 | Qualcomm Incorporated | Memory reduction for non-separable transforms |
TWI794129B (zh) * | 2017-07-13 | 2023-02-21 | 美商松下電器(美國)知識產權公司 | 編碼裝置、編碼方法、解碼裝置、解碼方法及電腦可讀取之非暫時性媒體 |
US10516885B1 (en) * | 2018-07-11 | 2019-12-24 | Tencent America LLC | Method and apparatus for video coding |
US11259052B2 (en) * | 2018-07-16 | 2022-02-22 | Qualcomm Incorporated | Transform variations of multiple separable transform selection |
PL3723373T3 (pl) * | 2018-09-02 | 2023-11-06 | Lg Electronics Inc. | Sposób dekodowania sygnału obrazu, sposób kodowania sygnału obrazu i nośnik danych |
CN116055719A (zh) * | 2018-09-05 | 2023-05-02 | Lg电子株式会社 | 对图像信号进行编码/解码的设备及发送图像信号的设备 |
-
2019
- 2019-09-05 CN CN202310078267.1A patent/CN116055719A/zh active Pending
- 2019-09-05 JP JP2020537505A patent/JP7106652B2/ja active Active
- 2019-09-05 CN CN202310080184.6A patent/CN116074508A/zh active Pending
- 2019-09-05 KR KR1020237024123A patent/KR20230112741A/ko not_active Application Discontinuation
- 2019-09-05 EP EP19858300.7A patent/EP3723374A4/en not_active Ceased
- 2019-09-05 KR KR1020207017920A patent/KR102443501B1/ko active IP Right Grant
- 2019-09-05 CN CN202310072429.0A patent/CN116055718A/zh active Pending
- 2019-09-05 CN CN201980014843.8A patent/CN111771378B/zh active Active
- 2019-09-05 WO PCT/KR2019/011517 patent/WO2020050668A1/ko unknown
- 2019-09-05 KR KR1020227031341A patent/KR102557256B1/ko active IP Right Grant
-
2020
- 2020-06-15 US US16/901,818 patent/US11082694B2/en active Active
-
2021
- 2021-06-28 US US17/360,164 patent/US11589051B2/en active Active
-
2022
- 2022-07-13 JP JP2022112263A patent/JP7328414B2/ja active Active
-
2023
- 2023-01-17 US US18/097,775 patent/US11818352B2/en active Active
- 2023-08-02 JP JP2023126442A patent/JP7508664B2/ja active Active
- 2023-09-18 US US18/369,493 patent/US20240031573A1/en active Pending
-
2024
- 2024-06-19 JP JP2024098614A patent/JP2024120019A/ja active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017019649A1 (en) * | 2015-07-28 | 2017-02-02 | Microsoft Technology Licensing, Llc | Reduced size inverse transform for video decoding and encoding |
US20170094313A1 (en) * | 2015-09-29 | 2017-03-30 | Qualcomm Incorporated | Non-separable secondary transform for video coding |
KR20180014655A (ko) * | 2016-08-01 | 2018-02-09 | 한국전자통신연구원 | 영상 부호화/복호화 방법 |
KR20180085526A (ko) * | 2017-01-19 | 2018-07-27 | 가온미디어 주식회사 | 효율적 변환을 처리하는 영상 복호화 및 부호화 방법 |
Non-Patent Citations (1)
Title |
---|
KOO, MOONMO ET AL.: "Description of SDR video coding technology proposal by LG Electronics", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11. 10TH MEETING, no. JVET-J0017-v1, 11 April 2018 (2018-04-11), San Diego , CA, pages 1 - 67, XP030151178 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022546895A (ja) * | 2019-09-17 | 2022-11-10 | キヤノン株式会社 | ビデオサンプルのブロックを符号化並びに復号するための方法、装置、及びシステム |
JP7394875B2 (ja) | 2019-09-17 | 2023-12-08 | キヤノン株式会社 | ビデオサンプルのブロックを符号化並びに復号するための方法、装置、及びシステム |
Also Published As
Publication number | Publication date |
---|---|
JP2023133520A (ja) | 2023-09-22 |
JP7328414B2 (ja) | 2023-08-16 |
US20200314425A1 (en) | 2020-10-01 |
US11082694B2 (en) | 2021-08-03 |
US20240031573A1 (en) | 2024-01-25 |
JP2021510253A (ja) | 2021-04-15 |
EP3723374A1 (en) | 2020-10-14 |
CN111771378B (zh) | 2023-02-17 |
KR20200086733A (ko) | 2020-07-17 |
CN116055718A (zh) | 2023-05-02 |
CN116055719A (zh) | 2023-05-02 |
US20230164319A1 (en) | 2023-05-25 |
US11589051B2 (en) | 2023-02-21 |
JP2024120019A (ja) | 2024-09-03 |
US20210337201A1 (en) | 2021-10-28 |
CN116074508A (zh) | 2023-05-05 |
US11818352B2 (en) | 2023-11-14 |
CN111771378A (zh) | 2020-10-13 |
JP7508664B2 (ja) | 2024-07-01 |
JP7106652B2 (ja) | 2022-07-26 |
JP2022132405A (ja) | 2022-09-08 |
EP3723374A4 (en) | 2021-02-24 |
KR102443501B1 (ko) | 2022-09-14 |
KR20230112741A (ko) | 2023-07-27 |
KR20220127389A (ko) | 2022-09-19 |
KR102557256B1 (ko) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020060364A1 (ko) | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 | |
WO2020050665A1 (ko) | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 | |
WO2019235797A1 (ko) | 축소된 변환을 이용하여 비디오 신호를 처리하는 방법 및 장치 | |
WO2021054796A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020050668A1 (ko) | 영상 신호를 처리하기 위한 방법 및 장치 | |
WO2020046086A1 (ko) | 영상 신호를 처리하기 위한 방법 및 장치 | |
WO2019235887A1 (ko) | 인트라 예측 모드에 기초하여 변환 인덱스 코딩을 수행하는 방법 및 이를 위한 장치 | |
WO2020009434A1 (ko) | 이차 변환을 기반으로 비디오 신호를 처리하는 방법 및 장치 | |
WO2020162690A1 (ko) | 축소된 변환을 사용하여 비디오 신호를 처리하기 위한 방법 및 장치 | |
WO2020046092A1 (ko) | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 | |
WO2021206445A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020162732A1 (ko) | 비디오 신호를 처리하기 위한 방법 및 장치 | |
WO2020166977A1 (ko) | 비디오 신호를 처리하기 위한 방법 및 장치 | |
WO2020050651A1 (ko) | 다중 변환 선택에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020013541A1 (ko) | 비디오 신호를 처리하기 위한 방법 및 장치 | |
WO2021066598A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020036390A1 (ko) | 영상 신호를 처리하기 위한 방법 및 장치 | |
WO2021096290A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2020071736A1 (ko) | 비디오 신호의 부호화/복호화 방법 및 이를 위한 장치 | |
WO2021054798A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021167421A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021096295A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2021060905A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 | |
WO2019194515A1 (ko) | 영상의 처리 방법 및 이를 위한 장치 | |
WO2021201649A1 (ko) | 변환에 기반한 영상 코딩 방법 및 그 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19858300 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20207017920 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020537505 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2019858300 Country of ref document: EP Effective date: 20200710 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |