WO2020046085A1 - Procédé et dispositif de traitement d'un signal d'image - Google Patents

Procédé et dispositif de traitement d'un signal d'image Download PDF

Info

Publication number
WO2020046085A1
WO2020046085A1 PCT/KR2019/011249 KR2019011249W WO2020046085A1 WO 2020046085 A1 WO2020046085 A1 WO 2020046085A1 KR 2019011249 W KR2019011249 W KR 2019011249W WO 2020046085 A1 WO2020046085 A1 WO 2020046085A1
Authority
WO
WIPO (PCT)
Prior art keywords
transform
unit
mts
block
index
Prior art date
Application number
PCT/KR2019/011249
Other languages
English (en)
Korean (ko)
Inventor
구문모
살레후메디
팔루리시달
김승환
임재현
Original Assignee
엘지전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 엘지전자 주식회사 filed Critical 엘지전자 주식회사
Publication of WO2020046085A1 publication Critical patent/WO2020046085A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to a method and apparatus for processing a video signal, and more particularly, to a method and apparatus for encoding or decoding a video signal by performing a transformation.
  • Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or for storing in a form suitable for a storage medium.
  • Media such as an image, an image, and voice may be a target of compression encoding.
  • a technique of performing compression encoding on an image is referred to as video image compression.
  • Next-generation video content will be characterized by high spatial resolution, high frame rate, and high dimensionality of scene representation. Processing such content will result in a huge increase in terms of memory storage, memory access rate, and processing power.
  • the video codec standard after the high efficiency video coding (HEVC) standard requires an efficient conversion technique for converting a spatial domain video signal into the frequency domain along with a higher accuracy prediction technique. Shall be.
  • Embodiments of the present invention provide an image signal processing method and apparatus for reducing the computational complexity during conversion.
  • a method of processing an image signal comprising: identifying a transform index indicating a transform kernel for transforming a current block, determining a transform matrix corresponding to the transform index, and converting the transform index; Generating an array of residual samples by applying a matrix to the transform coefficients of the current block, the components of the transform matrix being implemented by a shift operation and addition of one.
  • each of the components of the transformation matrix may be implemented by a sum of terms consisting of a left shift of one.
  • the number of terms constituting each of the components of the transformation matrix may be set to be smaller than three.
  • each of the components of the transformation matrix may be set to a value approximated within an allowable error range from DCT-4, DST-7, or DCT-8.
  • each of the components of the transformation matrix may be determined in consideration of the allowed error range and the number of terms.
  • An apparatus for processing an image signal includes a memory for storing the image signal and a processor coupled to the memory, wherein the processor indicates a transform kernel for transforming a current block. Determine a transform index, determine a transform matrix corresponding to the transform index, and apply the transform matrix to transform coefficients of the current block, the components of the transform matrix being: It can be implemented by shift operation and addition of one.
  • the computational complexity may be reduced by performing a transformation using a shift operation and an addition operation without a multiplication operation.
  • FIG. 1 shows an example of an image coding system as an embodiment to which the present invention is applied.
  • FIG. 2 is a schematic block diagram of an encoding apparatus in which an encoding of a video / image signal is performed, according to an embodiment to which the present invention is applied.
  • FIG. 3 is an embodiment to which the present invention is applied and shows a schematic block diagram of a decoding apparatus in which decoding of a video signal is performed.
  • FIG. 4 is a structural diagram of a content streaming system according to an embodiment to which the present invention is applied.
  • FIG. 5 is an embodiment to which the present invention may be applied.
  • FIG. 5A is a quadtree (QT)
  • FIG. 5B is a binary tree (BT)
  • FIG. 5C is a ternary tree (TT).
  • FIG. 4 is a diagram for describing block division structures by Tree (AT).
  • FIG. 6 is a schematic block diagram of a transform and quantization unit, an inverse quantization unit, and an inverse transform unit in the encoding apparatus 100 of FIG. 2, and FIG. A schematic block diagram of the inverse quantization and inverse transform portion is shown.
  • FIG. 8 is a flowchart illustrating a process of performing adaptive multiple transform (AMT).
  • AMT adaptive multiple transform
  • FIG. 9 is a flowchart illustrating a decoding process in which AMT is performed.
  • FIG. 10 shows three forward scan sequences for transform coefficients or transform coefficient blocks applied in the HEVC standard, (a) a diagonal scan, (b) a horizontal scan, and (c) a vertical scan (vertical scan).
  • FIG. 11 and 12 illustrate embodiments to which the present invention is applied.
  • FIG. 11 shows positions of transform coefficients when a forward diagonal scan is applied when 4x4 RST is applied to a 4x8 block
  • FIG. 12 shows two 4x4 blocks. An example of a case of merging valid transform coefficients of a into one block is shown.
  • FIG. 13 is a flowchart illustrating an inverse transform process based on multiple transform selection (MTS) according to an embodiment of the present invention.
  • FIG. 14 is a block diagram of an apparatus for performing decoding based on an MTS according to an embodiment of the present invention.
  • FIG. 15 shows an example of a decoding flowchart for performing a conversion process according to an embodiment of the present invention.
  • FIG. 16 shows a flowchart for processing a video signal according to an embodiment to which the present invention is applied.
  • FIG. 17 shows an example of a block diagram of an apparatus for processing a video signal as an embodiment to which the present invention is applied.
  • a 'processing unit' refers to a unit in which a process of encoding / decoding such as prediction, transformation, and / or quantization is performed.
  • the processing unit may be interpreted to include a unit for a luma component and a unit for a chroma component.
  • the processing unit may correspond to a block, a coding unit (CU), a prediction unit (PU), or a transform unit (TU).
  • the processing unit may be interpreted as a unit for the luminance component or a unit for the chrominance component.
  • the processing unit may correspond to a CTB, CB, PU or TB for the luminance component.
  • the processing unit may correspond to a CTB, CB, PU or TB for the chrominance component.
  • the present invention is not limited thereto, and the processing unit may be interpreted to include a unit for a luminance component and a unit for a color difference component.
  • processing unit is not necessarily limited to square blocks, but may also be configured in a polygonal form having three or more vertices.
  • a pixel or a pixel is referred to as a sample.
  • using a sample may mean using a pixel value or a pixel value.
  • FIG. 1 shows an example of an image coding system as an embodiment to which the present invention is applied.
  • the image coding system can include a source device 10 and a receiving device 20.
  • the source device 10 may transmit the encoded video / video information or data to the receiving device 20 through a digital storage medium or a network in a file or streaming form.
  • Source device 10 may include a video source 11, an encoding device 12, and a transmitter 13.
  • the receiving device 20 may include a receiver 21, a decoding device 22 and a renderer 23.
  • the encoding device 10 may be called a video / image encoding device, and the decoding device 20 may be called a video / image decoding device.
  • the transmitter 13 may be included in the encoding device 12.
  • the receiver 21 may be included in the decoding device 22.
  • the renderer 23 may include a display unit, and the display unit may be configured as a separate device or an external component.
  • the video source may acquire the video / image through a process of capturing, synthesizing, or generating the video / image.
  • the video source may comprise a video / image capture device and / or a video / image generation device.
  • the video / image capture device may include, for example, one or more cameras, video / image archives including previously captured video / images, and the like.
  • Video / image generation devices may include, for example, computers, tablets and smartphones, and may (electronically) generate video / images.
  • a virtual video / image may be generated through a computer or the like. In this case, the video / image capturing process may be replaced by a process of generating related data.
  • the encoding device 12 may encode the input video / image.
  • the encoding device 12 may perform a series of procedures such as prediction, transform, and quantization for compression and coding efficiency.
  • the encoded data (encoded video / image information) may be output in the form of a bitstream.
  • the transmitter 13 may transmit the encoded video / video information or data output in the form of a bitstream to the receiver of the receiving device through a digital storage medium or a network in the form of a file or streaming.
  • the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, and the like.
  • the transmission unit 13 may include an element for generating a media file through a predetermined file format, and may include an element for transmission through a broadcast / communication network.
  • the receiver 21 may extract the bitstream and transfer it to the decoding device 22.
  • the decoding device 22 may decode the video / image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding device 12.
  • the renderer 23 may render the decoded video / image.
  • the rendered video / image may be displayed through the display unit.
  • FIG. 2 is a schematic block diagram of an encoding apparatus in which an encoding of a video / image signal is performed, according to an embodiment to which the present invention is applied.
  • the encoding apparatus 100 of FIG. 2 may correspond to the encoding apparatus 12 of FIG. 1.
  • the image divider 110 may divide an input image (or a picture or a frame) input to the encoding apparatus 100 into one or more processing units.
  • the processing unit may be called a coding unit (CU).
  • the coding unit may be recursively divided according to a quad-tree binary-tree (QTBT) structure from a coding tree unit (CTU) or a largest coding unit (LCU).
  • QTBT quad-tree binary-tree
  • CTU coding tree unit
  • LCU largest coding unit
  • one coding unit may be divided into a plurality of coding units of a deeper depth based on a quad tree structure and / or a binary tree structure.
  • the quad tree structure may be applied first and the binary tree structure may be applied later.
  • the binary tree structure may be applied first.
  • the coding procedure according to the present invention may be performed based on the final coding unit that is no longer split.
  • the maximum coding unit may be used as the final coding unit immediately based on coding efficiency according to the image characteristic, or if necessary, the coding unit is recursively divided into coding units of lower depths and optimized.
  • a coding unit of size may be used as the final coding unit.
  • the coding procedure may include a procedure of prediction, transform, and reconstruction, which will be described later.
  • the processing unit may further include a prediction unit (PU) or a transform unit (TU).
  • the prediction unit and the transform unit may be partitioned or partitioned from the last coding unit described above, respectively.
  • the prediction unit may be a unit of sample prediction
  • the transformation unit may be a unit for deriving a transform coefficient and / or a unit for deriving a residual signal from the transform coefficient.
  • an M ⁇ N block may represent a set of samples or transform coefficients composed of M columns and N rows.
  • a sample may generally represent a pixel or a value of a pixel, and may represent only a pixel / pixel value of a luma component or only a pixel / pixel value of a chroma component.
  • a sample may be used as a term corresponding to one picture (or image) for a pixel or a pel.
  • the encoding apparatus 100 subtracts the prediction signal (predicted block, prediction sample array) output from the inter prediction unit 180 or the intra prediction unit 185 from the input image signal (original block, original sample array).
  • a signal may be generated (residual signal, residual block, residual sample array), and the generated residual signal is transmitted to the converter 120.
  • a unit for subtracting a prediction signal (prediction block, prediction sample array) from an input image signal (original block, original sample array) in the encoder 100 may be referred to as a subtraction unit 115.
  • the prediction unit may perform a prediction on a block to be processed (hereinafter, referred to as a current block) and generate a predicted block including prediction samples for the current block.
  • the prediction unit may determine whether intra prediction or inter prediction is applied on a current block or CU basis. As described later in the description of each prediction mode, the prediction unit may generate various information related to prediction, such as prediction mode information, and transmit the generated information to the entropy encoding unit 190. The information about the prediction may be encoded in the entropy encoding unit 190 and output in the form of a bitstream.
  • the intra predictor 185 may predict the current block by referring to the samples in the current picture.
  • the referenced samples may be located in the neighborhood of the current block or may be located apart according to the prediction mode.
  • the prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
  • Non-directional mode may include, for example, DC mode and planner mode (Planar mode).
  • the directional mode may include, for example, 33 directional prediction modes or 65 directional prediction modes depending on the degree of detail of the prediction direction. However, as an example, more or less number of directional prediction modes may be used depending on the setting.
  • the intra predictor 185 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.
  • the inter predictor 180 may derive the predicted block with respect to the current block based on the reference block (reference sample array) specified by the motion vector on the reference picture.
  • the motion information may be predicted in units of blocks, subblocks, or samples based on the correlation of the motion information between the neighboring block and the current block.
  • the motion information may include a motion vector and a reference picture index.
  • the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
  • the neighboring block may include a spatial neighboring block existing in the current picture and a temporal neighboring block present in the reference picture.
  • a reference picture including a reference block and a reference picture including a temporal neighboring block may be the same or different.
  • the temporal neighboring block may be referred to as a collocated reference block, a collocated CU (colCU), or the like.
  • a reference picture including a temporal neighboring block may be referred to as a collocated picture (colPic).
  • the inter prediction unit 180 constructs a motion information candidate list based on neighboring blocks, and provides information indicating which candidates are used to derive the motion vector and / or reference picture index of the current block. Can be generated. Inter prediction may be performed based on various prediction modes.
  • the inter prediction unit 180 may use motion information of a neighboring block as motion information of a current block.
  • the residual signal may not be transmitted.
  • MVP motion vector prediction
  • the motion vector of the neighboring block is used as a motion vector predictor, and the motion vector of the current block is signaled by signaling a motion vector difference. Can be directed.
  • the prediction signal generated by the inter predictor 180 or the intra predictor 185 may be used to generate a reconstruction signal or may be used to generate a residual signal.
  • the transformer 120 may apply transform techniques to the residual signal to generate transform coefficients.
  • the transformation technique may include at least one of a discrete cosine transform (DCT), a discrete sine transform (DST), a karhunen-loeve transform (KLT), a graph-based transform (GBT), or a conditionally non-linear transform (CNT).
  • DCT discrete cosine transform
  • DST discrete sine transform
  • KLT karhunen-loeve transform
  • GBT graph-based transform
  • CNT conditionally non-linear transform
  • GBT means a conversion obtained from this graph when the relationship information between pixels is represented by a graph.
  • the CNT refers to a transform that is generated based on and generates a prediction signal by using all previously reconstructed pixels.
  • the conversion process may be applied to pixel blocks having the same size as the square, or may be applied to blocks of variable size rather than square.
  • the quantization unit 130 quantizes the transform coefficients and transmits them to the entropy encoding unit 190.
  • the entropy encoding unit 190 encodes the quantized signal (information about the quantized transform coefficients) and outputs the bitstream as a bitstream. have.
  • Information about the quantized transform coefficients may be referred to as residual information.
  • the quantization unit 130 may rearrange the quantized transform coefficients in the form of a block into a one-dimensional vector form based on a coefficient scan order, and the quantized transform based on the quantized transform coefficients in the form of a one-dimensional vector. Information about the coefficients may be generated.
  • the entropy encoding unit 190 may perform various encoding methods such as, for example, exponential Golomb, context-adaptive variable length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), and the like.
  • the entropy encoding unit 190 may encode information necessary for video / image reconstruction other than quantized transform coefficients (for example, values of syntax elements) together or separately.
  • the encoded information (eg, video / picture information) may be transmitted or stored in units of NALs (network abstraction layer) in a bitstream form.
  • the bitstream may be transmitted over a network or may be stored in a digital storage medium.
  • the network may include a broadcasting network and / or a communication network
  • the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, and the like.
  • the signal output from the entropy encoding unit 190 may include a transmitting unit (not shown) for transmitting and / or a storing unit (not shown) for storing as an internal / external element of the encoding apparatus 100, or the transmitting unit It may be a component of the entropy encoding unit 190.
  • the quantized transform coefficients output from the quantization unit 130 may be used to generate a prediction signal.
  • the quantized transform coefficients may be reconstructed in the residual signal by applying inverse quantization and inverse transform through inverse quantization unit 140 and inverse transform unit 150 in a loop.
  • the adder 155 adds the reconstructed residual signal to the predicted signal output from the inter predictor 180 or the intra predictor 185 so that a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array) is added. Can be generated. If there is no residual for the block to be processed, such as when the skip mode is applied, the predicted block may be used as the reconstructed block.
  • the adder 155 may be called a restoration unit or a restoration block generation unit.
  • the generated reconstruction signal may be used for intra prediction of the next block to be processed in the current picture, and may be used for inter prediction of the next picture through filtering as described below.
  • the filtering unit 160 may improve subjective / objective image quality by applying filtering to the reconstruction signal. For example, the filtering unit 160 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and transmit the modified reconstructed picture to the decoded picture buffer 170. Various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and the like. The filtering unit 160 may generate various information related to the filtering and transmit the generated information to the entropy encoding unit 190 as described later in the description of each filtering method. The filtering information may be encoded in the entropy encoding unit 190 and output in the form of a bitstream.
  • the modified reconstructed picture transmitted to the decoded picture buffer 170 may be used as the reference picture in the inter predictor 180.
  • the encoding apparatus may avoid prediction mismatch between the encoding apparatus 100 and the decoding apparatus, and may improve encoding efficiency.
  • the decoded picture buffer 170 may store the modified reconstructed picture for use as a reference picture in the inter prediction unit 180.
  • FIG. 3 is an embodiment to which the present invention is applied and shows a schematic block diagram of a decoding apparatus in which decoding of a video signal is performed.
  • the decoding device 200 of FIG. 3 may correspond to the decoding device 22 of FIG. 1.
  • the decoding apparatus 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an adder 235, a filtering unit 240, and a decoded picture buffer (DPB). 250, an inter predictor 260, and an intra predictor 265 may be configured.
  • the inter predictor 260 and the intra predictor 265 may be collectively called a predictor. That is, the predictor may include an inter predictor 180 and an intra predictor 185.
  • the inverse quantization unit 220 and the inverse transform unit 230 may be collectively called a residual processing unit. That is, the residual processor may include an inverse quantization unit 220 and an inverse transform unit 230.
  • the entropy decoding unit 210, the inverse quantization unit 220, the inverse transformer 230, the adder 235, the filtering unit 240, the inter prediction unit 260, and the intra prediction unit 265 are described above. Can be configured by one hardware component (eg, decoder or processor).
  • the decoded picture buffer 250 may be implemented by one hardware component (for example, a memory or a digital storage medium) according to an exemplary embodiment.
  • the decoding apparatus 200 may reconstruct an image corresponding to a process in which the video / image information is processed in the encoding apparatus 100 of FIG. 2.
  • the decoding apparatus 200 may perform decoding using a processing unit applied in the encoding apparatus 100.
  • the processing unit of decoding may thus be a coding unit, for example, and the coding unit may be divided along the quad tree structure and / or the binary tree structure from the coding tree unit or the largest coding unit.
  • the reconstructed video signal decoded and output through the decoding apparatus 200 may be reproduced through the reproducing apparatus.
  • the decoding apparatus 200 may receive a signal output from the encoding apparatus 100 of FIG. 2 in the form of a bitstream, and the received signal may be decoded through the entropy decoding unit 210.
  • the entropy decoding unit 210 may parse the bitstream to derive information (eg, video / image information) necessary for image reconstruction (or picture reconstruction).
  • the entropy decoding unit 210 decodes the information in the bitstream based on a coding method such as exponential Golomb coding, CAVLC, or CABAC, quantized values of syntax elements required for image reconstruction, and transform coefficients for residuals. Can be output.
  • the CABAC entropy decoding method receives a bin corresponding to each syntax element in a bitstream, and decodes syntax element information and decoding information of neighboring and decoding target blocks or information of symbols / bins decoded in a previous step.
  • the context model may be determined using the context model, the probability of occurrence of a bin may be predicted according to the determined context model, and arithmetic decoding of the bin may be performed to generate a symbol corresponding to the value of each syntax element. have.
  • the CABAC entropy decoding method may update the context model by using the information of the decoded symbol / bin for the context model of the next symbol / bin after determining the context model.
  • the information related to the prediction among the information decoded by the entropy decoding unit 210 is provided to the predictor (the inter predictor 260 and the intra predictor 265), and the entropy decoding performed by the entropy decoder 210 is performed. Dual values, that is, quantized transform coefficients and related parameter information, may be input to the inverse quantizer 220.
  • information on filtering among information decoded by the entropy decoding unit 210 may be provided to the filtering unit 240.
  • a receiver (not shown) that receives a signal output from the encoding apparatus 100 may be further configured as an internal / external element of the decoding apparatus 200, or the receiver may be a component of the entropy decoding unit 210. It may be.
  • the inverse quantization unit 220 may dequantize the quantized transform coefficients and output the transform coefficients.
  • the inverse quantization unit 220 may rearrange the quantized transform coefficients in the form of a two-dimensional block. In this case, reordering may be performed based on the coefficient scan order performed in the encoding apparatus 100.
  • the inverse quantization unit 220 may perform inverse quantization on quantized transform coefficients using a quantization parameter (for example, quantization step size information), and may obtain transform coefficients.
  • a quantization parameter for example, quantization step size information
  • the inverse transformer 230 inversely transforms the transform coefficients to obtain a residual signal (residual block, residual sample array).
  • the prediction unit may perform prediction on the current block and generate a predicted block including prediction samples for the current block.
  • the prediction unit may determine whether intra prediction or inter prediction is applied to the current block based on the information about the prediction output from the entropy decoding unit 210, and may determine a specific intra / inter prediction mode.
  • the intra predictor 265 may predict the current block by referring to the samples in the current picture.
  • the referenced samples may be located in the neighbor of the current block or may be spaced apart according to the prediction mode.
  • the prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
  • the intra predictor 265 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.
  • the inter prediction unit 260 may construct a motion information candidate list based on neighboring blocks and derive a motion vector and / or a reference picture index of the current block based on the received candidate selection information. Inter prediction may be performed based on various prediction modes, and the information about the prediction may include information indicating a mode of inter prediction for the current block.
  • the adder 235 adds the obtained residual signal to the predictive signal (predicted block, predictive sample array) output from the inter predictor 260 or the intra predictor 265 to restore the reconstructed signal (reconstructed picture, reconstructed block). , Restore sample array). If there is no residual for the block to be processed, such as when the skip mode is applied, the predicted block may be used as the reconstructed block.
  • the adder 235 may be called a restoration unit or a restoration block generation unit.
  • the generated reconstruction signal may be used for intra prediction of the next block to be processed in the current picture, and may be used for inter prediction of the next picture through filtering as described below.
  • the filtering unit 240 may improve subjective / objective image quality by applying filtering to the reconstruction signal. For example, the filtering unit 240 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and transmit the modified reconstructed picture to the decoded picture buffer 250.
  • Various filtering methods may include, for example, deblocking filtering, sample adaptive offset (SAO), adaptive loop filter (ALF), bilateral filter, and the like.
  • the modified reconstructed picture transmitted to the decoded picture buffer 250 may be used as the reference picture by the inter predictor 260.
  • the embodiments described by the filtering unit 160, the inter prediction unit 180, and the intra prediction unit 185 of the encoding apparatus 100 are respectively the filtering unit 240 and the inter prediction unit 260 of the decoding apparatus.
  • the intra prediction unit 265 may be equally or correspondingly applied.
  • FIG. 4 is a structural diagram of a content streaming system according to an embodiment to which the present invention is applied.
  • the content streaming system to which the present invention is applied may largely include an encoding server 410, a streaming server 420, a web server 430, a media storage 440, a user device 450, and a multimedia input device 460. have.
  • the encoding server 410 compresses content input from multimedia input devices such as a smartphone, a camera, a camcorder, etc. into digital data to generate a bitstream and transmit the bitstream to the streaming server 420.
  • multimedia input devices such as a smartphone, a camera, a camcorder, or the like directly generates a bitstream
  • the encoding server 410 may be omitted.
  • the bitstream may be generated by an encoding method or a bitstream generation method to which the present invention is applied, and the streaming server 420 may temporarily store the bitstream in the process of transmitting or receiving the bitstream.
  • the streaming server 420 transmits the multimedia data to the user device 450 based on the user request through the web server 430, and the web server 430 serves as an intermediary to inform the user of what service there is.
  • the web server 430 transmits the request to the streaming server 420, and the streaming server 420 transmits multimedia data to the user.
  • the content streaming system may include a separate control server, in which case the control server serves to control the command / response between each device in the content streaming system.
  • the streaming server 420 may receive content from the media store 440 and / or the encoding server 410. For example, when the content is received from the encoding server 410, the content may be received in real time. In this case, in order to provide a smooth streaming service, the streaming server 420 may store the bitstream for a predetermined time.
  • Examples of the user device 450 include a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a navigation device, and a slate PC. ), Tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, glass glasses, head mounted displays), digital TVs , Desktop computers, digital signage, and the like.
  • Each server in the content streaming system may operate as a distributed server.
  • data received from each server may be distributedly processed.
  • FIG. 5 is an embodiment to which the present invention may be applied.
  • FIG. 5A is a quadtree (QT)
  • FIG. 5B is a binary tree (BT)
  • FIG. 5C is a ternary tree (TT).
  • FIG. 4 is a diagram for describing block division structures by Tree (AT).
  • one block may be divided on a QT basis.
  • one subblock divided by QT may be further divided recursively using QT.
  • Leaf blocks that are no longer QT split may be split by at least one of BT, TT, or AT.
  • BT may have two types of divisions: horizontal BT (2NxN, 2NxN) and vertical BT (Nx2N, Nx2N).
  • the TT may have two types of divisions, horizontal TT (2Nx1 / 2N, 2NxN, 2Nx1 / 2N) and vertical TT (1 / 2Nx2N, Nx2N, 1 / 2Nx2N).
  • AT is horizontal-up AT (2Nx1 / 2N, 2Nx3 / 2N), horizontal-down AT (2Nx3 / 2N, 2Nx1 / 2N), vertical-left AT (1 / 2Nx2N, 3 / 2Nx2N), vertical-right AT (3 / 2Nx2N, 1 / 2Nx2N) can be divided into four types.
  • Each BT, TT, AT may be further recursively divided using BT, TT, AT.
  • Block A may be divided into four sub-blocks A0, A1, A2, A3 by QT.
  • the sub block A1 may be further divided into four sub blocks B0, B1, B2, and B3 by QT.
  • Block B3 which is no longer divided by QT, may be divided into vertical BT (C0, C1) or horizontal BT (D0, D1). Like the block C0, each subblock may be further recursively divided into the form of horizontal BT (E0, E1) or vertical BT (F0, F1).
  • Block B3 which is no longer divided by QT, may be divided into vertical TT (C0, C1, C2) or horizontal TT (D0, D1, D2). Like block C1, each subblock may be further recursively divided into a form of horizontal TT (E0, E1, E2) or vertical TT (F0, F1, F2).
  • Block B3 which is no longer divided by QT, may be divided into vertical AT (C0, C1) or horizontal AT (D0, D1). Like block C1, each subblock may be further recursively divided into a form of horizontal AT (E0, E1) or vertical TT (F0, F1).
  • BT, TT, AT splitting can be used together to split.
  • a sub block divided by BT may be divided by TT or AT.
  • the sub-block divided by TT can be divided by BT or AT.
  • a sub block divided by AT may be divided by BT or TT.
  • each sub block may be divided into vertical BTs, or after the vertical BT division, each sub block may be divided into horizontal BTs. In this case, the division order is different, but the shape of the final division is the same.
  • searching from left to right and from top to bottom, and searching for a block means an order of determining whether each divided sub-block is divided into additional blocks, or when each sub-block is not divided any more.
  • a coding order of a block may be referred to, or a search order when referring to information of another neighboring block in a subblock.
  • the transformation may be performed for each processing unit (or transformation block) divided by the division structure as illustrated in FIGS. 5A to 5D, and in particular, the transformation matrix may be applied by dividing by the row direction and the column direction. .
  • different conversion types may be used depending on the length of the row direction or the column direction of the processing unit (or transform block).
  • FIG. 6 is a schematic diagram of a transform and quantization unit 120/130 and an inverse quantization and inverse transform unit 140/150 in the encoding apparatus 100 of FIG. 7 shows a schematic block diagram of an inverse quantization and inverse transform unit 220/230 in the decoding apparatus 200.
  • the transform and quantization unit 120/130 may include a primary transform unit 121, a secondary transform unit 122, and a quantization unit 130. have.
  • the inverse quantization and inverse transform unit 140/150 may include an inverse quantization unit 140, an inverse secondary transform unit 151, and an inverse primary transform unit 152. Can be.
  • the inverse quantization unit 220/230 may include an inverse quantization unit 220, an inverse secondary transform unit 231, and an inverse primary transform unit ( 232).
  • the transformation may be performed through a plurality of steps when performing the transformation.
  • two stages of a primary transform and a secondary transform may be applied, or more transformation steps may be used according to an algorithm.
  • the primary transform may be referred to as a core transform.
  • the primary transform unit 121 may apply a primary transform to the residual signal, where the primary transform may be defined as a table at the encoder and / or the decoder.
  • the secondary transform unit 122 may apply a secondary transform on the primary transformed signal, where the secondary transform may be defined as a table at the encoder and / or the decoder.
  • a non-separable secondary transform may be conditionally applied as a secondary transform.
  • NSST is applied only to intra prediction blocks, and may have a transform set applicable to each prediction mode group.
  • the prediction mode group may be set based on symmetry with respect to the prediction direction. For example, since the prediction mode 52 and the prediction mode 16 are symmetric with respect to the prediction mode 34 (diagonal direction), the same transform set may be applied by forming one group. At this time, when the transform for the prediction mode 52 is applied, the input data is transposed and then applied, since the prediction set 16 and the transform set are the same.
  • each has a transform set, and the transform set may be composed of two transforms.
  • three transforms may be configured per transform set.
  • the quantization unit 130 may perform quantization on the quadratic transformed signal.
  • the inverse quantization and inverse transform unit 140/150 performs the above-described process in reverse, and redundant description thereof will be omitted.
  • FIG. 7 shows a schematic block diagram of an inverse quantization and inverse transform unit 220/230 in the decoding apparatus 200.
  • the inverse quantization and inverse transform units 220 and 230 may include an inverse quantization unit 220, an inverse secondary transform unit 231, and an inverse primary transform unit. 232 may include.
  • the inverse quantization unit 220 obtains a transform coefficient from the entropy decoded signal using the quantization step size information.
  • the inverse quadratic transform unit 231 performs inverse quadratic transformation on the transform coefficients.
  • the inverse secondary transform indicates an inverse transform of the secondary transform described with reference to FIG. 6.
  • the inverse primary transform unit 232 performs inverse primary transform on the inverse secondary transformed signal (or block) and obtains a residual signal.
  • the inverse primary transform indicates an inverse transform of the primary transform described with reference to FIG. 6.
  • FIG. 8 is a flowchart illustrating a process of performing adaptive multiple transform (AMT).
  • AMT adaptive multiple transform
  • a combination of transforms may be constructed from a mixture of separable and non-separable transforms.
  • the row / column transform selection or the horizontal / vertical direction selection is unnecessary, and the transform combinations of Table 4 may be used only when the separable transform is selected.
  • the schemes proposed in this specification may be applied regardless of the first-order transform or the second-order transform. That is, there is no restriction that it should be applied to either one, and both can be applied.
  • the primary transform may mean a transform for transforming the residual block first
  • the secondary transform may mean a transform for applying the transform to a block generated as a result of the primary transform.
  • the encoding apparatus 100 may determine a transform group corresponding to the current block (S805).
  • the transform group may mean the transform group of Table 4, but the present invention is not limited thereto and may be configured with other transform combinations.
  • the encoding apparatus 100 may perform transform on candidate transform combinations available in the transform group (S810). As a result of the conversion, the encoding apparatus 100 may determine or select a transformation combination having the lowest cost of RD (rate distortion) (S815). The encoding apparatus 100 may encode a transform combination index corresponding to the selected transform combination (S820).
  • FIG. 9 is a flowchart illustrating a decoding process in which AMT is performed.
  • the decoding apparatus 200 may determine a transform group for the current block (S905).
  • the decoding apparatus 200 may parse the transform combination index, where the transform combination index may correspond to any one of a plurality of transform combinations in the transform group (S910).
  • the decoding apparatus 200 may induce a transform combination corresponding to the transform combination index (S915).
  • the transform combination may mean the transform combination described in Table 4, but the present invention is not limited thereto. That is, the structure by other conversion combination is also possible.
  • the decoding apparatus 200 may perform inverse transform on the current block based on the transform combination (S920). If the transformation combination consists of row transformations and column transformations, you can apply the column transformation first and then the column transformation. However, the present invention is not limited thereto, and in the case of applying the reverse or non-separated transform, the non-separated transform may be applied immediately.
  • the process of determining the transform group and the process of parsing the transform combination index may be performed at the same time.
  • Example 1 Reduced secondary transform (RST) that can be applied to a 4x4 block
  • a non-separated transform that can be applied to one 4x4 block is a 16x16 transform. That is, when the data elements constituting the 4x4 block are arranged in a row-first or column-first order, a 16x1 vector may be applied to the corresponding non-separated transform.
  • the forward 16x16 transform consists of 16 row-wise transform basis vectors. When the inner product of the 16x1 vector and each transform basis vector is taken, a transform coefficient for the corresponding transform basis vector is obtained. do. The process of obtaining the corresponding transform coefficients for all 16 transform basis vectors is equivalent to multiplying the 16x16 non-separated transform matrix by the input 16x1 vector.
  • the transform coefficients obtained by the matrix product have a 16 ⁇ 1 vector form, and statistical characteristics may be different for each transform coefficient. For example, when a 16x1 transform coefficient vector is composed of 0th to 15th elements, the variance of the 0th element may be greater than the variance of the 15th element. In other words, the greater the variance value is, the larger the element is.
  • Applying the inverse 16x16 non-separation transform from the 16x1 transform coefficients can restore the original 4x4 block signal (when ignoring effects such as quantization or integer calculations).
  • the forward 16x16 non-separated transform is an orthonormal transform
  • the reverse 16x16 transform can be obtained by transposing the matrix with respect to the forward 16x16 transform. Simply multiply the inverse 16x16 non-separated transform matrix by the 16x1 transform coefficient vector to obtain 16x1 vector data and arrange the row-first or column-first order that was applied first to restore the 4x4 block signal.
  • elements constituting the 16x1 transform coefficient vector may have different statistical characteristics.
  • the original signal may be applied to some of the transform coefficients that appear first without using all the transform coefficients. You can restore a signal that is fairly close to. For example, suppose that the inverse 16x16 non-separated transform consists of 16 column basis vectors, leaving only L column basis vectors to form a 16xL matrix and only the L transform coefficients that are more important among the transform coefficients.
  • Lx1 vector which can appear first as in the previous example
  • multiplying the 16xL matrix by the Lx1 vector can restore the original input 16x1 vector data and the 16x1 vector with little error.
  • the Lx1 transform coefficient vector is obtained instead of the 16x1 transform coefficient vector. That is, L significant transform coefficients can be obtained by selecting L corresponding row direction transform vectors from a forward 16x16 non-separated transform matrix and constructing an Lx16 transform and multiplying the 16x1 input vector.
  • FIG. 10 shows three forward scan sequences for transform coefficients or transform coefficient blocks applied in the HEVC standard, (a) a diagonal scan, (b) a horizontal scan, and (c) a vertical scan (vertical scan).
  • FIG. 10 shows three forward scan sequences for transform coefficients or transform coefficient blocks (4x4 blocks, Coefficient Groups (CGs)) applied in the HEVC standard, and the residual coding may be (a), (b), or ( c) in the reverse order of scan order (i.e., coded in the order of 16 to 1). Since the three scan orders shown in (a), (b), and (c) are selected according to the intra-prediction mode, the scan order is determined according to the intra-prediction mode in the same way for the L transform coefficients. Can be configured.
  • FIG. 11 and 12 illustrate embodiments to which the present invention is applied.
  • FIG. 11 shows positions of transform coefficients when a forward diagonal scan is applied when 4x4 RST is applied to a 4x8 block
  • FIG. 12 shows two 4x4 blocks. An example of a case of merging valid transform coefficients of a into one block is shown.
  • transform coefficients may be located. Only half of each 4 ⁇ 4 block may have transform coefficients, and a value of 0 may be filled as a default at positions marked with X. Therefore, it is assumed that L transform coefficients are placed for each 4x4 block in the scan order shown in FIG. 10A, and filled with zeros for the remaining (16-L) positions of each 4x4 block. Residual coding (eg, residual coding in HEVC) may be applied.
  • residual coding eg, residual coding in HEVC
  • L transform coefficients arranged in two 4 ⁇ 4 blocks may be configured as one block.
  • the transform coefficients of the two 4x4 blocks completely fill one 4x4 block, and thus no transform coefficients remain in the other block.
  • a flag (coded_sub_block_flag) indicating whether the residual coding is applied to the block may be coded as 0 in HEVC.
  • the combination scheme for the position of the transform coefficients of two 4x4 blocks may vary. For example, the positions may be combined in any order, but the following method may also be applied.
  • the transform coefficients for the first 4x4 block may be arranged first, and then the transform coefficients for the second 4x4 block may be arranged. In other words, It can be arranged as connected. naturally, You can change the order as
  • 0 values may be filled from L + 1 to 16th according to the transform coefficient scan order for each 4x4 block. Accordingly, if any one of the two 4x4 blocks is a non-zero value among the L + 1st to 16th positions, it can be seen that 4x4 RST is not applied. If the 4x4 RST also has a structure in which one of the transform sets prepared as JEM NSST is selected and applied, the index for which transform is to be applied (named NSST index in this document) may be signaled. Suppose that a decoder can know the NSST index through bitstream parsing and perform such parsing after residual decoding.
  • 4x4 RST is applied to several 4x4 blocks in a specific region as shown in FIG. 11 (all of the same 4x4 RST may be applied or different 4x4 RST may be applied), all of the above through one NSST index A 4x4 RST can be specified (same or separate) that applies to 4x4 blocks. Since one NSST index determines 4x4 RST for all 4x4 blocks and whether it is applied, whether or not a non-zero transform coefficient exists in L + 1 th to 16 th positions for all 4x4 blocks It can be configured not to code the NSST index if there is a non-zero transform coefficient (L + 1 to 16th) in a position that is not allowed even in one 4x4 block by checking during the decoding process.
  • the NSST index may signal separately for the luminance block and the chrominance block, and for the chrominance block, the NSST index may signal separate NSST indexes for Cb and Cr, and share one NSST index. It may be (signal only once).
  • the 4x4 RST specified by the same NSST index may be applied (the 4x4 RST for Cb and Cr may be the same, or the NSST index may be the same but have separate 4x4 RST).
  • it is checked whether there is a non-zero transform coefficient from L + 1 to 16th for all 4x4 blocks for Cb and Cr, if any one is 0. If no transform coefficient is found, the signaling for the NSST index may be omitted.
  • whether or not the 4x4 RST is applied is determined in advance, so that the residual coding can be omitted for the positions where the transform coefficient is sure to be filled with zero.
  • whether 4x4 RST is applied can be configured to know through NSST index value (for example, 4x4 RST is not applied when NSST index is 0) or through a separate syntax element (for example, NSST flag). It may signal. For example, if a separate syntax element is an NSST flag, the NSST flag is parsed first to determine whether 4x4 RST is applied, and if the NSST flag value is 1, the residuals for positions where no valid conversion factor can exist as described above are present. Coding can be omitted.
  • the NSST index can be configured not to code and apply 4x4 RST.
  • the positions marked with X in Figure 2 do not have valid transform coefficients when 4x4 RST is applied (eg zero values can be filled), so that the last nonzero coefficient is placed in the region marked with X. In this case, coding of the NSST index can be omitted. If the last non-zero coefficient is not located in the region marked with X, coding of the NSST index may be performed.
  • the remaining residual coding part may be processed in the following two ways.
  • a corresponding transform coefficient must exist for a specific position or a specific 4x4 block (for example, X position of FIG. 11), so that the position or block can be filled with 0 by default. You can skip the residual coding for this. For example, if you reach the location marked X in Figure 2, you can omit coding for sig_coeff_flag (a flag for whether a non-zero coefficient exists at that location, present in HEVC), as shown in Figure 3 Likewise, if you combine the transform coefficients of two blocks, you can omit coding for coded_sub_block_flag (exists in HEVC) for a 4x4 block that is empty to zero, and derive its value to 0. I can fill it.
  • coded_sub_block_flag existing in HEVC
  • the method of determining NSST index coding by comparing with the threshold value can be applied differently to luminance and chrominance. For example, different Tx and Ty may be applied to luminance and chrominance, and luminance (to Threshold may be applied and not luminance (color difference).
  • the two methods described above are omitted (when the last nonzero coefficient is located in an area where no valid transform coefficients exist, the NSST index coding is omitted; when the X and Y coordinates for the last nonzero coefficient are less than any threshold, respectively, May be applied all at once. For example, a threshold check on the last non-zero coefficient position coordinate may be performed first, and then it may be checked whether the last non-zero coefficient is located in an area where no valid transform coefficient exists (the order may be changed). ).
  • the methods presented in this embodiment 4) can also be applied to 8x8 RST. That is, if the last non-zero coefficient is located in the non-left 4x4 region in the top-left 8x8 region, coding for the NSST index may be omitted, otherwise coding may be performed in the NSST index. have. In addition, if the X and Y coordinate values for the last non-zero coefficient position are less than a certain takeover value, the coding for the NSST index may be omitted. Naturally, the two methods can be applied together.
  • NSST index coding and residual coding schemes for luminance and color difference may be applied differently.
  • the luminance follows the scheme described in Example 4, and the scheme in Example 3 may be applied to the color difference.
  • the conditional NSST index coding described in Example 3) or Example 4) may be applied to the luminance, and the conditional NSST index coding may not be applied to the color difference, and vice versa (not applied to the color difference, but not applied to the luminance). .
  • Table 1 shows three examples of a reduced adaptive multiple transform (RAMT) using a predefined R value for each primary transform size.
  • Example 7 Reduced adaptive (or explicit) multiple transform based on primary transform
  • the reduced transform factor R may be determined depending on the corresponding primary transform. For example, if the primary transform is DCT-2, the computational amount is relatively simple compared to other primary transforms, so that the coding performance is reduced by not using the reduced transform for a small block or by using a relatively large R value. Minimize. For example, in the case of DCT-2 and other transformations, other reduced transformation factors may be used as shown in Table 2.
  • Example 8 EMT (AMT) core transform mapping depends on intra prediction mode
  • one of the four combinations of EMT indexes (0,1,2,3) is selected through the 2-bit EMT_TU_index, and the corresponding primary transform is selected based on the given EMT index.
  • Table 3 is an example of a mapping table for selecting a corresponding primary transform for horizontal and vertical directions based on an EMT index value.
  • the present invention analyzes the statistics of the first-order transforms generated according to the intra prediction mode and proposes a more efficient EMT core transform mapping method based on the statistics.
  • Table 4 shows the distribution (%) of the EMT_TU_index as a percentage by intra prediction mode.
  • the Hor mode represents Modes 2 through 33 when the JEM is based on the 67 mode
  • the Ver mode represents the angular modes 34 through 66.
  • Table 5 shows an example of using different mappings for Hor mode groups.
  • the method of deriving the first-order transform based on the EMT_TU_index uses a different mapping table based on the intra prediction direction.
  • the present invention proposes a method in which the available EMT_TU_index for each intra prediction mode is not the same but may be defined differently.
  • the available EMT_TU_index for each intra prediction mode is not the same but may be defined differently.
  • the probability of occurrence is relatively low, and thus, such an efficient coding is possible by excluding such a part.
  • Table 6 specifies an example in which the available EMT_TU_index value depends on the intra prediction mode.
  • the context model is determined using the information of the intra prediction mode.
  • Table 8 shows some examples.
  • the intra prediction mode context modeling method specified in the present invention may be considered along with other factors such as block size.
  • the AMT term is redefined to MTS.
  • Relevant syntaxes and semantics in VVC (Versitile Video Coding Version 4, JVET-K1001-v4.docx) are summarized as in Table 9 below.
  • JVET-K1001-v4.docx JVET-K1001-v4.docx
  • the residual coding syntax is shown in Tables 13 and 14 below.
  • FIG. 13 is a flowchart illustrating an inverse transformation process based on an MTS according to an embodiment of the present invention.
  • the decoding apparatus 200 to which the present invention is applied may acquire sps_mts_intra_enabled_flag or sps_mts_inter_enabled_flag (S1305).
  • sps_mts_intra_enabled_flag indicates whether cu_mts_flag exists in the residual coding syntax of the intra coding unit.
  • sps_mts_intra_enabled_flag 0 cu_mts_flag is not present in the residual coding syntax of the intra coding unit
  • sps_mts_intra_enabled_flag 1 cu_mts_flag is present in the residual coding syntax of the intra coding unit
  • sps_mts_inter_enabled_flag indicates whether cu_mts_flag exists in the residual coding syntax of the inter coding unit.
  • mts_idx indicates which transform kernel is applied to luma residual samples along the horizontal and / or vertical direction of the current transform block.
  • mts_idx For example, for mts_idx, at least one of the embodiments described herein may be applied.
  • the decoding apparatus 200 may induce a transform kernel corresponding to mts_idx (S1320).
  • a transform kernel corresponding to mts_idx may be defined by being divided into a horizontal transform and a vertical transform.
  • the decoding apparatus 200 may configure an MTS candidate based on the intra prediction mode of the current block.
  • the decoding flowchart of FIG. 10 may further include configuring the MTS candidate.
  • the decoding apparatus 200 may determine the MTS candidate applied to the current block by using mts_idx among the configured MTS candidates.
  • transform kernels may be applied to the horizontal transform and the vertical transform.
  • present invention is not limited thereto, and the same transform kernel may be applied to the horizontal transform and the vertical transform.
  • the decoding apparatus 200 may perform inverse transformation based on the transform kernel.
  • MTS may also be expressed as AMT or EMT.
  • mts_idx may also be expressed as AMT_idx, EMT_idx, AMT_TU_idx EMT_TU_idx, and the present invention is not limited thereto.
  • FIG. 14 is a block diagram of an apparatus for performing decoding based on an MTS according to an embodiment of the present invention.
  • the decoding apparatus 200 to which the present invention is applied may include a sequence parameter obtaining unit 1405, an MTS flag obtaining unit 1410, an MTS index obtaining unit 1415, and a transform kernel deriving unit 1420.
  • the sequence parameter obtainer 1405 may acquire sps_mts_intra_enabled_flag or sps_mts_inter_enabled_flag.
  • sps_mts_intra_enabled_flag indicates whether cu_mts_flag exists in the residual coding syntax of the intra coding unit
  • sps_mts_inter_enabled_flag indicates whether cu_mts_flag exists in the residual coding syntax of the inter coding unit.
  • the description associated with FIG. 10 may be applied.
  • cu_mts_flag indicates whether the MTS is applied to the residual sample of the luma transform block. As a specific example, the description associated with FIG. 10 may be applied.
  • mts_idx indicates which transform kernel is applied to luma residual samples along the horizontal and / or vertical direction of the current transform block. As a specific example, the description of FIG. 10 may be applied.
  • the translation kernel derivation unit 1420 may derive the translation kernel corresponding to mts_idx. In addition, the decoding apparatus 200 may perform inverse transform based on the derived transform kernel.
  • the process for the transform process for the scaled transform coefficients may be as Table 15 below.
  • the horizontal transform (trTypeHor) and vertical transform (trTypeVer) according to the MTS index (mts_idx) and the prediction mode (CuPredMode) of the current CY may be set as shown in Table 16 below.
  • two MTS candidates for the directional mode and four MTS candidates for the non-directional mode may be used as follows.
  • the DST-7 is used for horizontal and vertical conversion.
  • DST-7 is used for vertical transformation and DCT-8 is used for horizontal transformation.
  • the DCT-8 is used for vertical transformation and the DST-7 is used for horizontal transformation.
  • the DCT-8 is used for horizontal and vertical conversion.
  • the DST-7 is used for horizontal and vertical conversion.
  • the DCT-8 is used for vertical transformation and the DST-7 is used for horizontal transformation.
  • the DST-7 is used for horizontal and vertical conversion.
  • DST-7 is used for vertical transformation and DCT-8 is used for horizontal transformation.
  • the horizontal group modes include intra prediction modes 2 to 34, and the vertical modes include intra prediction modes 35 to 66,
  • Table 17 below shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for a non-angular mode.
  • Table 18 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for the horizontal group mode.
  • Table 19 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for the vertical group mode.
  • DST-7 is used for horizontal and vertical conversion.
  • DST-7 is used for the vertical transform and DCT-8 is used for the horizontal transform.
  • DCT-8 is used for vertical transformation and DST-7 is used for horizontal transformation.
  • Table 20 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to a prediction mode (CuPredMode) and an MTS index (mts_idx).
  • three MTS candidates are used for all intra prediction modes.
  • the DST-7 is used for horizontal and vertical conversion.
  • DST-7 is used for vertical transformation and DCT-8 is used for horizontal transformation.
  • the DCT-8 is used for vertical transformation and the DST-7 is used for horizontal transformation.
  • two MTS candidates are used for directional prediction modes and three MTS candidates for non-directional prediction modes.
  • DST-7 is used for horizontal and vertical conversion when MTS index is 0
  • DST-7 is used for vertical transformation and DCT-8 is used for horizontal transformation.
  • DCT-8 is used for vertical transformation and DST-7 is used for horizontal transformation when MTS index is 2.
  • DST-7 is used for horizontal and vertical conversion when MTS index is 0
  • DCT-8 is used for vertical transformation and DST-7 is used for horizontal transformation when MTS index is 1
  • DST-7 is used for horizontal and vertical conversion when MTS index is 0
  • DST-7 is used for vertical transformation and DCT-8 is used for horizontal transformation.
  • the horizontal group modes include 2 to 34 intra prediction modes and the vertical modes include 35 to 66 intra prediction modes.
  • Table 21 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for non-directional modes.
  • Table 22 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for the horizontal group mode.
  • Table 23 shows a horizontal transform type (trTypeHor) and a vertical transform type (trTypeVer) according to the MTS index (mts_idx) for the vertical group mode.
  • one MTS (eg DST-7) is used for all modes.
  • the MTS index (mts_idx [x] [y]) is not required, but the MTS flag cu_mts_flag [x] [y] may be used as shown in Table 24 to indicate a conversion type.
  • VTM 2.0 The conversion process in VTM 2.0 can be summarized as shown in Table 25 below.
  • FIG. 15 shows an example of a decoding flowchart for performing a conversion process according to an embodiment of the present invention.
  • the decoding apparatus 200 to which the present invention is applied may check the transform size nTbS (S1505).
  • the transform size nTbS may be a variable representing a horizontal sample size of scaled transform coefficients.
  • the decoding apparatus 200 may check the transform kernel type trType (S1510).
  • the transform kernel type trType may be a variable indicating the type of the transform kernel, and various embodiments of the present disclosure may be applied.
  • the decoding apparatus 200 may perform transform matrix multiplication based on at least one of a transform size nTbS or a transform kernel type (S1515). For example, if the conversion kernel type is 0, (Equation 15-1) of Table 25 may be applied, and if the conversion kernel type is 1 or 2, (Equation 15-2) of Table 25 may be applied.
  • the transform matrix of (Equation 15-3) in Table 25 may be applied when performing the transform matrix multiplication.
  • the transform matrix of (Equation 15-4) in Table 25 may be applied.
  • the transform matrix shown in (Equation 15-3) of Table 25 may be applied when performing the transform matrix multiplication.
  • the predefined transform matrix may be applied.
  • the decoding apparatus 200 may derive the transform sample based on the transform matrix multiplication (S1520).
  • the transformation matrix contains a predefined number of specific coefficients, which are repeated in several rows in the transformation matrix.
  • 4x4 DST-7 is defined as Equation 1 below.
  • the first row contains four coefficients 117, 219, 296, and 336.
  • the four coefficients are repeated in the rest.
  • a coefficient approximation procedure for eliminating multiplication operations is introduced.
  • Equation 2 represents transform matrix multiplication.
  • transMatrix [i] [j] and x [j] represent transform coefficients and input values, respectively.
  • the multiplication operation between the transform coefficients and the input values is eliminated by efficient means (i.e. shift and add operations) if the transform coefficients are expressed in the form of a polynomial of power of two. Can be.
  • the conversion factor is 65
  • 64 we can approximate 64 instead of calculating 65 * x [j]
  • the multiplication process can be eliminated because 64 * x [j] is equivalent to x [j] ⁇ 6. have.
  • the transform coefficient 280 may be approximated to 282.
  • 280 * x [j] can be replaced by 282 * x [j], which is equivalent to x [j] ⁇ 8 + x [j] ⁇ 4.
  • all transform coefficient values can be approximated with a combination of powers of two to eliminate the multiplication process in an efficient manner (i.e. using a small number of terms that minimize approximation error).
  • the difference between the original transform coefficients and the approximated value needs to be minimized to reduce coding performance loss.
  • transform coefficient approximation can be less efficient in energy compression because it can damage the orthogonality of each basis vector.
  • the difference between the original value and the approximate value should be minimized to maintain orthogonality (to maintain coding performance).
  • Table 26 shows how the approximation error (Diff in Table 26) changes according to the number of terms for approximation. For example, the minimum error for two term approximation represents a maximum of 30 errors. However, if three term approximations are used the maximum error can be reduced to three.
  • Equation 3 may be applied.
  • Eight transform coefficients are repeatedly used in the transform matrix in the 4 ⁇ 4 transform matrix, and an approximation based on up to three terms may be as shown in Table 27 below.
  • Equation 4 the 4x4 transformation matrix may be approximated as in Equation 4 below.
  • Equation 5 the transformation matrix when trType is 1 and nTbs is 4 is expressed by Equation 5 below.
  • Equation 6 Equation 6
  • Equation 7 the transformation matrix when trType is 1 and nTbs is 16 is expressed by Equation 7 below.
  • 16 transform coefficients are repeatedly used in the matrix, and an approximation based on up to three terms may be as shown in Table 29 below.
  • the 16x16 transform matrix may be approximated as shown in Equation 8 below.
  • VTM 2.0 32 transform coefficients in a 32x32 matrix are repeatedly used in the matrix, and an approximation based on up to three terms may be as shown in Table 29 below.
  • the 32x32 transformation matrix can be arranged as described above.
  • the DST-7 and DCT-8 coefficients are parameterized and summarized. As mentioned in the previous embodiments (16-16), the respective coefficients (parameters) may be approximated in the form of Equations 9-16 below.
  • FIG. 16 shows a flowchart for processing a video signal according to an embodiment to which the present invention is applied.
  • the flowchart of FIG. 16 may be performed by the decoding apparatus 200 or the inverse transform unit 230.
  • the decoding apparatus 200 confirms a transform index indicating a transform kernel for transforming the current block.
  • the decoding apparatus 200 determines a transform matrix corresponding to the transform index.
  • the components of the transformation matrix are implemented by shift operation and addition of one.
  • each of the components of the transformation matrix may be implemented by the sum of terms consisting of a left shift of one.
  • the number of terms constituting each of the components of the transformation matrix may be set to be less than three.
  • each of the components of the transformation matrix may be set to a value approximated within an allowable error range from DCT-4, DST-7, or DCT-8.
  • each of the components of the transformation matrix may be determined in consideration of the allowable error range and the number of terms.
  • the decoding apparatus 200 generates an array of residual samples by applying a transform matrix having coefficients approximated by a shift operation and an add operation to the transform coefficients of the current block.
  • FIG. 17 shows an example of a block diagram of an apparatus for processing a video signal as an embodiment to which the present invention is applied.
  • the video signal processing apparatus of FIG. 17 may correspond to the encoding apparatus of FIG. 1 or the decoding apparatus of FIG. 2.
  • the image processing apparatus 1700 for processing an image signal includes a memory 1720 storing an image signal and a processor 1710 coupled to the memory and processing the image signal.
  • the processor 1710 may be configured with at least one processing circuit for processing an image signal, and may process the image signal by executing instructions for encoding or decoding the image signal. That is, the processor 1710 may encode the original image data or decode the encoded image signal by executing the above-described encoding or decoding methods.
  • the processing method to which the present invention is applied can be produced in the form of a program executed by a computer, and can be stored in a computer-readable recording medium.
  • Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
  • the computer readable recording medium includes all kinds of storage devices and distributed storage devices in which computer readable data is stored.
  • the computer-readable recording medium may be, for example, a Blu-ray disc (BD), a universal serial bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical disc. It may include a data storage device.
  • the computer-readable recording medium also includes media embodied in the form of a carrier wave (for example, transmission over the Internet).
  • the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
  • embodiments of the present invention may be implemented as a computer program product by a program code, the program code may be performed on a computer by an embodiment of the present invention.
  • the program code may be stored on a carrier readable by a computer.
  • the embodiments described herein may be implemented and performed on a processor, microprocessor, controller, or chip.
  • the functional units shown in each drawing may be implemented and performed on a computer, processor, microprocessor, controller, or chip.
  • the decoder and encoder to which the present invention is applied include a multimedia broadcasting transmitting and receiving device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video chat device, a real time communication device such as video communication, a mobile streaming device, Storage media, camcorders, video on demand (VoD) service providing devices, OTT video (Over the top video) devices, Internet streaming service providing devices, three-dimensional (3D) video devices, video telephony video devices, and medical video devices. It can be used to process video signals or data signals.
  • the OTT video device may include a game console, a Blu-ray player, an Internet access TV, a home theater system, a smartphone, a tablet PC, a digital video recorder (DVR), and the like.
  • the processing method to which the present invention is applied can be produced in the form of a program executed by a computer, and can be stored in a computer-readable recording medium.
  • Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
  • the computer readable recording medium includes all kinds of storage devices and distributed storage devices in which computer readable data is stored.
  • the computer-readable recording medium may be, for example, a Blu-ray disc (BD), a universal serial bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical disc. It may include a data storage device.
  • the computer-readable recording medium also includes media embodied in the form of a carrier wave (for example, transmission over the Internet).
  • the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
  • embodiments of the present invention may be implemented as a computer program product by a program code, the program code may be performed on a computer by an embodiment of the present invention.
  • the program code may be stored on a carrier readable by a computer.
  • Embodiments according to the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof.
  • an embodiment of the present invention may include one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), FPGAs ( field programmable gate arrays), processors, controllers, microcontrollers, microprocessors, and the like.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • processors controllers, microcontrollers, microprocessors, and the like.
  • an embodiment of the present invention may be implemented in the form of a module, procedure, function, etc. that performs the functions or operations described above.
  • the software code may be stored in memory and driven by the processor.
  • the memory may be located inside or outside the processor, and may exchange data with the processor by various known means.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Selon des modes de réalisation, la présente invention concerne un procédé et un dispositif permettant de traiter un signal vidéo. Un procédé de traitement d'un signal d'image, selon un mode de réalisation de la présente invention, comprend les étapes consistant à: confirmer un indice de transformation pour indiquer un noyau de transformation pour la transformation d'un bloc courant; déterminer une matrice de transformation correspondant à l'indice de transformation; et générer un réseau d'échantillons résiduels par application de la matrice de transformation à des coefficients de transformation du bloc courant, les composants de la matrice de transformation étant implémentés par une opération de décalage et l'addition de 1.
PCT/KR2019/011249 2018-09-02 2019-09-02 Procédé et dispositif de traitement d'un signal d'image WO2020046085A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201862726302P 2018-09-02 2018-09-02
US62/726,302 2018-09-02
US201862727528P 2018-09-05 2018-09-05
US62/727,528 2018-09-05

Publications (1)

Publication Number Publication Date
WO2020046085A1 true WO2020046085A1 (fr) 2020-03-05

Family

ID=69645328

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/011249 WO2020046085A1 (fr) 2018-09-02 2019-09-02 Procédé et dispositif de traitement d'un signal d'image

Country Status (1)

Country Link
WO (1) WO2020046085A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120098500A (ko) * 2011-02-25 2012-09-05 삼성전자주식회사 영상의 변환 및 역변환 방법, 및 이를 이용한 영상의 부호화 및 복호화 장치
KR20180063186A (ko) * 2015-09-29 2018-06-11 퀄컴 인코포레이티드 비디오 코딩을 위한 비-분리가능한 2 차 변환
KR20180085526A (ko) * 2017-01-19 2018-07-27 가온미디어 주식회사 효율적 변환을 처리하는 영상 복호화 및 부호화 방법

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120098500A (ko) * 2011-02-25 2012-09-05 삼성전자주식회사 영상의 변환 및 역변환 방법, 및 이를 이용한 영상의 부호화 및 복호화 장치
KR20180063186A (ko) * 2015-09-29 2018-06-11 퀄컴 인코포레이티드 비디오 코딩을 위한 비-분리가능한 2 차 변환
KR20180085526A (ko) * 2017-01-19 2018-07-27 가온미디어 주식회사 효율적 변환을 처리하는 영상 복호화 및 부호화 방법

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANLE CHEN: "Algorithm Description of Joint Exploration Test Model 1", JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, DOCUMENT: JVET-A1001, 21 October 2015 (2015-10-21) *
MEHDI SALEHIFAR: "CE 6.2.6: Reduced Secondary Transform (RST", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 , JVET-K0099, 18 July 2018 (2018-07-18) *

Similar Documents

Publication Publication Date Title
WO2020149648A1 (fr) Procédé et dispositif de codage d'images utilisant un drapeau de saut de transformation
WO2020009556A1 (fr) Procédé et dispositif de codage d'image à base de transformée
WO2020046091A1 (fr) Procédé de codage d'image basé sur une sélection d'une transformée multiple et dispositif associé
WO2020046092A1 (fr) Procédé de codage/décodage de signaux vidéo et dispositif pour celui-ci
WO2020046086A1 (fr) Procédé et appareil de traitement d'un signal d'image
WO2019203610A1 (fr) Procédé de traitement d'images et dispositif pour sa mise en œuvre
WO2020162690A1 (fr) Procédé et dispositif de traitement de signal vidéo en utilisant une transformée réduite
WO2020050651A1 (fr) Procédé de codage d'image basé sur une sélection multiple de transformée et dispositif associé
WO2020116961A1 (fr) Procédé de codage d'image basé sur une une transformée secondaire et dispositif associé
WO2019235822A1 (fr) Procédé et dispositif de traitement de signal vidéo à l'aide de prédiction de mouvement affine
WO2019216714A1 (fr) Procédé de traitement d'image fondé sur un mode de prédiction inter et appareil correspondant
WO2020171673A1 (fr) Procédé et appareil de traitement de signal vidéo pour prédiction intra
WO2020130661A1 (fr) Procédé de codage vidéo sur la base d'une transformée secondaire et dispositif associé
WO2019194463A1 (fr) Procédé de traitement d'image et appareil associé
WO2020180122A1 (fr) Codage de vidéo ou d'images sur la base d'un modèle à alf analysé conditionnellement et d'un modèle de remodelage
WO2020046084A1 (fr) Procédé et dispositif de traitement d'un signal d'image
WO2021040487A1 (fr) Procédé de décodage d'image pour codage de données résiduelles dans un système de codage d'image, et appareil associé
WO2020256482A1 (fr) Procédé de codage d'image basé sur une transformée et dispositif associé
WO2021025530A1 (fr) Procédé et appareil de codage d'images sur la base d'une transformation
WO2021006700A1 (fr) Procédé de décodage d'image faisant appel à un fanion servant à un procédé de codage résiduel dans un système de codage d'image, et dispositif associé
WO2020130581A1 (fr) Procédé permettant de coder une image sur la base d'une transformée secondaire et dispositif associé
WO2019245228A1 (fr) Procédé et dispositif de traitement de signal vidéo utilisant la prédiction affine de mouvement
WO2021158048A1 (fr) Procédé de décodage d'image associé à la signalisation d'un drapeau indiquant si tsrc est disponible, et dispositif associé
WO2021025526A1 (fr) Procédé de codage vidéo sur la base d'une transformée et dispositif associé
WO2020185005A1 (fr) Procédé de codage d'images basé sur une transformée et dispositif associé

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19856018

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19856018

Country of ref document: EP

Kind code of ref document: A1