WO2019066202A1 - Procédé de traitement d'image et appareil s'y rapportant - Google Patents

Procédé de traitement d'image et appareil s'y rapportant Download PDF

Info

Publication number
WO2019066202A1
WO2019066202A1 PCT/KR2018/007094 KR2018007094W WO2019066202A1 WO 2019066202 A1 WO2019066202 A1 WO 2019066202A1 KR 2018007094 W KR2018007094 W KR 2018007094W WO 2019066202 A1 WO2019066202 A1 WO 2019066202A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
block
filtering
current
reference picture
Prior art date
Application number
PCT/KR2018/007094
Other languages
English (en)
Korean (ko)
Inventor
박내리
남정학
서정동
이재호
Original Assignee
엘지전자(주)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 엘지전자(주) filed Critical 엘지전자(주)
Publication of WO2019066202A1 publication Critical patent/WO2019066202A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to a still image or moving picture processing method, and more particularly, to a method of encoding / decoding moving picture based on an inter prediction mode and a device supporting the same.
  • Compressive encoding refers to a series of signal processing techniques for transmitting digitized information over a communication line or for storing it in a form suitable for a storage medium.
  • Media such as video, image, and audio can be subject to compression coding.
  • a technique for performing compression coding on an image is referred to as video image compression.
  • Next-generation video content will feature high spatial resolution, high frame rate, and high dimensionality of scene representation. Processing such content will result in a tremendous increase in terms of memory storage, memory access rate, and processing power.
  • prediction block filtering derives the wiener filter coefficients between the original block and the prediction block and applies it to the prediction block to increase the accuracy of the prediction block and reduce the residual signal.
  • this method is not suitable for the latest video codec because it increases the amount of additional information because the filter coefficient must be transmitted on a block-by-block basis. Therefore, a method of deriving a filter coefficient to reduce the amount of additional information has been proposed.
  • the APBF scheme Wiener filter coefficients between the restoration block of the adjacent block and the prediction block of the adjacent block are derived to replace the original block and the prediction block in the encoding / decoding process, To be applied.
  • the APBF scheme has a limitation in using a filter coefficient derived using a neighboring block, not a current block.
  • an inter-prediction-based image processing method comprising: determining whether to apply a pair prediction-based filtering to a first prediction block and a second prediction block of a current block; Applying the bi-prediction-based filtering to the first prediction block and the second prediction block if it is determined to apply the bi-prediction-based filtering; And generating a final predicted block of the current block using the filtered first predictive block and the filtered second predictive block, wherein the bi- Wherein the first prediction block is generated by performing inter-prediction based on a list 0 reference picture, and the first prediction block is generated by performing inter-prediction based on a list 0 reference picture, The second prediction block is generated by performing inter prediction on the basis of the list 1 reference picture.
  • the applying the bi-prediction-based filtering to the first and second prediction blocks may include generating the average block using the first and second prediction blocks, Deriving first Wiener filter coefficients that minimize a difference between the first prediction block and the average block; Deriving second Wiener filter coefficients that minimize a difference between the second prediction block and the average block; Filtering the first prediction block using the derived first Wiener filter coefficients; And filtering the second prediction block using the derived second Wiener filter coefficients.
  • the generating of the average block may include: generating a first interpolation block based on the size of the first prediction block and the number of taps of the Wiener filter; Generating a second interpolation block based on the size of the second prediction block and the number of taps of the Wiener filter; And generating an average value of the first interpolation block and the second interpolation block as the average block.
  • the step of determining whether to apply the biproporant-based filtering may include the step of determining whether to apply the biproporant-based filtering when the AMVP mode is applied to the current block, Wherein the AMVP mode is a mode for deriving a motion vector prediction value of the current block from a neighboring block of the current block; And determining to apply the bi-prediction-based filtering to the first and second prediction blocks when the bi-prediction-based filtering is applied to the current block according to the bi-prediction-based filtering flag .
  • the step of determining whether to apply the bi-prediction-based filtering comprises: constructing a merge candidate list based on motion information of neighboring blocks of the current block when a merge mode is applied to the current block; Wherein the merge mode is a mode for deriving motion information of the current block using spatially or temporally neighboring blocks with the current block; Obtaining a merge index indicating the selected merge candidate; And determining whether to apply the bi-prediction-based filtering to the first and second prediction blocks based on the selected merge candidate indicated by the merge index.
  • the selected merge candidate is a merge candidate generated by combining other merge candidates, a zero motion vector, a candidate derived in units of subblocks, or a temporal merge candidate
  • the first prediction block and the second The prediction-based filtering is not applied to the prediction block.
  • the bi-prediction-based filtering is applied to the selected merge candidate, the bi-prediction-based filtering is applied to the first and second prediction blocks.
  • the bi-prediction-based filtering is applied to the first and second prediction blocks.
  • the bi-prediction-based filtering is applied to the first prediction block and the second prediction block.
  • an inter-prediction-based image processing apparatus including: a filtering determination unit determining whether to apply a pair prediction-based filtering to a first prediction block and a second prediction block of a current block; Wherein the filtering unit applies the bi-prediction-based filtering to each of the first and second prediction blocks if it is determined to apply the bi-prediction-based filtering.
  • a prediction block generation unit for generating a final prediction block of the current block using the filtered first prediction block and the filtered second prediction block, wherein the bi- Wherein the first prediction block is generated by performing inter-prediction based on a list 0 reference picture, and the first prediction block is generated by performing inter-prediction based on a list 0 reference picture, The second prediction block is generated by performing inter prediction on the basis of the list 1 reference picture.
  • the prediction efficiency of the prediction block can be improved and the coding efficiency can be improved by reducing the amount of information of the residual signal by filtering two prediction blocks in the pair prediction close to the average blocks of the two prediction blocks.
  • prediction performance and compression efficiency can be further improved by determining whether to apply the filtering in units of blocks or samples.
  • FIG. 1 is a schematic block diagram of an encoder in which still image or moving picture signal encoding is performed according to an embodiment of the present invention.
  • FIG. 2 is a schematic block diagram of a decoder in which still image or moving picture signal encoding is performed according to an embodiment of the present invention.
  • FIG. 3 is a diagram for explaining a division structure of a coding unit applicable to the present invention.
  • FIG. 4 is a diagram for explaining a prediction unit that can be applied to the present invention.
  • FIG. 5 is a diagram illustrating the direction of inter prediction, which is an embodiment to which the present invention can be applied.
  • Figure 6 illustrates integer and fractional sample locations for 1/4 sample interpolation as an embodiment to which the present invention may be applied.
  • Figure 7 illustrates the location of spatial candidates as an embodiment to which the present invention may be applied.
  • FIG. 8 is a diagram illustrating an inter prediction method according to an embodiment to which the present invention is applied.
  • FIG. 9 is a diagram illustrating a motion compensation process according to an embodiment to which the present invention can be applied.
  • FIG. 10 schematically illustrates a method of applying adaptive loop filtering, in accordance with an embodiment of the present invention.
  • FIG. 11 schematically shows a method of applying prediction block filtering and adaptive prediction block filtering according to an embodiment of the present invention.
  • FIG. 12 is a flowchart illustrating a motion compensation process in an inter-prediction mode for applying the bi-predictive block filtering according to an embodiment of the present invention.
  • FIG. 13 shows a flowchart of an inter prediction based image processing method according to an embodiment of the present invention.
  • FIG. 14 shows a block diagram of an inter prediction unit according to an embodiment of the present invention.
  • FIG. 15 shows a structure of a contents streaming system according to an embodiment of the present invention.
  • 'block' or 'unit' means a unit in which encoding / decoding processes such as prediction, conversion and / or quantization are performed, and may be composed of a multi-dimensional array of samples (or pixels).
  • a 'block' or 'unit' may refer to a multidimensional array of samples for a luma component, or a multidimensional array of samples for a chroma component. It may also be collectively referred to as a multidimensional array of samples for a luma component and a multidimensional array of samples for a chroma component.
  • a 'block' or a 'unit' may include a coding block (CB) indicating an array of samples to be subjected to encoding / decoding, a coding tree block (CTB) composed of a plurality of coding blocks
  • a prediction block (PU) Prediction Unit
  • a conversion block (TB) representing an array of samples to which the same conversion is applied
  • Transform Block or Transform Unit (TU)
  • a 'block' or 'unit' is a syntax or syntax used in encoding / decoding an array of samples for a luma component and / or a chroma component,
  • the syntax structure means zero or more syntax elements existing in the bit stream in a specific order, and the syntax element means an element of data represented in the bitstream.
  • a 'block' or a 'unit' includes a coding block (CB) and a coding unit (CU) including a syntax structure used for encoding the corresponding coding block (CB)
  • a prediction unit PU Prediction Unit
  • a prediction unit PU Coding Tree Unit
  • a conversion unit TU: Transform Unit
  • 'block' or 'unit' is not necessarily limited to an array of samples (or pixels) in the form of a square or a rectangle, but may be a polygonal sample (or pixel, pixel) having three or more vertices. May also be used. In this case, it may be referred to as a polygon block or a polygon unit.
  • FIG. 1 is a schematic block diagram of an encoder in which still image or moving picture signal encoding is performed according to an embodiment of the present invention.
  • an encoder 100 includes an image divider 110, a subtractor 115, a transformer 120, a quantizer 130, an inverse quantizer 140, an inverse transformer 150, A decoding unit 160, a decoded picture buffer (DPB) 170, a predicting unit 180, and an entropy encoding unit 190.
  • the prediction unit 180 may include an inter prediction unit 181 and an intra prediction unit 182.
  • the image divider 110 divides an input video signal (or a picture or a frame) input to the encoder 100 into one or more blocks.
  • the subtractor 115 subtracts a predicted signal (or a predicted block) from the predictor 180 (i.e., the inter prediction unit 181 or the intra prediction unit 182) )) To generate a residual signal (or a difference block).
  • the generated difference signal (or difference block) is transmitted to the conversion unit 120.
  • the transforming unit 120 transforms a difference signal (or a difference block) by a transform technique (for example, DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), GBT (Graph-Based Transform), KLT (Karhunen- Etc.) to generate a transform coefficient.
  • a transform technique for example, DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), GBT (Graph-Based Transform), KLT (Karhunen- Etc.
  • the transform unit 120 may generate transform coefficients by performing transform using a transform technique determined according to a prediction mode applied to a difference block and a size of a difference block.
  • the quantization unit 130 quantizes the transform coefficients and transmits the quantized transform coefficients to the entropy encoding unit 190.
  • the entropy encoding unit 190 entropy-codes the quantized signals and outputs them as a bitstream.
  • the quantized signal output from the quantization unit 130 may be used to generate a prediction signal.
  • the quantized signal can be reconstructed by applying inverse quantization and inverse transformation through the inverse quantization unit 140 and the inverse transform unit 150 in the loop.
  • a reconstructed signal (or reconstruction block) can be generated by adding the reconstructed difference signal to the prediction signal output from the inter prediction unit 181 or the intra prediction unit 182.
  • the filtering unit 160 applies filtering to the restored signal and outputs the restored signal to the playback apparatus or the decoded picture buffer 170.
  • the filtered signal transmitted to the decoding picture buffer 170 may be used as a reference picture in the inter-prediction unit 181. [ As described above, not only the picture quality but also the coding efficiency can be improved by using the filtered picture as a reference picture in the inter picture prediction mode.
  • the decoded picture buffer 170 may store the filtered picture for use as a reference picture in the inter-prediction unit 181.
  • the inter-prediction unit 181 performs temporal prediction and / or spatial prediction to remove temporal redundancy and / or spatial redundancy with reference to a reconstructed picture.
  • the reference picture used for prediction is a transformed signal obtained through quantization and inverse quantization in units of blocks at the time of encoding / decoding in the previous time, blocking artifacts or ringing artifacts may exist have.
  • the inter-prediction unit 181 can interpolate signals between pixels by sub-pixel by applying a low-pass filter in order to solve the performance degradation due to discontinuity or quantization of such signals.
  • a subpixel means a virtual pixel generated by applying an interpolation filter
  • an integer pixel means an actual pixel existing in a reconstructed picture.
  • the interpolation method linear interpolation, bi-linear interpolation, wiener filter and the like can be applied.
  • the interpolation filter may be applied to a reconstructed picture to improve the accuracy of the prediction.
  • the inter prediction unit 181 may apply an interpolation filter to an integer pixel to generate an interpolation pixel, and may perform prediction using an interpolated block composed of interpolated pixels.
  • the intra predictor 182 predicts a current block by referring to samples in the vicinity of a block to be currently encoded.
  • the intraprediction unit 182 may perform the following procedure to perform intra prediction. First, a reference sample necessary for generating a prediction signal can be prepared. Then, the predicted signal (predicted block) can be generated using the prepared reference sample. Thereafter, the prediction mode is encoded. At this time, reference samples can be prepared through reference sample padding and / or reference sample filtering. Since the reference samples have undergone prediction and reconstruction processes, quantization errors may exist. Therefore, a reference sample filtering process can be performed for each prediction mode used for intraprediction to reduce such errors.
  • a predicted signal (or a predicted block) generated through the inter prediction unit 181 or the intra prediction unit 182 is used to generate a reconstructed signal (or a reconstructed block) Block).
  • FIG. 2 is a schematic block diagram of a decoder in which still image or moving picture signal encoding is performed according to an embodiment of the present invention.
  • the decoder 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an adder 235, a filtering unit 240, a decoded picture buffer (DPB) A buffer unit 250, and a prediction unit 260.
  • the prediction unit 260 may include an inter prediction unit 261 and an intra prediction unit 262.
  • the reconstructed video signal output through the decoder 200 may be reproduced through a reproducing apparatus.
  • the decoder 200 receives a signal (i.e., a bit stream) output from the encoder 100 of FIG. 1, and the received signal is entropy-decoded through the entropy decoding unit 210.
  • a signal i.e., a bit stream
  • the inverse quantization unit 220 obtains a transform coefficient from the entropy-decoded signal using the quantization step size information.
  • the inverse transform unit 230 obtains a residual signal (or a difference block) by inverse transforming the transform coefficient by applying an inverse transform technique.
  • the adder 235 adds the obtained difference signal (or difference block) to a predicted signal (or prediction signal) output from the predicting unit 260 (i.e., the inter prediction unit 261 or the intra prediction unit 262)
  • the reconstructed signal (or restoration block) is generated.
  • the filtering unit 240 applies filtering to a reconstructed signal (or a reconstructed block) and outputs it to a reproducing apparatus or transmits the reconstructed signal to a decoding picture buffer unit 250.
  • the filtered signal transmitted to the decoding picture buffer unit 250 may be used as a reference picture in the inter prediction unit 261.
  • the embodiments described in the filtering unit 160, the inter-prediction unit 181 and the intra-prediction unit 182 of the encoder 100 respectively include the filtering unit 240 of the decoder, the inter-prediction unit 261, The same can be applied to the intra prediction unit 262.
  • a block-based image compression method is used in a still image or moving image compression technique (for example, HEVC).
  • HEVC still image or moving image compression technique
  • a block-based image compression method is a method of dividing an image into a specific block unit, and can reduce memory usage and computation amount.
  • FIG. 3 is a diagram for explaining a division structure of a coding unit applicable to the present invention.
  • the encoder divides one image (or picture) into units of a rectangular shaped coding tree unit (CTU: Coding Tree Unit). Then, one CTU is sequentially encoded according to a raster scan order.
  • CTU Coding Tree Unit
  • the size of CTU can be set to 64X64, 32X32, 16X16.
  • the encoder can select the size of the CTU according to the resolution of the input image or characteristics of the input image.
  • the CTU includes a coding tree block (CTB) for a luma component and a CTB for two chroma components corresponding thereto.
  • CTB coding tree block
  • One CTU can be partitioned into a quad-tree structure. That is, one CTU is divided into four units having a square shape and having a half horizontal size and a half vertical size to generate a coding unit (CU) have. This division of the quad-tree structure can be performed recursively. That is, the CU is hierarchically partitioned from one CTU to a quad-tree structure.
  • CU coding unit
  • the CU means a basic unit of coding in which processing of an input image, for example, intra / inter prediction is performed.
  • the CU includes a coding block (CB) for the luma component and CB for the corresponding two chroma components.
  • CB coding block
  • the size of the CU can be set to 64X64, 32X32, 16X16, or 8X8.
  • the root node of the quad-tree is associated with the CTU.
  • the quad-tree is divided until it reaches the leaf node, and the leaf node corresponds to the CU.
  • the CTU may not be divided.
  • the CTU corresponds to the CU.
  • a node that is not further divided in the lower node having a depth of 1 corresponds to a CU.
  • CU (a), CU (b), and CU (j) corresponding to nodes a, b, and j in FIG. 3B are divided once in the CTU and have a depth of one.
  • a node that is not further divided in the lower node having a depth of 2 corresponds to a CU.
  • CU (c), CU (h) and CU (i) corresponding to nodes c, h and i in FIG. 3B are divided twice in the CTU and have a depth of 2.
  • a node that is not further divided in the lower node having a depth of 3 corresponds to a CU.
  • the maximum size or the minimum size of the CU can be determined according to the characteristics of the video image (for example, resolution) or considering the efficiency of encoding. Information on this or information capable of deriving the information may be included in the bitstream.
  • a CU having a maximum size is called a Largest Coding Unit (LCU), and a CU having a minimum size can be referred to as a Smallest Coding Unit (SCU).
  • LCU Largest Coding Unit
  • SCU Smallest Coding Unit
  • a CU having a tree structure can be hierarchically divided with a predetermined maximum depth information (or maximum level information).
  • Each divided CU can have depth information.
  • the depth information indicates the number and / or degree of division of the CU, and therefore may include information on the size of the CU.
  • the size of the SCU can be obtained by using the LCU size and the maximum depth information. Conversely, by using the size of the SCU and the maximum depth information of the tree, the size of the LCU can be obtained.
  • split_cu_flag information indicating whether the corresponding CU is divided
  • This split mode is included in all CUs except SCU. For example, if the value of the flag indicating division is '1', the corresponding CU is again divided into four CUs. If the flag indicating the division is '0', the corresponding CU is not further divided, Can be performed.
  • the CU is a basic unit of coding in which intra prediction or inter prediction is performed.
  • the HEVC divides the CU into units of Prediction Unit (PU) in order to more effectively code the input image.
  • PU Prediction Unit
  • PU is a basic unit for generating prediction blocks, and it is possible to generate prediction blocks in units of PU different from each other in a single CU.
  • PUs belonging to one CU are not mixed with intra prediction and inter prediction, and PUs belonging to one CU are coded by the same prediction method (i.e., intra prediction or inter prediction).
  • the PU is not divided into a quad-tree structure, and is divided into a predetermined form in one CU. This will be described with reference to the following drawings.
  • FIG. 4 is a diagram for explaining a prediction unit that can be applied to the present invention.
  • the PU is divided according to whether the intra prediction mode is used or the inter prediction mode is used in the coding mode of the CU to which the PU belongs.
  • FIG. 4A illustrates a PU when an intra prediction mode is used
  • FIG. 4B illustrates a PU when an inter prediction mode is used.
  • one CU is divided into two types (i.e., 2NX2N or NXN) .
  • the PU when the PU is divided into PUs of NXN type, one CU is divided into four PUs, and different prediction blocks are generated for each PU unit.
  • the division of the PU can be performed only when the size of the CB with respect to the luminance component of the CU is the minimum size (i.e., when the CU is the SCU).
  • one CU includes eight PU types (i.e., 2NX2N, NXN, 2NXN , NX2N, nLX2N, nRX2N, 2NXnU, 2NXnD).
  • PU partitioning of the N ⁇ N type can only be performed if the size of the CB for the luminance component of the CU is the minimum size (i.e., the CU is SCU).
  • AMP Asymmetric Motion Partition
  • nLH2N, nRH2N, 2NHnU, 2NHnD asymmetric motion partition
  • 'n' means a 1/4 value of 2N.
  • the AMP can not be used when the CU to which the PU belongs is the minimum size CU.
  • the optimal division structure of the coding unit (CU), the prediction unit (PU), and the conversion unit (TU) for efficiently encoding an input image in one CTU is a rate-distortion- Value. ≪ / RTI > For example, if we look at the optimal CU segmentation process in a 64 ⁇ 64 CTU, the rate-distortion cost can be calculated by dividing the 64X64 size CU to the 8X8 size CU.
  • the concrete procedure is as follows.
  • a prediction mode is selected in units of PU, and prediction and reconstruction are performed in units of actual TUs for the selected prediction mode.
  • the TU means the basic unit on which the actual prediction and reconstruction are performed.
  • the TU includes a transform block (TB) for the luma component and a TB for the two chroma components corresponding thereto.
  • the TU is hierarchically divided into a quad-tree structure from one CU to be coded, as one CTU is divided into a quad-tree structure to generate a CU.
  • the TUs segmented from the CUs can be further divided into smaller lower TUs.
  • the size of the TU can be set to any one of 32 ⁇ 32, 16 ⁇ 16, 8 ⁇ 8, and 4 ⁇ 4.
  • the root node of the quadtree is associated with a CU.
  • the quad-tree is divided until it reaches a leaf node, and the leaf node corresponds to TU.
  • the CU may not be divided.
  • the CU corresponds to the TU.
  • TU (a), TU (b), and TU (j) corresponding to nodes a, b, and j in FIG. 3B are once partitioned in the CU and have a depth of one.
  • TU (c), TU (h) and TU (i) corresponding to nodes c, h and i in FIG. 3B are divided twice in CU and have a depth of 2.
  • a node that is not further divided in the lower node having a depth of 3 corresponds to a CU.
  • TU (d), TU (e), TU (f), and TU (g) corresponding to nodes d, e, f and g in FIG. Depth.
  • a TU having a tree structure can be hierarchically divided with predetermined maximum depth information (or maximum level information). Then, each divided TU can have depth information.
  • the depth information indicates the number and / or degree of division of the TU, and therefore may include information on the size of the TU.
  • information indicating whether the corresponding TU is divided may be communicated to the decoder.
  • This partitioning information is included in all TUs except the minimum size TU. For example, if the value of the flag indicating whether or not to divide is '1', the corresponding TU is again divided into four TUs, and if the flag indicating the division is '0', the corresponding TU is no longer divided.
  • And may use the decoded portion of the current picture or other pictures that contain the current processing unit to recover the current processing unit in which decoding is performed.
  • a picture (slice) that uses only the current picture for restoration that is, a picture (slice) that only performs intra prediction (or intra prediction) is referred to as an intra picture or an I picture (Slice) is referred to as a predictive picture or a P picture (slice), and a picture (slice) using a maximum of two motion vectors and a reference index is referred to as a Bi-predictive picture or a B picture .
  • Intra prediction refers to a prediction method that derives the current processing block from a data element (e.g., a sample value, etc.) of the same decoded picture (or slice). That is, it means a method of predicting the pixel value of the current processing block by referring to the reconstructed areas in the current picture.
  • a data element e.g., a sample value, etc.
  • Inter prediction refers to a prediction method of deriving a current processing block based on a data element (e.g., a sample value or a motion vector) of a picture other than the current picture. That is, this means a method of predicting pixel values of a current processing block by referring to reconstructed areas in other reconstructed pictures other than the current picture.
  • a data element e.g., a sample value or a motion vector
  • Inter Inter prediction (or inter prediction)
  • Inter prediction refers to a prediction method of deriving a current processing block based on a data element (e.g., a sample value or a motion vector) of a picture other than the current picture. That is, this means a method of predicting pixel values of a current processing block by referring to reconstructed areas in other reconstructed pictures other than the current picture.
  • a data element e.g., a sample value or a motion vector
  • Inter prediction (or inter picture prediction) is a technique for eliminating the redundancy existing between pictures, and is mostly performed through motion estimation and motion compensation.
  • FIG. 5 is a diagram illustrating the direction of inter prediction, which is an embodiment to which the present invention can be applied.
  • the inter prediction includes uni-directional prediction using a past picture or a future picture as a reference picture on a time axis for one block, and bidirectional prediction Bi-directional prediction).
  • uni-directional prediction includes forward direction prediction using one reference picture temporally displayed (or outputting) before the current picture and forward prediction using temporally one And a backward direction prediction using a plurality of reference pictures.
  • the motion parameter (or information) used to specify which reference region (or reference block) is used to predict the current block in the inter prediction process i. E., Unidirectional or bidirectional prediction
  • the inter prediction mode may indicate a reference direction (i.e., unidirectional or bidirectional) and a reference list (i.e. L0, L1 or bidirectional), a reference index (or reference picture index or reference list index) And includes motion vector information.
  • the motion vector information may include a motion vector, a motion vector predictor (MVP), or a motion vector difference (MVD).
  • the motion vector difference value means a difference value between the motion vector and the motion vector predictor.
  • a motion parameter for one direction is used. That is, one motion parameter may be needed to specify the reference region (or reference block).
  • bidirectional prediction motion parameters for both directions are used.
  • a maximum of two reference areas can be used. These two reference areas may exist in the same reference picture or in different pictures. That is, in the bi-directional prediction method, a maximum of two motion parameters can be used, and two motion vectors may have the same reference picture index or different reference picture indexes.
  • the reference pictures may be all displayed (or output) temporally before the current picture, or all displayed (or output) thereafter.
  • the encoder performs motion estimation (Motion Estimation) for finding a reference region most similar to the current block from the reference pictures.
  • the encoder may then provide motion parameters for the reference region to the decoder.
  • the encoder / decoder can obtain the reference area of the current block using motion parameters.
  • the reference area exists in the reference picture having the reference index.
  • a pixel value or an interpolated value of a reference region specified by the motion vector may be used as a predictor of the current processing block. That is, motion compensation for predicting an image of a current processing block from a previously decoded picture is performed using motion information.
  • a method of acquiring a motion vector predictor (mvp) using motion information of previously coded blocks and transmitting only a difference value (mvd) therebetween may be used. That is, the decoder obtains the motion vector predictor of the current block by using the motion information of the decoded other blocks, and obtains the motion vector value for the current processing block using the difference value transmitted from the encoder. In obtaining the motion vector predictor, the decoder may obtain various motion vector candidate values using the motion information of other decoded blocks and acquire one of the candidate motion vector predictors.
  • DPB decoding picture buffer
  • a reference picture refers to a picture including samples that can be used for inter prediction in the decoding process of the next picture in the decoding order.
  • a reference picture set refers to a set of reference pictures associated with a picture, and is composed of all the pictures previously associated in the decoding order.
  • the reference picture set may be used for inter prediction of a picture following an associated picture or a picture associated with the decoding order. That is, the reference pictures held in the decoded picture buffer DPB may be referred to as a reference picture set.
  • the encoder can provide the decoder with reference picture set information in a sequence parameter set (SPS) (i.e., a syntax structure composed of syntax elements) or in each slice header.
  • SPS sequence parameter set
  • a reference picture list refers to a list of reference pictures used for inter prediction of a P picture (or a slice) or a B picture (or a slice).
  • the reference picture list can be divided into two reference picture lists and can be referred to as a reference picture list 0 (or L0) and a reference picture list 1 (or L1), respectively.
  • the reference picture belonging to the reference picture list 0 can be referred to as a reference picture 0 (or L0 reference picture)
  • the reference picture belonging to the reference picture list 1 can be referred to as a reference picture 1 (or L1 reference picture).
  • one reference picture list i.e., reference picture list 0
  • two reference picture lists Picture list 0 and reference picture list 1 can be used.
  • Information for identifying the reference picture list for each reference picture may be provided to the decoder through the reference picture set information.
  • the decoder adds the reference picture to the reference picture list 0 or the reference picture list 1 based on the reference picture set information.
  • a reference picture index (or a reference index) is used to identify any one specific reference picture in the reference picture list.
  • a sample of a prediction block for an inter-predicted current block is obtained from a sample value of a corresponding reference area in a reference picture identified by a reference picture index.
  • the corresponding reference area in the reference picture indicates a region of a position indicated by a horizontal component and a vertical component of a motion vector.
  • Fractional sample interpolation is used to generate a prediction sample for noninteger sample coordinates, except when the motion vector has an integer value. For example, a motion vector of a quarter of the distance between samples may be supported.
  • fractional sample interpolation of the luminance component applies the 8-tap filter in the horizontal and vertical directions, respectively.
  • the fractional sample interpolation of the chrominance components applies the 4-tap filter in the horizontal direction and the vertical direction, respectively.
  • Figure 6 illustrates integer and fractional sample locations for 1/4 sample interpolation as an embodiment to which the present invention may be applied.
  • a shaded block in which an upper-case letter (A_i, j) is written represents an integer sample position and a shaded block in which a lower-case letter (x_i, j) .
  • a fractional sample is generated with interpolation filters applied to integer sample values in the horizontal and vertical directions, respectively.
  • interpolation filters applied to integer sample values in the horizontal and vertical directions, respectively.
  • an 8-tap filter may be applied to the left four integer sample values and the right four integer sample values based on the fraction sample to be generated.
  • AMVP Advanced Motion Vector Prediction
  • the merge mode refers to a method of deriving a motion parameter (or information) from a neighboring block spatially or temporally.
  • the set of candidates available in the merge mode consists of spatial neighbor candidates, temporal candidates, and generated candidates.
  • Figure 7 illustrates the location of spatial candidates as an embodiment to which the present invention may be applied.
  • each spatial candidate block is available according to the order of ⁇ A1, B1, B0, A0, B2 ⁇ . At this time, if the candidate block is encoded in the intra-prediction mode and motion information does not exist, or if the candidate block is located outside the current picture (or slice), the candidate block can not be used.
  • the spatial merge candidate can be constructed by excluding unnecessary candidate blocks from the candidate block of the current block. For example, if the candidate block of the current prediction block is the first prediction block in the same coding block, the candidate blocks excluding the candidate block and the same motion information may be excluded.
  • the temporal merge candidate configuration process proceeds according to the order of ⁇ T0, T1 ⁇ .
  • a right bottom block T0 of a collocated block of a reference picture is available, the block is configured as a temporal merge candidate.
  • a collocated block refers to a block existing at a position corresponding to a current block in a selected reference picture. Otherwise, the block (T1) located at the center of the collocated block is constructed as a temporal merge candidate.
  • the maximum number of merge candidates can be specified in the slice header. If the number of merge candidates is greater than the maximum number, the spatial candidates and temporal candidates smaller than the maximum number are retained. Otherwise, additional merge candidates (i.e., combined bi-predictive merging candidates) are generated by combining the candidates added so far until the number of merge candidates reaches the maximum number of candidates .
  • the encoder constructs a merge candidate list by performing the above-described method and performs motion estimation (Motion Estimation) to obtain a merge index (for example, merge_idx [x0] [y0] ) To signal the decoder.
  • FIG. 7B illustrates a case where the B1 block is selected in the merge candidate list. In this case, "Index 1" can be signaled to the decoder as a merge index.
  • the decoder constructs a merge candidate list in the same way as the encoder and derives the motion information for the current block from the motion information of the candidate block corresponding to the merge index received from the encoder in the merge candidate list. Then, the decoder generates a prediction block for the current block based on the derived motion information (i.e., motion compensation).
  • the AMVP mode refers to a method of deriving motion vector prediction values from neighboring blocks.
  • the horizontal and vertical motion vector difference (MVD), reference index, and inter prediction mode are signaled to the decoder.
  • the horizontal and vertical motion vector values are calculated using the derived motion vector prediction value and the motion vector difference (MVD) provided from the encoder.
  • the encoder constructs a motion vector predictor candidate list and performs motion estimation (Motion Estimation) to select a motion vector predictor flag (i.e., candidate block information) (e.g., mvp_lX_flag [ x0] [y0] ') to the decoder.
  • the decoder constructs a motion vector predictor candidate list in the same way as the encoder.
  • the motion vector prediction of the current processing block is performed using the motion information of the candidate block indicated in the motion vector predictor flag received from the encoder To derive a person.
  • the decoder obtains a motion vector value for the current processing block using the derived motion vector predictor and the motion vector difference value transmitted from the encoder.
  • the decoder generates a predicted block (i.e., an array of predicted samples) for the current block based on the derived motion information (i.e., motion compensation).
  • the motion vector is scaled.
  • the candidate composition is terminated. If the number of selected candidates is less than two, temporal motion candidates are added.
  • FIG. 8 is a diagram illustrating an inter prediction method according to an embodiment to which the present invention is applied.
  • a decoder (specifically, the inter-prediction unit 261 of the decoder in FIG. 2) decodes a motion parameter for a processing block (for example, a prediction block) (S801).
  • the decoder can decode the signaled merge index from the encoder. Then, the decoder can derive the motion parameter of the current block from the motion parameter of the candidate block indicated in the merge index.
  • the decoder can decode the horizontal and vertical motion vector difference (MVD) signaled from the encoder, the reference index and the inter prediction mode.
  • the motion vector predictor is derived from the motion parameter of the candidate block indicated by the motion vector predictor flag, and the motion vector value of the current block can be derived using the motion vector predictor and the received motion vector difference value.
  • the decoder performs motion compensation on the current block using the decoded motion parameter (or information) (S802).
  • the encoder / decoder performs motion compensation for predicting an image of a current block from a previously decoded picture (i.e., generating a prediction block for a current unit) using the decoded motion parameters.
  • the encoder / decoder can derive a predicted block (i.e., an array of predicted samples) of the current block from a sample of the area corresponding to the current block in the previously decoded reference picture.
  • FIG. 9 is a diagram illustrating a motion compensation process according to an embodiment to which the present invention can be applied.
  • the motion parameters for the current block to be coded in the current picture are unidirectional prediction, the second picture in LIST0, the second picture in LIST0, and the motion vector (-a, b) do.
  • the current block is predicted using the value of the position (-a, b) of the current block in the second picture of LIST0 (i.e., the sample value of the reference block).
  • another reference list e.g., LIST1
  • a reference index e.g., a reference index
  • a motion vector difference value e.g., a motion vector difference value
  • FIG. 10 schematically illustrates a method of applying adaptive loop filtering, in accordance with an embodiment of the present invention.
  • Adaptive loop filtering is a technique for acquiring an image similar to an original image by applying a filter to a reconstructed picture to compensate for errors due to prediction and quantization.
  • the encoder When the ALF is applied, the encoder derives the coefficients of the Wiener filter using the original block and the reconstructed block, and applies the coefficients of the derived Wiener filter to the restoration block.
  • a filtered reconstructed block may be obtained.
  • a restoration block is obtained by adding a prediction block and a residual block.
  • a circular symbol + (10010) represents addition.
  • the circular symbol M (10020) indicates that the coefficients of the Wiener filter are calculated (or the Wiener filter is applied).
  • the filter coefficient of FIG. 10 represents the Wiener filter coefficient.
  • the ALF method uses the restored block and the original block as inputs to calculate the Wiener filter coefficients.
  • the coefficients of the obtained Wiener filter are applied to the reconstruction block, whereby the filtered reconstruction block is obtained.
  • the coefficients of the obtained Wiener filter are transmitted to the decoder.
  • the ALF technique can improve the peak signal-to-noise ratio (PSNR) by applying a filter to the reconstructed block (picture).
  • PSNR peak signal-to-noise ratio
  • the filter coefficient is calculated on a picture-by-picture basis, and the encoder transmits the calculated filter coefficient of the picture unit to the decoder. Since the ALF performs filtering on the reconstruction block of the current block, the coding efficiency can not be improved by reducing the data amount of the residual signal of the current picture. Instead, the ALF can improve the coding efficiency by using the filtered reconstruction picture as an enhanced reference picture of the picture (future picture) decoded after the current picture.
  • FIG. 11 schematically shows a method of applying prediction block filtering and adaptive prediction block filtering according to an embodiment of the present invention.
  • FIG. 11 (a) is a schematic diagram of a prediction block filtering (PBF) technique and (b) is a schematic diagram of an adaptive prediction block filtering (APBF) technique.
  • PPF prediction block filtering
  • APBF adaptive prediction block filtering
  • Predictive block filtering is to improve prediction accuracy and coding efficiency by applying a filter to the prediction block to compensate for errors due to prediction and quantization.
  • the encoder calculates the Wiener filter coefficients between the original block and the prediction block, and applies the calculated Wiener filter coefficients to the prediction block to improve the accuracy of the prediction block and the coding efficiency.
  • the coefficients of the Wiener filter between the prediction block and the original block are acquired (computed).
  • the circular symbol M (11010) indicates that the coefficient of the Wiener filter is calculated (or the Wiener filter is applied).
  • the PBF method uses the prediction block and the original block as inputs to calculate the Wiener filter coefficients.
  • the filter coefficient indicates the Wiener filter coefficient.
  • the obtained Wiener filter coefficients are applied to the prediction block, whereby the filtered prediction block is obtained.
  • a modified reconstructed block is obtained by adding the filtered residual block to the filtered residual block.
  • the modified residual block indicates that the residual block has also changed because the prediction block is changed due to filtering (application of the Wiener filter).
  • the residual block is obtained by subtracting the prediction block from the original block.
  • the circular symbol + (11020) represents addition.
  • the PBF scheme calculates the filter coefficients in units of blocks
  • the PBF scheme has a disadvantage in that the filter coefficients in units of blocks must be transmitted to the decoder. Therefore, the use of the PBF scheme increases the amount of additional information for transmitting the filter coefficients.
  • 11A, "X" indicated in the filter coefficient under the modified restoration block means that the PBF scheme is not suitable for improving the coding efficiency due to an increase in the data amount of the additional information to be transmitted.
  • APBF adaptive predictive block filtering
  • Adaptive Predictive Blocking derives the filter coefficients using information of neighboring blocks of the current block rather than the current block, and applies the derived filter coefficients to the prediction blocks to improve the prediction accuracy and coding efficiency .
  • the decoder does not have information on the original block, which is the target block for deriving the filter coefficients. Therefore, in order to replace the original block, the APBF method derives a filter coefficient that can improve the accuracy of a prediction block of a neighboring block by using a reconstruction block of a neighboring block.
  • the filter coefficients derived in this way are used for the prediction block of the current block.
  • the decoder derives the Wiener filter coefficients between the reconstructed block of the neighboring block (or neighboring block) and the predicted block of the neighboring block instead of the original block of the current block and the predicted block of the current block, The coefficient is applied to the prediction block of the current block.
  • a Wiener filter coefficient is obtained (calculated) between a prediction block of a neighboring block and a restoration block of a neighboring block.
  • Circular symbol M indicates that the coefficients of the Wiener filter are calculated (or the Wiener filter is applied). That is, the APBF scheme uses the prediction block of the neighboring block and the restoration block of the neighboring block as an input for calculating the Wiener filter coefficient.
  • the filter coefficient indicates the Wiener filter coefficient. The obtained Wiener filter coefficients are applied to the prediction block of the current block, whereby the filtered prediction block is obtained.
  • a Wiener filter is a filter that transforms the input as closely as possible to the desired output.
  • the meaning of 'as close as possible' means that the sum of squares of the difference between the filter input and the desired result is minimized. That is, the Wiener filter is a filter that minimizes the mean square error between the input and the desired output.
  • Equation (1) is an example of an equation for calculating the coefficients of the Wiener filter.
  • Equation (1) C represents the Wiener filter coefficient, and x and y represent the coordinates of the sample in the block. i and j denote the coordinates in the Wiener filter. c_ (i, j) represents the coefficient of the (i, j) coordinate in the winner filter coefficient. N represents the filter size, where the number of filter taps is 2N + 1.
  • R denotes a reconstruction block, and P denotes a prediction block.
  • the restoration block R and the prediction block P correspond to the inputs of Equation (1). That is, Equation (1) corresponds to a formula for obtaining a Wiener filter coefficient (C) that minimizes an error between a reconstruction block (R) and a prediction block (P).
  • an adaptive loop filter computes a Wiener filter coefficient C by using an original block O and a reconstruction block R as inputs instead of a reconstruction block R and a prediction block P, (See Fig. 10).
  • Prediction block filtering uses the original block O and the prediction block P as inputs (see the description of FIG. 11 (a)).
  • Adaptive prediction block filtering uses a reconstruction block R and a prediction block P as inputs (see the description of FIG. 11 (b)).
  • Bi-prediction block filtering (BPBF) proposed in this specification uses an average block (avg (P0, P1)) and a prediction block (P0 / P1) as inputs. Details of the BPBF will be described later.
  • the APBF does not transmit the filter coefficient information to the decoder, and the decoder reduces the amount of additional information transmitted by deriving the filter coefficient.
  • the APBF uses a filter coefficient derived from the information of the adjacent block rather than the information of the current block, it has a limitation in increasing the accuracy of the prediction block.
  • BPBF bi-prediction block filtering
  • the decoder can improve prediction accuracy and coding efficiency by filtering two prediction blocks in different reference picture lists obtained by bidirectional prediction similar to the average blocks of the two prediction blocks. This approach can be referred to as bi-predictive block filtering or bi-predictive based filtering and the like.
  • Pair prediction block filtering is performed so that each of the prediction block P0 of the reference picture list 0 and the prediction block of the reference picture list P1 is similar to the average block avg (P0, P1) of the two prediction blocks Thereby improving the accuracy of prediction and the coding efficiency.
  • a prediction block obtained based on reference picture list 0 in bidirectional prediction may be referred to as P0 (or P0 block), and a prediction block obtained based on reference picture list 1 may be referred to as P1 (or P1 block) Lt; / RTI >
  • P0 or P0 block
  • P1 or P1 block
  • Lt a prediction block obtained based on reference picture list 1
  • RTI &gt One prediction block composed of the average values of the P0 block and the P1 block
  • Avg (P0, P1) corresponds to an average value (or an average block) of the P1 block and the P0 block, and may also be referred to as an average prediction block.
  • the generation of one block (or an operation in which an average block of two blocks is generated) as an average value of two blocks may be referred to as an average sum.
  • one prediction block i.e., Avg (P0, P1)
  • Avg (P0, P1) the average block (Avg (P0, P1)) of the two prediction blocks can be regarded as a block most similar to the original block. Therefore, the prediction performance and the coding efficiency can be further improved by refining the P0 block and the P1 block to become more similar to the average prediction block (Avg (P0, P1)).
  • the proposed Prediction Block Filtering can further improve the accuracy of the prediction block by refinement such that the two prediction blocks P0 and P1 are similar to the average prediction block Avg (P0, P1) have.
  • the pair prediction block filtering derives a Wiener filter coefficient that minimizes an error between the average prediction block (Avg (P0, P1)) and the P0 block, and applies the derived filter coefficient to the P0 block, .
  • the Pair Prediction Block Filtering derives Wiener filter coefficients that minimize errors between the average prediction block (Avg (P0, P1)) and the P1 block, and applies the derived filter coefficients to the P1 block to refine the P1 block do.
  • Pair Prediction Block Filtering can increase the accuracy of prediction by using refined prediction blocks, thereby reducing the information amount of residual signals and improving coding efficiency.
  • the BPBF can be adaptively applied to various sequences by determining whether refinement is applied in block or sample units.
  • BPBF bi-predictive block filtering
  • the decoder can determine whether to apply the BPBF on a block-by-block basis.
  • the encoder can signal information indicating whether the BPBF is applied to the decoder on a block-by-block basis.
  • the flag indicating whether or not the BPBF is applied may be referred to as a BPBF flag (bpbf_flag).
  • the encoder can determine whether to signal the BPBF flag depending on whether the inter prediction mode of the current block is the AMVP mode or the merge mode.
  • the BPBF flag (bpbf_flag) is signaled to the decoder only when the prediction direction of the corresponding block is bidirectional prediction, and is not signaled if it is not bidirectional prediction.
  • whether to apply the BPBF of the current block can be determined according to 'bpbf_flag' of the selected candidate block.
  • the decoder may not apply the BPBF to the current block if the block (or coding unit) is a candidate derived on a divided sub-block basis, or corresponds to a temporal motion vector predictor (TMVP) or the like.
  • MVP combined motion vector predictor
  • TMVP temporal motion vector predictor
  • FIG. 12 is a flowchart illustrating a motion compensation process in an inter-prediction mode for applying the bi-predictive block filtering according to an embodiment of the present invention.
  • the decoder may perform different motion compensation processes when the current block does not satisfy the specific condition, or when the block prediction block filtering (BPBF) is applied or not. .
  • BPBF block prediction block filtering
  • the decoder confirms (or determines) whether or not the current block satisfies a predefined condition (S12010). Examples of conditions for determining whether to apply the BPBF will be described below.
  • the condition may be whether or not the BPBF flag (bpbf_flag) indicates that the BPBF is applied to the current block (condition 1). If the obtained 'bpbf_flag' indicates that the BPBF is applied to the current block, the decoder can decide to apply the BPBF to the current block. If no syntax for BPBF (e.g., 'bpbf_flag') is present, the following conditions may be used.
  • condition 2 may be whether or not the current block has been predicted in a bi-prediction mode (condition 2).
  • the decoder can decide to apply the BPBF if the current block is predicted in the bi-prediction mode.
  • the condition may be predicted in the bi-predictive mode, and whether the picture order counter (POC) of the two reference pictures is in a different direction with respect to the current picture (Condition 3).
  • the POC is the same as the display order.
  • the decoder can decide to apply the BPBF to the current block if the two reference pictures are a past picture and a future picture, respectively, with respect to the time axis of the current picture.
  • the specific condition may be the size of the current block (condition 4).
  • the characteristics of the block can be considered. For example, the decoder may decide to apply BPBF if the size of the current block (e.g., a coding unit) is greater than 8 ⁇ 8 and not BPBF if less than 8 ⁇ 8. If the current block size is small, it is possible to generate a relatively optimal prediction block through the motion estimation process, so that the decoder can not apply the BPBF considering the signaling overhead due to the BPBF flag ('bpbf_flag').
  • the decoder decides to apply the BPBF to the current block if the current block satisfies a certain condition (or if the condition is true), and performs the following S12020 to S12050.
  • the decoder performs interpolation on a block having a size of (W + T_W) X (H + T_H) in each reference picture list (S12020).
  • the shape of the Wiener filter is taken into account.
  • the shape of the Wiener filter may be M xl, M x N, N x M or M x M, and may also be a 5 x 5 diamond shape, 7 x 7 diamond shape, or 9 x 9 diamond shape.
  • W represents the width of the current block
  • H represents the height of the current block.
  • T_W and T_H represent values derived from the number of horizontal filter taps and the number of vertical filter taps, respectively.
  • (W + T_W) ⁇ (H + T_H) is 12 ⁇ 12 corresponding to (8 + 4) ⁇ (8 + 4) when the Wiener filter is 5 ⁇ 5 and the block is 8 ⁇ 8. That is, the value of T_W and T_H is 4 at this time.
  • (W + T_W) ⁇ (H + T_H) is 14 ⁇ 12 corresponding to (8 + 6) ⁇ (8 + 4) when the Wiener filter is 7 ⁇ 5 and 8 ⁇ 8. That is, the value of T_W is 6 and the value of T_H is 4 at this time.
  • the decoder repeats the process of S12020 in the reference picture list 0 (L0) and the reference picture list 1 (L1), respectively.
  • the P0 block and the P1 block used for generating the average prediction block (Avg (P0, P1)) are obtained.
  • the decoder generates an average prediction block (Avg (P0, P1)) which is an average sum of the two prediction blocks obtained in step S12020 (S12030). That is, the decoder generates the average value (or average block) of the P0 block and the P1 block as an average prediction block (Avg (P0, P1)).
  • the decoder then calculates the Wiener filter coefficients between the prediction blocks P0 and P1 in the respective directions and the average prediction block Avg (P0, P1), and outputs the calculated filter coefficients to the prediction blocks P0 and P1 in each direction (S12040). Specifically, the decoder calculates a filter coefficient that minimizes the error between the P0 block and the average prediction block (Avg (P0, P1)), and then applies the calculated filter coefficient to the P0 block. Further, the decoder calculates the filter coefficient that minimizes the error between the P1 block and the average prediction block (Avg (P0, P1)), and then applies the calculated filter coefficient to the P1 block.
  • BPBF bi-prediction block filtering
  • step S12040 the P0 block and the P1 block are filtered.
  • the decoder performs refinement so that the P0 block and the P1 block are similar to the average prediction block (Avg (P0, P1)) by performing the calculation and application of the filter coefficients in the P0 block and the P1 block, respectively.
  • the decoder generates a final prediction block by calculating the average of the sum (Sum Average) of the filtered prediction block P0 and P1 to the secondary (2 nd Average sum) (S12050 ).
  • the decoder determines not to apply the BPBF to the current block if the current block does not satisfy a certain condition (or if the condition is false), and performs steps S12060 to S12070.
  • Steps S12060 and S12070 are the same as those of the existing inter prediction.
  • the decoder performs interpolation for each reference picture in the reference picture lists 0 and 1 (S12060). Thereafter, the decoder generates a prediction block by calculating an average sum of two reference blocks selected from the interpolated reference pictures (S12070). That is, the average value (average block) of the two reference predictions is generated as the final prediction block.
  • the decoder can determine whether to apply BPBF in units of samples (or pixels).
  • a variation amount of a sample within a certain interval may be used.
  • the decoder can decide not to apply the BPBF to the sample if the amount of change in the sample in a particular region (or window) around one sample in one block is greater than a predetermined threshold.
  • the size of the specific area may be 5 X 5 area.
  • the threshold value for the variation of the sample value within a specific region may be different depending on the quantization parameter (QP).
  • QP quantization parameter
  • the first calculation method of calculating the variation of the sample in the specific region is a method of calculating the sum of the difference between the average of the sample values in the window and the respective sample values as the variation amount of the sample.
  • Equation (2) below is an example of a formula for calculating the amount of change of the sample using the first calculation method.
  • Equation (2) (x-) on the right side in parentheses represents an average value in the window. i and j represent positions in the window. The size of the window is N ⁇ N. (1 / N 2 ) may not be applied considering the computational complexity and loss of information due to down-scale.
  • a second calculation method for calculating the amount of change of a sample in a specific area is a method of calculating a sum of a difference value between a sample value of an intermediate position (target sample) in the window and another sample value in the window as a variation amount of the sample Method. Equation (3) below is an example of a formula for calculating the amount of change of a sample using the second calculation method.
  • Equation (2) i and j denote positions in the window. Also, k and l represent an intermediate position in the window (or the position of the target sample). The size of the window is N ⁇ N. 1 / N 2 may not be applied considering the computational complexity and loss of information due to down-scale.
  • the decoder can compute the amount of change in the window by dividing the horizontal axis and the vertical axis.
  • the decoder can change the shape of the Wiener filter in consideration of the amount of change in each direction, and apply the changed Wiener filter to the prediction block. That is, the decoder can determine the shape of the Wiener filter according to the variation of the sample value in the horizontal or vertical direction.
  • the shape of the Wiener filter may be changed from 5 ⁇ 5 to 7 ⁇ 5 when the decoder has a large variation in the horizontal axis (or in the horizontal direction). Further, when the variation of the vertical axis (or vertical direction) is large, the shape of the Wiener filter can be changed from 5 ⁇ 5 to 5 ⁇ 7.
  • the directionality of the motion vector can be considered as a method for determining the shape of the Wiener filter. For example, if the 5X5 window is used and the x value of the motion vector having (x, y) is large, the shape of the Wiener filter may be changed from 5X5 to 7X5. When the y value of the motion vector is large, the shape of the Wiener filter can be changed from 5 ⁇ 5 to 5 ⁇ 7.
  • FIG. 13 shows a flowchart of an inter prediction based image processing method according to an embodiment of the present invention.
  • the encoder / decoder determines whether to apply the pair prediction based filtering to the first prediction block and the second prediction block of the current block (S13010).
  • the prediction-based filtering corresponds to filtering (i.e., BPBF filtering) for approximating each of the first and second prediction blocks to an average block generated based on the first prediction block and the second prediction block.
  • filtering i.e., BPBF filtering
  • the first prediction block is generated by performing inter prediction on the basis of the list 0 reference picture
  • the second prediction block is generated by performing inter prediction on the basis of the list 1 reference picture.
  • the bi-prediction-based filtering is applied to the first and second prediction blocks (S13020).
  • BPBF filtering For a specific description on the process of applying the pair prediction-based filtering (BPBF filtering), refer to the description of FIG. 12 described above.
  • the encoder / decoder generates a final predicted block of the current block using the filtered first predictive block and the filtered second predictive block (S13030).
  • a residual block can be generated using an encoder final prediction block, and a decoder can generate a reconstruction block using a final prediction block.
  • FIG. 14 shows a block diagram of an inter prediction unit according to an embodiment of the present invention.
  • the inter prediction unit implements the functions, processes and / or methods proposed in the description related to Figs. 11 to 13 above.
  • the inter prediction unit may include a filtering determination unit 14010, a filtering application unit 14020, and a prediction block generation unit 14030.
  • the filtering determination unit 14010 may determine whether to apply the bi-prediction-based filtering to the first prediction block and the second prediction block of the current block.
  • Pair prediction-based filtering may correspond to filtering (i.e., BPBF filtering) for approximating each of the first and second prediction blocks to an average block generated based on the first prediction block and the second prediction block.
  • the filtering application unit 14020 may apply the pair prediction-based filtering to the first prediction block and the second prediction block, respectively, if it is determined that the bi-prediction-based filtering is applied.
  • the first prediction block may be generated by applying inter prediction on the basis of the list 0 reference picture
  • the second prediction block may be generated by performing inter prediction on the basis of the list 1 reference picture.
  • the prediction block generator 14030 may generate a final prediction block of the current block using the filtered first predictive block and the filtered second predictive block.
  • the filtering application unit 14020 generates the average block using the first prediction block and the second prediction block, derives first Wiener filter coefficients that minimize a difference between the first prediction block and the average block, Deriving second Wiener filter coefficients that minimize the difference between the prediction block and the average block, filtering the first prediction block using the derived first Wiener filter coefficients, and filtering the second predicted block using the derived second Wiener filter coefficients,
  • the prediction block can be filtered.
  • the filtering application unit 14020 In generating the average block, the filtering application unit 14020 generates a first interpolation block based on the size of the first prediction block and the number of taps of the Wiener filter, and the size of the second prediction block and the number of taps of the Wiener filter And a mean value of the first interpolation block and the second interpolation block may be generated as an average block.
  • the filtering decision unit 14010 can obtain a bi-prediction-based filtering flag indicating whether bi-prediction-based filtering is applied when the AMVP mode is applied to the current block. Then, the filtering decision unit 14010 can determine to apply the bi-prediction-based filtering to the first and second prediction blocks when the bi-prediction-based filtering is applied to the current block according to the bi-prediction-based filtering flag.
  • the filtering determination unit 14010 may construct a merge candidate list based on motion information of neighboring blocks of the current block. Thereafter, the filtering determination unit 14010 obtains a merge index indicating the selected merge candidate, and determines whether to apply pair prediction-based filtering to the first prediction block and the second prediction block based on the selected merge candidate indicated by the merge index Can be determined.
  • the selected merge candidate is a merge candidate generated by combining other merge candidates, a zero motion vector, a candidate derived in units of subblocks, or a temporal merge candidate, 1 prediction block and the second prediction block may not be applied.
  • the filtering determination unit 14010 may determine to apply the bi-prediction-based filtering to the first and second prediction blocks when the bi-prediction-based filtering is applied to the selected merge candidate.
  • the filtering determination unit 14010 may determine to apply bi-prediction-based filtering to the first and second prediction blocks when the current block is predicted in a bi-prediction mode .
  • the filtering determination unit 14010 determines that the first prediction block Prediction-based filtering on the first prediction block and the second prediction block.
  • the filtering determination unit 14010 may determine to apply the bi-prediction-based filtering to the first and second prediction blocks if the size of the current block is greater than a predetermined threshold value.
  • the filtering determination unit 14010 may determine whether to apply bi-prediction-based filtering of the current sample based on the amount of change of the sample value within a specific area around the current sample.
  • the variation amount of the sample value may include a variation amount of the horizontal direction sample value in the specific region and a variation amount of the vertical direction sample value.
  • the filtering determination unit 14010 does not apply the bi-prediction-based filtering to the current sample if the sum of the average of the sample values in the specific region and the difference values between the respective sample values in the specific region is greater than a predetermined threshold value You can decide. Further, the filtering decision unit 14020 can decide not to apply the bi-predictive based filtering to the current sample if the sum of the difference values between the sample value at the intermediate position in the specific area and each sample value within the specific area is greater than a predetermined threshold value have.
  • the threshold value may be determined based on the quantization parameter value.
  • the filtering determination unit 14010 can change the shape of the Wiener filter according to a direction having a larger amount of change in the horizontal direction sample value and the vertical direction sample value.
  • FIG. 15 shows a structure of a contents streaming system according to an embodiment of the present invention.
  • the content streaming system to which the present invention is applied may include an encoding server, a streaming server, a web server, a media repository, a user device, and a multimedia input device.
  • the encoding server compresses content input from multimedia input devices such as a smart phone, a camera, and a camcorder into digital data to generate a bit stream and transmit the bit stream to the streaming server.
  • multimedia input devices such as a smart phone, a camera, a camcorder, or the like directly generates a bitstream
  • the encoding server may be omitted.
  • the bitstream may be generated by an encoding method or a bitstream generating method to which the present invention is applied, and the streaming server may temporarily store the bitstream in the process of transmitting or receiving the bitstream.
  • the streaming server transmits multimedia data to a user device based on a user request through the web server, and the web server serves as a medium for informing the user of what services are available.
  • the web server delivers it to the streaming server, and the streaming server transmits the multimedia data to the user.
  • the content streaming system may include a separate control server. In this case, the control server controls commands / responses among the devices in the content streaming system.
  • the streaming server may receive content from a media repository and / or an encoding server. For example, when receiving the content from the encoding server, the content can be received in real time. In this case, in order to provide a smooth streaming service, the streaming server can store the bit stream for a predetermined time.
  • Examples of the user device include a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a navigation device, a slate PC, Such as tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glass, HMDs (head mounted displays)), digital TVs, desktops Computers, and digital signage.
  • PDA personal digital assistant
  • PMP portable multimedia player
  • slate PC Such as tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glass, HMDs (head mounted displays)), digital TVs, desktops Computers, and digital signage.
  • Each of the servers in the content streaming system can be operated as a distributed server. In this case, data received at each server can be distributed.
  • the embodiments described in the present invention can be implemented and executed on a processor, a microprocessor, a controller, or a chip.
  • the functional units depicted in the figures may be implemented and implemented on a computer, processor, microprocessor, controller, or chip.
  • the decoder and encoder to which the present invention is applied can be applied to multimedia communication devices such as a multimedia broadcasting transmitting and receiving device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video chatting device, (3D) video devices, video telephony video devices, and medical video devices, and the like, which may be included in, for example, a storage medium, a camcorder, a video on demand (VoD) service provision device, an OTT video (Over the top video) And may be used to process video signals or data signals.
  • the OTT video (Over the top video) device may include a game console, a Blu-ray player, an Internet access TV, a home theater system, a smart phone, a tablet PC, a DVR (Digital Video Recorder)
  • the processing method to which the present invention is applied may be produced in the form of a computer-executed program, and may be stored in a computer-readable recording medium.
  • the multimedia data having the data structure according to the present invention can also be stored in a computer-readable recording medium.
  • the computer-readable recording medium includes all kinds of storage devices and distributed storage devices in which computer-readable data is stored.
  • the computer-readable recording medium may be, for example, a Blu-ray Disc (BD), a Universal Serial Bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD- Data storage devices.
  • the computer-readable recording medium includes media implemented in the form of a carrier wave (for example, transmission over the Internet).
  • the bit stream generated by the encoding method can be stored in a computer-readable recording medium or transmitted over a wired or wireless communication network.
  • an embodiment of the present invention may be embodied as a computer program product by program code, and the program code may be executed in a computer according to an embodiment of the present invention.
  • the program code may be stored on a carrier readable by a computer.
  • Embodiments in accordance with the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof.
  • an embodiment of the present invention may include one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs) field programmable gate arrays, processors, controllers, microcontrollers, microprocessors, and the like.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • an embodiment of the present invention may be implemented in the form of a module, a procedure, a function, or the like for performing the functions or operations described above.
  • the software code can be stored in memory and driven by the processor.
  • the memory is located inside or outside the processor and can exchange data with the processor by various means already known.

Abstract

L'invention concerne un procédé de traitement d'images base une inter-prédiction. De façon plus précise, un procédé de traitement d'image basé sur une inter-prédiction peut comprendre les étapes consistant : à déterminer si un filtrage basé sur une bi-prédiction est appliqué à un premier bloc de prédiction et à un second bloc de prédiction pour un bloc actuel; lorsqu'il est déterminé que le filtrage basé sur une bi-prédiction est appliqué à ce dernier, à appliquer le filtrage basé sur une bi-prédiction au premier bloc de prédiction et à la seconde prédiction; et à générer un bloc de prédiction final pour le bloc actuel, à l'aide du premier bloc de prédiction filtré et de la seconde prédiction.
PCT/KR2018/007094 2017-09-26 2018-06-22 Procédé de traitement d'image et appareil s'y rapportant WO2019066202A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762563573P 2017-09-26 2017-09-26
US62/563,573 2017-09-26

Publications (1)

Publication Number Publication Date
WO2019066202A1 true WO2019066202A1 (fr) 2019-04-04

Family

ID=65901588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/007094 WO2019066202A1 (fr) 2017-09-26 2018-06-22 Procédé de traitement d'image et appareil s'y rapportant

Country Status (1)

Country Link
WO (1) WO2019066202A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740319A (zh) * 2019-10-30 2020-01-31 腾讯科技(深圳)有限公司 视频编解码方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012044116A2 (fr) * 2010-09-30 2012-04-05 한국전자통신연구원 Appareil et procédé pour coder et décoder une vidéo via la réalisation d'un filtrage adaptatif sur un bloc de prédiction
WO2012148128A2 (fr) * 2011-04-24 2012-11-01 엘지전자 주식회사 Procédé de prédiction inter, et procédés de codage et de décodage et dispositif les utilisant
WO2012177052A2 (fr) * 2011-06-21 2012-12-27 한국전자통신연구원 Procédé inter-prédictions et appareil associé
WO2017034089A1 (fr) * 2015-08-23 2017-03-02 엘지전자(주) Procédé de traitement d'image basé sur un mode d'inter-prédiction et appareil associé

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012044116A2 (fr) * 2010-09-30 2012-04-05 한국전자통신연구원 Appareil et procédé pour coder et décoder une vidéo via la réalisation d'un filtrage adaptatif sur un bloc de prédiction
WO2012148128A2 (fr) * 2011-04-24 2012-11-01 엘지전자 주식회사 Procédé de prédiction inter, et procédés de codage et de décodage et dispositif les utilisant
WO2012177052A2 (fr) * 2011-06-21 2012-12-27 한국전자통신연구원 Procédé inter-prédictions et appareil associé
WO2017034089A1 (fr) * 2015-08-23 2017-03-02 엘지전자(주) Procédé de traitement d'image basé sur un mode d'inter-prédiction et appareil associé

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
C. ROSEWARNE ET AL.: "High Efficiency Video Coding (HEVC) Test Model 16 (HM 16) Improved Encoder Description Update 9", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG 16 WP3 AND ISO/IEC JTC1/SC29/WG11, JCTVC-AB1002, 28TH MEETING, 21 July 2017 (2017-07-21), Torino, IT, XP030118276 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110740319A (zh) * 2019-10-30 2020-01-31 腾讯科技(深圳)有限公司 视频编解码方法、装置、电子设备及存储介质
CN110740319B (zh) * 2019-10-30 2024-04-05 腾讯科技(深圳)有限公司 视频编解码方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
WO2020166897A1 (fr) Procédé et dispositif d'inter-prédiction sur la base d'un dmvr
WO2018066927A1 (fr) Procédé de traitement de vidéo sur la base d'un mode d'inter-prédiction et dispositif associé
WO2019017694A1 (fr) Procédé de traitement d'image basé sur un mode de prédiction intra et appareil associé
WO2018070713A1 (fr) Procédé et appareil pour dériver un mode de prédiction intra pour un composant de chrominance
WO2019117640A1 (fr) Procédé et dispositif de décodage d'image selon une inter-prédiction dans un système de codage d'image
WO2020184991A1 (fr) Procédé et appareil de codage/décodage vidéo utilisant un mode ibc, et procédé de transmission de flux binaire
WO2021137597A1 (fr) Procédé et dispositif de décodage d'image utilisant un paramètre de dpb pour un ols
WO2018062950A1 (fr) Procédé de traitement d'image et appareil associé
WO2019235822A1 (fr) Procédé et dispositif de traitement de signal vidéo à l'aide de prédiction de mouvement affine
WO2020235961A1 (fr) Procédé de décodage d'image et dispositif associé
WO2020256390A1 (fr) Procédé de décodage d'image pour la réalisation d'une bdpcm sur la base d'une taille de bloc et dispositif associé
WO2019027145A1 (fr) Procédé et dispositif permettant un traitement d'image basé sur un mode de prédiction inter
WO2019078427A1 (fr) Procédé de traitement d'image basé sur un mode d'interprédiction et dispositif associé
WO2016200235A1 (fr) Procédé de traitement d'image basé sur un mode de prédiction intra et appareil associé
WO2021125700A1 (fr) Appareil et procédé de codage d'image/vidéo basé sur une table pondérée par prédiction
WO2021034116A1 (fr) Procédé de décodage d'image utilisant un paramètre de quantification de chrominance, et appareil associé
WO2019245228A1 (fr) Procédé et dispositif de traitement de signal vidéo utilisant la prédiction affine de mouvement
WO2019135419A1 (fr) Procédé et appareil pour une interprédiction basée sur un filtre d'interpolation de bloc actuel dans un système de codage d'image
WO2021112633A1 (fr) Procédé et appareil de codage/décodage d'image sur la base d'un en-tête d'image comprenant des informations relatives à une image co-localisée, et procédé de transmission de flux binaire
WO2021137598A1 (fr) Procédé de décodage d'image comprenant un processus de gestion de dpb et appareil associé
WO2021034122A1 (fr) Procédé et dispositif de codage/décodage d'image permettant d'effectuer une prédiction sur la base d'un candidat hmvp, et procédé de transmission de flux binaire
WO2019066202A1 (fr) Procédé de traitement d'image et appareil s'y rapportant
WO2021015512A1 (fr) Procédé et appareil de codage/décodage d'images utilisant une ibc, et procédé de transmission d'un flux binaire
WO2021125702A1 (fr) Procédé et dispositif de codage d'image/vidéo basés sur une prédiction pondérée
WO2021034117A1 (fr) Procédé de décodage d'image et dispositif associé

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18862616

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18862616

Country of ref document: EP

Kind code of ref document: A1