CN113287302A

CN113287302A - Method and apparatus for image encoding/decoding

Info

Publication number: CN113287302A
Application number: CN202080007897.4A
Authority: CN
Inventors: 文柱禧; 元栋载; 河在玟
Original assignee: Industry Academy Cooperation Foundation of Sejong University
Current assignee: Industry Academy Cooperation Foundation of Sejong University
Priority date: 2019-01-04
Filing date: 2020-01-06
Publication date: 2021-08-20

Abstract

The method and apparatus for image encoding/decoding according to the present invention may generate a candidate list of a current block and perform inter prediction of the current block by using any one of a plurality of candidates belonging to the candidate list. Here, the plurality of candidates include at least one of a spatial candidate, a temporal candidate, and a candidate based on the reconstruction information, and the candidate based on the reconstruction information may be added from a buffer storing motion information decoded before the current block.

Description

Method and apparatus for image encoding/decoding

Technical Field

The present disclosure relates to an image encoding/decoding method and apparatus.

Background

Recently, the demand for multimedia data such as video on the internet is rapidly increasing. However, the speed of channel bandwidth development has not kept pace with the rapidly increasing amount of multimedia data. Thus, VCEG (video coding experts group) of ITU-T and MPEG (moving Picture experts group) of ISO/IEC promulgated HEVC (high efficiency video coding) version 1, a video compression standard, in 2014.2.

HEVC defines techniques such as intra prediction, inter prediction, transform, quantization, entropy coding, and in-loop filters.

Disclosure of Invention

Technical problem

The present disclosure proposes a method in which prediction efficiency can be improved by efficiently deriving motion information for generating a MERGE/AMVP candidate list.

The present disclosure provides a method and apparatus for searching for a motion vector predictor of a current block in reconstructed motion information around the current block when a prediction block of the current block is generated.

The present disclosure provides a method and apparatus for efficiently transmitting motion information of a current block.

The present disclosure provides a method and apparatus for predicting a current block more efficiently by using reconstruction information in a current picture.

The present disclosure provides a method and apparatus for encoding/decoding a transform coefficient of a current block.

Technical scheme

The image encoding/decoding method and apparatus according to the present disclosure may generate a candidate list of a current block and perform inter prediction of the current block by using any one of a plurality of candidates belonging to the candidate list.

In the image encoding/decoding method and apparatus according to the present disclosure, the plurality of candidates may include at least one of a spatial candidate, a temporal candidate, or a candidate based on the reconstruction information, and the candidate based on the reconstruction information may be added from a buffer storing motion information decoded before the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, the motion information stored in the buffer may be added to the candidate list in the order of the motion information stored later in the buffer, or may be added to the candidate list in the order of the motion information stored first in the buffer.

In the image encoding/decoding method and apparatus according to the present disclosure, the number or order of adding motion information stored in the buffer to the candidate list may be differently determined according to the inter prediction mode of the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, the candidate list may be filled by using the motion information stored in the buffer until the maximum number of candidates in the candidate list is reached, or the candidate list may be filled by using the motion information stored in the buffer until the number of subtracting 1 from the maximum number of candidates is reached.

In the image encoding/decoding method and apparatus according to the present disclosure, the buffer may be initialized in units of any one of a Coding Tree Unit (CTU), a CTU line, a slice, or a picture.

The computer-readable recording medium according to the present disclosure may store a bitstream to be decoded by an image decoding method.

In the computer-readable recording medium according to the present disclosure, the image decoding method may include generating a candidate list of the current block, and performing inter prediction of the current block by using any one of a plurality of candidates belonging to the candidate list.

In the computer-readable recording medium according to the present disclosure, the plurality of candidates may include at least one of a spatial candidate, a temporal candidate, or a candidate based on the reconstruction information, and the candidate based on the reconstruction information may be added from a buffer storing motion information decoded before the current block.

In the computer-readable recording medium according to the present disclosure, the motion information stored in the buffer may be added to the candidate list in the order of the motion information stored later in the buffer, or may be added to the candidate list in the order of the motion information stored first in the buffer.

In the computer-readable recording medium according to the present disclosure, the number or order of adding motion information stored in the buffer to the candidate list may be differently determined according to the inter prediction mode of the current block.

In the computer-readable recording medium according to the present disclosure, the candidate list may be filled by using the motion information stored in the buffer until the maximum number of candidates in the candidate list is reached, or the candidate list may be filled by using the motion information stored in the buffer until a number of subtracting 1 from the maximum number of candidates is reached.

In the computer-readable recording medium according to the present disclosure, the buffer may be initialized in units of any one of a Coding Tree Unit (CTU), a CTU row, a slice, or a picture.

In the image encoding/decoding method and apparatus according to the present disclosure, the plurality of candidates may include a temporal candidate in units of sub-blocks. The temporal candidate in units of sub-blocks may be a candidate for deriving motion information of each sub-block of the current block, and may have motion information of a target block temporally adjacent to the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, the sub-block may be an nxm block having a fixed size preset in the decoding apparatus.

In the image encoding/decoding method and apparatus according to the present disclosure, the sub-block of the target block may be determined as a block at a position offset from the position of the sub-block of the current block by a predetermined time motion vector.

In the image encoding/decoding method and apparatus according to the present disclosure, the temporal motion vector may be set by using only a surrounding block at a specific position in a spatial surrounding block of the current block, and the surrounding block at the specific position may be a left-side block of the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, the temporal motion vector may be set only when the reference picture of the surrounding block at the specific position is identical to the target picture to which the target block belongs.

In the computer-readable recording medium according to the present disclosure, the plurality of candidates may include a time candidate in units of sub-blocks. The temporal candidate in units of sub-blocks may be a candidate for deriving motion information of each sub-block of the current block, and may have motion information of a target block temporally adjacent to the current block.

In the computer-readable recording medium according to the present disclosure, the sub-block may be an nxm block having a fixed size preset in the decoding apparatus.

In the computer-readable recording medium according to the present disclosure, the sub-block of the target block may be determined as a block at a position offset from the position of the sub-block of the current block by a predetermined time motion vector.

In the computer-readable recording medium according to the present disclosure, the temporal motion vector may be set by using only a surrounding block at a specific position in a spatial surrounding block of the current block, and the surrounding block at the specific position may be a left-side block of the current block.

In the computer-readable recording medium according to the present disclosure, the temporal motion vector may be set only when the reference picture of the surrounding block at the specific position is the same as the target picture to which the target block belongs.

The image encoding/decoding method and apparatus according to the present disclosure may configure a merge candidate list of a current block, set any one of a plurality of merge candidates belonging to the merge candidate list as motion information of the current block, derive a final motion vector of the current block by adding a predetermined motion vector difference value (MVD) to a motion vector in the motion information of the current block, and generate a prediction block of the current block by performing motion compensation based on the final motion vector.

In the image encoding/decoding method and apparatus according to the present disclosure, the merge candidate list may be configured with k merge candidates, and k may be a natural number such as 4, 5, 6, or more.

In the image encoding/decoding method and apparatus according to the present disclosure, motion information of the current block may be set by using any one of the first merge candidate or the second merge candidate belonging to the merge candidate list according to the merge candidate index information transmitted from the encoding apparatus.

In the image encoding/decoding method and apparatus according to the present disclosure, the motion vector difference value may be derived based on a predetermined offset vector, and the predetermined offset vector may be derived based on at least one of a length or a direction of the predetermined offset vector.

In the image encoding/decoding method and apparatus according to the present disclosure, the length of the predetermined offset vector may be determined based on at least one of a distance index or a predetermined flag, and the flag may represent information indicating whether the motion vector uses integer pixel accuracy in the merge mode of the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, the direction of the predetermined offset vector may be determined based on a direction index, and the direction may represent any one of left, right, top, bottom, upper left, lower left, upper right, or lower right directions.

In the image encoding/decoding method and apparatus according to the present disclosure, the predetermined offset vector may be modified by considering a POC difference between a reference picture of the current block and a current picture to which the current block belongs.

An image encoding/decoding method and apparatus according to the present disclosure may determine a prediction block of a current block belonging to a current picture by using a previously reconstructed region in the current picture, encode/decode a transform block of the current block, and reconstruct the current block based on the prediction block and the transform block.

In the image encoding/decoding method and apparatus according to the present disclosure, determining the prediction block may include determining a candidate for deriving motion information of the current block, configuring a candidate list of the current block based on the candidate, and determining the motion information of the current block from the candidate list.

In the image encoding/decoding method and apparatus according to the present disclosure, the candidates may represent motion information of a surrounding block spatially adjacent to the current block.

In the image encoding/decoding method and apparatus according to the present disclosure, there may be a restriction that a prediction block belongs to the same Coding Tree Unit (CTU) or CTU row as a current block.

In the image encoding/decoding method and apparatus according to the present disclosure, motion information of surrounding blocks may be selectively added to the candidate list based on whether the size of the current block is greater than a predetermined threshold size.

In the image encoding/decoding method and apparatus according to the present disclosure, the candidate list may additionally include motion information stored in a buffer of the encoding/decoding apparatus.

In the image encoding/decoding method and apparatus according to the present disclosure, the current block is divided into a plurality of sub-blocks, and encoding/decoding the transform block may include: encoding/decoding subblock information of a subblock of a current block, and encoding/decoding at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, or coefficient information greater than 3 of a current coefficient in the subblock when at least one non-zero coefficient exists in the subblock according to the subblock information.

In the image encoding/decoding method and apparatus according to the present disclosure, the number information of the sub-blocks may be encoded/decoded, and the number information may represent the maximum number of coefficient information allowed for the sub-blocks.

In the image encoding/decoding method and apparatus according to the present disclosure, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, or coefficient information greater than 3.

In the image encoding/decoding method and apparatus according to the present disclosure, the number information may be increased/decreased by 1 whenever at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, or coefficient information greater than 3 is encoded/decoded, respectively.

Technical effects

According to the present disclosure, prediction efficiency may be improved by efficiently deriving motion information for generating a MERGE/AMVP candidate list.

The present disclosure can improve encoding efficiency by selecting a motion vector predictor using reconstructed motion information around a current block and efficiently transmitting the motion information.

The present disclosure may improve the accuracy of a prediction signal by searching motion information of a block even in a current picture instead of a previously reconstructed picture, and may provide an image encoding/decoding method and apparatus, which thereby more efficiently transmit transform coefficients.

Drawings

Fig. 1 is a flowchart schematically showing an image encoding apparatus.

Fig. 2 is a diagram for describing a prediction unit of an image encoding apparatus in detail.

Fig. 3 is a diagram for describing a method of deriving candidate motion information of SKIP, MERGE mode.

Fig. 4 is a flowchart illustrating a method of deriving candidate motion information of an AMVP mode.

Fig. 5 is a flowchart illustrating a method of encoding prediction information.

Fig. 6 is a flowchart schematically showing an image decoding apparatus.

Fig. 7 is a diagram for describing a prediction unit of an image decoding apparatus.

Fig. 8 is a flowchart illustrating a method of decoding prediction information.

Fig. 9 is a flowchart for describing a method of configuring a MERGE/AMVP candidate list according to the present embodiment.

Fig. 10 is a diagram for describing a method of deriving temporal candidate motion information according to the present embodiment.

Fig. 11 is a diagram for describing a first method of determining a target block in a target picture when deriving temporal candidate motion information according to the present embodiment.

Fig. 12 is a diagram for describing a second method of determining a target block in a target picture when deriving temporal candidate motion information according to the present embodiment.

Fig. 13 is a diagram for describing a third method of determining a target block in a target picture when deriving temporal candidate motion information according to the present embodiment.

Fig. 14 is a diagram for describing a first method of deriving history-based candidate motion information according to the present embodiment.

Fig. 15 is a diagram for describing a second method of deriving history-based candidate motion information according to the present embodiment.

Fig. 16 is a diagram for describing a first method of deriving average candidate motion information according to the present embodiment.

Fig. 17 is a diagram for describing a second method of deriving average candidate motion information according to the present embodiment.

Fig. 18 is a diagram for describing a method of predicting MVD information.

Fig. 19 is an example table for describing a method of configuring a Merge/AMVP candidate list of motion vectors according to an embodiment of the present disclosure.

Fig. 20 is an example table for describing a method of configuring a Merge/AMVP candidate list for an MVD according to an embodiment of the present disclosure.

Fig. 21 is a flowchart illustrating a procedure of encoding prediction information including MVD candidate motion information according to an embodiment of the present disclosure.

Fig. 22 is a flowchart illustrating a flow of decoding prediction information including MVD candidate motion information according to an embodiment of the present disclosure.

Fig. 23 is a flowchart illustrating a procedure of encoding prediction information including additional MVD information according to an embodiment of the present disclosure.

Fig. 24 is a flowchart illustrating a procedure of decoding prediction information including additional MVD information according to an embodiment of the present disclosure.

Fig. 25 is a table showing a configuration example of a reference picture set according to an embodiment of the present disclosure.

Fig. 26 is a table describing a method of adaptively determining an inter prediction direction and binarization of reference picture index information according to a configuration state of a reference picture set according to an embodiment of the present disclosure.

Fig. 27 is a table illustrating information on initial occurrence probabilities of MPS, LPS according to contexts of corresponding binary bits when transmitting binary bits of inter prediction direction information according to an embodiment of the present disclosure.

Fig. 28 is a diagram illustrating an update rule of an LPS occurrence probability according to an embodiment of the present disclosure.

Fig. 29 is a block diagram illustrating an intra prediction unit of the image encoding apparatus.

Fig. 30 is a block diagram illustrating an inter prediction unit of the image encoding apparatus.

Fig. 31 is a method of encoding prediction mode information.

Fig. 32 shows an intra prediction unit of the image decoding apparatus.

Fig. 33 shows an inter prediction unit of the image decoding apparatus.

Fig. 34 is a method of decoding prediction mode information.

Fig. 35 is a flowchart illustrating an encoding method of a transform block.

Fig. 36 is a flowchart illustrating a decoding method of a transform block.

Fig. 37 is a flowchart showing a context adaptive binarization arithmetic coding method.

Fig. 38 is a flowchart showing a context adaptive binarization arithmetic decoding method.

Fig. 39 is a diagram showing an example in which probability information is applied differently according to information of surrounding coefficients.

Fig. 40 is a diagram showing an example in which probability information is applied differently according to information of surrounding coefficients.

Fig. 41 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present disclosure.

Fig. 42 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 43 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 44 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 45 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 46 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 47 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 48 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 49 illustrates a method of generating a prediction block by using intra block copy prediction according to an embodiment of the present disclosure.

Fig. 50 is a block diagram illustrating an intra block copy prediction unit of an image encoding apparatus according to an embodiment of the present disclosure.

Fig. 51 shows the positions of neighboring spatial candidates around the current block.

Fig. 52 is a method of encoding prediction mode information according to an embodiment of the present disclosure.

Fig. 53 is a block diagram illustrating an intra block copy prediction unit of an image decoding apparatus according to an embodiment of the present disclosure.

Fig. 54 is a block diagram illustrating an intra block copy prediction unit of an image decoding apparatus according to an embodiment of the present disclosure.

Fig. 55 is a method of decoding prediction mode information according to an embodiment of the present disclosure.

Fig. 56 is a flowchart illustrating a method of encoding quantized transform coefficients according to an embodiment of the present disclosure.

Fig. 57 is a flowchart illustrating a method of decoding quantized transform coefficients according to an embodiment of the present disclosure.

Detailed Description

Best mode for carrying out the invention

Embodiments of the present disclosure

Embodiments of the present disclosure are described in detail with reference to the drawings attached in the specification so that those of ordinary skill in the art to which the present disclosure pertains can easily implement the present disclosure. However, the present disclosure may be embodied in various different forms and is not limited to the embodiments described herein. Also, portions that are not relevant to the description are omitted, and like reference numerals are attached to like portions throughout the specification to clearly describe the present disclosure in the drawings.

In this specification, when one component is referred to as being "connected to" another component, a case where it is electrically connected while interposing another element and a case where it is directly connected are included.

In addition, in the present specification, when a component is referred to as "including" a component, it means: other components may be additionally included without excluding other components, unless otherwise specified.

Also, terms such as first, second, etc. may be used to describe various components, but these components should not be limited by these terms. These terms are only used to distinguish one component from another.

In addition, in the embodiments of the apparatus and method described in the present specification, some configurations of the apparatus or some steps of the method may be omitted. In addition, some configurations of the apparatus or the order of some steps of the method may be changed. Additionally, another configuration or another step may be inserted in some configurations of the device or some steps of the method.

In addition, some configurations or some steps in the first embodiment of the present disclosure may be added to or replaced with some configurations or some steps in the second embodiment of the present disclosure.

In addition, when the configuration units shown in the embodiments of the present disclosure are independently illustrated to represent different feature functions, it does not mean that each configuration unit is configured in separate hardware or one software configuration unit. In other words, for convenience of description, each of the configuration units may be described by enumerating it as each of the configuration units, and at least two of each of the configuration units may be combined to configure one configuration unit, or one configuration unit may be divided into a plurality of configuration units to perform functions. Such integrated and separated embodiments in each construction unit are also included in the scope of the right of the present disclosure as long as they do not exceed the essence of the present disclosure.

In this specification, a block may be variously represented as a unit, a region, a unit, a partition, or the like, and a sample may be variously represented as a pixel, a pixel point, a pel, or the like.

Hereinafter, embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. In describing the present disclosure, a repetitive description of the same components is omitted.

Fig. 1 is a block flow diagram schematically showing the configuration of an image encoding apparatus. As a device for encoding an image, an image encoding device may mainly include a block partitioning unit, a prediction unit, a transform unit, a quantization unit, an entropy encoding unit, an inverse quantization unit, an inverse transform unit, an addition unit, an in-loop filter unit, a memory unit, and a subtraction unit.

The block partitioning unit 101 partitions a block encoded in the maximum size (hereinafter, referred to as a maximum encoding block) into a block encoded in the minimum size (hereinafter, referred to as a minimum encoding block). There are a variety of block partitioning methods. A quadtree partition (hereinafter referred to as a QT (quadtree) partition) is a partition that accurately partitions a current coding block into four. A binary tree partition (hereinafter referred to as a BT (binary tree) partition) is a partition that accurately partitions a coding block into two in the horizontal direction or the vertical direction. A ternary tree partition is a partition that partitions a coding block into three in either the horizontal or vertical direction. When partitioning an encoded block in the horizontal direction, the ratio of the heights of the partitioned blocks may be {1: n:1 }. Alternatively, when the encoded block is partitioned in the vertical direction, the ratio of the widths of the partitioned blocks may be {1: n:1 }. In this case, n may be a natural number such as 1, 2, 3, or more. Various other partitioning methods may exist. In addition, the partitioning may be performed by considering several partitioning methods at the same time.

The prediction unit 102 generates a prediction block by using surrounding pixels of a currently predicted block (hereinafter referred to as a prediction block) in a current original block or pixels in a reference picture that has been encoded/decoded. For the prediction block, 1 or more prediction blocks may be generated in the encoding block. When the number of prediction blocks in a coding block is 1, the prediction block has the same shape as the coding block. As the prediction technology of a video signal is mainly configured with intra prediction and inter prediction, the intra prediction is a method of generating a prediction block by using surrounding pixels of a current block, and the inter prediction is a method of generating a prediction block by finding a block most similar to the current block in a reference picture that has been encoded/decoded. Then, for a residual block obtained by subtracting the prediction block from the original block, an optimal prediction mode of the prediction block is determined by using various methods such as RDO (rate distortion optimization) and the like. The formula for calculating the RDO cost is shown in equation 1.

[ formula 1]

J(Φ，λ)＝D(Φ)+λR(Φ)

D. R, J are the degradation caused by quantization, the rate of compressing the stream and the RD cost, respectively, phi is the encoding mode and lambda is the lagrangian multiplier, which are used as coefficients for scale modification to match the units between the error amount and the bit amount. In order to be selected as an optimal coding mode in the coding process, J (i.e., RD cost value) when the corresponding mode is applied should be smaller than J when the other modes are applied, and in the formula for finding the RD cost value, it is calculated by considering both the bit rate and the error.

An intra prediction unit (not shown) may generate a prediction block based on reference pixel information around the current block, which is pixel information in the current picture. When the prediction mode of the surrounding block of the current block on which the intra prediction is to be performed is inter prediction, reference pixels in other surrounding blocks to which the intra prediction is applied may be used instead of reference pixels included in the surrounding blocks to which the inter prediction is applied. In other words, when a reference pixel is unavailable, the unavailable reference pixel information may be used by utilizing at least one of the available reference pixels instead.

In the intra prediction, the prediction mode may have a directional prediction mode using reference pixel information according to a prediction direction and a non-directional mode not using directional information when performing prediction. The mode for predicting the luminance information may be different from the mode for predicting the chrominance information, and the chrominance information may be predicted using intra prediction mode information for predicting the luminance information or predicted luminance signal information.

The intra prediction unit may include an Adaptive Intra Smoothing (AIS) filter, a reference pixel interpolation unit, and a DC filter. As part of performing filtering on the reference pixels of the current block, the AIS filter may adaptively determine whether to apply the filter according to a prediction mode in the current prediction unit. When the prediction mode of the current block is a mode in which the AIS filtering is not performed, the AIS filter may not be applied.

When the prediction mode in the prediction unit is a prediction unit that performs intra prediction based on a pixel value of an interpolated reference pixel, the reference pixel interpolation unit in the intra prediction unit may interpolate the reference pixel to generate the reference pixel at a position of a fractional unit. When the prediction mode in the current prediction unit is a prediction mode in which a prediction block is generated without interpolating reference pixels, the reference pixels may not be interpolated. When the prediction mode of the current block is the DC mode, the DC filter may generate the prediction block through filtering.

An inter prediction unit (not shown) generates a prediction block by using the motion information stored in the memory 110 and a previously reconstructed reference picture. For example, the motion information may include a motion vector, a reference picture index, a list 1 prediction flag (flag), a list 0 prediction flag, and the like.

The inter prediction unit may derive the prediction block based on information of at least one of a previous picture or a subsequent picture of the current picture. In addition, a prediction block for the current block may be derived based on information of some regions encoded in the current picture. An inter prediction unit according to an embodiment of the present disclosure may include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.

In the reference picture interpolation unit, reference picture information may be provided from the memory 110, and pixel information equal to or less than integer pixels may be generated in the reference picture. For luminance pixels, a DCT-based 8-tap interpolation filter with different filter coefficients may be used to generate pixel information equal to or less than integer pixels in units of 1/4 pixels. For chrominance signals, a DCT-based 4-tap interpolation filter with different filter coefficients may be used to generate pixel information equal to or less than an integer pixel in units of 1/8 pixels.

The motion prediction unit may perform motion prediction based on the reference picture interpolated by the reference picture interpolation unit. As a method for calculating the motion vector, various methods such as FBMA (full search based block matching algorithm), TSS (three-step search), NTS (new three-step search algorithm), and the like may be used. The motion vector may have a motion vector value of 1/2 or 1/4 pixel units based on the interpolated pixels. In the motion prediction unit, a prediction block of the current block may be predicted by making a motion prediction method different. As the motion prediction method, various methods such as a Skip (Skip) mode, a Merge (Merge) mode, an AMVP (advanced motion vector prediction) mode, and the like may be used.

Fig. 2 is a flowchart describing a flow in a prediction unit of the image encoding apparatus. When intra prediction (201) is performed by using original information and reconstruction information, an optimal intra prediction mode (202) is determined by using an RD cost value of each prediction mode, and a prediction block is generated. When inter prediction (203) is performed by using the original information and the reconstructed information, RD cost values of the SKIP mode, the MERGE mode, and the AMVP mode are calculated. In the MERGE candidate search unit 204, candidate motion information sets for the SKIP mode and the MERGE mode are configured. Optimal motion information is determined (205) by using the RD cost value in the corresponding candidate motion information set. In the AMVP candidate search unit 206, a candidate motion information set for the AMVP mode is configured. Motion prediction is performed (207) by using the corresponding candidate motion information set, and optimal motion information is determined. The prediction block is generated by performing motion compensation 208 using the optimal motion information determined in each mode.

The above inter prediction may be configured with 3 modes (SKIP mode, MERGE mode, and AMVP mode). Each prediction mode can find a prediction block of the current block by using motion information (prediction direction information, reference picture information, motion vector), and there may be additional prediction modes using the motion information.

The SKIP mode determines optimal prediction information by using motion information of a previously reconstructed region. A motion information candidate group is configured in the reconstruction region to generate a prediction block by using a candidate having the smallest RD cost value among the corresponding candidate group as prediction information, and in this case, a method of configuring the motion information candidate group is the same as that of configuring the motion information candidate group of the MERGE mode, so it is omitted in this specification.

The MERGE mode is the same as the SKIP mode in that optimal prediction information is determined by using motion information of a previously reconstructed region. However, they are different in that the SKIP mode searches for motion information having a prediction error of zero in a motion information candidate group, and the MERGE mode searches for motion information having a prediction error of non-zero in a motion information candidate group. Similar to the SKIP mode, a motion information candidate group is configured in the reconstruction region to generate a prediction block by using a candidate having the smallest RD cost value in the corresponding candidate group as prediction information.

Reference numeral 301 in fig. 3 denotes a method of generating motion information candidates in the SKIP mode and the MERGE mode. The maximum number of motion information candidates may be equally determined in the image encoding apparatus and the image decoding apparatus, and the corresponding number information may be transmitted in advance in a higher header of the image encoding apparatus (the higher header indicates a parameter transmitted from a higher set of blocks such as a video parameter set, a sequence parameter set, a picture parameter set, etc.). In the description of step S305 and step S306, motion information derived by using corresponding motion information is included in a motion information candidate group only when spatial candidate blocks and temporal candidate blocks are encoded by an inter prediction mode. In step S305, 4 candidates of 5 spatial candidate blocks around the current block are selected in the same picture. The positions of the spatial candidates refer to 302 of fig. 3, and the position of each candidate may be changed to any block in the reconstruction region. By reaction with A₁、A₂、A₃、A₄、A₅The order of (2) considers spatial candidates, and motion information of a spatial candidate block that can be used first is determined as a spatial candidate. However, it is only an example of the priority, and the priority may be a₂、A₁、A₃、A₄、A₅Or A₂、A₁、A₄、A₃、A₅. Considering only motion candidates with high priority when there is repeated motion informationAnd (4) information. In step S306, 1 candidate of the 2 temporal candidate blocks is selected. The positions of the temporal candidates refer to 302 of fig. 3, and the position of each candidate is determined based on a block at the same position as a current block of a current picture in a co-located picture. In this case, the co-located picture can be set between the reconstructed pictures under the same condition in the image encoding apparatus and the image decoding apparatus. The motion information of the candidate block that can be used first is passed through with B₁、B₂The order of the blocks is determined as a time candidate in consideration of the time candidate. The method of determining motion information of the temporal candidates refers to 303 of fig. 3. Candidate block (B) in co-located picture₁、B₂) Indicates a prediction block in the reference picture B (however, the reference pictures of each candidate block may be different from each other). In this embodiment, it is denoted as reference picture B) for convenience. For the corresponding motion vector, a motion vector of the temporal candidate motion information is determined by scaling the motion vector of the candidate block by a correspondence ratio after calculating a ratio of a distance between the current picture and the reference picture a and a distance between the co-located picture and the reference picture B. Equation 2 represents a zoom equation.

[ formula 2]

MV representing a motion vector of motion information of a temporal candidate block, MV_scaleDenotes a scaled motion vector, TB denotes a temporal distance between the co-located picture and the reference picture B, and TD denotes a temporal distance between the current picture and the reference picture a. In addition, reference picture a and reference picture B may be the same reference picture. Similarly, the motion information of the temporal candidate is derived by determining the scaled motion vector as the motion vector of the temporal candidate and determining the reference picture of the current picture as the reference picture information of the temporal candidate motion information. Step S307 is only performed when the maximum number of motion information candidates is not filled in steps S305, S306, and step S307 is to add a new bi-directional motion information candidate by a combination of motion information candidates derived in previous stepsThe step (2). The bi-directional motion information candidates are generated by introducing each of the motion information in the previously derived past or future directions and combining them as new candidates. The table 304 of fig. 3 indicates the priority of bi-directional motion information candidate combinations. In addition to the combinations in the table, additional combinations may be used, and the table represents only one example. When the maximum number of motion information candidates is not filled although the bidirectional motion information candidates are used, step S307 is performed. In step S308, the motion vector of the motion information candidate is fixed to a zero motion vector, and the maximum number of motion information candidates is filled by making the reference pictures different according to the prediction direction.

The AMVP mode determines optimal motion information through motion estimation of each reference picture according to a prediction direction. In this case, the prediction direction may be a unidirectional direction using only one of the past/future directions, or may be a bidirectional direction using both the past and future directions. The prediction block is generated by performing motion compensation using optimal motion information determined by motion estimation. In this case, a motion information candidate set for motion estimation is derived for each reference picture according to a prediction direction. The corresponding motion information candidate set is used as a starting point for motion estimation. The method of deriving a motion information candidate set for motion estimation of AMVP mode refers to fig. 4.

The maximum number of motion information candidates may be equally determined in the image encoding apparatus and the image decoding apparatus, and the corresponding number information may be transmitted in advance from the upper header of the image encoding apparatus. In the description of steps S401 and S402, motion information derived by using corresponding motion information is included in a motion information candidate group only when spatial candidate blocks and temporal candidate blocks are encoded by an inter prediction mode. In step S401, unlike the description in step S305, the number (2) derived as spatial candidates may be different, and the priority for selecting spatial candidates may also be different. The remaining description is the same as that in step S305. Step S402 is the same as the description in step S306. In step S403, when there is duplicate motion information in the candidates derived so far, it is removed. Step S404 is the same as the description in step S308. Among the motion information candidates derived in this manner, the motion information candidate having the smallest RD cost value is selected as the optimal motion information candidate to obtain the optimal motion information of the AMVP mode through the motion estimation process based on the corresponding motion information.

The transform unit 103 generates a transform block by transforming a residual block, which is a difference between an original block and a prediction block. The transform block is the smallest unit used for the process of transformation and quantization. The transform unit generates a transform block having transform coefficients by transforming the residual signal to a frequency domain. In this case, as a method of transforming the residual signal to the frequency domain, various transformation methods such as DCT (discrete cosine transform) -based transformation, DST (discrete sine transform), KLT (Karhunen Loeve) transformation, and the like may be used, and the transform coefficient is generated by transforming the residual signal to the frequency domain using the various transformation methods. The transformation method is conveniently used by performing a matrix operation using the basis vectors. The transform methods may be mixed in various ways and used in matrix operations according to which prediction mode the prediction block is encoded in. For example, in intra prediction, a discrete cosine transform may be used in the horizontal direction and a discrete sine transform may be used in the vertical direction according to a prediction mode.

The quantization unit 104 generates a quantized transform block by quantizing the transform block. In other words, the quantization unit generates a quantized transform block (quantized transform coefficient) having a quantized transform coefficient by quantizing the transform coefficient of the transform block generated from the transform unit 103. As the quantization method, DZUTQ (dead zone uniform threshold quantization) or a quantization weighting matrix or the like may be used, but various quantization methods such as quantization improved thereto may be used.

On the other hand, it is shown and described above that the image encoding apparatus includes the transform unit and the quantization unit, but may alternatively include the transform unit and the quantization unit. In other words, the image encoding apparatus may generate the transform block by transforming the residual block without performing the quantization process, may perform only the quantization process without transforming the residual block into the frequency coefficients, or may not perform both the processes of transforming and quantizing. Although all or some of the processes of the transform unit and the quantization unit are not performed in the image encoding apparatus, the block input to the entropy encoding unit is generally referred to as a "quantized transform block".

The entropy encoding unit 105 outputs a bitstream by encoding the quantized transform block. In other words, the entropy encoding unit encodes the coefficients of the quantized transform block output from the quantization unit by using various encoding methods such as entropy encoding, and the like, and generates and outputs a bitstream including additional information (e.g., information on a prediction mode (motion information or intra prediction mode information, etc. determined in the prediction unit may be included in the information on the prediction mode), quantized coefficients, and the like, which are required to decode the corresponding block in an image decoding apparatus described below.

The inverse quantization unit 106 reconstructs an inverse quantized transform block by inversely performing a quantization method used in quantization of the quantized transform block.

The inverse transform unit 107 reconstructs a residual block by inversely transforming the inversely quantized transform block using the same method as used in the transformation, and performs inverse transformation by inversely performing the transformation method used in the transform unit.

On the other hand, the inverse quantization unit and the inverse transformation unit may perform inverse quantization and inverse transformation by inversely using the quantization method used in the quantization unit and the transformation method used in the transformation unit. In addition, when the transform unit and the quantization unit perform only quantization without performing transform, only inverse quantization may be performed without performing inverse transform. When both transform and quantization are not performed, the inverse quantization unit and the inverse transform unit may not perform both inverse transform and inverse quantization, or the inverse quantization unit and the inverse transform unit may be omitted without being included in the image encoding device.

The addition unit 108 reconstructs the current block by adding the residual signal generated in the inverse transform unit to the prediction block generated through prediction.

The filter unit 109 is a process of: after all blocks in the current picture are reconstructed, additional filtering is performed on the picture, and there are deblocking filtering, SAO (sample adaptive offset), ALF (adaptive loop filter), and the like. Deblocking filtering refers to an operation of reducing block distortion generated when an image is encoded in units of blocks, and SAO (sample adaptive offset) refers to an operation of minimizing a difference between a reconstructed image and an original image by subtracting a specific value from or adding a specific value to a reconstructed pixel. ALF (adaptive loop filter) may be performed based on values generated by comparing a filtered reconstructed image with an original image. Pixels included in the image may be divided into predetermined groups, one filter to be applied to the corresponding group may be determined, and filtering may be performed differently per group. Information related to whether or not ALF is applied may be transmitted for each Coding Unit (CU), and the shape and/or filter coefficients of the ALF filter to be applied may be different according to each block. In addition, the same shape (fixed shape) of the ALF filter can be applied regardless of the characteristics of the block to be applied.

The memory 110 may store a current block reconstructed through additional filtering in the in-loop filter unit after adding the residual signal generated in the inverse transform unit and the prediction block generated through prediction, and it may be used to predict a subsequent block or a subsequent picture, etc.

The subtraction unit 111 generates a residual block by subtracting the prediction block from the current original block.

Fig. 5 is a flowchart showing the flow of encoding of the encoding information in the entropy encoding unit in the image encoding apparatus. In step S501, operation information of the SKIP mode is encoded. In step S502, it is determined whether the SKIP mode is operated. When the SKIP mode operates in step S502, the flowchart ends after the merge candidate index information of the SKIP mode is encoded in step S507. When the SKIP mode does not operate in step S502, the prediction mode is encoded in step S503. In step S503, it is determined whether the prediction mode is the inter prediction mode or the intra prediction mode. When the prediction mode is the inter prediction mode in step S504, the operation information of the MERGE mode is encoded in step S505. In step S506, it is determined whether the MERGE mode is operating. When the MERGE mode operates in step S506, the flowchart ends after moving to step S507 and encoding MERGE candidate index information of the MERGE mode. When the MERGE mode is not operated in step S506, the prediction direction is encoded in step S508. In this case, the predicted direction may be one of a past direction, a future direction, or a bidirectional direction. In step S509, it is determined whether the predicted direction is a future direction. When the prediction direction is not the future direction in step S509, reference picture index information in the past direction is encoded in step S510. In step S511, MVD (motion vector difference) information in the past direction is encoded. In step S512, MVP (motion vector predictor) information in the past direction is encoded. When the predicted direction is the future direction or the bidirectional direction in step S509, or when step S512 ends, it is determined whether the predicted direction is the past direction in step S513. When the prediction direction is not the past direction in step S513, reference picture index information in the future direction is encoded in step S514. In step S515, MVD information in the future direction is encoded. After encoding the MVP information in the future direction in step S516, the flowchart ends. When the prediction mode is the intra prediction mode in step S504, the flowchart ends after the intra prediction mode information is encoded in step S517.

Fig. 6 is a block flow chart schematically showing the configuration of the image decoding apparatus 600.

The image decoding apparatus 600 is an apparatus that decodes an image, and may mainly include a block entropy decoding unit, an inverse quantization unit, an inverse transformation unit, a prediction unit, an addition unit, an in-loop filter unit, and a memory. The encoding block in the image encoding apparatus is referred to as a decoding block in the image decoding apparatus.

The entropy decoding unit 601 reads quantized transform coefficients and various information required to decode a corresponding block by interpreting a bitstream transmitted from an image encoding apparatus.

The inverse quantization unit 602 reconstructs an inverse quantization block having inversely quantized coefficients by inversely performing a quantization method used in quantizing the decoded quantized coefficients in the entropy decoding unit.

The inverse transform unit 603 reconstructs a residual block having a difference signal by inversely transforming the inversely quantized transform block using the same method as that used in the transform, and performs inverse transform by inversely performing the transform method used in the transform unit.

The prediction unit 604 generates a prediction block by using the prediction mode information decoded in the entropy decoding unit using the same method as the prediction method performed in the prediction unit of the image encoding apparatus.

The addition unit 605 reconstructs the current block by adding the residual signal reconstructed in the inverse transform unit to the prediction block generated through prediction.

The filter unit 606 is a process of: after all blocks in the current picture are reconstructed, additional filtering is performed on the picture, there are deblocking filtering, SAO (sample adaptive offset), ALF, and the like, and the detailed description is the same as that described in the in-loop filter unit of the image coding apparatus described above.

The memory 607 may store the current block reconstructed through additional filtering in the in-loop filter unit after adding the residual signal generated in the inverse transform unit and the prediction block generated through prediction, and it may be used to predict a subsequent block or a subsequent picture, etc.

Fig. 7 is a flowchart describing a flow in a prediction unit of the image decoding apparatus. When the prediction mode is intra prediction, optimal intra prediction mode information is determined (701) and a prediction block is generated by performing intra prediction (702). When the prediction mode is inter prediction, an optimal prediction mode of the SKIP mode, the MERGE mode, and the AMVP mode is determined (703). When decoding in the SKIP mode or the MERGE candidate search unit 704, candidate motion information sets for the SKIP mode and the MERGE mode are configured. Optimal motion information is determined by using the transmitted candidate index (e.g., merge index) among the corresponding candidate motion information sets (705). When decoding in the AMVP mode, a candidate motion information set for the AMVP mode is configured in the AMVP candidate search unit 706. Among the corresponding candidate motion information candidates, optimal motion information is determined by using the transmitted candidate index (e.g., MVP information) (707). Then, a prediction block is generated by performing motion compensation 708 using the optimal motion information determined in each mode.

Fig. 8 is a flowchart showing a decoding flow of encoded information in the image decoding apparatus. In step S801, the operation information of the SKIP mode is decoded. In step S802, it is determined whether the SKIP mode is operated. When the SKIP mode operates in step S802, the flowchart ends after decoding merge candidate index information for the SKIP mode in step S807. When the SKIP mode does not operate in step S802, the prediction mode is decoded in step S803. In step S803, it is determined whether the prediction mode is the inter prediction mode or the intra prediction mode. When the prediction mode is the inter prediction mode in step S804, the operation information of the MERGE mode is decoded in step S805. In step S806, it is determined whether the MERGE mode is operating. When the MERGE mode operates in step S806, the flowchart ends after moving to step S807 and decoding MERGE candidate index information of the MERGE mode. When the MERGE mode does not operate in step S806, the prediction direction is decoded in step S808. In this case, the predicted direction may be one of a past direction, a future direction, and a bidirectional direction. In step S809, it is determined whether the predicted direction is a future direction. When the prediction direction is not the future direction in step S809, reference picture index information in the past direction is decoded in step S810. In step S811, MVD (motion vector difference) information in the past direction is decoded. In step S812, MVP (motion vector predictor) information in the past direction is decoded. When the predicted direction is the future direction or the bidirectional direction in step S809, or when step S812 ends, it is determined whether the predicted direction is the past direction in step S813. When the prediction direction is not the past direction in step S813, reference picture index information in the future direction is decoded in step S814. In step S815, MVD information in the future direction is decoded. In step S816, after the MVP information in the future direction is decoded, the flowchart ends. When the prediction mode is the intra prediction mode in step S804, the flowchart ends after the intra prediction mode information is decoded in step S1317.

The following embodiments will describe methods of deriving candidate motion information for inter prediction of a current block in the merge

candidate search units

204, 704 and AMVP

candidate search units

206, 706 of prediction units of image encoding and decoding apparatuses. The candidate motion information is immediately determined as the motion information of the current block in the merge candidate search unit and is used as a predictor for transmitting the optimal motion information of the current block in the AMVP candidate search unit.

Fig. 9 is a flowchart illustrating a method of deriving candidate motion information for the MERGE/AMVP mode. In this flowchart, a method of deriving candidate motion information for the MERGE mode and the AMVP mode is shown in the same flowchart, but some candidates may not be used in each mode. Thus, the candidate motion information derived for each mode may be different, and the number of candidate motion information derived may also be different. For example, the MERGE mode may select 4(B) candidates of 5(a) spatial candidates, and the AMVP mode may select only 2(B) candidates of 4(a) spatial candidates. In steps S901, S902, A, B, C, D (A, B, C, D is an integer equal to or greater than 1) respectively represents the number of spatial candidates, the number of selected spatial candidates, the number of temporal candidates, and the number of selected temporal candidates.

The description of step S901 is the same as the description of steps S305, S401 described above. However, the positions of the surrounding blocks of the spatial candidates may be different. In addition, the surrounding blocks of the spatial candidate may belong to at least one of the first group, the second group, and the third group. In this case, the first group may include a left block (A) of the current block₁) And the lower left block (A)₄) May comprise a top block (a) of the current block₂) And the upper right block (A)₃) And the third group may include an upper left block (a) of the current block₅) At least one of a block adjacent to a bottom of the upper left block and a block adjacent to a left side of the upper left block.

The description of step S902 is the same as that of steps S306 and S402 described above. Likewise, the locations of the blocks of the temporal candidates may be different.

In step S903, time candidates in units of sub-blocks are added. However, when temporal candidates in units of subblocks are added in the AMVP candidate list, according to the above-described method of deriving a motion vector of the AMVP mode, only candidate motion information of one arbitrary subblock should be used as a predictor, but in some cases, candidate motion information of two or more subblocks may be used as a predictor. The contents of this step will be described in detail in example 1 below.

In step S904, history-based candidates are added. The contents of this step will be described in detail in example 2 below.

In step S905, an average candidate between candidate motion information of the MERGE/AMVP list is added. The contents of this step will be described in detail in example 3 below.

After step S905, when the candidate motion information of the MERGE/AMVP list does not reach the maximum number, the flowchart ends after the maximum number is filled by adding zero motion information in step S906 and the candidate motion information list of each mode is configured. The candidate motion information described in this embodiment may be used in various prediction modes other than the MERGE/AMVP mode. In addition, in fig. 9, the candidate list does not limit the order of the added candidates. For example, temporal candidates in units of subblocks may be added to the candidate list in preference to spatial candidates. Alternatively, the average candidate may be added to the candidate list in preference to the history-based candidate. In the present specification, the candidate motion information list, the candidate motion information set, the motion information candidate group, and the candidate list may be understood to have the same meaning.

In the present embodiment, a method of deriving time candidates and sub-block-unit time candidates in steps S902, S903 of fig. 9 will be described in detail. The time candidate means a time candidate in units of blocks, and is distinguishable from a time candidate in units of sub-blocks. In this case, a subblock is obtained by dividing a block to be currently encoded or decoded (hereinafter, a current block) into blocks of arbitrary NxM (N, M ≧ 0) size, and the subblock denotes a unit of a basic block used to derive motion information of the current block. The sub-blocks may have a size preset in an encoder and/or a decoder. For example, the sub-blocks may have a square shape with a fixed size, such as 4 x 4 or 8 x 8. However, it is not limited thereto, and the shape of the sub-block may be non-square, and at least one of the width and the height of the sub-block may be greater than 8. There may be a limitation that the temporal candidate in units of sub-blocks is added to the candidate list only when the current block is greater than nxm. For example, when N and M are 8, respectively, temporal candidates in units of sub-blocks may be added to the candidate list when the width and height of the current block are greater than 8, respectively.

Fig. 10 is a basic conceptual diagram for describing this embodiment. One current (sub) block of the current picture is shown in fig. 10. The target (sub) block corresponding to the (sub) block is searched in the target picture. In this case, information on the target picture and the target (sub) block may be transmitted in units of a higher header or a current block, respectively, and the target picture and the target (sub) block may be specified under the same condition in the image encoding apparatus and the image decoding apparatus. After determining the target picture and the target (sub) block of the current (sub) block, the motion information of the current (sub) block is derived by using the motion information of the target (sub) block.

Specifically, each sub-block of the current block has a correspondence with each sub-block of the target block. The temporal candidate in units of sub-blocks may have motion information of each sub-block in the current block, and the motion information of each sub-block may be derived by using the motion information of the sub-block having a correspondence relationship in the target block. However, there may be a case where the motion information of the sub-block having the correspondence is not available. In this case, the motion information of the corresponding sub-block may be set as default motion information. In this case, the default motion information may represent motion information of surrounding sub-blocks adjacent to the corresponding sub-block in a horizontal direction or a vertical direction. Alternatively, the default motion information may represent motion information of a sub-block including a center sample of the target block. However, it is not limited thereto, and the default motion information may represent motion information of sub-blocks including any one of n angular samples of the target block. n may be 1, 2, 3 or 4. Alternatively, among the sub-blocks including the center sample and/or the sub-blocks including the n corner samples, sub-blocks having available motion information may be searched according to a predetermined priority, and the motion information of the first searched sub-block may be set as default motion information.

Alternatively, it may be determined whether the default motion information is available. As a result of the determination, when default motion information is not available, a process of deriving motion information of a temporal candidate in units of sub-blocks and adding it to a candidate list may be omitted. In other words, only when default motion information is available, temporal candidates in units of sub-blocks may be derived and may be added to the candidate list.

On the other hand, the motion vector of the motion information may represent a scaled motion vector. The temporal distance between the target picture of the target (sub) block and the reference picture is determined as TD, the temporal distance between the current picture of the current (sub) block and the reference picture is determined as TB, and the Motion Vector (MV) of the target (sub) block is scaled by using equation 2. Scaled Motion Vector (MV)_scale) May be used in indicating a predictive (sub) block of a current (sub) block in a reference picture or may be used as a motion vector for a temporal candidate of the current (sub) block or a temporal candidate in units of sub-blocks of the current (sub) block. However, the variable MV used in equation 2 represents the motion vector of the target (sub) block when deriving the scaled motion vector, and MV_scaleA scaled motion vector representing the current (sub-) block.

In addition, the reference picture information of the current (sub) block may be specified by the image encoding apparatus and the image decoding apparatus under the same condition, and the reference picture information of the current (sub) block may also be transmitted in units of the current (sub) block.

A method of determining a target (sub) block of a current (sub) block under the same condition in an image encoding apparatus and an image decoding apparatus will be described in more detail below. The target (sub-) block of the current (sub-) block may be indicated by using one of the candidate motion information in the MERGE/AMVP candidate list. In more detail, after determining the prediction mode of the candidate motion information in the candidate list, the target (sub) block of the current (sub) block may be determined by prioritizing the prediction mode. For example, the target (sub) block may be indicated by selecting one of the motion information in the candidate list according to the priority of AMVP mode, MERGE mode, SKIP mode.

In addition, simply, the target (sub) block may be indicated by unconditionally selecting the first candidate motion information in the candidate list. For candidate motion information encoded by the same prediction mode, a variety of priority conditions may be used, such as selection according to priority in the candidate list. However, when the reference picture and the target picture of the candidate motion information are different, the corresponding candidate motion information may be excluded. Alternatively, it may be determined as a target (sub) block in the target picture corresponding to the same position as the current (sub) block.

Specifically, the target (sub) block may be determined as a block at a position offset from the position of the current (sub) block by a predetermined time motion vector (temporal MV). In this case, the temporal motion vector may be set as a motion vector of a surrounding block spatially adjacent to the current block. The surrounding block may be any one of a left block, a top block, a lower left block, an upper right block, and an upper left block of the current block. Alternatively, the temporal motion vector may be derived by using only surrounding blocks at a predetermined fixed position in the encoding/decoding apparatus. For example, the surrounding block at the fixed position may be a left side block (a) of the current block₁). Alternatively, the surrounding block at the fixed position may be the top block (a) of the current block₂). Alternatively, the surrounding block at the fixed position may be a lower left block (a) of the current block₃). Alternatively, the surrounding block at the fixed position may be an upper right block (a) of the current block₄). Alternatively, the surrounding block at the fixed position may be an upper left block (a) of the current block₅)。

This setting may be performed only when the reference picture and the target picture of the surrounding block are the same (e.g., when the POC difference between the reference picture and the target picture is 0). When the reference picture of the surrounding block is not identical to the target picture, the temporal motion vector may be set to (0, 0).

The set temporal motion vector may be rounded based on at least one of a predetermined offset (offset) or a shift value. In this case, the offset may be derived based on the shift value, and the shift value may include at least one of a shift value in a right direction (rightShift) or a shift value in a left direction (leftShift). The shift value may be an integer preset in the encoding/decoding apparatus. For example, rightShift may be set to 4 and leftShift may be set to 0, respectively. For example, rounding of the temporal motion vector may be performed as in the following equation 3.

[ formula 3]

offset＝(rightShift＝＝0)？0:(1<<(rightShift-1))

mvX_R[0]＝((mvX[0]+offset-(mvX[0]>＝0))>>rightShift)<<leftShift

mvX_R[1]＝((mvX[1]+offset-(mvX[1]>＝0))>>rightShift)<<leftShift

For a more detailed description, two conditions are assumed. First, motion information is stored and held in units of 4 × 4 sub-blocks in an encoded picture (hereinafter, the boundary in units of 4 × 4 sub-blocks for storing motion information matches the boundary of a target (sub) block in a target picture). Second, the size of the sub-block in the current block is set to 4 × 4. The size of the above block may be determined differently. In this case, when determining a position of a target (sub) block in the target picture corresponding to the same position in the current (sub) block or a position of a target (sub) block indicated in the target picture by using motion information in a MERGE/AMVP candidate list of the current (sub) block, the basic coordinates of each sub-block may not correspond to the basic coordinates of the current (sub) block in units of 4 × 4 sub-blocks storing motion information in the target picture. For example, a mismatch may occur, such as the coordinate of the top left pixel of the current (sub) block being (12,12) and the top left coordinate of the target (sub) block being (8, 8). This is an inevitable phenomenon that occurs because the block partition structure of the target picture is different from that of the current picture.

Fig. 11 is a schematic diagram for describing a method of determining a target block when temporal candidate motion information is derived in units of current blocks rather than in units of sub-blocks. After determining a Motion Vector (MV) indicating a target position from the base coordinates of the current block (a motion vector derived from the same position (i.e. zero motion) or a candidate list), the point indicated by the corresponding motion vector is found in the target picture. In deriving the scaled motion vector of the current block, motion information of 4 × 4 target sub-blocks including a corresponding target point in the target picture may be used. Alternatively, when the target point indicated by each of the plurality of basic coordinates of the current block indicates the same 4 × 4 target sub-block, the motion information of the corresponding target sub-block is used in deriving the scaled motion vector of the current block. However, they indicate a plurality of 4 × 4 target sub-blocks, and the average motion information of each target sub-block may be used to derive a scaled motion vector of the current block. For the target position, two target positions in the central area of the current block may be used as in the example of fig. 11, but more than two target positions may be used, and any pixel position in other current blocks may be used. Naturally, the more motion information used for calculating the average motion information may be two or more. Alternatively, the final prediction block may be generated by deriving a plurality of scaled motion vectors using each of the plurality of target sub-blocks and generating a plurality of prediction blocks after performing weighted summation on the corresponding prediction blocks.

Fig. 12 and 13 are diagrams for describing a method of determining a target block when temporal candidate motion information is derived in units of sub-blocks in a current block. Fig. 12 is a schematic diagram of a case where there is one basic position in units of sub-blocks, and fig. 13 is a schematic diagram of a case where a plurality of basic positions are used in units of sub-blocks.

Fig. 12 shows that the basic position in units of sub-blocks is the upper left pixel position of the sub-block. After determining a target block having the same size as the current block based on the 4 x 4 target sub-block of the target picture, each of the scaled motion vectors of the current sub-block may be derived by using motion information of the target sub-block between the target block and a co-located sub-block of the current block, the 4 x 4 target sub-block of the target picture being found based on basic coordinates of a lower-right sub-block of the current block. Alternatively, after calculating the target position of the target picture by a motion vector indicating the target position at the basic coordinates of the sub-block, the motion information of the 4 × 4 target sub-block including the target position may be used in deriving the scaled motion vector of the current sub-block.

Fig. 13 is a schematic diagram for describing a method of deriving a scaled motion vector by using a plurality of target sub-blocks of each current sub-block. When the scaling motion vector of the sub-block D is derived as in the example of fig. 13, the target position in the target picture is calculated based on a plurality of basic coordinates in the sub-block D. Then, when the target position indicates the same target sub-block, the scaled motion vector of the current sub-block may be derived by using the motion information of the corresponding target sub-block, but when the target position indicates different target sub-blocks as shown in fig. 13, the scaled motion vector of the current sub-block may be derived by calculating the average motion information of each target sub-block. In addition, after different scaled motion vectors are respectively derived by using motion information of each target sub-block, a final prediction block may be generated by respectively generating prediction sub-blocks and performing weighted summation on each prediction sub-block. Other sub-blocks in the current block (sub-block A, B, C) may also generate predictor blocks in the manner described above.

In the present embodiment, step 904 of fig. 9 will be described in detail. As for the history-based candidates (hereinafter, referred to as "reconstruction information-based candidates"), there is a motion candidate storage buffer (hereinafter, referred to as "H-buffer") based on reconstruction information for storing motion information encoded/decoded before the current block in units of a sequence, a picture, or a slice. The corresponding buffer manages the encoded motion information while updating the buffer by a FIFO (first in first out) method. The H-buffer may be initialized in units of CTUs, CTU lines, slices, or pictures, and corresponding motion information is updated in the H-buffer when a current block is predicted and encoded by motion information. In step S904 of fig. 9, the motion information stored in the H buffer may be used as a MERGE/AMVP candidate. When candidate motion information of the H-buffer is added to the candidate list, the candidate motion information may be added in the order of the most recently updated motion information in the H-buffer, and vice versa. Alternatively, the order of motion information to be added to the H-buffer of the candidate list may be determined according to the inter prediction mode.

In particular, in an example, motion information of the H-buffer may be added to the candidate list by redundancy checking between the H-buffer and the candidate list. In the case of MERGE mode, redundancy checking may be performed on some MERGE candidates of the candidate list and some motion information of the H-buffer. Some of the candidate lists may include left and top blocks of spatial merge candidates. However, it is not limited thereto, and it may be limited to any one block of the spatial merging candidates, or may further include at least one of a lower left block, an upper right block, an upper left block, and a temporal merging candidate. On the other hand, some of the H buffers may represent m pieces of motion information that have been recently added to the H buffers. In this case, m may be 1, 2, 3, or more, and may be a fixed value agreed in advance in the encoding/decoding apparatus. It is assumed that 5 pieces of motion information are stored in the H buffer and 1 to 5 indexes are assigned to each piece of motion information. When the index is large, motion information stored later is represented. In this case, a redundancy check between the motion

information having indexes

5, 4, and 3 and the merge candidates of the candidate list may be performed. Alternatively, a redundancy check between the motion information with

indices

5 and 4 and the merging candidates of the candidate list may be performed. Alternatively, excluding the motion information of the index 5 added last, a redundancy check between the motion information with

indexes

4 and 3 and the merging candidates of the candidate list may be performed. As a result of the redundancy check, the motion information of the H-buffer may not be added to the candidate list when even one identical motion information exists. On the other hand, when the same motion information does not exist, the motion information of the H-buffer may be added to the last position of the candidate list. In this case, the motion information most recently stored in the H-buffer may be added to the candidate list in the order of the motion information (i.e., in the order from the large index to the small index). However, there may be a limitation that motion information (motion information having the largest index) finally stored in the H buffer is not added to the candidate list.

On the other hand, in case of the AMVP mode, motion information (particularly, motion vectors) first stored in the H-buffer may be added to the candidate list in the order of the motion information. In other words, motion information having a small index among motion information stored in the H buffer may be added to the candidate list before motion information having a large index.

On the other hand, the motion vector stored in the H buffer may be equally added to the candidate list, and the motion vector to which the above rounding processing is applied may be added to the candidate list. Rounding is used to control the accuracy of the candidate motion information to correspond to the accuracy of the motion vector of the current block. With reference to formula 3, mvX, respectively_RA motion vector to which rounding processing is applied may be represented, and mvX may represent a motion vector stored in the H buffer. In addition, at least one of right shift and left shift (shift value) may be determined by considering the accuracy (or resolution) of the motion vector. For example, when the accuracy of the motion vector is 1/4 samples, the shift value may be determined as 2, and when the accuracy of the motion vector is 1/2 samples, the shift value may be determined as 3. The shift value may be determined to be 4 when the accuracy of the motion vector is 1 sample, and may be determined to be 6 when the accuracy of the motion vector is 4 samples. rightShift and leftShift may be set to the same value.

When motion information stored in the H-buffer is added to the Merge/AMVP candidate list, the number of motion information that can be added may be limited. For example, up to a maximum number of candidates of the Merge/AMVP candidate list may be filled by using the motion information in the H-buffer, but may be filled only up to (maximum number of candidates-1).

The number of candidate motion information stored in the H-buffer may be determined under the same condition in the image encoding apparatus and the image decoding apparatus, and may be transmitted to the image decoding apparatus through a higher header.

Specifically, in the case of merging the candidate lists, only up to (the maximum number of candidates-n) may be filled by using the motion information of the H buffer. In this case, n may be an integer such as 1, 2, or more. The maximum number of candidates may be determined as a fixed number (e.g., 5, 6, 7, 8) predefined in the encoding/decoding apparatus or may be variably determined based on information indicating the maximum number of candidates using a signal. On the other hand, in case of the AMVP candidate list, the maximum number of candidates may be filled by using the motion information of the H-buffer. The maximum number of candidates in the AMVP candidate list may be 2, 3, 4, or more. In the case of the AMVP candidate list, the maximum number of candidates may not be variable unlike the merge candidate list.

A first method of updating the H-buffer is shown in fig. 14. The H-buffer updates the motion information in the coding order of the blocks in the first CTU row. For the second CTU row, it is the same to update the motion information in the coding order of the blocks, but the H-buffer may be updated by additionally considering the motion information stored in the reconstructed block adjacent to the current CTU row in the top CTU. In fig. 14, mi in CTU is an abbreviation of motion information, and is reconstructed motion information stored in bottom blocks in the remaining CTU rows except the last CTU row. Up to P (P is an integer equal to or greater than 1) pieces of motion information may be updated in the H buffer, and an update method may be different according to a unit in which the H buffer is initialized. In the present embodiment, a method of updating the H buffer will be described in terms of a case where the H buffer is initialized in units of CTUs and a case where the H buffer is initialized in units of CTU rows. First, when the H buffer is initialized in units of CTUs, each CTU in the second CTU row may initialize the H buffer by using mi before starting encoding. In this case, initialization means that any motion information is re-updated in a completely empty H-buffer. E.g. in coding CTU₈Previously, it could be used in CTU₃The bottom 4 mi of the H-buffer. The order in which mi is updated may also be determined differently. It may be updated from the mi currently in the left position or, conversely, from the mi in the right position. When the H-buffer is initialized in CTU row units, only the first CTU in each CTU row may be updated by using mi in the top CTU. In addition, the H-buffer can be completely emptied and initialized in units of each initialization. On the other hand, when the most recently encoded/decoded motion information is identical to the motion information previously stored in the H buffer, the most recent motion information may not be added to the H buffer. Alternatively, the same motion information as the latest motion information may be removed from the H buffer, and the latest motion information may be stored in the H buffer. In this case, the latest motion information may be stored in the last position of the H-buffer.

A second method of updating the H-buffer is shown in fig. 15. The second method has an additional motion candidate memory buffer that does not include an H-buffer. The buffer is a buffer (hereinafter, referred to as a "V buffer") that stores reconstructed motion information (hereinafter, referred to as "Vmi") of the top CTU. The V buffer may be used when the above H buffer is initialized in units of CTU rows or stripes and the V buffer may be initialized in units of CTU rows. Vmi in the V buffer should be updated for the bottom of the other CTU rows except the last CTU row in the current picture. In this case, up to Q (Q is an integer equal to or greater than 1) pieces of motion information may be updated in the V buffer, and the motion information corresponding to Vmi may be determined by various methods. For example, Vmi may be reconstructed motion information for a block that includes the center coordinates of the CTU, or may be the most recent motion information included in the H-buffer when encoding a block that includes the center coordinates in the top CTU. The number of Vmi to be updated in one CTU may be one or more, and the updated Vmi is used when updating the H-buffer of the bottom CTU. Vmi stored in the V buffer of the top row of CTUs is updated in the current H buffer according to each of the remaining rows of CTUs except the first row of CTUs. When multiple Vmi are stored in the V buffer of the top CTU row, the first updated motion information may be fetched and updated first in the H buffer, or vice versa. Vmi stored in the top CTU in the V buffer may be updated at various times in the H buffer, such as before encoding each CTU or before encoding the block bordering the top CTU in each CTU. Additionally, Vmi in the top left CTU, the top right CTU, and the top CTU may also be updated in the H buffer.

The candidate motion information of the V-buffer described above can be independently added to the process of deriving the MERGE/AMVP candidate list of fig. 9. When adding V buffer candidate motion information, the corresponding priority in the MERGE/AMVP candidate list may be determined differently. For example, when there is valid motion information in the V buffer after step S904, corresponding candidate motion information may be added between step S904 and step S905.

In the present embodiment, step 905 of fig. 9 will be described in detail. When the motion information filled in the Merge/AMVP candidate list is equal to or less than 1 through steps S901 to S904, this step is omitted. When the motion information filled in the Merge/AMVP candidate list is equal to or greater than 2, the candidate list may be filled by generating average candidate motion information between candidates. A motion vector of the average candidate motion information is derived from each prediction direction of the motion information (List 0) or List 1(List 1)), which represents a motion vector generated by averaging motion vectors in the same direction stored in the candidate List. When reference picture index information of motion information used in averaging is different in deriving the average candidate motion information, reference picture information of motion information having a high priority may be determined as reference picture index information of the average candidate motion information. Alternatively, reference picture information having motion information of a low priority may be determined as reference picture index information of the average candidate motion information.

When adding the average candidate motion information to the Merge/AMVP candidate list, the generated average candidate motion information may be used in generating another average candidate motion information. To describe this example, reference is made to fig. 16. In fig. 16, the left side table is the Merge/AMVP candidate list before step S905 of fig. 9, and the right side table is the Merge/AMVP candidate list after step S905 of fig. 9. In the left table, the candidate motion information corresponding to the 0 th and 1 st candidate lists are filled in the 2 candidate lists. The average candidate motion information corresponding to No. 2 can be generated by using these 2 pieces of motion information. In the example of fig. 16, the motion information in the list 0 direction of candidate No. 2 is filled by averaging the Motion Vector (MV) (1,1) (motion information whose reference picture index (Refidx) is 1) in the list 0 direction of candidate No. 0 and the motion vector (3, -1) (motion information whose reference picture index is 0) in the list 0 direction of candidate No. 1. For the list 1 direction, there is no motion information for candidate No. 0. In this case, the motion information in the list 1 direction of candidate No. 2 is filled by bringing in the motion information of candidate No. 1 (since it exists as a candidate in the list 1 direction). When there is no motion information in either direction when deriving the average candidate motion information, the corresponding directions are not derived separately. Additional average candidate motion information may be generated by using the average candidate motion information of No. 2 derived in this way. The average candidate motion information of No. 3 is the average candidate motion information of No. 0, No. 2 candidates, and the average candidate motion information of No. 4 is the average candidate motion information of No. 1, No. 2 candidates. The method of generating the average candidate motion information is the same as the above-described method.

Before step S905 of fig. 9, repeated candidate motion information may exist in the Merge/AMVP candidate list. The average candidate motion information may be used to remove such duplicate candidate motion information. Fig. 17 shows such an example. The left and right tables are the same as described in fig. 16. In the left table, the

candidate motion information

0 and 2 are identical. In this case, the candidate motion information No. 2 having a low priority may be replaced with the average candidate motion information. In the example of fig. 17, the existing candidate motion information No. 2 is replaced with the average candidate motion information No. 0, 1, and No. 3 fills the candidate list with the average candidate motion information No. 0, 2, and No. 4 fills the candidate list with the average candidate motion information No. 1, 2.

The number of Merge/AMVP candidate lists that may be populated with average candidate motion information may also be limited. For example, up to a maximum number of candidates of the Merge/AMVP candidate list may be filled by using the average candidate motion information, but may be filled only up to (maximum number of candidate list-1). In addition, the candidate motion information used when calculating the average candidate motion information may be 3 or more, and median information (information other than averaging 3 or more candidate motion information) may be determined as the average candidate motion information.

In the following embodiments, a method of efficiently transmitting a motion vector of motion information will be described. In the above-described AMVP mode, motion vector difference (MVD in fig. 5 and 8) information obtained by subtracting motion vector predictor (MVP in fig. 5 and 8) information from a motion vector of a current block determined in an image encoding device is transmitted to an image decoding device. In addition, in the above-described MERGE mode, motion information of the selected MERGE candidate is set as motion information of the current block without transmitting MVD information. However, if the additional MVD information is transmitted in the MERGE mode, prediction efficiency may be improved by increasing the accuracy of the motion information. In this case, the MVD information is generally referred to as random information in which a certain pattern may not be found. However, such MVD information may also be changed to predictable information when a specific situation is assumed.

The above-described specific case will be described by using fig. 18. In fig. 18, a current block to be encoded exists in a current picture. There are reconstructed blocks a to D around the current block. In this case, when it is assumed that a specific object passes through the current block while performing uniform acceleration motion from the top direction to the bottom direction, it is predictable that a motion vector will increase in size by the blocks C, B, the current block. Alternatively, the same principle can also be considered to apply to uniform deceleration motion. Returning to this, if a motion vector increases by a certain size when a specific object passes through the current block while performing uniform accelerated motion, the motion vector indicating a point at which the MVD of the corresponding block is added to the motion vectors of the surrounding blocks is likely to be determined as optimal motion information in the current block. On the premise of the above, fig. 18 is seen in detail from the viewpoint of AMVP mode. If the MVD information of the difference between the optimal motion vector of the block B and the motion vector of the block C (i.e., the MVP of the block B) is similar to the MVD information of the difference between the optimal motion vector of the current block and the MVP of the block B, it can be used to more efficiently encode the motion information. Here, the motion vector of the block C is used as MVP information of the block B, and the current block uses the motion vector of the block B as MVP. From the viewpoint of the MERGE mode, more efficient encoding can be performed by additionally transmitting MVD information.

In the following embodiments, a method in which such MVD information is effectively predicted and encoded/decoded will be described in detail.

Fig. 19 is a candidate list example table of a merge candidate search unit and an AMVP candidate search unit in prediction units in an image encoding apparatus and an image decoding apparatus.

The left table of fig. 19 is a Merge/AMVP candidate list that may be generated after step S307 of fig. 3 or step S403 of fig. 4. The right side table is a candidate list after generating a new motion candidate using motion information whose MVD is not (0,0) among motion information already existing in the Merge/AMVP candidate list. New candidate motion information is generated by adding the MVD to the motion vector of the motion information whose MVD is not (0,0) and is filled in the candidate list. When even one MVD is not (0,0), the bi-directionally predicted candidate motion information may generate new candidate motion information through the above-described method. Since such a method does not require new encoding and decoding information, motion information can be encoded by using the above-described flowcharts in fig. 5 and 8. Alternatively, a motion vector obtained by adding a motion vector in the Merge/AMVP candidate list to the MVD may be determined as a final candidate motion vector without generating such new candidate motion information, and in this case, information indicating that a candidate motion vector is obtained by adding the MVD to a motion vector in the current candidate motion information may be additionally transmitted.

Fig. 20 is an exemplary table for describing a method of using an MVD of a reconstruction area as an MVD of a current block by additionally generating a Merge/AMVP candidate list for the MVD instead of the Merge/AMVP candidate list for the motion vector. The left side table of fig. 20 is an example table of the Merge/AMVP candidate list where the motion vector is completed, and the right side table is an MVD Merge/AMVP candidate list made by using MVD information in the Merge/AMVP candidate list of the motion vector. MVDs No. 0 and 1 of the right table are determined by using MVDs in candidate motion information No. 0 and 1 of the left table, and MVDs No. 2 and 3 of the right table are determined by using MVDs in candidate motion information No. 4 and 5. In addition, the image encoding device and the image decoding device may add another candidate MVD information to the MVD Merge/AMVP candidate list by using the reconstructed motion information. The MVD information may be additionally transmitted in the Merge mode by using the MVD Merge/AMVP candidate list information derived in this way, or may be merged without transmitting the MVD information in the AMVP mode. The detailed encoding/decoding flow of motion information refers to fig. 21 and 22.

Fig. 21 is a flowchart for encoding motion information using MVD Merge/AMVP candidate list information. The description in steps S2101 to S2107 is the same as that in steps S501 to S507 of fig. 5. In step S2108, operation information indicating whether to perform MVD merging is encoded. MVD merging may mean deriving a final motion vector by adding a predetermined MVD to a motion vector reconstructed through a MERGE mode or an AMVP mode. In step S2109, it is determined whether the corresponding operation information is true or false, and if the corresponding operation information is false, the flowchart ends, and if the corresponding operation information is true, candidate index information indicating which MVD information in the MVD merge candidate list is added to the current motion vector is encoded, and the flowchart ends. Steps S2111 to S2113 are the same as those described in steps S508 to S510 of fig. 5. In step S2114, operation information for determining whether to transmit an MVD in the past direction is encoded. In step S2115, it is determined whether the corresponding operation information is true or false, and if true, the process moves to step S2116, and if false, the process moves to step S2117. The description in steps S2116, S2117 is the same as that in steps S511, S512 of fig. 5. In step S2118, candidate index information of MVDs indicating motion information in the past direction in the MVD AMVP candidate list is encoded. The description in steps S2119, S2120 is the same as that in steps S513, S514 of fig. 5. In step S2121, operation information for determining whether to transmit an MVD in a future direction is encoded. In step S2122, it is determined whether the corresponding operation information is true or false, and if true, the process moves to step S2123, and if false, the process moves to step S2124. In step S2125, candidate index information of an MVD indicating motion information in a future direction in the MVD AMVP candidate list is encoded, and the flowchart ends. The description in step S2126 is the same as that in step S517.

Fig. 22 is a flowchart for decoding motion information using MVD Merge/AMVP candidate list information. The description in steps S2201 to S2207 is the same as that in steps S801 to S807 of fig. 8. In step S2208, operation information indicating whether to perform MVD merging is decoded. MVD merging may mean deriving a final motion vector by adding a predetermined MVD to a motion vector reconstructed through a MERGE mode or an AMVP mode. In step S2209, it is determined whether the corresponding operation information is true or false, and if the corresponding operation information is false, the flowchart ends, and if the corresponding operation information is true, candidate index information indicating which MVD information in the MVD merge candidate list is added to the current motion vector is decoded, and the flowchart ends. Steps S2211 to S2213 are the same as those described in steps S808 to S810 of fig. 8. In step S2214, the operation information determining whether to transmit the MVD in the past direction is decoded. In step S2215, it is determined whether the corresponding operation information is true or false, and if true, it moves to step S2216, and if false, it moves to step S2217. The description in steps S2216, S2217 is the same as that in steps S811, S812 of fig. 8. In step S2218, candidate index information of MVDs indicating motion information in the past direction in the MVD AMVP candidate list is decoded. The description in steps S2219, S2220 is the same as that in steps S813, S814 of fig. 8. In step S2221, operation information for determining whether to transmit an MVD in a future direction is decoded. In step S2222, it is determined whether the corresponding operation information is true or false, and if true, the process moves to step S2223, and if false, the process moves to step S2224. In step S2225, candidate index information of MVDs indicating motion information in the future direction in the MVD AMVP candidate list is decoded, and the flowchart ends. The description in step S2226 is the same as that in step S817.

In the Merge mode, a motion vector may be determined by transmitting additional MVD information having motion information indicated by the Merge candidate index information and adding the additional MVD information to a motion vector of the motion information indicated by the Merge candidate index. In this case, the candidate list of the merging mode may be configured with k merging candidates, and in this case, k may be a natural number such as 4, 5, 6, or more. An index is assigned to each merging candidate, and the index has a value of 0 to (k-1). However, when the MVD merging is applied, the merging candidate index information may have only a value of 0 or 1. In other words, when the MVD merge is applied, motion information of the current block may be derived from any one of the first merge candidate or the second merge candidate belonging to the candidate list according to the merge candidate index information. The additional MVD information may be transmitted in various shapes. The MVD may be expressed by direction information (such as top, bottom, left side, right lower diagonal, left lower diagonal, right upper diagonal, left upper diagonal direction, etc.) and distance information indicating how far it is spaced in each direction based on a motion vector of motion information indicated by the current merging candidate index information, without transmitting the MVD in a vector shape such as (x, y).

In particular, the MVD of the current block may be derived based on an offset vector (offsetMV). The MVD may include at least one of an MVD in an L0 direction (MVD0) or an MVD in an L1 direction (MVD1), and each of the MVD0 and the MVD1 may be derived by using an offset vector.

The offset vector may be determined based on a length (mvdstance) and a direction (mvdddirection) of the offset vector. For example, an offset vector (offsetMV) may be determined as in equation 4 below.

[ formula 4]

offsetMV[x0][y0][0]＝(mvdDistance[x0][y0]<<2)*mvdDirection[x0][y0][0]

offsetMV[x0][y0][1]＝(mvdDistance[x0][y0]<<2)*mvdDirection[x0][y0][1]

In this case, the mvdddistance may be determined by considering at least one of a distance index (distance _ idx) and a predetermined flag (pic _ fpel _ mmvd _ enabled _ flag). The distance index (distance _ idx) may represent an index coded to specify a length or distance of the MVD. pic _ fpel _ mmvd _ enabled _ flag may indicate whether the motion vector uses integer pixel accuracy in MERGE mode for the current block. For example, when pic _ fpel _ mmvd _ enabled _ flag is a first value, the merge mode for the current block uses integer pixel accuracy. In other words, this may mean that the motion vector resolution of the current block is an integer-sample (integer-pel). On the other hand, when pic _ fpel _ mmvd _ enabled _ flag is a second value, the merge mode for the current block may use fractional pixel accuracy. In other words, when pic _ fpel _ mmvd _ enabled _ flag is a second value, the merge mode for the current block may use integer pixel accuracy and fractional pixel accuracy. Alternatively, when pic _ fpel _ mmvd _ enabled _ flag is a second value, there may be a limitation that the merge mode of the current block uses only fractional pixel accuracy. As an example of fractional pixel accuracy, there may be 1/2 samples, 1/4 samples, 1/8 samples, 1/16 samples, and so forth. At least one of the distance index (distance _ idx) or the flag (pic _ fpel _ mmvd _ enabled _ flag) may be encoded and transmitted in the encoding apparatus.

For example, mvdstance can be determined as in table 1 below.

[ Table 1]

In addition, mvdddirection may represent a direction of an offset vector, and may be determined based on a direction index (direction _ idx). In this case, the direction may include at least one of left, right, top, bottom, upper left, lower left, upper right, and lower right directions. For example, mvdddirection may be determined as in table 2 below. The direction index (direction _ idx) may be encoded and transmitted in the encoding apparatus.

[ Table 2]

In Table 2, mvdDirection [ x0] [ y0] [0] may represent the sign of the x-component of the MVD, and mvdDirection [ x0] [ y0] [1] may represent the sign of the y-component of the MVD. Respectively, when the direction _ idx is 0, the direction of the MVD may be determined as a right direction, when the direction _ idx is 1, the direction of the MVD may be determined as a left direction, when the direction _ idx is 2, the direction of the MVD may be determined as a bottom direction, and when the direction _ idx is 3, the direction of the MVD may be determined as a top direction.

On the other hand, the MVD may be set to be the same as the offset vector determined above. Alternatively, the offset vector may be modified by considering a POC difference (PocDiff) between a reference picture of the current block and a current picture to which the current block belongs, and the modified offset vector may be set to an MVD. In this case, the current block may be encoded/decoded by bi-directional prediction, and the reference picture of the current block may include a first reference picture (a reference picture in the L0 direction) and a second reference picture (a reference picture in the L1 direction). For convenience of description, hereinafter, the POC difference between the first reference picture and the current picture is referred to as PocDiff0, and the POC difference between the second reference picture and the current picture is referred to as PocDiff 1.

When PocDiff0 and PocDiff1 are the same, MVD0 and MVD1 of the current block may be equally set as the offset vector, respectively.

In the case where PocDiff0 and PocDiff1 are not the same, MVD0 may be equally set as an offset vector when the absolute value of PocDiff0 is greater than or equal to the absolute value of PocDiff 1. On the other hand, the MVD1 may be derived based on the preset MVD 0. For example, when the first and second reference pictures are long-term reference pictures, the MVD1 may be derived by applying a first scaling factor to the MVD 0. The first scaling factor may be determined based on PocDiff0 and PocDiff 1. On the other hand, when at least one of the first reference picture and the second reference picture is a short-term reference picture, the MVD1 may be derived by applying a second scaling factor to the MVD 0. The second scaling factor may be a fixed value (e.g., -1/2, -1, etc.) previously agreed in the encoding/decoding apparatus. However, the second scaling factor may be applied only if the sign of PocDiff0 is different from the sign of PocDiff 1. If the sign of PocDiff0 is the same as the sign of PocDiff1, MVD1 may be set to be the same as MVD0 and no separate scaling may be performed.

On the other hand, in the case where PocDiff0 and PocDiff1 are not the same, when the absolute value of PocDiff0 is smaller than the absolute value of PocDiff1, MVD1 may be equally set as an offset vector. On the other hand, the MVD0 may be derived based on the preset MVD 1. For example, when the first and second reference pictures are long-term reference pictures, the MVD0 may be derived by applying a first scaling factor to the MVD 1. The first scaling factor may be determined based on PocDiff0 and PocDiff 1. On the other hand, when at least one of the first reference picture and the second reference picture is a short-term reference picture, the MVD0 may be derived by applying a second scaling factor to the MVD 1. The second scaling factor may be a fixed value (e.g., -1/2, -1, etc.) previously agreed in the encoding/decoding apparatus. However, the second scaling factor may be applied only if the sign of PocDiff0 is different from the sign of PocDiff 1. If the sign of PocDiff0 is the same as that of PocDiff1, MVD0 may be set to be the same as MVD1 and no separate scaling may be performed. The detailed encoding and decoding flows for MVDs refer to fig. 23 and 24.

Fig. 23 is a flowchart for encoding motion information including an additional MVD in a merge mode. The description in steps S2301 to S2307 is the same as that in steps S501 to S507 of fig. 5. In step S2308, operation information indicating whether the additional MVD information is encoded in the skip mode or the merge mode is encoded. In step S2309, it is determined whether the corresponding operation information is true or false, and if true, the flowchart is ended after the additional MVD information is encoded in step S2310, and if false, the flowchart is ended without delay. The description in steps S2311 to S2320 is the same as that in steps S508 to S517 of fig. 5.

Fig. 24 is a flowchart for decoding motion information including an additional MVD in a merge mode. The description in steps S2401 to S2407 is the same as that in steps S801 to S807 of fig. 8. In step S2408, operation information indicating whether the additional MVD information is decoded in the skip mode or the merge mode is decoded. In step S2409, it is determined whether the corresponding operation information is true or false, and if true, the flowchart is ended after decoding the additional MVD information in step S2410, and if false, the flowchart is ended without delay. The description in steps S2411 to S2420 is the same as that in steps S808 to S817 of fig. 8.

In the present embodiment, a binarization method of reference picture index information and prediction direction information among components of motion information when encoding the motion information will be described in detail.

For the prediction direction information and the reference picture index information, the binarization method may be changed according to the configuration state of the reference picture set (hereinafter, referred to as "RPS"). RPS information may be transmitted in the upper header. The components of the RPS information may include the number of reference pictures for each prediction direction, the reference picture corresponding to the reference picture index, and POC difference information between the corresponding reference picture and the current picture, etc. Fig. 25 is an example of RPS information, and shows how RPSs are configured. The RPS is configured with reference pictures for the list 0 direction and the list 1 direction, respectively. The binarization method of the prediction direction information and the reference picture index information of each example of fig. 25 will be described by using these examples in fig. 26.

There are 3 steps to check the RPS configuration status. The first step (hereinafter, referred to as "first RPS check") determines whether reference pictures in the List 0(List 0) direction and the List 1(List 1) direction are stored in the RPS in the same index order. However, the number of reference pictures in the list 0 direction should be greater than or equal to the number of reference pictures in the list 1 direction. The second step (hereinafter, referred to as "second RPS check") determines whether all reference pictures in the list 1 direction are included regardless of the reference picture index order of the RPSs in the list 0 direction. The third step (hereinafter, referred to as "third RPS check") determines whether the number of reference pictures in the list 0 direction is the same as the number of reference pictures in the list 1 direction. The binarization method of the prediction direction information and the reference picture index information may be changed based on the above three determinations.

As a binarization method of the prediction direction information, a first RPS check, a restriction of bidirectional prediction according to a block size, and the like may be considered. For example, when the sum of the width and the length is equal to or less than a predetermined threshold length, bi-prediction may be limited. In this case, since the threshold length is a value preset in the encoding/decoding apparatus, it may be 8, 12, 16, etc. For blocks for which the first RPS check is false and bi-prediction is allowed, binarization may be performed by assigning 1 to bi-prediction, 00 to the list 0 direction, and 01 to the list 1 direction. For blocks where the first RPS check is false and bi-prediction is limited, binarization may be performed by assigning 0 to the list 0 direction and 1 to the list 1 direction. For blocks for which the first RPS check is true and bi-prediction is allowed, binarization may be performed by assigning a 1 to bi-prediction and a 0 to the list 0 direction. This is because the reference picture in the list 1 direction already exists in the list 0 direction, and thus the list 1 direction prediction does not need to be performed. For the block whose first RPS check is true and whose bidirectional prediction is limited, there is no need to transmit prediction direction information and perform binarization of the corresponding information. In this case, when the first RPS check is false, refer to RPS a in fig. 25, and when the first RPS check is true, refer to RPS B or RPS C in fig. 25.

The prediction direction information may be binarized only by a method when the first RPS check is false, regardless of the first RPS check result. In this case, when bi-prediction is not limited, the second bin indicating whether the prediction direction is list 0 or list 1 should be encoded, and in this case, entropy encoding/decoding using CABAC may be performed by considering the first RPS check. For example, when the first RPS check condition is regarded as the context of the second binary bit of the prediction direction information, the occurrence probability state of the MPS (most probable symbol), LPS (least probable symbol) may be updated by using the initial probability of the context index information No. 4 in the context initial probability table of fig. 27, because the list 1 prediction may not occur if the corresponding condition is true. If the corresponding condition is false, the occurrence probability state of MPS (most probable symbol), LPS (least probable symbol) may be updated by using the initial probability that the context index information is No. 1. In this example, the second binary bit indicating (fig. 17) list 0 is 0, so the MPS information is 0 and the LPS information is 1. The probability information can be updated by the rule of probability change of the LPS in fig. 28. In fig. 28, the probability state index (σ) on the horizontal axis is index information shown by presetting the level of change in the occurrence probability of LPS, and the vertical axis represents the occurrence probability of LPS. For example, when σ is 5, the LPS occurrence probability is about 40%, and if updating is performed to improve the LPS occurrence probability, it may be updated to a probability of about 44% (LPS occurrence probability when σ is 3) according to the rule of LPS occurrence probability change in fig. 19. Referring to fig. 27 again in this way, when the context index information is No. 4, the initial occurrence probability of the LPS is 5%, which is the same as when σ is 31 as found in fig. 28, and when the context index information is No. 1, the initial occurrence probability of the LPS is 35%, which is the same as when σ is 7 as found in fig. 28. While considering this occurrence probability state as the initial information, the occurrence probability states of MPS, LPS may be uniformly updated by considering the context of the second binary bit of the prediction direction information.

The reference picture index information may be binarized by considering all of the first, second and third RPS checks. The reference picture index information may be binarized based on the number of reference pictures in the RPS for each prediction direction. Referring to fig. 26, when the first RPS check is false, the second RPS check is true, and the third RPS check is false, the binarization method of the reference picture index information is different and the same for other conditions.

In this case, for other conditions, binarization may be performed according to the index order of the reference pictures and the number of reference pictures. For example, when the number of reference pictures is 5, the reference picture index information may be binarized into 0, 10, 110, 1110, 1111.

For the other cases (first RPS check is false, second RPS check is true, and third RPS check is false), the reference picture in the list 1 direction also exists in the same way in the reference picture in the list 0 direction, but the index order of each reference picture is different. In this case, binarization may be performed by 2 methods.

In the first method, binarization may be separately performed by dividing into a common reference picture group and a non-common reference picture group in the RPS according to a prediction direction. In the table representing the binarization method of the reference picture index information, the RPS common POC is a common reference picture group, and the RPS non-common POC is a non-common reference picture group. Referring to RPS D in fig. 25, there are 3 reference pictures numbered 1, 3, 4 in the common reference picture group and 2 reference pictures numbered 0, 2 in the non-common reference picture group. Thus, for reference pictures No. 1, 3, 4 in RPS common POC, the reference picture index information can be binarized to 00, 010, 011, and for reference pictures No. 0, 2 in RPS non-common POC, the reference picture index information can be binarized to 10, 11.

The second method corresponds to the case where the prediction direction is not bi-directional prediction. Similar to the first approach, the reference pictures for each prediction direction of the RPS are divided into a common reference picture group and a non-common reference picture group. However, the first bin indicating the group to which the current reference picture belongs in the reference picture index information (the underlined bin in the table for the binarization method for reference picture index information in fig. 26) is not transmitted, and the first bin indicating whether it is the list 0 or list 1 of the prediction direction information (the underlined bin in the table for the binarization method for prediction direction information in fig. 26) is transmitted using the bin indicating whether it is the list 0 or list 1. The meaning of the underlined binary digits in the table of the binarization method for prediction direction information in fig. 26 is used as information indicating that the reference picture of the current block is a common reference picture group rather than indicating whether the prediction direction is the list 0 direction or the list 1 direction. In this case, when the prediction direction information is binarized, only a binary bit indicating whether the prediction direction is bi-directional prediction is transmitted. When bi-prediction is limited, no prediction direction information is transmitted.

After the prediction mode of the current block selects intra prediction, reference pixels surrounding the current block are derived and filtered in the reference pixel generation unit 2901. The reference pixel is determined by using reconstructed pixels around the current block. When some reconstructed pixels may not be used or there are no reconstructed pixels around the current block, an available reference pixel or a middle value among a range of values that the pixel may have may be filled into the unavailable region. After all reference pixels are derived, filtering is performed by using an AIS (adaptive intra frame smoothing) filter.

The optimal intra prediction mode determination unit 2902 is a device that determines one prediction mode among M intra prediction modes. In this case, M denotes the total number of intra prediction modes. The intra prediction mode generates a prediction block generated by using reference pixels filtered according to a directional prediction mode and a non-directional prediction mode. One intra prediction mode having the lowest cost value is selected by comparing RD costs of each intra prediction mode.

Fig. 30 is a block diagram showing the inter prediction unit 3000 of the image encoding apparatus in detail.

The inter prediction unit may be divided into the merge candidate search unit 3002 and the AMVP candidate search unit 3004 according to a method of deriving motion information. The merge candidate search unit S302 sets a reference block using inter prediction among reconstructed blocks around the current block as a merge candidate. Merging candidates are derived in the encoding/decoding apparatus by the same method, the same number is used, and the number of merging candidates is transmitted from the encoding apparatus to the decoding apparatus. In this case, when as many merge candidates as the agreed number are not set from reconstructed reference blocks around the current block, motion information of a block at the same position as the current block is brought in from other pictures than the current picture. Alternatively, motion information in the past direction and the future direction from the current picture is combined and filled as candidates, or a block at the same position of another reference picture is set as motion information to set merging candidates.

The AMVP candidate search unit 3004 determines motion information of the current block in a motion estimation (motion prediction) unit 3005. The motion estimation unit 3005 finds a prediction block most similar to the current block among the reconstructed images.

In the inter prediction unit, after determining motion information of the current block by using one of the merge candidate search unit and the AMVP candidate search unit, a prediction block is generated by motion compensation 3006.

Fig. 31 is a method of encoding prediction mode information.

The skip mode operation information encoding (S3101) is information indicating whether the prediction mode information of the current block uses the merge information of inter prediction and whether the prediction block is used as a reconstructed block in the decoding apparatus.

The determined merge candidate index coding is performed if the skip mode is operated (S3103), and the prediction mode coding is performed if the skip mode is not operated (S3104).

The prediction mode encoding (S3104) encodes whether the prediction mode of the current block is inter prediction or intra prediction. When the inter prediction mode is selected, the merge mode operation information is encoded (S3106). When the merge mode operates (S3107), merge candidate index encoding is performed (S3103). When the merge mode does not operate, prediction direction encoding is performed (S3108). Prediction direction encoding (S3108) indicates whether the direction of the reference picture used is in the past direction or the future direction, or both directions are used, based on the current picture. When the prediction direction is the past direction or the bidirectional direction (S3109), the inter prediction motion information of the current block may be indicated by encoding reference picture index information in the past direction (S3110), encoding MVD information in the past direction (S3111), and encoding MVP information in the past direction (S3112), and when the prediction direction is the future direction or bidirectional direction (S3113), the inter prediction motion information of the current block may be indicated by encoding reference picture index information in the future direction (S3114), encoding MVD information in the future direction (S3115), and encoding MVP information in the future direction (S3116). Information encoded in the inter prediction process is called inter prediction unit mode information encoding.

When the prediction mode is the intra prediction mode, the MPM operation information is encoded (S3117). The MPM operation information encoding is information indicating that the same prediction mode information as the current block is used without encoding the prediction mode information of the current block when having the same prediction mode information as the current block among reconstructed blocks surrounding the current block. When the MPM operation is performed (S3118), which prediction mode of the reconstructed block is used as the prediction mode of the current block is indicated by the MPM index encoding (S3119), and when the MPM operation is not performed (S3118), the residual prediction mode encoding is performed (S3120). The residual prediction mode encoding encodes a prediction mode index used as a prediction mode of the current block among the remaining prediction modes except the prediction mode selected as the MPM candidate. Information encoded in the intra prediction process is referred to as intra prediction unit mode information encoding.

Fig. 32 and 33 show an intra prediction unit and an inter prediction unit of the image decoding apparatus.

For the intra prediction unit 3200, only the process of determining the optimal prediction mode of fig. 29 is omitted, and the process of generating a prediction block based on the optimal prediction mode operates in substantially the same manner as the intra prediction unit of the image encoding apparatus.

For the inter prediction unit 3300, only the process of determining the optimal prediction mode of fig. 30 is omitted, and the process of generating a prediction block based on the optimal prediction mode operates in substantially the same manner as the inter prediction unit of the image encoding apparatus.

Fig. 34 is a method of decoding prediction mode information. It basically operates in the same manner as the method of encoding the prediction mode information in fig. 31.

Fig. 35 is a flowchart illustrating an encoding method of a transform block.

The encoding method of the transform block in fig. 35 may be performed by the entropy encoding unit 105 of the image encoding device 100.

First, when the transform coefficients are scanned according to the reverse scan order, the first non-zero coefficient is determined as a basic coefficient, and the position information Last sig is encoded (S3501).

A sub-block including the basic coefficients therein is selected (S3502), and transform coefficient information in the corresponding sub-block is encoded. When it is not a sub-block in which the basic coefficient is included, sub-block information is encoded before encoding the coefficient in the transform block (S3503). Coded _ sub _ blk _ flag (subblock information) is a flag indicating whether or not at least one or more non-zero coefficients exist in a current subblock. Subsequently, the non-zero coefficient information is encoded (S3504). In this case, Sig _ coeff _ flag (non-zero coefficient information) indicates whether the value of each coefficient in the subblock is 0.

Further, coefficient information larger than N is encoded (S3505). In this case, the coefficient information larger than N indicates that the absolute value of each coefficient is larger than each of the values from 1 to N, respectively, for all the coefficients in the sub-block. N may be any preset value in encoding and decoding, but it is permissible to use the same value in encoding and decoding by encoding the value of N. The number of coefficient information larger than N may be any preset value, or may be different according to the position of the basic coefficient. Coefficient information greater than N may be encoded for all or some of the coefficients in the sub-block, and may be encoded sequentially in the scan order of each coefficient.

For example, when N is set to 3, it is encoded whether the absolute value of each coefficient is greater than 1 for all non-zero coefficients in the sub-block. For this purpose, a flag Abs _ grease 1_ flag indicating whether the absolute value of the coefficient is greater than 1 is used. Subsequently, only for the coefficient determined to be a value larger than 1, whether it is a value larger than 2 is encoded. For this purpose, a flag Abs _ grease 2_ flag indicating whether the absolute value of the coefficient is greater than 2 is used. Finally, only for coefficients determined to be a value greater than 2, whether they are a value greater than 3 is encoded. For this purpose, a flag Abs _ grease 3_ flag indicating whether the absolute value of the coefficient is greater than 3 is used.

Optionally, for non-zero coefficients in the sub-block, whether the absolute value of each coefficient is greater than 1 is encoded. For this purpose, a flag Abs _ grease 1_ flag indicating whether the absolute value of the coefficient is greater than 1 is used. Then, whether the coefficient is even or odd may be encoded only for coefficients determined to be values greater than 1. For this, parity information indicating whether the coefficient is even or odd may be used. Further, whether the absolute value of the coefficient is greater than 3 may be encoded. For this purpose, a flag Abs _ grease 3_ flag indicating whether the absolute value of the coefficient is greater than 3 may be used.

As described above, the coefficient information greater than N may include at least one of Abs _ greaterN _ flag and a flag indicating whether it is an even number. In this case, N may be 1, 2, 3, but is not limited thereto. N may be a natural number greater than 3, such as 4, 5, 6, 7, 8, 9, and so forth.

Subsequently, for each coefficient determined to be non-zero, sign information indicating whether it is a negative number or a positive number is encoded (S3506). For the Sign information, Sign _ flag may be used.

Further, a residual value where N is subtracted only from a coefficient whose absolute value is determined to be greater than N is defined as residual coefficient information, and residual value information remaining _ coeff of the coefficient is encoded (S3507). In this case, the encoding of the information of each coefficient may be performed by a method of moving to a subsequent coefficient after performing the S3504, S3505, S3506, and S3507 processes on each coefficient. Alternatively, the information of the coefficients in the sub-blocks may be encoded once per step. For example, when there are 16 coefficients in a sub-block, each of the 16 coefficients may be encoded first (S3504), the S3505 process may be completely performed only on the coefficients for which the absolute values of the coefficients are determined to be non-zero in S3504, and the S3506 process may be performed. Subsequently, when it is impossible to represent the absolute value of the current coefficient in the processing of S3505, the processing of S3507 may be performed. The absolute value of the non-zero coefficient may be derived by decoding at least one of Sig _ coeff _ flag, one or more Abs _ greaterN _ flags, parity information, and residual value information.

After encoding all coefficient information of the current sub-block, it is checked whether a subsequent sub-block exists (S3509). When there is a subsequent sub-block, it is moved to the subsequent sub-block (S3510), and sub-block information is encoded (S3503). The subblock information Coded _ sub _ blk _ flag is checked (S3508). And when the value of Coded _ sub _ blk _ flag is checked to be true, coding the non-zero coefficient information Sig _ coeff _ flag. When the value Coded _ sub _ blk _ flag of the subblock information is false, it indicates that there is no coefficient to be encoded in the corresponding subblock, and thus it is checked whether there is a subsequent subblock. Alternatively, after moving to a subsequent sub-block, when the sub-block is the sub-block located at the lowest frequency, it may also be equally set to true in encoding and decoding without encoding and decoding sub-block information, assuming that there will be non-zero coefficients.

In fig. 35, for convenience of description, encoding of symbol information (S3506) is described as processing after S3505, but S3506 processing may be performed between S3504 and S3505 or after S3507.

Fig. 36 is a flowchart illustrating a decoding method of a transform block.

The method of decoding the transform block in fig. 36 corresponds to the method of encoding the transform block in fig. 35. The decoding method of the transform block in fig. 36 may be performed by the entropy decoding unit 601 of the image decoding device 600 in fig. 6.

For information to be encoded, context-adaptive binarization arithmetic processing is performed by binarization processing. The context adaptive binarization arithmetic processing refers to a process of symbolizing and encoding coding information in a block by differently applying an occurrence probability of a symbol (symbol) using probability information according to a situation. In this example, for convenience of description, only 0 and 1 are used as symbols, but for the number of symbols, N (N is a natural number equal to or greater than 2) may be used.

The probability information refers to the occurrence probability of 0 and 1 in the binarization information. The occurrence probability of the two pieces of information may be set equally or differently according to the previous reconstruction information. From the information, it can have M probability information. In this case, the M pieces of probability information may be implemented as probability tables.

Fig. 37 is a flowchart showing a context adaptive binarization arithmetic coding method. First, probability initialization is performed (S3701). The probability initialization is a process of dividing a probability interval (section) for binarization information by a probability set in the probability information. However, as to which probability information is to be used, the same condition may be used by a rule arbitrarily preset in the encoding apparatus or the decoding apparatus, and the probability information may be individually encoded. The initial probability interval may be equally determined by a preset rule in the encoding/decoding process. Alternatively, the initial probability interval may be newly encoded and used. Alternatively, the probability intervals and probability information of previously used coding parameters may be brought in without performing probability initialization.

When determining the binary information of the current encoding parameter to be encoded (S3702), the binary information of the current encoding parameter is encoded by using the probability interval state up to the previous step S3702 and the previous probability information of the same encoding parameter (S3703). Further, the probability information and the probability interval may be updated for binary information to be subsequently encoded (S3704). Further, when there is encoding parameter information to be encoded later (S3705), the above process is repeated by moving to the subsequent encoding parameter information (S3706). If there is no encoding parameter information to be encoded subsequently, the flowchart ends.

Fig. 38 is a flowchart showing a context adaptive binarization arithmetic decoding method. Unlike the encoding apparatus, the decoding apparatus determines information of the current encoding parameter (S3803) after decoding binary information of the encoding parameter by using the probability information and the probability section (S3802). In addition, since the decoding method in fig. 38 corresponds to the encoding method in fig. 37, detailed description is omitted.

In the above-described steps S3703 and S3802 in fig. 37 and 38, encoding or decoding may be performed by selectively using optimal probability information among M pieces of probability information, which are preset by using information (or encoding parameters) that has been reconstructed around each encoding parameter.

For example, probability information having a high occurrence probability of information depending on the size of the transform block is used as the probability information of the coding parameter.

Alternatively, the probability information may be differently applied according to information of surrounding coefficients of a coefficient to be currently encoded or decoded, and the probability information of the information to be currently encoded or decoded may be selected by using the probability information of the previously encoded or decoded information.

Fig. 39 and 40 are diagrams showing examples in which probability information is differently applied according to information of surrounding coefficients.

Fig. 39 is an example of a probability information table for encoding or decoding the Sig _ coeff _ flag information value of the current coefficient. When the number of coefficients having the same information value as the Sig _ coeff _ flag information value of the current coefficient among coefficients adjacent to the coefficient to be currently encoded or decoded is 1, an index 8 is assigned to the current coefficient. In this case, the probability of the symbol 1 (Sig _ coeff _ flag binary information of the current coefficient) is 61%, and the probability of the symbol 0 is 39%. When the number of surrounding coefficients having the same information value as the Sig _ coeff _ flag information value of the current coefficient is 2, an index of 5 is allocated to the current coefficient, and in this case, the probability of a symbol 1 (Sig _ coeff _ flag binary information of the current coefficient) is 71%, and the probability of a symbol 0 is 29%. When the number of surrounding coefficients having the same information value as the Sig _ coeff _ flag information value of the current coefficient is 3, an index of 2 is allocated to the current coefficient, and in this case, the probability of a symbol 1 (Sig _ coeff _ flag binary information of the current coefficient) is 87%, and the probability of a symbol 0 is 13%.

After encoding or decoding the current coefficient by using the probability information table shown in fig. 39, the probability information may be updated as in fig. 40.

On the other hand, for the non-zero coefficient information Sig _ coeff _ flag, since it is closer to the low frequency domain, probability information having a high occurrence probability of the non-zero coefficient information Sig _ coeff _ flag may be used.

Further, in the case of probability information of coefficient information greater than N, the probability information of coefficient information greater than N at present may be set by using probability information of coefficient information greater than N that is encoded/decoded just before, or may use probability information of coefficient information greater than N that is first encoded/decoded in units of sub-blocks as it is. As described above, the coefficient information greater than N may include at least one of Abs _ greater1_ flag, Abs _ greater2_ flag, Abs _ greater3_ flag, … …, and Abs _ greater N _ flag.

In addition, the subblock information Coded _ sub _ blk _ flag may use probability information of M surrounding subblocks being encoded/decoded or probability information of a subblock just previously encoded/decoded.

Fig. 41 is a diagram of adding an intra block copy prediction unit to the image encoding apparatus 100 of fig. 1. The intra block copy prediction unit may generate a prediction block of a block to be currently encoded by using a reconstructed region in a current picture.

Fig. 42 to 47 are examples of generating a prediction block in an intra block copy prediction unit. In the figure, CB denotes a current block, and PB denotes a prediction block.

The motion search range may be limited to the reconstruction region. For example, as in fig. 42, the motion search range may be limited only within the reconstructed region in the current picture, and as in fig. 43, when even some of the prediction blocks belong to the reconstructed region, it may be set as the motion search range. Alternatively, as in the example of fig. 44, when the current block and the prediction block partially overlap, it may be set as a motion search range. The corresponding regions that partially overlap may be filled by using neighboring reconstructed pixels, and the overlapping regions may be predicted by using the reconstructed pixels.

Fig. 45 to 47 are diagrams illustrating an example of a method of generating a prediction block when the prediction block and the current block overlap. A denotes a region where the current block and the prediction block overlap, and B is a neighboring reconstructed pixel around prediction a. The usable reconstructed pixels may be different according to which direction of the top-left, top-right, left-side, etc. of the current block there is a prediction block. In this case, the a region may be predicted by using M (M is an integer equal to or greater than 1) intra prediction modes. M is the number of prediction modes available for intra prediction of the current block. Fig. 48 is a schematic diagram illustrating pixel prediction in the a region by using a reconstructed pixel line (line) available for pixel prediction in the a region and an upper left direction mode among M available intra prediction modes.

The motion vector of the current block may be used to indicate a reconstructed pixel line used to derive a reference pixel line of the current block, instead of indicating a prediction block of the current block in a reconstruction region within the current picture as in the diagram of fig. 49. Intra prediction may be performed by using a reference pixel line of a reconstructed region that is not adjacent to the current block based on the M prediction modes, and a prediction mode that generates an optimal prediction block may be selected. In this case, the prediction block may be generated by using a prediction mode in W (W is an integer equal to or greater than 1) reference pixel lines and the optimal reference pixel line, and the optimal prediction block may be generated by performing weighted summation on the generated prediction block after generating the prediction block by using different prediction modes or the same prediction mode for the W reference pixel lines.

Alternatively, after prediction blocks are respectively generated by using intra block copy prediction and intra prediction, an optimal prediction block may be generated by performing weighted summation on the generated prediction blocks. After prediction blocks are respectively generated by using intra block copy prediction and inter prediction, an optimal prediction block may be generated by performing weighted summation on the generated prediction blocks.

In addition, the reference pixel line may use only the reconstructed pixels on the top or the reconstructed pixels on the left. In this case, when the prediction block and the current block overlap, the prediction block may be generated by using a reference pixel line used in generating the prediction block.

In the diagrams of fig. 42 to 48, the motion search range is the current picture as an example, but it may be limited to a CTU or a CTU line to which the current block belongs, or may be limited to neighboring CTUs other than the current CTU. For example, the Prediction Block (PB) indicated by the motion vector of the Current Block (CB) may be restricted to belong to the same CTU or CTU line as the current block.

Fig. 50 is a block diagram illustrating an intra block copy prediction unit of the image encoding apparatus in detail, and the intra block copy prediction S5001 may be divided into a CPR _ Merge candidate search unit S5002 and a CPR _ AMVP candidate search unit S5004. The CPR _ Merge candidate search unit S5002 may use the reconstructed block as a CPR _ Merge candidate. The reconstructed block may be a block encoded/decoded by inter prediction or may be limited to a block encoded/decoded by an Intra Block Copy (IBC) mode among surrounding blocks. The maximum number of CPR _ Merge candidates may be used equally in the encoding/decoding apparatus or may be transmitted from a higher head. In this case, the maximum number may be 2, 3, 4, 5, or more. The higher header represents higher header information including picture and block information such as a video parameter level, a sequence parameter level, a picture parameter level, a slice level, etc. A method of deriving the CPR _ Merge candidate is described by using fig. 50.

Fig. 51 shows spatial candidates adjacent to a current block. AL, a, AR, L, BL are positions of reconstructed blocks belonging to the same picture as the current block, and can be used as CPR _ Merge candidates. For example, when inter prediction or intra block copy prediction is used in reconstructed blocks at the positions of AL, a, AR, L, BL, it can be used as a CPR _ Merge candidate. It is contemplated that the order of reconstruction blocks may be determined by the order of L, a, AR, BL, AL or other various priorities. Spatially neighboring reconstructed blocks may be used as CPR _ Merge candidates only if the size of the current block is larger than a predetermined threshold size. The size of the current block may be expressed as a width, a height, a sum of the width and the height of the block, a product of the width and the height, a minimum/maximum value among the width and the height, etc. For example, when the product of the width and the height of the current block is greater than 16, the reconstructed block may be used as a CPR _ Merge candidate, and otherwise, the reconstructed block may not be used as a CPR _ Merge candidate.

When the CPR _ large candidate list is not filled with the maximum number of candidates, the motion information stored in the H-buffer may be added to the CPR _ large candidate list. The H buffer may store motion information of blocks encoded/decoded before the current block. Alternatively, when the maximum number of candidates is not filled in the CPR _ Merge candidate list and an intra block copy prediction technique is used for a reconstructed block at the same position as the current block in a previously reconstructed picture, motion information of the corresponding reconstructed block may be added as the CPR _ Merge candidate.

Alternatively, a default vector candidate may be added when the number of CPR _ MVP candidates added so far is less than the maximum number of candidates. The default vector may represent a vector that is equally determined by the encoding/decoding device. For example, when the default vector is (0,0), (-10,0), (0, -10), (-15,0), (0, -15), and 2 CPR _ Merge candidates are missing, 2 default vectors may be added to the CPR _ Merge candidate list in order from the front. Subsequently, RD costs for each motion information in the CPR _ Merge candidate list are calculated, and motion information having the optimal RD cost is determined (S5003).

The CPR _ AMVP candidate search unit S5004 may determine at least one of CPR _ MVP candidates or CPR _ MVD information by using motion information of surrounding blocks after generating a prediction block within a motion search range. The maximum number of CPR _ MVP candidates may be used equally in the encoding/decoding apparatus or may be transmitted from a higher header. In this case, the maximum number may be 2, 3, 4, 5, or more. The amount of CPR _ MVP information may be used equally in the encoding/decoding apparatus or may be sent from a higher header. A method of deriving the CPR _ MVP candidate is described by using fig. 50. AL, a, AR, L, BL are positions of reconstructed blocks belonging to the same picture as the current block, and can be used as CPR _ MVP candidates. When inter prediction or intra block copy prediction is used in reconstructed blocks at the locations of AL, a, AR, L, BL, it can be used as a CPR _ MVP candidate. It is contemplated that the order of reconstructing the blocks may be determined by the order of L, a, AR, BL, AL or various priorities. Only when the size of the current block is greater than a predetermined threshold size, spatially neighboring reconstructed blocks may be used as CPR _ MVP candidates. The size of the current block may be expressed as a width, a height, a sum of the width and the height of the block, a product of the width and the height, a minimum/maximum value among the width and the height, etc. For example, when the product of the width and height of the current block is greater than 16, the reconstructed block may be used as a CPR _ MVP candidate, and otherwise, the reconstructed block may not be used as a CPR _ MVP candidate.

When the CPR _ MVP candidate list is not filled with the maximum number of candidates, the motion information stored in the H-buffer may be added to the CPR _ MVP candidate list. The H buffer may store motion information of blocks encoded/decoded before the current block. Alternatively, when the maximum number of candidates is not filled in the CPR _ MVP candidate list and an intra block copy prediction technique is used for a reconstructed block at the same position as the current block in a previously reconstructed picture, motion information of the corresponding reconstructed block may be added as the CPR _ MVP candidate.

When the number of CPR _ MVP candidates added so far is less than the maximum number of candidates, a default vector may be added to the CPR _ MVP candidate list. The CPR _ MVD information may be a difference value between motion information of the current block and motion information stored in the CPR _ MVP candidate. For example, when the motion vector of the current block is (-14) and the motion vector of the CPR _ MVP candidate is (-13), the CPR _ MVD information may be (1,1), i.e., (-14) - (-13), which is one difference value. Alternatively, when the current block and the prediction block do not overlap within the motion search range, the motion vector may be expressed as the following

equations

5 and 6 according to the size of the current block.

[ formula 5]

MV.x＝Curr.MV.x-Curr·blk·width

[ formula 6]

MV.y＝Curr.MV.y-Curr.blk.height

In

equations

5 and 6, Curr _ mv.x and Curr _ mv.y are x and y components of the motion vector of the current block. The Curr _ blk _ width and Curr _ blk _ height may be determined as various values, such as a horizontal size, a vertical size, 1/2 of the horizontal size, 1/2 of the vertical size, and the like of the current block. The MV is the motion vector of the current block that is finally derived. For example, when the motion vector of the current block is (-14) and the size of the current block is (4,4), the motion vector may be set to (-10 ). Subtracting only half of the horizontal length and the vertical length of the current block from the motion vector of the current block may be determined as the motion vector of the current block. Subsequently, RD costs for each motion information in the CPR _ MVP candidate list are calculated, and motion information having an optimal RD cost is determined (S5005).

In the intra block copy prediction unit, after motion information of the current block is determined by using one of the CPR _ Merge candidate search unit and the CPR _ AMVP candidate search unit, a prediction block is generated by motion compensation (S5006).

Fig. 52 is a method of encoding prediction mode information.

The skip mode operation information encoding (S5201) is information indicating whether a prediction block is used as a reconstructed block in the encoding apparatus.

The prediction mode encoding (S5202) may encode whether the prediction mode of the current block is inter prediction, intra prediction, or intra block copy prediction. When it is encoded by inter prediction (S5203), inter prediction unit mode information may be encoded (S5204). The inter prediction unit mode information encoding (S5204) may play the same role as the inter prediction unit mode information encoding in fig. 31. When the prediction mode is encoded by intra prediction (S5205), intra prediction unit mode information may be encoded (S5206). The intra prediction unit mode information encoding may play the same role as the intra prediction unit mode information encoding in fig. 31. When the intra block copy prediction mode is selected, CPR _ Merge mode operation information may be encoded (S5207). When the CPR _ target mode operates (S5208), CPR _ target candidate index encoding may be performed (S5209). When the CPR _ Merge mode is not operated, CPR _ MVD information encoding may be performed (S5210) and CPR _ MVP candidates may be encoded (S5211). When it is determined that the current block and the prediction block overlap by using the CPR _ MVP candidate and the CPR _ MVD information, a prediction mode for an overlapping region may be additionally encoded. In addition, when intra prediction is performed by the example in fig. 49, intra prediction mode information encoding may be performed after the CPR _ target candidate encoding S5209 and the CPR _ MVP candidate encoding S5211 (S5206).

In this case, when there is no previously reconstructed picture that can be used in the current picture due to the higher header setting, the inter prediction unit mode information may be omitted in the prediction mode encoding (S5202).

The prediction mode information encoding may be performed by using fig. 31.

The intra block copy prediction unit mode information may be represented as inter prediction unit mode information. It can be represented by adding current picture information to reference picture index information set for inter prediction information. For example, when there are reference picture indexes from No. 0 to No. 4, No. 0 to No. 3 may represent previously reconstructed pictures and No. 4 may represent a current picture. In the merge candidate index coding (S3103), when the past direction reference picture index information is the current picture when the past direction information is used, the intra block copy prediction technique may be performed, and for other cases, the inter prediction technique may be performed. In addition, in encoding AMVP mode information, when a past direction is encoded with respect to prediction direction information (S3108) and past direction reference picture index information is encoded as a current picture (S3110), past direction MVD information (S3111) and past direction MVP candidates (S3112) may be information for intra block copy prediction, and for other cases, it may be information for inter prediction techniques. In this case, when there is no previously reconstructed picture that can be used by the current picture due to the higher header setting, the processes of prediction direction encoding (S3108), past direction reference picture index information encoding (S3110), future direction reference picture index information encoding (S3114), future direction MVD information encoding (S3115), and future direction MVP information encoding (S3116) may be omitted, and when inter prediction is encoded in the prediction mode encoding step, it may represent intra block copy prediction instead of inter prediction.

Fig. 53 is a diagram of adding an intra block copy prediction unit to the image decoding apparatus 600 of fig. 6.

Fig. 54 shows an intra block copy prediction unit of the image decoding apparatus.

For the intra block copy prediction unit, only the process of determining the optimal prediction mode of fig. 50 is omitted, and the process of generating a prediction block by receiving the prediction mode determined to be optimal operates in substantially the same manner as the intra block copy prediction unit of the image encoding apparatus.

Fig. 55 is a method of decoding prediction mode information.

The skip mode operation information decoding (S5501) is information indicating whether a predicted block is used as a reconstructed block in the decoding apparatus.

The prediction mode decoding (S5502) may decode whether the prediction mode of the current block is inter prediction, intra prediction, or intra block copy prediction. When it is decoded by inter prediction (S5503), the inter prediction unit mode information may be decoded (S5504). The inter prediction unit mode information decoding (S5504) may play the same role as the inter prediction unit mode information decoding in fig. 34. When the prediction mode is decoded as intra prediction (S5505), intra prediction unit mode information may be decoded (S5506). The intra prediction unit mode information decoding may play the same role as the intra prediction unit mode information decoding in fig. 34. When the intra block copy prediction mode is selected, CPR _ Merge mode operation information may be decoded (S5507). When the CPR _ Merge mode operates (S5508), CPR _ Merge candidate index decoding may be performed (S5509). When the CPR _ Merge mode is not operated, CPR _ MVD information decoding may be performed (S5510), and CPR _ MVP candidates may be decoded (S5511). When it is determined that the current block and the prediction block overlap by using the CPR _ MVP candidate and the CPR _ MVD information, a prediction mode for an overlapping region may be additionally decoded. In addition, when intra prediction is performed by the example in fig. 49, intra prediction mode information decoding (S5506) may be performed after CPR _ target candidate decoding (S5509) and CPR _ MVP candidate decoding (S5511).

In this case, when there is no previously reconstructed picture that can be used by the current picture due to the higher header setting, the inter prediction unit mode information may be omitted in the prediction mode decoding (S5502).

The prediction mode information decoding may be performed by using fig. 34.

The intra block copy prediction unit mode information may be represented as inter prediction unit mode information. It can be represented by adding current picture information to reference picture index information set in inter prediction information. For example, when there are reference picture indexes from No. 0 to No. 4, No. 0 to No. 3 may represent previously reconstructed pictures and No. 4 may represent a current picture. In the merge candidate index decoding (S3403), when the past direction information is used and the past direction reference picture index information is the current picture, the intra block copy prediction technique may be performed, and for other cases, the inter prediction technique may be performed. In addition, in decoding the AMVP mode information, the past information MVD information (S3411) and the past direction MVP candidate (S3412) may be information for intra block copy prediction when the prediction direction information is decoded (S3408) to indicate the past direction and the past direction reference picture index information is decoded to indicate the current picture (S3410), and for other cases, it may be inter prediction technology information. In this case, when there is no previously reconstructed picture that can be used by the current picture due to the higher header setting, the processes of prediction direction decoding (S3408), past direction reference picture index information decoding (S3410), future direction reference picture index information decoding (S3414), future direction MVD information decoding (S3415), and future direction MVP information decoding (S3416) may be omitted, and when inter prediction is decoded in the prediction mode decoding step, this may represent intra block copy prediction instead of inter prediction.

Fig. 56 is a flowchart illustrating an encoding method of quantized transform coefficients (hereinafter referred to as "transform coefficients"). Which may be performed by an entropy encoding unit of an image encoding device.

First, when the transform coefficients are scanned according to a reverse scan order, a first non-zero coefficient may be determined as a basic coefficient, and position information (Last _ sig) is encoded (S5601).

A sub-block including the basic coefficient therein is selected (S5602), and transform coefficient information in the sub-block may be encoded. When it is not a sub-block in which the basic coefficient is included, sub-block information may be encoded before encoding the coefficient in the transform block (S5603). Coded _ sub _ blk _ flag (subblock information) is a flag indicating whether or not at least one or more non-zero transform coefficients exist in a current subblock. The first amount of coding information and the second amount of coding information may be initialized to 0 before coding coefficient information in the sub-block. The first encoded information is the number of encoded coefficient information greater than 0 (S5606), coefficient information greater than 1 (S5606), parity information (S5607). The second encoding information is the number of encoded coefficient information (S5610) greater than 3. The first-step coefficient information encoding includes steps S5606, S5607, and S5608, in which coefficient information greater than 0, coefficient information greater than 1, and parity information are encoded. The second-step coefficient information encoding is a step S5610 of encoding coefficient information larger than 3.

Subsequently, a transform coefficient to be currently encoded may be selected in a reverse scan order (S5604). PosL denotes the first position in reverse scan order of the transform coefficient in the current subblock, which was not encoded by the first step coefficient information encoding process. After selecting a transform coefficient to be first encoded in a sub-block, coefficient information greater than 0 indicating whether an absolute value of a current transform coefficient is greater than 0 may be encoded (S5606). Subsequently, when the current transform coefficient is determined to be non-zero, coefficient information greater than 1 indicating whether the absolute value of the current transform coefficient is greater than 1 may be encoded (S5607). Subsequently, when it is determined that the absolute value of the current transform coefficient is greater than 1 from the coefficient information greater than 1, parity information is encoded (S5608) to indicate the parity of the current transform coefficient. For example, the parity information may indicate whether the absolute value of the current transform coefficient is even or odd.

In this case, when the coefficient information greater than 0, the coefficient information greater than 1, and the parity information are encoded, the number of first encoded information increases (S5606, S5607, S5608). For example, when at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information is encoded, the first encoded information amount may be increased by 1. Alternatively, the first encoded information amount may be increased by 1 each time at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information is encoded, respectively.

In other words, the first encoding information amount may represent the maximum amount of coefficient information allowed for one block. In this case, the block may represent a transform block or a sub-block of a transform block. In addition, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information. The first encoding information amount may be defined in units of video sequences, pictures, slices, coding tree blocks (CTUs), coding blocks (CUs), transform blocks (TUs), or subblocks of transform blocks. In other words, the same first encoding information amount may be determined/set for all transform blocks or sub-blocks belonging to the corresponding unit.

Subsequently, the transform coefficient to be encoded is changed to a subsequent coefficient by subtracting 1 from the PosL value. In this case, when the first encoded information amount exceeds the first threshold or the first-step coefficient information encoding in the current sub-block is completed, it is possible to move to a coefficient information encoding step larger than 3. Otherwise, subsequent coefficient information may be encoded. The first threshold is the maximum number of at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information (S5606, S5607, S5608) that can be encoded in units of sub-blocks.

Coefficient information greater than 3 may be encoded only for transform coefficients whose parity information is encoded in the reverse scan order (S5610). When coefficient information greater than 3 is encoded, the second encoded information amount may increase. When the second encoding information amount exceeds the second threshold or the second-step coefficient information encoding in the current sub-block is completed, it may move to the subsequent step S5611. The second threshold is the maximum number of coefficient information greater than 3 that can be encoded in units of subblocks.

Alternatively, the first encoding information may represent the maximum number of coefficient information that can be encoded in a predetermined unit. The coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3. In this case, the step of encoding the coefficient information greater than 3 may be included in the first step coefficient information encoding step.

Specifically, coefficient information greater than 0 indicating whether the absolute value of the current transform coefficient is greater than 0 may be encoded. When the current transform coefficient is determined to be non-zero, coefficient information greater than 1 indicating whether the absolute value of the current transform coefficient is greater than 1 may be encoded. Then, when it is determined from the coefficient information greater than 1 that the absolute value of the current transform coefficient is greater than 1, parity information may be encoded, and coefficient information greater than 3 may be encoded.

In this case, when coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 are encoded, the first encoded information amount increases. For example, when at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 is encoded, the first encoded information amount may be increased by 1. Alternatively, the first encoded information amount may be increased by 1 whenever at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 is encoded, respectively.

In other words, the first encoding information amount may represent the maximum amount of coefficient information allowed for one block. In this case, the block may represent a transform block or a sub-block of a transform block. In addition, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3. The first encoding information amount may be defined in units of video sequences, pictures, slices, coding tree blocks (CTUs), coding blocks (CUs), transform blocks (TUs), or subblocks of transform blocks. In other words, the same first encoding information amount may be determined/set for all transform blocks or sub-blocks belonging to the corresponding unit.

PosC denotes the position of the transform coefficient currently to be encoded. When PosL is smaller than PosC (S5612), it may be shown that the first-step coefficient information is encoded. After encoding coefficient information greater than N, an absolute value of a difference coefficient obtained by subtracting a minimum absolute value of a current transform coefficient, which is known from parity information of the current transform coefficient, from the current coefficient value may be encoded (S5613). In this case, N represents a number equal to or greater than 3, and the same value may be used in the encoding/decoding apparatus or may be transmitted from an upper header. When the value of N is 5, coefficient information greater than 4 may be encoded for a coefficient for which the absolute value of the current coefficient is determined to be 4 or greater. When it is determined that the absolute value of the current coefficient is 5 or more by the coefficient information greater than 4, the coefficient information greater than 5 may be encoded. When the value of the current transform coefficient is completely encoded by encoding up to the coefficient information greater than N, the step of encoding the absolute value of the difference coefficient (S5613) may be omitted. When PosL is greater than PosC, the absolute value of the current transform coefficient itself may be encoded (S5614). Subsequently, sign information representing sign information of the current transform coefficient may be encoded (S5615). When all information of the current transform coefficient is encoded, a subsequent transform coefficient in the subblock may be selected as the current transform coefficient by subtracting 1 from the PosC value (S5617), and the first and second thresholds may be updated when the current transform coefficient is the last transform coefficient in the subblock (S5618).

For the first threshold and the second threshold, when the number of transform coefficients in the current sub-block, which encode the absolute value of the current coefficient itself, is equal to or greater than C (C is an integer equal to or greater than 0), the corresponding thresholds may be adjusted. For example, when the first threshold value is 13, the first encoded information amount is 15, the second threshold value is 2, and the second encoded information amount is 2, this means that the first encoded information amount and the second encoded information amount reach the first threshold value and the second threshold value, and thus the first threshold value and the second threshold value may be updated so that the first threshold value and the second threshold value are increased. In addition, for example, when the first threshold value is 13, the first encoded information amount is 15, the second threshold value is 2, and the second encoded information amount is 1, this means that the first encoded information amount exceeds the first threshold value, but the second encoded information amount does not reach the second threshold value, and thus the first threshold value and the second threshold value may be updated such that the first threshold value is increased and the second threshold value is decreased. Alternatively, when neither the first amount of encoded information nor the second amount of encoded information reaches the first threshold value or the second threshold value, the first threshold value and the second threshold value may be updated such that the first threshold value and the second threshold value are decreased. Optionally, the first and second thresholds may be updated to maintain the first and second thresholds.

When the current sub-block is not the last sub-block (S5619), the subsequent sub-block may be moved (S5620), and when the current sub-block is the last sub-block (S5619), the transform block encoding may end.

Fig. 57 is a flowchart showing a decoding method of quantized transform coefficients. Which may be performed by an entropy decoding unit of an image decoding device.

First, by decoding Last _ sig, a first non-zero coefficient when transform coefficients are scanned according to a reverse scan order may be determined as a basic coefficient (S5701).

A sub-block including a base coefficient may be selected (S5702), and transform coefficient information in the sub-block may be decoded. When it is not a sub-block including a basic coefficient, sub-block information may be decoded before decoding coefficients in the transform block (S5703). Coded _ sub _ blk _ flag is a flag indicating that at least one or more non-zero coefficients exist in the current sub-block. The first and second decoding information amounts may be initialized to 0 before decoding the coefficient information in the sub-block. The first decoded information number is the number of decoded coefficient information (S5706) greater than 0, coefficient information (S5706) greater than 1, and parity information (S5707). The second decoded information number is the number of decoded coefficient information greater than 3 (S5710).

Subsequently, the transform coefficients to be currently decoded may be selected in a reverse scan order (S5704). PosL indicates the first position of the transform coefficient in the current subblock in the reverse scan order, which was not decoded by the first step coefficient information decoding process. After selecting a transform coefficient to be decoded first in a sub-block, coefficient information greater than 0 indicating whether an absolute value of a current transform coefficient is greater than 0 may be decoded (S5706). When the current transform coefficient is determined to be non-zero, coefficient information greater than 1, which indicates whether the absolute value of the current transform coefficient is greater than 1, may be decoded (S5707). Subsequently, when it is determined that the absolute value of the current transform coefficient is greater than 1 from the coefficient information greater than 1, the parity of the current transform coefficient may be known by decoding the parity information (S5708). In this case, when the coefficient information greater than 0, the coefficient information greater than 1, and the parity information are decoded, the first decoded information amount is reduced (S5706, S5707, S5708). For example, when at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information is decoded, the first decoded information amount may be reduced by 1. Alternatively, the first decoded information number may be reduced by 1 whenever at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information is decoded, respectively. In other words, the first decoding information amount may represent the maximum amount of coefficient information transmitted for one block. In this case, the block may represent a transform block or a sub-block of a transform block. In addition, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, and parity information. The first decoding information amount may be defined in units of video sequences, pictures, slices, coding tree blocks (CTUs), coding blocks (CUs), transform blocks (TUs), or subblocks of a transform block. In other words, the same first decoding information amount may be set for all transform blocks or sub-blocks belonging to the corresponding unit.

Subsequently, the coefficient to be decoded is changed to the subsequent transform coefficient by decreasing the PosL value by 1. In this case, when the first decoded information amount exceeds the first threshold or the first-step coefficient information decoding in the current sub-block is completed, it is possible to move to a coefficient information decoding step greater than 3. Otherwise, subsequent transform coefficient information may be decoded. The first threshold is the maximum number of coefficient information greater than 0, coefficient information greater than 1, and parity information (S5706, S5707, S5708) that can be decoded in units of subblocks. The first-step coefficient information decoding represents steps S5706, S5707, S5708 of decoding coefficient information larger than 0, coefficient information larger than 1, and parity information.

Coefficient information greater than 3 may be decoded only for transform coefficients whose parity information is decoded in a reverse scan order (S5710). When coefficient information greater than 3 is decoded, the second decoded information amount may be increased. When the second decoded information amount exceeds the second threshold or the second-step coefficient information decoding in the current sub-block is completed, it may move to the subsequent step S5711. The second threshold is the maximum number of coefficient information greater than 3 that can be decoded in units of subblocks. The second step of coefficient information decoding is step S5710 of decoding coefficient information larger than 3.

Alternatively, the first decoding information amount may represent the maximum amount of coefficient information that can be transmitted in a predetermined unit. In this case, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3. In this case, the step of decoding the coefficient information greater than 3 may be included in the first step coefficient information decoding step.

Specifically, coefficient information greater than 0 indicating whether the absolute value of the current transform coefficient is greater than 0 may be decoded. When the current transform coefficient is determined to be non-zero, coefficient information greater than 1 indicating whether the absolute value of the current transform coefficient is greater than 1 may be decoded. Then, when it is determined that the absolute value of the current transform coefficient is greater than 1 from the coefficient information greater than 1, the parity information and the coefficient information greater than 3 may be decoded.

In this case, when coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 are decoded, the first decoded information amount is reduced. For example, when at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 is decoded, the first decoded information number may be reduced by 1. Alternatively, the first decoded information number may be reduced by 1 whenever at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3 is decoded separately.

In other words, the first decoding information amount may represent the maximum amount of coefficient information allowed for one block. In this case, the block may represent a transform block or a sub-block of a transform block. In addition, the coefficient information may include at least one of coefficient information greater than 0, coefficient information greater than 1, parity information, and coefficient information greater than 3. The first decoding information amount may be defined in units of video sequences, pictures, slices, coding tree blocks (CTUs), coding blocks (CUs), transform blocks (TUs), or subblocks of a transform block. In other words, the same first decoding information amount may be set for all transform blocks or sub-blocks belonging to the corresponding unit.

PosC denotes the position of the transform coefficient currently to be decoded. When PosL is smaller than PosC (S5712), it may be shown that information on the current transform coefficient is decoded in the first-step coefficient information decoding. After decoding coefficient information greater than N, an absolute value of a difference coefficient obtained by subtracting a minimum absolute value of a current transform coefficient, which is known from parity information of the current transform coefficient, from the current coefficient value may be decoded (S5713). When the value of the current coefficient is completely decoded by decoding up to the coefficient information greater than N, the step S5713 of decoding the absolute value of the difference coefficient may be omitted. When PosL is greater than PosC, decoding of the absolute value of the current transform coefficient information may be performed once (S5714). Subsequently, symbol information representing symbol information of the current transform coefficient may be decoded (S5715). When all information of the current transform coefficient is decoded, a subsequent coefficient in the sub-block may be selected as the current coefficient by decreasing the PosC value by 1 (S5717), and when the current transform coefficient is the last coefficient in the sub-block, the first and second thresholds may be updated (S5718).

For the first threshold and the second threshold, when the number of transform coefficients for which the absolute value of the current coefficient itself in the current sub-block is decoded is equal to or greater than C (C is an integer equal to or greater than 0), the corresponding thresholds may be adjusted. For example, when the first threshold value is 13, the first decoded information amount is 15, the second threshold value is 2, and the second decoded information amount is 2, this means that the first decoded information amount and the second decoded information amount reach the first threshold value and the second threshold value, and thus the first threshold value and the second threshold value may be updated so that the first threshold value and the second threshold value are increased. In addition, for example, when the first threshold value is 13, the first decoded information amount is 15, the second threshold value is 2, and the second decoded information amount is 1, this means that the first decoded information amount exceeds the first threshold value, but the second decoded information amount does not reach the second threshold value, so the first threshold value and the second threshold value may be updated so that the first threshold value is increased and the second threshold value is decreased. Alternatively, when neither the first decoded information amount nor the second decoded information amount reaches the first threshold value or the second threshold value, the first threshold value and the second threshold value may be updated such that the first threshold value and the second threshold value are decreased. Optionally, the first and second thresholds may be updated to maintain the first and second thresholds.

When the current sub-block is not the last sub-block (S5719), the subsequent sub-block (S5720) may be moved, and when the current sub-block is the last sub-block (S5719), transform block decoding may end.

The various embodiments of the present disclosure do not list all possible combinations, but describe representative aspects of the present disclosure, and what is described in the various embodiments may be applied independently or may be applied by two or more combinations.

In addition, various embodiments of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof, and the like. For implementation by hardware, implementation may be performed by one or more ASICs (application specific integrated circuits), DSPs (digital signal processors), DSPDs (digital signal processing devices), PLDs (programmable logic devices), FPGAs (field programmable gate arrays), general-purpose processors, controllers, micro-controllers, microprocessors, or the like.

The scope of the present disclosure includes software or machine-executable instructions (e.g., operating systems, applications, firmware, programs, etc.) that perform actions in accordance with the methods of various embodiments in a device or computer, as well as non-transitory computer-readable media that cause such software or instructions, etc. to be stored and executed in a device or computer.

INDUSTRIAL APPLICABILITY

The present disclosure may be used to encode/decode an image.

Claims

1. An image decoding method, comprising:

generating a candidate list of the current block; and

performing inter prediction of the current block by using any one of a plurality of candidates belonging to the candidate list,

wherein the plurality of candidates include at least one of spatial candidates, temporal candidates, and candidates based on reconstruction information, and

wherein the candidates based on the reconstruction information are added from a buffer storing motion information decoded before the current block.

2. The method of claim 1, wherein the motion information stored in the buffer is added to the candidate list in an order of motion information stored later in the buffer or in an order of motion information stored first in the buffer.

3. The method of claim 2, wherein the number or order of adding the motion information stored in the buffer to the candidate list is differently determined according to an inter prediction mode of the current block.

4. A method according to claim 3, wherein the candidate list is filled by using the motion information stored in the buffer until a maximum number of candidates of the candidate list is reached, or the candidate list is filled by using the motion information stored in the buffer until a number of subtractions of 1 from the maximum number of candidates is reached.

5. The method according to claim 1, wherein the buffer is initialized in units of any one of a Coding Tree Unit (CTU), a CTU row, a slice, or a picture.

6. An image encoding method comprising:

generating a candidate list of the current block; and

7. The method of claim 6, wherein the motion information stored in the buffer is added to the candidate list in an order of motion information stored later in the buffer or in an order of motion information stored first in the buffer.

8. The method of claim 7, wherein the number or order of adding the motion information stored in the buffer to the candidate list is differently determined according to an inter prediction mode of the current block.

9. The method of claim 8, wherein the candidate list is populated by using the motion information stored in the buffer until a maximum number of candidates of the candidate list is reached, or the candidate list is populated by using the motion information stored in the buffer until a number of subtractions of 1 from the maximum number of candidates is reached.

10. The method according to claim 6, wherein the buffer is initialized in units of any one of a Coding Tree Unit (CTU), a CTU row, a slice, or a picture.

11. A computer-readable recording medium storing a bitstream decoded by an image decoding method; the image decoding method includes:

generating a candidate list of the current block; and

12. The method of claim 11, wherein the motion information stored in the buffer is added to the candidate list in an order of motion information stored later in the buffer or in an order of motion information stored first in the buffer.

13. The method of claim 12, wherein the number or order of adding the motion information stored in the buffer to the candidate list is differently determined according to an inter prediction mode of the current block.

14. The method of claim 13, wherein the candidate list is populated by using the motion information stored in the buffer until a maximum number of candidates for the candidate list is reached, or the candidate list is populated by using the motion information stored in the buffer until a number of subtractions of 1 from the maximum number of candidates is reached.

15. The method of claim 11, wherein the buffer is initialized in units of any one of a Coding Tree Unit (CTU), a CTU row, a slice, or a picture.