WO2020005032A1

WO2020005032A1 - Inter-prediction method using motion information compression, and device for same

Info

Publication number: WO2020005032A1
Application number: PCT/KR2019/007925
Authority: WO
Inventors: 유선미; 장형문; 최장원; 허진
Original assignee: 엘지전자 주식회사
Priority date: 2018-06-28
Filing date: 2019-06-28
Publication date: 2020-01-02

Abstract

A video decoding method performed by a decoding device according to the present invention comprises: a step for deriving temporal motion information about a current block on the basis of a temporally adjacent block of the current block; a step for compiling a motion information candidate list for the current block, the list including the temporal motion information; a step for deriving motion information about the current block on the basis of the motion information candidate list; and a step for making a prediction about the current block on the basis of the motion information about the current block, wherein the temporally adjacent block is a collocated block (col block) which is positioned corresponding to the current block in a reference picture, and the temporal motion information is the same as representative motion information stored in storage units corresponding to the col block among pieces of representative motion information compressed and stored in prescribed storage units with respect to the reference picture.

Description

Inter prediction method using motion information compression and its device

The present invention relates to an image coding technology, and more particularly, to an inter prediction method and apparatus using motion information compression in an image coding system.

Recently, the demand for high resolution and high quality images such as high definition (HD) images and ultra high definition (UHD) images is increasing in various fields. The higher the resolution and the higher quality of the image data, the more information or bit rate is transmitted than the existing image data. Therefore, the image data can be transmitted by using a medium such as a conventional wired / wireless broadband line or by using a conventional storage medium. In the case of storage, the transmission cost and the storage cost are increased.

Accordingly, a high efficiency image compression technique is required to effectively transmit, store, and reproduce high resolution, high quality image information.

An object of the present invention is to provide a method and apparatus for improving image coding efficiency.

Another object of the present invention is to provide a method and apparatus for improving the efficiency of image coding based on inter prediction.

Another technical problem of the present invention is to provide an inter prediction method and apparatus using motion information compression.

Another technical problem of the present invention is to provide a method and apparatus for reducing the memory capacity by compressing and storing information for deriving temporal motion information.

Another technical problem of the present invention is to provide a method and apparatus for reducing computation complexity and improving compression performance in motion information compression.

According to an embodiment of the present invention, there is provided an image decoding method performed by a decoding apparatus. The method may include deriving temporal motion information for the current block based on a temporal neighboring block of the current block, constructing a motion information candidate list for the current block including the temporal motion information, Deriving motion information of the current block based on a motion information candidate list, and performing prediction on the current block based on motion information of the current block, wherein the temporal neighboring block includes: a reference picture; It is a col block (collocated block) positioned corresponding to the current block in the, wherein the temporal motion information is stored corresponding to the col block among the representative motion information stored by compressing the reference picture in a predetermined storage unit And representative motion information stored in the unit.

According to another embodiment of the present invention, an image encoding method performed by an encoding apparatus is provided. The method may include deriving temporal motion information for the current block based on a temporal neighboring block of the current block, constructing a motion information candidate list for the current block including the temporal motion information, Deriving motion information of the current block based on a motion information candidate list, performing prediction on the current block based on motion information of the current block, and encoding motion information of the current block And the temporal neighboring block is a col block (located block) positioned corresponding to the current block in the reference picture, and the temporal motion information is stored in a predetermined storage unit with respect to the reference picture. Among information, a table stored in a storage unit corresponding to the col block. And table motion information.

According to the present invention, the overall video / video compression efficiency can be improved.

According to the present invention, the efficiency of image coding based on inter prediction can be improved.

According to the present invention, the efficiency of inter prediction and image coding using motion information compression can be improved.

According to the present invention, it is possible to reduce the computational complexity in motion information compression and to increase the compression performance.

According to the present invention, hardware cost can be reduced by reducing memory capacity as motion information compression is performed.

According to the present invention, it is possible to minimize the performance reduction due to motion information compression by adaptively using spatial location for motion information compression.

1 is a diagram schematically illustrating a configuration of a video encoding apparatus to which the present invention may be applied.

2 is a diagram schematically illustrating a configuration of a video decoding apparatus to which the present invention may be applied.

3 exemplarily illustrates a neighboring block referred to to derive motion information of a current block.

4 is a diagram schematically illustrating a method of compressing and storing motion information that may be applied when constructing a motion information candidate list according to the present invention.

5 exemplarily shows blocks partitioned for prediction within a picture.

FIG. 6 is a diagram schematically illustrating a method of storing motion information in units of a minimum prediction block.

7 is a diagram schematically illustrating a method of compressing and storing motion information in a predetermined storage unit according to the present invention.

8 exemplarily illustrates a case in which one or more prediction blocks are included in a storage unit having a predetermined size.

FIG. 9 exemplarily shows spatial position candidates used to determine representative motion information stored on behalf of one storage unit according to the present invention.

10 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit having a predetermined size based on a spatial position candidate according to the present invention.

11 is a flowchart illustrating another embodiment of a method of determining representative motion information in a storage unit of a predetermined size based on a spatial position candidate according to the present invention.

12 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit having a predetermined size with reference to an optimized spatial position candidate among spatial position candidates according to the present invention.

FIG. 13 is a flowchart illustrating another embodiment of a method of determining representative motion information in a storage unit having a predetermined size with reference to an optimized spatial position candidate among spatial position candidates according to the present invention.

14 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit of a predetermined size with reference to two optimized spatial position candidates among spatial position candidates according to the present invention.

FIG. 15 exemplarily illustrates prediction blocks included in one storage unit to illustrate a method of determining representative motion information based on an area of prediction blocks in a storage unit having a predetermined size according to the present invention.

16 is a diagram illustrating an embodiment of exception processing for representative motion information stored on behalf of one storage unit according to the present invention.

17 is a flowchart schematically illustrating an image encoding method by an encoding apparatus according to the present invention.

18 is a flowchart schematically illustrating an image decoding method by a decoding apparatus according to the present invention.

19 exemplarily shows a structure diagram of a content streaming system to which the present invention is applied.

As the present invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the invention to the specific embodiments. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the spirit of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. The terms "comprise" or "having" in this specification are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features It is to be understood that the numbers, steps, operations, components, parts or figures do not exclude in advance the presence or possibility of adding them.

On the other hand, each configuration in the drawings described in the present invention are shown independently for the convenience of description of the different characteristic functions, it does not mean that each configuration is implemented by separate hardware or separate software. For example, two or more of each configuration may be combined to form one configuration, or one configuration may be divided into a plurality of configurations. Embodiments in which each configuration is integrated and / or separated are also included in the scope of the present invention without departing from the spirit of the present invention.

Hereinafter, with reference to the accompanying drawings, it will be described in detail a preferred embodiment of the present invention. Hereinafter, the same reference numerals are used for the same components in the drawings, and redundant description of the same components is omitted.

On the other hand, the present invention relates to video / image coding. For example, the methods / embodiments disclosed in the present invention may include a versatile video coding (VVC) standard, an essential video coding (ECC) standard, an AOMedia Video 1 (AV1) standard, a second generation of audio video coding standard (AVS2), or next-generation video. / Image coding standards (e.g., H.267, H.268, etc.).

In the present specification, a video may mean a series of images over time. A picture generally refers to a unit representing one image in a specific time zone, and a slice is a unit constituting a part of a picture in coding. One picture may be composed of a plurality of slices, and if necessary, the picture and the slice may be mixed with each other.

A pixel or a pel may refer to a minimum unit constituting one picture (or image). Also, 'sample' may be used as a term corresponding to a pixel. A sample may generally represent a pixel or a value of a pixel, and may only represent pixel / pixel values of the luma component, or only pixel / pixel values of the chroma component.

A unit represents the basic unit of image processing. The unit may include at least one of a specific region of the picture and information related to the region. The unit may be used interchangeably with terms such as block or area in some cases. In a general case, an M × N block may represent a set of samples or transform coefficients composed of M columns and N rows.

Referring to FIG. 1, the video encoding apparatus 100 may include a picture splitter 105, a predictor 110, a residual processor 120, an entropy encoder 130, an adder 140, and a filter 150. ) And memory 160. The residual processing unit 120 may include a subtraction unit 121, a conversion unit 122, a quantization unit 123, a reordering unit 124, an inverse quantization unit 125, and an inverse conversion unit 126.

The picture divider 105 may divide the input picture into at least one processing unit.

As an example, the processing unit may be called a coding unit (CU). In this case, the coding unit may be recursively split from the largest coding unit (LCU) according to a quad-tree binary-tree (QTBT) structure. For example, one coding unit may be divided into a plurality of coding units of a deeper depth based on a quad tree structure and / or a binary tree structure. In this case, for example, the quad tree structure may be applied first and the binary tree structure may be applied later. Alternatively, the binary tree structure may be applied first. The coding procedure according to the present invention may be performed based on the final coding unit that is no longer split. In this case, the maximum coding unit may be used as the final coding unit immediately based on coding efficiency according to the image characteristic, or if necessary, the coding unit is recursively divided into coding units of lower depths and optimized. A coding unit of size may be used as the final coding unit. Here, the coding procedure may include a procedure of prediction, transform, and reconstruction, which will be described later.

As another example, the processing unit may include a coding unit (CU) prediction unit (PU) or a transform unit (TU). The coding unit may be split from the largest coding unit (LCU) into coding units of deeper depths along the quad tree structure. In this case, the maximum coding unit may be used as the final coding unit immediately based on coding efficiency according to the image characteristic, or if necessary, the coding unit is recursively divided into coding units of lower depths and optimized. A coding unit of size may be used as the final coding unit. If a smallest coding unit (SCU) is set, the coding unit may not be split into smaller coding units than the minimum coding unit. Here, the final coding unit refers to a coding unit that is the basis of partitioning or partitioning into a prediction unit or a transform unit. The prediction unit is a unit partitioning from the coding unit and may be a unit of sample prediction. In this case, the prediction unit may be divided into sub blocks. The transform unit may be divided along the quad tree structure from the coding unit, and may be a unit for deriving a transform coefficient and / or a unit for deriving a residual signal from the transform coefficient. Hereinafter, a coding unit may be called a coding block (CB), a prediction unit is a prediction block (PB), and a transform unit may be called a transform block (TB). A prediction block or prediction unit may mean a specific area in the form of a block within a picture, and may include an array of prediction samples. In addition, a transform block or a transform unit may mean a specific area in a block form within a picture, and may include an array of transform coefficients or residual samples.

The prediction unit 110 may perform a prediction on a block to be processed (hereinafter, referred to as a current block) and generate a predicted block including prediction samples of the current block. The unit of prediction performed by the prediction unit 110 may be a coding block, a transform block, or a prediction block.

The prediction unit 110 may determine whether intra prediction or inter prediction is applied to the current block. As an example, the prediction unit 110 may determine whether intra prediction or inter prediction is applied on a CU basis.

In the case of intra prediction, the prediction unit 110 may derive a prediction sample for the current block based on reference samples outside the current block in the picture to which the current block belongs (hereinafter, referred to as the current picture). In this case, the prediction unit 110 may (i) derive the prediction sample based on the average or interpolation of neighboring reference samples of the current block, and (ii) the neighbor reference of the current block. The prediction sample may be derived based on a reference sample present in a specific (prediction) direction with respect to the prediction sample among the samples. In case of (i), it may be called non-directional mode or non-angle mode, and in case of (ii), it may be called directional mode or angular mode. In intra prediction, the prediction mode may have, for example, 33 directional prediction modes and at least two non-directional modes. The non-directional mode may include a DC prediction mode and a planner mode (Planar mode). The prediction unit 110 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.

In the case of inter prediction, the prediction unit 110 may derive the prediction sample for the current block based on the sample specified by the motion vector on the reference picture. The prediction unit 110 may apply one of a skip mode, a merge mode, and a motion vector prediction (MVP) mode to derive a prediction sample for the current block. In the skip mode and the merge mode, the prediction unit 110 may use the motion information of the neighboring block as the motion information of the current block. In the skip mode, unlike the merge mode, the difference (residual) between the prediction sample and the original sample is not transmitted. In the MVP mode, the motion vector of the current block may be derived using the motion vector of the neighboring block as a motion vector predictor.

In the case of inter prediction, the neighboring block may include a spatial neighboring block existing in the current picture and a temporal neighboring block present in the reference picture. A reference picture including the temporal neighboring block may be called a collocated picture (colPic). The motion information may include a motion vector and a reference picture index. Information such as prediction mode information and motion information may be encoded (entropy) and output in the form of a bitstream.

When the motion information of the temporal neighboring block is used in the skip mode and the merge mode, the highest picture on the reference picture list may be used as the reference picture. Reference pictures included in a reference picture list may be sorted based on a difference in a picture order count (POC) between a current picture and a corresponding reference picture. The POC corresponds to the display order of the pictures and may be distinguished from the coding order.

The subtraction unit 121 generates a residual sample which is a difference between the original sample and the prediction sample. When the skip mode is applied, residual samples may not be generated as described above.

The transform unit 122 generates transform coefficients by transforming the residual sample in units of transform blocks. The transform unit 122 may perform the transform according to the size of the transform block and the prediction mode applied to the coding block or the prediction block that spatially overlaps the transform block. For example, if intra prediction is applied to the coding block or the prediction block that overlaps the transform block, and the transform block is a residual array of 4Х4, the residual sample is obtained by using a discrete sine transform (DST) transform kernel. In other cases, the residual sample may be transformed using a discrete cosine transform (DCT) transform kernel.

The quantization unit 123 may quantize the transform coefficients to generate quantized transform coefficients.

The reordering unit 124 rearranges the quantized transform coefficients. The reordering unit 124 may reorder the quantized transform coefficients in the form of a block into a one-dimensional vector form through a coefficient scanning method. Although the reordering unit 124 has been described in a separate configuration, the reordering unit 124 may be part of the quantization unit 123.

The entropy encoding unit 130 may perform entropy encoding on the quantized transform coefficients. Entropy encoding may include, for example, encoding methods such as exponential Golomb, context-adaptive variable length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), and the like. The entropy encoding unit 130 may encode information necessary for video reconstruction other than the quantized transform coefficients (for example, a value of a syntax element) together or separately according to entropy encoding or a predetermined method. The encoded information may be transmitted or stored in units of network abstraction layer (NAL) units in the form of bitstreams. The bitstream may be transmitted over a network or may be stored in a digital storage medium. The network may include a broadcasting network and / or a communication network, and the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, and the like.

The inverse quantization unit 125 inverse quantizes the quantized values (quantized transform coefficients) in the quantization unit 123, and the inverse transformer 126 inverse transforms the inverse quantized values in the inverse quantization unit 125 to generate a residual sample. Create

The adder 140 reconstructs the picture by combining the residual sample and the predictive sample. The residual sample and the predictive sample may be added in units of blocks to generate a reconstructed block. Although the adder 140 has been described in a separate configuration, the adder 140 may be part of the predictor 110. On the other hand, the adder 140 may be called a restoration unit or a restoration block generation unit.

The filter unit 150 may apply a deblocking filter and / or a sample adaptive offset to the reconstructed picture. Through deblocking filtering and / or sample adaptive offset, the artifacts of the block boundaries in the reconstructed picture or the distortion in the quantization process can be corrected. The sample adaptive offset may be applied on a sample basis and may be applied after the process of deblocking filtering is completed. The filter unit 150 may apply an adaptive loop filter (ALF) to the reconstructed picture. ALF may be applied to the reconstructed picture after the deblocking filter and / or sample adaptive offset is applied.

The memory 160 may store reconstructed pictures (decoded pictures) or information necessary for encoding / decoding. Here, the reconstructed picture may be a reconstructed picture after the filtering process is completed by the filter unit 150. The stored reconstructed picture may be used as a reference picture for (inter) prediction of another picture. For example, the memory 160 may store (reference) pictures used for inter prediction. In this case, pictures used for inter prediction may be designated by a reference picture set or a reference picture list.

Referring to FIG. 2, the video decoding apparatus 200 may include an entropy decoding unit 210, a residual processor 220, a predictor 230, an adder 240, a filter 250, and a memory 260. It may include. Here, the residual processor 220 may include a rearrangement unit 221, an inverse quantization unit 222, and an inverse transform unit 223.

When a bitstream including video information is input, the video decoding apparatus 200 may restore video in response to a process in which video information is processed in the video encoding apparatus.

For example, the video decoding apparatus 200 may perform video decoding using a processing unit applied in the video encoding apparatus. Thus, the processing unit block of video decoding may be, for example, a coding unit, and in another example, a coding unit, a prediction unit, or a transform unit. The coding unit may be split along the quad tree structure and / or binary tree structure from the largest coding unit.

The prediction unit and the transform unit may be further used in some cases, in which case the prediction block is a block derived or partitioned from the coding unit and may be a unit of sample prediction. At this point, the prediction unit may be divided into subblocks. The transform unit may be divided along the quad tree structure from the coding unit, and may be a unit for deriving a transform coefficient or a unit for deriving a residual signal from the transform coefficient.

The entropy decoding unit 210 may parse the bitstream and output information necessary for video reconstruction or picture reconstruction. For example, the entropy decoding unit 210 decodes information in a bitstream based on a coding method such as exponential Golomb coding, CAVLC, or CABAC, quantized values of syntax elements necessary for video reconstruction, and residual coefficients. Can be output.

More specifically, the CABAC entropy decoding method receives a bin corresponding to each syntax element in a bitstream, and decodes syntax element information and decoding information of neighboring and decoding target blocks or information of symbols / bins decoded in a previous step. The context model may be determined using the context model, the probability of occurrence of a bin may be predicted according to the determined context model, and arithmetic decoding of the bin may be performed to generate a symbol corresponding to the value of each syntax element. have. In this case, the CABAC entropy decoding method may update the context model by using the information of the decoded symbol / bin for the context model of the next symbol / bean after determining the context model.

The information related to the prediction among the information decoded by the entropy decoding unit 210 is provided to the prediction unit 230, and the residual value on which the entropy decoding has been performed by the entropy decoding unit 210, that is, the quantized transform coefficient, is used as a reordering unit ( 221 may be input.

The reordering unit 221 may rearrange the quantized transform coefficients in a two-dimensional block form. The reordering unit 221 may perform reordering in response to coefficient scanning performed by the encoding apparatus. Here, the rearrangement unit 221 has been described in a separate configuration, but the rearrangement unit 221 may be part of the inverse quantization unit 222.

The inverse quantization unit 222 may dequantize the quantized transform coefficients based on the (inverse) quantization parameter and output the transform coefficients. In this case, information for deriving a quantization parameter may be signaled from the encoding apparatus.

The inverse transform unit 223 may inversely transform transform coefficients to derive residual samples.

The prediction unit 230 may perform prediction on the current block and generate a predicted block including prediction samples for the current block. The unit of prediction performed by the prediction unit 230 may be a coding block, a transform block, or a prediction block.

The prediction unit 230 may determine whether to apply intra prediction or inter prediction based on the information about the prediction. In this case, a unit for determining which of intra prediction and inter prediction is to be applied and a unit for generating a prediction sample may be different. In addition, the unit for generating a prediction sample in inter prediction and intra prediction may also be different. For example, whether to apply inter prediction or intra prediction may be determined in units of CUs. In addition, for example, in inter prediction, a prediction mode may be determined and a prediction sample may be generated in PU units, and in intra prediction, a prediction mode may be determined in PU units and a prediction sample may be generated in TU units.

In the case of intra prediction, the prediction unit 230 may derive the prediction sample for the current block based on the neighbor reference samples in the current picture. The prediction unit 230 may derive the prediction sample for the current block by applying the directional mode or the non-directional mode based on the neighbor reference samples of the current block. In this case, the prediction mode to be applied to the current block may be determined using the intra prediction mode of the neighboring block.

In the case of inter prediction, the prediction unit 230 may derive the prediction sample for the current block based on the sample specified on the reference picture by the motion vector on the reference picture. The prediction unit 230 may apply any one of a skip mode, a merge mode, and an MVP mode to derive a prediction sample for the current block. In this case, motion information required for inter prediction of the current block provided by the video encoding apparatus, for example, information about a motion vector, a reference picture index, and the like may be obtained or derived based on the prediction information.

In the skip mode and the merge mode, the motion information of the neighboring block may be used as the motion information of the current block. In this case, the neighboring block may include a spatial neighboring block and a temporal neighboring block.

The prediction unit 230 may construct a merge candidate list using motion information of available neighboring blocks, and may use information indicated by the merge index on the merge candidate list as a motion vector of the current block. The merge index may be signaled from the encoding device. The motion information may include a motion vector and a reference picture. When the motion information of the temporal neighboring block is used in the skip mode and the merge mode, the highest picture on the reference picture list may be used as the reference picture.

In the skip mode, unlike the merge mode, the difference (residual) between the prediction sample and the original sample is not transmitted.

In the MVP mode, the motion vector of the current block may be derived using the motion vector of the neighboring block as a motion vector predictor. In this case, the neighboring block may include a spatial neighboring block and a temporal neighboring block.

For example, when the merge mode is applied, a merge candidate list may be generated by using a motion vector of a reconstructed spatial neighboring block and / or a motion vector corresponding to a Col block, which is a temporal neighboring block. In the merge mode, the motion vector of the candidate block selected from the merge candidate list is used as the motion vector of the current block. The information about the prediction may include a merge index indicating a candidate block having an optimal motion vector selected from candidate blocks included in the merge candidate list. In this case, the prediction unit 230 may derive the motion vector of the current block by using the merge index.

As another example, when the Motion Vector Prediction (MVP) mode is applied, a motion vector predictor candidate list may be generated using a motion vector of a reconstructed spatial neighboring block and / or a motion vector corresponding to a Col block, which is a temporal neighboring block. Can be. That is, the motion vector of the reconstructed spatial neighboring block and / or the Col vector, which is a temporal neighboring block, may be used as a motion vector candidate. The prediction information may include a prediction motion vector index indicating an optimal motion vector selected from the motion vector candidates included in the list. In this case, the prediction unit 230 may select the predicted motion vector of the current block from the motion vector candidates included in the motion vector candidate list using the motion vector index. The prediction unit of the encoding apparatus may obtain a motion vector difference (MVD) between the motion vector of the current block and the motion vector predictor, and may encode the output vector in a bitstream form. That is, MVD may be obtained by subtracting the motion vector predictor from the motion vector of the current block. In this case, the prediction unit 230 may obtain a motion vector difference included in the information about the prediction, and derive the motion vector of the current block by adding the motion vector difference and the motion vector predictor. The prediction unit may also obtain or derive a reference picture index or the like indicating a reference picture from the information about the prediction.

The adder 240 may reconstruct the current block or the current picture by adding the residual sample and the predictive sample. The adder 240 may reconstruct the current picture by adding the residual sample and the predictive sample in block units. Since the residual is not transmitted when the skip mode is applied, the prediction sample may be a reconstruction sample. Although the adder 240 has been described in a separate configuration, the adder 240 may be part of the predictor 230. On the other hand, the adder 240 may be called a restoration unit or a restoration block generation unit.

The filter unit 250 may apply the deblocking filtering sample adaptive offset, and / or ALF to the reconstructed picture. In this case, the sample adaptive offset may be applied in units of samples and may be applied after deblocking filtering. ALF may be applied after deblocking filtering and / or sample adaptive offset.

The memory 260 may store reconstructed pictures (decoded pictures) or information necessary for decoding. Here, the reconstructed picture may be a reconstructed picture after the filtering process is completed by the filter unit 250. For example, the memory 260 may store pictures used for inter prediction. In this case, pictures used for inter prediction may be designated by a reference picture set or a reference picture list. The reconstructed picture can be used as a reference picture for another picture. In addition, the memory 260 may output the reconstructed picture in an output order.

Meanwhile, as described above, the prediction unit 110 of the video encoding apparatus 100 and / or the prediction unit 230 of the video decoding apparatus 200 may derive the prediction sample by performing inter prediction on the current block. . When inter prediction is applied to the current block, in order to derive motion information of the current block, not only motion information of neighboring blocks (i.e., spatial neighboring blocks) spatially adjacent to the current block, but also neighboring blocks (i.e., the previously decoded reference picture) The motion information candidate list may be configured based on the motion information of the temporal neighboring block). Here, the motion information may include a motion vector and a reference picture index, and the motion information candidate list may indicate a merge candidate list or a motion vector predictor (MVP) candidate list.

In particular, when deriving motion information of a current block by referring to a reference picture that is already decoded during inter prediction, all motion information of predicted blocks in the reference picture should be stored. However, a significant amount of data is required to store all the motion information in such a reference picture, and therefore, a memory for storing the reference picture (i.e., a decoded picture buffer (DPB)) also needs a considerable amount of capacity, which may increase hardware costs. Can be.

Accordingly, the present invention proposes a method of saving memory and improving performance in storing motion information of predicted blocks in a decoded reference picture.

(A), (b) and (c) of FIG. 3 may represent spatial neighboring blocks, and (d) of FIG. 3 may represent temporal neighboring blocks.

Referring to FIG. 3A, when the size of the current block is 2N × 2N, the spatial neighboring blocks include the left neighboring block A, the upper neighboring block B, the upper right corner peripheral block C, and the lower left corner neighboring. Block D and / or block E in the upper left corner.

For example, if the size of the current block is 2Nx2N, and the x component of the top-left sample position of the current block is 0 and the y component is 0, the left neighboring block A is (-1, 2N-1) coordinates. The upper peripheral block (B) is a block containing a sample of (2N-1, -1) coordinates, and the upper right corner peripheral block (C) is a sample of (2N, -1) coordinates. And a lower left corner peripheral block (D) is a block including a sample of (-1, 2N) coordinates, and an upper left corner peripheral block (E) is a sample of (-1, -1) coordinates. It may be a containing block.

Referring to FIG. 3B, when the size of the current block is 2N × N, the spatial neighboring blocks are the upper neighboring block A, the upper right corner peripheral block B, the lower left corner peripheral block C, and / or The upper left corner peripheral block D may be included.

For example, if the size of the current block is 2NxN, and the x component of the top-left sample position of the current block is 0 and the y component is 0, the upper peripheral block A is (N-1, -1) coordinates. Is a block containing a sample of, and the upper right corner peripheral block (B) is a block containing a sample of (N, -1) coordinates, and a lower left corner peripheral block (C) is a sample of (-1, 2N) coordinates. The block containing the upper left corner peripheral block (D) may be a block containing a sample of the (-1, -1) coordinates.

Referring to FIG. 3C, when the size of the current block is Nx2N, the spatial neighboring blocks may include the left neighboring block A, the upper right corner peripheral block B, the lower left corner peripheral block C, and / or The upper left corner peripheral block D may be included.

For example, when the size of the current block is Nx2N, and the x component of the top-left sample position of the current block is 0 and the y component is 0, the left neighboring block A is (-1, N-1) coordinates. Is a block containing a sample of, and the upper right corner peripheral block (B) is a block containing a sample of (2N, -1) coordinates, and a lower left corner peripheral block (C) is a sample of (-1, N) coordinates. The block containing the upper left corner peripheral block (D) may be a block containing a sample of the (-1, -1) coordinates.

Referring to FIG. 3D, the temporal neighboring block may represent a block located at the same position as the current block in a picture different from the current picture including the current block (ie, the reference picture). Here, another picture may be before or after the current picture on a picture order count (POC). In addition, a reference picture used when deriving a temporal neighboring block may be referred to as a collocated picture. In addition, a collocated block may indicate a block located at a position in a col picture corresponding to the position of the current block, and may be referred to as a col block.

For example, the temporal neighboring block may include an A block located in the col picture corresponding to the lower right corner peripheral block and / or a B block located corresponding to the center lower right block of the current block.

Referring to FIG. 4, the encoding apparatus 100 and / or the video decoding apparatus 200 may derive the spatial motion information candidate based on the spatial neighboring blocks of the current block as described with reference to FIG. 3 (S400).

The encoding apparatus 100 and / or the video decoding apparatus 200 may derive the temporal motion information candidate based on the temporal neighboring block of the current block as described with reference to FIG. 3 (S410).

In this case, when motion information compression is applied, the encoding apparatus 100 and / or the video decoding apparatus 200 may compress and store motion information of pictures that are already decoded (S415). For example, when prediction is performed on a reference picture (ie, col picture), predicted blocks may be generated and motion information for each predicted block may be derived. In this case, instead of storing all the motion information for each predicted block, the motion information in the reference picture may be compressed and stored in a predetermined storage unit.

Therefore, in deriving a temporal motion candidate, the encoding apparatus 100 and / or the video decoding apparatus 200 may use motion information of a reference picture (ie, col picture) compressed and stored in a memory.

The encoding apparatus 100 and / or the video decoding apparatus 200 may be candidates for constructing a motion information candidate list based on the number of current candidates (spatial motion information candidates and temporal motion information candidates) derived in steps S100 to S110. Compared to the number, the combined bi-predictive candidate and the zero vector candidate may be added to the motion information candidate list according to the comparison result (S420 and S430).

The motion information compressed and stored in a predetermined storage unit as described above may be used in an inter prediction method of deriving a temporal motion information candidate with reference to a temporal neighboring block.

For example, when the merge mode is applied, a merge candidate list may be generated using a motion vector corresponding to a col block which is a temporal neighboring block. In this case, the temporal merge candidate in the merge candidate list may use motion information compressed and stored for the col block in the col picture. In addition, when a motion vector prediction (MVP) mode is applied, a motion vector predictor candidate list may be generated using a motion vector corresponding to a col block which is a temporal neighboring block. In this case, the temporal motion vector predictor candidate in the motion vector predictor candidate list may use the motion information compressed and stored for the col block in the col picture.

5 exemplarily shows blocks partitioned for prediction within a picture.

The picture 500 illustrated in FIG. 5 represents a decoded picture, and the decoded picture may be stored in the

memories

160 and 260 of the encoding apparatus 100 and / or the video decoding apparatus 200. The decoded picture stored in the

memory

160, 260 may be used as a reference picture (col picture) to derive motion information of the current block.

The picture 500 may be partitioned or divided into prediction blocks (ie, prediction units) to perform prediction in the decoding process as shown in FIG. 5. In this way, inter prediction or intra prediction may be performed on the partitioned or divided prediction blocks, and prediction related information about each prediction block may be derived as a prediction result. For example, when inter prediction is applied to the prediction block 510 in the picture 500, prediction related information of the prediction block 510 may be derived. In this case, the prediction related information of the prediction block 510 may include motion information regarding a motion vector, a reference picture index, and the like.

As described above, a picture may be partitioned or partitioned into prediction blocks to perform prediction. When prediction blocks in a picture are divided as illustrated in FIG. 5, motion information may be stored in units of prediction blocks according to an implementation method of a decoder. However, when storing the motion information of each of the prediction blocks having different sizes in this way, in order to manage and use the motion information of each prediction block, not only the motion information but also the block partitioning information must be stored together. In this case, the computational complexity that occurs in the process of using the overhead of the amount of data to be stored and the motion information of the corresponding prediction block is considerable.

Therefore, most decoders do not store motion information in units of prediction blocks that are actually divided as shown in FIG. 5, but store motion information in units of minimum prediction blocks as shown in FIG. 6. The minimum prediction block refers to the prediction block having the smallest size. For example, if the minimum prediction block has a 4x4 size, motion information may be stored in units of 4x4 size as shown in FIG. 6. However, even when the motion information is stored in the minimum prediction block unit, there is a considerable amount of data to be stored in the picture.

For example, the high efficiency video coding (HEVC) standard, which is one of the techniques related to video coding, requires 74 bits to store motion information in units of minimum prediction blocks, and stores it in 128-bit storage according to a hardware specification of the decoder. In the case of explaining the pixel value as an example, an 8-bit image is equivalent to 8 bits * 16 = 128 bits. Since this amount of data is stored in memory (i.e., DPB), it can be a cause of increasing the hardware cost of the decoder.

In order to solve this problem, the present invention proposes a method of compressing and storing motion information in a storage unit larger than the size of the minimum prediction block, as shown in FIG. 6.

The motion information of the prediction blocks split to perform the prediction in the picture may be compressed and stored in a storage unit having a size larger than the minimum prediction block size as shown in FIG. 7.

For example, when the minimum prediction block size is 4x4 size, the storage unit for storing the motion information according to the present invention may be set to a block of size NxN larger than 4x4 size. Here, N may be an integer. For example, the storage unit of NxN size may be set to a block having a size of 8x8, 16x16, 32x32, 64x64.

The table below shows the results of experimental verification of the case where the motion information is compressed in a storage unit larger than the minimum predicted block size proposed in the present invention.

Table 1 shows the performance when the 16x16 storage unit proposed by the present invention is compressed as compared with the 4x4 unit, which is the minimum prediction block size.

	YY	UU	VV
Class A1Class A1	0.01%0.01%	-0.13%-0.13%	0.00%0.00%
Class A2Class A2	0.00%0.00%	-0.08%-0.08%	-0.02%-0.02%
Class BClass B	-0.01%-0.01%	0.14%0.14%	-0.08%-0.08%
Class CClass C	-0.05%-0.05%	0.12%0.12%	-0.03%-0.03%
Class DClass D	-0.19%-0.19%	-0.14%-0.14%	-0.31%-0.31%
OverallOverall	-0.05%-0.05%	0.00%0.00%	-0.09%-0.09%

Table 2 shows the performance of the 32x32 storage unit proposed by the present invention when compared to the 4x4 unit, which is the minimum prediction block size.

	YY	UU	VV
Class A1Class A1	0.06%0.06%	-0.27%-0.27%	0.08%0.08%
Class A2Class A2	0.07%0.07%	0.08%0.08%	0.14%0.14%
Class BClass B	0.12%0.12%	0.13%0.13%	0.12%0.12%
Class CClass C	0.19%0.19%	0.23%0.23%	0.32%0.32%
Class DClass D	0.38%0.38%	0.39%0.39%	0.28%0.28%
OverallOverall	0.17%0.17%	0.13%0.13%	0.19%0.19%

Table 3 shows the performance in the case of compressing in the 64x64 storage unit proposed by the present invention, compared to the result of compressing in the 4x4 unit, which is the minimum predicted block size.

	YY	UU	VV
Class A1Class A1	0.26%0.26%	0.25%0.25%	0.28%0.28%
Class A2Class A2	0.45%0.45%	0.47%0.47%	0.61%0.61%
Class BClass B	0.53%0.53%	0.43%0.43%	0.32%0.32%
Class CClass C	0.78%0.78%	0.70%0.70%	0.68%0.68%
Class DClass D	0.98%0.98%	0.79%0.79%	0.68%0.68%
OverallOverall	0.62%0.62%	0.54%0.54%	0.51%0.51%

According to the present invention, since the storage unit for storing the motion information has a size larger than the minimum prediction block size, as illustrated in FIG. 8, one storage unit may include one or more prediction blocks. In this case, each of the one or more prediction blocks includes motion information. For example, referring to FIG. 8, an n th prediction block in a storage unit may include an n th motion vector (eg, MV _n ) as motion information. Here, n = 0, 1, ..., 10 may be.

That is, when one or more prediction blocks are included in a storage unit of a predetermined size, it is important to determine which prediction block of one or more prediction blocks is to be compressed and stored. Accordingly, the present invention proposes a method of determining representative motion information to be stored on behalf of a corresponding storage unit among motion information of prediction blocks included in a storage unit having a predetermined size.

In one embodiment, the spatial location candidate may designate a specific location within one storage unit and be determined based on the designated location. For example, as shown in FIG. 9, a spatial location candidate includes a block including a sample located at a center in a storage unit (C candidate block) and a block including a sample located at a top-left. (TL candidate block), a block located at the top-right (TR candidate block), a block located at the bottom-left (BL candidate block) and a block located at the bottom-right It may include at least one of (BR candidate block). In this case, the spatial position candidate may be predefined.

Hereinafter, a method of determining representative motion information to be stored on behalf of the storage unit based on the spatial position candidate in the storage unit as illustrated in FIG. 9 will be described in detail.

10 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit having a predetermined size based on a spatial position candidate according to the present invention. The method of FIG. 10 may be performed by the encoding apparatus 100 and the decoding apparatus 200. However, the method of FIG. 10 will be described as being performed by the decoding apparatus 200 for convenience of description.

In one embodiment, the decoding apparatus 200 iterates in accordance with a predetermined order for the spatial position candidates in the storage unit, and determines whether the corresponding spatial position candidates are available, and determines the motion information of the available spatial position candidates. Set as representative motion information. In this case, the spatial position candidates may include a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block as illustrated in FIG. 9. The predetermined traversal order of the spatial position candidates may be a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block.

10, the decoding apparatus 200 may determine whether a C candidate block in a storage unit having a predetermined size is an available spatial position candidate (S1000). According to an embodiment, the decoding apparatus 200 may determine whether the C candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the C candidate block is available, the decoding apparatus 200 may determine the motion information of the C candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1005). In this case, the decoding apparatus 200 may not perform a process of determining whether an available spatial position candidate is available for the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block.

If the C candidate block is not available, the decoding apparatus 200 may determine whether the TL candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1010). In an embodiment, the decoding apparatus 200 may determine whether the TL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

When the TL candidate block is available, the decoding apparatus 200 may determine the motion information of the TL candidate block as representative motion information in the storage unit, and compress and store the motion information of the TL candidate block in operation S1015. In this case, the decoding apparatus 200 may not perform a process of determining whether an available spatial position candidate is available for the TR candidate block, the BL candidate block, and the BR candidate block.

If the TL candidate block is not available, the decoding apparatus 200 may determine whether the TR candidate block in the storage unit having a predetermined size is an available spatial position candidate (S1020). According to an embodiment, the decoding apparatus 200 may determine whether the TR candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the TR candidate block is available, the decoding apparatus 200 may determine the motion information of the TR candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1025). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the BL candidate block and the BR candidate block.

If the TR candidate block is not available, the decoding apparatus 200 may determine whether the BL candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1030). In an embodiment, the decoding apparatus 200 may determine whether the BL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the BL candidate block is available, the decoding apparatus 200 may determine the motion information of the BL candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1035). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the BR candidate block.

If the BL candidate block is not available, the decoding apparatus 200 may determine whether the BR candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1040). According to an embodiment, the decoding apparatus 200 may determine whether the BR candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is a block predicted in the inter prediction mode.

When the BR candidate block is available, the decoding apparatus 200 may determine the motion information of the BR candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1045).

If the BR candidate block is not available, the decoding apparatus 200 may determine that there are no available candidates among the spatial position candidates, and set a default value as the representative motion information in the storage unit (S1050). In one embodiment, the default value may be a motion vector having a value of zero.

As described above, the decoding apparatus 200 may compress and store the finally determined representative motion information for the corresponding storage unit while iterating in the order of the C candidate block, the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block. There is (S1060).

11 is a flowchart illustrating another embodiment of a method of determining representative motion information in a storage unit of a predetermined size based on a spatial position candidate according to the present invention. Although the method of FIG. 11 may be performed by the encoding apparatus 100 and the decoding apparatus 200, the method of FIG. 11 will be described as being performed by the decoding apparatus 200 for convenience of description.

In one embodiment, the decoding apparatus 200 iterates in accordance with a predetermined order for the spatial position candidates in the storage unit, and determines whether the corresponding spatial position candidates are available, and determines the motion information of the available spatial position candidates. Set as representative motion information. In this case, the spatial position candidates may include a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block as illustrated in FIG. 9. The predetermined traversal order of the spatial position candidates may be in the order of the TL candidate block, the C candidate block, the BL candidate block, the TR candidate block, and the BR candidate block.

Specifically, referring to FIG. 11, the decoding apparatus 200 may determine whether a TL candidate block in a storage unit having a predetermined size is an available spatial position candidate (S1100). In an embodiment, the decoding apparatus 200 may determine whether the TL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

When the TL candidate block is available, the decoding apparatus 200 may determine the motion information of the TL candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1105). In this case, the decoding apparatus 200 may not perform a process of determining whether an available spatial position candidate is available for the C candidate block, the BL candidate block, the TR candidate block, and the BR candidate block.

If the TL candidate block is not available, the decoding apparatus 200 may determine whether the C candidate block in the storage unit having a predetermined size is an available spatial position candidate (S1110). According to an embodiment, the decoding apparatus 200 may determine whether the C candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the C candidate block is available, the decoding apparatus 200 may determine the motion information of the C candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1115). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the BL candidate block, the TR candidate block, and the BR candidate block.

If the C candidate block is not available, the decoding apparatus 200 may determine whether the BL candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1120). In an embodiment, the decoding apparatus 200 may determine whether the BL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

When the BL candidate block is available, the decoding apparatus 200 may determine the motion information of the BL candidate block as the representative motion information in the storage unit, and compress and store the motion information (S1125). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the TR candidate block and the BR candidate block.

If the BL candidate block is not available, the decoding apparatus 200 may determine whether the TR candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1130). According to an embodiment, the decoding apparatus 200 may determine whether the TR candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

When the TR candidate block is available, the decoding apparatus 200 may determine the motion information of the TR candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1135). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the BR candidate block.

If the TR candidate block is not available, the decoding apparatus 200 may determine whether the BR candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1140). According to an embodiment, the decoding apparatus 200 may determine whether the BR candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is a block predicted in the inter prediction mode.

When the BR candidate block is available, the decoding apparatus 200 may determine the motion information of the BR candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1145).

If the BR candidate block is not available, the decoding apparatus 200 may determine that there are no available candidates among the spatial position candidates, and set a default value as the representative motion information in the storage unit (S1150). In one embodiment, the default value may be a motion vector having a value of zero.

As described above, the decoding apparatus 200 may compress and store the finally determined representative motion information for the corresponding storage unit while circulating in the order of the TL candidate block, the C candidate block, the BL candidate block, the TR candidate block, and the BR candidate block. There is (S1160).

The above-described traversal order of the spatial position candidates in FIGS. 10 and 11 is merely an example, and the spatial position candidates may be traversed in a different order from that of FIGS. 10 and 11.

In the present invention, as described above, all spatial position candidates shown in FIG. 9 may be traversed in a predetermined order, and the motion information of the first available spatial position candidates may be determined and stored as representative motion information of the corresponding storage unit. In this case, when all spatial position candidates are traversed in order, hardware complexity may increase. Accordingly, in the present invention, the representative motion information may be determined with reference to the optimized spatial position candidate in consideration of hardware complexity.

The table below shows the results of the experimental verification of the case of determining the representative motion information for the storage unit of the predetermined size using the optimized spatial position candidate proposed in the present invention.

Table 4 shows the performance of using the spatial position candidate (TL candidate block) optimized for the 16x16 storage unit proposed by the present invention, compared to the compression result of the 4x4 unit, which is the minimum prediction block size.

	YY	UU	VV
Class A1Class A1	0.06%0.06%	-0.15%-0.15%	0.13%0.13%
Class A2Class A2	0.04%0.04%	0.12%0.12%	0.10%0.10%
Class BClass B	0.05%0.05%	0.13%0.13%	-0.11%-0.11%
Class CClass C	0.02%0.02%	-0.01%-0.01%	0.10%0.10%
Class DClass D	0.01%0.01%	0.09%0.09%	0.00%0.00%
OverallOverall	0.03%0.03%	0.05%0.05%	0.03%0.03%

Table 5 shows the performance when the spatial position candidate (C candidate block) optimized for the 16x16 storage unit proposed by the present invention is compared with the result of compressing the 4x4 unit, which is the minimum prediction block size.

	YY	UU	VV
Class A1Class A1	0.07%0.07%	-0.06%-0.06%	0.25%0.25%
Class A2Class A2	0.02%0.02%	0.05%0.05%	0.07%0.07%
Class BClass B	0.05%0.05%	0.23%0.23%	0.00%0.00%
Class CClass C	-0.06%-0.06%	-0.08%-0.08%	0.09%0.09%
Class DClass D	-0.03%-0.03%	0.09%0.09%	-0.19%-0.19%
OverallOverall	0.01%0.01%	0.06%0.06%	0.03%0.03%

As shown in the experimental verification results of Tables 4 and 5, TL candidate blocks and C candidate blocks may be used as optimized spatial position candidates.

12 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit having a predetermined size with reference to an optimized spatial position candidate among spatial position candidates according to the present invention. Although the method of FIG. 12 may be performed by the encoding apparatus 100 and the decoding apparatus 200, the method of FIG. 12 will be described as being performed by the decoding apparatus 200 for convenience of description.

In an embodiment, the decoding apparatus 200 may determine the representative motion information using the TL candidate block among the spatial position candidates.

Referring to FIG. 12, the decoding apparatus 200 may determine whether a TL candidate block in a storage unit of a predetermined size is an available spatial position candidate (S1200). In an embodiment, the decoding apparatus 200 may determine whether the TL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the TL candidate block is available, the decoding apparatus 200 may determine the motion information of the TL candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1205).

If the TL candidate block is not available, the decoding apparatus 200 may set a default value as the representative motion information in the storage unit (S1210). In one embodiment, the default value may be a motion vector having a value of zero.

The decoding apparatus 200 may compress and store the finally determined representative motion information for the corresponding storage unit (S1220).

FIG. 13 is a flowchart illustrating another embodiment of a method of determining representative motion information in a storage unit having a predetermined size with reference to an optimized spatial position candidate among spatial position candidates according to the present invention. Although the method of FIG. 13 may be performed by the encoding apparatus 100 and the decoding apparatus 200, the method of FIG. 13 will be described as being performed by the decoding apparatus 200 for convenience of description.

In an embodiment, the decoding apparatus 200 may determine the representative motion information using the C candidate block among the spatial position candidates.

Referring to FIG. 13, the decoding apparatus 200 may determine whether a C candidate block in a storage unit having a predetermined size is an available spatial position candidate (S1300). According to an embodiment, the decoding apparatus 200 may determine whether the C candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the C candidate block is available, the decoding apparatus 200 may determine the motion information of the C candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1305).

If the C candidate block is not available, the decoding apparatus 200 may set a default value as the representative motion information in the storage unit (S1310). In one embodiment, the default value may be a motion vector having a value of zero.

The decoding apparatus 200 may compress and store the finally determined representative motion information for the corresponding storage unit in operation S1320.

In the present invention, as described above, one optimized spatial position candidate may be used in consideration of hardware complexity, but one or more optimized spatial position candidates may be used in consideration of a trade-off condition of compression performance and complexity. have.

The table below shows the results of the experimental verification of the case of determining the representative motion information for the storage unit of the predetermined size using one or more optimized spatial position candidates proposed in the present invention.

Table 6 shows the performance of using two optimized spatial position candidates (C candidate block and TL candidate block) in the 16x16 storage unit proposed by the present invention, compared with the compression result of 4x4 unit, which is the minimum prediction block size. It is shown. In this case, the traversal order of the spatial candidate blocks is a C candidate block and a TL candidate block.

	YY	UU	VV
Class A1Class A1	0.06%0.06%	-0.19%-0.19%	-0.02%-0.02%
Class A2Class A2	-0.01%-0.01%	-0.01%-0.01%	0.01%0.01%
Class BClass B	0.03%0.03%	0.01%0.01%	-0.17%-0.17%
Class CClass C	-0.03%-0.03%	0.06%0.06%	0.12%0.12%
Class DClass D	-0.08%-0.08%	-0.13%-0.13%	-0.44%-0.44%
OverallOverall	0.00%0.00%	-0.04%-0.04%	-0.11%-0.11%

Table 7 shows all spatial position candidates (C candidate block, TL candidate block, TR candidate block, BL candidate block, It shows the performance when using the BR candidate block). In this case, the traversal order of the spatial candidate blocks is a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block.

14 is a flowchart illustrating an embodiment of a method of determining representative motion information in a storage unit of a predetermined size with reference to two optimized spatial position candidates among spatial position candidates according to the present invention. Although the method of FIG. 14 may be performed by the encoding apparatus 100 and the decoding apparatus 200, it is described in FIG. 14 as being performed by the decoding apparatus 200 for convenience of description.

In an embodiment, the decoding apparatus 200 may determine the representative motion information using the C candidate block and the TL candidate block among the spatial position candidates. In this case, the traversal order of the spatial position candidates may be searched in the order of the C candidate block and the TL candidate block.

Referring to FIG. 14, the decoding apparatus 200 may determine whether a C candidate block in a storage unit having a predetermined size is an available spatial position candidate (S1400). According to an embodiment, the decoding apparatus 200 may determine whether the C candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

If the C candidate block is available, the decoding apparatus 200 may determine the motion information of the C candidate block as the representative motion information in the storage unit, and may compress and store the motion information (S1405). In this case, the decoding apparatus 200 may not perform a process of determining whether it is an available spatial position candidate for the TL candidate block.

If the C candidate block is not available, the decoding apparatus 200 may determine whether the TL candidate block in the storage unit of the predetermined size is an available spatial position candidate (S1410). In an embodiment, the decoding apparatus 200 may determine whether the TL candidate block is a block predicted in the inter prediction mode, and determine the available spatial position candidate when the block is predicted in the inter prediction mode.

When the TL candidate block is available, the decoding apparatus 200 may determine the motion information of the TL candidate block as representative motion information in the storage unit, and compress and store the motion information of the TL candidate block in operation S1415.

If the TL candidate block is not available, the decoding apparatus 200 may determine that there are no available candidates among the spatial position candidates, and set a default value as the representative motion information in the storage unit (S1420). In one embodiment, the default value may be a motion vector having a value of zero.

As described above, the decoding apparatus 200 may compress and store the finally determined representative motion information for the corresponding storage unit while iterating in the order of the C candidate block and the TL candidate block (S1430).

The present invention may determine the representative motion information stored on behalf of the storage unit based on the spatial position candidate (s) as described above, and based on the area of the prediction blocks included in the storage unit of the predetermined size, the representative motion information may be determined. You can also decide.

One storage unit may include prediction blocks, and the areas of each prediction block may be different as shown in FIG. 15.

In an embodiment, the motion information of the prediction block having the largest area among the prediction blocks included in the storage unit having a predetermined size may be used as the representative motion information to be stored in the storage unit. For example, in FIG. 15, the prediction block including the motion vector of MV ₀ occupies about 25% of the largest area in the storage unit. In this case, the motion vector of MV ₀ may be used as the representative motion information in the corresponding storage unit.

As such, the method of determining the representative motion information based on the area of the prediction blocks included in the storage unit may be more effective in terms of hardware implementation than the process of searching for the spatial position candidate. That is, in the hardware implementation, it is possible to remove the position computation complexity required to find a predefined position (ie, spatial position candidate), and to read the area information of the prediction block by performing a simple read operation. In terms of hardware, there is an advantage.

When the representative motion information is not determined in the storage unit by the method proposed in the present invention, an exception may be processed for the storage unit. This will be described with reference to FIG. 16.

16 is a diagram illustrating an embodiment of exception processing for representative motion information stored on behalf of one storage unit according to the present invention. Although the method of FIG. 16 may be performed by the encoding apparatus 100 and the decoding apparatus 200, the method of FIG. 16 is described as being performed by the decoding apparatus 200 for convenience of description.

Referring to FIG. 16, the decoding apparatus 200 traverses the spatial position candidates in a predetermined order and determines whether the corresponding spatial position candidates are available, and represents the motion information of the available spatial position candidates as representative motion information. Set to. In this case, the spatial position candidates may include a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block as illustrated in FIG. 9. The predetermined traversal order of the spatial position candidates may be a C candidate block, a TL candidate block, a TR candidate block, a BL candidate block, and a BR candidate block.

When the decoding apparatus 200 iterates in the order of the C candidate block, the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block, and determines that all spatial position candidates are not available, the decoding device 200 makes an exception to the corresponding storage unit. You can do it. That is, since it is determined that all the spatial position candidates are not the blocks predicted in the inter prediction mode, the decoding apparatus 200 may set the representative motion information of the corresponding storage unit to the intra prediction mode. In other words, since all spatial position candidates are blocks predicted in the intra prediction mode (rather than the inter prediction mode), unlike the case in which the representative motion information in the storage unit is set to the motion vector having a value of 0 in the above-described embodiments, Intra prediction mode may be set.

The method of FIG. 17 may be performed by the encoding apparatus 100 of FIG. 1. More specifically, steps S1700 to S1730 may be performed by the prediction unit 110 or the memory 160 disclosed in FIG. 1, and step S1740 may be performed by the entropy encoding unit 130 illustrated in FIG. 1. In addition, the detailed description overlapping with the contents described with reference to FIGS. 1 to 16 will be omitted or simplified. In addition, the encoding apparatus 100 of FIG. 17 may perform operations corresponding to the decoding apparatus 200 of FIG. 18, which will be described later.

The encoding apparatus 100 may derive temporal motion information on the current block based on the temporal neighboring block of the current block (S1700).

The temporal neighboring block indicates a col block located corresponding to the current block in the reference picture. In one embodiment, as described with reference to FIG. 3, the temporal neighboring block is an A block located in the reference picture (ie, the col picture) corresponding to the lower right corner peripheral block of the current block and / or the lower right block of the current block. It may include a B block located corresponding to. For example, the encoding apparatus 100 may first determine whether an A block is available as a col block of a current block in a reference picture. That is, availability may be determined by determining whether the A block is used only for intra prediction or outside a picture boundary. Only when the A block is not available, the encoding apparatus 100 may use the B block as the col block of the current block in the reference picture.

The temporal motion information includes representative motion information stored in a storage unit corresponding to the temporal neighboring block (ie, col block) among the representative motion information stored by compressing the reference picture (ie, col picture) into a storage unit having a predetermined size. It may be the same motion information.

Here, the storage unit having a predetermined size may be a block having a size larger than the minimum prediction block size. For example, when the minimum prediction block size is 4x4 size, the storage unit having a predetermined size may be an NxN size block larger than the 4x4 size, and the NxN size may be 8x8, 16x16, 32x32, 64x64. In addition, the size of the storage unit may be predetermined or may be information signaled from the encoding apparatus 110 to the decoding apparatus 200.

On the other hand, when one picture is decoded, it may be stored in the DPB, and the picture stored in the DPB may be used as a reference picture referenced when decoding another picture. In this case, when motion information compression is applied, the picture stored in the DPB (ie, the reference picture) compresses and stores the motion information for each storage unit having a predetermined size instead of storing all the motion information in the picture. In other words, one picture may include representative motion information as many as the number of storage units in the picture.

Therefore, temporal motion information derived based on temporal neighboring blocks refers to representative motion information stored on behalf of a storage unit including temporal neighboring blocks (ie, col blocks) in a reference picture.

In compressing and storing the representative motion information for each reference unit for each storage unit, the encoding apparatus 100 compresses and stores the representative motion information for a block (that is, prediction blocks) in the reference picture in a storage unit having a predetermined size. May be determined and the determined representative motion information may be stored as motion information of blocks included in a storage unit having a predetermined size.

In an embodiment, the encoding apparatus 100 may determine an available spatial position candidate among the spatial position candidates included in the storage unit of the predetermined size, and may represent the motion information of the available spatial position candidate in the storage unit of the predetermined size. Can be determined by the motion information.

Here, the spatial position candidates include a C candidate block located in C (Center) in a storage unit of a predetermined size, a TL candidate block located in Top-Left (TL), a TR candidate block located in Top-Right (TR), It may include at least one of a BL candidate block located in bottom-left (BL) and a BR candidate block located in bottom-right (BR). In addition, the spatial position candidates may be predefined.

In addition, in determining an available spatial position candidate, the encoding apparatus 100 includes a spatial position including at least one of the C candidate block, the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block. It is possible to traverse the candidates in a predetermined order to determine whether the corresponding spatial position candidates are available.

If there are no available spatial position candidates among the spatial position candidates included in the storage unit of the predetermined size, the encoding apparatus 100 may determine the representative motion information stored in the storage unit of the predetermined size as a default value. For example, the default value may be a motion vector or intra mode with a value of zero.

In another embodiment, the encoding apparatus 100 may determine the representative motion information stored in the storage unit of the predetermined size based on the area of blocks (that is, the prediction blocks) included in the storage unit of the predetermined size.

A method of compressing and storing representative motion information for each reference unit for each storage unit has been described in detail with reference to FIGS. 7 to 16. Therefore, a detailed description thereof will be omitted. Of course, in the present embodiment, various methods of compressing and storing the representative motion information for each of the storage units for the reference picture disclosed in FIGS. 7 to 16 may be applied.

The encoding apparatus 100 may construct a motion information candidate list for the current block including temporal motion information (S1710).

In an embodiment, when the merge mode is applied, the encoding apparatus 100 may generate a merge candidate list using a motion vector corresponding to a col block which is a temporal neighboring block. That is, when the merge mode is applied, the merge candidate list may be configured as the motion information candidate list. A temporal merge candidate (ie, temporal motion information) in the merge candidate list may use motion information compressed and stored for the col block in the col picture. In other words, the temporal merge candidate (ie, temporal motion information) refers to representative motion information stored by compressing and storing the corresponding storage unit including the col block in the col picture.

In another embodiment, when a motion vector prediction (MVP) mode is applied, the encoding apparatus 100 may generate a motion vector predictor candidate list using a motion vector corresponding to a col block which is a temporal neighboring block. That is, when the MVP mode is applied, the motion vector predictor candidate list may be configured as the motion information candidate list. The temporal motion vector predictor candidate (ie, temporal motion information) in the motion vector predictor candidate list may use motion information compressed and stored for the col block in the col picture. In other words, the temporal motion vector predictor candidate (ie, temporal motion information) refers to representative motion information compressed and stored with respect to the corresponding storage unit including the col block in the col picture.

The encoding apparatus 100 may derive motion information of the current block based on the motion information candidate list in operation S1720.

According to an embodiment, the encoding apparatus 100 may select an optimal motion information candidate from among motion information candidates included in the motion information candidate list based on a rate-distortion (RD) cost, and use the selected motion information candidate as the motion of the current block. Can be derived from information.

The encoding apparatus 100 may perform prediction on the current block based on the motion information of the current block (S1730).

That is, the encoding apparatus 100 may derive the prediction samples of the current block by performing inter prediction on the current block based on the motion information of the current block. In addition, the encoding apparatus 100 may derive residual samples of the current block based on prediction samples of the current block. In this case, the encoding apparatus 100 may encode residual information regarding the residual samples and signal the decoding apparatus 200 to the decoding apparatus 200.

The encoding apparatus 100 may encode motion information of the current block (S1740).

According to an embodiment, the encoding apparatus 100 may encode information about a motion information candidate selected in a motion information candidate list based on a rate-distortion (RD) cost. For example, the encoding apparatus 100 may encode and index the candidate index information indicating the motion information candidate to be used as the motion information of the current block in the motion information candidate list and to the decoding apparatus 200.

The method of FIG. 18 may be performed by the decoding apparatus 200 of FIG. 2. More specifically, steps S1800 to S1830 may be performed by the predictor 230 or the memory 260 disclosed in FIG. 2. In addition, some operations may be performed by the entropy decoding unit 210 disclosed in FIG. 2. In addition, the detailed description overlapping with the contents described with reference to FIGS. 1 to 16 will be omitted or simplified. In addition, the decoding apparatus 200 of FIG. 18 may perform operations corresponding to the encoding apparatus 100 of FIG. 17.

The decoding apparatus 200 may derive temporal motion information on the current block based on the temporal neighboring block of the current block (S1800).

The temporal neighboring block indicates a col block located corresponding to the current block in the reference picture. In one embodiment, as described with reference to FIG. 3, the temporal neighboring block is an A block located in the reference picture (ie, the col picture) corresponding to the lower right corner peripheral block of the current block and / or the lower right block of the current block. It may include a B block located corresponding to. For example, the decoding apparatus 200 may first determine whether an A block is available as a col block of a current block in a reference picture. That is, availability may be determined by determining whether the A block is used only for intra prediction or outside a picture boundary. Only when the A block is not available, the decoding apparatus 200 may use the B block as the col block of the current block in the reference picture.

On the other hand, when one picture is decoded, it may be stored in the DPB, and the picture stored in the DPB may be used as a reference picture referred to when decoding another picture. In this case, when motion information compression is applied, the picture stored in the DPB (ie, the reference picture) compresses and stores the motion information for each storage unit having a predetermined size instead of storing all the motion information in the picture. In other words, one picture may include representative motion information as many as the number of storage units in the picture.

In compressing and storing the representative motion information for each reference unit for each storage unit, the decoding apparatus 200 compresses and stores the representative motion information in a storage unit having a predetermined size with respect to blocks (ie, prediction blocks) in the reference picture. May be determined and the determined representative motion information may be stored as motion information of blocks included in a storage unit having a predetermined size.

In an embodiment, the decoding apparatus 200 may determine an available spatial position candidate among the spatial position candidates included in the storage unit having a predetermined size, and may represent the motion information of the available spatial position candidate in the storage unit having the predetermined size. Can be determined by the motion information.

In addition, in determining an available spatial position candidate, the decoding apparatus 200 includes a spatial position including at least one of the C candidate block, the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block. It is possible to traverse the candidates in a predetermined order to determine whether the corresponding spatial position candidates are available.

When there are no spatial position candidates available among the spatial position candidates included in the storage unit of the predetermined size, the decoding apparatus 200 may determine the representative motion information stored in the storage unit of the predetermined size as a default value. For example, the default value may be a motion vector or intra mode with a value of zero.

In another embodiment, the decoding apparatus 200 may determine the representative motion information stored in the storage unit of the predetermined size based on the area of the blocks (that is, the prediction blocks) included in the storage unit of the predetermined size.

The decoding apparatus 200 may construct a motion information candidate list for the current block including temporal motion information (S1810).

In one embodiment, when the merge mode is applied, the decoding apparatus 200 may generate a merge candidate list using a motion vector corresponding to a col block which is a temporal neighboring block. That is, when the merge mode is applied, the merge candidate list may be configured as the motion information candidate list. A temporal merge candidate (ie, temporal motion information) in the merge candidate list may use motion information compressed and stored for the col block in the col picture. In other words, the temporal merge candidate (ie, temporal motion information) refers to representative motion information stored by compressing and storing the corresponding storage unit including the col block in the col picture.

In another embodiment, when the Motion Vector Prediction (MVP) mode is applied, the decoding apparatus 200 may generate a motion vector predictor candidate list using a motion vector corresponding to a col block which is a temporal neighboring block. That is, when the MVP mode is applied, the motion vector predictor candidate list may be configured as the motion information candidate list. The temporal motion vector predictor candidate (ie, temporal motion information) in the motion vector predictor candidate list may use motion information compressed and stored for the col block in the col picture. In other words, the temporal motion vector predictor candidate (ie, temporal motion information) refers to representative motion information compressed and stored with respect to the corresponding storage unit including the col block in the col picture.

The decoding apparatus 200 may derive the motion information of the current block based on the motion information candidate list (S1820).

In an embodiment, the decoding apparatus 200 may select the motion information candidate indicated by the candidate index from the motion information candidates included in the motion information candidate list and derive the motion information candidate indicated by the current block. In this case, candidate index information may be signaled from the encoding apparatus 100.

The decoding apparatus 200 may perform prediction on the current block based on the motion information of the current block (S1830).

That is, the decoding apparatus 200 may derive the prediction samples of the current block by performing inter prediction on the current block based on the motion information of the current block. In addition, the decoding apparatus 200 may derive the residual samples based on the residual information of the current block, and generate a reconstructed picture based on the derived residual samples and the prediction samples. In this case, the residual information may be signaled from the encoding apparatus 100.

In the above-described embodiment, the methods are described based on a flowchart as a series of steps or blocks, but the present invention is not limited to the order of steps, and any steps may occur in a different order or simultaneously from other steps as described above. have. In addition, those skilled in the art will appreciate that the steps shown in the flowcharts are not exclusive and that other steps may be included or one or more steps in the flowcharts may be deleted without affecting the scope of the present invention.

The embodiments described herein may be implemented and performed on a processor, microprocessor, controller, or chip. For example, the functional units shown in each drawing may be implemented and performed on a computer, processor, microprocessor, controller, or chip. In this case, information for implementation (ex. Information on instructions) or an algorithm may be stored in a digital storage medium.

In addition, the decoding apparatus and encoding apparatus to which the present invention is applied include a multimedia broadcasting transmitting and receiving device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video chat device, a real time communication device such as video communication, and mobile streaming. Devices, storage media, camcorders, video on demand (VoD) service providing devices, OTT video (Over the top video) devices, internet streaming service providing devices, 3D (3D) video devices, video telephony video devices, vehicle terminals (ex Vehicle terminals, airplane terminals, ship terminals, etc.) and medical video devices, etc., and may be used to process video signals or data signals. For example, the OTT video device may include a game console, a Blu-ray player, an internet access TV, a home theater system, a smartphone, a tablet PC, a digital video recorder (DVR), and the like.

In addition, the processing method to which the present invention is applied can be produced in the form of a program executed by a computer, and can be stored in a computer-readable recording medium. Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium. The computer readable recording medium includes all kinds of storage devices and distributed storage devices in which computer readable data is stored. The computer-readable recording medium may be, for example, a Blu-ray disc (BD), a universal serial bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical disc. It may include a data storage device. The computer-readable recording medium also includes media embodied in the form of a carrier wave (for example, transmission over the Internet). In addition, the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.

In addition, an embodiment of the present invention may be implemented as a computer program product by program code, which may be performed on a computer by an embodiment of the present invention. The program code may be stored on a carrier readable by a computer.

The content streaming system to which the present invention is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.

The encoding server compresses content input from multimedia input devices such as a smart phone, a camera, a camcorder, etc. into digital data to generate a bitstream and transmit the bitstream to the streaming server. As another example, when multimedia input devices such as smart phones, cameras, camcorders, etc. directly generate a bitstream, the encoding server may be omitted.

The bitstream may be generated by an encoding method or a bitstream generation method to which the present invention is applied, and the streaming server may temporarily store the bitstream in the process of transmitting or receiving the bitstream.

The streaming server transmits the multimedia data to the user device based on the user's request through the web server, and the web server serves as a medium for informing the user of what service. When a user requests a desired service from the web server, the web server delivers it to a streaming server, and the streaming server transmits multimedia data to the user. In this case, the content streaming system may include a separate control server. In this case, the control server plays a role of controlling a command / response between devices in the content streaming system.

The streaming server may receive content from a media store and / or an encoding server. For example, when the content is received from the encoding server, the content may be received in real time. In this case, in order to provide a smooth streaming service, the streaming server may store the bitstream for a predetermined time.

Examples of the user device include a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), navigation, a slate PC, Tablet PCs, ultrabooks, wearable devices, such as smartwatches, glass glasses, head mounted displays, digital TVs, desktops Computer, digital signage, and the like.

Each server in the content streaming system may be operated as a distributed server, in which case data received from each server may be distributed.

Claims

In the image decoding method performed by the decoding apparatus,

Deriving temporal motion information for the current block based on a temporal neighboring block of the current block;

Constructing a motion information candidate list for the current block including the temporal motion information;

Deriving motion information of the current block based on the motion information candidate list; And

Performing prediction on the current block based on the motion information of the current block,

The temporal neighboring block is a col block (collocated block) positioned corresponding to the current block in a reference picture,

The temporal motion information is the same as the representative motion information stored in the storage unit corresponding to the col block among the representative motion information compressed and stored in a predetermined storage unit for the reference picture.
The method of claim 1,

Determining representative motion information stored on a representative basis in the predetermined storage unit with respect to blocks in the reference picture; And

Storing the determined representative motion information as motion information of blocks included in the predetermined storage unit,

The determining of the representative motion information,

Determining available spatial position candidates among spatial position candidates included in the predetermined storage unit, and determining motion information of the available spatial position candidates as representative motion information stored in the predetermined storage unit. A video decoding method.
The method of claim 2,

The spatial position candidates are

In the C candidate block located in C (Center) in the predetermined storage unit, the TL candidate block located in Top-Left (TL), the TR candidate block located in Top-Right (TR), and Bottom-Left (BL). And at least one of a BL candidate block located and a BR candidate block located at bottom-right (BR).
The method of claim 3,

The determining of the representative motion information,

Whether the spatial position candidate is available while traversing in a predetermined order for the spatial position candidates including at least one of the C candidate block, the TL candidate block, the TR candidate block, the BL candidate block, and the BR candidate block. The image decoding method, characterized in that for determining.
The method of claim 2,

The determining of the representative motion information,

And if there are no available spatial position candidates among the spatial position candidates included in the predetermined storage unit, the representative motion information stored in the predetermined storage unit is determined as a default value.
The method of claim 5,

The default value is a video decoding method, characterized in that 0 or intra mode.
The method of claim 1,

Determining representative motion information stored on a representative basis in the predetermined storage unit with respect to blocks in the reference picture; And

Storing the determined representative motion information as motion information of blocks included in the predetermined storage unit,

The determining of the representative motion information,

And determining representative motion information to be stored in the predetermined storage unit based on an area of blocks included in the predetermined storage unit.
The method of claim 7, wherein

The determining of the representative motion information,

And determining the block having the largest area among the blocks included in the predetermined storage unit, and determining the motion information of the block having the largest area as the representative motion information stored in the predetermined storage unit. Decoding method.
The method of claim 1,

The predetermined storage unit is a video decoding method, characterized in that the size larger than the size of the minimum prediction unit.
The method of claim 1,

The predetermined storage unit is a video decoding method, characterized in that the size larger than 4x4.
In the video encoding method performed by the encoding device,

Deriving temporal motion information for the current block based on a temporal neighboring block of the current block;

Constructing a motion information candidate list for the current block including the temporal motion information;

Deriving motion information of the current block based on the motion information candidate list;

Performing prediction on the current block based on the motion information of the current block; And

Encoding motion information of the current block;

The temporal neighboring block is a col block (collocated block) positioned corresponding to the current block in a reference picture,

The temporal motion information is the same as the representative motion information stored in the storage unit corresponding to the col block among the representative motion information stored by compressing the reference picture in a predetermined storage unit.