WO2012033327A2

WO2012033327A2 - Image-decoding method and apparatus including a method for configuring a reference picture list

Info

Publication number: WO2012033327A2
Application number: PCT/KR2011/006587
Authority: WO
Inventors: 임재현; 전용준; 박승욱; 김정선; 전병문; 박준영; 최영희; 성재원
Original assignee: 엘지전자 주식회사
Priority date: 2010-09-08
Filing date: 2011-09-06
Publication date: 2012-03-15
Also published as: WO2012033327A3

Abstract

Provided is an image-decoding method. Image information, including information indicating whether or not a current slice is a GPB slice, is received from an encoder, and reference picture list 0 and reference picture list 1 are generated on the basis of the information indicating whether or not a current slice is a GPB slice. According to the present invention, the amount of information transmitted from an encoder to a decoder is reduced, and encoding efficiency is enhanced.

Description

An image decoding method and apparatus including a method of constructing a reference picture list

The present invention relates to image processing, and more particularly, to a decoding method and apparatus including a method of constructing a reference picture list.

Recently, the demand for high resolution and high quality images such as high definition (HD) images and ultra high definition (UHD) images is increasing in various fields. The higher the resolution and the higher quality of the image data, the more information or bit rate is transmitted than the existing image data. Therefore, the image data can be transmitted by using a medium such as a conventional wired / wireless broadband line or by using a conventional storage medium. In the case of storage, the transmission cost and the storage cost are increased. High efficiency image compression techniques can be used to solve these problems.

The image compression technique includes an inter prediction technique that predicts pixel values included in a current picture from before and / or after a current picture, and predicts pixel values included in a current picture by using pixel information in the current picture. There are various techniques, such as intra prediction technique, an entropy encoding technique of allocating a short code to a high frequency of appearance and a long code to a low frequency of appearance. By using such image compression technology, image data may be effectively compressed and transmitted or stored.

In the inter prediction technology, since a pixel value included in a current picture is predicted from a picture before and / or after a current picture, that is, a reference picture, information about a reference picture list may be generated and transmitted.

An object of the present invention is to provide an image decoding method and apparatus including a method of constructing a reference picture list.

Another object of the present invention is to provide an image decoding method and apparatus including a method of constructing a reference picture list capable of reducing the amount of information of the maximum number of reference pictures transmitted to a decoder.

Another technical problem of the present invention is to provide an image decoding method and apparatus including a combined list construction method capable of reducing the amount of information transmitted to a decoder.

Another object of the present invention is to provide an image decoding method and apparatus including a method of constructing a reference picture list capable of reducing the amount of information of inter prediction mode information transmitted to a decoder without generating a combine list.

Another technical problem of the present invention is to provide an image decoding method and apparatus including a method of constructing a reference picture list capable of reducing the amount of information transmitted to a decoder for a generalized P and B (GPB) slice.

Another object of the present invention is to provide an image decoding method and apparatus capable of reducing the complexity of deriving a temporal motion information candidate for a low delay B slice.

Another object of the present invention is to provide an image decoding method and apparatus including a method of constructing a reference picture list capable of reducing the amount of information of reference picture index information transmitted to a decoder.

One embodiment of the present invention is a video decoding method. The method includes receiving reference picture list information including first flag information, deriving maximum reference picture number information in the reference picture list based on the first flag information, and the maximum reference. Generating a reference picture list by using picture number information.

The first flag information indicates whether the number of first maximum reference pictures allowed for the current picture and the number of second maximum reference pictures allowed for the current slice are the same, and the first flag information indicates a sequence parameter. Sequence Parameter Set (SPS) information or Picture Parameter Set (PPS) information.

The first flag information may indicate whether the maximum number of reference pictures allowed for all pictures and slices in the current sequence is the same. The first flag information may be information obtained from sequence parameter set information.

Another embodiment of the present invention is a video decoding method. The method may further include generating a reference picture list 0 and a reference picture list 1 for the current picture, from a temporal level of the reference picture included in the reference picture list 0 and the reference picture list 1 and the current picture. Determining a temporal distance of the memory; and inserting the reference picture based on the temporal level and the temporal distance to generate a combined list. In the generating of the combined list, a reference picture having a small temporal distance is inserted first, a reference picture having a low temporal level is preferentially inserted among the reference pictures having the same temporal distance, and the temporal distance and the Reference pictures included in reference picture list 0 may be preferentially inserted among reference pictures having the same temporal level.

The method may further include receiving second flag information indicating whether a reference picture having a low temporal level is inserted first or always whether a reference picture included in reference picture list 0 is inserted first. .

Another embodiment of the present invention is a video decoding method. The method may include generating a reference picture list 0 and a reference picture list 1 for a current picture, determining a quantization parameter of a reference picture included in the reference picture list 0 and the reference picture list 1, and a temporal distance from the current picture. And inserting the reference picture based on the quantization parameter (QP) and the temporal distance to generate a bind list. In the generating of the combine list, a reference picture having a small temporal distance is inserted first, and a reference picture having a low quantization parameter is preferentially inserted among the reference pictures having the same temporal distance, and the temporal distance and the quantization parameter are first inserted. Reference pictures included in reference picture list 0 among the same reference pictures may be inserted first.

Another embodiment of the present invention is a video decoding method using inter prediction in uni-prediction mode. The method may include generating a reference picture list 0 and a reference picture list 1 for a current picture, scanning the reference pictures included in the reference picture list 0 and the reference picture list 1, and then scanning the current pictures to a current prediction target block in the current picture. Selecting a reference picture for the image and deriving motion information on the current prediction target block using the selected reference picture. In the selecting of the reference picture for the current prediction target block in the current picture, the reference picture of reference picture list 0 and the reference picture list 1 are referenced, starting from the reference picture of reference picture list 0, and then in descending order of the reference picture index. When pictures are alternately scanned and the reference pictures in the current order are the same as the reference pictures in the previous order, scanning for the reference pictures in the current order may be skipped.

Another embodiment of the present invention is a video decoding method. The method includes receiving image information including information indicating whether a current slice is a Generalized P and B (GPB) slice from an encoder, and based on information indicating whether the current slice is a GPB slice, a reference picture list. Generating 0 and reference picture list1. In the generating of the reference picture list 0 and the reference picture list 1, if the current slice is a GPB slice, the reference picture list 1 is generated using the reference picture list 0 related information included in the image information. The GPB slice is a slice in which the reference picture list 0 and the reference picture list 1 are the same.

The information indicating whether the current slice is a GPB slice is information indicated by a third flag indicating whether the current slice is a GPB slice, and the third flag is generated by an encoder and transmitted to the decoder. If the flag and the third flag indicates that the current slice is a GPB slice, the image information may not include reference picture list1 related information.

The third flag may be a flag generated by an encoder and transmitted to a decoder only when the current slice is a B picture.

The information indicating whether the current slice is a GPB slice is information indicated by a fourth flag indicating whether the reference picture list 0 and the reference picture list 1 are the same for the current slice, and the fourth flag. Is a flag generated by the encoder and transmitted to the decoder, and when the fourth flag indicates that the reference picture list 0 and the reference picture list 1 are the same for the current slice, the image information is related to the reference picture list1. It may not include.

The GPB slice is a slice type defined separately from an I slice, a P slice, and a B slice, and the B slice is a slice in which reference picture list 0 and reference picture list 1 are not the same, and whether the current slice is a GPB slice. The indicating information is information indicated by the slice type of the GPB slice, and when the slice type of the current slice is a GPB slice, the image information may not include reference picture list1 related information.

Another embodiment of the present invention is a video decoding method. The method includes receiving image information including information indicating whether a current slice is a low delay B slice, and based on information indicating whether the current slice is a low delay B slice, a current prediction target. Deriving a temporal motion information candidate by checking a reference picture list of a co-located block with respect to the block, and inter-screening using the temporal motion information candidate and the image information ) Performing prediction, wherein the low delay B slice is a slice having only a forward reference picture, the co-located block is a block located within a reference picture of the current prediction target block, and the current slice is a low delay B slice. In the case of the temporal motion information candidates In the step, the same reference picture list and the list of reference pictures of the reference picture list of the co-located block in the current prediction block is first checked.

Another embodiment of the present invention is a video decoding method. The method may further include receiving image information including fifth flag information indicating whether index L0 and index L1 indicate the same reference picture from the encoder with respect to the current prediction target block, and based on the fifth flag information. Generating an index L0 and the index L1. The index L0 is a reference picture index of a reference picture referenced by the current prediction target block in reference picture list 0, and the index L1 is a reference picture index of a reference picture referenced by the current prediction target block in reference picture list 1. to be.

In the generating of the index L0 and the index L1, when the index L0 and the reference picture indicated by the index L1 are the same, the image information does not include the index L1 related information and is included in the image information. The index L1 may be generated using the index L0 related information.

In the generating of the index L0 and the index L1, when the index pictures indicated by the index L0 and the index L1 are not the same, when the index value of the index L1 is larger than the index Co, the index information included in the image information is included. The index L1 is generated by adding 1 to the index value of the index L1 related information. If the index value of the index L1 is smaller than the index Co, the index value of the index L1 related information included in the image information is copied. The index L1 may be generated, and the index Co may be an index value of the reference picture list 1 indicating the same picture as the reference picture indicated by the index L0.

In the encoding and decoding of an image, the amount of information transmitted is reduced and the coding efficiency is improved.

1 is a block diagram schematically illustrating an image encoding apparatus according to an embodiment of the present invention.

2 is a conceptual diagram schematically illustrating a prediction unit according to an embodiment of the present invention.

3 is a conceptual diagram schematically illustrating an example of a quad tree structure of a processing unit in a system to which the present invention is applied.

4 is a block diagram schematically illustrating an image decoding apparatus according to an embodiment of the present invention.

5 is a conceptual diagram schematically illustrating a predictor of an image decoding apparatus according to an embodiment of the present invention.

6 is a conceptual diagram schematically illustrating an embodiment of a prediction method that may be used in a P picture and a B picture.

7 is a conceptual diagram schematically illustrating a comparison between a conventional P picture, a conventional B picture, and a prediction method of a GPB.

8 is a flowchart schematically illustrating a method for constructing a reference picture list in the encoder and transmitting reference picture list information according to an embodiment of the present invention.

9 is a flowchart schematically illustrating a method of generating reference picture list by receiving reference picture list information in a decoder according to an embodiment of the present invention.

10 is a conceptual diagram schematically illustrating an embodiment of a method of constructing a bind list.

11 is a conceptual diagram schematically illustrating an embodiment of a hierarchical coding structure.

12 is a conceptual diagram schematically showing an embodiment of a method of generating a combined list using a temporal level according to the present invention.

13 is a flowchart illustrating an embodiment of a method for generating a combine list according to the present invention.

14 is a conceptual diagram schematically illustrating a reference picture selection method according to an embodiment of the present invention.

15 is a flowchart schematically illustrating a method for deriving motion information according to an embodiment of the present invention.

16 is a conceptual diagram schematically illustrating an embodiment of a reference picture and a reference picture list for a GPB and low delay B picture.

17 is a flowchart schematically illustrating a method of transmitting a reference picture list of an encoder according to an embodiment of the present invention.

18 is a flowchart schematically illustrating a method of generating a reference picture list of a decoder according to an embodiment of the present invention.

19 is a conceptual diagram schematically illustrating an embodiment of a picture in which a reference picture index of list 0 and a reference picture index of list 1 indicate the same reference picture.

20 is a flowchart schematically illustrating a method of transmitting a reference picture index of an encoder according to an embodiment of the present invention.

21 is a flowchart schematically illustrating a reference picture index generation method of a decoder according to an embodiment of the present invention.

As the present invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the invention to the specific embodiments. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the spirit of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this specification, terms such as "comprise" or "have" are intended to indicate that there is a feature, number, step, action, component, part, or combination thereof described on the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

On the other hand, each of the components in the drawings described in the present invention are shown independently for the convenience of the description of the different characteristic functions in the image encoding / decoding apparatus, each component is implemented by separate hardware or separate software It does not mean to be. For example, two or more of each configuration may be combined to form one configuration, or one configuration may be divided into a plurality of configurations. Embodiments in which each configuration is integrated and / or separated are also included in the scope of the present invention without departing from the spirit of the present invention.

In addition, some of the components may not be essential components for performing essential functions in the present invention, but may be optional components for improving performance. The present invention can be implemented including only the components essential for implementing the essentials of the present invention except for the components used for improving performance, and the structure including only the essential components except for the optional components used for improving performance. Also included within the scope of the present invention.

Hereinafter, with reference to the accompanying drawings, it will be described in detail a preferred embodiment of the present invention. Hereinafter, the same reference numerals are used for the same components in the drawings, and redundant description of the same components is omitted.

1 is a block diagram schematically illustrating an image encoding apparatus according to an embodiment of the present invention. Referring to FIG. 1, the image encoding apparatus 100 may include a picture splitter 105, a predictor 110, a transformer 115, a quantizer 120, a realigner 125, and an entropy encoder 130. , An inverse quantization unit 135, an inverse transform unit 140, a filter unit 145, and a memory 150.

The picture dividing unit 105 may divide the input picture into at least one processing unit. In this case, the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU).

The predictor 110 includes an inter prediction unit for performing inter prediction and an intra prediction unit for performing intra prediction. The prediction unit 110 generates a prediction block by performing prediction on the processing unit of the picture in the picture division unit 105. The processing unit of the picture in the prediction unit 110 may be a coding unit, a transformation unit, or a prediction unit. In addition, it is possible to determine whether the prediction performed on the processing unit is inter-screen prediction or intra-screen prediction, and determine specific contents (eg, prediction mode, etc.) of each prediction method. In this case, the processing unit in which the prediction is performed may differ from the processing unit in which the prediction method and the details are determined. For example, the method of prediction and the prediction mode are determined in units of prediction units, and the performance of prediction may be performed in units of transform units. The residual value (residual block) between the generated prediction block and the original block is input to the converter 115. In addition, prediction mode information and motion vector information used for prediction are encoded by the entropy encoder 130 together with the residual value and transmitted to the decoder.

The transform unit 115 performs a transform on the residual block in transform units and generates transform coefficients. The transform unit in the transform unit 115 may be a transform unit and may have a quad tree structure. In this case, the size of the transform unit may be determined within a range of a predetermined maximum and minimum size. The transform unit 115 may transform the residual block using a discrete cosine transform (DCT) and / or a discrete sine transform (DST).

The quantization unit 120 may generate quantization coefficients by quantizing the residual values transformed by the transformation unit 115. The value calculated by the quantization unit 120 is provided to the inverse quantization unit 135 and the reordering unit 125.

The reordering unit 125 rearranges the quantization coefficients provided from the quantization unit 120. By rearranging the quantization coefficients, the efficiency of encoding in the entropy encoder 130 may be increased. The reordering unit 125 may rearrange the quantization coefficients in the form of a two-dimensional block into a one-dimensional vector form through a coefficient scanning method. The reordering unit 125 may increase the entropy coding efficiency of the entropy encoder 130 by changing the order of coefficient scanning based on probabilistic statistics of coefficients transmitted from the quantization unit.

The entropy encoder 130 may perform entropy encoding on the quantized coefficients rearranged by the reordering unit 125. Entropy encoding may use, for example, an encoding method such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), or Context-Adaptive Binary Arithmetic Coding (CABAC). The entropy encoder 130 may include quantization coefficient information, block type information, prediction mode information, division unit information, prediction unit information, transmission unit information, and motion vector of the coding unit received from the reordering unit 125 and the prediction unit 110. Various information such as information, reference picture information, interpolation information of a block, and filtering information can be encoded.

The inverse quantization unit 135 inverse quantizes the quantized values in the quantization unit 120, and the inverse transformer 140 inversely transforms the inverse quantized values in the inverse quantization unit 135. The residual value generated by the inverse quantization unit 135 and the inverse transformer 140 may be combined with the prediction block predicted by the prediction unit 110 to generate a reconstructed block.

The filter unit 145 may apply a deblocking filter and / or an adaptive loop filter (ALF) to the reconstructed picture.

The deblocking filter may remove block distortion generated at the boundary between blocks in the reconstructed picture. The adaptive loop filter (ALF) may perform filtering based on a value obtained by comparing a reconstructed image with an original image after the block is filtered through a deblocking filter. ALF may be performed only when high efficiency is applied.

Meanwhile, the filter unit 145 may not apply filtering to the reconstructed block used for inter prediction.

The memory 150 may store the reconstructed block or the picture calculated by the filter unit 145. The reconstructed block or picture stored in the memory 150 may be provided to the predictor 110 that performs inter prediction.

A coding unit (CU) is a unit in which coding / decoding of a picture is performed and may be divided with a depth based on a quad tree structure. The coding unit may have various sizes, such as 64x64, 32x32, 16x16, and 8x8.

The encoder may transmit information about a largest coding unit (LCU) and a minimum coding unit (SCU) to the decoder. Information (depth information) regarding the number of splittable times together with information about the maximum coding unit and / or the minimum coding unit may be transmitted to the decoder. Information on whether the coding unit is split based on the quad tree structure may be transmitted from the encoder to the decoder through flag information such as a split flag.

2 is a conceptual diagram schematically illustrating a prediction unit according to an embodiment of the present invention. Referring to FIG. 2, the predictor 200 may include an inter prediction unit 210 and an intra prediction unit 220.

The inter prediction unit 210 may generate a prediction block by performing prediction based on information of at least one of a previous picture or a subsequent picture of the current picture. In addition, the intra prediction unit 220 may generate a prediction block by performing prediction based on pixel information in the current picture.

The inter prediction unit 210 selects a reference picture with respect to the prediction unit and selects a reference block having the same size as that of the prediction unit in units of integer pixel samples. Subsequently, the inter prediction unit 210 is most similar to the current prediction unit in sub-integer sample units such as 1/2 pixel sample unit and 1/4 pixel sample unit, so that the residual signal is minimized and the size of the motion vector to be encoded is also minimal. Generate a predictive block that can be. In this case, the motion vector may be expressed in units of integer pixels or less. For example, the motion vector may be expressed in units of 1/4 pixels for luminance pixels and in units of 1/8 pixels for chrominance pixels.

Information about the index and the motion vector of the reference picture selected by the inter prediction unit 210 is encoded and transmitted to the decoder.

The maximum coding unit (LCU) 300 may have a hierarchical structure composed of smaller coding units 310 through division, and the hierarchical structure of the coding unit may be based on size information, depth information, split flag information, and the like. Can be specified. The size information of the maximum coding unit, the split depth information, and whether the current coding unit is split may be included in a sequence parameter set (SPS) on the bit stream and transmitted to the image decoder. However, since the minimum coding unit is no longer divided into small coding units, the split flag of the coding unit may not be transmitted to the minimum coding unit.

Meanwhile, which prediction between inter prediction and intra prediction may be performed may be determined in units of coding units. When performing inter prediction, inter prediction may be performed in units of prediction units. When performing intra prediction, a prediction mode may be determined in units of prediction units, and prediction may be performed in units of prediction units. In this case, a prediction mode may be determined in units of prediction units, and intra prediction may be performed in units of transformation units.

Referring to FIG. 3, in the case of intra prediction, the prediction unit 320 may be 2N × 2N or N × N (N is an integer), and in case of inter prediction, the prediction unit 330 may be 2N ×. 2N, 2N × N, N × 2N, or N × N. In this case, for example, N × N may be determined to be applied only to the minimum size coding unit or to be applied only to intra prediction. In addition to the size of the predicted block, N × mN, mN × N, 2N × mN, or mN × 2N (m <1) may be further defined and used.

4 is a block diagram schematically illustrating an image decoding apparatus according to an embodiment of the present invention. Referring to FIG. 4, the image decoder 400 includes an entropy decoder 410, a reordering unit 415, an inverse quantizer 420, an inverse transform unit 425, a predictor 430, and a filter unit 435. And a memory 440.

When an image bit stream is input to the image encoder, the input bit stream may be decoded according to a procedure in which image information is processed by the image encoder.

For example, when variable length coding (VLC, hereinafter called 'VLC') is used to perform entropy encoding in the image encoder, the entropy decoder 410 may also be identical to the VLC table used in the encoder. Entropy decoding can be performed by implementing a VLC table. Even when CABAC is used to perform entropy encoding in the image encoder, the entropy decoding unit 410 may perform entropy decoding using CABAC correspondingly.

Information for generating a prediction block among the information decoded by the entropy decoder 410 may be provided to the predictor 430, and a residual value of which entropy decoding is performed by the entropy decoder may be input to the reordering unit 415.

The reordering unit 415 may reorder the bit stream deentropy decoded by the entropy decoding unit 410 based on a method of reordering the image encoder. The reordering unit 415 may reorder the coefficients expressed in the form of a one-dimensional vector by restoring the coefficients in the form of a two-dimensional block. The reordering unit 415 may receive the information related to the coefficient scanning performed by the encoder and perform the rearrangement by performing a reverse scanning method based on the scanning order performed by the corresponding encoder.

The inverse quantization unit 420 may perform inverse quantization based on the quantization parameter provided by the encoder and the coefficient values of the rearranged block.

The inverse transformer 425 may perform inverse DCT and / or inverse DST on DCT and DST performed by the encoder of the encoder with respect to the quantization result performed by the image encoder. The inverse transform may be performed based on a transmission unit determined by the encoder or a division unit of an image. In the encoder, the DCT and / or DST may be selectively performed according to a plurality of pieces of information, such as a prediction method, a size and a prediction direction of the current block, and the inverse transformer 425 of the decoder is performed by the transformer of the encoder. Inverse transformation may be performed based on the transformation information.

The prediction unit 430 may generate the prediction block based on the prediction block generation related information provided by the entropy decoding unit 410 and previously decoded blocks and / or picture information provided by the memory 440. The reconstruction block may be generated using the prediction block generated by the predictor 430 and the residual block provided by the inverse transform unit 425.

The reconstructed block and / or picture may be provided to the filter unit 435. The filter unit 435 applies deblocking filtering, sample adaptive offset (SAO), and / or adaptive loop filtering (ALF) to the reconstructed block and / or picture.

The memory 440 may store the reconstructed picture or block to use as a reference picture or reference block, and may provide the reconstructed picture to the output unit.

Referring to FIG. 5, the predictor 500 may include an intra prediction unit 510 and an inter prediction unit 520.

The intra prediction unit 510 may generate a prediction block based on pixel information in the current picture when the prediction mode for the prediction unit is an intra prediction mode (intra prediction mode).

The inter picture prediction unit 520 may provide information, for example, motion required for inter picture prediction of the current prediction unit provided by the image encoder when the prediction mode for the corresponding prediction unit is an inter prediction mode (inter prediction mode). The inter-screen prediction of the current prediction unit may be performed based on information included in at least one of a previous picture or a subsequent picture of the current picture including the current prediction unit using information on a vector, a reference picture index, and the like.

In this case, the motion information may be derived in response to the skip flag, the merge flag, and the like of the coding unit received from the encoder.

Hereinafter, when a "picture" or a "picture" can represent the same meaning as a "picture" according to the configuration or expression of the invention, the "picture" may be described as a "picture" or a "picture". In addition, inter prediction and inter prediction have the same meaning, and intra prediction and intra prediction have the same meaning.

As described above, in the inter prediction mode, the inter prediction unit may generate a prediction block by performing prediction on the current unit using information of at least one of a previous picture or a subsequent picture of the current picture, not the current picture. have. In this case, a picture used for prediction is referred to as a reference picture. The reference picture used for prediction of the current unit may be represented by using a reference picture index and a motion vector information indicating the reference picture. The P picture and the B picture may be encoded in the inter prediction mode, and the I picture, the P picture, and the B picture will be described later.

6 is a conceptual diagram schematically illustrating an embodiment of a prediction method that may be used in a P picture and a B picture. 6 includes a prediction method 610 for a P picture and a

prediction method

620, 630, 640 for a B picture.

Hereinafter, a block to be predicted currently is called a prediction target block, a block generated by the prediction is a prediction block, and a block used for prediction of the prediction target block in a reference picture is referred to as a reference block.

There are three types of pictures used for video encoding and decoding, such as an I picture, a P picture, and a B picture. Hereinafter, POC (Picture Order Count) means display order or time order of pictures.

An I picture is a picture that is encoded independently in the picture regardless of the picture before and after. Prediction in the time direction is not applied to the I picture, and only intra-picture information is used for the encoding process.

The P picture is a picture that can be encoded by unidirectional prediction between pictures using one reference picture. Referring to FIG. 6, FIG. 6 shows unidirectional prediction 610 in a P picture. P picture requires one reference picture list, which is referred to as reference picture list0. The inter prediction using the reference picture selected from the reference picture list 0 is called L0 prediction, and L0 prediction is mainly used for forward prediction. In the P picture, intra prediction or L0 prediction may be performed.

In the P picture, prediction may be performed on a prediction target block using one reference block that exists in the past than the current picture in decoding order. Accordingly, as shown in FIG. 6 (610), prediction using a past reference picture in time order may be used to perform inter-screen forward prediction. In addition, prediction using a future reference picture in time order may be used to perform reverse prediction between screens. Therefore, the P picture uses only one piece of motion information (motion vector, reference picture index) for the block to be predicted in one direction.

A B picture is a picture that can be encoded by forward, backward, or bidirectional prediction between pictures using one or more, for example, two reference pictures. One or more reference pictures may be used in the B picture.

The B picture requires two reference picture lists, and the two reference picture lists are referred to as reference picture list 0 and reference picture list 1, respectively. The inter prediction using the reference picture selected from the reference picture list 0 is called L0 prediction, and L0 prediction is mainly used for forward prediction. The inter prediction using the reference picture selected from the reference picture list 1 is called L1 prediction, and L1 prediction is mainly used for backward prediction. In addition, inter prediction using two reference pictures selected from reference picture list 0 and reference picture list 1 is called bi prediction.

In the B picture, intra prediction, L0 prediction, L1 prediction, or bi prediction using two pieces of motion information may be performed.

6 shows bi-prediction 620 in a B picture. In the B picture, prediction may be performed on a prediction target block by using both a forward reference block in a past reference picture and a backward reference block in a future reference picture in time order. That is, bidirectional prediction using both the forward reference block and the backward reference block may be used to perform inter prediction.

6 also shows forward prediction 630 using two past reference pictures and B backward prediction 640 using two future reference pictures in a B picture. As illustrated in FIG. 6, in the B picture, two past reference pictures may be used in time order, and inter prediction may be performed on the prediction target block. In addition, in the B picture, two future reference pictures may be used in temporal order to perform inter prediction on the prediction block.

In the B picture, not only the case described above but also the forward prediction using one past reference picture and the backward prediction using one future reference picture may be used.

Accordingly, the B picture may use up to two pieces of motion information for the picture to be predicted in the forward and / or reverse direction. B pictures are delayed but are often used to achieve high performance. The structure of the above-mentioned reference picture list will be described later.

7 is a conceptual diagram schematically illustrating a comparison between a conventional P picture, a conventional B picture, and a prediction method of a GPB. Referring to FIG. 7, each picture is shown in the display order (POC) of the screen.

In the case of a P picture, since unidirectional prediction is possible from one reference picture, forward prediction may be performed from a past picture as shown in FIG. 7. In addition, backward prediction may be performed from a future picture. In the case of a B picture, since pair prediction using two reference pictures is allowed, up to two pieces of motion information can be used. In addition, as described above, a reference picture list may be configured of pictures used for inter prediction for the current picture. The B picture requires two reference picture lists: reference picture list 0 and reference picture list 1.

A picture having the same reference picture list 0 and reference picture list 1 among the B pictures is referred to as Generalized P and B or Generalized B picture. In addition, as one embodiment, only forward prediction may be allowed in the GPB, in which case backward prediction may not be performed.

Referring to FIG. 7, in the GPB, prediction may be performed using two or more pieces of motion information on a block to be predicted, similarly to a B picture. If only forward prediction is allowed in GPB, no delay due to backward prediction is involved. In GPB, low delay coding is possible while maintaining high coding performance.

The characteristics of the I picture, the P picture, and the B picture may be different for each slice in one picture, not in a picture unit. A slice having a feature of an I picture, a slice having a feature of a P picture, and a slice having a feature of a B picture in a slice unit may be referred to as an I slice, a P slice, and a B slice, respectively. In slice units, GPBs may be referred to as GPB slices or Generalized B slices.

As described above, there are two types of reference picture lists used for inter prediction, that is, inter picture prediction, reference picture list 0 and reference picture list 1. Hereinafter, reference picture list 0 is referred to as list 0, and reference picture list 1 is referred to as list 1.

The information about the list may be generated by the inter-prediction unit of the encoder and encoded to form a bitstream and transmitted to the decoder. The transmitted information may be decoded and used to perform prediction in the inter prediction unit of the decoder.

The maximum number of reference pictures for constructing the reference picture list may be defined by syntax elements as included in [Table 1] below in one embodiment.

TABLE 1

Using num_ref_idx_l0_default_active_minus1 and num_ref_idx_l1_default_active_minus1 in the Picture Parameter Set (PPS), the maximum number of reference pictures allowed in List 0 and List 1 may be defined first. Next, when the current slice is a P or B slice, whether or not to update the information about the maximum number of reference pictures of the PPS in the current slice using num_ref_idx_active_override_flag in the slice header (slice_header). When num_ref_idx_active_override_flag is 1, that is, when it is determined to update information, the maximum number of reference pictures allowed in the current slice may be defined using num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1 .

Each of the pieces of information may be included in the PPS and the slice header and transmitted from the encoder to the decoder through the bitstream.

Referring to [Table 1], the maximum number of reference pictures is determined based on the PPS and may be adaptively determined differently for each slice. However, it is not always allowed to have different numbers of reference pictures for slices belonging to the same picture. In this case, num_ref_idx_active_override_flag need not be transmitted for each slice header. In addition, the maximum number of reference pictures may be fixed for a sequence unit by being extended in the PPS unit. In this case, if the maximum number of reference pictures is defined and transmitted in both the PPS and the slice header, unnecessary information duplication occurs.

Table 2 shows an embodiment of a method of newly defining the maximum number of reference pictures in the syntax in order to prevent waste of such information.

TABLE 2

Referring to [Table 2], a flag called num_ref_idx_no_override_flag is added to the Sequence Parameter Set (SPS). The flag is a flag indicating whether the maximum number of reference pictures defined in the PPS is applied to all slices in the same picture as they are.

The condition for the flag is added to the slice header so that the information about the maximum reference picture in the current slice is updated only when the flag is zero. Therefore, when the flag is set, that is, 1, the maximum number of reference pictures defined in the PPS may be commonly applied to all slices in the same picture.

Therefore, since num_ref_idx_no_override_flag can be prevented from being transmitted for each slice header, the amount of information transmitted through the bitstream can be reduced.

Table 3 shows another embodiment of a method of defining the maximum number of reference pictures in the syntax for constructing the reference picture list.

TABLE 3

Referring to [Table 3], a flag called num_ref_idx_no_override_flag is added to a picture parameter set (PPS). The flag is a flag indicating whether the maximum number of reference pictures defined in the PPS is applied to all slices in the picture as it is.

The condition for the flag is added to the slice header so that the information about the maximum reference picture in the current slice is updated only when the flag is zero. Therefore, if the flag is set, that is, 1, the maximum number of reference pictures defined in the PPS is applied to all slices in the same picture in common.

When there is one slice defined in the PPS, num_ref_idx_no_override_flag defined in the PPS replaces the role played by num_ref_idx_active_override_flag in the slice header. Therefore, when a new flag is defined in the PPS, the amount of information transmitted through the bitstream is the same as when a new flag is not defined in the PPS.

However, if there are a plurality of slices defined in the PPS, if a new flag is not defined in the PPS, num_ref_idx_active_override_flag should be transmitted or signaled by the number of slices. On the other hand, when the new flag is defined in the PPS, since the newly defined flag needs to be transmitted only once, the amount of information transmitted through the bitstream may be reduced.

Table 4 shows another embodiment of a method of defining the maximum number of reference pictures in the syntax for constructing the reference picture list.

TABLE 4

Referring to [Table 4], a flag called num_ref_idx_infer_flag is added to the Sequence Parameter Set (SPS). The flag is a flag indicating whether the same maximum number of reference pictures is used for the entire sequence. If the flag is 1, the number of maximum reference pictures num_ref_idx_l0_minus1 and num_ref_idx_l0_minus1 is defined in the SPS.

In the PPS and slice headers, the maximum number of reference pictures for each is transmitted only when the flag num_ref_idx_infer_flag added to the SPS is zero. Therefore, since the same maximum number of reference pictures for the entire sequence can be defined using the flags and syntax elements added to the SPS, the amount of information transmitted through the bitstream can be reduced.

8 is a flowchart schematically illustrating a method of transmitting a reference picture list information by constructing a reference picture list in an encoder according to an embodiment of the present invention.

Referring to FIG. 8, the encoder generates a reference picture list (S810). By generating the reference picture list, information related to the reference picture list may be generated together, and the information related to the reference picture list may include the maximum number of reference pictures for the list 0 and / or the list 1.

The maximum number of reference pictures may be defined first in the PPS as shown in [Table 1]. If it is determined to update it in the current slice, the number of reference pictures finally allowed in the current slice can be determined. In addition, as shown in Table 2, for all slices in the same picture, a flag indicating whether the maximum number of reference pictures defined in the PPS is applied as it may be defined in the SPS.

As shown in Table 3, for all slices in the picture, a flag indicating whether the maximum number of reference pictures defined in the PPS is applied as it is may be defined in the PPS. In addition, as shown in [Table 4], a flag indicating whether the same maximum number of reference pictures is used for the entire sequence may be defined in the SPS.

Information about the reference picture list is entropy encoded (S820). The coded reference picture list information forms a bitstream and is transmitted from the encoder to the decoder (S830).

In this case, in the case of [Table 1], only when it is determined that the maximum number of reference pictures is updated for each slice, information about the maximum number of reference pictures allowed in each slice may be included in the slice header and transmitted. In the case of Table 2, when the newly defined flag is 1 in the SPS, the maximum number of reference pictures defined in the PPS may be commonly applied to all slices in the sequence. At this time, information about the maximum number of reference pictures allowed in each slice may not be transmitted.

In the case of Table 3, when the newly defined flag becomes 1 in the PPS, the maximum number of reference pictures defined in the PPS may be commonly applied to all slices in the current picture. At this time, information about the maximum number of reference pictures allowed in each slice may not be transmitted. In the case of Table 4, when the newly defined flag is 1 in the SPS, the maximum reference picture number information is included and transmitted in the SPS, and the maximum reference picture number may be commonly applied to the entire sequence. In this case, information about the number of separate reference pictures may not be transmitted in each PPS or slice header.

9, the decoder receives information about a reference picture list (S910). The information related to the reference picture list may include the maximum number of reference pictures for List0 and / or List1. In this case, the received maximum reference picture number information includes information that can be transmitted by the encoder of FIG. 8, and may differ from each other according to the above-described embodiments of Tables 1 to 4. For example, in the embodiment of Table 4, when the newly defined flag is 1 in the SPS, only the maximum reference picture number information for the entire sequence included in the SPS is received, and a separate picture for each picture or slice is received. The maximum reference picture number information of may not be received.

Information about the reference picture list is entropy decoded (S920). The decoder may generate a reference picture list by using the decoded reference picture list information (S930).

A B slice or a GPB slice may have three prediction modes as modes used for inter prediction. The three modes are bi-prediction mode (Pred_BI) using two reference pictures from List0 and List1, and uni-prediction mode using one reference picture from List0, respectively. It is a single prediction mode (Pred_L1) using (Pred_L0) and one reference picture from List1. Hereinafter, uni-prediction is referred to as uniprediction and bi-prediction is referred to as bi-prediction.

In one embodiment, each mode Pre_BI, Pred_L0, and Pred_L1 may be indicated by a syntax element called inter_pred_idc on the system. In this case, when the current prediction unit is in the inter prediction mode, information on which of three modes corresponds to which mode should be transmitted from the encoder to the decoder through inter_pred_idc.

If the type of mode is reduced to two types, bi-prediction mode and short-prediction mode, the index of inter_pred_idc transmitted per PU may be replaced with a flag such as inter_pred_flag, and the amount of information and overhead transmitted Can be reduced. Therefore, a combined list can be used for this. The combined list may also be referred to as a combined reference picture list or LC.

The structure of the bind list will be described later. Hereinafter, the LC and the bind list have the same meaning.

10 is a conceptual diagram schematically illustrating an embodiment of a method of constructing a bind list. 10 schematically illustrates a method of generating a combined list from

lists

0 and 1.

Referring to FIG. 10, starting from list 0, reference pictures of list 0 and list 1 may be alternately mapped or inserted into a bind list. At this time, the pictures may be mapped or inserted in the bind list in the order of the distant pictures from the pictures close in time to the current picture. In addition, when the reference picture indicated by the current list is the same as the reference picture indicated by the previous list, the mapping or insertion into the combined list may be skipped. Hereinafter, the above-described combined list generation order is referred to as a default order.

In addition, syntax elements such as pic_from_list_0_flag and ref_idx_list_curr may be transmitted for more efficient optimal combine list construction. The meaning of each syntax element is mentioned later.

By creating a bind list, three modes can be reduced to two modes. The two types of modes are pair prediction mode Pred_BI using two reference pictures from list 0 and list 1 and single prediction mode Pred_LC using one reference picture from the combined list. Therefore, the amount of information transmitted may be reduced for the reasons described above.

In order to form a combined reference picture list, information as shown in Table 5 below may be transmitted in a slice header.

TABLE 5

In Table 5, each syntax may be defined as follows.

When ref_pic_list_combination_flag is 1, list 0 and list 1 are combined to generate a combined list for short prediction. When the flag is 0, list 0 and list 1 are the same, and list 0 is used as a bind list.

In addition, when the flag is 0, the list 0 and the list 1 are the same, and the GPB means the picture in which the list 0 and the list 1 are the same.

num_ref_idx_lc_active_minus2 +2 indicates the number of reference pictures selected from list 0 or list 1 in the bind list. When the number of reference pictures is less than or equal to 1, since a bind list does not need to be generated, the value of the corresponding syntax may be at least 2, and thus minus2 may be applied.

When ref_pic_list_modification_flag_lc is 1, it indicates that there are syntax elements pic_from_list_0_flag and ref_idx_list_curr for specifying the mapping from the combine list to list 0 and list 1. When the flag is 0, the syntax element does not exist, and a bind list may be generated according to a predetermined default order or method.

pic_from_list_0_flag indicates whether the current reference picture added to the bind list is from list 0 or from list 1. If the flag is 1, the list 0 may be added. If the flag is 0, the current reference picture may be added from the list 1.

ref_idx_list_curr indicates a reference picture index of a picture in the current reference picture list (list 0 or list 1) added at the end of the combined list.

Referring to the above, LC (List Combined) is generated for short prediction from pictures of List 0 and List 1. LC generation methods may include using a predefined default order followed by truncation of the LC length or using other syntax that is not based on the associated syntax element in the slice header. In the former method, the LC may be generated by alternately inserting entries of List 0 and List 1 in ascending order starting from the smallest index of List 0 as described above with reference to FIG. 10.

However, if the list 0 is always inserted prior to the list 1 as in the above method, since coding efficiency may not be sufficiently reflected, a temporal level and / or a quantization parameter (QP) value A method for generating LC using can be used.

In image encoding, the quantization step QP may be encoded without directly encoding the quantization step. In this case, the encoded QP may be transmitted from the encoder to the decoder, and the decoder may derive the quantization step from the QP. As the QP increases, the quantization step may increase, and the signal-to-noise ratio (SNR) may decrease.

In addition, in image encoding, a hierarchical coding structure may be used.

11 is a conceptual diagram schematically illustrating an embodiment of a hierarchical coding structure. The hierarchical coding structure according to the embodiment of FIG. 11 has four hierarchical levels, and the number of hierarchical steps may not be limited to four. In addition, the size of the GOP is 8 in FIG. GOP (Group Of Pictures) means a group of pictures, that is, a group of pictures. Reference pictures referred to by each picture in FIG. 11 may be selected differently from those shown in FIG. 11.

In FIG. 11, pictures of different hierarchical levels may have different temporal levels. In a hierarchical coding structure, a picture with a lower temporal level can be coded using a smaller QP. This means that a picture with a lower temporal level can provide better coding efficiency than a picture with a higher temporal level.

Thus, for higher coding efficiency, a combined list generation may be considered considering temporal level and / or QP.

12 is a conceptual diagram schematically showing an embodiment of a method of generating a combined list using a temporal level according to the present invention. 12, pictures are shown in POC, that is, display order or temporal order, and the size of the POC is eight. In the embodiment of FIG. 12, there are four temporal levels. The temporal level is lower toward the top and the temporal level is higher toward the bottom. In FIG. 12, LC represents a bind list constructed according to a default order, and LC (prop.) Represents a bind list constructed according to an embodiment of the present invention.

According to an embodiment of the present invention, a reference picture having a lower temporal level may be preferentially inserted when an entry of List0 or List1 is inserted into the LC. In more detail, the following rule may be applied when generating a combine list.

First, for each list, a reference picture that is closer in time to the current picture can be inserted into the LC first. This means that a reference picture with a smaller POC difference is inserted first. For reference pictures having the same temporal distance from the current picture among entries of List0 and List1, a picture having a lower temporal level may be inserted first. When entries from List0 and List1 have the same temporal distance and the same temporal level as the current picture, the reference picture from List0 may be inserted first.

In the case of the picture 1210 having the POC value of 3 in the embodiment of FIG. 12, the combine list consisting of List 0 and List 1 inserted alternately starting from List 0 in the default order is 2, 4, 0, 6 in order. Has an entry of However, according to the embodiment of the present invention, since the fourth picture having a low temporal level is inserted before the second picture, the list of LCs may have entries of 4, 2, 0, and 6 in order.

In the embodiment of FIG. 12, the picture 1220 having a POC value of 5 has a lower temporal level of picture 8 than picture 2. Therefore, according to an embodiment of the present invention, picture 8 may be inserted before picture 2 in the bind list.

In the case of the picture 1230 having a POC value of 6 in the embodiment of FIG. 12, referring to the LC generated according to the default order, the picture 4 of the reference picture index (refIdx) of List 1 is 1 is already inserted into the LC. Since it is the same as the picture 4 from the list 0, it may not be inserted. In addition, according to the embodiment of the present invention, since the 8th picture having a low temporal level may be inserted before the 4th picture, the list of LCs may have entries of 8, 4, and 2 in order.

In the embodiment of FIG. 12, the picture 1240 having a POC value of 7 has a lower temporal level of picture 8 than picture 6. Therefore, according to an embodiment of the present invention, the picture number 8 may be inserted before the picture number 6 in the combined list.

Also, as described above, a picture having a lower temporal level can be coded using a lower QP. Therefore, the bind level may be generated using the QP value instead of the temporal level.

According to the embodiment of the present invention, when the entry of List0 or List1 is inserted into the LC, the reference picture having a lower QP value is inserted first. In more detail, the following rule may be applied when generating a combine list.

First, for each list, a reference picture that is closer in time to the current picture can be inserted into the LC first. This means that a reference picture with a smaller POC difference is inserted first. For reference pictures having the same temporal distance from the current picture among the entries of List0 and List1, a picture having a lower QP value may be inserted first. When the entries from List0 and List1 have the same temporal distance and the same QP value as the current picture, the reference picture from List0 may be inserted first.

In the case where the combine list can be configured according to the above-described embodiment, a flag indicating whether a combine list generation method using a temporal level or a QP value is enabled may be provided.

If the flag is 1, a picture having a lower QP or lower temporal level may be preferentially inserted when the nth entry of List0 (RefPicList0) and List1 (RefPicList1) is inserted into the combined list. When the flag is 0, when the n-th entry of List0 (RefPicList0) and List1 (RefPicList1) is inserted into the combined list, the picture of List0 may be inserted before the picture of List1. That is, when the flag is 0, a bind list may be generated according to a default order.

The flag may be defined in an SPS, PPS or slice header. The flag generated by the encoder may be encoded and transmitted to the decoder. In the decoder, a bind list may be generated using a temporal level or QP or in a default order according to the information of the flag.

13 is a flowchart illustrating an embodiment of a method for generating a combine list according to the present invention. FIG. 13 shows an embodiment when the new flag described above in FIG. 12 is 1;

Referring to FIG. 13, the encoder or the decoder generates a reference picture list 0 and a reference picture list 1 (S1310).

The encoder or decoder determines the temporal distance from the current picture and the temporal level of each picture with respect to the picture from the reference picture list 0 and the picture from the reference picture list 1 (S1320). In this case, if the new flag described above in FIG. 12 is used, the temporal distance and temporal level may be determined when the flag is 1. In addition, a QP value may be used instead of the temporal level.

The encoder or decoder generates an LC using the temporal distance from the current picture, temporal level information of each picture, or the temporal distance from the current picture, QP information (S1330). In this case, a picture having a temporal distance from the current picture may be preferentially inserted into the combine list. If the picture of list 0 and the picture of list 1 have the same temporal distance from the current picture, the picture with lower temporal level and / or lower QP value may be preferentially inserted into the combine list. When the temporal distance and temporal level are the same or the temporal distance and the QP value are the same, the pictures of the list 0 may be preferentially inserted into the bind list.

When a combined list is generated, a value that an inter_pred_idc syntax element may have may be reduced from three (L0, L1, Bi) to two (Uni, Bi). Accordingly, inter_pred_idc may be replaced by inter_pred_flag indicating whether bi-pred or uni_pred is used, and the amount of information transmitted may be reduced. However, unnecessary duplication or burden may occur due to syntax elements or information added by the generation of the bind list.

In the prediction unit prediction_unit (), a new syntax element for the combined list needs to be defined, in addition to ref_idx_lX, mvd_lX, and mvp_idx_lX, which are syntax elements for List0 and List1. The new syntax element added to prediction_unit () may include a reference picture index ref_idx_lc of LC, motion vector difference mvd_lc of LC, motion vector predictor index mvp_idx_lc of LC, and the like.

In addition, in the slice header (), a new syntax element for modification of the bind list needs to be defined. New syntax elements added to the slice header may include num_ref_idx_lc_active_munus1, ref_pic_list_modification_flag_lc, pic_from_list_0_flag, ref_idx_list_curr, and the like.

There is also a need for a mapping process for mapping LC related syntax to list 0 or list 1 for prediction units on which uni-prediction is performed. For example, a mapping process from refIdxLX to ref_idx_lc, mvd_lX to mvd_lc, mvp_idx_lX to mvp_idx_lc and LcToLx to LX may be required.

Where x and X in Lx, lX or LX may be interpreted as 0 or 1, where L and l mean a list. This applies equally below.

As described above, when the combined list is generated, the burden for the definition of the LC-related syntax, the mapping to the list 0 or the list 1, or the duplication of the process may occur.

And the LC related syntax substantially indicates list 0 or list 1 which can be signaled or transmitted using the currently existing syntax element. ref_idx_lc may be unnecessary redundant to ref_idx_l0 and ref_idx_l1, mvd_lc to mvd_l0 and mvd_l1, and mvp_idx_lc to mvp_idx_l0 and mvp_idx_l1.

The slice header also contains syntax for modification of the bind list. Such syntaxes allow the order of entries in the bind list to be optimized for coding efficiency. However, such syntaxes can also be unnecessary redundant. For example, num_ref_idx_lc_active_minus1 may substantially refer to num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1, ref_pic_list_modification_flag_lc may be ref_pic_list_modification_flag_l_ and ref_pic_list_list_mod_list_mod_list_mod_list_mod_list_mod_list_mod_list_mod_l

Since the combined list can reduce the number of bits for inter_pred_idc from 3 (L0, L1, Bi) to 2 (Uni, Bi), inter_pred_idc can be replaced with inter_pred_flag.

However, this benefit can also be obtained without the creation of a bind list. Instead of defining a new syntax for the mapping of the lists, the existing List0 related syntax can be used. In this case, list 0 L0 can be utilized like LC. List 0 related syntax that may be used instead of LC generation may include, for example, ref_idx_l0, mvd_l0, mvp_idx_l0, and the like.

A new reference picture selection process may be provided to reduce the number of bits required for transmission of inter prediction mode information (eg, inter_pred_flag) without generating a new list. In this new reference picture selection process, inter_pred_flag points to single prediction (inter_pred_flag == Pred_Uni), slices are B slices (slice_type == B_SLICE), and reference picture list 0 and reference picture list 1 are not identical (ref_pic_list_combination_flag == 1). ) May be provided when the condition is satisfied.

14 is a conceptual diagram schematically illustrating a reference picture selection method according to an embodiment of the present invention. FIG. 14 schematically illustrates a reference picture selection method in the case where List0 related syntax is used instead of the combine list without generating the combine list as described above.

Referring to 1410 of FIG. 14, pictures are shown in POC, that is, display order or temporal order. A picture having a POC value of 7 represents the current picture. In 1420 of FIG. 14, a picture indicated by the reference picture index refIdxL0 and refIdxL1, respectively, of the reference pictures included in the reference picture list 0 and the reference picture list 1 is indicated. That is, 1420 of FIG. 14 illustrates a relationship between refIdxL0 and a POC and a relationship between refIdxL1 and a POC. 1430 of FIG. 14 illustrates a method of selecting a reference picture, which will be described later.

In the reference picture selection process, the input may be a reference picture index refIdxLX and the output may be a two-dimensional array of luma samples, that is, the reference picture refPicLX. If the present invention does not apply, the output of the process may be RefPicListX [refIdxLX]. Here, RefPicListX means a reference picture list and refIdxLX means a reference picture index indicating a reference picture in the reference picture list. X may mean 0 or 1 in some cases.

A new reference picture selection process provided according to an embodiment of the present invention is as follows.

The variable idxMax is set as follows.

idxMax = ref_idx_l0

LX is set to a variable that points to L0 or L1.

idxL0 and idxL1 are set to the indexes in the reference picture lists RefPicList0 and RefPicList1. idxL0 and idxL1 are both set to zero.

The variable idxLX is set to zero.

The following process is repeated. If idxLX is equal to idxMax, the next process ends.

If RefPicList0 [refIdxL0] is the first reference picture to appear, that is, if there were no duplicate reference pictures before,

If idxLX is equal to idxMax, then LX = LO and the process ends.

Otherwise, idxL0 ++ and idxLX ++ operations are performed.

If RefPicList1 [refIdxL1] is the first reference picture to appear, i.e. if there were no previous reference pictures,

If idxLX is equal to idxMax, then LX = L1 and the process ends.

Otherwise, idxL1 ++ and idxLX ++ operations are performed.

When the process ends, the output of the reference picture selection process is as follows.

If LX is L0, RefPicList0 [refIdxL0] is output.

If LX is L1, RefPicList1 [refIdxL1] is output.

A schematic concept of the reference picture selection process is shown at 1430 of the embodiment of FIG. 14. Referring to 1430 of FIG. 14, as the value of the reference picture index ref_idx_l0 increases, reference pictures of the list 0 and the list 1 may be scanned in a certain order to select the reference picture. First, scanning starts from the first entry of reference picture list 0 (RefPicList0). The reference picture list 0 (RefPicList0) and the reference picture list 1 (RefPicList1) may be sequentially scanned in turn. Therefore, the order in which the reference picture is scanned may be “6, 8, 4, 6, 2, 4, 0, 2” based on the POC of the picture.

However, already scanned reference pictures are not scanned but skipped. For example, in 1430 of FIG. 14, a reference picture having refIdxL1 of 1 and POC of 6 in reference picture list 1 is not scanned. This is because the reference picture is already scanned when refIdxL0 is 0 in the reference picture list 0. Therefore, referring to FIG. 14, the order in which the reference pictures are scanned is “6, 8, 4, 2, 0”.

By the above-described process and the reference picture selection process according to the embodiment of FIG. 14, it may be confirmed that the List0 related syntax may be used instead of the LC related syntax without generating a separate bind list. That is, ref_idx_l0 may be used instead of ref_idx_lc.

In the uni-predicdtion mode, the entries of the list 0 and the list 1 may be used as the reference picture by the above-described reference picture selection process. In addition, in the above-described reference picture selection process, it is only necessary to indicate whether inter_pred_flag is pair prediction or short prediction among inter prediction modes without generating LC. Therefore, the reference picture selection process can provide the advantages of the combined list as it is.

If List0 related syntax is used instead of Combined list related syntax without creating a bind list, a new process for deriving the motion vector and the reference picture index will be provided to map list0 to listX. Can be.

One embodiment of a new motion vector and reference picture index derivation process provided according to the present invention is as follows. Hereinafter, in the variables predFlagLX, mvLX, refIdxLX, Pred_LX, and syntax elements ref_idx_lX and mvd_lX, X may mean 0 or 1 in some cases.

1. The variables refIdxLX and predFlagLX can be derived as follows.

inter_pred_flag points to a single prediction (inter_pred_flag == Pred_Uni), slice is B slice (slice_type == B_SLICE), and reference picture list 0 and reference picture list 1 are not the same (ref_pic_list_combination_flag == 1) The following process applies.

The variable idxLC is set to zero.

The following process is repeated to derive the variable LC.

If idxLC is equal to ref_idx_l0 [xP] [yP], then LC = LO and the process ends.

Otherwise, idxL0 ++ and idxLC ++ operations are performed.

If idxLC is equal to ref_idx_l0 [xP] [yP], LC = L1 and the process ends.

Otherwise, idxL1 ++ and idxLC ++ operations are performed.

According to the LC, the following procedure applies.

If LC is L0,

refIdxL0 = idxL0, refIdxL1 = -1

predFlagL0 = 1, predFlagL1 = 0

mvd_l0 [xP] [yP] [0] = mvd_l0 [xP] [yP] [0] (horizontal component of the motion vector difference)

mvd_l0 [xP] [yP] [1] = mvd_l0 [xP] [yP] [1] (vertical component of the motion vector difference)

mvp_idx_l0 [xP] [yP] [0] = mvp_idx_l0 [xP] [yP]

If LC is L1

refIdxL0 = -1, refIdxL1 = idxL1

predFlagL0 = 0, predFlagL1 = 1

mvd_l1 [xP] [yP] [0] = mvd_l0 [xP] [yP] [0] (horizontal component of the motion vector difference)

mvd_l1 [xP] [yP] [1] = mvd_l0 [xP] [yP] [1] (vertical component of the motion vector difference)

mvp_idx_l1 [xP] [yP] [0] = mvp_idx_l0 [xP] [yP]

If inter_pred_flag [xP] [yP] is equal to Pred_LX or Pred_BI,

refIdxLX = ref_idx_lX [xP] [yP], predFlagLX = 1,

If inter_pred_flag is not the same as Pred_Uni, Pred_LX, and Pred_BI,

refIdxLX = -1 and predFlagLX = 0.

2. The variable mvdLX can be derived as follows.

mvdLX [0] = mvdlX [xP] [yP] [0]

mvdLX [1] = mvdlX [xP] [yP] [1]

3. When predFlagLX is 1, the variable mvpLX can be derived.

4. When predFlagLX is 1, the luminance motion vector mvLX can be derived as follows.

mvLX [0] = mvpLX [0] + mvdLX [0]

mvLX [1] = mvpLX [1] + mvdLX [1]

In the above-described motion vector and reference picture index derivation process, the order or method in which the reference picture is scanned is the same as described in the embodiment of FIG. In addition, in the above process, it may be confirmed that List0 related syntax is used instead of LC related syntax without generating a separate bind list. That is, ref_idx_l0, mvd_l0, and mvp_idx_l0 may be used instead of ref_idx_lc, mvd_lc, and mvp_idx_lc, respectively. Therefore, motion information such as a motion vector and a reference picture index in the process may be derived from a copy from list 0, not a combined list.

In the uni-predicdtion mode, the entries of the list 0 and the list 1 may be used as the reference picture by the above-described reference picture selection process. In the above-described motion vector and reference picture index derivation process, it is only necessary to indicate whether inter_pred_flag is bi-prediction or short-prediction among inter prediction modes without generating LC. Thus, the process can provide the benefits of the bind list as it is.

15 is a flowchart schematically illustrating a method for deriving motion information according to an embodiment of the present invention. The motion information may include a motion vector, a reference picture index, and the like. 15 may include the method of selecting a reference picture according to the embodiment of FIG. 14 described above, and in the embodiment of FIG. 15, a bind list may not be generated.

The encoder or the decoder generates a reference picture list 0 and a reference picture list 1 (S1510). In this case, the encoder or the decoder may not generate a separate list such as a combined list.

The encoder or the decoder scans the reference picture using the reference picture list 0 and the reference picture list 1 (S1520). At this time, the first entry of the reference picture list 0 may be scanned. In addition, the reference picture list 0 and the reference picture list 1 may be sequentially scanned in order of increasing the reference picture index. Also, a reference picture that has been previously scanned may be skipped.

The encoder or the decoder may derive motion information such as a motion vector and a reference picture index from the selected reference picture through a scan process (S1530).

In the embodiment of FIG. 15, the List0 related syntax may be used instead of the LC related syntax without generating the combine list as in the above-described motion vector and reference picture index derivation process. In addition, in the uni-prediction mode, both entries of List 0 and List 1 can be used as reference pictures, and only inter-pred_flag needs to indicate whether inter-pred_flag is bi-prediction or uni-prediction among inter prediction modes without generating LC. Therefore, the amount of information transmitted may be reduced, such as when a bind list is generated.

When the combined list is generated, a flag called ref_pic_list_combination_flag may be used as described above. When ref_pic_list_combination_flag is 1, list 0 and list 1 are combined to generate a combined list for short prediction. When the flag is 0, it means that list 0 and list 1 are the same, and list 0 may be used as a bind list.

In addition, when the flag is 0, it may mean GPB. As described above, a picture having the same reference picture list 0 and reference picture list 1 among the B pictures is referred to as a generalized P and B or a generalized B picture. In addition, in one embodiment, only forward prediction may be allowed in the GPB.

16 is a conceptual diagram schematically illustrating an embodiment of a reference picture and a reference picture list for a GPB and low delay B picture. In FIG. 16, pictures are shown in the display order of POCs, that is, pictures from the left.

Referring to 1610 of FIG. 16, List 0 and List 1 are the same. Accordingly, 1610 of FIG. 16 illustrates a case in which the current picture is a GPB, that is, a generalized B picture.

Referring to 1620 of FIG. 16, the reference picture indices of List 0 and List 1 may not be the same. However, the reference pictures of List 0 and List 1 are both reference pictures used for forward prediction.

A P picture having only a forward reference picture or a B picture having only a forward reference picture may be used for low delay coding. Hereinafter, a B picture having only a forward reference picture may be referred to as a low delay B picture, and may be referred to as a low delay B slice in a slice unit.

1620 of FIG. 16 illustrates a case where the current picture is a low delay B picture.

In the encoding and decoding process, a reference picture is generated including an initialization and a modification process. In the reference picture generation process, the reference picture list 0 and the reference picture list 1 are always generated together for the B slice. The slice header includes information transmitted for an inter picture (P picture, B picture). In the case of a B picture or a B slice, information for list 1 is additionally included and transmitted.

Information for list 1 may include num_ref_idx_l1_active_minus1, collocated_from_l0_flag, ref_pic_list_modification_flag_l1, and the like. num_ref_idx_l1_active_minus1 indicates the number of reference pictures present in list 1, and collocated_from_l0_flag indicates whether a block having the same position used for inter prediction is obtained from list 0 or list 1. ref_pic_list_modification_flag_l1 has the same meaning as ref_pic_list_modification_flag_lc described above, and only lc is a flag changed to l1.

The information for List 1 may also include switched interpolation filter with offset (SIFO) information indicating an interpolation filter and offset information to be used when reference pictures in List 1 are used.

In a typical B picture, all information for list 0 and list 1 should be transmitted. However, in the GPB picture or the GPB slice, the reference picture list 0 and the reference picture list 1 are the same. Therefore, in GPB, list 0 does not need to be transmitted separately since list 0 has substantially all the information about the reference picture.

This duplication of information can be resolved by copying all entries of list 0 to list 1 without using list 1 in the case of GPB slices. That is, if the current slice is a B slice and List0 and List1 are the same, all entries of List0 can be copied to List1 in the same order.

If duplication of information related to List1 information in the GPB is eliminated, the amount of information transmitted is reduced and the compression efficiency is increased, thereby improving coding efficiency. The coding method of the slice header may be modified to remove duplication of information in the GPB, and a syntax element called generalized_b_slice may be provided. generalized_b_slice indicates whether the current slice is a GPB slice. In addition, a syntax element indicating whether or not it is a GPB is not limited to its name (generalized_b_slice), and a syntax element indicating whether it is a GPB under any other name is included in the spirit of the present invention.

For example, the slice header may include information as shown in Table 6 below.

TABLE 6

Referring to [Table 6], when a slice is a B slice, regardless of GPB, num_ref_idx_l1_active_minus1 and collocated_from_l0_flag related to List 1 are transmitted in the slice header. The information about list 1 does not need to be transmitted in the GPB. However, in [Table 6], whether or not the GPB can be recognized only after all information about List 0 and List 1 has been parsed (parsed) and parsed. Therefore, even when GPB, information about List 1 is always included in the slice header. There is a burden to be transmitted.

Table 7 below shows a modified slice header according to an embodiment of the present invention.

TABLE 7

Referring to [Table 7], it may be recognized whether the current slice is GBP through generalized_b_slice in the parsing of the slice header. In addition, num_ref_idx_l1_active_minus1 and collocated_from_l0_flag, which are related to list 1, may be transmitted only when the current slice is not GPB using generalized_b_slice . Thus, duplication of unnecessary information for the GPB slice can be eliminated.

Table 8 below is an example of the SIFO parameter sifo_param () to which the present invention is not applied.

TABLE 8

Table 8 also has the same problems as in Table 6.

Table 9 below shows an embodiment of the SIFO parameter sifo_param () to which the present invention is applied.

TABLE 9

Referring to [Table 9], information related to list 1 may be transmitted only when the current slice is not GPB using generalized_b_slice . Thus, duplication of unnecessary information for the GPB slice can be eliminated.

Table 10 below is an example of syntax to which the present invention is not applied.

TABLE 10

Table 10 also has the same problems as in Table 6.

Table 11 below shows an embodiment of syntax to which the present invention is applied.

TABLE 11

Referring to [Table 11], the information sifo_offset_l1 related to list 1 may be transmitted only when the current slice is not GPB using generalized_b_slice . Thus, duplication of unnecessary information for the GPB slice can be eliminated.

The method for preventing transmission of List 1 related information from a GPB slice is not limited to the embodiments of Tables 7, 9, and 11.

Since GPB refers to a slice in which List0 and List1 are the same among B slices, it may not be necessary to determine whether the current picture is a GPB in a P picture that does not need information about List1. Therefore, the generalized_b_slice flag may be transmitted only when the current slice is a B slice.

In addition, the encoder may not separately transmit the generalized_b_slice syntax to the decoder without transmitting List1 related information for the GPB slice. In this case, the decoder may compare the reference picture of list 0 with the reference picture of list 1 to determine whether it is a GPB, and may derive the value of the syntax to 1 and use it. In this case, too, it may have an advantage that transmission of unnecessary information regarding the list 1 is prevented.

In addition, in the case of GPB, when information for reference pictures is added in addition to the details described above in the above embodiment, the amount of information transmitted using the same principle may be reduced. At this time, the amount of information transmitted with respect to the reference picture list can be reduced by half, and coding efficiency can be increased.

According to an embodiment of the present invention, when the current slice is a generalized B slice, the list 1 related information may be omitted without being transmitted because it is included in the slice header. In this case, the omitted slice header information may be obtained by directly copying information of reference pictures having the same POC of List0. Therefore, the decoder can copy and use the omitted information from the list 0 related information.

The syntax below illustrates one embodiment of a method for deriving information about List1.

if (POC (ref_pic (refidx_l0)) == POC (ref_pic (refidx_l1))

＆＆ generalized_b_slice)

{

num_ref_idx_l1_active_munus1 = num_ref_idx_l0_active_minus1;

sifo_info_l1 = sifo_info_l0;

…

}

In the case of the low delay B picture, since list 0 and list 1 may not be the same, the above method for reducing the amount of information transmission in the GPB picture may not be applied. Instead, a method can be provided that can reduce the complexity of the operation for determining temporal candidates in inter prediction or inter prediction.

In inter-screen prediction, for example, a merge skip method, a prediction unit merge method, an AMVP method, or the like may be used. In the AMVP method, motion prediction related information of one AMVP candidate block among two spatial AMVP candidate blocks and one temporal AMVP candidate block may be used as motion vector related information of the current prediction target block. Additionally, the difference between the motion vector value of the current prediction target block and the motion vector value of the selected AMVP candidate block may be used as the motion prediction related information of the current block.

In order to generate a temporal AMVP candidate block, a co-located block may be derived from a reference picture based on the current prediction target block. In order to derive the best motion prediction related information in the same position block, both list 0 and list 1 of the same position block can be checked. In this case, when pictures are displayed in the order of POC, a motion vector that crosses the current picture among the motion vectors of the same location block may be selected for temporal motion vector scaling.

However, if the motion information is selected only after both List 0 and List 1 are checked, the complexity of the operation can be large. Accordingly, a method of checking motion information in one list according to a condition and then checking motion information in another list when there is no valid motion information in the list may be used.

In the case of the low delay B picture, since the POC of the same position reference picture is always smaller than the POC of the current block, the motion vector of the same position block may not cross the current picture. In this case, it may be meaningless that the motion vector traversing the current picture is preferentially selected, and complexity may increase since both list 0 and list 1 should be checked.

Accordingly, in the case of the low delay B picture, a method of first checking a list identical to the reference picture list of the current picture among the reference picture lists of the same position block may be used. At this time, if there is no valid motion information in the checked list first, another list of the same position block may be checked. The embodiment of the present invention is not limited to the above, and the list to be checked first may be selected in other ways.

Information on whether the picture is a low delay B picture may be included in a flag and transmitted from the encoder to the decoder. In addition, the encoder may not transmit information about whether the encoder is a low delay B picture. In this case, the decoder may compare the reference pictures of list 0 and list 1 to determine whether the picture is a low delay B picture.

According to the above method, when valid motion information exists in the list to be checked first, since another list of the same position block may not be checked, the complexity may be reduced.

As another embodiment for preventing the transmission of unnecessary information in the GPB slice, a method using the aforementioned ref_pic_list_combination_flag may be provided. ref_pic_list_combination_flag specifies whether reference picture list 0 and reference picture list 1 are the same or not. If List0 and List1 are equal, they correspond to GPB slices. Therefore, when the current slice is a GPB slice according to the flag information, transmission of List 1 related information may be skipped.

In some cases, ref_pic_list_combination_flag may be signaled after List1 related syntax. In this case, before signaling of the list1 related information, whether the current slice is a GPB slice cannot be determined, and therefore, transmission of the list1 related information cannot be skipped. However, if the position of the flag is changed on the syntax so that the flag can be signaled before the List1 related syntax, the signaling of the List1 related information in the GPB slice may be skipped.

In addition, ref_pic_list_identical_flag may be used instead of ref_pic_list_combination_flag as a flag name. When ref_pic_list_identical_flag is 0, list 0 and list 1 are combined to generate a combined list for short prediction. When the flag is 1, list 0 and list 1 are the same, and list 0 is used as a bind list.

The new flag may indicate whether List0 and List1 are the same. If the flag is 1, it means that the current slice is a GPB slice. However, the flag is not limited to the name and may be included in the spirit of the present invention as long as the syntax element indicates whether the list 0 and the list 1 are the same regardless of any other names.

Table 12 below shows an embodiment of a slice header configured using ref_pic_list_identical_flag according to the above-described method.

TABLE 12

Table 13 below shows an embodiment of the ref_pic_list_modification () configuration in the slice header of Table 12.

TABLE 13

Table 14 below shows an embodiment of the ref_pic_list_combination () configuration in the slice header of Table 12.

TABLE 14

Referring to [Table 12], [Table 13], and [Table 14], since the syntax ref_pic_list_combination_flag is located behind the syntax element related to List1, ref_pic_list_combination_flag information cannot be used to decode the List1 related information. Therefore, the flag may be removed as in the embodiment of Table 14, and may be replaced with! Ref_pic_list_identical_flag.

In the slice header, ref_pic_list_identical_flag is defined before List1 related syntax elements num_ref_idx_l1_active_minus1, collocated_from_l0_flag, and ref_pic_list_modification_flag_l1. In addition, a condition is added to each List1 related syntax element so that List1 related syntax elements num_ref_idx_l1_active_minus1, collocated_from_l0_flag and ref_pic_list_modification_flag_l1 may be transmitted to the decoder only when ref_pic_list_identical_flag is 0. In other words, List1 related syntax elements are not transmitted in the GPB slice. Therefore, transmission of unnecessary information in the GPB slice can be prevented.

In the embodiment of Table 12, the semantic of collocated_from_l0_flag may be changed by ref_pic_list_identical_flag.

If ref_pic_list_identical_flag is not present, collocated_from_l0_flag equal to 1 means that the co-located picture is derived from list 0, otherwise it is derived from list 1.

If ref_pic_list_identical_flag is present, collocated_from_l0_flag equal to 1 means that the co-located picture is derived from list 0, otherwise it is derived from list 1. If collocated_from_l0_flag does not exist, collocated_from_l0_flag is assumed to be 1. Referring to Table 12, collocated_from_l0_flag is not present when the current slice is not a B slice or a GPB slice. In this case, since all the reference pictures can be derived from the list 0, the flag can be estimated as one.

According to the embodiments of [Table 12], [Table 13], and [Table 14], when the current slice is a GPB slice, reference picture list 1 related information is not transmitted to the decoder. When the current slice is a B slice and ref_pic_list_identical_flag is 1, all entries of the reference picture list 0 may be copied to the reference picture list 1 in the same order. Therefore, the decoder can derive information about list 1 from list 0.

As another embodiment for preventing transmission of unnecessary information in a GPB slice, a method of separately defining a GPB in a slice type (slice_type) may be used.

In the slice type currently used, three kinds of slices are defined: I slice, P slice, and B slice. This can be confirmed in the following [Table 15].

TABLE 15

Inter_pred_flag is used to specify whether bi-pred or uni-pred is used for the current prediction unit. The slice type and the prediction mode or name corresponding to the inter_pred_flag value may be checked in the following [Table 16].

TABLE 16

In Table 16, when the inter_pred_flag does not exist, the prediction mode is estimated to be Pred_L0 if the slice type is P and Pred_BI if the slice type is B.

Table 17 shows an embodiment of a method of defining a slice type when a GPB is defined in the slice type.

TABLE 17

Table 18 shows an embodiment of a method of defining inter_pred_flag when GPB is defined in a slice type.

TABLE 18

In Table 18, when the inter_pred_flag is not present, the prediction mode is estimated to be Pred_L0 if the slice type is P and Pred_BI if the slice type is B or GPB.

Referring to Table 17 and Table 18, a new slice type called GPB slice is added. As described above, the GPB slice means a slice in which the reference picture list 0 and the reference picture list 1 are the same.

The definition of the B slice may be changed by defining a separate slice type called GPB. A conventional B slice refers to a slice decoded using intra prediction or inter prediction using at most two reference pictures. When GPB is defined as a separate slice type, a B slice is a slice that is decoded by using intra prediction or inter prediction using up to two reference pictures, and a slice in which reference picture list 0 and reference picture list 1 are not the same. Means. That is, in the definition of a new B slice, the GPB slice is not included in the B slice.

Table 19 below shows an embodiment of a slice header configured using a GPB slice type according to the present invention.

TABLE 19

Table 20 below shows an embodiment of the ref_pic_list_modification () configuration in the slice header of Table 19.

TABLE 20

Referring to [Table 19], the slice type is GPB as a condition for defining num_ref_idx_active_override_flag. This is because the newly defined B slice does not include the GPB slice. In addition, List0 related information is transmitted only in the B slice, not the GPB slice. Therefore, in the case of a GPB slice, unnecessary List1 related information may not be transmitted.

Referring to [Table 20], slice_type% 5! = 3 means that the current slice is not an I slice. In the conventional slice type definition, 2 is used instead of 3 because the slice_type value of the I slice is 2, but 3 may be used instead of 2 since the slice_type value of the I slice is 3 in the newly defined slice type.

17 is a flowchart schematically illustrating a method of transmitting a reference picture list of an encoder according to an embodiment of the present invention. 17 schematically illustrates the flow of a method of transmitting a reference picture list according to the embodiments of Tables 7, 9, 11, 14, and 17, and 20, described above. .

The encoder determines whether the current slice is GPB (S1710).

If the current slice is a GPB slice, the encoder encodes and transmits information indicating whether the current slice is GPB (S1720).

At this time, the information indicating whether or not the GPB may be included in the slice header through generalized_b_slice and transmitted as in the embodiments of Tables 7, 9, and 11. In addition, in order to transmit information indicating whether the information is GPB, as in the embodiments of Tables 12, 13, and 14, ref_pic_list_identical_flag having a modified ref_pic_list_combination_flag may be included in the slice header and transmitted.

In addition, the information indicating whether the GPB is a GPB slice is defined as a separate slice type as in the embodiments of Tables 17 to 20, and may be transmitted by being reflected in the syntax of the slice header.

However, at this time, the information related to the list 1 may not be transmitted. This is because List0 and List1 are the same in the GPB slice, so that List0 has substantially all the information about the reference picture. When the information related to the list 1 is not transmitted, redundancy of unnecessary information can be eliminated, and thus the coding efficiency can be improved by reducing the amount of transmitted information.

If the current slice is not a GPB slice, the encoder encodes and transmits information indicating whether the current slice is GPB and List 1 related information to the decoder (S1730).

18 is a flowchart schematically illustrating a method of generating a reference picture list of a decoder according to an embodiment of the present invention. FIG. 18 schematically illustrates a flow of a method of generating a reference picture list according to the embodiments of Tables 7, 9, 11, 14, and 17, and 20, described above. .

The decoder receives information indicating whether the current slice is GPB from the encoder (S1810). The received information can be decrypted at the decoder.

The decoder determines whether the current slice is GPB based on the information received from the encoder (S1820). When generalized_b_slice is used as in the embodiments of Tables 7, 9, and 11, the decoder may determine whether the current slice is a GPB slice through generalized_b_slice. When the flag ref_pic_list_identical_flag is used as in the embodiments of Tables 12, 13, and 14, the decoder may determine whether the current slice is a GPB slice through ref_pic_list_identical_flag. When the embodiments of Tables 17 to 20 are applied, it may be determined whether the current slice is a GPB slice based on a separately defined GPB slice type.

If the current slice is the GPB, the decoder derives the reference picture list1 related information from the reference picture list0 related information (S1830). As described above with reference to FIG. 17, in the case of a GPB slice, the encoder may not transmit List 1 related information in order to remove unnecessary redundancy. However, since List0 and List1 are the same in the GPB slice, List1 related information can be derived using information about List0.

If the current slice is not GPB, the encoder transmits List 1 related information to the decoder. The decoder decodes the information received from the encoder to derive list 1 related information (S1840).

The decoder may derive the reference picture list related information as described above, and generate the reference picture list using the reference picture list.

If the inter prediction mode of the prediction target block in the current picture is a bi-prediction mode and the current slice is a GPB slice, the reference picture index (refIdxL0) indicating the reference picture of the list 0 referenced by the prediction target block is referenced by the prediction target block. There may be a case where the reference picture index refIdxL1 indicating the reference picture of List 1 indicates the same reference picture.

19 is a conceptual diagram schematically illustrating an embodiment of a picture in which a reference picture index of list 0 and a reference picture index of list 1 indicate the same reference picture. In FIG. 19, pictures are shown in display order, that is, in temporal order.

In biprediction mode, both List0 and List1 exist. Therefore, the prediction target block in the current picture has both a reference picture index refIdxL0 for indicating the reference picture of List0 and a reference picture index refIdxL1 for indicating the reference picture of List1. Hereinafter, the reference picture index for the reference picture referenced by the current prediction target block in list 0 is referred to as index L0 and the reference picture index for the reference picture referred to by the current prediction target block in list 1 is referred to as index L1.

Referring to the embodiment of FIG. 19, both the index L0 and the index L1 of the prediction target block in the current picture indicate the reference picture List0 [0] closest in temporal distance.

When the inter prediction mode of the current prediction target block is a pair prediction mode and the reference pictures indicated by the index L0 and the index L1 are the same, the index L1 may not be transmitted. In this case, however, the encoder needs to transmit information on whether the reference picture indicated by the index L0 and the index L1 are the same to the decoder. In an embodiment, a flag called ref_idx_l1_present_flag may be used to indicate information on whether the reference picture indicated by the index L0 and the index L1 is the same.

The flag is not bound to the name, and any flag may be included in the spirit of the present invention as long as the flag indicates information on whether the reference picture indicated by the index L0 and the index L1 is the same.

[Table 21] below shows a syntax structure related to the reference picture index of the prediction unit according to the embodiment of the present invention.

TABLE 21

In Table 21, if ref_idx_l1_present_flag is 1, ref_idx_l1 is present. If ref_idx_l1_present_flag is 0, it means that ref_idx_l1 is not present. If ref_idx_l1_present_flag does not exist, it is assumed to be 1. When ref_idx_l1_present_flag is 0, ref_idx_l1 is estimated to be a value indicating the same reference picture as the reference picture indicated by ref_idx_l0.

The meaning of ref_pic_list_combination_flag has been described above.

Referring to the embodiment of Table 21, ref_idx_l1_present_flag is defined only when the prediction mode of the current unit is pair prediction and the current slice is GPB. In addition, ref_idx_l1 is defined only when ref_idx_l1_present_flag is 1.

When ref_idx_l1_present_flag is 0, information about index L1 may not be transmitted, where index L0 and index L1 indicate the same reference picture. According to an embodiment of the present invention, since the transmission of index L1 (ref_idx_l1) information may be skipped when the index L0 and the index L1 indicate the same reference picture, the amount of information to be transmitted may be reduced.

In this case, the index L1 in the encoder may be derived by the following process.

The variable refIdxCo is set to an index value of List 1 indicating the same reference picture as the reference picture indicated by ref_idx_l0. If ref_idx_l1_present_flag is 0, refIdxL1 = refIdxCo, otherwise refIdxL1 = ref_idx_l1.

That is, when the information about the index L1 is transmitted, the encoder uses this to derive the index L1, but otherwise, the encoder may derive the index L1 from the index L0 information. This is because the reference pictures indicated by index L0 and index L1 are the same when ref_idx_l1_present_flag is 0.

If the reference pictures indicated by the index L0 and the index L1 are not the same, the index L1 information should be transmitted. At this time, a method capable of reducing the amount of information or the number of bits necessary for the transmission of the index L1 information may be provided. This method may be provided when ref_pic_list_combination_flag is 1, that is, the current slice is a GPB slice, inter_pred_flag is Pred_BI, that is, the inter picture encoding mode is a bi-prediction mode, and ref_idx_l1_present_flag is present and a value of 1 is satisfied. The above condition may be satisfied when the reference pictures indicated by the list L0 and the list L1 are different. Hereinafter, a method for reducing the bit amount of index L1 information transmitted is described.

Since the reference pictures indicated by the list L0 and the list L1 are different from each other, the encoder generates the index L1 information by using the remaining indexes except the index indicating the same picture as the reference picture indicated by the index L0 among the indexes of the list 1. Can be. In this case, after subtracting 1 from the number of indexes of List 1 having a value higher than the index indicating the same picture as the reference picture indicated by the index L0, the index L1 information may be generated using the changed index value. In this case, the bit amount of the transmitted index L1 information may be reduced.

In an embodiment, it is assumed that List 1 may indicate four reference pictures, and values of 0, 1, 2, and 3 are assigned to an index indicating each reference picture. In this case, it is assumed that the index of the list 1 indicating the same picture as the reference picture indicated by the index L0 is 1. The index L1 may have an index value of 0, 2 or 3, and the encoder may transmit the indicated index L1 information to the decoder as it is.

However, when ref_idx_l1_present_flag is 1, the decoder can recognize that the reference pictures indicated by index L0 and index L1 are different from each other. Therefore, after subtracting 1 from

index values

2 and 3 that are greater than 1, index values of 0, 1 or 2 Index L1 information may be generated using. Therefore, index L1 may have an index value of 0, 1 or 2 instead of 0, 2 or 3, and the amount of information or bits to be transmitted may be reduced.

When the index L1 information is transmitted to the decoder by the above method, the process for deriving the index L1 information on the decoder side may be performed as follows.

If ref_idx_l1 is smaller than refIdxCo, it may be refIdxL1 = ref_idx_l1. Otherwise, if ref_idx_l1 is greater than refIdxCo, refIdxL1 = ref_idx_l1 + 1. refIdxL1 means a reference picture index for List 1 derived from the decoder.

The variable refIdxCo is an index value of List 1 indicating the same reference picture as the reference picture indicated by the index L0 (ref_idx_l0). Referring to the process of the encoder, since index values of List 1 having a value lower than the index value of refIdxCo are not changed, ref_idx_l1 information transmitted from the encoder may be copied to refIdxL1 as it is. However, in the case of index values of List 1 having a value higher than that of refIdxCo, ref_idx_l1 information may be generated after 1 is subtracted from the index values. Therefore, the decoder can copy ref_idx_l1 plus 1 to refIdxL1.

Referring to the embodiment of the encoder operation, when the index value of the list 1 indicating the same picture as the reference picture indicated by the index L0 is 1 and the index value of the index L1 is 0, the index value of the list 1 is not changed. Therefore, the decoder can use the ref_idx_l1 information received from the encoder as it is. If the index value of List1 indicating the same picture as the reference picture indicated by index L0 is 1 and the index value of index L1 is 2 or 3, the index value of 1 or 2 subtracted from each index value is index L1 information. Can be used to generate Therefore, the decoder can derive the index L1 information by copying the ref_idx_l1 information received from the encoder plus refIdxL1.

When the index value of index L1 is larger than the index value of List1 indicating the same picture as the reference picture pointed to by index L0, a value obtained by subtracting 1 from the index value may be used to generate index L1 information, so ref_idx_l1 and refIdxCo There may be the same case. Even in this case, the decoder can derive the index L1 information by copying the ref_idx_l1 information received from the encoder plus refIdxL1.

The above-described embodiments for reducing the transmission amount of the reference picture index information are not necessarily applicable only to the GPB slice in which the reference picture list 0 and the reference picture list 1 are the same. 19 and Table 21 below may be applied when reference picture list 0 and reference picture list 1 are not the same, that is, when the current slice is not a GPB slice.

In this case, the condition (if (! Ref_pic_list_combination_flag)) specifying whether List 0 and List 1 are the same in the syntax of [Table 21] may be modified or removed. In addition, in the embodiment of the method of deriving refIdxL1 when the above-described reference pictures indicated by the index L0 and the index L1 are not the same, the GPB slice condition may be removed.

20 is a flowchart schematically illustrating a method of transmitting a reference picture index of an encoder according to an embodiment of the present invention. FIG. 20 schematically illustrates a flow of a reference picture index transmission method according to the above-described embodiments of FIGS. 19 and 21 below. In the embodiment of FIG. 20, the inter prediction mode may be pair prediction. The reference picture index transmission method according to the embodiment of FIG. 20 may also be applied when the inter prediction mode is pair prediction and the current slice is a GPB slice.

The encoder determines whether the reference picture index for the list 0 and the reference picture index for the list 1 indicate the same reference picture (S2010). In this case, as in the above-described embodiment, a flag called ref_idx_l1_present_flag may be used. According to an embodiment, the encoder may set the flag value to 0 when the reference picture index for list 0 and the reference picture index for list 1 indicate the same reference picture, and otherwise set the flag value to 1. have.

If the reference picture index for list 0 and the reference picture index for list 1 indicate the same reference picture, the encoder transmits a flag such as ref_idx_l1_present_flag to the decoder to inform the decoder of this information (S2020).

At this time, the encoder may not transmit the reference picture index for List1. This is because the index L1 can be derived using the information about the index L0 since the reference picture index for the list 0 and the reference picture index for the list 1 indicate the same reference picture. Since the transmission of the reference picture index for the list 1 is omitted, the amount of information transmitted may be reduced.

If the information indicated by the reference picture index for list 0 and the information indicated by the reference picture index for list 1 are not the same, the encoder not only flags indicating this information but also reference picture index for list 1 to the encoder. It transmits (S2030).

In this case, the encoder subtracts 1 from each index value of List 1 having a value higher than that of List 1 indicating the same picture as the reference picture indicated by index L0, and then generates index L1 information using the changed index value. It may be. In this case, the bit amount of the transmitted index L1 information may be reduced.

21 is a flowchart schematically illustrating a reference picture index generation method of a decoder according to an embodiment of the present invention. 19 and Table 21 schematically illustrate the flow of a reference picture index generation method according to an embodiment.

The decoder receives from the encoder a flag indicating whether the reference picture index for the list 0 and the reference picture index for the list 1 indicate the same reference picture (S2110). The received flag information can be decoded in the decoder.

The decoder determines whether the reference picture index for the list 0 and the reference picture index for the list 1 indicate the same reference picture based on the flag information received from the encoder (S2120). In this case, as in the embodiment of Table 19, a flag called ref_idx_l1_present_flag may be used. When the flag is 0, it may indicate that the index L0 and the index L1 indicate the same reference picture. Otherwise, the index L0 and the index L1 do not indicate the same reference picture.

When the index L0 and the index L1 indicate the same reference picture, the decoder derives the index L1 from the index L0 information (S2130). As described above with reference to FIG. 20, when the index L0 and the index L1 indicate the same reference picture, the encoder may not transmit the index L1 information in order to remove unnecessary duplication. However, since the index L0 and the index L1 indicate the same reference picture, the index L1 may be derived from the index L0 information.

If the index L0 and the index L1 do not indicate the same reference picture, the decoder decodes the information received from the encoder to derive the index L1 (S2140).

The encoder may generate index L1 information using the changed index value after subtracting 1 from each index value of List 1 having a value higher than that of List 1 indicating the same picture as the reference picture indicated by index L0. have. Therefore, when the index value of index L1 is larger than the index value of List 1 indicating the same picture as the reference picture indicated by index L0, the decoder derives index L1 by adding 1 to the index value of the index L1 information received from the encoder. can do.

In the above-described embodiment, the methods are described based on a flowchart as a series of steps or blocks, but the present invention is not limited to the order of steps, and any steps may occur in a different order or simultaneously from other steps as described above. have. In addition, those skilled in the art will appreciate that the steps shown in the flowcharts are not exclusive and that other steps may be included or one or more steps in the flowcharts may be deleted without affecting the scope of the present invention.

The above-described embodiments include examples of various aspects. While not all possible combinations may be described to represent the various aspects, one of ordinary skill in the art would recognize that other combinations are possible. Accordingly, the invention is intended to embrace all other replacements, modifications and variations that fall within the scope of the following claims.

Claims

Receiving reference picture list information including first flag information;
Deriving maximum reference picture number information in the reference picture list based on the first flag information; And
And generating a reference picture list using the maximum reference picture number information.
The method according to claim 1,
The first flag information indicates whether the number of first maximum reference pictures allowed for the current picture and the number of second maximum reference pictures allowed for the current slice are equal to each other.
The first flag information is information obtained from sequence parameter set (SPS) information or picture parameter set (PPS) information.
The method according to claim 1,
The first flag information indicates whether the maximum number of reference pictures allowed for all pictures and slices in the current sequence is the same;
And the first flag information is information obtained from sequence parameter set information.
Generating a reference picture list 0 and a reference picture list 1 for the current picture;
Determining a temporal level of a reference picture included in the reference picture list 0 and the reference picture list 1 and a temporal distance from the current picture; And
Generating a combined list by inserting the reference picture based on the temporal level and the temporal distance;
In the generating of the combined list,
The reference picture having a small temporal distance is inserted first,
Among the reference pictures having the same temporal distance, a reference picture having a low temporal level is preferentially inserted.
Reference pictures included in reference picture list 0 are preferentially inserted among the reference pictures having the same temporal distance and the temporal level.
Image Decoding Method.
The method according to claim 4,
And receiving second flag information indicating whether a reference picture having a low temporal level is preferentially inserted or always a reference picture included in reference picture list 0 is preferentially inserted.
Generating a reference picture list 0 and a reference picture list 1 for the current picture;
Determining a quantization parameter of a reference picture included in the reference picture list 0 and the reference picture list 1 and a temporal distance from the current picture; And
Generating a bind list by inserting the reference picture based on the quantization parameter (QP) and the temporal distance;
In the generating of the combined list,
The reference picture having a small temporal distance is inserted first,
Among the reference pictures having the same temporal distance, a reference picture having a low quantization parameter is inserted first,
Reference pictures included in reference picture list 0 are preferentially inserted among the reference pictures having the same temporal distance and the quantization parameter.
Image Decoding Method.
A video encoding method using inter prediction in uni-prediction mode,
Generating a reference picture list 0 and a reference picture list 1 for the current picture;
Scanning reference pictures included in the reference picture list 0 and the reference picture list 1 and selecting a reference picture for a current prediction target block in the current picture; And
Deriving motion information on the current prediction target block using the selected reference picture,
In the step of selecting a reference picture for the current prediction target block in the current picture,
Starting from the reference picture of the reference picture list 0, the reference pictures of the reference picture list 0 and the reference pictures of the reference picture list 1 are alternately scanned in descending order of the reference picture index.
If the reference pictures in the current order are the same as the reference pictures in the previous order, the scan for the reference pictures in the current order is skipped.
Image Decoding Method.
Receiving image information including information indicating whether a current slice is a generalized P and B (GPB) slice from an encoder; And
Generating a reference picture list 0 and a reference picture list 1 based on the information indicating whether the current slice is a GPB slice;
In the step of generating the reference picture list 0 and the reference picture list 1,
If the current slice is a GPB slice, the reference picture list 1 is generated using the reference picture list 0 related information included in the image information.
The GPB slice is a slice in which the reference picture list 0 and the reference picture list 1 are the same.
Image Decoding Method.
The method according to claim 8,
The information indicating whether the current slice is a GPB slice is information indicated by a third flag indicating whether the current slice is a GPB slice,
The third flag is a flag generated by the encoder and transmitted to the decoder,
When the third flag indicates that the current slice is a GPB slice, the image information does not include reference picture list 1 related information.
Image Decoding Method.
The method according to claim 9,
The third flag is a flag generated in an encoder and transmitted to a decoder only when the current slice is a B picture.
Image Decoding Method.
The method according to claim 8,
The information indicating whether the current slice is a GPB slice is information indicated by a fourth flag indicating whether the reference picture list 0 and the reference picture list 1 are the same for the current slice.
The fourth flag is a flag generated by the encoder and transmitted to the decoder.
When the fourth flag indicates that the reference picture list 0 and the reference picture list 1 are the same for the current slice, the image information does not include reference picture list 1 related information.
Image Decoding Method.
The method according to claim 8,
The GPB slice is a slice type defined separately from an I slice, a P slice, and a B slice, and the B slice is a slice in which reference picture list 0 and reference picture list 1 are not the same.
Information indicating whether the current slice is a GPB slice is information indicated by a slice type of the GPB slice,
If the slice type of the current slice is a GPB slice, the image information does not include reference picture list 1 related information.
Image Decoding Method.
Receiving image information including information indicating whether a current slice is a low delay B slice;
Based on the information indicating whether the current slice is a low delay B slice, a temporal motion information candidate is checked by checking a reference picture list of a co-located block with respect to a current prediction target block. Deriving; And
Performing inter prediction using the temporal motion information candidate and the image information;
The low delay B slice is a slice having only a forward reference picture, the co-located block is a block located within a reference picture of the current prediction target block,
In the deriving of the temporal motion information candidate when the current slice is a low delay B slice, a reference picture list identical to a reference picture list of the current prediction target block among the reference picture lists of the same location block is checked first.
Video decoding device.
Receiving image information including fifth flag information indicating whether index L0 and index L1 indicate the same reference picture from the encoder for the current prediction target block; And
Generating the index L0 and the index L1 based on the fifth flag information.
The index L0 is a reference picture index of a reference picture referenced by the current prediction target block in reference picture list 0, and the index L1 is a reference picture index of a reference picture referenced by the current prediction target block in reference picture list 1. sign
Image Decoding Method.
The method of claim 14, wherein in the generating of the index L0 and the index L1, when the reference picture indicated by the index L0 and the index L1 is the same,
The image information does not include the index L1 related information, and generates the index L1 using the index L0 related information included in the image information.
Image Decoding Method.
The method of claim 14, wherein in the generating of the index L0 and the index L1, when the reference pictures indicated by the index L0 and the index L1 are not the same,
When the index value of the index L1 is larger than the index Co, the index L1 is generated by adding 1 to the index value of the index L1 related information included in the image information. When the index value of the index L1 is smaller than the index Co, Copying an index value of the index L1 related information included in the image information to generate the index L1,
The index Co is an index value of the reference picture list 1 indicating the same picture as the reference picture indicated by the index L0.
Image Decoding Method.