WO2013065431A1

WO2013065431A1 - Image decoding device, image decoding method, and image encoding device

Info

Publication number: WO2013065431A1
Application number: PCT/JP2012/075195
Authority: WO
Inventors: 山本　智幸; 知宏猪飼; 将伸八杉
Original assignee: シャープ株式会社
Priority date: 2011-11-04
Filing date: 2012-09-28
Publication date: 2013-05-10

Abstract

The purpose is to achieve a reduction in the amount of encoding needed when employing asymmetric partitions, and an efficient encoding/decoding process that utilizes the properties of asymmetric partitions. Provided is an image decoding device for decoding images in prediction units, that employs as the prediction mode for inter-image prediction either a single prediction mode involving referring to a single reference image or a double prediction mode involving referring to two reference images, equipped with a double prediction mode restricting means for restricting performance of double prediction on prediction units when the prediction units are a predetermined size or smaller.

Description

Image decoding apparatus, image decoding method, and image encoding apparatus

The present invention relates to an image decoding apparatus that decodes encoded data representing an image, an image decoding method, and an image encoding apparatus that generates encoded data by encoding an image.

In order to efficiently transmit or record a moving image, a moving image encoding device that generates encoded data by encoding the moving image, and a moving image that generates a decoded image by decoding the encoded data An image decoding device is used.

Specific examples of the moving image encoding method include H.264. H.264 / MPEG-4. A method adopted in KTA software, which is a codec for joint development in AVC and VCEG (Video Coding Expert Group), a method adopted in TMuC (Test Model Under Software), and a successor codec, HEVC (High- The method proposed in (Efficiency 特許 Video 文献 Coding) (Non-Patent Documents 1 and 4) is included.

In such a moving image coding system, an image (picture) constituting a moving image is a slice obtained by dividing the image, a coding unit obtained by dividing the slice (Coding Unit) And is managed by a hierarchical structure composed of blocks and partitions obtained by dividing an encoding unit, and is normally encoded / decoded block by block.

In such a moving image coding method, a predicted image is usually generated based on a local decoded image obtained by encoding / decoding an input image, and the predicted image is generated from the input image (original image). A prediction residual obtained by subtraction (sometimes referred to as “difference image” or “residual image”) is encoded. In addition, examples of the method for generating a predicted image include inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction).

In intra prediction, predicted images in a corresponding frame are sequentially generated based on a locally decoded image in the same frame.

On the other hand, in inter prediction, by applying motion compensation using a motion vector to a reference image in a reference frame (decoded image) obtained by decoding the entire frame, a predicted image in a prediction target frame is converted into a prediction unit ( For example, it is generated for each block).

For inter prediction, at the 6th meeting of JCT-VC recently held (Torino, IT, 14-22 July, 2011), when using inter prediction, the encoding unit that becomes the unit of the encoding process is selected. A technique of dividing into an asymmetric partition (PU) is adopted (AMP; Asymmetric Motion Partition, Non-Patent Documents 2 and 3).

In addition, when the partition type is an asymmetric partition, it has been proposed to perform non-square quadtree transformation (NSQT; Non-Square Quadtree Transform) (Non-Patent Document 2).

However, the code amount of the side information is increased due to the addition of the asymmetric partition described above in inter prediction. In addition, the newly added asymmetric partition has a problem that although the property is different from the conventional symmetric partition, the property is not sufficiently utilized in the encoding process.

The present invention has been made in view of the above problems, and an object of the present invention is to realize a reduction in code amount when an asymmetric partition is used and an efficient encoding / decoding process utilizing the characteristics of the asymmetric partition. It is an object to provide an image decoding apparatus, an image decoding method, and an image encoding apparatus.

In order to solve the above problem, an image decoding apparatus according to an aspect of the present invention provides a single prediction that refers to one reference image or a bi-prediction that refers to two reference images as a prediction method for inter-screen prediction. In the image decoding apparatus that decodes an image in a prediction unit using the bi-prediction restriction unit for restricting the bi-prediction to the prediction unit when the prediction unit is a prediction unit having a predetermined size or less. It is characterized by providing.

In order to solve the above-described problem, an image decoding method according to an aspect of the present invention includes a single prediction that refers to one reference image or a bi-prediction that refers to two reference images as a prediction method for inter-screen prediction. In the image decoding method for decoding an image in a prediction unit using the step, determining whether or not the prediction unit is a prediction unit having a predetermined size or less, and not using the bi-prediction for the prediction unit And at least the step of limiting.

In order to solve the above-described problem, an image coding apparatus according to an aspect of the present invention, as a prediction method for inter-screen prediction, performs a single prediction that refers to one reference image or a bi-reference that refers to two reference images. In an image encoding apparatus that encodes an image in a prediction unit using prediction, when the prediction unit is a prediction unit having a size equal to or smaller than a predetermined size, the bi-prediction that restricts the bi-prediction to the prediction unit is limited. A prediction limiting means is provided.

In order to solve the above-described problem, an image decoding apparatus according to an aspect of the present invention divides the coding unit in an image decoding apparatus that decodes image coded data and generates a decoded image for each coding unit. A CU information decoding unit that decodes information designating a division type to be performed, and an arithmetic decoding unit that decodes a binary value from the image encoded data by arithmetic decoding using context or arithmetic decoding without using context, When the CU information decoding unit decodes information designating asymmetric division (AMP) as the division type, the arithmetic decoding unit performs arithmetic using the context according to the position of the binary value. Decoding is performed by switching between decoding and arithmetic decoding without using the context.

According to one aspect of the present invention, it is possible to realize a reduction in the amount of code when using an asymmetric partition and an efficient encoding / decoding process utilizing the characteristics of the asymmetric partition.

It is a functional block diagram shown about the structural example of the CU information decoding part with which the moving image decoding apparatus which concerns on one Embodiment of this invention is provided, and a decoding module. It is the functional block diagram shown about the schematic structure of the said moving image decoding apparatus. FIG. 3 is a diagram illustrating a data configuration of encoded data generated by a video encoding device according to an embodiment of the present invention and decoded by the video decoding device, wherein (a) to (d) are pictures, respectively. It is a figure which shows a layer, a slice layer, a tree block layer, and a CU layer. It is a figure which shows the pattern of PU division | segmentation type. (A) to (h) show partition shapes when PU partition types are 2N × N, 2N × nU, 2N × nD, 2N × N, 2N × nU, and 2N × nD, respectively. . It is a figure which shows the specific structural example of PU size table in which the number and size of PU are defined in correlation with the size of CU and PU division | segmentation type. It is a figure which shows 2NxN CU and 2NxnU CU in which the edge which has an inclination exists. It is a table which shows an example of the binarization information which defines matching with the combination of a CU prediction type and PU division | segmentation type, and a bin string. It is a figure which shows an example of the said binarization information defined about CU of 8x8 size. It is a figure which shows the other example of the said binarization information defined about CU of 8x8 size. It is a table which shows the other example of the binarization information which defines the correlation with the combination of a CU prediction type and PU division type, and a bin string. It is a table which shows another example of the binarization information which defines matching with the combination of a CU prediction type and PU division | segmentation type, and a bin string. It is a functional block diagram shown about the structural example of PU information decoding part with which the said moving image decoding apparatus is provided, and a decoding module. It is a figure which shows CU with which the asymmetric partition was selected. It is a figure which shows the priority of the merge candidate in CU with which the symmetrical partition was selected. It is a figure which shows the priority of the merge candidate in CU with which the asymmetric partition was selected. Both (a) and (b) show CUs when the PU partition type is 2N × nU. (A) shows the priority order of merge candidates in the smaller partition, and (b) shows the priority order of merge candidates in the larger partition. Further, (c) and (d) both indicate a CU when the PU partition type is 2N × nD. (C) shows the priority of merge candidates in the larger partition, and (d) shows the priority of merge candidates in the smaller partition. It is a functional block diagram shown about the structural example of the TU information decoding part with which the said moving image decoding apparatus is provided, and a decoding module. It is a figure which shows an example of the conversion size determination information in which a TU division | segmentation pattern is defined according to the size of CU, the depth (trafoDepth) of TU division | segmentation, and PU division | segmentation type of object PU. It is a figure which shows about the division | segmentation system which divides a square node into a square or a non-square by quadtree division. (A) shows square division, (b) shows horizontal rectangular division, and (c) shows vertical rectangular division. It is a figure which shows about the division | segmentation system which divides a square node into a square or a non-square by quadtree division. (A) is a horizontal division of a horizontal node, (b) is a square division of a horizontal node, (c) is a vertical division of a vertical node, and (d) is a square of a vertical node. Shows the division of. It is a figure which shows the example of TU division | segmentation in 32 * 32CU of PU division | segmentation type 2N * N. It is a figure which shows the example of TU division | segmentation in 32 * 32CU of PU division | segmentation type 2NxnU. It is a figure which shows the flow of TU division | segmentation at the time of dividing | segmenting according to the conversion size determination information shown in FIG. (A) shows the case where the PU partition type is 2N × 2N, and (b) shows the case where the PU partition type is 2N × nU. It is a figure which shows an example of the flow of TU division | segmentation at the time of dividing | segmenting the area | region whose PU division | segmentation type is 2Nx2N. It is a flowchart which shows an example of the flow of a CU decoding process. It is the functional block diagram shown about the schematic structure of the moving image encoder which concerns on one Embodiment of this invention. It is a flowchart which shows an example of the flow of a CU encoding process. It is the figure shown about the structure of the transmitter which mounts the said moving image encoder, and the receiver which mounts the said moving image decoder. (A) shows a transmitting apparatus equipped with a moving picture coding apparatus, and (b) shows a receiving apparatus equipped with a moving picture decoding apparatus. It is the figure shown about the structure of the recording device which mounts the said moving image encoder, and the reproducing | regenerating apparatus which mounts the said moving image decoder. (A) shows a recording apparatus equipped with a moving picture coding apparatus, and (b) shows a reproduction apparatus equipped with a moving picture decoding apparatus. It is a functional block diagram shown about the detailed structural example of the motion compensation parameter derivation | leading-out part of the PU information decoding part with which the said moving image decoding apparatus is provided. It is a functional block diagram shown about the detailed structural example of the motion information decoding part of the decoding module with which the said moving image decoding apparatus is provided. It is an example of the syntax table of PU in a prior art, and is a figure which shows the structure of the coding data when not performing bi-prediction restriction | limiting. It is a figure shown about the meaning of the inter prediction flag. (A) shows the meaning of the inter prediction flag when it is a binary flag, and (b) shows the meaning of the inter prediction flag when it is a ternary flag. It is a figure which shows the example of the syntax table of PU, (a) And (b) shows the part of inter prediction flag inter_pred_flag especially of the structure of the coding data in the case of performing bi-prediction restriction | limiting, respectively. It is a figure which shows the example of the syntax table regarding bi-prediction restriction | limiting. (A) shows a case where the sequence parameter set includes a flag disable_bipred_in_small_PU that restricts whether or not to restrict bi-prediction. (B) is an example in which a prediction constraint flag use_restricted_prediction is provided as a common flag. (C) is an example in which disable_bipred_size indicating the size of the PU for which bi-prediction is prohibited is included in the encoded data. It is a figure which shows the correspondence of the range which implements bi-prediction restriction | limiting, and a bi-prediction restriction | limiting method. It is a figure which shows the example of the syntax table regarding bi-prediction restriction | limiting. It is a figure explaining the combined table regarding a bi-prediction restriction | limiting, (a), (b), (c) is a figure for demonstrating the example of the value of combined_inter_pred_ref_idx, (d) is the derivation method of the maximum value MaxPredRef. It is a figure which shows the table and pseudo code which show. It is a figure explaining the variable table with respect to a combined table, (a) is a figure which shows the example of conversion variable table EncTable and reverse conversion variable table DecTable, (b) is a figure which shows reverse conversion variable table DecTable. . It is a figure explaining decoding inter_pred_flag regarding bi-prediction restriction. It is a figure explaining decoding of the joint joint inter prediction reference index flag combined_inter_pred_ref_idx regarding a bi-prediction restriction | limiting. Pseudo code showing decoding processing of combined_inter_pred_ref_idx when using an inverse transformation variable table. Pseudo code showing the encoding process of combined_inter_pred_ref_idx when using the conversion variable table. It is a block diagram which shows the structure of a merge motion compensation parameter derivation | leading-out part. It is a flowchart which shows operation | movement of a merge motion compensation parameter derivation | leading-out part. It is a figure explaining operation | movement of the adjacent merge candidate derivation | leading-out part 1212A. (A)-(c) is a figure explaining operation | movement of the temporal merge candidate derivation | leading-out part 1212B. It is a figure explaining operation | movement of the unique candidate derivation | leading-out part 1212C. (A)-(c) is a figure explaining operation | movement of combined bi-prediction merge candidate derivation | leading-out part 1212D. (A), (b) is a figure explaining operation | movement of the non-scale bi-predictive merge candidate derivation | leading-out part 1212E. It is a figure explaining operation | movement of the zero vector merge candidate derivation | leading-out part 1212F. It is a figure explaining operation | movement of bi-predictive transformation. FIG. 7 is a diagram for explaining an example of a bi-prediction restriction method, in which (a) uniformly applies a bi-prediction restriction of a basic inter PU, a bi-prediction restriction of a merge PU to a PU of a size of 4 × 4, 4 × 8, and 4 × 8; It is a figure which shows the example which applies the skip of bi-predictive merge candidate derivation, (b), (c) is bi-prediction restriction | limiting of merge PU, and skip of bi-predictive merge candidate derivation is not performed, but bi-prediction only to basic inter PU It is a figure which shows the example which performs a restriction | limiting, (d) is the merge PU with respect to PU of the size of 4x4, 4x8, and 4x8 uniformly with respect to the bi-prediction restriction | limiting of a basic inter PU, and the size of 8x8. It is a figure which shows the example which applies the bi-prediction restriction | limiting of this, and the skip of bi-prediction merge candidate derivation | leading-out. It is a figure explaining the example of the bi-prediction restriction | limiting method, (a) shows the example which applies the bi-prediction restriction | limiting of basic inter-PU, and the skip of bi-prediction merge candidate derivation with respect to 4x4, 4x8, 4x8, 8x8. (B) is a figure which shows the example which applies the skip of bi-predictive merge candidate derivation | leading-out with respect to 4x4, 4x8, 4x8, 8x8. It is a block diagram which shows the structure of a basic motion compensation parameter derivation | leading-out part. It is a block diagram which shows the structure of a PU information generation part. It is a block diagram which shows the structure of a merge motion compensation parameter production | generation part. It is a block diagram which shows the structure of a basic motion compensation parameter production | generation part. H. 2 is a table that defines level restrictions in H.264 / AVC. H. 2 is a table that defines level restrictions in H.264 / AVC. It is the figure shown about adaptive PU size restrictions and bi-prediction restrictions. (A) shows the case of 16 × 16 CU, and (b) shows the case of 8 × 8 CU. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter derivation | leading-out part with which a PU information decoding part is provided. It is a figure which shows an example of the syntax table regarding bi-prediction restriction | limiting. It is a figure which shows an example of the pseudo code shown about operation | movement of the bi prediction restriction | limiting PU determination part. It is a figure which shows the other example of the syntax table regarding bi-prediction restriction | limiting. It is a figure which shows the other example of the pseudo code shown about operation | movement of the bi prediction restriction | limiting PU determination part. It is a figure which shows another example of the syntax table regarding a bi-prediction restriction | limiting. It is a figure which shows another example of the pseudo code shown about operation | movement of the bi-prediction restriction | limiting PU determination part. It is a figure which shows the modification of another example of the pseudo code shown about operation | movement of the bi prediction restriction | limiting PU determination part. It is a figure which shows another example of the syntax table regarding bi-prediction restriction | limiting. It is a figure which shows another example of the pseudo code shown about operation | movement of the bi-prediction restriction | limiting PU determination part. It is a flowchart shown about an example of the flow of a process of a merge motion compensation parameter derivation | leading-out part and a bi-prediction conversion part. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter derivation | leading-out part with which a PU information decoding part is provided. It is a flowchart shown about the modification of the flow of a process of a merge motion compensation parameter derivation | leading-out part and a bi-prediction conversion part. It is a time chart of a series of processing consisting of merge candidate derivation processing, bi-single conversion processing, and list creation processing. It is a time chart of a series of processing consisting of merge candidate derivation processing, bi-single conversion processing, and list creation processing. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter derivation | leading-out part with which a PU information decoding part is provided. It is a figure explaining the specific example of the integer conversion process which converts X coordinate into an integer. It is a figure explaining the specific example of the integer conversion process which converts Y coordinate into an integer. It is a figure explaining the specific example of the integer-ized process which converts X coordinate and Y coordinate into an integer. It is a figure explaining the specific example of the integer-ized process which converts the X coordinate and Y coordinate into an integer only for one list. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter production | generation part with which a PU information generation part is provided. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter production | generation part with which a PU information generation part is provided. It is a block diagram which shows the structural examples, such as a merge motion compensation parameter production | generation part with which a PU information generation part is provided. It is a table which defines level regulation in the present invention. It is a table which defines another example of level regulation in the present invention. It is a figure which shows the modification of another example of the pseudo code shown about operation | movement of the bi prediction restriction | limiting PU determination part. It is a figure which shows the example of the pseudo code shown about operation | movement of the motion compensation parameter restriction | limiting part. 6 is a block diagram illustrating another configuration of the PU information generation unit 30. FIG.

An embodiment of the present invention will be described with reference to FIGS. First, an overview of the moving picture decoding apparatus (image decoding apparatus) 1 and the moving picture encoding apparatus (image encoding apparatus) 2 will be described with reference to FIG. FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.

The video decoding device 1 and the video encoding device 2 shown in FIG. H.264 / MPEG-4 AVC standard technology, VCEG (Video Coding Expert Group) technology used in KTA software, which is a joint development codec, TMuC (Test Model Underside) software The technology and the technology proposed by HEVC (High-Efficiency Video Coding), which is the successor codec, are implemented.

The video encoding device 2 generates encoded data # 1 by entropy encoding a syntax value defined to be transmitted from the encoder to the decoder in these video encoding schemes. .

As entropy coding methods, context-adaptive variable-length coding (CAVLC) and context-adaptive binary arithmetic coding (CABAC) are known. ing.

In encoding / decoding by CAVLC and CABAC, processing adapted to the context is performed. The context is an encoding / decoding situation (context), and is determined by past encoding / decoding results of related syntax. Examples of the related syntax include various syntaxes related to intra prediction and inter prediction, various syntaxes related to luminance (Luma) and color difference (Chroma), and various syntaxes related to CU (Coding Unit encoding unit) size. In CABAC, the binary position to be encoded / decoded in binary data (binary string) corresponding to the syntax may be used as the context.

In CAVLC, various syntaxes are encoded by adaptively changing the VLC table used for encoding. On the other hand, in CABAC, binarization processing is performed on syntax that can take multiple values such as a prediction mode and a conversion coefficient, and binary data obtained by this binarization processing is adaptive according to the occurrence probability. Are arithmetically encoded. Specifically, multiple buffers that hold the occurrence probability of binary values (0 or 1) are prepared, one buffer is selected according to the context, and arithmetic coding is performed based on the occurrence probability recorded in the buffer I do. Further, by updating the occurrence probability of the buffer based on the binary value to be decoded / encoded, an appropriate occurrence probability can be maintained according to the context.

The moving picture decoding apparatus 1 receives encoded data # 1 obtained by encoding a moving picture by the moving picture encoding apparatus 2. The video decoding device 1 decodes the input encoded data # 1 and outputs the video # 2 to the outside. Prior to detailed description of the moving picture decoding apparatus 1, the configuration of the encoded data # 1 will be described below.

[Configuration of encoded data]
A configuration example of encoded data # 1 that is generated by the video encoding device 2 and decoded by the video decoding device 1 will be described with reference to FIG. The encoded data # 1 exemplarily includes a sequence and a plurality of pictures constituting the sequence.

FIG. 3 shows the hierarchical structure below the picture layer in the encoded data # 1. 3A to 3D are included in the picture layer that defines the picture PICT, the slice layer that defines the slice S, the tree block layer that defines the tree block TBLK, and the tree block TBLK, respectively. It is a figure which shows the CU layer which prescribes | regulates a coding unit (Coding | union Unit; CU).

(Picture layer)
In the picture layer, a set of data referred to by the video decoding device 1 for decoding a picture PICT to be processed (hereinafter also referred to as a target picture) is defined. As shown in FIG. 3A, the picture PICT includes a picture header PH and slices S1 to SNS (NS is the total number of slices included in the picture PICT).

In the following description, if it is not necessary to distinguish each of the slices S1 to SNS, the subscripts may be omitted. The same applies to other data with subscripts included in encoded data # 1 described below.

The picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture. For example, the encoding mode information (entropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH.

When entropy_coding_mode_flag is 0, the picture PICT is encoded by CAVLC (Context-based Adaptive Variable Variable Length Coding). When entropy_coding_mode_flag is 1, the picture PICT is encoded by CABAC (Context-based Adaptive Binary Arithmetic Coding).

Note that the picture header PH is also referred to as a picture parameter set (PPS).

(Slice layer)
In the slice layer, a set of data referred to by the video decoding device 1 for decoding the slice S to be processed (also referred to as a target slice) is defined. As shown in FIG. 3B, the slice S includes a slice header SH and tree blocks TBLK1 to TBLKNC (NC is the total number of tree blocks included in the slice S).

The slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.

The slice types that can be specified by the slice type specification information include (1) I slice that uses only intra prediction at the time of encoding, (2) P slice that uses single prediction or intra prediction at the time of encoding, ( 3) B-slice using single prediction, bi-prediction, or intra prediction at the time of encoding may be used.

Further, the slice header SH may include a filter parameter referred to by a loop filter (not shown) included in the video decoding device 1.

(Tree block layer)
In the tree block layer, a set of data referred to by the video decoding device 1 for decoding a processing target tree block TBLK (hereinafter also referred to as a target tree block) is defined.

The tree block TBLK includes a tree block header TBLKH and coding unit information CU1 to CUNL (NL is the total number of coding unit information included in the tree block TBLK). Here, first, a relationship between the tree block TBLK and the coding unit information CU will be described as follows.

The tree block TBLK is divided into units for specifying a block size for each process of intra prediction or inter prediction and conversion.

The above unit of the tree block TBLK is divided by recursive quadtree partitioning. The tree structure obtained by this recursive quadtree partitioning is hereinafter referred to as a coding tree.

Hereinafter, a unit corresponding to a leaf that is a node at the end of the coding tree is referred to as a coding node. In addition, since the encoding node is a basic unit of the encoding process, hereinafter, the encoding node is also referred to as an encoding unit (CU).

That is, the coding unit information CU1 to CUNL is information corresponding to each coding node (coding unit) obtained by recursively dividing the tree block TBLK into quadtrees.

Also, the root of the coding tree is associated with the tree block TBLK. In other words, the tree block TBLK is associated with the highest node of the tree structure of the quadtree partition that recursively includes a plurality of encoding nodes.

Note that the size of each coding node is half the size of the coding node to which the coding node directly belongs (that is, the unit of the node one layer higher than the coding node).

Also, the size that each coding node can take depends on the size designation information of the coding node and the maximum hierarchy depth (maximum hierarchical depth) included in the sequence parameter set SPS of the coded data # 1. For example, when the size of the tree block TBLK is 64 × 64 pixels and the maximum hierarchical depth is 3, the encoding nodes in the hierarchy below the tree block TBLK have four sizes, that is, 64 × 64. It can take any of a pixel, 32 × 32 pixel, 16 × 16 pixel, and 8 × 8 pixel.

(Tree block header)
The tree block header TBLKH includes an encoding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target tree block. Specifically, as shown in (c) of FIG. 3, tree block division information SP_TBLK that designates a division pattern of the target tree block into each CU, and a quantization parameter difference that designates the size of the quantization step Δqp (qp_delta) is included.

The tree block division information SP_TBLK is information representing a coding tree for dividing the tree block. Specifically, the shape and size of each CU included in the target tree block, and the position in the target tree block Is information to specify.

Note that the tree block division information SP_TBLK may not explicitly include the shape or size of the CU. For example, the tree block division information SP_TBLK may be a set of flags (split_coding_unit_flag) indicating whether or not the entire target tree block or a partial area of the tree block is divided into four. In that case, the shape and size of each CU can be specified by using the shape and size of the tree block together.

Further, the quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target tree block and the quantization parameter qp ′ in the tree block encoded immediately before the target tree block.

(CU layer)
In the CU layer, a set of data referred to by the video decoding device 1 for decoding a CU to be processed (hereinafter also referred to as a target CU) is defined.

Here, before explaining the specific contents of the data included in the coding unit information CU, the tree structure of the data included in the CU will be described. The encoding node is a node at the root of a prediction tree (PT) and a transformation tree (TT). The prediction tree and the conversion tree are described as follows.

In the prediction tree, the encoding node is divided into one or a plurality of prediction blocks, and the position and size of each prediction block are defined. In other words, the prediction block is one or a plurality of non-overlapping areas constituting the encoding node. The prediction tree includes one or a plurality of prediction blocks obtained by the above division.

Prediction processing is performed for each prediction block. Hereinafter, a prediction block that is a unit of prediction is also referred to as a prediction unit (PU).

There are roughly two types of division in the prediction tree: intra prediction and inter prediction.

In the case of intra prediction, there are 2N × 2N (the same size as the encoding node) and N × N division methods.

In the case of inter prediction, there are 2N × 2N (the same size as the encoding node), 2N × N, N × 2N, N × N, and the like.

Also, in the transform tree, the encoding node is divided into one or a plurality of transform blocks, and the position and size of each transform block are defined. In other words, the transform block is one or a plurality of non-overlapping areas constituting the encoding node. The conversion tree includes one or a plurality of conversion blocks obtained by the above division.

Conversion processing is performed for each conversion block. Hereinafter, the transform block which is a unit of transform is also referred to as a transform unit (TU).

(Data structure of encoding unit information)
Next, specific contents of data included in the coding unit information CU will be described with reference to FIG. As shown in FIG. 3D, the coding unit information CU specifically includes a skip mode flag SKIP, CU prediction type information Pred_type, PT information PTI, and TT information TTI.

[Skip flag]
The skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target CU. When the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target CU, the code The PT information PTI in the unit information CU is omitted. Note that the skip flag SKIP is omitted for the I slice.

[CU prediction type information]
The CU prediction type information Pred_type includes CU prediction method information PredMode and PU partition type information PartMode.

The CU prediction method information PredMode specifies whether to use intra prediction (intra CU) or inter prediction (inter CU) as a predicted image generation method for each PU included in the target CU. Hereinafter, the types of skip, intra prediction, and inter prediction in the target CU are referred to as a CU prediction mode.

The PU partition type information PartMode specifies a PU partition type that is a pattern of partitioning the target coding unit (CU) into each PU. Hereinafter, dividing the target coding unit (CU) into each PU according to the PU division type in this way is referred to as PU division.

For example, the PU partition type information PartMode may be an index indicating the type of PU partition pattern, and the shape, size, and position of each PU included in the target prediction tree may be It may be specified.

Note that selectable PU partition types differ depending on the CU prediction method and the CU size. Furthermore, the PU partition types that can be selected are different in each case of inter prediction and intra prediction. Details of the PU partition type will be described later.

If the slice is not an I slice, the value of the PU partition type information PartMode is specified by an index (cu_split_pred_part_mode) that specifies a combination of a tree block partition (partition), a prediction method, and a CU split (split) method. It may be.

[PT information]
The PT information PTI is information related to the PT included in the target CU. In other words, the PT information PTI is a set of information on each of one or more PUs included in the PT. As described above, since the generation of the predicted image is performed in units of PUs, the PT information PTI is referred to when the moving image decoding apparatus 1 generates a predicted image. As shown in FIG. 3 (d), the PT information PTI includes PU information PUI1 to PUINP (NP is the total number of PUs included in the target PT) including prediction information in each PU.

The prediction information PUI includes intra prediction information or inter prediction information depending on which prediction method the prediction type information Pred_mode specifies. Hereinafter, a PU to which intra prediction is applied is also referred to as an intra PU, and a PU to which inter prediction is applied is also referred to as an inter PU.

The inter prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an inter prediction image by inter prediction.

Examples of the inter prediction parameters include a merge flag (merge_flag), a merge index (merge_idx), an estimated motion vector index (mvp_idx), a reference image index (ref_idx), an inter prediction flag (inter_pred_flag), and a motion vector residual (mvd). Is mentioned.

The intra prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction.

Examples of intra prediction parameters include an estimated prediction mode flag, an estimated prediction mode index, and a residual prediction mode index.

In the intra prediction information, a PCM mode flag indicating whether to use the PCM mode may be encoded. When the PCM mode flag is encoded and the PCM mode flag indicates that the PCM mode is used, the prediction process (intra), the conversion process, and the entropy encoding process are omitted. .

[TT information]
The TT information TTI is information regarding the TT included in the CU. In other words, the TT information TTI is a set of information regarding each of one or a plurality of TUs included in the TT, and is referred to when the moving image decoding apparatus 1 decodes residual data. Hereinafter, a TU may be referred to as a block.

As shown in FIG. 3 (d), the TT information TTI includes TT division information SP_TU that designates a division pattern of the target CU into each transform block, and TU information TUI1 to TUINT (NT is included in the target CU). The total number of blocks).

TT division information SP_TU is information for determining the shape and size of each TU included in the target CU and the position in the target CU. For example, the TT division information SP_TU can be realized from information (split_transform_flag) indicating whether or not the target node is to be divided and information (trafoDepth) indicating the depth of the division.

Also, for example, when the size of the CU is 64 × 64, each TU obtained by the division can take a size from 32 × 32 pixels to 4 × 4 pixels.

TU information TUI1 to TUINT are individual information regarding one or more TUs included in the TT. For example, the TU information TUI includes a quantized prediction residual.

Each quantized prediction residual is encoded data generated by the video encoding device 2 performing the following processes 1 to 3 on a target block that is a processing target block.

Process 1: DCT transform (Discrete Cosine Transform) of the prediction residual obtained by subtracting the prediction image from the encoding target image;
Process 2: Quantize the transform coefficient obtained in Process 1;
Process 3: Variable length coding is performed on the transform coefficient quantized in Process 2;
The quantization parameter qp described above represents the magnitude of the quantization step QP used when the moving image coding apparatus 2 quantizes the transform coefficient (QP = 2 ^{qp / 6} ).

(PU split type)
In the PU division type, if the size of the target CU is 2N × 2N pixels, there are the following eight patterns in total. That is, 4 symmetric splittings of 2N × 2N pixels, 2N × N pixels, N × 2N pixels, and N × N pixels, and 2N × nU pixels, 2N × nD pixels, nL × 2N pixels, And four asymmetric splittings of nR × 2N pixels. N = 2 ^m (m is an arbitrary integer of 1 or more). Hereinafter, an area obtained by dividing the target CU is also referred to as a partition.

(A) to (h) of FIG. 4 specifically show the positions of the boundaries of PU division in the CU for each division type.

FIG. 4A shows a 2N × 2N PU partition type that does not perform CU partitioning.

Also, (b), (c), and (d) of FIG. 4 show the partition shapes when the PU partition types are 2N × N, 2N × nU, and 2N × nD, respectively. ing. Hereinafter, partitions when the PU partition type is 2N × N, 2N × nU, and 2N × nD are collectively referred to as a horizontally long partition.

Further, (e), (f), and (g) of FIG. 4 show the shapes of partitions when the PU partition types are N × 2N, nL × 2N, and nR × 2N, respectively. . Hereinafter, partitions when the PU partition type is N × 2N, nL × 2N, and nR × 2N are collectively referred to as a vertically long partition.

Also, the horizontally long partition and the vertically long partition are collectively referred to as a rectangular partition.

Further, (h) in FIG. 4 shows the shape of the partition when the PU partition type is N × N. The PU partition types shown in FIGS. 4A and 4H are also referred to as square partitioning based on the shape of the partition. The PU partition types shown in FIGS. 4B to 4G are also referred to as non-square partitions.

Also, in (a) to (h) of FIG. 4, the numbers given to the respective regions indicate the identification numbers of the regions, and the processing is performed on the regions in the order of the identification numbers. That is, the identification number represents the scan order of the area.

In FIGS. 4A to 4H, the upper left is the reference point (origin) of the CU.

[Partition type for inter prediction]
In the inter PU, seven types other than N × N ((h) in FIG. 4) are defined among the above eight division types. The four asymmetric partitions may be called AMP (Asymmetric Motion Partition).

The specific value of N described above is defined by the size of the CU to which the PU belongs, and specific values of nU, nD, nL, and nR are determined according to the value of N. For example, a 128 × 128 pixel inter-CU includes 128 × 128 pixels, 128 × 64 pixels, 64 × 128 pixels, 64 × 64 pixels, 128 × 32 pixels, 128 × 96 pixels, 32 × 128 pixels, and 96 × It is possible to divide into 128-pixel inter PUs.

[Partition type for intra prediction]
In the intra PU, the following two types of division patterns are defined. That is, there are a division pattern 2N × 2N in which the target CU is not divided, that is, the target CU itself is handled as one PU, and a pattern N × N in which the target CU is symmetrically divided into four PUs.

Therefore, in the intra PU, the division patterns (a) and (h) can be taken in the example shown in FIG.

For example, an 128 × 128 pixel intra CU can be divided into 128 × 128 pixel and 64 × 64 pixel intra PUs.

In the case of an I slice, the coding unit information CU may include an intra partition mode (intra_part_mode) for specifying the PU partition type PartMode.

(TU partitioning and TU order within a node)
Next, TU partitioning and the order of TUs within a node will be described with reference to FIGS. The TU partition pattern is determined by the CU size, the partition depth (trafoDepth), and the PU partition type of the target PU.

TU partition patterns include square quadtree partition and non-square quadtree partition. Specific examples of the TU partition pattern are as shown in FIGS.

FIG. 18 shows a division method for dividing a square node into a square or a non-square by quadtree division.

(A) in FIG. 18 shows a division method in which a square node is divided into quadtrees into squares. Moreover, (b) of the same figure has shown the division | segmentation system which divides a square node into a horizontally long rectangle by quadtree division. And (c) of the same figure has shown the division | segmentation system which divides a square node into a quadrangle | longitudinal rectangle by quadtree division.

Further, FIG. 19 shows a division method for dividing a non-square node into a square or non-square by quadtree division.

(A) of FIG. 19 shows a division method in which a horizontally long rectangular node is divided into quadrant trees into horizontally long rectangles. Moreover, (b) of the same figure has shown the division | segmentation system which divides a horizontally long rectangular node into a quadtree in a square. In addition, (c) in the figure shows a division method in which a vertically long rectangular node is divided into quadrants into vertically long rectangles. And (d) of the figure has shown the division | segmentation system which divides a vertically long rectangular node into a quadtree into a square.

FIG. 20 shows an example of 32 × 32 CU TU partitioning of PU partition type 2N × N. In the drawing, “depth” indicates a division depth (trafoDepth). Further, “split” indicates the value of split_transform_flag in the depth. If “split” is “1”, TU partitioning is performed for the depth node, and if “0”, TU partitioning is not performed.

Details of the correspondence between the CU size, the division depth (trafoDepth), the PU division type of the target PU, and the TU division pattern will be described later.

[Video decoding device]
Hereinafter, the configuration of the video decoding device 1 according to the present embodiment will be described with reference to FIGS.

(Outline of video decoding device)
The video decoding device 1 generates a prediction image for each PU, generates a decoded image # 2 by adding the generated prediction image and a prediction residual decoded from the encoded data # 1, and generates The decoded image # 2 is output to the outside.

Here, the generation of the predicted image is performed with reference to the encoding parameter obtained by decoding the encoded data # 1. An encoding parameter is a parameter referred in order to generate a prediction image. In addition to prediction parameters such as a motion vector referred to in inter-screen prediction and a prediction mode referred to in intra-screen prediction, the encoding parameters include PU size and shape, block size and shape, and original image and Residual data with the predicted image is included. Hereinafter, a set of all information excluding the residual data among the information included in the encoding parameter is referred to as side information.

In the following, a picture (frame), a slice, a tree block, a block, and a PU to be decoded are referred to as a target picture, a target slice, a target tree block, a target block, and a target PU, respectively. .

Note that the size of the tree block is, for example, 64 × 64 pixels, and the size of the PU is, for example, 64 × 64 pixels, 32 × 32 pixels, 16 × 16 pixels, 8 × 8 pixels, 4 × 4 pixels, or the like. . However, these sizes are merely examples, and the sizes of the tree block and PU may be other than the sizes shown above.

(Configuration of video decoding device)
Referring to FIG. 2 again, the schematic configuration of the moving picture decoding apparatus 1 will be described as follows. FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.

As shown in FIG. 2, the moving picture decoding apparatus 1 includes a decoding module 10, a CU information decoding unit 11, a PU information decoding unit 12, a TU information decoding unit 13, a predicted image generation unit 14, an inverse quantization / inverse conversion unit 15, A frame memory 16 and an adder 17 are provided.

[Decryption module]
The decoding module 10 performs a decoding process for decoding a syntax value from binary. More specifically, the decoding module 10 decodes a syntax value encoded by an entropy encoding method such as CABAC and CAVLC based on encoded data and a syntax type supplied from a supplier, Returns the decrypted syntax value to the supplier.

In the example shown below, the sources of encoded data and syntax type are the CU information decoding unit 11, the PU information decoding unit 12, and the TU information decoding unit 13.

As an example of the decoding process in the decoding module 10, a case where a binary (bit string) of encoded data and a syntax type “split_coding_unit_flag” are supplied from the CU information decoding unit 11 to the decoding module 10 will be described next. It is as follows. That is, in this case, the decoding module 10 refers to the association between the bit string related to “split_coding_unit_flag” and the syntax value, derives the syntax value from the binary, and sends the derived syntax value to the CU information decoding unit 11. return.

[CU information decoding unit]
The CU information decoding unit 11 uses the decoding module 10 to perform decoding processing at the tree block and CU level on the encoded data # 1 for one frame input from the moving image encoding device 2. Specifically, the CU information decoding unit 11 decodes the encoded data # 1 according to the following procedure.

First, the CU information decoding unit 11 refers to various headers included in the encoded data # 1, and sequentially separates the encoded data # 1 into slices and tree blocks.

Here, the various headers include (1) information about the method of dividing the target picture into slices, and (2) information about the size, shape, and position of the tree block belonging to the target slice. .

Then, the CU information decoding unit 11 divides the target tree block into CUs with reference to the tree block division information SP_TBLK included in the tree block header TBLKH.

Next, the CU information decoding unit 11 acquires coding unit information (hereinafter referred to as CU information) corresponding to the CU obtained by the division. The CU information decoding unit 11 performs the decoding process of the CU information corresponding to the target CU, with each CU included in the tree block as the target CU in order.

That is, the CU information decoding unit 11 demultiplexes the TT information TTI related to the conversion tree obtained for the target CU and the PT information PTI related to the prediction tree obtained for the target CU.

The TT information TTI includes the TU information TUI corresponding to the TU included in the conversion tree as described above. Further, as described above, the PT information PTI includes the PU information PUI corresponding to the PU included in the target prediction tree.

The CU information decoding unit 11 supplies the PT information PTI obtained for the target CU to the PU information decoding unit 12. Further, the CU information decoding unit 11 supplies the TT information TTI obtained for the target CU to the TU information decoding unit 13.

[PU information decoding unit]
The PU information decoding unit 12 uses the decoding module 10 to perform decoding processing at the PU level for the PT information PTI supplied from the CU information decoding unit 11. Specifically, the PU information decoding unit 12 decodes the PT information PTI by the following procedure.

The PU information decoding unit 12 refers to the PU partition type information Part_type, and determines the PU partition type in the target prediction tree. Subsequently, the PU information decoding unit 12 performs a decoding process of PU information corresponding to the target PU, with each PU included in the target prediction tree as a target PU in order.

That is, the PU information decoding unit 12 performs a decoding process on each parameter used for generating a predicted image from PU information corresponding to the target PU.

The PU information decoding unit 12 supplies the PU information decoded for the target PU to the predicted image generation unit 14.

[TU information decoding unit]
The TU information decoding unit 13 uses the decoding module 10 to perform decoding processing at the TU level for the TT information TTI supplied from the CU information decoding unit 11. Specifically, the TU information decoding unit 13 decodes the TT information TTI by the following procedure.

The TU information decoding unit 13 refers to the TT division information SP_TU and divides the target conversion tree into nodes or TUs. Note that the TU information decoding unit 13 recursively performs TU division processing if it is specified that further division is performed for the target node.

When the division process ends, the TU information decoding unit 13 executes the decoding process of the TU information corresponding to the target TU, with each TU included in the target prediction tree as the target TU in order.

That is, the TU information decoding unit 13 performs a decoding process on each parameter used for restoring the transform coefficient from the TU information corresponding to the target TU.

The TU information decoding unit 13 supplies the TU information decoded for the target TU to the inverse quantization / inverse transform unit 15.

[Predicted image generator]
The predicted image generation unit 14 generates a predicted image based on the PT information PTI for each PU included in the target CU. Specifically, the prediction image generation unit 14 performs intra prediction or inter prediction for each target PU included in the target prediction tree according to the parameters included in the PU information PUI corresponding to the target PU, thereby generating a decoded image. A predicted image Pred is generated from a certain local decoded image P ′. The predicted image generation unit 14 supplies the generated predicted image Pred to the adder 17.

Note that a method in which the predicted image generation unit 14 generates a predicted image of a PU included in the target CU based on motion compensation prediction parameters (motion vector, reference image index, inter prediction flag) is as follows.

When the inter prediction flag indicates single prediction, the predicted image generation unit 14 generates a predicted image corresponding to the decoded image located at the location indicated by the motion vector of the reference image indicated by the reference image index.

On the other hand, when the inter prediction flag indicates bi-prediction, the predicted image generation unit 14 generates a predicted image by motion compensation for each of the two sets of reference image indexes and motion vectors, and calculates an average. Alternatively, the final predicted image is generated by weighting and adding each predicted image based on the display time interval between the target picture and each reference image.

[Inverse quantization / inverse transform unit]
The inverse quantization / inverse transform unit 15 performs an inverse quantization / inverse transform process on each TU included in the target CU based on the TT information TTI. Specifically, the inverse quantization / inverse transform unit 15 performs inverse quantization and inverse orthogonal transform on the quantization prediction residual included in the TU information TUI corresponding to the target TU for each target TU included in the target conversion tree. By doing so, the prediction residual D for each pixel is restored. Here, the orthogonal transform refers to an orthogonal transform from the pixel region to the frequency region. Therefore, the inverse orthogonal transform is a transform from the frequency domain to the pixel domain. Examples of inverse orthogonal transform include inverse DCT transform (Inverse Discrete Cosine Transform), inverse DST transform (Inverse Discrete Sine Transform), and the like. The inverse quantization / inverse transform unit 15 supplies the restored prediction residual D to the adder 17.

[Frame memory]
Decoded decoded images P are sequentially recorded in the frame memory 16 together with parameters used for decoding the decoded images P. In the frame memory 16, at the time of decoding the target tree block, decoded images corresponding to all tree blocks decoded before the target tree block (for example, all tree blocks preceding in the raster scan order) are stored. It is recorded. Examples of the decoding parameters recorded in the frame memory 16 include CU prediction method information PredMode.

[Adder]
The adder 17 adds the predicted image Pred supplied from the predicted image generation unit 14 and the prediction residual D supplied from the inverse quantization / inverse transform unit 15 to thereby obtain the decoded image P for the target CU. Generate.

In the video decoding device 1, the encoded data for one frame input to the video decoding device 1 at the time when the decoded image generation processing for each tree block is completed for all tree blocks in the image. Decoded image # 2 corresponding to # 1 is output to the outside.

Hereinafter, the configurations of (1) CU information decoding unit 11, (2) PU information decoding unit 12, and (3) TU information decoding unit 13 will be described in detail together with the configuration of decoding module 10 corresponding to each configuration. .

(1) Details of CU Information Decoding Unit Next, configuration examples of the CU information decoding unit 11 and the decoding module 10 will be described with reference to FIG. FIG. 1 is a functional block diagram illustrating a configuration for decoding CU prediction information, that is, a configuration of a CU information decoding unit 11 and a decoding module 10 in the moving image decoding apparatus 1.

Hereinafter, the configuration of each unit will be described in the order of the CU information decoding unit 11 and the decoding module 10.

(CU information decoding unit)
As illustrated in FIG. 1, the CU information decoding unit 11 includes a CU prediction mode determination unit 111, a PU size determination unit 112, and a PU size table 113.

The CU prediction mode determination unit 111 supplies the encoded data and syntax type of the CU prediction mode and the encoded data and syntax type of the PU partition type to the decoding module 10. Further, the CU prediction mode determination unit 111 acquires the decoded CU prediction mode syntax value and the PU partition type syntax value from the decoding module 10.

Specifically, the CU prediction mode determination unit 111 determines the CU prediction mode and the PU partition type as follows.

First, the CU prediction mode determination unit 111 decodes the skip flag SKIP by the decoding module 10 and determines whether or not the target CU is a skip CU.

If the target CU is not a skip CU, the decoding module 10 decodes the CU prediction type information Pred_type. Moreover, while determining whether object CU is intra CU or inter CU based on CU prediction method information PredMode included in CU prediction type information Pred_type, PU partition type is determined based on PU partition type information PartMode.

The PU size determination unit 112 refers to the PU size table 113 and determines the number and size of PUs from the size of the target CU and the CU prediction type and PU partition type determined by the CU prediction mode determination unit 111. .

The PU size table 113 is a table that associates the number and size of PUs with a combination of a CU size and a CU prediction type-PU partition type.

Here, a specific configuration example of the PU size table 113 will be described with reference to FIG.

In the PU size table 113 shown in FIG. 5, the number and size of PUs are defined according to the CU size and the PU partition type (intra CU and inter CU). Note that “d” in the table indicates the division depth of the CU.

In the PU size table 113, four sizes of 64 × 64, 32 × 32, 16 × 16, and 8 × 8 are defined as CU sizes.

In the PU size table 113, the number and size of PUs in each PU partition type are defined for the size of the CU.

For example, in the case of a 64 × 64 inter CU and 2N × N division, there are two PUs and the size is both 64 × 32.

Further, in the case of a 64 × 64 inter CU and 2N × nU division, there are two PUs and the sizes are 64 × 16 and 64 × 48.

Also, in the case of an 8 × 8 intra CU and N × N division, there are 4 PUs and all the sizes are 4 × 4.

Note that the PU partition type of the skip CU is estimated to be 2N × 2N. Further, in the table, a portion indicated by “−” indicates that the PU partition type cannot be selected.

That is, when the CU size is 8 × 8, PU partition types of asymmetric partitions (2N × nU, 2N × nD, nL × 2N, and nR × 2N) cannot be selected in the inter CU. In the case of an inter CU, an N × N PU partition type cannot be selected.

In intra prediction, an N × N PU partition type can be selected only when the CU size is 8 × 8.

(Decryption module)
As shown in FIG. 1, the decoding module 10 includes a CU prediction mode decoding unit (decoding unit, changing unit) 1011, a binarized information storage unit 1012, a context storage unit 1013, and a probability setting storage unit 1014.

The CU prediction mode decoding unit 1011 decodes the syntax value from the binary included in the encoded data according to the encoded data and the syntax type supplied from the CU prediction mode determination unit 111. Specifically, the CU prediction mode decoding unit 1011 performs CU prediction mode and PU partition type decoding processing according to the binarization information stored in the binarization information storage unit 1012. The CU prediction mode decoding unit 1011 performs a skip flag decoding process.

The binarization information storage unit 1012 stores binarization information for the CU prediction mode decoding unit 1011 to decode a syntax value from binary. The binarization information is information indicating a correspondence between a binary (bin sequence) and a syntax value.

The context storage unit 1013 stores a context that the CU prediction mode decoding unit 1011 refers to in the decoding process.

The probability setting storage unit 1014 stores a probability setting value that is referenced when the CU prediction mode decoding unit 1011 decodes a bin sequence from the encoded data by arithmetic decoding processing. The probability setting value includes a recording setting value corresponding to each context and a default probability setting value. The probability setting value corresponding to each context is updated based on the result of arithmetic decoding. On the other hand, the predetermined probability setting value is constant and is not updated by the result of arithmetic decoding. Note that the probability setting value may be expressed not in the probability value itself but in a state indicated by an integer value corresponding to the probability value.

[Specific configuration example]
[1-1] Example of Configuration that Limits Reference of Context When the PU partition type is an asymmetric partition, the CU prediction mode decoding unit 1011 does not use information indicating the partition type of the asymmetric partition without using the CABAC context. Decryption processing may be performed. In other words, when decoding a bin string corresponding to information indicating the type of division of an asymmetric partition from encoded data by arithmetic decoding, the probability setting value recorded for each context in the probability setting storage unit 1014 is not used. Decoding processing may be performed using a probability setting value (for example, a probability setting value in which occurrence probabilities of 0 and 1 correspond to equal probabilities).

Hereinafter, with reference to FIG. 7, an example of a configuration for restricting context reference in this way will be described.

The CU prediction mode decoding unit 1011 decodes information indicating the type of asymmetric partition division assuming a specified probability.

Referring to FIG. 7, a more specific example of this configuration example is as follows. In the association table BT1 shown in FIG. 7, for the rectangular division, the prefix portion indicates whether the division direction is horizontal (horizontal) or vertical (vertical), and the suffix portion indicates the type of division. Show.

For example, when the prefix part indicates that the PU partition type is a horizontally long partition, the suffix part selects one of the three horizontally long partitions of 2N × N, 2N × nU, and 2N × nD. Indicates.

When the PU partition type is rectangular partitioning, the CU prediction mode decoding unit 1011 sets the probability setting storage unit 1014 instead of the probability setting value recorded for each context set in the probability setting storage unit 1014. Referring to the predetermined probability setting value, arithmetic decoding of each bin of the suffix part is performed. Note that the probability setting value can be set assuming an equal probability, for example.

Here, CABAC arithmetic decoding using context is a method of recording / updating the occurrence probability of binary value (indicating state) in accordance with the binary position (context), and performing arithmetic based on the occurrence probability (state). This refers to the process of decoding. On the other hand, CABAC arithmetic decoding without using context refers to performing arithmetic decoding based on a fixed probability determined by a probability setting value without updating the occurrence probability (state) of a binary value. When the context is not used, the processing load is reduced and the throughput is improved because the occurrence probability (state) is not updated in the encoding process and the decoding process. In addition, a memory for accumulating occurrence probabilities (states) corresponding to the context is not necessary. When a probability of 0.5 is used as a fixed probability, it may be called EP coding (equal probability, Equal-Probability coding) or Bypass (bypass).

The operation and effect of the above configuration will be described with reference to FIG. First, the context is effective for improving the coding efficiency when the same code is generated continuously in a specific situation. The decoding efficiency is improved by decoding the suffix part with reference to the context. Specifically, 2N × N, 2N × nU, or 2N × nD is continuous in a situation where horizontal division is selected. Is selected. For example, 2N × N is selected in the prediction unit next to the prediction unit in which 2N × N is selected.

On the other hand, the partition is often set so as not to cross the edge boundary as shown in FIG.

That is, as shown in FIG. 6, when an edge E1 having an inclination exists in the region, the PU partition types of the CUs 10 and C20 are determined so as not to cross the edge E1.

More specifically, in CU10, the edge E1 exists near the center in the vertical direction of the region, and in CU20, the edge E1 exists above the region.

When the edge E1 having such an inclination exists in the region, the CU 10 is divided into PU11 and PU12 that are symmetrical by the 2N × N PU partition type so as not to cross the edge E1.

In addition, the CU 20 is divided into asymmetric PU 21 and PU 22 by a 2N × nU division type so as not to cross the edge E1.

In this way, when the edge E1 having an inclination exists in the region, the same shape partition may not appear continuously.

In this case, 2N × N, 2N × nU, or 2N × nD is not applicable when selected continuously. In such a case, the encoding efficiency may not be reduced without using a context.

As in the above configuration, decoding the above information assuming a prescribed probability for the prefix part can simplify the decoding process of pred_type while maintaining the encoding efficiency.

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention is the image decoding apparatus that generates a prediction image and restores an image for each prediction unit obtained by dividing the coding unit into one or more numbers. The division type into a rectangle includes a division into rectangular prediction units, and includes a code indicating whether the rectangle is vertically long or horizontally long, and a code indicating the type of the rectangle. Of the codes for specifying the division, the code indicating the type of the rectangle is provided with decoding means for decoding without using the context.

Therefore, it is possible to simplify the processing by not referring to the context while maintaining the encoding efficiency.

Note that the above example is a PU partition type set corresponding to PU partition including a plurality of rectangular partitions, and any PU partition in a PU partition type set including both symmetric partition and asymmetric partition types. When decoding information for selecting a type, it may be expressed that decoding may be performed without using a context.

Further, in the above example, the context may be used for some bins instead of using the context for decoding all bins in the bin sequence corresponding to the information related to the selection of the asymmetric partition. For example, in the example of FIG. 7 described above, when a division including a rectangular partition is selected with a CU larger than 8 × 8, a bin of up to two digits is decoded. Of these, the first digit is information indicating symmetric division or asymmetric division. The second digit is a bin that is decoded when the first digit indicates “0”, that is, an asymmetric division, and represents the positional relationship between a small PU and a large PU in the asymmetric division. For the first digit, it is preferable not to set a context because the same code may not be continuous for the reason described with reference to FIG. On the other hand, for the second digit, in the state where the precondition that an asymmetric partition is used is satisfied, the small PU is locally on one side (for example, the second digit represents 2N × nU and 2N × nD selection information) , It is preferable to set a context.

[1-2] Configuration for Decoding CU Prediction Type Information (pred_type) The CU prediction mode decoding unit 1011 refers to the binarization information stored in the binarization information storage unit 1012 as described below. You may comprise so that prediction type information may be decoded.

A configuration example of the binarized information stored in the binarized information storage unit 1012 will be described with reference to FIG. FIG. 7 is a table showing an example of binarization information that defines associations between combinations of CU prediction types and PU partition types and bin sequences.

In FIG. 7, for example, the binarization information is shown in a table format that associates a bin column with a CU prediction type-PU partition type, but is not limited thereto. The binarization information may be a derivation formula for deriving the PU partition type and the CU prediction type. The same applies to binarization information described later.

In addition, the binarized information does not have to be stored as data, and may be realized as a program logic for performing a decoding process.

In the association table BT1 illustrated in FIG. 7, bin columns are associated according to the CU prediction type, the PU partition type, and the CU size.

First, the definition of the CU size will be described. In the association table BT1, as the definition of the CU size, when the CU size is larger than 8 × 8 (CU> 8 × 8), the CU size is 8 × 8 (CU == 8 × 8). Two types of associations of 8 × 8 CU 1012A are defined.

Note that the bin sequence associated with each of the non-8 × 8 CU 1012B and the 8 × 8 CU 1012A includes a prefix part (prefix) and a suffix part (suffix).

In addition, in the association table BT1, for the definition of each CU size, two systems of the above-described intra CU (shown as “Intra”) and inter CU (shown as “Inter”) are defined as CU prediction types. . Furthermore, PU partition types are defined according to each CU prediction type.

Specifically, it is as follows. First, in the case of an intra CU, two PU partition types of 2N × 2N and N × N are defined.

2N × 2N is described as follows. In the non-8 × 8 CU 1012B, only the prefix part is defined, and the bin string is “000”. The suffix part is not encoded. In the 8 × 8 CU 1012A, the prefix part is “000” and the suffix part is “0”.

On the other hand, N × N is defined only in non-8 × 8CU1012B. In this case, the prefix part is “000” and the suffix part is “1”.

Thus, in the case of an intra CU, the suffix part is “000”.

Next, in the case of an inter CU, seven PU partition types of 2N × 2N, 2N × N, 2N × nU, 2N × nD, N × 2N, nL × 2N, and nR × 2N are defined.

When the PU partition type is 2N × 2N, only the prefix part is defined in both the non-8 × 8 CU 1012B and the 8 × 8 CU 1012A, and the bin string is “1”.

In the non-8 × 8 CU 1012B, a common prefix part “01” is assigned to the PU partition type 2N × N, 2N × nU, and 2N × nD of the horizontally long partition that performs horizontal partitioning.

Also, the suffix portions of 2N × N, 2N × nU, and 2N × nD are “1”, “00”, and “01”, respectively.

Also, a common prefix part “001” is assigned to the PU partition types of vertical partitions that perform vertical partitioning, N × 2N, nL × 2N, and nR × 2N.

The suffix parts of N × 2N, nL × 2N, and nR × 2N are “1”, “00”, and “01”, respectively. The suffix portion is the same as in the case of the PU partition type that performs the horizontal partition described above.

That is, in the definition of a horizontally long partition and a vertically long partition, the suffix part represents the type of division. That is, in the case of symmetrical division, bin is “1”. “00” indicates that the boundary of the division is closer to the origin than in the case of symmetric division, and “01” indicates that the boundary of the division is further from the origin than in the case of symmetric division. Yes.

Subsequently, in the 8 × 8 CU 1012A, only the prefix portion is defined for 2N × 2N, 2N × N, and N × 2N. The prefix portions of 2N × 2N, 2N × N, and N × 2N are “1”, “01”, and “001”, respectively.

The CU prediction mode decoding unit 1011 may use different contexts for the respective bin positions of the prefix part and the suffix part in the decoding process according to the above binarization information.

When different contexts are used for each bin position of the prefix part and the suffix part, there are a total of eight contexts as follows.

First, since the prefix part defines bins with a maximum of 3 bits, the number of contexts is three.

Hereinafter, the suffix part is one for 2N × 2N and N × N. And in the case of a horizontally long partition (2N × N, 2N × nU, and 2N × nD), there are two, and in the case of a vertically long partition (N × 2N, nL × 2N, and nR × 2N), two is there.

[1-3] Configuration of decoding short code for intra CU in small size CU The CU prediction mode decoding unit 1011 may be configured to decode a short code for intra CU in a small size CU. The small size CU is a CU having a size equal to or smaller than a predetermined size. In the following, it is assumed to be an 8 × 8 size CU.

[Configuration Example 1-3-1]
Therefore, the binarization information stored in the binarization information storage unit 1012 may be configured as shown in FIG. FIG. 8 shows another configuration example of 8 × 8 CU 1012A which is the definition of binarization information. That is, 8 × 8 CU 1012A_1 illustrated in FIG. 8 is another configuration example of 8 × 8 CU 1012A included in association table BT1 illustrated in FIG.

As shown in FIG. 8, in the 8 × 8 CU 1012A_1 that is the definition of the binarization information, a short code is assigned to the intra CU in the 8 × 8 size CU that is a small size CU.

In the 8 × 8 CU 1012A_1 shown in FIG. 8, a code shorter than the code assigned to the intra CU is assigned in a large CU (see the non-8 × 8 CU 1012B in FIG. 7). The large size CU is a CU that is not a small size CU, and specifically means a CU having a size larger than 8 × 8.

Also, in 8 × 8 CU 1012A_1, the code assigned to the intra CU is shorter than the code assigned to the inter CU. In other words, in a CU of the same size, a shorter code is assigned to an intra CU than other PU partition types that are not intra CUs.

For example, in the 8 × 8 CU 1012A_1, a 1-bit code is assigned to the intra CU, and a 2-bit or 3-bit code is assigned to the inter CU.

In the region where inter prediction is difficult to hit, intra prediction with a small CU tends to be used. Therefore, in a small CU, the usage rate of an intra CU is high. On the other hand, in the configuration example shown in FIG. 7, a long code is assigned to the intra CU. On the other hand, according to the above data configuration, a short code is assigned to a small size intra CU.

Thereby, in a region where inter prediction is difficult to hit, the CU prediction mode decoding unit 1011 decodes a short code for a small size intra CU. Thereby, there exists an effect that encoding efficiency improves.

In the above configuration, it is preferable that the CU prediction mode decoding unit 1011 sets different contexts for the prefix portion of the large CU and the prefix portion of the small CU.

Therefore, in the context storage unit 1013, an 8 × 8 CU prefix 1013A that is a context for decoding a prefix portion of a large CU and a non-8 × 8 CU prefix 1013B that is a context for decoding a prefix portion of a small CU May be stored. Here, the 8 × 8 CU prefix 1013A and the non-8 × 8 CU prefix 1013B have different contexts.

The meaning of each bin in the prefix portion is different between a small size CU (CU == 8 × 8) and a large size CU (CU> 8 × 8).

For example, the first bit of the prefix portion is information indicating whether the CU prediction type is an intra CU or an inter CU in a small size CU, whereas 2N × 2N in a large size CU. This is information indicating whether it is an inter CU or other inter CU.

Also, bins with different meanings have different appearance tendencies. For this reason, when the same context is set in both the large-size CU and the small-size CU, the appearance tendency of bins is different, which may reduce the coding efficiency.

According to the above configuration, since different contexts can be set according to bins having different appearance tendencies, the encoding efficiency of bins can be improved.

[Configuration Example 1-3-2]
Further, as shown in FIG. 9, the binarization information stored in the binarization information storage unit 1012 may be configured. FIG. 9 shows another configuration example of 8 × 8 CU 1012A which is the definition of binarization information. That is, 8 × 8 CU 1012A_2 illustrated in FIG. 9 is another configuration example of 8 × 8 CU 1012A included in association table BT1 illustrated in FIG.

In the 8 × 8 CU 1012A_2 shown in FIG. 9, the bin string is composed of three parts: a flag, a prefix part, and a suffix part.

In the case of an intra CU, the flag is “1”, and in the case of an inter CU, the flag is “0”.

Also, in the case of intra CU, only the suffix part is defined. That is, when the PU partition type is 2N × 2N, the suffix part is “0”, and when it is N × N, the suffix part is “1”.

On the other hand, in the case of inter CU, only the prefix part is defined. That is, when 2N × 2N, 2N × N, and N × 2N, they are “1”, “01”, and “00”, respectively.

In the 8 × 8 CU 1012A_2 illustrated in FIG. 9, as in the 8 × 8 CU 1012A_1 illustrated in FIG. 8, a code shorter than the code allocated to the intra CU is allocated to the intra CU in the large size CU. The assigned code is shorter than the code assigned to the inter CU.

As shown in FIG. 9, by configuring the 8 × 8 CU 1012A_2, the CU prediction mode decoding unit 1011 decodes a short code for a small size intra CU in a region where inter prediction is difficult to hit. Thereby, there exists an effect that encoding efficiency improves.

In the above configuration, it is preferable to set a unique context different from the context set in the prefix part and the suffix part in the flag. In the prefix part, it is preferable to set the same context between the small CU and the large CU.

For example, the context storage unit 1013 may store one context in which the 8 × 8 CU prefix 1013A and the non-8 × 8 CU prefix 1013B are integrated.

In the above configuration, each bin is designed to have the same meaning between the prefix portion of the small CU and the prefix portion of the large CU. For this reason, by setting the same context between the two, it is possible to improve the encoding efficiency of bin.

[Action / Effect]
The present invention can also be expressed as follows. That is, an image decoding apparatus according to an aspect of the present invention provides, for each coding unit, an image decoding apparatus that restores an image by decoding information for restoring an image from image encoded data, For codes assigned to a combination with a prediction method applied to a coding unit, a combination other than the combination is applied to a combination in which a prediction method for intra prediction is applied to a coding unit having a size equal to or smaller than a predetermined size. It is the structure provided with the decoding means which decodes a code shorter than the code allocated to.

Therefore, it is possible to assign a short code to a combination having a high probability of occurrence in a coding unit having a size equal to or smaller than a predetermined size, and the effect of improving the coding efficiency is achieved.

[1-4] Configuration for Changing the Interpretation of a Bin Sequence According to a Prediction Prediction Parameter The CU prediction mode decoding unit 1011 changes the interpretation of a bin sequence with reference to a prediction parameter assigned to an adjacent region. It may be configured.

[Configuration Example 1-4-1]
Therefore, the binarization information stored in the binarization information storage unit 1012 may be configured as shown in FIG.

FIG. 10 is a diagram showing still another configuration example of the binarization information stored in the binarization information storage unit 1012.

In the binarized information association table BT20 shown in FIG. 10, the value of the prediction parameter of the adjacent region is obtained by replacing the 8 × 8 CU 1012A shown in FIG. 7 with the definition of the inter CU (1012D) and the definition of the intra CU (1012C). Depending on the case, the interpretation of the bin sequence is changed.

Specifically, in the association table BT20, in the definition of a small size CU, the inter CU 1012D that is the definition of binarization information when at least one of the adjacent CUs is an inter CU, and the adjacent CU are both intra CUs. In this case, the interpretation of the bin sequence is changed with the intra CU 1012C which is the definition of the binarization information.

Inter CU 1012D (when at least one of adjacent CUs is an inter CU) interprets that the target CU is an intra CU (2N × 2N or N × N) and the prefix portion when the bin string of the prefix portion is “000”. When the bin string of “1” is “1”, it is interpreted that the target CU is 2N × 2N of the inter CU.

On the other hand, in the intra CU 1012C (when both adjacent CUs are intra CUs), when the bin string of the prefix part is “1”, the target CU is interpreted as an intra CU (2N × 2N or N × N), and the prefix When the bin string of the part is “000”, it is interpreted that the target CU is 2N × 2N of the inter CU.

When a nearby CU is an intra CU, there is a high possibility that the target CU is also an intra CU due to spatial correlation. Therefore, when the neighboring CU is an intra CU, the code amount can be reduced by assigning a short code to the intra CU.

In addition, in a small size CU, the frequency of intra CU generation is high. For this reason, in a small size CU, by assigning a short code to an intra CU, the encoding efficiency can be further improved.

On the other hand, a CU that is not a small-sized CU (for example, a large-sized CU) does not have a configuration of “assigning a short code to an intra CU when both adjacent CUs are intra” as shown in FIG. It doesn't matter. It can be determined according to the frequency of occurrence of intra CUs, in which size CU, a configuration in which “a short code is assigned to an intra CU when adjacent CUs are both intra” is applied. In general, the smaller the CU, the higher the intra CU selection rate tends to be. Therefore, a code shorter than the intra CU with a CU of a predetermined size (for example, 16 × 16) or less including the minimum size CU. Is preferably assigned. According to the above configuration, when the neighboring CU is an intra CU, the CU prediction mode decoding unit 1011 refers to the intra CU 1012C and assigns a short code to the intra CU. On the other hand, when an inter CU is included in a neighboring CU, the CU prediction mode decoding unit 1011 refers to the inter CU 1012D and assigns a short code to the inter CU. As a result, it is possible to reduce the code amount and improve the encoding efficiency.

[Configuration Example 1-4-2]
Further, as shown in FIG. 11, the binarization information stored in the binarization information storage unit 1012 may be configured.

FIG. 11 is a diagram showing still another configuration example of the binarization information stored in the binarization information storage unit 1012.

In the binarized information association table BT30 shown in FIG. 11, the non-8 × 8 CU 1012B shown in FIG. 7 is defined as “upper CU size is larger than or equal to target CU” (1012B_1) and “upper CU size is smaller than the target CU. ] (1012B_2), the interpretation of the bin sequence is changed in accordance with the value of the prediction parameter of the adjacent region.

That is, in the association table BT30, in the definition for a large CU, the bin string is interpreted between the case where the upper adjacent CU is larger than the target CU and the case where the upper adjacent CU is smaller than the target size. It is configured to change.

In “the size of the upper CU is equal to or larger than the target CU” 1012B_1 (when the upper adjacent CU is larger than the target CU), when the bin column of the prefix part is “001”, the target CU is interpreted as a vertically long partition. When the bin string in the prefix part is “01”, it is interpreted that the target CU is a horizontally long partition.

On the other hand, in “the size of the upper CU is smaller than the target CU” 1012B_2 (when the upper adjacent CU is smaller than the target CU), when the bin column of the prefix part is “01”, the target CU is a vertically long partition. When the bin string in the prefix part is “001”, the target CU is interpreted as a horizontally long partition.

When the size of the adjacent CU is smaller than the target CU, there is a high possibility that an edge exists in the adjacent CU.

Therefore, in such a case, there is a high possibility that the target CU is partitioned in a direction perpendicular to the side corresponding to the boundary with the adjacent CU. For this reason, when the upper adjacent CU is smaller than the target CU, there is a high possibility that the vertically long partition is selected.

Therefore, when the upper adjacent CU is smaller than the target CU, encoding efficiency can be improved by assigning a short code to a vertically long partition that is highly likely to be selected.

According to the above configuration, when the upper adjacent CU is smaller than the target CU, the CU prediction mode decoding unit 1011 refers to “the size of the upper CU is smaller than the target CU” 1012B_2 and assigns a short code to the vertically long partition .

On the other hand, when the upper adjacent CU has a size larger than the target CU, the CU prediction mode decoding unit 1011 refers to “the size of the upper CU is equal to or larger than the target CU” 1012B_1 and assigns a short code to the horizontally long partition. As a result, it is possible to reduce the code amount and improve the encoding efficiency.

Furthermore, it is preferable that the interpretation of the suffix part is the same without depending on the interpretation of the prefix part based on the adjacent CU. In the association table BT30, the prefix part is independent of whether it is a vertically long partition or a horizontally long partition, and the interpretation of the same suffix part is the same. In other words, in the example of the association table BT30, the decoding process of the suffix part may not be changed depending on whether the prefix part is a vertically long partition or a horizontally long partition.

That is, the association table BT30 is configured such that the PU partition type (number of partitions) does not depend on the parameter to be referred to.

Since the number of divisions does not depend on the value of the parameter to be referenced, even if an error has occurred in the reference parameter, there is little effect on the subsequent variable length decoding process. Specifically, even when there is an error in the size of the adjacent CU and the interpretation of whether the prefix portion indicates a vertically long or horizontally long partition is incorrect, decoding of the subsequent syntax including the suffix portion can be continued.

That is, since the suffix part can be decoded regardless of the size of the adjacent CU, the error resistance is improved without being influenced by the error of the adjacent parameter.

In addition, when the left adjacent CU is smaller than the target CU, there is a high possibility that the horizontally long partition is selected. Therefore, when the left adjacent CU is smaller than the target CU, a short code may be assigned to a horizontally long partition that is easily selected. As a result, the same effect as described above can be obtained.

In addition, it is preferable not to perform the interpretation switching process based on the size of the adjacent CU for the minimum size CU. When the target CU is a CU having the minimum size, the target CU is always smaller than or equal to the adjacent CU. Therefore, the decoding process can be simplified by omitting the interpretation switching process.

In addition, it can be said that the size of the upper adjacent CU is smaller than the size of the target CU, there is a CU boundary in a positional relationship perpendicular to the upper side on the upper side (excluding the vertex) of the target CU.

Therefore, when there is a CU boundary or PU boundary that is perpendicular to the upper side of the target CU (excluding the vertex), a short code may be assigned to the vertically long partition.

In the above description, the adjacent CU adjacent to the target CU has been described, but the present invention is not limited thereto. The same can be said for CUs located in the vicinity to such an extent that spatial correlation is recognized.

The above configuration can be generalized as follows. That is, the above configuration determines the priority order of the occurrence of pred_type according to the adjacent prediction parameter for a plurality of binary strings and a plurality of corresponding pred_types having the same number of divisions, and pred_type having a higher priority order. It is associated with a short binary string.

In the above description, the condition that the size of the upper adjacent CU is smaller than the size of the target CU can also be expressed as follows.

(1) Let the upper left pixel in the target CU be (xc, yc).

(2) A CU including (xc, yc-1) is derived as the upper adjacent CU, and the upper left pixel of the upper adjacent CU is (xu, yu).

(3) If “log2CUSize [xu] [yu] <log2CUSize [xc] [yc]” holds, it is determined that the size of the upper adjacent CU is smaller than the size of the target CU. Here, log2CUSize [x] [y] is a logarithmic value with 2 as the base of the size of the CU having the pixel (x, y) as the upper left pixel.

Here, it is preferable that only the size of the CU positioned above the upper left pixel of the target CU is compared with the size of the target CU as described above.

In the above description, the case of the upper adjacent CU is described. However, even when determining the size of the left adjacent CU, only the size of the CU located to the left of the upper left pixel of the target CU is determined by comparison with the size of the target CU. It is preferable.

In the above determination procedure (3), an example in which the value of the CU size is directly compared has been described. However, another value associated with the size of the CU may be compared. For example, the above condition (3) uses the value of CU division depth (cuDepth [x] [y]) corresponding to how many times a tree block (LCU) is divided, and “cuDepth [xu] [yu] >> cuDepth [xc] [yc] ”may be used for the determination.

[Action / Effect]
The present invention can also be expressed as follows. That is, an image decoding apparatus according to an aspect of the present invention is an image decoding apparatus that generates a prediction image and restores an image for each prediction unit obtained by dividing an encoding unit into one or more numbers. A plurality of codes associated with a plurality of combinations of a division type and a prediction scheme that is a type for dividing a target coding unit that is the coding unit into the prediction units It is a structure provided with the change means to change according to the decoded parameter allocated to the decoding prediction unit of the vicinity of the object prediction unit which is.

Therefore, a shorter code can be assigned to a combination of a prediction method and a division type having a higher probability of occurrence according to a decoded parameter assigned to a neighboring decoded prediction unit, thereby improving coding efficiency. Can be made.

(2) Details of PU Information Decoding Unit Next, configuration examples of the PU information decoding unit 12 and the decoding module 10 will be described with reference to FIG. FIG. 12 is a functional block diagram illustrating the configuration for decoding motion information, that is, the configurations of the PU information decoding unit 12 and the decoding module 10 in the video decoding device 1.

Hereinafter, the configuration of each unit will be described in the order of the PU information decoding unit 12 and the decoding module 10.

(PU information decoding unit)
As shown in FIG. 12, the PU information decoding unit 12 includes a motion compensation parameter derivation unit (bi-prediction restriction unit, candidate determination unit, estimation unit) 121, a merge candidate priority order information storage unit 122, and a reference frame setting information storage unit. 123.

The motion compensation parameter deriving unit 121 derives the motion compensation parameter of each PU included in the target CU from the encoded data.

Specifically, the motion compensation parameter deriving unit 121 derives a motion compensation parameter in the following procedure. Here, when the target CU is a skip CU, a skip index may be decoded instead of the merge index, and a prediction parameter in the skip CU may be derived based on the value.

First, the motion compensation parameter derivation unit 121 determines a skip flag. As a result, if the target CU is a non-skip CU, the motion information decoding unit 1021 is used to decode the merge flag.

Here, when the target CU is a skip CU or a merge PU, the motion compensation parameter deriving unit 121 decodes the merge index, and based on the decoded merge index value, predictive parameters (motion vector, reference image index, Inter prediction flag) is derived. The motion compensation parameter deriving unit 121 determines a merge candidate specified by the merge index according to the merge candidate information stored in the merge candidate priority information storage unit 122.

On the other hand, when the target CU is neither a skip CU nor a merge PU, the motion compensation parameter derivation unit 121 decodes a prediction parameter (inter prediction flag, reference image index, motion vector difference, estimated motion vector index).

Furthermore, the motion compensation parameter deriving unit 121 derives an estimated motion vector based on the value of the estimated motion vector index, and derives a motion vector based on the motion vector difference and the estimated motion vector.

The merge candidate priority information storage unit 122 stores merge candidate information including information indicating which region is a merge candidate and information indicating the priority of the merge candidate.

The reference frame setting information storage unit 123 is used to determine which inter prediction prediction method to use, that is, single prediction referring to one reference image and bi-prediction referring to two reference images. Reference frame setting information is stored.

(Decryption module)
As illustrated in FIG. 12, the decoding module 10 includes a motion information decoding unit 1021. The motion information decoding unit 1021 decodes the syntax value from the binary included in the encoded data according to the encoded data and syntax type supplied from the motion compensation parameter deriving unit 121. The motion compensation parameters decoded by the motion information decoding unit 1021 are a merge flag (merge_flag), a merge index (merge_idx), an estimated motion vector index (mvp_idx), a reference image index (ref_idx), an inter prediction flag (inter_pred_flag), and a motion vector difference. (Mvd).

[Configuration Example for Deriving Prediction Parameters in Merge PU]
[2-1] Example of Merge Candidate Position and Priority Order Derivation of prediction parameters in the merge PU will be described with reference to FIGS. 13 to 15.

The motion compensation parameter deriving unit 121 may be configured such that when the PU partition type is asymmetric, the priority order of the merge candidates is determined by a method different from that when the PU partition type is symmetric.

First, the characteristics of the asymmetric partition will be described. In an asymmetric partition, there is a high possibility that an edge in the long side direction exists in the smaller partition. In addition, there is a high possibility that an accurate motion vector is derived in a region where an edge exists.

It will be described in detail with reference to FIG. FIG. 13 shows a CU for which an asymmetric partition has been selected. As illustrated in FIG. 13, in the target CU 30, an edge E1 having an inclination exists in the region, and a 2N × nU PU partition type is selected.

The target CU includes PU31 and PU32. Here, the target PU is PU31. An edge E1 having an inclination crosses the region of the target PU 31.

In the example shown in FIG. 13, there is a high possibility that the same edge as the edge existing in the region of the target PU 31 exists also in the region R <b> 10 near the short side of the target PU 31. Therefore, there is a high possibility that the same motion vector (mv) as that of the target PU 31 is assigned.

Therefore, in a region where an edge is likely to exist, that is, in a small partition, a motion vector assigned to the short side region is referred to, and in a large partition, a motion vector assigned to a region around the small partition. The accuracy of the motion vector can be improved by referring to.

Therefore, the merge candidate priority information stored in the merge candidate priority information storage unit 122 is configured to include two merge candidate priority information of a symmetric PU partition type 122A and an asymmetric PU partition type 122B.

The merge candidate priority information of the symmetric PU partition type 122A will be described with reference to FIG.

FIG. 14 shows a CU in which a symmetric partition is selected. As shown in FIG. 14, a 2N × N PU partition type is selected in the symmetric CU. In the same figure, the target PU is indicated by “Curr PU”. In the target PU, priorities are assigned in the order of left (L), top (U), top right (UR), bottom left (BL), top left (UL) as merge candidates.

The merge candidate priority information of the asymmetric PU partition type 122B will be described with reference to FIG. (A) and (b) of FIG. 15 show the setting of the priority in the smaller partition of 2N × nU and the larger partition of 2N × nU, respectively. FIGS. 15C and 15D show setting of the priority order in the larger partition of 2N × nD and the smaller partition of 2N × nD, respectively.

As shown in FIGS. 15A and 15D, in the smaller partition in the asymmetric partition, a higher priority is assigned to the merge candidate on the short side.

Specifically, in the smaller PU of 2N × nU and 2N × nD, as shown in (a) and (d) respectively, the short side is adjacent (L), and the apex is adjacent (UR, BL, UL) ), Priority is assigned to the long side in the order of adjacent (U).

Further, as shown in FIGS. 15B and 15C, in the larger partition in the asymmetric partition, a higher priority is assigned to the merge candidate located near the smaller partition.

Specifically, in the larger PU of 2N × nU, as shown in FIG. 15B, the merge candidate (U) in the smaller PU and the merge candidate (UR, UL) closer to the smaller PU ) And priorities are assigned in the order of the other merge candidates (L, BL).

Further, in the larger PU of 2N × nD, as shown in FIG. 15C, the merge candidates (L, BL) closer to the smaller PU and the other merge candidates (U, BL, UL) in this order. A priority is assigned.

Note that a small merge index is assigned to a candidate with high priority, and a short code is assigned to the small merge index. Moreover, it is good also as a merge candidate which can select only a high priority candidate.

In the above description, prediction parameter derivation in the merge PU has been described. However, a similar derivation method may be used for derivation of an estimated motion vector used for restoring a motion vector in a non-merged PU in an inter CU. In general, the above method can be applied when deriving an estimated value or predicted value of a motion parameter corresponding to an adjacent region in each PU in an asymmetric PU.

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention generates an image by restoring a prediction image by using a prediction method for inter-screen prediction for each prediction unit obtained by dividing a coding unit into one or more numbers. In the decoding device, the division type into the prediction units includes asymmetric division that divides the coding unit into a plurality of prediction units of different sizes or symmetric division that divides the coding unit into a plurality of prediction units of the same size. In addition, in the case where the division type is asymmetric division, there is provided an estimation means for estimating a prediction parameter for inter-screen prediction by an estimation method different from that in the case where the division type is symmetric division.

Therefore, by changing the estimation method that is different between the case where the division type is asymmetric division and the case where the division type is symmetric division, the estimation parameter of the inter-screen prediction is estimated by a preferable estimation method according to the division type. There is an effect that it can be performed.

[2-2] Change of Merge Candidate by Combination of CU Size and Skip / Merge The motion compensation parameter derivation unit 121 depends on a combination of a CU size and whether or not the CU is a CU that performs skip / merge. It may be configured to change merge candidates. Accordingly, the merge candidate information stored in the merge candidate priority information storage unit 122 is configured to include two pieces of definition information of a small PU size 122C and a large PU size 122D.

The merge candidate information of the small PU size 122C defines the number of merge candidates applied to the small PU. Further, the merge information of the large PU size 122D defines the number of merge candidates applied to the large PU.

As an example, in the merge candidate information, the number of merge candidates defined for the small PU size 122C (the number of merge candidates for the small PU) is the number of merge candidates defined for the large PU size 122D (the merge of large PUs). It is defined to be smaller than the number of candidates.

In the area where a small PU is selected, the movement is often complicated. For this reason, there is a tendency that the correlation of motion vectors assigned to neighboring PUs becomes small.

For this reason, even if merge candidates are increased, the estimation accuracy may not be improved as compared with a large PU.

Therefore, it is preferable to reduce the code amount of the side information by reducing the number of merge candidates.

In the above example, in the merge candidate information, the number of merge candidates for small size PUs, which often have complicated movements, is reduced compared to the number of merge candidates for large size PUs. it can.

Examples of combinations of small size PUs and large size PUs are listed below.
The small size PU is a PU having at least one side smaller than a predetermined threshold (for example, 8), and the large size PU is a PU other than that. For example, PUs having a size of 16 × 4, 4 × 16, 8 × 4, 4 × 8, and 4 × 4 are small size PUs, and PUs having a size of 8 × 8 and 16 × 16 are large size PUs.
The small size PU is a PU whose area is smaller than a predetermined threshold (for example, 64), and the large size PU is any other PU. For example, a PU with a size of 8 × 4, 4 × 8, 4 × 4 is a small size PU, and a PU with a size of 8 × 8, 16 × 4, 4 × 16, 16 × 16, etc. is a large size PU. .
The small size PU is a PU included in a CU having a predetermined size (for example, 8 × 8) or less, and the large size PU is a PU included in a larger CU. For example, 8 × 8, 8 × 4, 4 × 8, and 4 × 4 size PUs included in the 8 × 8 CU are small size PUs.
A small size PU is a smaller PU in a CU to which asymmetric partitioning is applied, and a large size PU is a larger PU in a CU to which asymmetric partitioning is applied.

As another example, in the merge candidate information, in a small PU, it is preferable that the number of merge candidates by temporal prediction is smaller than the number of merge candidates by temporal prediction in a large PU. In small PU, you may define so that the merge candidate by temporal prediction may not be included in merge candidate information.

Also, in a region where the motion is complex such that a small size PU is selected, the correlation between the correlated PU used for temporal prediction and the target PU is small, so the possibility that temporal prediction is selected is low. Therefore, it is preferable to reduce the number of merge candidates based on temporal prediction or not to include merge candidates based on temporal prediction.

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention generates an image by restoring a prediction image by using a prediction method for inter-screen prediction for each prediction unit obtained by dividing a coding unit into one or more numbers. In the decoding apparatus, the target prediction unit, which is the prediction unit to be decoded, is a prediction unit for estimating the prediction parameter of the target prediction unit from the prediction parameters assigned to the region in the vicinity of the target prediction unit. In some cases, it is configured to include candidate determination means for determining a candidate area to be used for estimation according to the size of the target prediction unit.

Therefore, side information can be reduced by reducing the number of candidates, and as a result, encoding efficiency can be improved.

[2-3] Determination of the number of reference frames The motion compensation parameter derivation unit 121 may be configured as shown in [2-3-1] to [2-3-4] below. You may determine which prediction system of single prediction and bi-prediction is applied.

[2-3-1] Bi-Prediction Restriction in Small Size PU The motion compensation parameter derivation unit 121 refers to the reference frame setting information stored in the reference frame setting information storage unit 123, and performs single prediction and bi-prediction in inter prediction. You may determine which prediction system of prediction applies.

The motion compensation parameter deriving unit 121 may be configured to limit bi-prediction for small PUs. Therefore, the reference frame setting information is configured to include two definition information of the small PU size 123A and the large PU size 123B.

The large PU size 123B defines a prediction method that can be selected for a large PU. The large PU size 123B is defined so that both bi-prediction and uni-prediction prediction methods can be selected without limitation for a large PU.

The small PU size 123A defines a prediction method that can be selected in a small PU. The small PU size 123A is defined such that bi-prediction is limited in a small size PU.

An example of the definition of the small PU size 123A is a PU that is included in an inter CU and that does not apply merging and has a size less than 16 × 16, and the decoding of the inter prediction flag is omitted and single prediction is applied. Define this.

Further, as another example of the definition of the small PU size 123A, it is defined that the single prediction is applied to a PU that is included in the inter CU and to which merge is applied and whose size is less than 16 × 16.

Also, as another example of the definition of the small PU size 123A, single prediction is applied to each PU included in the skip CU.

As yet another example of the definition of the small PU size 123A, weighted prediction is not applied to PUs included in the inter CU that do not apply merging and have a size of less than 16 × 16. That is, information regarding weighted prediction is omitted.

Hereinafter, regarding the case where bi-prediction is limited based on the reference frame setting information, details of the configuration of the encoded data and the configuration of the video decoding device will be described using a syntax table and a block diagram.

(Type of bi-prediction restriction)
The PU type includes a PU for which the target CU is skipped (skip PU), a PU for which the target PU is adapted for merging (merge PU), and a PU that is neither skipped nor merged (basic inter PU or non-motion information) (Omitted PU). In the basic inter PU, an inter prediction flag indicating whether it is bi-prediction or uni-prediction is decoded from the encoded data, and a motion compensation parameter is derived. On the other hand, in the skip PU and merge PU, motion compensation parameters are derived without decoding the inter prediction flag. In these PUs, candidates for motion compensation are selected from skip candidates or merge candidates based on the skip index or merge index, and motion compensation parameters in the target PU are derived based on the motion compensation parameters in the selection candidates. . Usually, the motion compensation parameter derivation method of the skip PU is the same as that of the merge PU. When the use of merging is restricted by using a sequence parameter set flag or the like, the same method as the basic inter PU may be used except that the motion vector residual (mvd) is not decoded. In this case, the bi-prediction restriction operation of the skip PU is the same as that of the basic inter PU.

FIG. 35 (a) shows an example of bi-prediction restriction in each PU. In some cases, bi-prediction restriction is performed only on the basic inter PU, and bi-prediction restriction is performed on all PUs to which motion compensated prediction is applied. When the bi-prediction restriction is performed only on the basic inter PU, the bi-prediction restriction is not performed on the skip PU and the merge PU, but the bi-prediction restriction is performed only on the basic inter-PU. Whether bi-prediction restriction is applied only to the basic inter PU or bi-prediction restriction is applied to all PUs, it is possible to reduce the processing amount and the circuit scale in the video encoding device and the video decoding device. .

FIG. 35 (b) shows a bi-prediction restriction method for each PU. In the case of skip PUs and merge PUs, bi-prediction restriction is performed by deriving information indicating that bi-prediction is not performed in motion compensation parameter derivation based on skip candidates or merge candidates. Specifically, as will be described later in the motion compensation parameter derivation unit, bi-prediction restriction is performed by converting the value of the inter prediction flag included in the motion compensation parameter from bi-prediction to single prediction. When bi-prediction restriction is performed on a basic inter PU, whether or not bi-prediction restriction is applied is determined according to PU size information. When the bi-prediction restriction is not applied, the inter prediction flag is decoded. In addition, when the bi-prediction restriction is applied, decoding of the inter prediction flag is omitted, and further, a process of estimating the value of the inter prediction flag as single prediction is performed.

Here, the PU size information is information for determining whether or not the PU is a small PU, and the size of the target CU and the PU partition type, or the size of the target CU and the number of PU partitions, or the width of the PU. Alternatively, the height or the area of the PU can be used.

The skip PU and merge PU and the basic inter PU differ not only in the motion compensation parameter decoding method but also in the scenes used. The skip PU and the merge PU reduce the code amount by limiting selectable motion compensation parameters. Such a PU is mainly used in an area where the motion is uniform. When the motion is uniform, two prediction images are close to each other, and the noise removal effect by bi-prediction is often large. Therefore, compared with the bi-prediction restriction of the basic inter PU, the skip PU and the merge PU are likely to have a lower coding efficiency due to the bi-prediction restriction than the basic inter PU. Therefore, a restriction that uses bi-prediction only with the basic inter PU as described above, and a method that changes the PU size restricted with the basic inter PU, the skip PU, and the merge PU as described later are also suitable. In addition, the bi-prediction restriction in the basic inter PU is more effective because there is a code amount reduction effect by not encoding the inter prediction flag from the viewpoint of the structure of the encoded data.

(PU motion compensation parameters)
The motion compensation parameters of the PU are expressed by prediction list use flags predFlagL0 and predFlagL1, reference index numbers refIdxL0 and refIdxL1, and motion vectors mvL0 and mvL1. The prediction list use flags predFlagL0 and predFlagL1 indicate whether or not a reference prediction list is used. In the following, 1 is used when using, and 0 when not using. When two reference prediction lists are used, the case of predFlagL0 = 1 and predFlagL1 = 1 corresponds to bi-prediction, and when one reference prediction list is used, that is, (predFlagL0, predFlagL1) = (1, 0) or (predFlagL0, The case of predFlagL1) = (0, 1) corresponds to single prediction. Whether or not bi-prediction is possible can also be expressed by an inter-prediction flag to be described later, and information on whether the number of reference pictures is 1 (uni-prediction) or 2 (bi-prediction) is decoded from encoded data. This is used in some cases.

When the prediction list use flag predFlagL0 is 1, the reference picture of the L0 list is specified by the reference index number refIdxL0, and the motion vector for the reference picture specified by the motion vector mvL0 is specified.

When the prediction list use flag predFlagL1 is 1, the reference picture of the L1 list is specified by the reference index number refIdxL1, and the motion vector for the reference picture specified by the motion vector mvL1 is specified.

When the list X is not used (X is 0 or 1), that is, when the prediction list use flag predFlagLX is 0, basically, the value of the reference index number refIdxLX is -1 and the value of the motion vector mvL1 is (0, 0).

(Details of inter prediction flag)
Here, details of the inter prediction flag will be described. The inter prediction flag inter_pred_flag is a binary flag indicating whether it is uni-prediction or bi-prediction, and selects a list of reference images (reference list) to be referred to in uni-prediction from a plurality of reference lists. May contain information. For example, the inter prediction flag may be defined as a ternary flag including a flag for selecting one of two types of reference lists (L0 list and L1 list). Hereinafter, each case will be described.

The decoding module 10 decodes the combined list flag ref_pic_list_combination_flag that selects whether to use the L0, L1 list as the reference frame list or the combined list (LC list) from the slice header or the like. The reference frame determination method in the case of single prediction differs depending on the value of the combined list flag. When the combined list flag is 1, the combined list LC is used as a reference list used for specifying a single prediction reference frame, and a flag for specifying a reference list in each PU is unnecessary. Therefore, the inter prediction flag inter_pred_flag may be a binary flag. When the combined list flag is 0, it is necessary to select the reference list from the L0 list or the L1 list in each PU. Therefore, the inter prediction flag inter_pred_flag is a ternary flag.

32 (a) shows the meaning of the inter prediction flag when it is a binary flag, and FIG. 32 (b) shows the meaning of the inter prediction flag when it is a ternary flag.

(Example of syntax table in bi-prediction restriction)
FIG. 31 is an example of a PU syntax table in the prior art, and shows a configuration of encoded data when bi-prediction restriction is not performed. FIG. 33 is an example of a PU syntax table, and (a) and (b) each show a part of an inter prediction flag inter_pred_flag in the configuration of encoded data when bi-prediction restriction is performed. (A) of FIG. 33 is an example of a syntax table when the inter prediction flag is always a binary flag. In this case, two types of Pred_LC, which means single prediction, and Pred_Bi, which means bi-prediction, are distinguished by inter_pred_flag. When the slice is a B slice and bi-prediction is valid (DisableBiPred = false), the encoded data includes the inter prediction flag inter_pred_flag to distinguish between single prediction and bi-prediction, and bi-prediction is not valid. Since (DisableBiPred = true) is always single prediction, the inter prediction flag inter_pred_flag is not included.

FIG. 33B is an example of a syntax table when the inter prediction flag is a ternary flag. When using a combined list, inter_pred_flag distinguishes between Pred_LC, which means uni-prediction using one LC list reference frame, and Pred_Bi, which means bi-prediction. Three types are distinguished: Pred_L1, which means single prediction, Pred_L2, which means single prediction of the L1 list, and Pred_Bi, which means bi-prediction. When the slice is a B slice and bi-prediction is enabled (DisableBiPredPre = false), the encoded data includes the first inter prediction flag inter_pred_flag0 to specify uni-prediction and bi-prediction, and bi-prediction is not valid. If not, the encoded data includes a second inter prediction flag inter_pred_flag1 for specifying uni-prediction and bi-prediction for specifying the reference list only when the combined list is not used. More specifically, the case where the combined list is not used is determined by! UsePredRefLC &&! NoBackPredFlag as shown in FIG. That is, a flag UsePredRefLC (indicating that the combined list is to be used when the value of UsePredRefLC is true) and a flag NoBackPredFlag (NoBackPredFlag) indicating whether or not to use backward prediction are used. Indicates that backward prediction is not used when the value is true). In the case of using a combined list, it can be seen that it is a combined list without selecting a list. When backward prediction is not used, Pred_L1 is prohibited. Therefore, the list used even when the second inter prediction flag inter_pred_flag1 is not encoded is a combined list (Pred_LC) or an L0 list. Or Pred_L0. Note that the expression NoL1PredFlag can also be used in the sense that the L1 list is not used instead of NoBackPredFlag.

Whether the bi-prediction restriction is to be performed or a threshold value used for determining the PU size when the bi-prediction restriction is to be performed may be included in the encoded data. FIG. 34 is an example of a syntax table regarding bi-prediction restriction. FIG. 34A shows a case where the sequence parameter set includes a flag disable_bipred_in_small_PU that restricts whether or not bi-prediction restriction is performed. As shown in the figure, the bi-prediction restriction flag may be encoded independently of a flag disable_inter_4x4 that prohibits a small size PU (here, a 4 × 4 size PU). Note that the purpose of the flag prohibiting a small PU is to reduce the worst processing amount when generating a PU prediction image, as in the case of bi-prediction restriction. A flag prohibiting bi-prediction may be used as a common flag. FIG. 34B is an example in which a prediction constraint flag use_restricted_prediction is provided as a common flag. In this case, when the prediction constraint flag is true, the application of the small size PU and the bi-prediction in the small size PU are prohibited at the same time. (C) of FIG. 34 is an example which includes disable_bipred_size indicating the size of the PU for which bi-prediction is prohibited in the encoded data. As disable_bipred_size, a bi-prediction restriction determination method can be used such as a logarithmic value with a threshold TH of 2 which will be described later. Note that these flags may be encoded with a parameter set other than the sequence parameter set, or may be encoded with a slice header.

In the above, the syntax in the case of CABAC has been described. Subsequently, the syntax in the case of CAVLC will be described. As described above, FIG. 33 is a syntax table in the case of CABAC. On the other hand, FIG. 36 shows a syntax table including the case of CAVLC. The combined_inter_pred_ref_idx in FIG. 36 is a flag obtained by combining the inter prediction flag inter_pred_flag and the reference picture index (ref_idx_l0, ref_idx_lc, ref_idx_l1). As shown in FIG. 36, when the encoding mode information (entropy_coding_mode_flag) is 0, that is, CAVLC, the inter prediction flag inter_pred_flag and the reference picture index are not separately encoded and decoded, but combined_inter_pred_ref_idx that is a combined flag is used. Encode. Therefore, the encoded data includes combined_inter_pred_ref_idx.

Further, when bi-prediction is available (when bi-prediction restriction flag DisableBipred is false) and combined_inter_pred_ref_idx is a predetermined value MaxPredRef (described later), the encoded data further includes an inter prediction flag inter_pred_flag. DisableBipred is set to true when the PU size is small. The method of deriving DisableBipred in FIG. 36 is an example. In this case, DisableBipred = true is set for 4 × 4, 4 × 8, and 8 × 4 PU sizes.

(Join inter prediction reference index)
FIG. 37 is a diagram for explaining the combined inter prediction reference index combined_inter_pred_ref_idx. (A), (b), and (c) of FIG. 37 are diagrams for explaining an example of the value of combined_inter_pred_ref_idx. FIG. 37 (d) shows a table TBL37 and pseudo code CODE37 showing a method for deriving the maximum value MaxPredRef of combined_inter_pred_ref_idx. combined_inter_pred_ref_idx is used to encode a combination of an inter prediction flag inter_pred_flag having a high occurrence probability and a reference picture index with short bits. The combined_inter_pred_ref_idx is an index used to select a reference picture managed in a reference list LC (C for LC is an abbreviation of combined) whose elements are combinations of reference pictures with a high probability of occurrence.

In the example of FIGS. 37A and 37C, the value range of combined_inter_pred_ref_idx is 0 to 8 (= MaxPredRef). As shown in FIG. 37 (c), combinations (joined reference picture sets) that are likely to occur from 0 to 7 are assigned. When the combination of the inter prediction flag inter_pred_flag and the reference picture index is other than that, 8 (= MaxPredRef) is assigned. A method of deriving the maximum value MaxPredRef shown in (d) of FIG. 37 will be described later.

In CAVLC, combined_inter_pred_ref_idx is converted into a code number codeNum in a table called a conversion variable table EncTable, and the code number codeNum is encoded with a truncated unary code with MaxPredRef as the maximum value. That is, codeNum = EncTable [combined_inter_pred_ref_idx] is encoded. In truncated unary code, the smaller the value, the shorter the number of bits. Also, by explicitly using the maximum value, encoding can be performed without wasted bits. Note that the decoding of the code number (codeNum) by the truncated unary code in the case of the maximum value cMax can be realized by the process defined by the following pseudo code. Here, read_bits (1) is a function that reads a 1-bit binary from the encoded data and returns the value.
leadingZeroBits = -1
for (b = 0;! b && leadingZeroBits <cMax; leadingZeroBits ++)
b = read_bits (1)
codeNum = leadingZeroBits
FIGS. 38A and 38B show examples of the conversion variable table EncTable and the inverse conversion variable table DecTable, respectively. As shown in the example of FIG. 38 (a), by using the conversion variable table EncTable, small values are assigned to combinations that are more likely to occur in the combined reference picture set, and encoding is performed with a short code length. Is done. The conversion variable table and the inverse conversion variable table are updated so that the generated value becomes a short code number each time one combined_inter_pred_ref_idx is encoded and decoded. As a result, encoding can be performed with a shorter code length than when a fixed variable table is used. FIG. 38B shows an inverse conversion variable table DecTable. The decoded codeNum is corrected to combined_inter_pred_ref_idx by DecTable. That is, combined_inter_pred_ref_idx = DecTable [combined_inter_pred_ref_idx]. Details of the decoding operation will be described later in the description of the inter prediction flag decoding unit 1028.

(Motion compensation parameter derivation unit for bi-prediction restriction)
FIG. 29 shows the configuration of the motion compensation parameter derivation unit 121. The motion compensation parameter derivation unit 121 includes a skip motion compensation parameter derivation unit 1211, a merge motion compensation parameter derivation unit 1212, a basic motion compensation parameter derivation unit 1213, a bi-prediction restricted PU determination unit 1218, and a bi-prediction conversion unit 1219. Is done.

The motion compensation parameter deriving unit 121 particularly performs bi-prediction restriction on skip PUs and merge PUs that are PUs when the inter prediction flag is not decoded.

The skip motion compensation parameter deriving unit 1211 derives a motion compensation parameter of the skip PU when the target CU is skipped, and inputs the motion compensation parameter to the bi-prediction transform unit 1219. The bi-prediction conversion unit 1219 converts the motion compensation parameter in accordance with the bi-prediction limiting condition, and returns it to the skip motion compensation parameter derivation unit 1211. The bi-prediction limited condition is a condition for determining whether or not bi-prediction restriction is performed, and the determination is performed by the bi-prediction restricted PU determination unit 1218 as described later. Details of a method (bi-prediction conversion method) in which the bi-prediction conversion unit 1219 converts bi-prediction into single prediction will be described later. The skip motion compensation parameter deriving unit 1211 outputs the motion compensation parameter converted according to the bi-prediction restriction condition to the outside as the motion compensation parameter of the target PU. When the motion compensation parameter is determined by the skip index, the bi-predictive conversion unit 1219 may perform conversion for each skip candidate, and the converted skip candidate may be selected by the skip index. When the skip candidate derivation of the skip PU and the merge candidate derivation derivation method of the merge PU are the same, the skip motion compensation parameter derivation unit 1211 is replaced with the merge motion compensation parameter derivation unit 1212, and the merge candidate is skip candidate. Derived by replacing with.

The merge motion compensation parameter deriving unit 1212 derives a motion compensation parameter of the target PU when the target PU is a merge, and inputs the motion compensation parameter of the target PU to the bi-predictive transformation unit 1219. The bi-prediction conversion unit 1219 converts the motion compensation parameter according to the bi-prediction limited condition, and returns it to the merge motion compensation parameter derivation unit 1212. The merge motion compensation parameter derivation unit 1212 outputs the motion compensation parameter converted according to the bi-prediction restriction condition to the outside as the motion compensation parameter of the target PU. When motion compensation parameters are determined by the merge index, the bi-predictive conversion unit 1219 may perform conversion for each merge candidate, and the merge candidate after conversion may be selected by the merge index. .

The basic motion compensation parameter deriving unit 1213 derives the motion compensation parameter of the target PU when the target PU is neither skipped nor merged, and outputs it to the outside.

The bi-prediction restricted PU determination unit 1218 refers to the PU size information of the target PU, and determines whether to perform bi-prediction restriction that does not use bi-prediction in the target PU. Whether to perform bi-prediction restriction on skip CU and merge PU and whether to perform bi-prediction restriction on basic inter PU may be determined independently. For example, bi-prediction restriction may be performed using the same PU size as a threshold for all PUs, or bi-prediction restriction may be performed using a larger PU size as a threshold for skip PUs and merge PUs. Further, the skip prediction and the merge PU may not perform the bi-prediction restriction, and may perform the bi-prediction restriction only on the basic inter PU.

Note that when decoding inter prediction flags with skip PUs, such as when merging is restricted, whether or not bi-prediction restriction is performed on each of skip PUs, merge PUs, and basic inter PUs is determined independently. It is also possible to determine.

In the above configuration, the bi-prediction / uni-prediction setting set in the skip motion compensation parameter deriving unit 1211 is determined based on the bi-prediction restricted PU determination unit 1218 in the bi-prediction conversion unit 1219. Not limited to. For example, the configuration may be such that the determination result of the bi-prediction restricted PU determination unit 1218 is directly input to the skip motion compensation parameter derivation unit 1211 and bi-prediction / uni-prediction is set.

Hereinafter, among the components of the motion compensation parameter derivation unit 121, details of the merge motion compensation parameter derivation unit 1212, the basic motion compensation parameter derivation unit 1213, the bi-prediction restricted PU determination unit 1218, and the bi-uni prediction conversion unit 1219 will be described. Will be described in order.

(Details of merge motion compensation parameter deriving unit 1212)
FIG. 43 is a block diagram showing the configuration of the merge motion compensation parameter derivation unit 1212. When used in the case of a skip PU, the following merge candidates are replaced with skip candidates for operation.

The merge motion compensation parameter deriving unit 1212 includes an adjacent merge candidate deriving unit 1212A, a temporal merge candidate deriving unit 1212B, a unique candidate deriving unit 1212C, a combined bi-predictive merge candidate deriving unit 1212D, a non-scale bi-predictive merge candidate deriving unit 1212E, and zero. A vector merge candidate derivation unit 1212F, a merge candidate derivation control unit 1212G, a merge candidate storage unit 1212H, and a merge candidate selection unit 1212J are configured. Although not shown in FIG. 43, the adjacent merge candidate deriving unit 1212A and the temporal merge candidate deriving unit 1212B have decoding parameters of the already decoded CU and PU stored in the frame memory 16, in particular, motion compensation in units of PUs. Parameters are supplied. In the following, adjacent merge candidate derivation unit 1212A, temporal merge candidate derivation unit 1212B, unique candidate derivation unit 1212C, combined bi-prediction merge candidate derivation unit 1212D, non-scale bi-prediction merge candidate derivation unit 1212E, zero vector merge candidate derivation The unit 1212F is collectively referred to as merge candidate derivation means.

In the merge motion compensation parameter derivation unit 1212, the merge candidate derivation control unit 1212G controls each merge candidate derivation unit, derives merge candidates of a predetermined number MRG_MAX_NUM_CANDS, and stores them in the merge candidate storage unit 1212H. Here, the merge candidate includes prediction list use flags predFlagL0 and predFlagL1, which are motion compensation parameters of the PU, reference index numbers refIdxL0 and refIdxL1, and motion vectors mvL0 and mvL1. The merge candidate storage unit 1212H stores the set of motion parameters as a merge candidate. The merge candidates to be stored are managed as a list (merge candidate list) ordered in the order of storage. The merge candidate selection unit 1212J selects a merge candidate specified by the merge index and outputs it as prediction information PUI.

The combined bi-predictive merge candidate derivation unit 1212D and the non-scale bi-predictive merge candidate derivation unit 1212E particularly derive bi-predictive merge candidates, and are referred to as bi-predictive merge candidate derivation means.

FIG. 44 is a flowchart showing the operation of the merge motion compensation parameter derivation unit 1212. A method in which each merge candidate deriving unit derives merge candidates will be described later with reference to another drawing. First, in the adjacent merge candidate derivation unit 1212A, merge candidates A0 to B2 are obtained using the motion compensation parameters of adjacent blocks (S101). Subsequently, the temporal merge candidate derivation unit 1212B obtains a merge candidate T using the motion compensation parameter of the reference picture that has already been decoded (S102). In S103, duplicate merge candidates among the derived merge candidates A0 to T are removed and stored in the merge candidate storage unit 1212H. If the number of merge candidates that do not overlap here is MRG_MAX_NUM_CANDS or more, the derivation of merge candidates is terminated (YES in S104). Otherwise (NO in S104), the process proceeds to S105. If it is a B slice (YES in S105), the process proceeds to S106. If not (NO in S105), S107 and S108 are skipped and the process proceeds to S109 (S105). When bi-prediction restriction is performed, the bi-prediction motion candidate derivation process of S107 and S108 is also skipped in the case of a small size PU corresponding to the case where bi-directional merge candidate derivation is skipped, and the process proceeds to S109 (S106). ). In S107, the combined bi-predictive merge candidate derivation unit 1212D derives a combined bi-predictive merge candidate and stores it in the merge candidate storage unit 1212H. In S108, the non-scale bi-predictive merge candidate derivation unit 1212E derives a non-scale bi-predictive merge candidate and stores it in the merge candidate storage unit 1212H. If the number of merge candidates is MRG_MAX_NUM_CANDS or more (YES in S109), the derivation of merge candidates is terminated. Although not shown, when the number of merge candidates reaches MRG_MAX_NUM_CANDS during the steps of S107 and S108, each process is stopped and the derivation of merge candidates is ended. In S110, the zero vector merge candidate deriving unit 1212F derives zero vector merge candidates until the number of merge candidates reaches MRG_MAX_NUM_CANDS, and stores it in the merge candidate storage unit 1212H.

In the above process, since the merge candidate derivation process related to the bi-predictive merge candidate is omitted in the small size PU, the processing amount required for the merge candidate derivation can be reduced. Since the combined bi-predictive merge candidate derivation process (S107) and the non-scale bi-predictive merge candidate derivation process (S108) related to the bi-predictive merge candidate are heavy processes that require repeated determinations, the time required for decoding (process) The ability to omit the processing with a small size PU with a limited amount is particularly effective in a device that needs to perform a decoding process in real time. The omission of bi-predictive merge candidates is not limited to the operations of the combined bi-predictive merge candidate derivation unit 1212D and the non-scale bi-predictive merge candidate derivation unit 1212E, and other merge candidate derivations that mainly generate bi-predictive merge candidates. It can also be applied to processing.

Hereinafter, the details of each merge candidate derivation means will be described. FIG. 45 is a diagram for explaining the operation of the adjacent merge candidate derivation unit 1212A. As shown in FIG. 45, each merge candidate is derived by copying the motion compensation parameters of adjacent blocks including the positions of A0, A1, B0, B1, and B2. Assume that the derivation order is A1, B1, B0, A0, B2. The derived merge candidate is converted by the bi-predictive conversion unit 1219 and then stored in the merge candidate storage unit 1212H. The bi-prediction conversion unit 1219 converts the input merge candidate into a single prediction when bi-prediction restriction is performed. When there are a plurality of merge candidates converted to single prediction (for example, two of L0 prediction and L1 prediction), the plurality of merge candidates are stored in the merge candidate storage unit 1212H. The fact that the output of the bi-predictive conversion unit 1219 may be a plurality of merge candidates is the same for other merge candidate derivation units described later.

If the adjacent block is not available (unavailable) or is an intra block, a corresponding merge candidate is not derived. Note that the case where it is not available is the case where it is outside the screen, outside the slice, or not decoded in view of the block scan order. The positions A0 to B1 can be expressed as follows, where the upper left coordinates of the PU are (xP, yP) and the PU sizes are nPSW and nPSH.
A0: (xP-1, yP + nPSH)
A1: (xP-1, yP + nPSH-1)
B0: (xP + nPSH, yP-1)
B1: (xP + nPSH-1, yP-1)
B2: (xP-1, yP-1)
When all merge candidates corresponding to the positions A0, A1, B0, and B1 are derived, the merge candidates corresponding to the position B2 are not derived. In the derivation of merge candidates when the PU partition type is 2N × N or N × 2N and the PU index is 1, the motion compensation parameter of each merge candidate is the motion compensation parameter of the PU whose index is 0 Only when they do not match, the corresponding merge candidate is derived and stored in the merge candidate storage unit 1212H. The operation of the function equalMotion (A, B) for determining the coincidence of the motion compensation parameters of the block A and the block B can be defined as follows.

equalMotion (A, B) = (predFlagL0A == predFlagL0B) && (predFlagL1A == predFlagL1B) && mvL0A [0] == mvL0B [0] && mvL0A [1] == mvL0B [1] && mvL1A [1] = [0] && mvL1A [1] == mvL1B [1])
Here, predFlagL0A and predFlagL1A are 1 when the reference pictures L0 and L1 are used in the block A, respectively, and 0 otherwise. mvL0 [0] and mvL0 [1] are the horizontal motion vector and vertical motion vector of L0, and mvL1 [0] and mvL1 [1] are the horizontal motion vector and vertical motion vector of L1. In the case of block B, A is replaced with B.

FIG. 46 is a diagram for explaining the operation of the temporal merge candidate derivation unit 1212B. Referring to (a) of FIG. 46, when the current picture is currPic, the temporal merge candidate is specified by a reference index number refIdxL0 that occupies almost the same spatial position as the target PU in the current picture. It is derived by copying the motion compensation parameter of the reference picture PU or the PU of the reference picture specified by the reference picture index number. A method for deriving the reference index number refIdxL0 and the reference index number refIdxL1 will be described with reference to FIG. The reference index number refIdxLX (where X is 0, 1 or C) is obtained as follows using the reference pictures refIdxLXA, refIdxLXB, and refIdxLXC of the adjacent PU, A, B, and C blocks of the target PU.
(1) If refIdxLXA = refIdxLXB = refIdxLXC,
When refIdxLXA = -1, refIdxLX = 0
Otherwise, refIdxLX = refIdxLXA
(2) When refIdxLXA = refIdxLXB
When refIdxLXA = -1, refIdxLX = refIdxLXC Otherwise, refIdxLX = refIdxLXA
(3) If refIdxLXB = refIdxLXC,
When refIdxLXB = -1, refIdxLX = refIdxLXA
Otherwise, refIdxLX = refIdxLXB
(4) When refIdxLXA = refIdxLXC
When refIdxLXA = -1, refIdxLX = refIdxLXB
Otherwise, refIdxLX = refIdxLXA
(5) When refIdxLXA = -1,
refIdxLX = min (refIdxLXB, refIdxLXC)
(6) When refIdxLXB = -1,
refIdxLX = min (refIdxLXA, refIdxLXC)
(7) When refIdxLXC = -1,
refIdxLX = min (refIdxLXA, refIdxLXB)
(8) In other cases,
refIdxLX = min (refIdxLXA, refIdxLXB, refIdxLXC)
Here, min is a function that takes a minimum value.
The coordinates of the blocks A and B are as follows.
A: (xP-1, yP + nPSH-1)
B: (xP + nPSW-1, yP-1)
The coordinates of the block C are any of the following C0, C1, and C2. If the PU corresponding to each position is available and other than intra, the refIdxLX of the PU at that position is set as refIdxLXC.
C0: (xP + nPSW-1, yP-1)
C1: (xP-1, yP + nPSH)
C2: (xP-1, yP-1)
When refIdxL0 and refIdxL1 are derived as described above, the motion compensation parameter of the position of the reference picture (xP + nPSW, yP + nPSH) indicated by refIdxL0 is used to determine the L0 motion vector, and the reference indicated by refIdxL1 A temporal merge candidate is derived by determining the L1 motion vector using the motion compensation parameter of the picture position (xP + nPSW, yP + nPSH). That is, the motion vectors mvLXCol [0] and mvLXCol [0] for each reference picture list LX (X = 0, X = 1 or X = C) are calculated from the reference picture indicated by the LX list and refIdxLX. Specifically, if the PU at the position of the reference picture (xP + nPSW, yP + nPSH) indicated by refIdxLX is unavailable or in the intra prediction mode, the motion vector mvLXCol of the temporal merge candidate LX [0], mvLXCol [1] is set to 0. Otherwise, that is, when PredFlagL0 of the PU is 0, the L1 motion vector MvL1 of the PU is used as the temporal merge candidate LX motion vectors mvLXCol [0] and mvLXCol [1]. In other cases, the L0 motion vector MvL0 of the PU is used as the LX motion vectors mvLXCol [0] and mvLXCol [1] of temporal merge candidates.

Subsequently, the motion vector is scaled using the POC (Picture Order Count) of the current frame and the POC of the reference picture. Also in the temporal merge candidate, like the adjacent merge candidate, before the merge candidate is stored in the merge candidate storage unit 1212H, the merge candidate is input to the bi-predictive conversion unit 1219 for conversion. The merge candidate after conversion is stored in the merge candidate storage unit 1212H as a temporal merge candidate.

The unique candidate derivation unit 1212C updates the merge candidate list so that the merge candidates in the merge candidate list are unique to each other. When the merge candidates stored in the merge candidate list are from index 0 to index CANDX, a unique merge candidate list can be obtained by the steps shown in the pseudo code of FIG. The merge candidate list is managed using an array motion_cand [] that stores merge candidates. If the number of merge candidates is Numcand, CANDX = Numcand−1. Each step S of the pseudo code in FIG. 47 will be described below.
S4701: All validity flags from index 0 to index CANDX are initialized to be valid. Here, motion_valid [] is an array for storing the validity flag.
S4702: For loop variable i (i = 1 to CANDX), motion_cand [j of index j (0 <= j <i) having the same motion compensation parameter as i merge candidate motion_cand [i] is smaller than i. ], The validity flag motion_valid [i] of i is invalidated. In S4702-1, the motion compensation parameters of indexes i and j are compared. The equalMotion function is used to compare the motion compensation parameters. Here, equalMotion (A, B) is a function for determining the identity between the input motion compensation parameters A and B (in the figure, expressed as “hasEqualMotion”). If the motion compensation parameters match, the i validity flag motion_valid [i] is invalidated.
S4703: The merge candidate motion_cand whose validity flag motion_valid is true is stored in the merge candidate list. The merge candidate list is reconfigured by copying the merge candidate list in ascending order of numbers to a merge candidate list composed of an array of merge candidate motion_cands. Here, copy (A, B) is a function for copying B to A.
S4704: The validity flag motion_valid is reset.
S4705: The number of valid merge candidates NumCand is updated.

FIG. 48 is a diagram for explaining the operation of the combined bi-predictive merge candidate derivation unit 1212D. The combined bi-predictive merge candidate uses the two reference merge candidates stored in the merge candidate list to copy the motion compensation parameter of the list L0 from one reference merge candidate and the motion of the list L1 from the other reference merge candidate Derived by copying the compensation parameters. FIG. 48C is a table for determining two reference merge candidate lists to be extracted. The index of the combined bi-predictive merge candidate to be derived is expressed by combCand _k . Note that combCand _k uses a value obtained by adding 1 to the value of the last index of the merge candidate list already derived. k is an index starting from 0, and is incremented by 1 when a combined bi-predictive merge candidate is added to the merge candidate list. The combIdx index is a temporary index used when deriving a combined bi-predictive merge candidate and has a value from 0 to 11. For the combIdx indexes from 0 to 11, the reference merge candidates of the indexes indicated by the two indexes l0CandIdx and l1CandIdx are selected from the merge candidate list. Here, selecting an index candidate of index lXCandIdx (X = 0 or X = 1) means extracting a candidate indicated by index lXCandIdx from the merge candidates from index 0 to CANDX stored in the merge candidate list. Say. FIG. 48A shows a determination formula for determining whether or not to derive a combined bi-predictive merge candidate. The merge candidate L0 motion compensation parameters predFlagL0l0Cand, refIdxL0l0Cand, mvL0l0Cand selected by l0CandIdx, and the L1 merge candidate motion compensation parameters predFlagL1l1Cand, refIdxL1l1Cand, and mvL1l1 that satisfy all of the mvL1l1 formula If so, a combined bi-predictive merge candidate is derived. FIG. 48B is a diagram illustrating a method for deriving the combined bi-predictive merge candidate indicated by the index combCand _k . Motion compensation parameters RefIdxL0combCand _k of coupling bi-predictive merge _{_{candidates, refIdxL1combCand k, predFlagL0combCand k, predFlagL1combCand}} k, mvL0combCand k [0], mvL0combCand k [1], mvL1combCand k [0], mvL1combCand k [1] the motion compensation of the L0 Derived by copying the parameter and the L1 motion compensation parameter. When the derived combined bi-predictive merge candidate does not match all merge candidates stored in the merge candidate list of the merge candidate storage unit 1212H, the combined bi-predictive merge candidate is stored at the end of the merge candidate list. For the coincidence determination, the function equalMotion described above is used.

If the number of merge candidates has reached MRG_MAX_NUM_CANDS, the operation of the combined bi-predictive merge candidate derivation unit 1212D is terminated. If not reached, combIdx is incremented by 1, two reference merge candidates are extracted using the table of FIG. 48C, and merge candidate derivation is continued. When all the tables have been extracted, the operation of the combined bi-predictive merge candidate derivation unit 1212D is terminated.

FIG. 49 is a diagram for explaining derivation of the non-scale bi-predictive merge candidate derivation unit 1212E. FIG. 49A shows a determination formula for determining whether to derive a non-scale bi-predictive merge candidate. FIG. 49B is a diagram illustrating a method for deriving a non-scale bi-predictive merge candidate indicated by an index nscaleCand _l . Here, the index nscaleCand _l uses a value obtained by adding 1 to the value of the last index of the merge candidate list already derived. l is an index starting from 0, and is incremented by 1 when a non-scale bi-predictive merge candidate is added to the merge candidate list. The non-scale bi-predictive merge candidate derivation unit 1212E uses the merge candidate motion vectors already derived and stored in the merge candidate storage unit 1212H so that the motion vectors for the two reference pictures are in a relationship of reversing each other. Derive candidates. When the index of the merge candidate to be referenced is origCand and all the determination formulas in FIG. 49A are satisfied, the non-scale bi-predictive merge candidate is derived according to FIG. 49B. Also for non-scale bi-predictive merge candidates, if the derived joint bi-predictive merge candidate is not matched with all the merge candidates stored in the merge candidate list of the merge candidate storage unit 1212H using the function equalMotion, the combined bi-prediction is performed. The merge candidate is stored at the end of the merge candidate list. If the number of merge candidates has reached MRG_MAX_NUM_CANDS, the operation is terminated, and if not, the process is repeated.

FIG. 50 is a diagram illustrating the operation of the zero vector merge candidate derivation unit 1212F. If the number of merge candidates in the merge candidate storage unit 1212H has reached MRG_MAX_NUM_CANDS, no processing is performed. If not reached, zero vectors are stored until the number of merge candidates reaches MRG_MAX_NUM_CANDS. That is, the index of the merge candidate to be referenced is mvL0zeroCand _m , and the L0 motion vectors (mvL0zeroCand _m [0], mvL0zeroCand _m [1]) and L1 motion vectors (mvL1zeroCand _m [0], mvL1zeroCand _m [1]) are both A candidate such as 0 is derived. Here, the index zeroCand _m uses a value obtained by adding 1 to the value of the last index of the merge candidate list already derived. m is an index starting from 0, and is incremented by 1 when a zero vector prediction merge candidate is added to the merge candidate list. Note that the zero vector merge candidate derivation unit 1212F can also derive merge candidates using two reference pictures, that is, merge candidates with predFlagL1 = 1 and predFlagL1 = 1, but perform bi-prediction restriction on merge PUs. In this case, only a single prediction merge candidate is derived in a small size PU for which bi-prediction restriction is performed.

The merge candidate derivation control unit 1212G performs the operation shown in the flowchart of FIG. 44 to derive merge candidates.

The merge candidate storage unit 1212H stores the derived merge candidate.

(Details of Basic Motion Compensation Parameter Deriving Unit 1213)
FIG. 54 is a block diagram showing the configuration of the basic motion compensation parameter derivation unit 1213. The basic motion compensation parameter derivation unit 1213 includes an adjacent motion vector candidate derivation unit 1213A, a temporal motion vector candidate derivation unit 1213B, a zero vector merge candidate derivation unit 1213F, a motion vector candidate derivation control unit 1213G, a motion vector candidate storage unit 1213H, A vector candidate selection unit 1213I and a motion vector restoration unit 1213J are included. Hereinafter, the adjacent motion vector candidate derivation unit 1213A, the temporal motion vector candidate derivation unit 1213B, and the zero vector merge candidate derivation unit 1213F are collectively referred to as motion vector / merge candidate derivation means.

In the basic motion compensation parameter derivation unit 1213, the motion vector candidate derivation control unit 1213G controls each motion vector / merge candidate derivation unit, derives a predetermined number of predicted motion vector candidates PMV_MAX_NUM_CANDS, and stores it in the motion vector candidate storage unit 1213H. Store. Here, the predicted motion vector candidate is composed of motion vectors mvL0 and mvL1. The motion vector candidate storage unit 1213H stores the set of motion parameters as a predicted motion vector candidate. The predicted motion vector candidates to be stored are managed as a list (predicted motion vector candidate list) ordered in the order of storage.

Similar to the adjacent merge candidate derivation unit 1212A, the adjacent motion vector candidate derivation unit 1213A derives each motion vector predictor candidate by copying the motion compensation parameter of the adjacent block.

Similar to the temporal merge candidate derivation unit 1212B, the temporal motion vector candidate derivation unit 1213B derives temporal prediction motion vector candidates by copying the motion compensation parameters of the already decoded pictures.

The zero vector merge candidate derivation unit 1213F derives a zero vector as a motion vector predictor candidate.

The motion vector candidate derivation control unit 1213G ends the derivation when a predetermined number of predicted motion vector candidates PMV_MAX_NUM_CANDS are derived. Also, using the internal unique candidate determination unit 1213C, motion vectors are used so that the motion vector predictors derived from the adjacent motion vector candidate derivation unit 1213A and the temporal motion vector candidate derivation unit 1213B do not match each other (become unique). Store in the candidate storage unit 1213H. Specifically, the motion vector candidate derivation control unit 1213G inputs the two motion vectors A and B to the unique candidate determination unit 1213C, and determines whether the motion vector A and the motion vector B match. Let it be judged. The unique candidate determination unit 1213C determines whether or not the two input motion vectors match each other.

(Details of bi-prediction restriction PU determination unit 1218: bi-prediction restriction determination method)
A preferred example of a method for determining whether or not the bi-prediction restricted PU determination unit 1218 is a small-size PU that should be subjected to bi-prediction restriction will be described below. The determination method is not limited to the following example, and other parameters can be used as PU size information.

(Example 1 of determination method)
In example 1 of the determination method, when the threshold used for PU size determination is TH, bi-prediction restriction is performed for a PU less than THxTH. The determination formula using the target CU size (here, CU Width) and the PU partition type at this time is as follows.

DisableBiPred = ((CU Width == TH && PU split type! = 2Nx2N) || CU Width <TH)? True: false
In particular,
In the case of TH = 16, bi-prediction restriction is performed in each PU of 16 × 8, 8 × 16, 12 × 16, 4 × 16, 16 × 12, 16 × 4, 8 × 8, 8 × 4, 4 × 8, and 4 × 4.
In the case of TH = 8, bi-prediction restriction is performed in each of 8x4, 4x8, and 4x4 PUs.

A logarithm log2CUSize in which the base of the CU size (CU Width) is 2 may be used instead of CU Width as the CU size used for the PU size determination. In this case, the determination formula when the bi-prediction restriction is applied to 16 × 8, 8 × 16, 12 × 16, 4 × 16, 16 × 12, 16 × 4, 8 × 8, 8 × 4, 4 × 8, and 4 × 4 is as follows.

DisableBiPred = ((log2CUSize == 4 && PU partition type! = 2Nx2N) || log2CUSize <4)? True: false
Also, the judgment formula when bi-prediction restriction is applied to 8x4, 4x8, and 4x4 is as follows.

DisableBiPred = (log2CUSize == 3 && PU partition type! = 2Nx2N)
It is also possible to perform determination using parameters other than the target CU size and the PU partition type. For example, the following determination is also possible using the PU division number NumPart.

DisableBiPred = ((CU Width == TH &&NumPart> 1) && CU Width <TH)? True: false
(Example 2 of determination method)
In the determination method example 2, bi-prediction restriction is performed on a PU of THxTH or less. The judgment formula at this time is as follows.

DisableBiPred = ((CU Width == 2 * TH && PU partition type == NxN) || CU Width <2 * TH)? True: false
In particular,
When TH = 16, bi-prediction restriction is performed in each of 16x16, 16x8, 8x16, 12x16, 4x16, 16x12, 16x4, 8x8, 8x4, 4x8, and 4x4 PUs.
In the case of TH = 8, bi-prediction restriction is performed in each of 8x8, 8x4, 4x8, and 4x4 PUs.
In the case of TH = 4, bi-prediction restriction is performed in a 4 × 4 PU.

When log2CUSize is used for PU size determination, the determination formula when bi-prediction restriction is applied to 8x8, 8x4, 4x8, and 4x4 is as follows.

DisableBiPred = ((log2CUSize == 4 && PU partition type == NxN) || log2CUSize <4)? True: false
The 4x4 judgment formula is as follows.

DisableBiPred = ((log2CUSize == 3 && PU partition type == NxN))? True: false
The following determination can also be made using the PU division number NumPart.

DisableBiPred = ((CU Width == 2 * TH && NumPart! = 4) || CU Width <2 * TH)? True: false
In the above example, different PU sizes (threshold values TH) may be used for the skip PU, the merge PU, and the basic inter PU. Further, as already shown in FIG. 34C, the PU size (threshold value TH) used for the determination may be encoded.

(Details of bi-predictive conversion unit 1219)
If the input motion compensation parameter indicates bi-prediction, and the bi-prediction conversion unit 1219 determines that the bi-prediction restriction of the skip PU and merge PU is to be performed by the bi-prediction restricted PU determination unit, the bi-prediction conversion unit 1219 The motion compensation parameter input to the single prediction conversion unit 1219 is converted into single prediction.

The bi-prediction conversion unit 1219 can switch between a plurality of bi-prediction conversion methods, and bi-prediction conversion may be performed using the bi-prediction conversion method specified by the merge candidate derivation unit. In addition, it may be input to the bi-predictive conversion unit 1219 whether or not bi-predictive conversion is performed, and switching may be performed accordingly.

Also, as will be described later, the bi-prediction conversion unit 1219 may sequentially output two motion compensation parameters when the input motion compensation parameter is bi-prediction.

The motion compensation parameter is an inter prediction flag inter_pred_flag of a motion compensation parameter derived from a copy of a motion compensation parameter of a temporally and spatially close PU or a combination of motion compensation parameters of a temporally and spatially close PU. Is 2 indicating bi-prediction, it is converted to 1 indicating uni-prediction. Further, when the inter prediction flag (internal inter prediction flag) used for the internal processing is a flag including 1 indicating L0 prediction, 2 indicating L1 prediction, and 3 indicating bi-prediction, the following operation is performed. When the internal inter prediction flag is 3, the value of the internal inter prediction flag is converted to 1 meaning L0 prediction or 2 meaning L1 prediction. In addition, when converting to L0 prediction, it is possible to refresh the motion compensation parameter related to L1 prediction to, for example, zero. In the case of conversion to L1 prediction, the motion compensation parameter related to L0 prediction can be refreshed to, for example, zero. Note that the relationship between the internal inter prediction flag and the prediction list use flags predFlagL0 and predFlagL1 can be mutually converted as follows.

Internal inter prediction flag = (predFlagL1 << 1) + predFlagL0
predFlagL0 = Internal inter prediction flag & 1
predFlagL1 = Internal inter prediction flag >> 1
As a method of converting to single prediction, in addition to the method of changing the inter prediction flag (and the internal inter prediction flag), in the case of 1 which is a value indicating that both of the prediction list use flags predFlagL0 and predFlagL1 are used, the prediction list This can also be done by converting one of the usage flags to 0, which indicates that one of the usage flags is not used.

FIG. 51 is a diagram for explaining an example of a method of conversion to single prediction (bi-predictive conversion method). The L0 selection changes the prediction list use flag predFlagL1 to 0. L1 selection changes the prediction list use flag predFlagL0 to 0. In the reference index number selection, the prediction list use flag of the list having the larger value among the reference index number refIdxL0 and the reference index number refIdxL1 is changed to 0. As an expression, X = (ref_idx_L1 <ref_idx_L0)? 0: 1, predFlagLX = 0. Thereby, the list with the smaller reference index number is used. The POC selection includes the difference between the POC (POC_curr) of the current picture and the POC (POC_L0) of the reference picture indicated by the reference index number refIdxL0, the POC of the current picture, and the POC of the reference picture indicated by the reference index number refIdxL1 (POC_L1). ), The prediction list use flag of the list with the larger value is changed to 0. X = (| POC_L1 － POC_curr | <| POC_L0 － POC_curr |)? 0: 1, predFlagLX = 0. As a result, a reference picture having a POC close to that of the current picture is used.

Both selections use both the case where the L0 motion compensation parameter is used and the case where the L1 motion compensation parameter is used as candidates. In other words, when the input merge candidate is bi-prediction, the merge candidate changed to use L0 such as predFlagL1 = 0 and the merge candidate changed so that predFlagL0 = 0 are input. Output to the original merge candidate derivation unit.

In the experiment by the inventor, the bi-prediction is limited when the PU size is 8 × 4 and 4 × 8, and the decrease in coding efficiency is almost zero in the method of performing L1 selection for the bi-predictive transformation. Have confirmed. In this case, the bi-predictive transformation can be expressed as follows. When the CU size log2CUSize is 3 and the PU partition type PartMode is other than 2N × 2N, the prediction list usage flag predFlagL0 is set to 0 when both the prediction list usage flag predFlagL0 and the prediction list usage flag predFlagL1 are 1. Set.

Generally, in deriving motion compensation parameters, reference pictures in the L0 list are often given priority. Conversely, by using reference pictures in the L1 list instead of reference pictures in the L0 list, it is possible to differentiate from the derivation process that prioritizes these L0 lists. When a plurality of derivation processes can be selected according to a certain encoding parameter, assuming that a derivation process for a certain group is L0 list priority and a derivation process for other groups is L1 list priority, the respective derivation processes are used in a complementary manner. Therefore, it works effectively in sequences and areas with more motion properties. Therefore, in bi-predictive transformation, high encoding efficiency can be obtained by using the L1 list.

An example of a method for conversion to single prediction other than the example of FIG. 51 will be further described. For each bi-prediction merge candidate, when bi-prediction (both predFlagL0 and predFlagL1 are 1), it may be switched whether predFlagL1 is set to 0 (L0 list is used). For example, the adjacent merge candidate derivation unit 1212A derives merge candidates in the order of A1, B1, B0, A0, and B2, but there is a method of setting predFlagL1 = 0 in A1, B0, and B2, and predFlagL1 = 1 in B1 and A0. . In this example, every time a merge candidate is derived, the case where the L0 list is used and the case where the L1 list is used are alternately selected. In this example, the L0 list is used for one of the merge candidates A0 and A1 calculated from the blocks adjacent in the left direction and the merge candidates B0 and B1 calculated from the blocks adjacent in the upward direction. Suppose that the L1 list is used for the other. In this way, when converting some of the adjacent merge candidates to uni-prediction with predFlagL1 = 0 and some with predFlagL0 = 0, the motion compensation parameters of each reference list can be used in a balanced manner. High encoding efficiency can be obtained. In addition, it is preferable to select different reference lists for merge candidates whose adjacent directions are the right direction (A0, A1) and the upward direction (B0, B1, B2). It should be noted that other merge methods may be used as to which merge candidate predFlagL1 = 0 and which merge candidate predFlagL0 = 0. For example, predFlagL1 = 1 may be set for A1, B0, and B2, and predFlagL1 = 0 may be set for B1 and A0. Also, the opposite reference list may be used for each index in the order of derivation or when storing in the merge candidate list.

In addition, when bipredictive conversion is performed by setting the prediction list use flag predFlagLX of the reference list X to 0, the value of the reference index number refIdxLX is set to -1, the value of the motion vector mvL1 is set to (0, 0), etc. The initial value is not refreshed, but it is also possible to refresh. Experiments by the inventor have confirmed that these values can achieve higher encoding efficiency without refreshing. In the case of not refreshing, even when the use of the reference picture list is restricted, the reference index number and the motion vector value of the restricted one can be used in the subsequent processing. Therefore, high encoding efficiency can be obtained.

Note that the bi-prediction conversion unit 1219 may be configured as means included in the skip motion compensation parameter derivation unit 1211 and the merge motion compensation parameter derivation unit 1212. In addition, when bi-prediction restriction is performed only on the basic inter PU, a configuration without the bi-uni prediction conversion unit 1219 may be used.

(Motion information decoding unit 1021 in bi-prediction restriction)
FIG. 30 is a block diagram illustrating a configuration of the motion information decoding unit 1021. The motion information decoding unit 1021 includes at least an inter prediction flag decoding unit 1028. In particular, the motion information decoding unit 1021 performs bi-prediction restriction on a basic inter PU that is a PU when decoding an inter prediction flag. The inter prediction flag decoding unit 1028 changes whether or not to decode the inter prediction flag depending on whether or not the bi-prediction restricted PU determination unit 1218 described above performs bi-prediction restriction on the basic inter PU.

Note that when the inter prediction flag is decoded with the skip PU as in the case where the use of the merge is restricted, the bi-prediction restriction of the skip PU is performed.

(Inter prediction flag decoding unit 1028)
FIG. 39 is a flowchart showing the operation of the inter prediction flag decoding unit 1028 in the case of CABAC. When the slice is a B slice (YES in S131), the inter prediction flag decoding unit 1028 proceeds to S132. Otherwise (NO in S131), the process ends without decoding the inter prediction flag inter_pred_flag. When the PU size is a small PU size (when DisableBiPred = true) (YES in S132), the process ends without decoding the inter prediction flag inter_pred_flag. In other cases (NO in S132), the inter prediction flag inter_pred_flag is decoded (S133).

FIG. 40 is a flowchart showing the operation of the inter prediction flag decoding unit 1028 in the case of CAVLC. If the slice is a B slice (YES in S141), the inter prediction flag decoding unit 1028 proceeds to S142. Otherwise (NO in S141), the process ends without decoding the inter prediction flag inter_pred_flag. When the PU size is other than the small PU size (when DisableBiPred! = True) (NO in S142), the combined inter prediction reference index combined_inter_pred_ref_idx is decoded through S143, S144, and 145. When the PU size is a small PU size (YES in S142), the combined inter prediction reference index combined_inter_pred_ref_idx is decoded through S146, S147, and S148.

In S143 and S146, the maximum value MaxPredRef is calculated. The maximum value MaxPredRef is as shown in the table TBL37 and the pseudo code CODE37 in FIG. Specifically, the maximum value MaxPredRef for non-small PUs, that is, when there is no bi-prediction restriction (DisableBiPred! = True) is NumPredRefLC + NumPredRefL0 * NumPredRefL0 + NumPredRefL0 + NumPredRefL0 * NumPredRefL1, that is, combined reference picture set for single prediction (NumPredRefLC or NumPredRefL0) and the number of bi-predicted combined reference picture sets (NumPredRefL0 * NumPredRefL1) (S143). The maximum value MaxPredRef with bi-prediction restriction (DisableBiPred = true) is calculated by NumPredRefLC or NumPredRefL0, that is, the number of combined reference picture sets for single prediction (NumPredRefLC or NumPredRefL0), and combined reference picture set for bi-prediction Is not included (S146). Thereby, useless codes can be eliminated. Note that num_ref_idx_lc_active_minus1 is a number obtained by subtracting 1 from the number of reference list numbers managed by the reference list LC (size of the reference list LC). When num_ref_idx_lc_active_minus1 is greater than 0, it indicates that the reference list LC is used. Similarly, num_ref_idx_l0_active_minus1 is a number obtained by subtracting 1 from the number of reference list numbers managed in the reference list L0 (size of the reference list L0), and indicates that the reference list L0 is used. Similarly, num_ref_idx_l1_active_minus1 is a number obtained by subtracting 1 from the number of reference list numbers (size of the reference list L1) managed in the reference list L1. The number of combined reference picture sets for uni-prediction is determined as the number of clipped reference list sizes with the maximum value (4 for the reference list LC and 2 for the other reference lists) as in the following equation.
NumPredRefLC = Min (° 4, ° num_ref_idx_lc_active_minus1 ° + ° 1 °)
NumPredRefL0 = Min (° 2, ° num_ref_idx_l0_active_minus1 ° + ° 1 °)
NumPredRefL1 = Min (° 2, ° num_ref_idx_l1_active_minus1 ° + ° 1 °)
In decoding of combined_inter_pred_ref_idx in S144 and S147, codeNum encoded with unary code having the maximum value as MaxPredRef is decoded. In S145 and S148, codeNum is converted into combined_inter_pred_ref_idx. In the case of other than the small size PU, it is converted into combined_inter_pred_ref_idx by the reverse conversion variable table DecTable. That is, combined_inter_pred_ref_idx = DecTable [codeNum] (S145). In the case of a small size PU, codeNum is used as it is as the value of combined_inter_pred_ref_idx. That is, combined_inter_pred_ref_idx = codeNum (S148). Then, it is determined whether combined_inter_pred_ref_idx matches the maximum value MaxPredRef (S149). If they match (YES in S149), and if the size is not a small size PU, inter_pred_flag is decoded (S150). This operation corresponds to the decoding of the syntax table shown in FIG.

In the above example, simple decoding is performed by not using the inverse transformation variable table in the small size PU. However, by replacing the step of S148 with the process shown by the pseudo code in FIG. The combined_inter_pred_ref_idx used can also be decoded. The reverse conversion variable table and the conversion variable table use the same table when the bi-prediction restriction is used and when it is not used. When the bi-prediction table is not used, the entry corresponding to the bi-prediction is invalid among the table entries. Therefore, it is necessary to skip invalid entries. Specifically, after decoding the code number, the decoding unit scans the entries in the inverse conversion variable table DecTable in descending order of occurrence probability, that is, in order from the smallest number to the largest. Since the entry content is bi-predicted, that is, when two prediction lists are used, it is invalid and is skipped, and only the case where it is valid is counted. If this count value matches the decoded code number, the parameter of this count value entry is set to the value of combined_inter_pred_ref_idx to be decoded. Further, the reverse conversion variable table DecTable is updated by regarding the count value including the invalid count as the code number. Specifically, the operation of FIG. 41 is performed. FIG. 41 is a pseudo code illustrating a decoding process of combined_inter_pred_ref_idx when the inverse transformation variable table is used. Hereinafter, each step S of the pseudo code illustrated in FIG. 41 will be described. In FIG. 41, the maximum value MaxPredRef is represented as uiMaxVal for convenience of coding style. NumPredRefLC, NumPredRefL0, and NumPredRefL1 are written as uiValNumRefIdx0fLC, uiValNumRefIdx0fL0, and uiValNumRefIdx0fL1, respectively.
S501: A maximum value MaxPredRef is obtained.
S502: The maximum values MaxPredRef and uiBipredVal for bi-prediction restriction are obtained.
S503: The code number tmp is obtained by calling the unary decoding process xReadUnaryMaxSymbol with the maximum value MaxPredRef as an argument.
S504: Combined_inter_pred_ref_idx is obtained from the code number tmp using the reverse conversion variable table m_uiMITableD. When bi-prediction restriction is not performed, this value is the final combined_inter_pred_ref_idx value.
S505: This is a branch for entering the process of obtaining combined_inter_pred_ref_idx from the code number when bi-prediction restriction is performed.
S506: A maximum value MaxPredRef when bi-prediction restriction is not performed is obtained for later use in the determination in S509.
S507: Processing is performed from 0 to the maximum value MaxPredRef using the temporary code number tmp2 as a loop variable. The second temporary code number cx is processed into 0.
S508: A temporary combined_inter_pred_ref_idx value x obtained by converting the temporary code number tmp2 with the inverse conversion variable table is obtained.
S509: It is determined whether or not the value x of the temporary combined_inter_pred_ref_idx is within the valid range. The cases where the maximum value uiBipredVal when the bi-prediction restriction is performed are not exceeded and the maximum value MaxPredRef when the bi-prediction restriction is not performed are valid.
S510: When the second temporary code number cx matches the decoded code number tmp, the loop is terminated. The temporary code number tmp2 at the time of ending the loop corresponds to the code number when bi-prediction restriction is not performed. Therefore, it is assigned to the code number tmp.
S511: The second temporary code number cx is incremented.
S512: The provisional combined_inter_pred_ref_idx value x at the end of the loop is obtained as the decoded value of combined_inter_pred_ref_idx.
S513: The process adaptCodeword for updating the inverse transformation variable table is called using the code number tmp when the bi-prediction restriction is not performed.

FIG. 42 is a pseudo code showing the encoding process of combined_inter_pred_ref_idx when a variable table is used.

As described above, as a method of reducing the processing amount of the small PU size, bi-prediction restriction of basic inter PU (change of decoding method of inter prediction flag and joint inter prediction reference index), bi-prediction restriction of merge PU (bi-prediction in merge candidate derivation) (Conversion), skip prediction of bi-predictive merge candidate calculation has been described, but these limitations may be used alone, or PU sizes for performing these limitations may be different values. 52 and 53 show an example of bi-prediction processing amount reduction. In the figure, ◯ indicates that the process is performed, and x in the figure indicates that the process is not performed.

52A applies the basic inter-PU bi-prediction restriction, the merge PU bi-prediction restriction, and the bi-prediction merge candidate derivation skip uniformly to PUs of 4 × 4, 4 × 8, and 4 × 8 sizes. It is an example. (B) and (c) of FIG. 52 are examples in which bi-prediction restriction of merge PUs and bi-prediction merge candidate derivation skipping are not performed and bi-prediction restriction is performed only on basic inter PUs. In general, when the bi-prediction restriction of the merge PU is performed, there is a case where the encoding efficiency is lowered. Therefore, an example in which the bi-prediction restriction is applied only to the basic inter PU is appropriate.

FIG. 52 (d) shows a bi-prediction restriction for a basic inter PU uniformly for a PU of size 4x4, 4x8, 4x8, a bi-prediction restriction for a merge PU for a PU of size 8x8, and bi-prediction. It is an example which applies the skip of merge candidate derivation. It is appropriate in terms of coding efficiency to loosen the bi-prediction restriction of the merge PU compared to the bi-prediction restriction of the basic inter PU.

53A is an example in which bi-prediction restriction of basic inter PU and bi-prediction merge candidate derivation skip are applied to 4x4, 4x8, 4x8, and 8x8. Although the bi-prediction restriction of the merge PU is not performed, the processing amount related to the bi-prediction in the merge PU can be reduced by simplifying the merge candidate derivation used as the motion compensation parameter of the merge PU. FIG. 53 (b) is an example in which the bi-predictive merge candidate derivation skip is applied to 4x4, 4x8, 4x8, and 8x8. Thus, the bi-predictive merge candidate derivation skip can be used alone.

In order to realize such a case, the determination method may be managed with another flag. For example, flags DisableBiPredFlag, DisableBiPredMerge, and DisableBiPredMergeDerive indicating that prediction restrictions for each are performed are provided, and these can be performed by the following operations.

For example, the bi-prediction restricted PU determination unit 1218 derives three flags DisableBiPredFlag, DisableBiPredMerge, and DisableBiPredMergeDerive, respectively. The example shown in FIG. 52D can be derived as follows.

DisableBiPredFlag = (log2CUSize == 3 && PU partition type! = 2Nx2N)? True: false
DisableBiPredMerge, DisableBiPredMergeDerive = ((log2CUSize == 4 && PU partition type == NxN) || log2CUSize <4)? True: false
The inter prediction flag decoding unit 1028 changes the decoding method of the inter prediction flag and the combined inter prediction reference index when DisableBiPredFlag is true.

The merge motion compensation parameter derivation unit 1212 performs bi-prediction conversion using the bi-prediction conversion unit 1219 in merging candidate derivation when DisableBiPredMerge is true.

The merge motion compensation parameter derivation unit 1212 skips the bi-predictive merge candidate derivation in the merge candidate derivation when DisableBiPredMergeDerive is true.

Actions and effects obtained when the motion compensation parameter deriving unit 121 performs bi-prediction restriction with reference to the small PU size 123A are as follows. Bi-prediction has a larger processing amount than single prediction, and a small PU has a larger processing amount per unit area than a large PU. Therefore, bi-prediction in a small PU can be a processing bottleneck. For this reason, in small PU, it can suppress that processing amount increases excessively by suppressing bi-prediction. In particular, the worst-case processing amount for processing the smallest size PU can be suppressed.

Note that the inter prediction flag is supplemented as follows. In Non-Patent Document 1, the inter prediction flag (inter_pred_flag) is basically a flag for selecting bi-prediction or uni-prediction. However, when the combined list is not used and the backward prediction prohibition flag is not prohibited, a flag for selecting either L0 or L1 may be transmitted by inter_pred_flag as a reference frame list used for single prediction.

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention provides an image in a prediction unit between any one of screens of uni-prediction that refers to one reference image and bi-prediction that refers to two reference images. In an image decoding apparatus that restores using a prediction prediction method, bi-prediction restriction means is provided for restricting bi-prediction for a target prediction unit that is a prediction unit having a size of a predetermined size or less to which the inter-screen prediction is applied. It is a configuration.

By performing the above restriction, there is an effect that it is possible to reduce the amount of processing that becomes a bottleneck of decoding processing.

[2-3-2] Determination of Size for Limiting Bi-Prediction Hereinafter, a configuration for determining the size for limiting bi-prediction will be disclosed using FIGS. 58 to 70. FIG.

(Level regulation)
First, referring to FIG. 58 and FIG. A level limit in H.264 / AVC will be described. 58 and FIG. 2 is a table that defines level restrictions in H.264 / AVC.

The level restriction will be described with reference to FIG. The level in the level regulation defines the performance of the decoder and the complexity of the bit stream.

The level is specified by an integer part and a non-integer part. The level of the integer part mainly represents a rough division according to the resolution of the image to be handled. As shown in FIG. 58, a number from 1 to 5 is designated as the integer. Level 1, level 2, and level 3 correspond to the resolutions of QICF, CIF, and SDTV (standard television), respectively.

Level 4 corresponds to the resolution of HDTV (high definition television). Level 5 corresponds to Super HDTV resolution.

Further, as shown in FIG. 58, in each integer level, an intermediate level may be further specified by a non-integer part (see the item “LevelＬnumber” indicated by COL581).

∙ For these level specifications, parameters representing decoder performance and bit stream complexity are defined.

58. The parameters defined in the table shown in FIG. 58 are MaxMBPS, MaxFS, MaxDPB, MaxBR, MaxCPB, MaxVmvR, MinCR, MaxMvsPer2Mb in order from the Level number (COL581) to the right side in the table.

Here, MaxFS indicated by the reference number COL582 will be described as follows. MaxFS defines the maximum frame size by the number of macroblocks (MBs).

For example, at levels 2.2 and 3, MaxFS = 1620. Further, at level 3.1, MaxFS = 3600, and at level 3.2, MaxFS = 5120.

There are 480p, 720p, 1080p, 4k, etc. as screen sizes, and MaxFS determines which level these screen sizes can be processed.

One macro block is composed of 16 × 16 = 256 pixels. Therefore, for example, the number of macroblocks included in 480p (720 × 480) is 720 × 480/256 = 1350 (MBs). As described above, at level 2.2, since MaxFS = 1620, 480p can be processed.

For example, for 720p (1280 × 720), it is only necessary to process 1280 × 720/256 = 3600 (MBs) macroblocks per frame. Therefore, at levels 3.1 and 3.2, 720p can be processed. In the calculation of the processing capability of the image size, when an image size that is not divisible by 16 is a target, the calculation is performed by rounding up the image size to a value divisible by 16. For example, “1080p” is calculated after being divided into macroblocks as a screen corresponding to 1920 × 1088.

In the following, in FIG. 58, screen sizes that can be processed at each level of 4 to 5.1 are shown on the right side of the table.

In level 4 to 4.2, 1080p can be processed. At level 5, 2560 × 1600, which is the size used in the PC display, can be processed. At level 5.1, 4k can be processed.

As indicated by reference numeral COL591 in FIG. 59, at a level of 3.1 or higher (screen size of 720p or higher), the minimum block size allowed for luminance bi-prediction is 8 × 8 (MinLumaBiPredSize). . That is, bi-prediction in sub-macroblock units (8 × 4, 4 × 8, and 4 × 4) is prohibited at level 3.1 or higher.

H. A macroblock in H.264 / AVC is a unit corresponding to CU, TU, and PU in HEVC.

By the way, H. In H.264 / AVC, only bi-prediction is restricted, but with a small size PU, the processing amount required for filter processing for motion compensation and the transfer amount of reference pixels are large even in single prediction. Therefore, it is appropriate to limit the small PU size simultaneously with the bi-prediction limitation of the small PU size. Note that as the level increases, the resolution (image size) that is basically used increases, but the processing amount and transfer amount become more severe as the resolution increases. At the same time, when the resolution is high, the object size also increases accordingly (because the spatial correlation of motion increases), so that a high coding efficiency can be realized even with a relatively large PU alone. Conversely, the use of a small PU size and the bi-prediction with a small PU size can be limited without a significant decrease in coding efficiency. Since the resolution basically corresponds to the level, the PU size to be constrained and the prediction unit (PU) to limit the bi-prediction differ depending on the level. For example, in level 3.1 (720P), it is preferable to provide restrictions on 4 × 4 PU and restrict bi-prediction on 8 × 4 PU and 4 × 8 PU. Further, in level 5 (equivalent to 2560 × 1600 and 4k), it is preferable to restrict 8 × 4 PU and 4 × 8 PU and limit bi-prediction in 8 × 8 PU.

In HEVC, the minimum CU size can be controlled by a value log2_min_coding_block_size_minus3 in encoded data, which will be described later, but the purpose of reducing the minimum CU size is to reduce the processing amount and the transfer amount. It is appropriate to limit the bi-prediction that greatly affects the quantity at the same time. In addition, since the degree of processing amount and transfer amount to be restricted changes for each minimum CU size, it is preferable to adaptively change the restriction of bi-prediction according to the minimum CU size. An example of adaptive restriction / limitation will be described with reference to FIG. FIG. 60A illustrates the limitation of bi-prediction in the case of 16 × 16 CU. Moreover, (b) of FIG. 60 illustrates the limitation of bi-prediction in the case of 8 × 8 CU. As illustrated in FIG. 60A, for a 16 × 16 CU, 16 × 16 PU, 16 × 8 PU, 8 × 16 PU, 8 × 8 PU, and the like can be taken. Here, bi-prediction restriction is not performed on 16 × 16 PU, 16 × 8 PU, and 8 × 16 PU, while bi-prediction restriction is performed on 8 × 8 PU.

Also, as illustrated in FIG. 60B, for an 8 × 8 CU, 8 × 8 PU, 8 × 4 PU, and 4 × 8 PU can be taken. Here, while bi-prediction restriction is not performed on 8 × 8 PU, bi-prediction restriction is performed on 8 × 4 PU and 4 × 8 PU.

(Restriction of encoded data due to level restrictions)
The adaptive restriction / limitation as shown in FIG. 60 is a value of the motion compensation parameter derived when decoding the encoded data according to the level, even if the moving image decoding apparatus 1 is not specially configured. It can also be realized by level regulation that limits the above. FIG. 84 is an example of the level restriction of the present invention. In the table of FIG. 84, MaxLog2MinCUSize, MinPUSize, and MinBipredPUSize are respectively the logarithmic value of the minimum CU size, the minimum PU size, and the minimum bi-predicted PU size, and the minimum CU size and PU size that can be used at a specific level. Indicates the value. As shown in FIG. 84, when the level level_idc is less than the predetermined threshold TH1, the logarithmic value of the minimum CU size and the minimum PU size are 3, 4 × 4, respectively, and are not particularly limited. The minimum bi-prediction PU size is 8 × 4, 4 × 8, and 4 × 4 PU bi-prediction is disabled. Subsequently, when the level level_idc is equal to or higher than the predetermined threshold TH1 and lower than the predetermined threshold TH2, the logarithmic value of the minimum CU size (logarithm of the minimum PU size) is 3, which is not particularly limited. 8 × 4 and 4 × 8. That is, 4 × 4 PU cannot be used. Furthermore, the minimum bi-prediction PU size is 8 × 4 and 4 × 8, and bi-prediction of 4 × 4 PU is disabled. Subsequently, when the threshold value is equal to or greater than the predetermined threshold TH2, the logarithmic value of the minimum CU size is 4, and the minimum PU size is limited to 8 × 8. That is, 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU are disabled. Furthermore, the minimum bi-prediction PU size is 16 × 8, and bi-prediction of 8 × 8 PU is disabled. Note that bi-prediction of 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU cannot be used due to the minimum PU size restriction. The threshold TH1 is suitably level 2.1, which is a 720P line, and the threshold TH2 is suitably level 5, which is a line equivalent to 2560 × 1600, but other thresholds may be used.

FIG. 85 is another example of level regulation according to the present invention. In this example, it is almost the same as the example of FIG. 84, but when the level level_idc is less than the predetermined threshold value TH0, the logarithmic value of the minimum CU size, the minimum PU size, and the minimum bi-predicted PU size are 3 and 4 respectively. × 4, 4 × 4, all without restrictions. In this way, a level that does not impose restrictions may be provided. The level restriction only restricts encoded data, that is, skipping decoding of inter prediction flags, bi-transforming merge candidates, and skipping bi-predictive merge candidate derivation in merge candidate derivation in a small PU size. It is not necessary to perform part or all. Conversely, some or all of the skip of decoding of the inter prediction flag, the bi-conversion of merge candidates, and the skip of bi-prediction merge candidate derivation in merge candidate derivation in the small PU size may be used together.

[Action / Effect]
According to this level regulation, since both the usable PU size and the usable bi-predictive PU size are limited according to the level, the processing amount required for the filter processing for motion compensation and the transfer amount of the reference pixel are reduced. It can be limited appropriately. H. In the case of restricting only the usable bi-predicted PU size, as in the level restriction in H.264 / AVC, the usable PU size is not restricted. As in the case, there is a problem that a large processing amount and transfer amount are required. That is, according to this level regulation, the amount of processing and the amount of transfer associated with bi-prediction are limited, but there is no unbalance that the amount of processing and the amount of transfer for single prediction are not limited.

Furthermore, according to this level regulation, both the usable PU size and the usable bi-predictive PU size are switched according to the level using the same threshold value. For example, when the minimum PU size MinPUSize is changed before and after the threshold TH1, the minimum bi-predicted PU size MinBipredPUSize is changed using the same threshold TH1. In this way, by using the same threshold value for the usable PU size and the usable bi-predictive PU size, in the moving picture decoding means that supports decoding below a specific level, the processing amount and the transfer amount are appropriately set. Can be limited. It also makes it clear at what level the required limit changes.

Furthermore, according to this level regulation, the following combinations are used as the following usable PU sizes and usable bi-predictive PU sizes.

4 × 4 PU restriction and 8 × 4 PU bi-prediction restriction 8 × 4 PU restriction and 8 × 8 PU bi-prediction restriction The PU restriction and bi-prediction restriction of this combination will limit the amount of processing and the transfer amount to the same extent. Good balance.

(Description of configuration)
First, referring to FIGS. 62 to 70 in addition to FIG. 61, a configuration for determining a size for performing bi-prediction restriction will be described. The configuration shown in FIG. 61 is obtained by changing the bi-prediction restricted PU determination unit 1218 to the bi-prediction restricted PU determination unit 1218A in the configuration shown in FIG.

The bi-prediction restriction PU determination unit 1218A determines the bi-prediction restriction condition, and determines whether or not bi-prediction restriction that does not use bi-prediction is performed in the target PU. The bi-prediction restricted PU determination unit 1218A may independently determine whether to perform bi-prediction restriction on the skip CU and the merge PU and whether to perform bi-prediction restriction on the basic inter PU.

Similarly to the configuration shown in FIG. 43, in the above configuration, the bi-prediction / uni-prediction setting set in the skip motion compensation parameter deriving unit 1211 is transferred to the bi-prediction restricted PU determination unit 1218A in the bi-prediction conversion unit 1219A. A configuration based on the determination may be used. In addition, the configuration is not limited thereto, and for example, a configuration in which the determination result of the bi-prediction restricted PU determination unit 1218A is directly input to the skip motion compensation parameter derivation unit 1211 and bi-prediction / uni-prediction setting is performed may be used.

The determination of the bi-prediction limited condition performed by the bi-prediction limited PU determination unit 1218A is performed based on various flags and parameters set from the encoder side. Specifically, the bi-prediction restricted PU determination unit 1218A can be configured as exemplified in the following (1A) to (1E). In the following, each configuration example (1A) to (1E) will be described with reference to a syntax table and pseudo code. Note that the level regulation and each configuration example are not in conflict and can be used together.

(1A) Provide a flag indicating whether or not to restrict bi-prediction, and directly specify the size of the bi-prediction restriction [Syntax table]
An example of a syntax table related to bi-prediction restriction will be described with reference to FIG. FIG. 62 is a diagram illustrating an example of a syntax table related to bi-prediction restriction. As shown in FIG. 62, in the RBSP (Raw Byte Sequence Payload) of the sequence parameter set of the encoded data, log2_min_coding_block_size_minus3 (SYN621), inter_4x4_enabled_flag (SYN622), restrict_bipred_flag (SYN623), and log2_min_bipred_min3_coding_25_S3 .

Log2_min_coding_block_size_minus3 is a flag for determining the minimum CU size. In log2_min_coding_block_size_minus3, a value obtained by subtracting 3 from the logarithmic value of the minimum CU size to be specified is stored. For example, when the minimum CU size is 8 × 8, log2_min_coding_block_size_minus3 = 0, and when the minimum CU size is 16 × 16, log2_min_coding_block_size_minus3 = 1.

Inter_4x4_enabled_flag is a flag that prohibits inter 4 × 4 PU as the name suggests.

Log2_min_coding_block_size_minus3 and inter_4x4_enabled_flag indicate the logarithm value Log2MinCUSize of the minimum CU size and 4 × 4 PU availability, respectively, and constrain the size of the usable PU. The logarithmic value Log2MinCUSize of the minimum CU size is derived by log2_min_coding_block_size_minus3 + 3. For example, when Log2MinCUSize = 3, the minimum CU size is 8 × 8, and when Log2MinCUSize = 4, the minimum CU size is 16 × 16. When the minimum CU size is 8 × 8, 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU, which are PUs obtained by dividing the minimum CU, can be used. However, when inter_4x4_enabled_flag is 0, 4 × 4 PU cannot be used.

Also, for example, when Log2MinCUSize = 4 (log2_min_coding_block_size_minus3 = 1), that is, when the minimum CU size is 16 × 16 CU, an 8 × 8 CU size cannot be used. Therefore, 8 × 8 PU can be used, but 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU, which are PUs obtained only by dividing 8 × 8, cannot be used.

“Restrict_bipred_flag” and “log2_min_bipred_coding_block_size_minus3” are information on bi-prediction restriction.

“Restrict_bipred_flag” is a flag indicating whether or not bi-prediction should be restricted. The value of this flag is determined in accordance with the level in the video encoding device 2. When restrict_bipred_flag is “1”, it indicates that bi-prediction is restricted. Further, when restrict_bipred_flag is “0”, it indicates that bi-prediction is not restricted.

Log2_min_bipred_coding_block_size_minus3 directly specifies the minimum CU size for which bi-prediction restriction is performed (hereinafter referred to as the minimum bi-prediction restriction CU size). The method of specifying the size in log2_min_bipred_coding_block_size_minus3 is the same as in log2_min_coding_block_size_minus3. Also, log2_min_bipred_coding_block_size_minus3 is decoded when restrict_bipred_flag is encoded (SYN625).

[Pseudo code]
Next, the operation of the bi-prediction restricted PU determination unit 1218A will be described using the pseudo code illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 63 will be described.

S631: The bi-prediction restricted PU determination unit 1218A determines whether or not restrict_bipred_flag is “0”.

S632: When restrict_bipred_flag is “0”, the bi-prediction restricted PU determination unit 1218A sets “0” in the DisableBipred variable. The DisableBipred variable is a variable indicating whether or not bi-prediction restriction is performed. When “0” is set in the DisableBipred variable, bi-prediction is not restricted. If the DisableBipred variable is set to “1”, bi-prediction restriction can be performed.

S633: On the other hand, when restrict_bipred_flag is not “0”, the bi-prediction restricted PU determination unit 1218A further determines whether or not Log2MinBipredCUSize is “3”. Here, Log2MinBipredCUSizered = log2_min_bipred_coding_block_size_minus3 + 3. That is, in S633, it is determined whether or not the minimum CU size for performing bi-prediction restriction is 8 × 8 CU.

S634: When Log2MinBipredCUSize is “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

The logarithmic value (log2CUSize) of the CU size matches Log2MinBipredCUSize (= 3), and the bi-prediction is restricted for PUs whose modes are other than 2N × 2N.

That is, in S634, when the minimum bi-prediction restricted CU size is 8 × 8 CU, bi-prediction in a PU other than 8 × 8 PU (2N × 2N) is limited for 8 × 8 CU.

In S634, the operator “&&” indicates a logical product. That is, in the left term of “&&”, it is determined whether or not the logarithmic value (log2CUSize) of the CU size of the target block matches Log2MinBipredCUSize (here, “3”). In the right term of “&&”, it is determined that the PU mode (PartMode) is not 2N × 2N. Note that “! =” Is a relational operator indicating an “not equal” relationship.

S635: On the other hand, when Log2MinBipredCUSize is not “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

The bi-prediction is limited for a PU (minimum PU) in which the logarithmic value (log2CUSize) of the CU size matches Log2MinBipredCUSize and the PU mode is N × N.

That is, in S635, when the minimum bi-prediction restricted CU size is other than 8 × 8 (for example, 16 × 16), the CU having a size that matches the minimum bi-prediction restricted CU size is the minimum PU (N × N). Limiting bi-prediction. In the figure, “restrict B” means “restrict bi-predictive prediction”.

Note that S635 may be modified as in the following S635 '.

S635 ': When Log2MinBipredCUSize is not "3", the bi-prediction restricted PU determination unit 1218A sets the DisableBiPred variable as follows.

“When the logarithm of the CU size (log2CUSize) matches Log2MinBipredCUSize and the PU mode is N × N” or “The logarithm of the CU size (log2CUSize) is smaller than Log2MinBipredCUSize To limit bi-prediction.

In S635 ', in addition to the restriction in S635, bi-prediction is restricted for all modes when the CU size is smaller than the minimum bi-prediction restriction CU size.

[Action / Effect]
In the present configuration example, the bi-prediction restriction can be adaptively performed according to the intention of the moving image encoding device 2 (information included in the encoded data).

The moving image encoding device 2 may encode information regarding the bi-prediction restriction in accordance with, for example, the resolution of the moving image and the performance of the moving image decoding device 1.

Thereby, the moving picture decoding apparatus 1 can finely adjust the bi-prediction restriction in accordance with the resolution of the moving picture and the performance of the moving picture decoding apparatus 1.

(1B) The size of the bi-prediction restriction is determined in conjunction with the minimum CU size without providing an additional flag related to the bi-prediction restriction [Syntax table]
Another example of the syntax table related to bi-prediction restriction will be described with reference to FIG. FIG. 64 is a diagram illustrating another example of the syntax table regarding the bi-prediction restriction. As shown in FIG. 64, log2_min_coding_block_size_minus3 only needs to be encoded in the RBSP of the sequence parameter set of encoded data (SYN641).

[Pseudo code]
Next, the operation of the bi-prediction restricted PU determination unit 1218A will be described using the pseudo code illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 65 will be described.

S651: The bi-prediction restricted PU determination unit 1218A determines whether or not Log2MinCUSize is “3”. Here, Log2MinCUSize = log2_min_coding_block_size_minus3 + 3. That is, in S651, it is determined whether or not the minimum CU size is 8 × 8 CU.

S652: When Log2MinCUSize is “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

The logarithmic value (log2CUSize) of the CU size matches Log2MinCUSize (= 3), and the bi-prediction is restricted for PUs whose modes are other than 2N × 2N.

That is, in S652, when the CU size is 8 × 8 CU, bi-prediction in a PU other than 8 × 8 PU (2N × 2N) is limited for a CU having a size that matches the minimum CU size.

S653: On the other hand, when Log2MinCUSize is not "3", the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

The bi-prediction is limited for a PU (minimum PU) in which the logarithmic value (log2CUSize) of the CU size matches Log2MinCUSize and the PU mode is N × N.

That is, in S653, when the minimum CU size is other than 8 × 8 (for example, 16 × 16), bi-prediction at the minimum PU (N × N) is limited for a CU having a size that matches the minimum CU size. .

[Action / Effect]
In this configuration example, bi-prediction is restricted according to the minimum CU size. Therefore, it is possible to restrict bi-prediction without additionally encoding information related to bi-prediction restriction.

(1C) A flag indicating whether or not to perform bi-prediction restriction is provided, and the size of the bi-prediction restriction is determined in conjunction with the minimum CU size [Syntax table]
Another example of the syntax table related to bi-prediction restriction will be described with reference to FIG. FIG. 66 is a diagram illustrating another example of the syntax table regarding the bi-prediction restriction. As shown in FIG. 66, in the RBSP of the sequence parameter set of encoded data, log2_min_coding_block_size_minus3 (SYN661) and restrict_bipred_flag (SYN663) are encoded. Note that inter_4x4_enabled_flag may be encoded in the RBSP of the sequence parameter set of encoded data (SYN 662).

[Pseudo code]
Next, the operation of the bi-prediction restricted PU determination unit 1218A will be described using the pseudo code illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 67 will be described.

S671: The bi-prediction restricted PU determination unit 1218A determines whether or not restrict_bipred_flag is “0”.

S672: When restrict_bipred_flag is “0”, the bi-prediction restricted PU determination unit 1218A sets “0” in the DisableBipred variable.

S673: On the other hand, if restrict_bipred_flag is not “0”, the bi-prediction restricted PU determination unit 1218A further determines whether or not Log2MinCUSize is “3”.

S674: When Log2CUMinSize is “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

That is, in S674, similarly to S652, when the CU size is 8 × 8 CU, bi-prediction in a PU other than 8 × 8 PU (2N × 2N) is limited for a CU having a size that matches the minimum CU size.

S675: When Log2MinCUSize is not “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

That is, in S675, as in S653, when the minimum CU size is other than 8 × 8 (for example, 16 × 16), bi-prediction at the minimum PU (N × N) is performed for a CU having a size that matches the minimum CU size. Restricted.

[Action / Effect]
In this configuration example, bi-prediction restriction is performed in accordance with the minimum CU size in accordance with determination of a flag indicating whether or not bi-prediction restriction is performed. Therefore, it is possible to limit bi-prediction without additional encoding of information regarding bi-prediction restriction that directly specifies the minimum bi-prediction restriction CU size. In this configuration example, both the usable PU size and the usable bi-predicted PU size are controlled by the same Log2MinCUSize flag, so the processing amount and transfer amount associated with bi-prediction and the processing amount and transfer amount associated with single prediction are reduced. It is possible to limit in a balanced manner.

Note that this configuration example may be modified as shown below (1C ').

(1C ′) In 1C, the flag indicating whether or not bi-prediction restriction is to be performed is changed to a ternary flag. In the following, using pseudo code shown in FIG. 68, in 1C, change_bipred_flag is changed to a ternary flag. An example will be described. In this modification, it is assumed that restrict_bipred_flag can take 0, 1 and other values (for example, 2). Hereinafter, each step S of the pseudo code illustrated in FIG. 67 will be described.

S681: When restrict_bipred_flag is “0”, the bi-prediction restricted PU determination unit 1218A sets “0” in the DisableBipred variable.

S682: When restrict_bipred_flag is “1”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

“When the logarithmic value of CU size (log2CUSize) matches Log2MinCUSize and the PU mode is other than 2N × 2N” or “The logarithmic value of CU size (log2CUSize) is smaller than Log2MinCUSize "If", restrict bi-prediction.

S682: When restrict_bipred_flag is a value other than the above (for example, “2”), the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

“When the logarithmic value of CU size (log2CUSize) matches Log2MinCUSize and the PU mode is N × N” or “When the logarithmic value of CU size (log2CUSize) is smaller than Log2MinCUSize To limit bi-prediction.

[Action / Effect]
According to this modification, restrict_bipred_flag is configured as a ternary flag, so that finer adjustment regarding bi-prediction restriction can be realized.

For example, in 16 × 16 CU, only bi-prediction limitation in 8 × 8 PU can be performed, or bi-prediction limitation in 8 × 8 PU, 16 × 8 PU, and 8 × 16 PU can be performed. Thus, according to the present modification, a wide range of options can be provided for bi-prediction restriction.

(1D) A flag indicating whether or not to perform bi-prediction restriction also serves as a flag indicating that inter 4 × 4 PU is prohibited [syntax table]
With reference to FIG. 69, another example of the syntax table regarding the bi-prediction restriction will be described. FIG. 69 is a diagram illustrating still another example of the syntax table for bi-prediction restriction. As illustrated in FIG. 69, log2_min_coding_block_size_minus3 (SYN691) and restrict_motion_compensation_flag (SYN692) are encoded in the RBSP of the sequence parameter set of encoded data.

Here, a comparison between the (1C) syntax table shown in FIG. 66 and the (1D) syntax table shown in FIG. 69 is as follows.

That is, in the RBSP of the sequence parameter set of encoded data according to this configuration example, instead of inter_4x4_enabled_flag (SYN 662) and restrict_bipred_flag (SYN 663) illustrated in FIG. 66, restrict_motion_compensation_flag (SYN 692) is encoded.

“Restrict_motion_compensation_flag” is a combination of! Inter_4x4_enabled_flag (“!” Represents a logical negation logical operator) and restrict_motion_compensation_flag. That is, restrict_motion_compensation_flag is a flag indicating whether or not inter 4 × 4 is prohibited, and is also a flag indicating whether or not bi-prediction should be limited.

[Pseudo code]
Next, the operation of the bi-prediction restricted PU determination unit 1218A will be described using the pseudo code illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 70 will be described.

S701: The bi-prediction restricted PU determination unit 1218A determines whether or not restrict_motion_compensation_flag is “0”.

S702: When restrict_motion_compensation_flag is “0”, the bi-prediction restricted PU determination unit 1218A sets “0” in the DisableBipred variable.

S703: On the other hand, when restrict_motion_compensation_flag is not “0”, the bi-prediction restricted PU determination unit 1218A further determines whether or not Log2MinCUSize is “3”.

S704: When Log2CUSize is “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

That is, in S704, similarly to S674, when the CU size is 8 × 8 CU, bi-prediction in PUs other than 8 × 8 PUs (2N × 2N) is limited for CUs having a size that matches the minimum CU size.

S705: When Log2CUSize is not “3”, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

That is, in S705, as in S675, when the minimum CU size is other than 8 × 8 (for example, 16 × 16), bi-prediction at the minimum PU (N × N) is performed for a CU having a size that matches the minimum CU size. Restricted.

[Action / Effect]
In this configuration example, a flag indicating whether or not bi-prediction restriction is performed also serves as a flag indicating that inter 4 × 4 PU is prohibited. According to this configuration example, the number of flags can be reduced, and bi-prediction restriction can be realized relatively easily.

(1E) Determine the size of the bi-prediction restriction based on the level value [Syntax Table]
The syntax table of FIG. 64 is used. As shown in FIG. 64, level_idc need only be encoded in the RBSP of the sequence parameter set of encoded data (SYN642).

[Pseudo code]
Next, the operation of the bi-prediction restricted PU determination unit 1218A will be described using the pseudo code illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 86 will be described.

S861: The bi-prediction restricted PU determination unit 1218A determines whether the value of the level level_idc is less than a predetermined threshold value TH1.

S862: When the value of the level level_idc is less than TH1, the bi-prediction restricted PU determination unit 1218A performs no particular process.

S863: On the other hand, when the level level_idc value is not less than TH1, the bi-prediction restricted PU determination unit 1218A further determines whether or not the level level_idc value is less than a predetermined threshold value TH2.

S864: When the value of the level level_idc is less than TH2, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

That is, in S864, bi-prediction in PUs other than 8 × 8 PUs (2N × 2N) is limited for the 8 × 8 CU that is the minimum CU size.

S865: When the value of the level level_idc is not less than TH2, the bi-prediction restricted PU determination unit 1218A sets the DisableBipred variable as follows.

The bi-prediction is limited for a PU (minimum PU) in which the logarithmic value (log2CUSize) of the CU size matches Log2MinCUSize (= 4) and the PU mode is N × N.

That is, in S865, when the minimum CU size is other than 8 × 8 (for example, 16 × 16), bi-prediction at the minimum PU (N × N) is limited for a CU having a size that matches the minimum CU size. . In S864 and S865, the bi-prediction restricted PU determination unit 1218A may determine Log2MinCUSize with reference to MaxLog2MinCUSize in the table of FIG. 84 according to the value of the level level_idc. For example, as shown in FIG. 84, when the level level_idc value is greater than or equal to TH1 and less than TH2, MaxLog2MinCuSize = 3. Therefore, in S864, the bi-prediction restricted PU determination unit 1218A sets MaxLog2MinCuSize = 3 Can be used. Similarly, in S865, the bi-prediction restricted PU determination unit 1218A can use MaxLog2MinCuSize = 4 as the value of Log2MinCUSize.

[Action / Effect]
In this configuration example, the size for performing bi-prediction restriction is changed according to the level value. According to this configuration example, it is possible to realize the restriction of bi-prediction according to the target environment indicated by the level without using a flag for performing the restriction of bi-prediction.

By configuring as shown in the above (1A) to (1E), adaptive constraints and restrictions as shown in FIG. 60 can be realized. As shown in FIGS. 84 and 85, the adaptive restriction / restriction as shown in FIG. 60 can also be realized by the level restriction that the encoded data restricts.

[2-3-3] Configuration for Performing Partial Bi-Prediction Restriction Hereinafter, a configuration for performing partial bi-prediction restriction will be described with reference to FIG. 61 again. The configuration shown in FIG. 61 is obtained by changing the bi-prediction conversion unit 1219 to a bi-prediction conversion unit 1219A in the configuration shown in FIG.

The bi-prediction conversion unit 1219A performs partial bi-prediction restriction on the derived merge candidate. More specifically, it is as shown below.

First, the bi-prediction conversion unit 1219A acquires the merge candidates derived by the adjacent merge candidate derivation unit 1212A and the temporal merge candidate derivation unit 1212B, respectively.

Then, when the motion compensation parameter of the acquired merge candidate indicates bi-prediction and is further determined to perform bi-prediction restriction in the bi-prediction restricted PU determination unit 1218A, the bi-uni prediction conversion unit 1219A has obtained Limit bi-prediction for at least some of the merge candidates.

Here, as a method of selecting a merge candidate that restricts bi-prediction from the acquired merge candidates, a method of selecting the first N merge candidates can be cited. The bi-prediction conversion unit 1219A may select, for example, one head or two head merge candidates (bi-prediction) from adjacent merge candidates.

Alternatively, one or two of the adjacent merge candidates and one of the temporal merge candidates may be selected. In bi-prediction of 8 × 8 PU, the inventors' experiments have shown that it is useful to perform bi-single transformation of temporal merge candidates, so bi-transformation is performed among the merge candidates. A configuration in which a temporal merge candidate is included in a merge candidate is effective.

That is, the bi-prediction conversion unit 1219A performs bi-conversion on the selected merge candidate. Since the dual conversion is as described above, the description thereof is omitted here.

[Process flow]
The processing flow of the bi-predictive conversion unit 1219A will be described with reference to FIG. FIG. 71 is a flowchart illustrating an example of a process flow of the merge motion compensation parameter derivation unit 1212 and the bi-prediction conversion unit 1219A.

As shown in FIG. 71, when the adjacent merge candidate deriving unit 1212A derives an adjacent merge candidate (S711), the bi-predictive conversion unit 1219A performs bi-single conversion processing on the derived adjacent merge candidate (S714). . In S714, when it is determined that the bi-prediction restriction PU determination unit 1218A performs the bi-prediction restriction, the bi-prediction conversion unit 1219A performs a bi-conversion process on the first N adjacent merge candidates. In FIG. 71, the dotted line indicates that S714 is executed in parallel with the processing of S711 to S713. In addition, the bi-predictive conversion unit 1219A sequentially acquires the derived merge candidates one by one from the adjacent merge candidate deriving unit 1212A. However, the present invention is not limited to this, and the bi-prediction conversion unit 1219A may acquire all the merge candidates derived in the adjacent merge candidate deriving unit 1212A in a lump.

Subsequently, when the temporal merge candidate deriving unit 1212B derives the temporal merge candidate (S712), the bi-predictive conversion unit 1219A performs bi-single conversion processing on the derived temporal merge candidate (S714). At this time, in S714, the bi-prediction conversion unit 1219A may perform bi-single conversion processing on the first N temporal merge candidates, or may perform bi-single conversion processing on all temporal merge candidates. In addition, the bi-single conversion process may be omitted.

Then, other merge candidates are derived (S713), and the process ends when the bi-single conversion process in the bi-predictive conversion unit 1219A is completed.

[Action / Effect]
The bi-prediction conversion unit 1219A sequentially acquires adjacent merge candidates from the adjacent merge candidate derivation unit 1212A. When the bi-prediction restriction is limited, the bi-prediction conversion unit 1219A performs bi-single conversion processing on the first N merge candidates. .

According to the above configuration, the bi-single conversion process is performed for some merge candidates, so that the processing load of the bi-single conversion process is reduced as compared with the case where the bi-single conversion process is performed for all merge candidates. In addition, if bi-conversion is performed on at least one temporal merge candidate, even if bi-prediction restriction is performed on an 8 × 8 PU having a relatively large PU size, an 8 × 8 PU temporal merge candidate is Since it is guaranteed that it is uni-prediction, that is, usable, it is possible to minimize a decrease in coding efficiency. In addition, since the merge candidate derivation process and the bi-single conversion process are executed in parallel, the process can be performed efficiently.

(Modification)
Preferred modifications 1 to 3 of this configuration will be described.

(Modification 1)
The bi-prediction conversion unit 1219A may be configured to perform bi-prediction processing after all merge candidates are derived and the merge candidate list is stored in the merge candidate storage unit 1212H.

Therefore, the configuration is changed as shown in FIG. The configuration shown in FIG. 72 is the merge candidate stored in the merge candidate storage unit 1212H based on the merge candidate list completion notification from the merge candidate derivation control unit 1212G in the configuration shown in FIG. This is a bi-predictive conversion unit 1219B that performs bi-single conversion processing on the list.

In addition, since the bi-single conversion process itself in the bi-predictive conversion unit 1219B is the same as that of the bi-predictive conversion unit 1219A, the description thereof is omitted.

[Process flow]
The process flow of the bi-predictive conversion unit 1219B shown in FIG. 72 will be described with reference to FIG. FIG. 73 is a flowchart illustrating an example of a processing flow of the merge motion compensation parameter derivation unit 1212 and the bi-prediction conversion unit 1219B.

Since S731 to S733 are the same as S101 to S103 shown in FIG. 44, description thereof is omitted.

If it is a B slice in S734 following S733 (YES in S734), the bi-prediction restricted PU determination unit 1218A determines whether or not bi-prediction restriction is performed (S735).

When performing bi-prediction restriction (YES in S735), the bi-prediction conversion unit 1219B performs bi-conversion of the first N merge candidates in the merge candidate list (S736). Then, after execution of bi-single conversion, other merge candidates are derived (S737).

On the other hand, when the slice is not a B slice (NO in S734) or when bi-prediction restriction is not performed (NO in S735), other merge candidates are derived without performing bi-single conversion (S737).

[Action / Effect]
When the bi-prediction restriction is limited after the merge candidate list is generated in the merge candidate storage unit 1212H by executing S731 to S733, the bi-predictive conversion unit 1219B starts the merge candidate list. A bi-single conversion process is performed for N merge candidates.

According to the above configuration, the bi-single conversion process is performed for some merge candidates, so that the processing load of the bi-single conversion process is reduced as compared with the case where the bi-single conversion process is performed for all merge candidates.

In addition, since S731 to S733 can be configured in the same manner as S101 to S103, the configuration of the merge candidate generation process does not need to be significantly changed from FIG. Thus, a partial bi-prediction restriction process can be realized by a simple configuration change.

(Comparison)
Here, a comparison between the configuration shown in FIGS. 61 and 71 and the configuration shown in FIGS. 72 and 73 is as follows.

The configurations shown in FIGS. 61 and 71 are configurations in which merge candidate derivation processing and bi-single conversion processing are executed in parallel, as already described.

Therefore, a time table for a series of processes is, for example, as shown in FIG. In the time chart shown in FIG. 74, merge candidates A to E are derived. Of these, two merge candidates are bi-transformed due to the restriction of bi-prediction. Further, it is assumed that it takes more time to derive merge candidates C and E than to derive merge candidates A, B, and D.

74. As shown in FIG. 74, the merge candidates A and B are the first two merge candidates in the merge candidates A to E, and thus are subject to bi-serial conversion. The bi-conversion of the merge candidates A and B is performed in parallel while the merge candidates C and E having a long processing time are being executed. For this reason, in the example shown in FIG. 74, when the derivation process of the merge candidates C and D ends, the list creation process is started, and the entire process ends when the list creation process ends.

On the other hand, the configuration shown in FIGS. 72 and 73 is a configuration that performs bi-single conversion processing after the merge candidate list is generated.

Therefore, a time chart of a series of processing is as shown in FIG. 75, for example. In the time chart shown in FIG. 75, merge candidates A to E are derived as in FIG. 74, and two of these merge candidates are bi-transformed due to the restriction of bi-prediction. Further, it is assumed that it takes more time to derive merge candidates C and E than to derive merge candidates A, B, and D.

In the example shown in FIG. 75, after the merge candidates A to E are derived, the merge candidate list is completed, and then bi-serial conversion is performed.

Therefore, comparing the examples shown in FIG. 74 and FIG. 75, in the example shown in FIG. 75, the time for completing the entire process is delayed from the example shown in FIG.

On the other hand, in the example shown in FIG. 75, it is only necessary to add the bi-conversion processing after the merge candidate list is created, and it is not necessary to change the logic from merge candidate derivation to merge candidate list creation.

In the example shown in FIG. 74, the unique check is performed after the merge candidates after bi-serial conversion are stored in the merge candidate storage unit 1212H as a merge candidate list. Therefore, as shown in FIG. 75, in the example shown in FIG. 73, in the merge candidate included in the merge candidate list, the merge candidate included in the merge candidate list is compared with the merge candidate included in the merge candidate list. Candidate uniqueness is maintained.

Moreover, although it is disadvantageous on the time chart, a configuration of a type in which the bi-predictive conversion unit 1219A and the bi-predictive conversion unit 1219B are combined may be employed. That is, only a temporal merge candidate may be configured such that bi-serial conversion is performed when a merge candidate is derived (before the merge candidate list is stored), and bi-conversion is performed after the merge candidate list is stored for other merge candidates. In this case, it is possible to perform the bi-single conversion with a simple configuration while ensuring that the bi-single conversion is performed with the temporal merge candidate effective in the bi-prediction restriction in 8 × 8 PU.

(Modification 2)
In the bi-prediction conversion unit 1219B, bi-uniform conversion may be performed according to the number of single predictions included in the merge candidate list.

Specifically, the bi-prediction conversion unit 1219B counts the number of uni-predictions at the time of merging candidate derivation, and only if there are no N or more uni-predictions in the merge candidate list, the first N merge candidates are bi-directional. It may be converted. Note that N is a positive integer, for example, N = 1.

According to the above configuration, when there are N or more uni-predictions in the merge candidate list, it is not necessary to perform bi-single conversion, so the load of merge candidate derivation processing can be reduced.

(Modification 3)
The bi-prediction conversion unit 1219A or the bi-prediction conversion unit 1219B may perform bi-conversion depending on whether or not the two motion vectors for the bi-prediction merge candidate are non-integer motion vectors.

Here, the non-integer motion vector means that at least a part of the motion vector component is represented by a non-integer when the pixel position is expressed as an integer value. On the other hand, when the pixel position is expressed as an integer value, a component in which all the motion vector components are represented by integers is called an integer motion vector.

For non-integer motion vectors, an interpolation filter for generating an interpolated image is applied, which increases the processing load and increases the reference pixel range necessary for motion compensation. The transfer amount tends to be high. On the other hand, in the case of an integer motion vector, such filter processing is not essential.

In the case of integer motion vectors, the reference range required for motion compensation is the same as that of the target block. Therefore, when an integer motion vector is included in bi-prediction, even if bi-prediction is performed, the transfer amount and the processing amount do not increase so much.

Therefore, the bi-prediction conversion unit 1219A or the bi-prediction conversion unit 1219B may omit bi-conversion when at least one of the two motion vectors for the bi-prediction merge candidate is an integer motion vector.

That is, the bi-prediction conversion unit 1219A or the bi-prediction conversion unit 1219B may perform bi-conversion only when the two motion vectors for the bi-prediction merge candidate are both non-integer motion vectors.

Further, the bi-predictive conversion unit 1219A or the bi-predictive conversion unit 1219B may be configured as follows. In other words, bi-prediction conversion unit 1219A or bi-prediction conversion unit 1219B may omit bi-conversion when both motion vectors for bi-prediction merge candidates are both integer motion vectors.

In addition, the bi-prediction conversion unit 1219A or the bi-prediction conversion unit 1219B may perform bi-conversion when at least one of the two motion vectors for the bi-prediction merge candidate is a non-integer motion vector.

According to the above configuration, since it is not necessary to perform bi-single conversion for all of bi-prediction, conversion of merge candidates related to bi-prediction restriction can be minimized.

[2-3-4] Conversion of Motion Vectors into Integers Hereinafter, the conversion of motion vectors into integers will be described with reference to FIGS. First, a configuration for converting a motion vector into an integer will be disclosed with reference to FIG. The configuration shown in FIG. 76 is obtained by changing the bi-predictive conversion unit 1219A to a motion vector integer converting unit 1220 in the configuration shown in FIG.

The motion vector integer converting unit 1220 converts at least one component of one or more non-integer components included in the non-integer motion vector into an integer component. Hereinafter, the conversion by the motion vector integer converting unit 1220 is referred to as motion vector integer conversion.

More specifically, when the bi-prediction is limited, the motion vector integer converting unit 1220 includes a bi-prediction when the merge candidate input from the adjacent merge candidate derivation unit 1212A or the temporal merge candidate derivation 1212B includes the bi-prediction. It is determined whether the two motion vectors for prediction are non-integer motion vectors. Then, when at least one of the two motion vectors of bi-prediction is a non-integer motion vector, the motion vector integerization unit 1220 converts the non-integer motion vector into an integer.

Next, a specific example of the integer processing will be described with reference to FIGS. 77 to 80. 77 to 80 are diagrams illustrating specific examples of the integer processing in the motion vector integer converting unit 1220.

In the following, it is assumed that the motion vector is represented by a two-dimensional coordinate display of (X, Y). In the following description, for convenience of explanation, the non-integer motion vector refers to a non-integer motion vector in which both the X component and the Y component are non-integers. Mv_Lx indicates a motion vector of the list Lx (x = 0ｘor 1). Further, mv_Lx [0] indicates the X component of the motion vector, and mv_Lx [1] indicates the Y component of the motion vector.

[Make X coordinate integer]
The motion vector integer converting unit 1220 may convert the X coordinate of the motion vector into an integer as illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 77 will be described.

S771: It is determined whether or not the motion vector of L0 is a non-integer vector. The lower 2 bits of the coordinate component of the motion vector are bits representing the decimal position. “Mv_L0 [x] & 3” (x = 0 or 1) determines whether the lower 2 bits of the coordinate component are “11 (3)”, and does the coordinate position indicate a decimal position? Judgment is made.

S772: When the motion vector of L0 is a non-integer vector, the X coordinate of the L0 motion vector is converted to an integer by setting the lower 2 bits of the X coordinate of L0 to “00”. Note that “˜” is a bit negation operator r, and “˜3” is a bit negation of “11”, that is, “00”. “& =” Is a bit product substitution operator. For example, “A & = B” means “A = A & B”.

S773: It is determined whether or not the motion vector of L1 is a non-integer vector.

S774: If the L1 motion vector is a non-integer vector, the X coordinate of the L1 motion vector is converted to an integer.

[Make Y coordinate integer]
The motion vector integer converting unit 1220 may convert the Y coordinate of the motion vector into an integer as illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 78 will be described.

S781: It is determined whether or not the motion vector of L0 is a non-integer vector.

S782: When the motion vector of L0 is a non-integer vector, the Y coordinate of the motion vector of L0 is converted to an integer.

S783: It is determined whether or not the motion vector of L1 is a non-integer vector.

S784: If the L1 motion vector is a non-integer vector, the Y coordinate of the L1 motion vector is converted to an integer.

[X and Y coordinates are converted into integers]
Further, the motion vector integer converting unit 1220 may convert the X coordinate and Y coordinate of the motion vector into integers as illustrated in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 79 will be described.

S791: It is determined whether or not the motion vector of L0 is a non-integer vector.

S792: When the motion vector of L0 is a non-integer vector, the X coordinate and the Y coordinate of the motion vector of L0 are converted into integers.

S793: It is determined whether or not the motion vector of L1 is a non-integer vector.

S794: When the motion vector of L1 is a non-integer vector, the X coordinate and the Y coordinate of the motion vector of L1 are converted into integers.

[X and Y coordinates are converted to integers for only one list]
Further, as illustrated in FIG. 80, the motion vector integer converting unit 1220 may convert the X coordinate and Y coordinate of the motion vector of one list into integers. Hereinafter, each step S of the pseudo code illustrated in FIG. 80 will be described.

S801: It is determined whether or not the motion vector of LX is a non-integer vector (where X = 0 or 1).

S802: If the LX motion vector is a non-integer vector, the X and Y coordinates of the LX motion vector are converted to integers.

[Action / Effect]
As already described, in the case of an integer motion vector, it is not necessary to perform filter processing using an interpolation filter. Therefore, the reference range referenced in motion compensation matches the target block. For this reason, even if bi-prediction is performed, the processing amount of filter processing and the transfer amount of reference pixels do not increase so much.

Therefore, when at least one of the bi-prediction motion vectors is a non-integer motion vector in which both the X coordinate and the Y coordinate are non-integer components, at least one of the non-integer motion vectors having both the X coordinate and the Y coordinate has a non-integer component. If the non-integer component can be converted into the integer component, the processing amount and the transfer amount may be suppressed as compared with the case where the non-integer motion vector is processed as it is.

Also, if the non-integer motion vector is converted to an integer motion vector, the reference range matches the target block, so that the processing amount and the transfer amount can be further reduced.

Note that the motion vector integerization unit 1220 may apply the integer processing to a non-integer motion vector in which either one of the X coordinate and the Y coordinate is not a non-integer component.

(3) Details of TU Information Decoding Unit Next, configuration examples of the TU information decoding unit 13 and the decoding module 10 will be described with reference to FIG. FIG. 16 illustrates the configuration for performing the TU division decoding process, the transform coefficient decoding process, and the prediction residual derivation process in the moving image decoding apparatus 1, that is, the configurations of the TU information decoding unit 13 and the decoding module 10. It is a functional block diagram.

Hereinafter, the configuration of each unit will be described in the order of the TU information decoding unit 13 and the decoding module 10.

[TU information decoding unit]
As illustrated in FIG. 16, the TU information decoding unit 13 includes a TU partition setting unit 131 and a transform coefficient restoration unit 132.

The TU partition setting unit 131 sets the TU partition method based on the parameters decoded from the encoded data, the CU size, and the PU partition type. Further, the transform coefficient restoration unit 132 restores the prediction residual of each TU according to the TU partition set by the TU partition setting unit 131.

[TU division setting section]
First, the details of the TU partition setting unit 131 will be described with reference to FIG. More specifically, the TU division setting unit 131 includes a target region setting unit 1311, a division determination unit 1312, a division region setting unit (conversion unit division unit, division unit) 1313, and a conversion size determination information storage unit 1314.

The target area setting unit 1311 sets a target node that is a target area. The target area setting unit 1311 sets the entire target CU as the initial value of the target area when the TU partitioning process is started for the target conversion tree. The division depth is set to “0”.

The division determination unit 1312 uses the region division flag decoding unit 1031 to decode information (split_transform_flag) indicating whether or not to divide the target node set by the target region setting unit 1311, and based on the decoded information, Decide whether to split the node.

The divided region setting unit 1313 sets a divided region for the target node determined to be divided by the division determining unit 1312. Specifically, the divided region setting unit 1313 adds 1 to the division depth for the target node determined to be divided, and based on the converted size determination information stored in the converted size determination information storage unit 1314. Divide the target node.

Note that each target node obtained by the division is further set as a target region by the target region setting unit 1311.

That is, in the TU partitioning, “target region setting”, “partition determination”, and “partition region setting” are performed by the target region setting unit 1311, the partition determination unit 1312, and the partition region setting unit 1313 for the target node that has been split. A series of processes is repeated recursively.

The conversion size determination information storage unit 1314 stores conversion size determination information indicating the division method of the target node. Specifically, the transform size determination information is information that defines the correspondence between the CU size, the TU partition depth (trafoDepth), the PU partition type of the target PU, and the TU partition pattern.

Here, a specific configuration example of the conversion size determination information will be described with reference to FIG. In the transform size determination information shown in FIG. 17, TU partition patterns are defined according to the CU size, the TU partition depth (trafoDepth), and the PU partition type of the target PU. Note that “d” in the table indicates the division depth of the CU.

In the conversion size determination information, four sizes of 64 × 64, 32 × 32, 16 × 16, and 8 × 8 are defined as CU sizes.

In the conversion size determination information, selectable PU partition types are defined according to the size of the CU.

When the size of the CU is 64 × 64, 32 × 32, and 16 × 16, the PU partition types are 2N × 2N, 2N × nU, 2N × nD, N × 2N, nL × 2N, and nR × 2N. Either of these can be selected.

Also, when the size of the CU is 8 × 8, any of 2N × 2N, 2N × N, and N × 2N can be selected as the PU partition type.

Also, in the transform size determination information, a TU partition pattern at each TU partition depth is defined according to the CU size and PU partition type.

For example, when the size of the CU is 64 × 64, it is as follows. First, the TU division depth “0” is not defined, and a 64 × 64 CU is forcibly divided (indicated by * 1 in FIG. 17). This is because the maximum size of the conversion unit is defined as 32 × 32.

TU partition depths “1” and “2” define different TU partition patterns depending on whether only square quadtree partitioning is included or only non-square quadtree partitioning is included.

In the case of a TU partition pattern in which the PU partition type is 2N × 2N and includes only a square quadtree partition, with a TU partition depth “1”, a 32 × 32 square quadtree partition, a TU partition depth “2” "Defines a 16x16 square quadtree partition.

Definition when PU partition type is any of 2N × 2N, 2N × nU, 2N × nD, N × 2N, nL × 2N, and nR × 2N and includes only non-square quadtree partition Is as follows.

First, a 32 × 32 square quadtree partition is defined with a TU partition depth “1”. Subsequently, for the TU partition depth “2”, a 32 × 8 non-square quadtree partition is defined for the PU partition types: 2N × 2N, 2N × nU, and 2N × nD. : 8 × 32 non-square quadtree partitioning is defined for N × 2N, nL × 2N, and nR × 2N.

Furthermore, the case where the size of the CU is 8 × 8 is exemplified as follows. When the size of the CU is 8 × 8, selectable PU partition types are 2N × 2N, 2N × 2, and N × 2N. For each PU partition type, an 8 × 8 square quadtree partition is defined at a TU partition depth “1”, and a 4 × 4 square quadtree partition is defined at a TU partition depth “2”. Note that the TU division depth “3” is not defined, and is forcibly non-division (indicated by * 2 in FIG. 17).

Here, the details of the TU partitioning in the TU partition setting unit 131 will be described with reference to FIG. FIG. 20 is a diagram illustrating an example of TU partitioning when the CU size is 32 × 32 and the PU partitioning type is 2N × N.

First, when the TU partitioning process is started, the target area setting unit 1311 sets the entire target CU as an initial value of the target area and sets depth = 0. When depth = 0, the PU boundary B1 is indicated by a dotted line at the center in the region vertical direction.

Next, the division determination unit 1312 determines whether or not the target node needs to be split based on information (split_transform_flag) indicating whether or not the target node is to be split.

Since split = 1, the division determination unit 1312 determines to divide the target node.

The division area setting unit 1313 adds 1 to the depth, and sets a TU division pattern for the target node based on the conversion size determination information. The divided region setting unit 1313 performs TU division with depth = 1 for the target CU that is the target region.

17, according to the definition of the transform size determination information shown in FIG. 17, the division region setting unit 1313 divides the target node into quadrants into 32 × 8 regions when depth = 1.

Thereby, the target node is divided into four horizontally long rectangular regions TU0, TU1, TU2, and TU3 by the division method shown in FIG.

Further, the target area setting unit 1311 sets each node of TU0, TU1, TU2, and TU3 as target areas in order at the division depth of depth = 1.

Here, since split = 1 is set for TU1, the division determination unit 1312 determines to divide TU1.

The division area setting unit 1313 executes TU division with depth = 2 for TU1. In accordance with the definition of the transform size determination information illustrated in FIG. 17, the divided region setting unit 1313 divides the target node into a 16 × 4 region by quadtree division when depth = 2.

As a result, the target node TU1 is divided into four horizontally long rectangular areas TU1-0, TU1-1, TU1-2, and TU1-3 in the division method shown in FIG. .

[3-1] Example of Configuration for Deriving Partition Area Size When PU Partition Type is Asymmetric The partition area setting unit 1313 converts a rectangle (non-square) in the smaller PU when the PU partition type is asymmetric. On the other hand, at least a part of the larger PU may be configured to apply a square transformation.

For example, in the conversion size determination information stored in the conversion size determination information storage unit 1314, a small PU size 1314A and a large PU size 1314B, which are definition information, are provided.

The small PU size 1314A is defined such that rectangular transformation is applied to a PU having a small size among PUs divided asymmetrically.

The large PU size 1314B is defined such that square transformation is applied to a PU having a large size among PUs asymmetrically divided.

Also, the divided region setting unit 1313 sets a divided region by referring to either the definition information of the small PU size 1314A or the definition information of the large PU size 1314B according to the size of the asymmetrically divided PU. To do.

The TU partitioning according to the above configuration example will be described with reference to FIG.

FIG. 21 is a diagram illustrating an example of TU partitioning according to the above configuration example when the PU partition type is 2N × nU.

First, when the TU partitioning process is started, the target area setting unit 1311 sets the entire target CU as an initial value of the target area and sets depth = 0. When depth = 0, the PU boundary B2 is indicated by a dotted line above the center in the region vertical direction.

Here, the divided region setting unit 1313 performs a horizontally-long rectangular TU partition for a small size PU among the PUs asymmetrically divided according to the small PU size 1314A.

Further, the divided region setting unit 1313 includes a square TU partition for a large size PU among the PUs asymmetrically divided according to the large PU size 1314B. Note that the divided region setting unit 1313 may include a rectangular TU divided region as shown in FIG. 21 for the region located on the PU boundary side.

As a result, the divided region setting unit 1313 divides the target node into quadrants into two rectangular nodes and two square nodes when depth = 1.

Thereby, the target node is divided into four regions of horizontally long rectangular TU0 and TU1 and square TU2 and TU3.

Here, since split = 1 is set for TU0 and TU2, the division determination unit 1312 determines to divide TU0 and TU2.

The division region setting unit 1313 executes TU division with depth = 2 for TU0 and TU2. The division region setting unit 1313 TU-divides TU0 into four horizontally long rectangles and TU-divides TU2 into four squares when depth = 2.

Thus, TU0 is divided into four horizontally long rectangular areas, TU0-0, TU0-1, TU0-2, and TU0-3. In addition, TU2 is divided into four horizontally long rectangular areas, TU2-0, TU2-1, TU2-2, and TU2-3.

When performing TU partitioning in a CU with an asymmetric PU partition type as described above, it is preferable to partition so that the partition does not cross the PU boundary and the area of each divided TU partition region is the same. .

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers, and divides the coding unit into one or more numbers. In the image decoding apparatus that decodes the prediction residual for each transform unit obtained in this way and restores the image by adding the prediction residual to the predicted image, the division type into the prediction units is different from the encoding unit. A division type of a target coding unit that is the above-mentioned coding unit to be decoded, including asymmetrical division that divides a size into prediction units or a symmetric division that divides a coding unit into a plurality of prediction units of the same size Is a unit comprising a transform unit dividing means for determining a transform unit division method according to the size of the prediction unit included in the target coding unit.

Therefore, when the division type is asymmetric division, it is possible to select a conversion unit division method that can efficiently remove the correlation according to the size of the prediction unit included in the target coding unit.

[3-2] Example of configuration in which non-rectangular transformation is applied when the PU partition type is a square partition in some CU sizes [Configuration Example 3-2-1]
When the PU division type is square division, the division region setting unit 1313 may divide the target node into non-squares.

Therefore, in the transform size determination information stored in the transform size determination information storage unit 1314, when the PU partition type is square partition, a square PU partition type 1314C that defines that the target node is split into non-squares is defined. May be.

Then, when the PU partition type is square partition, the partition region setting unit 1313 refers to the square PU partition type 1314C and divides the target node into a non-square.

When the CU size is 32 × 32 size and the PU partition type is 2N × 2N, the partition region setting unit 1313 may divide the region into 32 × 8 nodes in the TU partition.

In addition, when the CU size is 32 × 32 size and the PU partition type is 2N × 2N, the partition region setting unit 1313 additionally decodes information indicating the TU partitioning scheme, and decodes the information. May be divided into any of 32 × 8, 16 × 16, and 8 × 32 nodes.

Also, the divided region setting unit 1313 divides the target CU based on the size of the adjacent CU and the PU division type when the CU size is 32 × 32 size and the PU division type is 2N × 2N. The TU size may be estimated. Further, the divided region setting unit 1313 may perform estimation as described in (i) to (iii) below.

(I) If there is a CU boundary or PU boundary on the left side, and no CU boundary or PU boundary exists on the upper side, select 32x8.

(Ii) If there is a CU boundary or PU boundary on the upper side and no CU boundary or PU boundary on the left side, select 8x32.

(Iii) Other than the above (i) and (ii) (when there is a boundary on the left side or the upper side, or there is no boundary on either the left side or the upper side), 16 × 16 is selected.

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers, and divides the coding unit into one or more numbers. In the image decoding apparatus that decodes the prediction residual for each conversion unit obtained in this way and restores the image by adding the prediction residual to the prediction image, the division method into the conversion units includes square and rectangular divisions When the shape of the target prediction unit, which is the prediction unit to be decoded, is a square, the unit includes a dividing unit that divides the target conversion unit by a rectangular division method.

The actions and effects of the above configuration are as follows. A square prediction unit may be selected even though an edge exists in the region and the image has directionality. For example, when an object having a large number of horizontal edges is moving, since the motion is uniform within the object, a square prediction unit is selected. However, in such a case, in the conversion process, it is desirable to apply a conversion unit having a shape that is long in the horizontal direction along the horizontal edge.

According to the above configuration, when the shape of the target prediction unit that is the prediction unit to be decoded is a square, the target conversion unit is divided by a rectangular division method.

Therefore, even in the square coding unit, a rectangular conversion unit can be selected, and thus the coding efficiency for the region as described above can be improved.

[Configuration Example 3-2-2]
In addition to the configuration 3-2-1 above, the divided region setting unit 1313 divides as follows at each division depth when the CU size is 16 × 16 size and the PU division type is 2N × 2N. I do.

Dividing depth = 1 ... Divide into 16x4 TUs.

Dividing depth = 2 ... Divide into 4x4 TUs.

According to the above configuration, in the 4 × 4 TU, the scan order is unified without depending on the 16 × 16 CU PU partition type. Since the 4 × 4 TU has different PU partition types of 16 × 16 CU, and the scan order is not unified, the scan processing must be changed according to the PU partition type of 16 × 16 CU. It becomes complicated. Therefore, such ununiformity in scan order can be a bottleneck in processing.

According to the above configuration, the processing order can be simplified by unifying the scan order.

More specific description will be made with reference to FIGS. 22 and 23 as follows. First, the TU division shown in FIG. 22 will be described. FIG. 22 shows the flow of TU partitioning when partitioning is performed according to the transform size determination information shown in FIG.

22A, when the PU partition type is 2N × 2N. When depth = 1, the partition region setting unit 1313 performs quadtree partitioning of the target node square. In addition, at depth = 2, the divided region setting unit 1313 further performs square quadtree division for each node divided into squares. Here, a recursive z-scan is used as the scan order. Specifically, it is as illustrated in FIG.

22B, when the PU partition type is 2N × nU. When depth = 1, the partition region setting unit 1313 performs horizontally-long rectangular quadtree partitioning on the target node. In addition, at depth = 2, the divided region setting unit 1313 further performs square quadtree division on each node divided into horizontally long rectangles. Here, raster scan is used as the scan order of each TU. Specifically, it is as illustrated in FIG.

Next, the TU division shown in FIG. 23 will be described. FIG. 23 shows the flow of TU partitioning when a region of PU partition type 2N × 2N is performed according to the square PU partition type 1314C.

PU partition type: 2N × 2N ... When depth = 1, the partition region setting unit 1313 performs horizontally-long rectangular quadtree partitioning on the target node. In addition, at depth = 2, the divided region setting unit 1313 further performs square quadtree division on each node divided into horizontally long rectangles.

As a result, raster scan is used as the scan order. Therefore, the scan order can be unified to the raster scan between the case where the PU division type is 2N × nU and the case where it is 2N × 2N.

[Transform coefficient restoration unit]
Next, details of the transform coefficient restoration unit 132 will be described with reference to FIG. 16 again. More specifically, the transform coefficient restoration unit 132 includes a non-zero coefficient determination unit 1321 and a transform coefficient derivation unit 1322.

The non-zero coefficient determination unit 1321 uses the determination information decoding unit (coefficient decoding unit) 1032 to decode the presence / absence information of the non-zero conversion coefficient for each TU or conversion tree included in the target CU, and non-zero for each TU. It is determined whether a conversion coefficient exists.

The transform coefficient deriving unit 1322 uses the transform coefficient decoding unit (coefficient decoding unit) 1033 to restore the transform coefficient of each TU in which the non-zero transform coefficient exists, while the transform coefficient of each TU in which the non-zero transform coefficient does not exist. Is set to 0 (zero).

[Decryption module]
As illustrated in FIG. 16, the decoding module 10 includes a region division flag decoding unit 1031, a determination information decoding unit 1032, a transform coefficient decoding unit 1033, and a context storage unit 1034.

The region division flag decoding unit 1031 decodes the syntax value from the binary included in the encoded data according to the encoded data and syntax type supplied from the division determination unit 1312. The area division flag decoding unit 1031 decodes information (split_transform_flag) indicating whether or not to divide the target node.

The determination information decoding unit 1032 decodes the syntax value from the binary included in the encoded data according to the encoded data and the syntax type of the non-zero transform coefficient existence information supplied from the transform coefficient deriving unit 1322. Do. The syntax decoded by the determination information decoding unit 1032 is specifically no_residual_data_flag, cbf_luma, cbf_cb, cbf_cr, and cbp.

The transform coefficient decoding unit 1033 decodes the syntax value from the binary included in the encoded data according to the encoded data of the transform coefficient supplied from the transform coefficient deriving unit 1322 and the syntax type. Specifically, the syntax decoded by the transform coefficient decoding unit 1033 is a level (level) that is an absolute value of the transform coefficient, a sign of the transform coefficient (sign), a length of run of consecutive zeros (run), and the like. is there.

The context storage unit 1034 stores a context that the determination information decoding unit 1032 and the transform coefficient decoding unit 1033 refer to in the decoding process.

[3-3] Specific Configuration Referencing Context at Transform Coefficient Decoding When the PU partition type is asymmetric partition, each of the determination information decoding unit 1032 and the transform coefficient decoding unit 1033 includes a TU included in the smaller PU. The syntax value related to the transform coefficient may be decoded using different contexts for the TU included in the larger PU. For example, such syntax types include non-zero transform coefficient flags, transform coefficient levels, transform coefficient runs, and non-zero transform coefficient presence / absence information at each node of the TU tree. A combination of the syntaxes may be included.

Therefore, the context storage unit 1034 includes the small PU size 1034A, which is a probability setting value corresponding to various syntax values related to transform coefficients in the context referred to in the TU included in the smaller PU, and the larger PU. The large PU size 1034B that is the probability setting value in the context referred to in the TU to be stored may be stored. Here, the small PU size 1034A and the large PU size 1034B are probability setting values corresponding to different contexts.

The determination information decoding unit 1032 refers to the small PU size 1034A when the target TU is included in the small PU, while referring to the large PU size 1034B when the target TU is included in the larger PU. Cbf (cbf_luma, cbf_cb, cbf_cr, etc.) in the TU is arithmetically decoded.

Also, the transform coefficient decoding unit 1033 refers to the small PU size 1034A when the target TU is included in a small PU, and refers to the large PU size 1034B when the target TU is included in a larger PU. , Arithmetically decode transform coefficients (level, sign, run, etc.) in the target TU.

Note that the determination information decoding unit 1032 and the transform coefficient decoding unit 1033 have a small PU size when the target TU is included in the larger PU and the target TU is located closer to the smaller PU. Reference may be made to 1034A.

In other words, even when the target TU is included in the larger PU, if the target TU is located near the PU boundary, the determination information decoding unit 1032 and the transform coefficient decoding unit 1033 have the small PU size. Reference may be made to 1034A.

The smaller PU is likely to have an edge, and conversion coefficients are likely to occur. On the other hand, conversion coefficients are less likely to occur in the larger PU. By using different contexts depending on whether the target TU is included in a small PU or a larger PU, variable-length decoding can be performed according to the occurrence probability of a transform coefficient in each region. .

[Action / Effect]
The present invention can also be expressed as follows. That is, the image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers, and divides the coding unit into one or more numbers. In the image decoding apparatus that decodes the prediction residual for each conversion unit obtained in this way and restores the image by adding the prediction residual to the prediction image, the division type into the prediction units is a prediction unit of a different size. Including the division into the asymmetric form to be divided and the division into the symmetric form to be divided into prediction units of the same size, the division type of the target prediction unit that is the prediction unit to be decoded is division into the asymmetric form In some cases, coefficient decoding means is provided for decoding transform coefficients with reference to different contexts between the small prediction unit and the large prediction unit obtained by the division.

Therefore, it is possible to perform variable length decoding according to the probability of occurrence of transform coefficients in the respective regions of transform units included in a small prediction unit and transform units included in a large prediction unit.

(Process flow)
The CU decoding process in the moving image decoding apparatus 1 will be described with reference to FIG. In the following, it is assumed that the target CU is an inter CU or a skip CU. FIG. 24 is a flowchart illustrating an example of a flow of CU decoding processing (inter / skip CU) in the video decoding device 1.

When the CU decoding process is started, the CU information decoding unit 11 decodes the CU prediction information for the target CU using the decoding module 10 (S11). This process is performed on a CU basis.

Specifically, in the CU information decoding unit 11, the CU prediction mode determination unit 111 decodes the skip flag SKIP using the decoding module 10. If the skip flag does not indicate a skip CU, the CU prediction mode determination unit 111 further decodes the CU prediction type information Pred_type using the decoding module 10.

Next, PU unit processing is performed. That is, the motion compensation parameter deriving unit 121 included in the PU information decoding unit 12 decodes the motion information (S12), and the predicted image generation unit 14 generates a predicted image by inter prediction based on the decoded motion information. (S13).

Next, the TU information decoding unit 13 performs a TU division decoding process (S14). Specifically, in the TU information decoding unit 13, the TU partition setting unit 131 sets the TU partitioning method based on the parameters decoded from the encoded data, the CU size, and the PU partition type. This process is performed on a CU basis.

Next, TU unit processing is performed. That is, the TU information decoding unit 13 decodes the transform coefficient (S15), and the inverse quantization / inverse transform unit 15 derives a prediction residual from the decoded transform coefficient (S16).

Next, the adder 17 adds the predicted image and the prediction residual to generate a decoded image (S17). This process is performed on a CU basis.

[Moving picture encoding device]
Hereinafter, the moving picture coding apparatus 2 according to the present embodiment will be described with reference to FIGS. 25 and 26.

(Outline of video encoding device)
Generally speaking, the moving image encoding device 2 is a device that generates and outputs encoded data # 1 by encoding the input image # 10.

(Configuration of video encoding device)
First, a configuration example of the video encoding device 2 will be described with reference to FIG. FIG. 25 is a functional block diagram illustrating the configuration of the moving image encoding device 2. As shown in FIG. 25, the moving image encoding apparatus 2 includes an encoding setting unit 21, an inverse quantization / inverse conversion unit 22, a predicted image generation unit 23, an adder 24, a frame memory 25, a subtractor 26, A quantization unit 27 and an encoded data generation unit (encoding means) 29 are provided.

The encoding setting unit 21 generates image data related to encoding and various setting information based on the input image # 10.

Specifically, the encoding setting unit 21 generates the next image data and setting information.

First, the encoding setting unit 21 generates the CU image # 100 for the target CU by sequentially dividing the input image # 10 into slice units and tree block units.

Also, the encoding setting unit 21 generates header information H ′ based on the result of the division process. The header information H ′ includes (1) information on the size and shape of the tree block belonging to the target slice and the position in the target slice, and (2) the size, shape and shape of the CU belonging to each tree block. CU information CU ′ for the position at

Further, the encoding setting unit 21 refers to the CU image # 100 and the CU information CU 'to generate PT setting information PTI'. The PT setting information PTI 'includes information on all combinations of (1) possible division patterns of the target CU for each PU and (2) prediction modes that can be assigned to each PU.

The encoding setting unit 21 calculates the cost of the combination of each division pattern and each prediction mode, and determines the lowest cost division pattern and prediction mode.

The encoding setting unit 21 supplies the CU image # 100 to the subtractor 26. In addition, the encoding setting unit 21 supplies the header information H ′ to the encoded data generation unit 29. Also, the encoding setting unit 21 supplies the PT setting information PTI ′ to the predicted image generation unit 23.

The inverse quantization / inverse transform unit 22 performs inverse quantization and inverse orthogonal transform on the quantized prediction residual for each block supplied from the transform / quantization unit 27, thereby predicting the prediction residual for each block. To restore. The inverse orthogonal transform is as already described for the inverse quantization / inverse transform unit 15 shown in FIG.

Also, the inverse quantization / inverse transform unit 22 integrates the prediction residual for each block according to the division pattern specified by the TT division information (described later), and generates the prediction residual D for the target CU. The inverse quantization / inverse transform unit 22 supplies the prediction residual D for the generated target CU to the adder 24.

The predicted image generation unit 23 refers to the local decoded image P ′ and the PT setting information PTI ′ recorded in the frame memory 25 to generate a predicted image Pred for the target CU. The predicted image generation unit 23 sets the prediction parameter obtained by the predicted image generation process in the PT setting information PTI ′, and transfers the set PT setting information PTI ′ to the encoded data generation unit 29. Note that the predicted image generation process performed by the predicted image generation unit 23 is the same as that performed by the predicted image generation unit 14 included in the video decoding device 1, and thus description thereof is omitted here.

The adder 24 adds the predicted image Pred supplied from the predicted image generation unit 23 and the prediction residual D supplied from the inverse quantization / inverse transform unit 22 to thereby obtain the decoded image P for the target CU. Generate.

Decoded decoded image P is sequentially recorded in the frame memory 25. In the frame memory 25, decoded images corresponding to all tree blocks decoded prior to the target tree block (for example, all tree blocks preceding in the raster scan order) at the time of decoding the target tree block. Are recorded together with the parameters used for decoding the decoded image P.

The subtractor 26 generates a prediction residual D for the target CU by subtracting the prediction image Pred from the CU image # 100. The subtractor 26 supplies the generated prediction residual D to the transform / quantization unit 27.

The transform / quantization unit 27 generates a quantized prediction residual by performing orthogonal transform and quantization on the prediction residual D. Here, the orthogonal transform refers to an orthogonal transform from the pixel region to the frequency region. Examples of inverse orthogonal transformation include DCT transformation (DiscretecreCosine Transform), DST transformation (Discrete Sine Transform), and the like.

Specifically, the transform / quantization unit 27 refers to the CU image # 100 and the CU information CU 'and determines a division pattern of the target CU into one or a plurality of blocks. Further, according to the determined division pattern, the prediction residual D is divided into prediction residuals for each block.

The transform / quantization unit 27 generates a prediction residual in the frequency domain by orthogonally transforming the prediction residual for each block, and then quantizes the prediction residual in the frequency domain to Generate quantized prediction residuals.

In addition, the transform / quantization unit 27 generates the quantization prediction residual for each block, TT division information that specifies the division pattern of the target CU, information about all possible division patterns for each block of the target CU, and TT setting information TTI ′ including is generated. The transform / quantization unit 27 supplies the generated TT setting information TTI ′ to the inverse quantization / inverse transform unit 22 and the encoded data generation unit 29.

The PU information generation unit 30 encodes the PT setting information PTI 'and derives the PT setting information PTI when the prediction type indicated in the PT setting information PTI' is inter prediction. Further, PTI setting information PTI ′ for the merge candidate is generated and supplied to the encoding setting unit 21.

The encoded data generation unit 29 encodes header information H ′, TT setting information TTI ′, and PT setting information PTI ′, and multiplexes the encoded header information H, TT setting information TTI, and PT setting information PTI. Coded data # 1 is generated and output.

(Correspondence relationship with video decoding device)
The video encoding device 2 includes a configuration corresponding to each configuration of the video decoding device 1. Here, “correspondence” means that the same processing or the reverse processing is performed.

For example, as described above, the prediction image generation process of the prediction image generation unit 14 included in the video decoding device 1 and the prediction image generation process of the prediction image generation unit 23 included in the video encoding device 2 are the same. .

For example, the process of decoding a syntax value from a bit string in the video decoding device 1 corresponds to a process opposite to the process of encoding a bit string from a syntax value in the video encoding device 2. Yes.

In the following, it will be described how each configuration in the video encoding device 2 corresponds to the CU information decoding unit 11, the PU information decoding unit 12, and the TU information decoding unit 13 of the video decoding device 1. . Thereby, the operation and function of each component in the moving image encoding device 2 will be clarified in more detail.

The encoded data generation unit 29 corresponds to the decoding module 10. More specifically, the decoding module 10 derives a syntax value based on the encoded data and the syntax type, whereas the encoded data generation unit 29 encodes the code based on the syntax value and the syntax type. Generate data.

The encoding setting unit 21 corresponds to the CU information decoding unit 11 of the video decoding device 1. A comparison between the encoding setting unit 21 and the CU information decoding unit 11 is as follows.

The CU information decoding unit 11 supplies the encoded data and syntax type related to the CU prediction type information to the decoding module 10, and determines the PU partition type based on the CU prediction type information decoded by the decoding module 10.

In response to this, the encoding setting unit 21 determines the PU partition type, generates CU prediction type information, and supplies the syntax value and syntax type related to the CU prediction type information to the encoded data generation unit 29. To do.

Note that the encoded data generation unit 29 may have the same configuration as the binarized information storage unit 1012, the context storage unit 1013, and the probability setting storage unit 1014 included in the decoding module 10.

The PU information generation unit 30 and the predicted image generation unit 23 correspond to the PU information decoding unit 12 and the predicted image generation unit 14 of the video decoding device 1. These are compared as follows.

As described above, the PU information decoding unit 12 supplies the encoded data related to the motion information and the syntax type to the decoding module 10 and derives a motion compensation parameter based on the motion information decoded by the decoding module 10. Further, the predicted image generation unit 14 generates a predicted image based on the derived motion compensation parameter.

On the other hand, the PU information generation unit 30 and the predicted image generation unit 23 determine a motion compensation parameter in the predicted image generation process, and generate a syntax value and syntax type related to the motion compensation parameter as an encoded data generation unit. 29.

Also, the PU information generation unit 30 and the predicted image generation unit 23 may have the same configuration as the merge candidate priority information storage unit 122 and the reference frame setting information storage unit 123 included in the PU information decoding unit 12.

The transform / quantization unit 27 corresponds to the TU information decoding unit 13 and the inverse quantization / inverse transform unit 15 of the video decoding device 1. These are compared as follows.

The TU division setting unit 131 included in the TU information decoding unit 13 supplies the encoded data and syntax type related to information indicating whether or not to perform node division to the decoding module 10, and the node decoded by the decoding module 10 TU partitioning is performed based on information indicating whether to perform the partitioning.

Further, the transform coefficient restoration unit 132 included in the TU information decoding unit 13 supplies the determination information and the encoded data related to the transform coefficient and the syntax type to the decoding module 10, and the determination information and the transform coefficient decoded by the decoding module 10. Based on the above, a conversion coefficient is derived.

On the other hand, the transform / quantization unit 27 determines the division method of the TU division, and sends the syntax value and the syntax type related to the information indicating whether or not to perform node division to the encoded data generation unit 29. Supply.

Also, the transform / quantization unit 27 supplies the encoded data generation unit 29 with syntax values and syntax types related to the quantized transform coefficients obtained by transforming and quantizing the prediction residual.

Note that the transform / quantization unit 27 may have the same configuration as the transform size determination information storage unit 1314 included in the TU partition setting unit 131. The encoded data generation unit 29 may have the same configuration as the context storage unit 1034 included in the decoding module 10.

(Details of PU information generation unit 30)
FIG. 55 is a block diagram illustrating a configuration of the PU information generation unit 30. The PU information generation unit 30 includes a motion compensation parameter generation unit 301 for generating a motion compensation parameter. The motion compensation parameter generation unit 301 includes a bi-prediction restricted PU determination unit 1218, a bi-prediction conversion unit 1219, a merge motion compensation parameter generation unit 3012, and a basic motion compensation parameter generation unit 3013. The merge motion compensation parameter generation unit 3012 generates merge candidates and supplies the merge candidates to the encoding setting unit 21 as PTI setting information PTI ′. Also, the merge motion compensation parameter generation unit 3012 outputs an index for selecting a merge candidate as PT setting information PTI. The basic motion compensation parameter generation unit 3013 encodes the P setting information PTI from the input PTI setting information PTI ′, here the motion compensation parameter.

Subsequently, a more detailed configuration of the merge motion compensation parameter generation unit 3012 will be described with reference to FIG. FIG. 56 is a block diagram illustrating a configuration of the merge motion compensation parameter generation unit 3012. Note that the motion parameter derivation of the skip PU is also performed by the merge motion compensation parameter generation unit 3012.

The merge motion compensation parameter generation unit 3012 includes an adjacent merge candidate derivation unit 1212A, a temporal merge candidate derivation unit 1212B, a unique candidate derivation unit 1212C, a combined bi-prediction merge candidate derivation unit 1212D, a non-scale bi-prediction merge candidate derivation unit 1212E, and zero. A vector merge candidate derivation unit 1212F, a merge candidate derivation control unit 1212G, and a merge candidate storage unit 1212H are included.

Similar to the merge motion compensation parameter derivation unit 1212, the merge candidate derivation process uses the bi-prediction conversion unit 1219 for a small size PU that is determined to be subjected to bi-prediction restriction by the bi-prediction restricted PU determination unit 1218. In the case of bi-prediction, only a single prediction merge candidate is derived by converting the motion compensation parameter into a single prediction. Details of each part have already been described and will be omitted.

The merge motion compensation parameter generation unit 3012 generates merge candidates and supplies them to the encoding setting unit 21 as PTI setting information PTI '. Further, the merge index is supplied to the encoded data generation unit 29 as PT setting information PTI.

Subsequently, a more detailed configuration of the basic motion compensation parameter generation unit 3013 will be described with reference to FIG. FIG. 57 is a block diagram showing a configuration of the basic motion compensation parameter generation unit 3013. The basic motion compensation parameter generation unit 3013 includes an adjacent motion vector candidate derivation unit 1213A, a temporal motion vector candidate derivation unit 1213B, a zero vector merge candidate derivation unit 1213F, a motion vector candidate derivation control unit 1213G, a motion vector candidate storage unit 1213H, A vector candidate selection unit 3013A and a difference motion vector calculation unit 3013B are included.

The process of deriving the predicted motion vector candidate is the same as that of the basic motion compensation parameter deriving unit 1213 shown in FIG. The motion vector candidate selection unit 3013A selects a predicted motion vector candidate closest to the supplied predicted motion vector from the predicted motion vector candidates stored in the motion vector candidate storage unit 1213H, and derives the index as a predicted motion vector index. To do. The selected prediction motion vector is supplied to the difference motion vector calculation unit 3013B. The difference motion vector calculation unit 3013B calculates a difference between the supplied motion vector and the predicted motion vector as a difference motion vector. The inter prediction flag, the reference index number, the derived prediction motion vector index, and the difference motion vector are supplied to the encoded data generation unit 29 as PT setting information PT.

(Correspondence with specific composition)
[1] 'Encoding Setting Unit and Encoded Data Generation Unit [1-1]' Example of Configuration for Limiting Context Reference When the PU partition type is an asymmetric partition, the encoded data generation unit 29 Information indicating the type of division may be encoded without using the CABAC context.

The specific configuration is the same as that described in, for example, the configuration example [1-1] of the video decoding device 1, and thus the description thereof is omitted here. However, the “CU prediction mode decoding unit 1011”, the “probability setting storage unit 1014”, and the “decoding (performing)” in the description of the configuration example [1-1] are respectively referred to as “encoded data generation unit 29”, “ The “configuration corresponding to the probability setting storage unit 1014” and “encoding (perform)” shall be read.

[1-2] Configuration for Encoding CU Prediction Type Information (pred_type) The encoded data generation unit 29 may be configured to encode CU prediction type information with reference to binarization information.

The specific configuration is the same as that described in, for example, the configuration example [1-2] of the video decoding device 1, and thus the description thereof is omitted here. However, “CU prediction mode decoding unit 1011”, “binarization information storage unit 1012”, and “decoding (performing)” in the description of configuration example [1-2] are respectively referred to as “encoded data generation unit 29”. , “Configuration corresponding to the binarized information storage unit 1012” and “encoding (perform)”.

[1-3] ′ Configuration for encoding a short code for an intra CU in a small size CU The encoded data generation unit 29 may be configured to encode a short code for an intra CU in a small size CU. Good.

The specific configuration is the same as that described in, for example, the configuration example [1-3] of the video decoding device 1, and thus the description thereof is omitted here. However, “CU prediction mode decoding unit 1011”, “context storage unit 1013”, “binarization information storage unit 1012”, and “decoding (performing)” in the description of the configuration example [1-3] are “ It shall be read as “encoded data generation unit 29”, “configuration corresponding to context storage unit 1013”, “configuration corresponding to binarized information storage unit 1012”, and “encoding (perform)”.

[1-4] ′ Configuration for Changing the Interpretation of a Bin Sequence According to a Prediction Parameter in the Neighborhood The encoded data generation unit 29 changes the interpretation of the bin sequence with reference to the prediction parameter assigned to the adjacent region You may comprise.

The specific configuration is the same as that described in, for example, the configuration example [1-4] of the video decoding device 1, and thus the description thereof is omitted here. However, “CU prediction mode decoding unit 1011”, “binarization information storage unit 1012”, and “decoding (performing)” in the description of the configuration example [1-4] are respectively referred to as “encoded data generation unit 29”. , “Configuration corresponding to the binarized information storage unit 1012” and “encoding (to do)”.

[2] 'Predicted image generation unit and encoded data generation unit [2-1]' Example of merge candidate positions and priorities The PU information generation unit 30 has a symmetric PU partition type when the PU partition type is asymmetric. The priorities of the merge candidates may be determined by a method different from that in the case of.

The specific configuration is the same as that described in, for example, the configuration example [2-1] of the video decoding device 1, and thus the description thereof is omitted here. However, the “motion compensation parameter derivation unit 121” and the “merge candidate priority information storage unit 122” in the description of the configuration example [2-1] are referred to as the “motion compensation parameter generation unit 301” and the “merge candidate priority”, respectively. It shall be read as “a configuration corresponding to the rank information storage unit 122”.

[2-2] Change of Merge Candidate by Combination of CU Size and Skip / Merge The PU information generation unit 30 depends on a combination of a CU size and whether or not the CU is a CU that performs skip / merge. It may be configured to change merge candidates.

The specific configuration is the same as that described in, for example, the configuration example [2-2] of the video decoding device 1, and thus the description thereof is omitted here. However, the “motion compensation parameter derivation unit 121” and the “merge candidate priority information storage unit 122” in the description of the configuration example [2-2] are referred to as the “motion compensation parameter generation unit 301” and the “merge candidate priority”, respectively. It shall be read as “a configuration corresponding to the rank information storage unit 122”.

[2-3] ′ Determination of Reference Frame Number The motion compensation parameter generation unit 301 may be configured as shown in the following [2-3-1] ′ to [2-3-4] ′, thereby You may determine which prediction system of single prediction and bi-prediction is applied in inter prediction.

[2-3-1] 'Bi-prediction restriction with small size PU The motion compensation parameter generation unit 301 refers to the reference frame setting information, and applies which prediction method of uni-prediction or bi-prediction in inter prediction May be determined.

The specific configuration is the same as that described in, for example, the configuration example [2-3-1] of the video decoding device 1, and thus the description thereof is omitted here. However, the “motion compensation parameter derivation unit 121” and the “reference frame setting information storage unit 123” in the description of the configuration example [2-3-1] are referred to as the “motion compensation parameter generation unit 301” and the “reference frame”, respectively. It shall be read as “a configuration corresponding to the setting information storage unit 123”.

In the reference frame setting information, by limiting the small-size prediction direction to single prediction, it is possible to significantly reduce the amount of processing required for predictive image generation in the motion compensation parameter generation unit 301. Further, by limiting the merge candidates to single prediction in a small size PU, it is possible to reduce the amount of processing necessary for predictive image generation. Furthermore, by omitting the derivation of bi-predictive merge candidates in a small size PU, it is possible to reduce the amount of processing necessary for derivation of merge candidates.

[2-3-2] 'Determining Size to Limit Bi-Prediction The PU information generation unit 30 may determine and code the size to limit bi-prediction. A specific configuration is as shown in FIG. The configuration shown in FIG. 81 is obtained by changing the bi-prediction restricted PU determination unit 1218 to a bi-prediction restricted PU determination unit 1218A in the PU information generation unit 30 shown in FIGS.

Since the details are the same as those described in the configuration example [2-3-2] of the video decoding device 1, for example, the description thereof is omitted here. However, the “motion compensation parameter deriving unit 121”, “merge motion compensation parameter deriving unit 1212”, and “reference frame setting information storage unit 123” in the description of the configuration example [2-3-2] It shall be read as “generating unit 301”, “merge motion compensation parameter generating unit 3012”, and “configuration corresponding to reference frame setting information storage unit 123”.

[2-3-3] ′ Configuration for Performing Partial Bi-Prediction Restriction The PU information generation unit 30 may perform partial bi-prediction restriction. A specific configuration is as shown in FIG. 81 and FIG. The configuration shown in FIG. 81 is obtained by changing the bi-prediction conversion unit 1219 to the bi-prediction conversion unit 1219A in the PU information generation unit 30 shown in FIGS. The configuration shown in FIG. 82 is obtained by changing the bi-prediction conversion unit 1219 to the bi-prediction conversion unit 1219B in the PU information generation unit 30 shown in FIGS.

Since the details are the same as those described in the configuration example [2-3-3] of the video decoding device 1, for example, the description thereof is omitted here. However, the “motion compensation parameter deriving unit 121”, “merge motion compensation parameter deriving unit 1212”, and “reference frame setting information storage unit 123” in the description of the configuration example [2-3-3] It shall be read as “generating unit 301”, “merge motion compensation parameter generating unit 3012”, and “configuration corresponding to reference frame setting information storage unit 123”.

[2-3-4] ′ Motion Vector Integer Conversion The PU information generation unit 30 may perform motion vector integer conversion. A specific configuration is as shown in FIG. The configuration shown in FIG. 83 is obtained by changing the bi-prediction conversion unit 1219 to the bi-prediction conversion unit 1219A in the PU information generation unit 30 shown in FIGS. The configuration shown in FIG. 82 is obtained by changing the bi-prediction conversion unit 1219 to a motion vector integer conversion unit 1220 in the PU information generation unit 30 shown in FIGS.

Since the details are the same as those described in the configuration example [2-3-4] of the video decoding device 1, for example, the description thereof is omitted here. However, the “motion compensation parameter deriving unit 121”, “merge motion compensation parameter deriving unit 1212”, and “reference frame setting information storage unit 123” in the description of the configuration example [2-3-4] It shall be read as “generating unit 301”, “merge motion compensation parameter generating unit 3012”, and “configuration corresponding to reference frame setting information storage unit 123”.

By limiting bi-prediction as shown in [2-3-2] ′ to [2-3-4] ′, it is possible to reduce the processing amount necessary for predictive image generation in the motion compensation parameter generation unit 301. it can.

(Restriction of motion compensation parameters on encoded data)
In the moving picture decoding apparatus, the restriction of encoded data due to level restrictions has been described. This limits the value of the motion compensation parameter obtained by deriving the encoded data for each level. Such a restriction can be realized by the moving picture encoding apparatus 2 shown below. Here, an example will be described in which inter prediction flag encoding skip, merge candidate bi-single conversion, and merge candidate derivation are not performed.

The moving picture encoding apparatus 2 of this configuration uses a PU information generation unit 30 shown in FIG. The PU information generation unit 30 includes a motion compensation parameter generation unit 301. The motion compensation parameter generation unit 301 includes a merge motion compensation parameter generation unit 3012 and a basic motion compensation parameter generation unit 3013. A motion compensation parameter restriction unit 3014 is provided instead of the prediction restriction PU determination unit 1218. In this configuration, bi-prediction conversion unit 1219 is not included because bi-conversion of merge candidates is not performed. However, bi-prediction conversion unit 1219 is included to perform bi-single conversion of merge candidates in the case of a small PU size. It may be configured.

The motion compensation parameter restriction unit 3014 receives PU size information and PTI setting information PTI ', and calculates an additional cost according to the motion compensation parameter of the PTI setting information PTI'. The additional cost is transmitted to the encoding setting unit 21. Since the transmitted additional cost is added to the minimum cost, selection of a specific motion compensation parameter can be prevented by setting a large additional cost.

[Pseudo code]
Next, the operation of the motion compensation parameter restriction unit 3014 will be described using the pseudo code shown in FIG. Hereinafter, each step S of the pseudo code illustrated in FIG. 87 will be described.

S871: The motion compensation parameter restriction unit 3014 determines whether the value of the level level_idc is less than a predetermined threshold value TH1.

S872: When the value of the level level_idc is less than TH1, the motion compensation parameter restriction unit 3014 performs no particular processing.

S873: On the other hand, if the value of the level level_idc is not less than TH1, the motion compensation parameter restriction unit 3014 further determines whether the value of the level level_idc is less than a predetermined threshold TH2.

S874: When the value of the level level_idc is less than TH2, the motion compensation parameter restriction unit 3014 sets the DisableBipred variable as follows.

That is, in S874, bi-prediction in PUs other than 8 × 8 PU (2N × 2N) is limited for 8 × 8 CU which is the minimum CU size.

Furthermore, when the PTI setting information PTI 'falls under the above restriction, the additional cost is set sufficiently large. If not, the additional cost is set to zero.

S874 ′: When the level level_idc value is less than TH1, the motion compensation parameter limiting unit 3014 limits the usable PU size. That is, the flag inter_4x4_enable_flag = 0 that restricts motion compensation parameter restriction is used to restrict the use of 4 × 4 PUs. Further, when the PTI setting information PTI 'corresponds to the above-described restriction, that is, when the PU size is 4 × 4, the additional cost is set sufficiently large. If not, the additional cost is set to zero. In the above description, inter_4x4_enable_flag is used as a flag for limiting the motion compensation parameter, but other flags (for example, use_restricted_prediction) for limiting the use of 4 × 4 PU may be used.

S875: When the value of the level level_idc is not less than TH2, the motion compensation parameter restriction unit 3014 sets the DisableBipred variable as follows.

That is, in S875, when the minimum CU size is other than 8 × 8 (for example, 16 × 16), bi-prediction at the minimum PU (N × N) is limited for a CU having a size that matches the minimum CU size. .

S875 ′: When the value of the level level_idc is not less than TH2, the motion compensation parameter restriction unit 3014 restricts the usable PU size by restricting the usable CU size. That is, the logarithmic value Log2MinCuSize of the CU size is limited to 4 or more. Limit log2_min_cu_size_minus3 to 1 or more.

Furthermore, when the PTI setting information PTI 'corresponds to the above-described restriction, that is, when the CU size is 8x8, the additional cost is set sufficiently large. If not, the additional cost is set to zero. In S874, the motion compensation parameter restriction unit 3014 may determine Log2MinCUSize by referring to MaxLog2MinCUSize in the table of FIG. 84 according to the level level_idc value. For example, as shown in FIG. 84, when the level level_idc value is greater than or equal to TH1 and less than TH2, MaxLog2MinCuSize = 3. Therefore, in S874, the motion compensation parameter limiter 3014 sets MaxLog2MinCuSize = 3 as the value of Log2MinCUSize. Can be used.

[3] 'Transformation / Quantization Unit and Coded Data Generation Unit [3-1]' Example of Configuration for Deriving Partition Area Size when PU Partition Type is Asymmetric The transform / quantization unit 27 has a PU partition type of In the case of asymmetrical configuration, a rectangular (non-square) transformation may be applied to the smaller PU, while a square transformation may be applied to the larger PU.

The specific configuration is the same as that described in, for example, the configuration example [3-1] of the video decoding device 1, and thus the description thereof is omitted here. However, “target region setting unit 1311”, “division determining unit 1312”, and “divided region setting unit 1313” in the description of the configuration example [3-1] shall be read as “transformation / quantization unit 27”. In addition, the “conversion size determination information storage unit 1314” in the description of the configuration example [3-1] is replaced with “a configuration corresponding to the conversion size determination information storage unit 1314”.

[3-2] 'Example of configuration in which non-rectangular transformation is applied when some of the PU partition types are square partitions. When the PU partition type is a square partition, the transform / quantization unit 27 The structure divided | segmented into a non-square may be sufficient. In addition to the above configuration, the transform / quantization unit 27 has a uniform scan order at each division depth when the CU size is 16 × 16 size and the PU division type is 2N × 2N. The configuration may be such that the division is performed as possible.

The specific configuration is the same as that described in, for example, the configuration example [3-2] of the video decoding device 1, and thus the description thereof is omitted here. However, the “partition area setting unit 1313” and the “transformed size determination information storage unit 1314” in the description of the configuration example [3-2] are the “transformation / quantization unit 27” and the “transformed size determination information storage”, respectively. It should be read as “a configuration corresponding to the unit 1314”.

[3-3] 'Specific Configuration Referencing Context at Coding Coefficients When the PU partition type is asymmetric partitioning, the encoded data generation unit 29 includes a TU included in a smaller PU and a larger PU. The configuration may be such that at least one of the non-zero transform coefficient presence information and the transform coefficient is encoded using a different context for the TU included in the.

The specific configuration is the same as that described in, for example, the configuration example [3-3] of the video decoding device 1, and thus the description thereof is omitted here. However, “determination information decoding unit 1032” and “transform coefficient decoding unit 1033” in the description of the configuration example [3-3] should be read as “encoded data generation unit 29”. Further, “decoding (perform)” and “context storage unit 1034” in the description of the configuration example [3-3] are referred to as “encoding (perform)” and “configuration corresponding to the context storage unit 1034”, respectively. It shall be replaced.

(Process flow)
The CU encoding process in the moving image encoding apparatus 2 will be described with reference to FIG. In the following, it is assumed that the target CU is an inter CU or a skip CU. FIG. 24 is a flowchart illustrating an example of a flow of CU encoding processing (inter / skip CU) in the moving image encoding device 2.

When the CU encoding process is started, the encoding setting unit 21 determines CU prediction information for the target CU, and the encoded data generation unit 29 encodes the CU prediction information determined by the encoding setting unit 21. (S21). This process is performed on a CU basis.

Specifically, the encoding setting unit 21 determines whether or not the target CU is a skip CU. When the target CU is a skip CU, the encoding setting unit 21 causes the encoded data generation unit 20 to encode the skip flag SKIP. When the target CU is not a skip CU, the encoding setting unit 21 causes the encoded data generation unit 20 to encode the CU prediction type information Pred_type.

Next, PU unit processing is performed. That is, the predicted image generation unit 23 derives motion information, and the encoded data generation unit 29 encodes the motion information derived by the predicted image generation unit 23 (S22). Further, the predicted image generation unit 14 generates a predicted image by inter prediction based on the derived motion information (S23).

Next, the transform / quantization unit 27 performs TU division coding processing (S24). Specifically, the transform / quantization unit 27 sets a TU partitioning scheme based on the CU size and PU partition type of the target CU. This process is performed on a CU basis.

Next, TU unit processing is performed. That is, the transform / quantization unit 27 transforms / quantizes the prediction residual into transform coefficients (S25), and the encoded data generation unit 29 encodes the transformed / quantized transform coefficients (S26).

Next, the inverse quantization / inverse transform unit 22 performs inverse quantization / inverse transform on the transformed / quantized transform coefficient to restore the prediction residual, and the adder 24 restores the predicted image and the prediction residual. Is added to generate a decoded image (S27). This process is performed on a CU basis.

[Application example]
The above-described moving image encoding device 2 and moving image decoding device 1 can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

First, it will be described with reference to FIG. 27 that the above-described moving image encoding device 2 and moving image decoding device 1 can be used for transmission and reception of moving images.

(A) of FIG. 27 is a block diagram illustrating a configuration of a transmission device PROD_A in which the moving image encoding device 2 is mounted. As illustrated in FIG. 27A, the transmission apparatus PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image, and with the encoded data obtained by the encoding unit PROD_A1. Thus, a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided. The moving image encoding apparatus 2 described above is used as the encoding unit PROD_A1.

The transmission device PROD_A is a camera PROD_A4 that captures a moving image, a recording medium PROD_A5 that records the moving image, an input terminal PROD_A6 that inputs the moving image from the outside, as a supply source of the moving image input to the encoding unit PROD_A1. An image processing unit A7 that generates or processes an image may be further provided. FIG. 27A illustrates a configuration in which the transmission apparatus PROD_A includes all of these, but a part may be omitted.

The recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 according to the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.

(B) of FIG. 27 is a block diagram illustrating a configuration of the receiving device PROD_B in which the moving image decoding device 1 is mounted. As illustrated in FIG. 27B, the receiving device PROD_B includes a receiving unit PROD_B1 that receives a modulated signal, a demodulating unit PROD_B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit PROD_B1, and a demodulator. A decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2. The moving picture decoding apparatus 1 described above is used as the decoding unit PROD_B3.

The receiving device PROD_B has a display PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording the moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3. PROD_B6 may be further provided. FIG. 27B illustrates a configuration in which the reception apparatus PROD_B includes all of these, but a part of the configuration may be omitted.

The recording medium PROD_B5 may be used for recording a non-encoded moving image, or may be encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.

Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

For example, a terrestrial digital broadcast broadcasting station (broadcasting equipment or the like) / receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting. Further, a broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.

Also, a server (workstation etc.) / Client (television receiver, personal computer, smart phone etc.) such as VOD (Video On Demand) service and video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication. This is an example of PROD_A / reception device PROD_B (usually, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

In addition to the function of decoding the encoded data downloaded from the server and displaying it on the display, the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.

Next, it will be described with reference to FIG. 28 that the above-described moving image encoding device 2 and moving image decoding device 1 can be used for recording and reproduction of moving images.

(A) of FIG. 28 is a block diagram showing a configuration of a recording apparatus PROD_C in which the above-described moving picture encoding apparatus 2 is mounted. As shown in (a) of FIG. 28, the recording device PROD_C includes an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on the recording medium PROD_M. A writing unit PROD_C2 for writing. The moving image encoding apparatus 2 described above is used as the encoding unit PROD_C1.

The recording medium PROD_M may be of a type built in the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.

The recording device PROD_C is a camera PROD_C3 that captures moving images as a supply source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and reception for receiving moving images. The unit PROD_C5 and an image processing unit C6 that generates or processes an image may be further provided. In FIG. 28A, a configuration in which the recording apparatus PROD_C includes all of these is illustrated, but a part of the configuration may be omitted.

The receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.

Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, and an HDD (Hard Disk Drive) recorder (in this case, the input terminal PROD_C4 or the receiving unit PROD_C5 is a main supply source of moving images). . In addition, a camcorder (in this case, the camera PROD_C3 is a main source of moving images), a personal computer (in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images), a smartphone (this In this case, the camera PROD_C3 or the receiving unit PROD_C5 is a main supply source of moving images) is also an example of such a recording device PROD_C.

(B) of FIG. 28 is a block showing a configuration of a playback device PROD_D equipped with the above-described video decoding device 1. As shown in FIG. 28 (b), the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written to the recording medium PROD_M and a coded data read by the read unit PROD_D1. And a decoding unit PROD_D2 to be obtained. The moving picture decoding apparatus 1 described above is used as the decoding unit PROD_D2.

Note that the recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory, It may be of a type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as DVD or BD. Good.

In addition, the playback device PROD_D has a display PROD_D3 that displays a moving image, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image as a supply destination of the moving image output by the decoding unit PROD_D2. PROD_D5 may be further provided. FIG. 28B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but some of the configurations may be omitted.

The transmission unit PROD_D5 may transmit an unencoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image using an encoding method for transmission between the decoding unit PROD_D2 and the transmission unit PROD_D5.

Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main supply destination of moving images). . In addition, a television receiver (in this case, the display PROD_D3 is a main supply destination of moving images), a digital signage (also referred to as an electronic signboard or an electronic bulletin board), and the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images. Desktop PC (in this case, the output terminal PROD_D4 or the transmission unit PROD_D5 is the main video image supply destination), laptop or tablet PC (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a moving image) A smartphone (which is a main image supply destination), a smartphone (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a main moving image supply destination), and the like are also examples of such a playback device PROD_D.
[Summary]
An image decoding apparatus according to an aspect of the present invention uses a single prediction that refers to one reference image or a bi-prediction that refers to two reference images as a prediction method for inter-screen prediction. In the image decoding apparatus for decoding, when the prediction unit is a prediction unit having a predetermined size or less, a bi-prediction restriction unit is provided for restricting the bi-prediction to the prediction unit.

An image decoding apparatus according to an aspect of the present invention includes a merge candidate derivation unit that derives a merge candidate based on a motion compensation parameter of an adjacent prediction unit in a process of deriving a motion compensation parameter of a prediction unit as a merge candidate. If the merge candidate to be derived is the bi-prediction, the bi-prediction restriction unit converts the bi-prediction into the single prediction.

The image decoding apparatus according to an aspect of the present invention uses at least two prediction list usage flags indicating whether or not a reference picture list is used, and the at least two prediction list usage flags indicate the reference picture list. In the case where the use is indicated, the bi-prediction restriction unit converts one of the at least two prediction list use flags to indicate that the reference picture list is not used.

In the image decoding device according to an aspect of the present invention, the bi-prediction restriction unit does not use the prediction list use flag indicating that the L1 list that is one of the reference picture lists is used. As shown.

In the image decoding device according to an aspect of the present invention, the size of the prediction unit is calculated using the width and height of the prediction unit.

In the image decoding device according to an aspect of the present invention, the bi-prediction restriction unit decodes information indicating whether to perform the bi-prediction or the uni-prediction when the prediction unit is larger than a predetermined size, When the prediction unit is a predetermined size or less, decoding of information indicating whether to perform the bi-prediction or the single prediction is omitted, and the single prediction is performed.

An image decoding method according to an aspect of the present invention uses, as a prediction method for inter-screen prediction, a single prediction that refers to one reference image or bi-prediction that refers to two reference images to generate an image in a prediction unit. In the image decoding method for decoding, at least a step of determining whether or not the prediction unit is a prediction unit of a predetermined size or less and a step of restricting the bi-prediction from being used for the prediction unit are included. .

An image coding apparatus according to an aspect of the present invention uses, as a prediction method for inter-screen prediction, uni-prediction that refers to one reference image or bi-prediction that refers to two reference images. In the image encoding device that encodes the image, when the prediction unit is a prediction unit of a predetermined size or less, a bi-prediction restriction unit is provided for restricting the bi-prediction to the prediction unit.

An image decoding apparatus according to an aspect of the present invention provides, in an image decoding apparatus that decodes image encoded data and generates a decoded image for each encoding unit, information that specifies a division type for dividing the encoding unit. A CU information decoding unit for decoding, and an arithmetic decoding unit for decoding a binary value from the image encoded data by arithmetic decoding using a context or arithmetic decoding without using a context, and the CU information decoding unit includes the division When decoding information designating asymmetric partition (AMP) as a type, the arithmetic decoding unit performs arithmetic decoding using the context and arithmetic not using the context according to the position of the binary value. Decoding is switched to decoding.

According to the above aspect of the present invention, it is possible to realize a reduction in the amount of code when using an asymmetric partition and an efficient encoding / decoding process utilizing the properties of the asymmetric partition.

The present invention can also be expressed as follows. That is, an image decoding apparatus according to an aspect of the present invention provides, for each coding unit, an image decoding apparatus that restores an image by decoding information for restoring an image from image encoded data, For codes assigned to a combination with a prediction method applied to a coding unit, a combination other than the combination is applied to a combination in which a prediction method for intra prediction is applied to a coding unit having a size equal to or smaller than a predetermined size. And a decoding means for decoding a code shorter than the code assigned to.

In the above configuration, a coding unit having a size equal to or smaller than a predetermined size is a coding unit having a size such that inter prediction is difficult to hit in a coding unit having a size larger than the predetermined size.

∙ In areas where inter prediction is difficult to hit, smaller coding units that are smaller than a predetermined size tend to be applied. Hereinafter, a coding unit having a size larger than the predetermined size is referred to as a large coding unit.

The coding unit having a size equal to or smaller than the predetermined size is, for example, a coding unit having a minimum size and an 8 × 8 pixel coding unit.

In such a small coding unit, spatial correlation is higher than that in a large coding unit, and therefore, an intra CU is often applied in order to improve prediction accuracy.

In the above configuration, a code having a smaller coding size and a prediction method of intra prediction is assigned a shorter code than the other combinations.

For this reason, according to the above configuration, it is possible to assign a short code to a combination having a high probability of occurrence in a coding unit having a size equal to or smaller than a predetermined size, and the effect of improving coding efficiency is achieved.

In the image decoding device according to an aspect of the present invention, the decoding unit is assigned to the combination in which the prediction method for intra prediction is applied to the encoding unit having a size larger than the predetermined size. It is preferable to decode short codes compared to existing codes.

According to the above configuration, the prediction method for intra prediction in a small coding unit that is likely to hit in-screen prediction is larger than the case where the prediction method for intra-screen prediction is applied in a large coding unit that is difficult to hit within the screen. When applied, shorter codes are decoded.

Thereby, it is possible to decode a short code for a combination having a high appearance frequency, and as a result, it is possible to improve the encoding efficiency.

In the image decoding device according to an aspect of the present invention, the decoding unit allocates, to the combination, a combination in which a prediction method other than intra prediction is applied to an encoding unit having the same size as the predetermined size. It is preferable to decode a short code compared to the code being used.

According to the above configuration, it is possible to decode a short code for intra prediction that is more likely to be predicted than inter prediction that is difficult to predict in a small encoding unit.

An image decoding apparatus according to an aspect of the present invention performs inter-screen prediction of an image in a prediction unit, either uni-prediction referring to one reference image or bi-prediction referring to two reference images. In the image decoding apparatus restored by a prediction method, the image decoding apparatus includes bi-prediction restriction means for restricting bi-prediction for a target prediction unit that is a prediction unit having a size of a predetermined size or less to which the inter-screen prediction is applied. Features.

Bi-prediction requires more processing than single prediction. Note that bi-prediction is a prediction method that uses two images referred to in inter-screen prediction. The image to be referred to may be temporally forward or backward with respect to the target frame.

Also, a small prediction unit having a size smaller than or equal to a predetermined size has a larger processing amount per unit area than a large prediction unit having a size larger than the predetermined size.

Therefore, when bi-prediction is performed in a small prediction unit, the amount of processing is large, so that it tends to be a bottleneck in decoding processing.

According to the above configuration, bi-prediction is limited in a small prediction unit. The restriction includes omitting a part of processing in bi-prediction and not performing bi-prediction processing.

In the image decoding device according to an aspect of the present invention, the bi-prediction restriction unit is a prediction unit that does not omit decoding at least a part of a motion vector used for generating a predicted image in the target prediction unit, and It is preferable to perform the above restriction on the target prediction unit that does not estimate the prediction parameter assigned to the target prediction unit from the prediction parameters assigned to the prediction unit in the vicinity of the target prediction unit.

According to the above configuration, so-called skip processing and merge processing are not performed, and bi-prediction is limited when a prediction parameter is actually derived for a target prediction unit.

When skip processing and merge processing are not performed, it is necessary to decode all motion vectors, which increases the processing amount. Therefore, by performing the above restriction in such a case, it is possible to reduce the amount of processing that becomes a bottleneck of decoding processing.

In the image decoding apparatus according to an aspect of the present invention, it is preferable that the bi-prediction restriction unit performs single prediction by omitting decoding of information indicating whether bi-prediction or single prediction is performed.

According to the above configuration, the decoding process in the target prediction unit that restricts bi-prediction can be simplified. Further, it is possible to avoid the overhead of decoding information indicating whether to perform bi-prediction or uni-prediction, although it is determined in advance that uni-prediction is performed.

In the image decoding apparatus according to an aspect of the present invention, it is preferable that the bi-prediction restriction unit omits processing of information related to weighted prediction in bi-prediction.

According to the above configuration, the amount of processing in bi-prediction can be reduced by omitting processing of information related to weighted prediction in bi-prediction. As a result, it is possible to reduce the amount of processing that becomes a bottleneck of decoding processing, such as processing information related to weighted prediction.

In the image decoding device according to an aspect of the present invention, the image decoding device includes a merge candidate derivation unit that derives a merge candidate from a motion compensation parameter used for decoding a decoded prediction unit, and the bi-prediction restriction unit includes: When the merge candidate derived by the merge candidate deriving means is bi-prediction, bi-prediction conversion means for converting the bi-prediction into single prediction may be provided.

According to the above configuration, even when the merge process is used for decoding the prediction image in the target prediction unit, bi-prediction can be restricted, and the decoding process in the target prediction unit can be simplified.

In the image decoding device according to an aspect of the present invention, when the bi-prediction conversion unit uses two prediction list use flags that are flags indicating whether or not a reference picture list is used, the two prediction list use flags When both indicate that the reference picture list is used, one of the two prediction list use flags may be converted to indicate that the reference picture list is not used. Good.

According to the above configuration, bi-prediction can be limited in the image decoding apparatus that controls whether bi-prediction or uni-prediction is performed using the prediction list use flag, and the decoding process in the target prediction unit is simplified. Can be

In the image decoding device according to an aspect of the present invention, the bi-predictive conversion unit converts the flag of the reference picture list indicating that the L0 list is used so as to indicate that the L0 list is not used. There may be.

According to the above configuration, in the decoding device that controls whether to perform bi-prediction or uni-prediction using the prediction list use flag, bi-prediction can be limited while maintaining coding efficiency. The decoding process in the prediction unit can be simplified.

Here, the L0 list is a list of pictures mainly used for forward prediction. In general, in deriving motion compensation parameters, reference pictures in the L0 list are often given priority. Conversely, by using reference pictures in the L1 list instead of reference pictures in the L0 list, it is possible to differentiate from the derivation process that prioritizes these L0 lists. When a plurality of derivation processes can be selected according to a certain encoding parameter, assuming that a derivation process for a certain group is L0 list priority and a derivation process for other groups is L1 list priority, the respective derivation processes are used in a complementary manner. Therefore, it works effectively in sequences and areas with more motion properties. Therefore, in bi-predictive transformation, high encoding efficiency can be obtained by using the L1 list.

In the image decoding device according to an aspect of the present invention, when the bi-prediction conversion unit converts the flag indicating whether or not the reference picture list is used so as to indicate that the reference picture list is not used, the reference The index number and the motion vector may not be refreshed.

According to the above configuration, even when the use of the reference picture list is restricted, the reference index number and the motion vector value of the restricted one can be used in the subsequent processing. Therefore, bi-prediction can be restricted while maintaining encoding efficiency as compared with the case where refresh is performed, and decoding processing in the target prediction unit can be simplified.

In the image decoding device according to an aspect of the present invention, when the bi-prediction restriction unit decodes the code number corresponding to the combined inter prediction reference index, the maximum value of the code number value is restricted when bi-prediction is restricted. When set to the number of combined reference picture sets for uni-prediction and does not restrict bi-prediction, it is set to the sum of the number of combined reference picture sets for uni-prediction and the number of combined reference picture sets for bi-prediction. Also good.

According to the above configuration, in an image decoding apparatus using a joint inter prediction reference index, it is determined in advance that single prediction is performed, but avoids the overhead of decoding a code number corresponding to bi-prediction. Can do.

In the image decoding apparatus according to an aspect of the present invention, when the bi-prediction restriction unit decodes the joint inter prediction reference index, when the bi-prediction is restricted, the joint inter prediction is performed using a variable table from a code number. When the reference index is derived and not limited, the combined inter prediction reference index may be derived from the code number without using the variable table.

According to the above configuration, in the image decoding apparatus that decodes a combined inter prediction reference index from a code number using a variable table, it is possible to simplify the decoding process for the variable table when bi-prediction is restricted.

In the image decoding apparatus according to an aspect of the present invention, when the bi-prediction restriction unit restricts bi-prediction when the decoded combined inter-prediction reference index indicates other than a combined reference picture set, bi-prediction and single prediction are performed. When decoding is performed by omitting decoding of information indicating which one is to be performed and bi-prediction is not limited, information indicating whether bi-prediction or single prediction is to be performed may be decoded.

According to the above configuration, in the image decoding apparatus using the joint inter prediction reference index, the overhead of decoding information indicating whether bi-prediction or uni-prediction is to be performed even though it is predetermined to perform uni-prediction. Can be avoided.

In the image decoding apparatus according to an aspect of the present invention, the image decoding apparatus includes merge candidate derivation means for deriving merge candidates that are motion compensation parameter candidates when merge processing is used for decoding a predicted image in a target prediction unit. The merge candidate derivation means includes an adjacent merge candidate derivation means for deriving a merge candidate from a motion compensation parameter used for decoding an adjacent target prediction unit adjacent to the target prediction unit, and a merge candidate from a plurality of reference pictures. A bi-predictive merge candidate deriving unit for deriving the bi-predictive merge candidate, and the merge candidate deriving unit may not use the merge candidate by the bi-predictive merge candidate deriving unit when the target prediction unit has a predetermined size. .

According to the above configuration, derivation of merge candidates can be simplified by omitting derivation of merge candidates for bi-prediction.

An image decoding apparatus according to an aspect of the present invention is an image decoding apparatus that generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers and restores the image, and is a decoding target. A plurality of codes associated with a plurality of sets of a division type and a prediction method, which is a type for dividing a target coding unit that is a coding unit into the prediction units, are the prediction units to be decoded. It is characterized by comprising changing means for changing according to a decoded parameter assigned to a decoded prediction unit in the vicinity of the target prediction unit.

The unit for generating the predicted image is determined based on the encoding unit that is the unit of the encoding process. Specifically, the same region as the coding unit or a region obtained by dividing the coding unit is set as the prediction unit.

In the above configuration, the division type into the prediction units may include division into a square and division into a non-square. The division into squares is a case where the prediction unit obtained by the division is a square.

For example, this is the case when a square coding unit is divided into four squares. This also applies to the case of non-division using a region having the same size as the square coding unit as a prediction unit. The division type in the case of non-division is generally expressed as 2N × 2N.

<Division into non-square> is when the prediction unit obtained by the division is non-square. For example, this is the case when the coding unit area is divided into a large rectangle and a small rectangle.

Also, the sign means a binary string of encoded parameter values. The binary sequence may be directly encoded or may be encoded by arithmetic encoding. The prediction method is either inter-screen prediction or intra-screen prediction. Further, the set of the prediction method and the division type is, for example, (intra-screen prediction, non-division), and may be represented by a parameter value called pred_type.

Also, in the above configuration, a code is associated with a set of prediction scheme and division type on a one-to-one basis.

According to the above configuration, the association is changed according to the decoded parameter. In other words, even if the codes are the same, the interpretation of which prediction scheme and division type pair is shown is changed according to the decoded parameters.

Therefore, a shorter code can be assigned to a combination of a prediction method and a division type having a higher occurrence probability.

Specifically, when the coding unit adjacent to the target coding unit is a coding unit for performing intra prediction, the target coding unit is likely to be used for intra prediction.

Therefore, in such a case, it is desirable to assign a short code to the set including the intra prediction.

According to the above configuration, a shorter code is assigned to a set of a prediction method and a division type having a higher probability of occurrence according to a decoded parameter assigned to a neighboring decoded prediction unit. Can be improved.

In the image decoding device according to an aspect of the present invention, when the changing unit allocates an intra-screen prediction prediction method to a decoded coding unit in the vicinity of the target coding unit, the prediction of the intra-screen prediction is performed. It is preferable to change the code associated with the set including the method to a short code.

According to the above configuration, when a prediction scheme for intra prediction is assigned to a decoded encoding unit in the vicinity of the target encoding unit, a code associated with a set including the prediction scheme for intra prediction is Change to a shorter code.

Note that there may be a plurality of neighborhoods, for example, the neighborhood of the upper adjacent coding unit and the left adjacent coding unit may be mentioned.

In this case, the prediction method for intra-screen prediction only needs to be assigned to one or both of the upper adjacent coding unit and the left adjacent coding unit.

When the prediction method for intra prediction is assigned to a decoded encoding unit in the vicinity of the target encoding unit, it is highly likely that the target encoding unit is also assigned intra prediction.

For this reason, it is possible to shorten a code associated with a group having a high occurrence frequency, and to improve encoding efficiency.

In the image decoding apparatus according to an aspect of the present invention, the changing unit performs division in an adjacent direction when a decoded coding unit adjacent to the target coding unit is smaller than the target coding unit. It is preferable to change the code associated with the set including the division type to be performed to a short code.

According to said structure, when the decoding encoding unit adjacent to the said object encoding unit is smaller than the said object encoding unit, it is matched with the group containing the division type which performs a division | segmentation to an adjacent direction. Change the code to a short code.

When the decoded coding unit adjacent to the target coding unit is smaller than the target coding unit, the direction is perpendicular to the boundary between the target coding unit and the adjacent decoded coding unit. An edge is likely to exist. That is, an edge often appears in a direction in which the target coding unit is adjacent to the decoded coding unit.

In such a case, it becomes easy to select a division type for performing division in an adjacent direction.

An image decoding apparatus according to an aspect of the present invention generates an image by using a prediction method for inter-frame prediction for each prediction unit obtained by dividing an encoding unit into one or more numbers, and restores the image. The target prediction unit, which is the prediction unit to be decoded, is a prediction unit for estimating the prediction parameter of the target prediction unit from the prediction parameters assigned to the area in the vicinity of the target prediction unit. The apparatus further comprises candidate determining means for determining a candidate area to be used for estimation according to the size of the target prediction unit.

According to the above configuration, a candidate for a region used for so-called skip or merge is determined according to the size of the target prediction unit. Alternatively, an area candidate used for deriving an estimated motion vector used for restoring the motion vector is set together with the decoded difference motion vector.

The correlation of motion vectors for inter-screen prediction varies depending on the size of the target prediction unit. For example, in a region where a small prediction unit of a predetermined size or less is selected, the motion of the object is often complicated, and the correlation between the motion vectors is small.

Therefore, according to the above configuration, for example, the number of candidates can be reduced according to the complexity of the movement. Thereby, side information can be reduced and, as a result, encoding efficiency can be improved.

In the image decoding device according to an aspect of the present invention, the candidate determination unit is configured such that the number of candidates for a small prediction unit equal to or smaller than a predetermined size is smaller than the number of candidates for a prediction unit larger than the small prediction unit. It is preferable to do.

According to the above configuration, the number of candidates for a small prediction unit of a predetermined size or less is made smaller than the number of candidates for a prediction unit larger than the small prediction unit.

As described above, in a region where a small prediction unit of a predetermined size or less is selected, the motion of the object is often complicated, and the correlation between the motion vectors is small.

Therefore, in the above area, it is preferable to reduce the number of candidates because side information can be reduced.

In the image decoding device according to an aspect of the present invention, it is preferable that the candidate determination unit does not include temporal prediction in the candidates in a small prediction unit of a predetermined size or less.

According to the above configuration, temporal prediction is not included in the candidates in a small prediction unit of a predetermined size or less.

In a region where the motion is complex such that a small prediction unit is selected, the correlation between the related prediction unit (collocated PU) used for temporal prediction and the target prediction unit is small, and therefore the possibility that temporal prediction is selected is small. Therefore, in such a region, it is preferable not to include temporal prediction as a merge candidate.

An image decoding apparatus according to an aspect of the present invention provides an image decoding apparatus that generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers and restores the image. The division type includes dividing the rectangle into prediction units, and includes dividing the rectangle into prediction units including a code indicating whether the rectangle is vertically long or horizontally and a code indicating the type of the rectangle. Among the codes for specifying, a decoding means for decoding a code indicating the type of the rectangle without using a context is provided.

According to the above configuration, when the division type into prediction units is division into rectangular prediction units, the code indicating the type of rectangle is decoded without using a context.

The types of rectangles are, for example, three types of 2N × N, 2N × nU, and 2N × nD when the division type is horizontal rectangular division.

The division of the prediction unit is often performed so as not to cross the edge existing in the coding unit area. When an edge having an inclination exists in a region, the same rectangular type may not be selected continuously. In such an area, even if decoding processing is performed using a context, the encoding efficiency may not be improved.

Conversely, in such a region, even if decoding processing is performed without using a context, the encoding efficiency may not be reduced.

According to the above configuration, it is possible to simplify the processing by not referring to the context while maintaining the encoding efficiency in the above-described region.

An image decoding apparatus according to an aspect of the present invention generates an image by using a prediction method for inter-frame prediction for each prediction unit obtained by dividing an encoding unit into one or more numbers, and restores the image. In the above, the division type into prediction units includes an asymmetric division that divides the coding unit into a plurality of prediction units of different sizes or a symmetric division that divides the coding unit into a plurality of prediction units of the same size, When the division type is asymmetric division, an estimation unit is provided that estimates a prediction parameter for inter-screen prediction by an estimation method different from that when the division type is symmetric division.

According to the above configuration, when the division type is asymmetric division, the prediction parameter for inter-screen prediction is estimated by an estimation method different from that when the division type is symmetric division.

The coding unit for which asymmetric division is selected is divided asymmetrically into a smaller prediction unit and a larger prediction unit in the division for obtaining the prediction unit.

Also, in the coding unit in which asymmetric division is selected, there is a high possibility that an edge crossing the smaller prediction unit in the long side direction exists.

Also, there is a high possibility that an accurate motion vector is derived in the region where the edge exists. That is, when the division type is asymmetric division, the region from which a highly accurate motion vector is derived differs from the case where the division type is symmetric division.

An image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers and obtains a coding unit by dividing the coding unit into one or more numbers. In the image decoding apparatus that decodes a prediction residual for each transform unit and restores an image by adding the prediction residual to the prediction image, the division type into the prediction units includes a plurality of encoding units of different sizes. Including an asymmetric division that divides into two prediction units or a symmetrical division that divides a coding unit into a plurality of prediction units of the same size, and the division type of the target coding unit that is the coding unit to be decoded is asymmetric In the case of division, a conversion unit dividing unit that determines a conversion unit division method according to the size of a prediction unit included in the target coding unit is provided.

According to the above configuration, when the division type of the target coding unit that is the coding unit to be decoded is asymmetrical division, conversion is performed according to the size of the prediction unit included in the target coding unit. Determine the unit division method.

When asymmetric partitioning is applied, the smaller prediction unit is more likely to contain an edge, whereas the larger prediction unit is less likely to contain an edge.

∙ When there is no directionality in the prediction residual, the correlation can be removed more efficiently by applying the square conversion rather than the rectangular conversion as the conversion unit division method.

According to the above configuration, when the partition type is asymmetric partitioning, a transform unit partitioning method that can efficiently remove the correlation is selected according to the size of the prediction unit included in the target coding unit. be able to. As a result, encoding efficiency is improved.

An image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers and obtains a coding unit by dividing the coding unit into one or more numbers. In an image decoding apparatus that decodes a prediction residual for each conversion unit and restores an image by adding the prediction residual to the prediction image, the division method into the conversion units includes square and rectangular division, and decoding When the shape of the target prediction unit, which is the above-mentioned prediction unit to be a target, is a square, there is provided a dividing means for dividing the target conversion unit by a rectangular division method.

In some cases, a square prediction unit is selected even though an edge exists in the region and the image has directionality. For example, when an object having a large number of horizontal edges is moving, since the motion is uniform within the object, a square prediction unit is selected. However, in such a case, in the conversion process, it is desirable to apply a conversion unit having a shape that is long in the horizontal direction along the horizontal edge.

Thereby, a rectangular conversion unit can be selected even in a square coding unit, and therefore, the coding efficiency for the above-described region can be improved.

In the image decoding apparatus according to an aspect of the present invention, the dividing unit further has a predetermined size when the shape of the transform unit corresponding to the division depth 2 is a square in the encoding unit of the predetermined size. In the encoding unit, it is preferable that the target conversion unit corresponding to the division depth 1 is a rectangle.

According to the above configuration, when the shape of the target prediction unit is a square, the target conversion unit is divided by a rectangular division method.

When the conversion unit is recursively divided twice by the square quadtree division method, that is, when the division depth is increased to 2, 16 square conversion units are obtained. In this case, the scan order is a recursively applied z-scan. Conventionally, this division method has been applied when the division type of the target coding unit is a square division.

On the other hand, when the conversion unit is divided by the horizontally long quadtree division method, each node is divided into square conversion units at a division depth of 2. That is, with a division depth of 2, 16 square conversion units are finally obtained. In this case, as the scan order, raster scan is applied to 16 square conversion units. Conventionally, this division method has been applied when the division type of the target coding unit is non-square division.

Therefore, the scan order is different depending on whether the division type of the target coding unit is square division or non-square division.

On the other hand, according to the above configuration, when the division type of the encoding unit is square division, that is, when the shape of the target prediction unit is square, the target conversion unit is divided by the rectangular division method.

For this reason, there is an effect that the scanning order can be unified in the case of square division and in the case of non-square division.

An image decoding apparatus according to an aspect of the present invention generates a prediction image for each prediction unit obtained by dividing a coding unit into one or more numbers, and obtains the prediction unit as a coding unit or a divided coding unit. In an image decoding apparatus that decodes a prediction residual for each conversion unit and restores an image by adding the prediction residual to the prediction image, the division type into the prediction units is a division into prediction units of different sizes And the division type of the target prediction unit, which is the prediction unit to be decoded, into the asymmetric type, including the division into the asymmetric type and the division into the symmetrical type that is the division into the prediction unit of the same size. The coefficient decoding means for decoding the transform coefficient with reference to different contexts between the small prediction unit and the large prediction unit obtained by the division.

A small prediction unit obtained by asymmetric division is likely to have an edge, and conversion coefficients are likely to occur. On the other hand, a conversion coefficient is not easily generated in a large prediction unit. By using different contexts for the case where the target transform unit is included in the small prediction unit and in the case where the target transform unit is included in the large prediction unit, variable length decoding can be performed according to the occurrence probability of the transform coefficient in each region. it can.

An image encoding apparatus according to an aspect of the present invention is an image encoding apparatus that generates encoded image data by encoding information for restoring an image for each encoding unit. For a code assigned to a combination with a prediction method applied to a unit, a combination other than the combination is assigned to a combination in which a prediction method for intra prediction is applied to a coding unit having a size equal to or smaller than a predetermined size. It is characterized by comprising an encoding means for encoding a code shorter than the currently used code.

Note that an image coding apparatus having a configuration corresponding to the image decoding apparatus also falls within the scope of the present invention. According to the image encoding device configured as described above, it is possible to achieve the same effects as those of the image decoding device according to one aspect of the present invention.

An image decoding apparatus according to an aspect of the present invention decodes encoded data, and performs single prediction that refers to one reference image and bi-prediction that refers to two reference images for an image in a prediction unit. In an image decoding apparatus that restores using any one of the inter-screen prediction prediction methods, bi-prediction restriction information, which is information indicating the size of a prediction unit for restricting bi-prediction, included in the encoded data is used. Bi-prediction limiting means for limiting prediction is provided.

According to the above configuration, if the image coding apparatus adaptively encodes the bi-prediction restriction information according to the intention in the image coding, the image decoding apparatus can restrict the bi-prediction according to the intention. It can be performed. Thereby, there is an effect that fine adjustment according to the resolution of the image and the performance of the image encoding device / image decoding device can be performed.

In the image decoding device according to an aspect of the present invention, when the bi-prediction restriction flag, which is a flag indicating whether or not to restrict bi-prediction, included in the encoded data indicates that bi-prediction restriction is performed. It is preferable that a restriction information decoding unit for decoding the bi-prediction restriction information is provided.

According to the above configuration, it is possible to adaptively limit bi-prediction according to a flag explicitly set by the image encoding device. As a result, bi-prediction restriction processing can be performed in the image decoding apparatus as intended by the image encoding apparatus.

In the image decoding device according to an aspect of the present invention, the bi-prediction restriction flag is set according to at least one of the complexity of the encoded data stream and the performance of the image decoding device that decodes the encoded data. It is preferred that

The bi-prediction restriction flag is a flag indicating whether or not bi-prediction restriction is performed as described above. The bi-prediction restriction flag is set in the image encoding device in accordance with at least one of the complexity of the encoded data stream and the performance of the image decoding device. As an index indicating the complexity of the stream of encoded data and the performance of the image decoding apparatus, for example, H.264 can be used. The level restriction (level における limit) in H.264 / AVC can be mentioned. In the level regulation, the speed at which the decoder decodes the bit stream is defined. The level restriction is composed of two levels, an integer level and a sub-level (a level after the decimal point). The integer level defines a rough range, and levels 1 to 5 are defined.

For example, level 4 corresponds to 1080p resolution of HDTV (High Definition Television), and level 5 corresponds to 4k resolution.

In addition, detailed specifications at each integer level are defined in the sub level.

Here, depending on the level, the PU size for which a constraint is to be set and the prediction unit (PU) for which bi-prediction is to be limited are different. For example, in level 4 (HD), it is preferable to set restrictions on 4 × 4 PU and to restrict bi-prediction in 8 × 4 PU and 4 × 8 PU. In level 5 (4k), it is preferable to restrict 8 × 4 PU and 4 × 8 PU, and to restrict bi-prediction in 8 × 8 PU.

Further, it is preferable that such a restriction of bi-prediction is explicitly specified in the image coding apparatus.

According to the above configuration, when the image encoding apparatus sets the bi-prediction restriction flag and the bi-prediction restriction information according to the level in the encoded data, the image decoding apparatus performs the bi-prediction restriction flag and the bi-prediction. Bi-prediction restriction based on restriction information can be performed.

That is, bi-prediction restriction can be performed in the image decoding apparatus in accordance with the bi-prediction restriction designation explicitly set in the image encoding apparatus.

Thus, by performing bi-prediction restriction based on the bi-prediction restriction information according to the determination result of the bi-prediction restriction flag, that is, adaptive bi-prediction according to the complexity of the stream and the performance (level) of the image decoding apparatus. There is an effect that the restriction can be performed.

An image decoding apparatus according to an aspect of the present invention decodes encoded data, and performs single prediction that refers to one reference image and bi-prediction that refers to two reference images for an image in a prediction unit. In an image decoding apparatus that restores using any one of the inter-screen prediction prediction methods, the size that can be taken by the prediction unit is determined according to the size of the encoding unit that is a unit of encoding. Bi-prediction restriction means for restricting bi-prediction according to the value of a flag indicating the minimum size of the included coding unit is included.

For example, in Cited Document 1, the minimum coding unit (CU) size is defined by a parameter “log2_min_coding_block_size_minus3”. In addition, the shape (size) of the prediction unit (PU) is determined in consideration of the size of the coding unit (CU). When the minimum CU size is 8 × 8, in addition to 8 × 8 PU, 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU obtained by dividing 8 × 8 CU can be used. When the minimum CU size is 16 × 16, 8 × 8 PU can be used, while 8 × 4 PU, 4 × 8 PU, and 4 × 4 PU cannot be used.

As described above, it is preferable to limit the bi-prediction according to a so-called level that is set according to at least one of the complexity of the encoded data stream and the performance of the image decoding apparatus.

If the restriction of bi-prediction is performed based on the minimum CU size, it is possible to balance the PU size that restricts bi-prediction and the PU size that restricts single prediction itself. That is, the processing amount and transfer amount associated with bi-prediction are limited, but the unbalance that the processing amount and transfer amount of uni-prediction is not limited is eliminated.

The minimum CU size is defined as an existing parameter as seen in cited document 1. Therefore, by using such an existing parameter, it is possible to easily limit the bi-prediction without causing an increase in the code amount as compared with a case where a flag dedicated to bi-prediction restriction is added.

In the image decoding apparatus according to an aspect of the present invention, when the bi-prediction restriction unit indicates that a bi-prediction restriction flag, which is a flag indicating whether or not to restrict bi-prediction, performs restriction on bi-prediction, It is preferable to limit the bi-prediction.

According to the above configuration, bi-prediction restriction based on the prediction restriction flag set by the image encoding device can be performed. Thereby, based on the designation | designated explicitly set by the image coding apparatus, the restriction | limiting of bi-prediction can be performed adaptively.

According to the above configuration, bi-prediction can be limited adaptively according to the level.

In the image decoding apparatus according to an aspect of the present invention, the bi-prediction restriction flag can take at least three values, and the bi-prediction restriction unit restricts bi-prediction according to the value of the bi-prediction restriction flag. It is preferable to carry out.

According to the above configuration, regarding the restriction of bi-prediction, fine adjustment according to the value of the bi-prediction restriction flag can be performed. For example, when the value of the bi-prediction restriction flag is set to ternary, the following restriction of bi-prediction can be considered. That is, for a certain size CU, there are three cases: bi-prediction restriction is not performed, bi-prediction other than 2N × 2N · PU is restricted, and bi-prediction of N × N · PU is restricted. Can be expressed by a bi-prediction restriction flag that can take three values.

More specifically, according to the value of the bi-prediction restriction flag, in 16 × 16 CU, when bi-prediction restriction is not performed, bi-prediction is restricted by 8 × 8 PU, and 8 × 8 PU, 16 × 8 PU , 8 × 16 PU can be selected to limit bi-prediction.

In the image decoding apparatus according to an aspect of the present invention, the bi-prediction restriction flag may also serve as a flag that prohibits an inter-prediction unit having a minimum size predetermined for a coding unit to be processed. preferable.

In the above configuration, when the inter prediction unit with the minimum PU (N × N) is prohibited, the bi-prediction restriction flag can be set to indicate that bi-prediction restriction is performed. The reverse is also true.

For example, as a flag indicating whether or not an inter prediction unit with a minimum PU (N × N) is allowed, in Cited Document 1, inter_4x4_enabled_flag is defined. Note that when the size of the CU is 8 × 8 or more, or when inter_4x4_enabled_flag is “1”, N × N inter prediction is allowed.

Here, if! Inter_4x4_enabled_flag (“!” Represents a logical negation logical operator), the flag configured in this way can be said to be a flag indicating whether or not inter 4 × 4 is prohibited. .

When the bi-prediction is limited in the smallest PU (N × N) that can be taken with respect to the size of a certain CU, it can be realized by the flag configured as described above. Therefore, in such a case, the number of flags can be reduced, and the restriction of bi-prediction can be realized relatively easily.

An image decoding apparatus according to an aspect of the present invention decodes encoded data, and performs single prediction that refers to one reference image and bi-prediction that refers to two reference images for an image in a prediction unit. In the image decoding apparatus that restores by any one of the inter-screen prediction prediction methods, merge candidate derivation means for deriving merge candidates from the motion compensation parameters used for decoding of the decoded prediction unit, and the merge candidate derived above Bi-prediction restriction means for restricting bi-prediction for at least a part is provided.

Bi-prediction requires more processing than single prediction. Single prediction is a prediction method that uses one image to be referred to in inter-screen prediction, and bi-prediction is a prediction method that uses two images to be referred to in inter-screen prediction. The image to be referred to may be temporally forward or backward with respect to the target frame.

According to the above configuration, even when the merge process is used for decoding the prediction image in the target prediction unit, bi-prediction can be restricted, and the decoding process in the target prediction unit can be simplified. Note that restricting means omitting a part of processing in bi-prediction, converting a motion vector of bi-prediction to one that can reduce the processing load, and does not perform bi-prediction processing (prohibition) To do).

By performing the above-described bi-prediction restriction on at least a part of the derived merge candidates, it is possible to reduce the amount of processing that becomes a bottleneck of decoding processing.

In the image decoding apparatus according to an aspect of the present invention, it is preferable that the bi-prediction restriction unit converts the bi-prediction into single prediction in the restriction of the bi-prediction.

In the above configuration, “restrict bi-prediction” means that bi-prediction is converted into single prediction so that bi-prediction is not required. As already described, the simple prediction is less complicated and the processing amount is smaller than the bi-prediction.

In addition, “converting bi-prediction to single prediction” refers to limiting the number of referenced reference images from two to one.

According to the above configuration, when the merge process is used for decoding the predicted image in the target prediction unit, the decoding process in the target prediction unit can be simplified by restricting the bi-prediction as described above.

In the image decoding apparatus according to an aspect of the present invention, the bi-prediction restriction unit is a bi-prediction merge candidate among the derived merge candidates, and at least one of the two motion vectors is a non-integer component. The bi-prediction restriction is preferably performed on merge candidates that are non-integer motion vectors including.

According to the above configuration, merge candidates whose two motion vectors are non-integer motion vectors are subject to bi-prediction restriction. On the other hand, when the merge candidate is an integer motion vector, bi-prediction need not be restricted.

Here, the non-integer motion vector means that at least a part of the motion vector component is represented by a non-integer when the pixel position is expressed as an integer value.

For non-integer motion vectors, an interpolation filter for generating an interpolated image is applied, which tends to increase the processing load. On the other hand, in the case of an integer motion vector, such filter processing is not essential.

Also, in the case of an integer motion vector, since interpolation filter processing is not essential, the range referred to in motion compensation can be the same as the target block.

For this reason, in the case of an integer motion vector, even if bi-prediction is performed, the transfer amount and the processing amount do not increase excessively.

According to the above configuration, since bi-conversion can be omitted for integer motion vectors that do not require much load even if bi-prediction is not restricted, bi-prediction restriction that bi-converts merge candidates. The processing load can be reduced.

In the image decoding apparatus according to an aspect of the present invention, the bi-prediction restriction unit is configured to perform bi-prediction merge candidate including a non-integer motion vector including a non-integer component among the derived merge candidates. It is preferable that at least a part of the non-integer component included in the non-integer motion vector is converted into an integer component.

In the above configuration, “restrict bi-prediction” means to restrict bi-prediction with non-integer motion vectors.

As described above, the interpolation filter need not be applied to the integer motion vector. Therefore, by converting the component of the non-integer motion vector of the bi-predictive merge candidate into an integer, it is possible to make the range referred to in motion compensation more closely match the target block range. If all the components are converted into integers, the range referred to in motion compensation matches the target block.

In the case of (X, Y) two-dimensional coordinate expression, integerization may be performed only for the X coordinate, only for the Y coordinate, or the X coordinate and the Y coordinate. You may do it for both. Only one of the L0 and L1 lists may be converted to an integer.

If bi-prediction is performed using integerized motion vectors obtained in this way, an increase in the transfer amount and the processing amount can be suppressed as compared with the case where bi-prediction is performed with non-integer motion vectors.

In the image decoding apparatus according to an aspect of the present invention, it is preferable that the bi-prediction restriction unit restricts the bi-prediction to a predetermined number of merge candidates among the derived merge candidates.

According to the above configuration, bi-prediction restriction can be performed on some merge candidates among the derived merge candidates. For example, when the order of derivation of merge candidates is determined, or when the order is determined in a list in which the derived merge candidates are stored, the top N may be subject to bi-prediction restriction. it can. N is a positive integer, and N = 1 may be sufficient.

This makes it possible to reduce the processing load compared to the case where bi-prediction restriction is performed on all merge candidates.

In the image decoding device according to an aspect of the present invention, it is preferable that the bi-prediction restriction unit restricts the bi-prediction when the derived merge candidate does not include a predetermined number or more of single predictions. .

If the merge candidate list includes a predetermined number or more of single predictions, even if bi-prediction is allowed, the processing load may not increase so much as a whole. According to the above configuration, in such a case, the bi-prediction restriction process can be omitted, and the non-proliferation can be suppressed by the bi-prediction restriction process.

In the image decoding apparatus according to an aspect of the present invention, after the merge candidate derivation unit derives all merge candidates, the bi-prediction restriction unit restricts the bi-prediction restriction on the derived merge candidates. Preferably it is done.

According to the above configuration, first, a merge candidate list is generated. Also, such processing is common in derivation of merge candidates. According to the above configuration, since it is not necessary to change the general merge candidate list generation process in this way, it is possible to prevent the processing logic from becoming complicated.

In the image decoding apparatus according to an aspect of the present invention, the bi-prediction restriction unit restricts the bi-prediction with respect to the derived merge candidate in parallel with the merge candidate deriving unit and the process of deriving the merge candidate. It is preferable to perform the processing.

According to the above configuration, bi-prediction is limited before the merge candidates are stored in the merge candidate list. This process is performed in parallel with the process of deriving merge candidates.

Such parallelization can improve processing efficiency. This is particularly useful when the allowable range of processing latency is narrow.

In addition, when bi-prediction is restricted by bi-transformation of merge candidates and the uniqueness of merge candidates is checked, a merge candidate list is generated and checked for uniqueness, and then bi-directional. Compared to a single conversion, a merge candidate list can be created without duplication of merge candidates.

An image encoding device according to an aspect of the present invention is an image encoding device that generates image encoded data by encoding information for restoring an image for each encoding unit. According to at least one of the complexity and the performance of the image decoding apparatus that decodes the encoded data, between single prediction referring to one reference image and bi-prediction referring to two reference images It comprises encoding means for encoding a bi-prediction restriction flag, which is a flag indicating whether or not to restrict bi-prediction among predictions, in encoded data.

The data structure of image encoded data according to one aspect of the present invention is the data structure of image encoded data in which information for restoring an image is encoded by an image encoding device for each encoding unit. Single prediction referring to one reference image and reference to two reference images according to at least one of the complexity of the stream of encoded data and the performance of the image decoding apparatus that decodes the encoded data Among the bi-prediction inter predictions, a bi-prediction restriction flag that is a flag indicating whether or not to restrict bi-prediction is included.

Note that an image encoding device having a configuration corresponding to the image decoding device and a data structure of image encoded data generated by the image encoding device also fall within the scope of the present invention. According to the image encoding device and the data structure of the image encoded data configured as described above, the same effects as those of the image decoding device according to an aspect of the present invention can be achieved.

(Hardware implementation and software implementation)
Each block of the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central Processing). Unit) may be implemented in software.

In the latter case, each device includes a CPU that executes instructions of a program that realizes each function, a ROM (Read （Memory) that stores the program, a RAM (Random Memory) that expands the program, the program, and various types A storage device (recording medium) such as a memory for storing data is provided. An object of the present invention is to provide a recording medium in which a program code (execution format program, intermediate code program, source program) of a control program for each of the above devices, which is software that realizes the above-described functions, is recorded in a computer-readable manner This can also be achieved by supplying each of the above devices and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).

Examples of the recording medium include tapes such as magnetic tapes and cassette tapes, magnetic disks such as floppy (registered trademark) disks / hard disks, CD-ROMs (Compact Disc-Read-Only Memory) / MO discs (Magneto-Optical discs). ) / MD (Mini Disc) / DVD (Digital Versatile Disc) / CD-R (CD Recordable) / Blu-ray Disc (Blu-ray Disc: registered trademark) and other optical discs, IC cards (including memory cards) / Cards such as optical cards, mask ROM / EPROM (Erasable Programmable Read-Only Memory) / EEPROM (registered trademark) (Electrically-Erasable Programmable Read-Only Memory) / semiconductor memory such as flash ROM, or PLD (Programmable Logic device) and FPGA (Field Programmable Gate Array) Can.

Further, each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, the Internet, intranet, extranet, LAN (Local Area Network), ISDN (Integrated Services Digital Network), VAN (Value-Added Network), CATV (Community Area Antenna / Cable Television) communication network, Virtual Private Network (Virtual Private Network) Network), telephone line network, mobile communication network, satellite communication network, and the like. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, even in the case of wired such as IEEE (Institute of Electrical and Electronic Engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared such as IrDA (Infrared Data Association) or remote control , Bluetooth (registered trademark), IEEE 802.11 wireless, HDR (High Data Rate), NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc. Is possible. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope indicated in the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

The present invention can be suitably applied to an image decoding apparatus that decodes encoded data obtained by encoding image data and an image encoding apparatus that generates encoded data obtained by encoding image data. Further, the present invention can be suitably applied to the data structure of encoded data generated by an image encoding device and referenced by the image decoding device.

DESCRIPTION OF SYMBOLS 1 Video decoding device 10 Decoding module 11 CU information decoding part 12 PU information decoding part 13 TU information decoding part 16 Frame memory 111 CU prediction mode determination part 112 PU size determination part 121 Motion compensation parameter derivation part (Bi prediction prediction means, candidate Determination means, estimation means)
122 merge candidate priority information storage unit 123 reference frame setting information storage unit 131 TU partition setting unit 132 transform coefficient restoration unit 1011 CU prediction mode decoding unit (decoding unit, changing unit)
1012 Binary information storage unit 1013 Context storage unit 1014 Probability setting storage unit 1021 Motion information decoding unit (restriction information decoding unit)
1031 Region division flag decoding unit 1032 Determination information decoding unit (coefficient decoding means)
1033 Transform coefficient decoding unit (coefficient decoding means)
1311 Target region setting unit 1312 Division determination unit 1313 Division region setting unit (conversion unit division unit, division unit)
1314 Transform Size Determination Information Storage Unit 2 Video Encoding Device 21 Coding Setting Unit 23 Predictive Image Generation Unit 25 Frame Memory 27 Transform / Quantization Unit 29 Encoded Data Generation Unit (Encoding Unit)
1211 Skip motion compensation parameter derivation unit 1212 Merge motion compensation parameter derivation unit (merge candidate derivation unit)
1213 Basic motion compensation parameter derivation unit 1218 Bi-prediction restricted PU determination unit 1219 Bi-uni prediction conversion unit 1220 Motion vector integerization unit 1212A Neighbor merge candidate derivation unit 1212B Temporal merge candidate derivation unit 1212C Unique candidate derivation unit 1212D Combined bi-prediction merge candidate Deriving unit 1212E Non-scale bi-predictive merge candidate deriving unit 1212F Zero vector merge candidate deriving unit 1212G Merge candidate deriving control unit 1212H Merge candidate storage unit 1212J Merge candidate selecting unit 1213A Adjacent motion vector candidate deriving unit 1213B Temporal motion vector candidate deriving unit 1213F Zero vector merge candidate derivation unit 1213G Motion vector candidate derivation control unit 1213H Motion vector candidate storage unit 1213I Motion vector candidate selection unit 1 213J Motion vector restoration unit 1218A Bi-prediction restricted PU determination unit 1219A Bi-unidirectional prediction conversion unit (bi-prediction restriction unit)
1219B Bi-single prediction conversion unit (bi-prediction limiting means)
3012 Merge motion compensation parameter generation unit 3013 Basic motion compensation parameter generation unit 3013A Motion vector candidate selection unit 3013B Differential motion vector calculation unit 3014 Motion compensation parameter restriction unit

Claims

In an image decoding apparatus that decodes an image in a prediction unit using single prediction referring to one reference image or bi-prediction referring to two reference images as a prediction method of inter-screen prediction,
An image decoding apparatus comprising: a bi-prediction restriction unit that restricts performing the bi-prediction on a prediction unit when the prediction unit is a prediction unit of a predetermined size or less.
In the process of deriving a motion compensation parameter of a prediction unit as a merge candidate, it comprises merge candidate derivation means for deriving the merge candidate based on a motion compensation parameter of an adjacent prediction unit,
When the derived merge candidate is the bi-prediction,
The image decoding apparatus according to claim 1, wherein the bi-prediction restriction unit converts the bi-prediction into the single prediction.
When using at least two prediction list use flags indicating whether or not the reference picture list is used, and when the at least two prediction list use flags indicate that the reference picture list is used,
3. The image according to claim 1, wherein the bi-prediction restriction unit converts one of the at least two prediction list use flags to indicate that the reference picture list is not used. Decoding device.
The bi-prediction restriction unit converts the prediction list use flag indicating that an L1 list, which is one of the reference picture lists, is used so as to indicate that the L1 list is not used. Item 4. The image decoding device according to Item 3.
5. The image decoding device according to claim 1, wherein the size of the prediction unit is calculated using a width and a height of the prediction unit.
The bi-prediction restriction unit decodes information indicating whether to perform the bi-prediction or the uni-prediction when the prediction unit is larger than a predetermined size, and when the prediction unit is a predetermined size or less, The image decoding apparatus according to claim 1, wherein the single prediction is performed by omitting decoding of information indicating whether to perform the bi-prediction or the single prediction.
In an image decoding method for decoding an image in a prediction unit using single prediction referring to one reference image or bi-prediction referring to two reference images as a prediction method of inter-screen prediction,
Determining whether the prediction unit is a prediction unit of a predetermined size or less;
And a step of restricting the bi-prediction from being used for the prediction unit.
In an image encoding apparatus that encodes an image in a prediction unit using uni-prediction referring to one reference image or bi-prediction referring to two reference images as a prediction method of inter-screen prediction,
An image coding apparatus comprising: a bi-prediction restriction unit that restricts performing the bi-prediction on a prediction unit when the prediction unit is a prediction unit of a predetermined size or less.
In an image decoding apparatus that decodes image encoded data and generates a decoded image for each encoding unit,
A CU information decoding unit for decoding information designating a division type for dividing the coding unit;
An arithmetic decoding unit that decodes a binary value from the image encoded data by arithmetic decoding using context or arithmetic decoding without using context,
When the CU information decoding unit decodes information designating asymmetric motion partition (AMP) as the division type,
The image decoding device, wherein the arithmetic decoding unit switches between arithmetic decoding using the context and arithmetic decoding not using the context according to a position of the binary value.