WO2013157251A1

WO2013157251A1 - Video encoding device, video encoding method, video encoding program, transmission device, transmission method, transmission program, video decoding device, video decoding method, video decoding program, reception device, reception method, and reception program

Info

Publication number: WO2013157251A1
Application number: PCT/JP2013/002565
Authority: WO
Inventors: 上田　基晴; 福島　茂; 英樹竹原
Original assignee: 株式会社Ｊｖｃケンウッド
Priority date: 2012-04-16
Filing date: 2013-04-16
Publication date: 2013-10-24

Abstract

A motion compensation predicting unit (112) generates a predicted signal, for a predicted block to be encoded, by means of motion compensation using derived motion information. An encoding block control parameter generating unit (122) generates a first control parameter indicating whether to permit motion compensation prediction for a predicted block size of a first size, as well as a second control parameter indicating a second size that forbids bi-predictive motion compensation for a predicted block size less than or equal to the second size. A block structure/prediction mode information supplementary information encoding unit (118) encodes information used in motion compensation prediction, including the first and second control parameters. The motion compensation prediction unit (112) performs motion compensation prediction on the basis of the first and second control parameters.

Description

Video encoding device, video encoding method, video encoding program, transmission device, transmission method and transmission program, and video decoding device, video decoding method, video decoding program, reception device, reception method and reception program

The present invention relates to a video signal encoding and decoding technique, and more particularly to a video encoding and decoding technique used for motion compensation prediction.

MPEG-4 AVC / H. In moving picture coding represented by H.264 (hereinafter referred to as AVC) and the like, as information compression using correlation in the time direction, a picture to be coded which is a picture signal to be coded is already coded and decoded. The detected local decoded signal is used as a reference picture, and a motion amount (hereinafter referred to as a motion vector) between the target picture and the reference picture is detected and predicted in a predetermined encoding processing unit (hereinafter referred to as an encoding target block). Motion compensated prediction that generates a signal is used.

In AVC, single prediction for generating a prediction signal in a single direction using one motion vector from one reference picture in motion compensation prediction, and a prediction signal using two motion vectors from two reference pictures Bi-prediction is used to generate A method of changing the size (hereinafter referred to as prediction block size) of a block (hereinafter referred to as prediction target block) that is a prediction processing target within a 16 × 16 pixel two-dimensional block that is an encoding target block. In addition, it is applied to a method for selecting a reference picture used for prediction from a plurality of reference pictures, and the accuracy of a motion vector is expressed with 1/4 pixel accuracy, thereby improving the accuracy of a prediction signal and transmitting The information amount of the difference (hereinafter, prediction error) is reduced. On the encoding side, information specifying the prediction mode information and the reference image is selected and transmitted together with the motion vector information. On the decoding side, the information specifying the transmitted prediction mode information and the reference image and the decoded motion vector information are transmitted. In accordance with the motion compensation prediction process.

For motion vector transmission, a motion vector of an encoded block adjacent to the processing target block is set as a prediction motion vector (hereinafter referred to as a prediction vector), and a difference between the motion vector of the processing target block and the prediction vector is obtained. Is transmitted as an encoded vector to improve the compression efficiency.

However, in AVC, when the prediction block size is reduced, the number of motion vectors required for encoding increases with respect to the pixels of the encoding target block, which is required when encoding a prediction error. There is a problem that the amount of code required for coding a motion vector increases with respect to the amount of code, the prediction error cannot be coded with sufficient accuracy, and the quality of the coded image signal is lowered.

In order to solve the problem of increasing the amount of code required for coding a motion vector, AVC uses a motion vector used for coding a block of a reference picture at the same position as a prediction target block, It is possible to use direct motion compensated prediction that realizes motion compensated prediction without transmitting.

Another solution is to reduce the number of motion vectors to be encoded by prohibiting bi-prediction and using only uni-prediction when the prediction block size is small in the encoding device as in Patent Document 1. Thus, a technique for preventing an increase in the code amount of a motion vector is known.

WO2006 / 082690

The direct motion compensated prediction described above pays attention to the continuity of motion in the temporal direction in the block of the reference picture located at the same position as the prediction target block, and uses the motion information of other blocks as they are. Thus, the motion compensation prediction process is performed without encoding the difference vector as an encoded vector.

However, when the continuity of motion is not sufficiently maintained or when the motion vector in the motion information of other blocks does not indicate an accurate motion, etc. In this case, a predicted image using the motion information in which the shift has occurred is generated. In that case, there is a difficult aspect that a motion compensated prediction image with high accuracy cannot be generated and the encoding efficiency is not improved.

Furthermore, when generating a prediction signal from a motion vector expressed with a motion vector accuracy of 1/4 pixel, an interpolation filter using a plurality of adjacent pixels is used to generate a 1/4 pixel accuracy specified by the motion vector. In order to generate a prediction pixel for the position of the image, in order to generate a prediction signal at the time of motion compensation prediction, a reference picture of an area corresponding to the number of pixels corresponding to the number of taps of the interpolation filter horizontally and vertically with respect to the prediction block size It is necessary to acquire an image signal. In particular, when the prediction block size is reduced, there is a problem that the memory access amount of the reference picture increases, and the same problem remains when direct motion compensation prediction is used.

According to the method of Patent Literature 1, by limiting the prediction method to single prediction, the memory access amount of the reference picture in the encoding device can be reduced together with the number of motion vectors, but in the decoding device, it is encoded. Since the restriction on the number of motion vectors to be recognized cannot be recognized, a decoding processing capability assuming a case where bi-prediction is performed is necessary to realize a real-time decoding process. In addition, when a prediction method that does not transmit an encoded vector, such as direct motion compensation prediction, is used under conditions where implicit bi-prediction is used, bi-prediction prediction signal generation is required, which is required by the decoding device. The maximum memory access amount cannot be reduced, and the problem is not solved.

The present invention has been made in view of such circumstances, and an object of the present invention is to provide a technique for improving the coding efficiency while limiting the memory access amount of a reference picture to a predetermined amount or less when using motion compensated prediction. There is to do.

In order to solve the above-described problem, a moving picture encoding apparatus according to an aspect of the present invention specifies a prediction block from a block in which a picture is divided into a plurality of blocks step by step, and the specified prediction block unit includes: A moving image encoding device for generating an encoded stream, wherein motion information is derived from at least one of a block spatially close to a prediction block to be encoded and a block close in time, and the encoding A candidate list construction unit (1506) for registering predetermined motion information from the derived motion information as a motion information candidate of a target prediction block and constructing a motion information candidate list, and the encoding target An encoding unit (118) for encoding index information for designating motion information candidates in the motion information candidate list used for the prediction block; A motion information conversion unit (1507) that converts a motion information candidate, and based on the motion information candidate, performs motion compensation prediction by either uni-prediction or bi-prediction, and generates a prediction signal of the prediction block to be encoded A motion compensation prediction unit (112). The motion information conversion unit (1507) performs prediction conversion for converting the prediction type information indicating the bi-prediction among the motion information candidates into the prediction type information indicating the single prediction, and the motion compensation prediction unit (112) ) Is a case where the block size of the prediction block to be encoded is a predetermined first size, and when the prediction type information indicates the bi-prediction, based on the motion information converted by the prediction conversion The motion compensation prediction is performed.

Another aspect of the present invention is also a moving picture coding apparatus. This apparatus is a moving picture encoding apparatus that encodes the moving picture using motion compensation prediction in units of blocks obtained by dividing each picture of the moving picture, and is to be encoded by motion compensation using the derived motion information. A motion compensation prediction unit (112) that generates a prediction signal of a prediction block, and a first control parameter (inter_4x4_enable) that specifies whether or not motion compensation prediction is permitted in the prediction block size of the specified first size And a coding block control parameter generation unit that generates a second control parameter (inter_bipred_restriction_idc) that specifies the second size and prohibits bi-prediction motion compensation in a prediction block size that is equal to or smaller than the specified second size. (122) and an encoding unit (118) that encodes information used for motion compensation prediction, including the first and second control parameters. Obtain. The motion compensation prediction unit (112) performs motion compensation prediction based on the first and second control parameters.

Still another aspect of the present invention is a video encoding method. This method is a moving picture coding method in which a prediction block is identified from a block in which a picture is divided into a plurality of blocks and a coded stream is generated in units of the identified prediction block. Motion information is derived from at least one of a spatially close block and a temporally close block to the target prediction block, and the derived motion information is used as a motion information candidate for the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from among them to construct a motion information candidate list, and index information for designating motion information candidates in the motion information candidate list used for the prediction block to be encoded An encoding step for encoding the motion information, a motion information conversion step for converting the motion information candidate, and the motion information candidate Zui by, and a motion compensated prediction step of generating a prediction signal of the prediction block to be the encoding target performs motion compensation prediction by either single prediction or bi-prediction. The motion information conversion step performs prediction conversion for converting the prediction type information indicating the bi-prediction among the motion information candidates into prediction type information indicating the uni-prediction, and the motion compensation prediction step includes the encoding When the block size of the target prediction block is a predetermined first size and the prediction type information indicates the bi-prediction, the motion compensation prediction is performed based on the motion information converted by the prediction conversion. .

Still another aspect of the present invention is a transmission device. This apparatus identifies a prediction block from a block obtained by dividing a picture into a plurality of blocks in stages, and is encoded by the moving picture coding method for generating an encoded stream in the identified prediction block unit. A packet processing unit that packetizes the encoded stream to obtain encoded data, and a transmission unit that transmits the packetized encoded data. The video encoding method derives motion information from at least one of a block spatially adjacent to a prediction block to be encoded and a block adjacent to temporally, and the motion of the prediction block to be encoded A candidate list construction step of constructing a motion information candidate list by registering predetermined motion information from the derived motion information as information candidates, and the motion information candidate list used for the prediction block to be encoded An encoding step for encoding index information for specifying a motion information candidate, a motion information conversion step for converting the motion information candidate, and motion based on either the single prediction or the bi-prediction based on the motion information candidate. A motion compensation prediction step of performing compensation prediction and generating a prediction signal of the prediction block to be encoded. The motion information conversion step performs prediction conversion for converting the prediction type information indicating the bi-prediction among the motion information candidates into prediction type information indicating the uni-prediction, and the motion compensation prediction step includes the encoding When the block size of the target prediction block is a predetermined first size and the prediction type information indicates the bi-prediction, the motion compensation prediction is performed based on the motion information converted by the prediction conversion. .

Still another aspect of the present invention is a transmission method. In this method, a prediction block is identified from a block in which a picture is divided into a plurality of blocks in stages, and the coded image is encoded by the moving image coding method for generating an encoded stream in the identified prediction block unit. A packet processing step of packetizing the encoded stream to obtain encoded data; and a transmitting step of transmitting the packetized encoded data. The video encoding method derives motion information from at least one of a block spatially adjacent to a prediction block to be encoded and a block adjacent to temporally, and the motion of the prediction block to be encoded A candidate list construction step of constructing a motion information candidate list by registering predetermined motion information from the derived motion information as information candidates, and the motion information candidate list used for the prediction block to be encoded An encoding step for encoding index information for specifying a motion information candidate, a motion information conversion step for converting the motion information candidate, and motion based on either the single prediction or the bi-prediction based on the motion information candidate. A motion compensation prediction step of performing compensation prediction and generating a prediction signal of the prediction block to be encoded. The motion information conversion step performs prediction conversion for converting the prediction type information indicating the bi-prediction among the motion information candidates into prediction type information indicating the uni-prediction, and the motion compensation prediction step includes the encoding When the block size of the target prediction block is a predetermined first size and the prediction type information indicates the bi-prediction, the motion compensation prediction is performed based on the motion information converted by the prediction conversion. .

A video decoding device according to an aspect of the present invention specifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and decodes an encoded stream in units of the specified prediction block A decoding unit (1108) that decodes index information specifying motion information of the prediction block to be decoded from the encoded stream, and a block spatially adjacent to the prediction block to be decoded Motion information is derived from at least one of temporally adjacent blocks, and predetermined motion information is registered from among the derived motion information as motion information candidates of the prediction block to be decoded. A candidate list construction unit (3604) for constructing a candidate list, and a motion information conversion unit (3605) for transforming the motion information candidates Motion compensation for generating a prediction signal of a prediction block to be decoded by performing motion compensation prediction by either uni-prediction or bi-prediction based on the motion information specified by the index information of the motion information candidates A prediction unit (1114). The motion information conversion unit (3605) performs prediction conversion for converting the prediction type information indicating the bi-prediction among the motion information candidates into the prediction type information indicating the single prediction, and the motion compensation prediction unit (1114). ) Is a case where the block size of the prediction block to be decoded is a predetermined first size, and when the prediction type information of the designated motion information indicates the bi-prediction, is converted by the prediction conversion. The motion compensation prediction is performed based on the motion information.

Another aspect of the present invention is also a video decoding device. The apparatus is a moving picture decoding apparatus that decodes a coded stream obtained by coding the moving picture using motion compensated prediction in units of blocks obtained by dividing each picture of the moving picture, and the motion compensated prediction is performed from the coded stream. And a first control parameter for specifying whether or not motion compensated prediction is permitted in the prediction block size of the designated first size from the information used for the motion compensated prediction that has been decoded. inter_4x4_enable) and a second control parameter (inter_bipred_restriction_idc) that specifies the second size and prohibits bi-prediction motion compensation in a prediction block size equal to or smaller than the specified second size (1108) And a motion compensated prediction unit (1114) that generates a prediction signal of a decoding target prediction block using information used for the motion compensated prediction Equipped with a. The motion compensation prediction unit (1114) performs motion compensation prediction based on the first and second control parameters.

Still another aspect of the present invention is a moving picture decoding method. This method is a moving picture decoding method in which a prediction block is specified from a block in which a picture is divided into a plurality of blocks in stages, and an encoded stream is decoded in the specified prediction block unit. A decoding step of decoding index information specifying motion information of the prediction block to be decoded from the stream, and at least one of a block spatially adjacent to the prediction block to be decoded and a block adjacent in time A candidate list construction step of deriving motion information from the above, and as a motion information candidate of the prediction block to be decoded, registering predetermined motion information from the derived motion information and constructing a motion information candidate list; A motion information conversion step for converting the motion information candidates; and the index of the motion information candidates. Based on the specified motion information by broadcast, and a motion compensated prediction step of generating a prediction signal of a prediction block to be the decoding target subjected to the motion compensation prediction by either single prediction or bi-prediction. The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction among the motion information candidates into prediction type information indicating the single prediction, and the motion compensation prediction step includes the decoding target Based on the motion information converted by the prediction conversion, when the block size of the prediction block to be is a predetermined first size and the prediction type information of the specified motion information indicates the bi-prediction The motion compensation prediction is performed.

Still another aspect of the present invention is a receiving device. This apparatus specifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and receives and decodes an encoded stream in which a moving image is encoded in the specified prediction block unit. A receiving unit that receives encoded data in which the encoded stream is packetized; a recovery unit that processes the received encoded stream to recover the original encoded stream; A decoding unit that decodes index information specifying motion information of the prediction block to be decoded from the encoded stream, a block that is spatially close to the prediction block that is to be decoded, and a block that is temporally close to the prediction block Motion information is derived from at least one of the above, and the motion information candidate of the prediction block to be decoded is derived as the motion information candidate A candidate list construction unit for registering predetermined motion information from among the motion information and constructing a motion information candidate list, a motion information conversion unit for converting the motion information candidates, and the index information of the motion information candidates. A motion-compensated prediction unit that performs motion-compensated prediction based on specified motion information by either uni-prediction or bi-prediction and generates a prediction signal of a prediction block to be decoded. The motion information conversion unit converts prediction type information indicating that the motion compensation prediction is performed by the bi-prediction among the motion information candidates into prediction type information indicating that the motion compensation prediction is performed by the single prediction. The motion compensation prediction unit is a case where the block size of the prediction block to be decoded is a predetermined first size, and the prediction type information of the designated motion information is the bi-prediction Indicates that the motion compensation prediction is performed, the motion compensation prediction is performed based on the prediction type information converted by the prediction conversion.

Still another aspect of the present invention is a receiving method. This method specifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and receives and decodes an encoded stream in which a moving image is encoded in the specified prediction block unit. A receiving step for receiving encoded data in which the encoded stream is packetized; a restoring step for packetizing the received encoded stream to restore the original encoded stream; A decoding step for decoding the index information specifying the motion information of the prediction block to be decoded from the encoded stream, a block spatially adjacent to the prediction block to be decoded, and a block adjacent in time Motion information is derived from at least one of the prediction block motion information candidates of the prediction block to be decoded. A candidate list construction step for registering predetermined motion information from the derived motion information to construct a motion information candidate list, a motion information conversion step for converting the motion information candidates, and among the motion information candidates A motion compensation prediction step of performing motion compensation prediction by either uni-prediction or bi-prediction based on the motion information specified by the index information, and generating a prediction signal of the prediction block to be decoded. In the motion information conversion step, prediction type information indicating that the motion compensation prediction is performed by the bi-prediction among the motion information candidates is converted into prediction type information indicating that the motion compensation prediction is performed by the single prediction. The motion compensation prediction step is performed when the block size of the prediction block to be decoded is a predetermined first size, and the prediction type information of the designated motion information is the bi-prediction. Indicates that the motion compensation prediction is performed, the motion compensation prediction is performed based on the prediction type information converted by the prediction conversion.

It should be noted that any combination of the above-described constituent elements, and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

According to the present invention, it is possible to improve the encoding efficiency while limiting the memory access amount of the reference picture to a predetermined amount or less.

It is a figure which shows the structure of the moving image encoder which concerns on Embodiment 1 of this invention. It is a figure which shows an example of the division | segmentation structure of an encoding object image. It is a figure which shows the detailed definition of CU / prediction block size. 4A to 4D are diagrams for explaining the prediction type of motion compensation prediction. It is a flowchart which shows the flow of an operation | movement of the encoding process per encoding block in the moving image encoder which concerns on Embodiment 1 of this invention. 6 is a flowchart for explaining a detailed operation of a CU prediction mode / prediction signal generation process in step S503 of FIG. 7 is a flowchart for explaining detailed operation of motion compensated prediction block (PU) size selection / prediction signal generation processing in step S608 of FIG. 6. FIGS. 8A and 8B are diagrams for explaining two prediction modes for encoding motion information used in motion compensated prediction according to Embodiment 1 of the present invention. It is a figure which shows the rough value of the reference image memory amount required for motion compensation prediction at the time of using a horizontal and vertical 7 tap filter for motion compensation prediction. It is a figure for demonstrating the control parameter which controls the block size and prediction process of motion compensation prediction based on Embodiment 1 of this invention. It is a figure which shows the structure of the moving image decoding apparatus which concerns on Embodiment 1 of this invention. It is a flowchart which shows the flow of operation | movement of the decoding process of the encoding block unit in the moving image decoding apparatus which concerns on Embodiment 1 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the CU unit decoding process in step S1202 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the CU unit motion compensation prediction decoding process in step S1310 of FIG. It is a figure which shows the detailed structure of the motion compensation prediction block structure selection part in the moving image encoder of Embodiment 1 of this invention. It is a figure which shows the structure of a joint motion information calculation part. 16 is a flowchart for explaining an operation of motion compensation prediction mode / prediction signal generation, which is steps S701, S702, S703, and S705 of FIG. 7, which operates via the motion compensation prediction block structure selection unit of FIG. 15. It is a flowchart for demonstrating the detailed operation | movement of combined motion information candidate list production | generation in step S1701 of FIG. It is a figure which shows the space candidate block group used for a space joint motion information candidate list generation. It is a flowchart for demonstrating the detailed operation | movement of the space joint motion information candidate list production | generation process in step S1800 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the combined motion information candidate deletion process in step S1801 of FIG. It is a figure which shows the comparison relationship of the candidate in the list | wrist in the case of four combined motion information candidates. FIGS. 23A and 23B are diagrams illustrating an example of comparison contents of combined motion information candidates. It is a figure which shows the time candidate block group used for a time joint movement information candidate list generation. It is a flowchart for demonstrating the detailed operation | movement of the time combination motion information candidate list production | generation process in FIG.18 S1802. It is a figure for demonstrating the calculation method of the motion vector value mvL0t and mvL1t registered with respect to the reference | standard motion vector value ColMv with respect to time coupling | bonding motion information with respect to L0 prediction and L1 prediction. It is a flowchart for demonstrating the detailed operation | movement of the 1st joint motion information candidate list addition process in step S1803 of FIG. It is a figure for demonstrating the relationship between the frequency | count of combination inspection, the combined motion information candidate M, and the combined motion information candidate N in the 1st combined motion information candidate list addition process. It is a flowchart for demonstrating the detailed operation | movement of the 2nd joint motion information candidate list addition process in step S1804 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the joint motion information candidate single prediction conversion process in step S1703 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the joint prediction mode evaluation value production | generation process in step S1704 of FIG. It is a figure which shows the Truncated Unary code sequence when the number of combined motion information candidates is 5. It is a flowchart for demonstrating the detailed operation | movement of the prediction mode evaluation value production | generation process in step S1705 of FIG. It is the syntax regarding the motion information of a prediction block. It is the syntax regarding the parameter which controls the restriction | limiting of the bi-prediction by prediction block size, and a prediction process. It is a figure which shows the detailed structure of the motion information decoding part in the moving image decoding apparatus of Embodiment 1 of this invention. 15 is a flowchart for explaining detailed operations of prediction block unit decoding processing in steps S1402, S1406, S1408, and S1410 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the motion information decoding process in step S3702 of FIG. It is a flowchart for demonstrating the detailed operation | movement of the joint prediction motion information decoding process in FIG.38 S3801. FIG. 39 is a flowchart for explaining detailed operations of a predicted motion information decoding process in step S3805 of FIG. 38. FIG. It is an example which shows the restriction | limiting of the prediction block size control parameter linked | linked with the level. It is a figure which shows the division | segmentation structure of the prediction block size in another structure of Embodiment 1 of this invention. It is a figure for demonstrating the control parameter which controls the block size and prediction process of motion compensation prediction in another structure of Embodiment 1 of this invention. In Embodiment 1 of this invention, it is an example which integrated two control parameters which control the block size of a motion compensation prediction, and a prediction process as one encoding transmission parameter. It is a figure for demonstrating the control parameter which controls the block size and prediction process of motion compensation prediction based on Embodiment 2 of this invention. It is a figure which shows the relationship between the control parameter which controls bi-prediction, and prediction block size based on Embodiment 2 of this invention. It is an example of the syntax regarding the parameter which controls the restriction | limiting of the bi-prediction by prediction block size and prediction processing in Embodiment 2 of this invention. It is a figure which shows an example of the definition of a space periphery prediction block in combined motion information candidate generation in Embodiment 3 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the motion compensation prediction block (PU) size selection / prediction signal generation process in Embodiment 3 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the motion compensation prediction mode / prediction signal generation process in Embodiment 3 of this invention. It is a flowchart for demonstrating the detailed operation | movement of an example of a joint prediction motion information decoding process in Embodiment 3 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the motion compensation prediction block generation process in Embodiment 4 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the joint motion information candidate list production | generation process in Embodiment 5 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the joint motion information candidate single prediction conversion process in Embodiment 5 of this invention. It is a flowchart for demonstrating the detailed operation | movement of the joint motion information candidate single prediction conversion process in Embodiment 6 of this invention.

DESCRIPTION OF EXEMPLARY EMBODIMENTS Preferred embodiments of a moving image encoding apparatus, a moving image encoding method, a moving image encoding program, a moving image decoding apparatus, a moving image decoding method, and a moving image decoding program according to an embodiment of the present invention will be described with reference to the drawings. A form is demonstrated in detail. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description is omitted.

(Embodiment 1)
[Overall configuration of video encoding apparatus]
FIG. 1 is a diagram showing a configuration of a moving picture coding apparatus according to Embodiment 1 of the present invention. Hereinafter, the operation of each unit will be described. The moving picture coding apparatus according to Embodiment 1 includes an input terminal 100, an input picture memory 101, a coding block acquisition unit 102, a subtraction unit 103, an orthogonal transform / quantization unit 104, a prediction error coding unit 105, an inverse quantum. / Inverse conversion unit 106, addition unit 107, intra-frame decoded image buffer 108, loop filter unit 109, decoded image memory 110, motion vector detection unit 111, motion compensation prediction unit 112, motion compensation prediction block structure selection unit 113, intra Prediction unit 114, intra prediction block structure selection unit 115, prediction mode selection unit 116, coding block structure selection unit 117, block structure / prediction mode information additional information coding unit 118, prediction mode information memory 119, multiplexing unit 120, An output terminal 121 and a coding block control parameter generation unit 122 are provided.

The image signal input from the input terminal 100 is stored in the input image memory 101, and the image signal to be processed for the encoding target picture is input from the input image memory 101 to the encoding block acquisition unit 102. The image signal of the encoding target block extracted based on the position information of the encoding target block by the encoding block acquisition unit 102 is a subtraction unit 103, a motion vector detection unit 111, a motion compensation prediction unit 112, and an intra prediction unit 114. To be supplied.

FIG. 2 is a diagram illustrating an example of an encoding target image. Regarding the prediction block size according to the first embodiment, as shown in FIG. 2, the encoding target image is encoded in units of 64 × 64 pixel encoding blocks, and the prediction block is configured based on the encoding block. . The maximum prediction block size is 64 × 64 pixels, which is the same as the encoded block, and the minimum prediction block size is 4 × 4 pixels. The division configuration of a CU into prediction blocks includes non-division (2N × 2N), horizontal / vertical division (N × N), horizontal division only (2N × N), and vertical division only (N × 2N) is possible. In the case of horizontal / vertical division, the prediction block further divided horizontally and vertically can be hierarchically divided into prediction blocks as coding blocks (CU), and the hierarchy is expressed by the number of CU divisions. Here, the divided areas viewed from the upper hierarchy CU of the four divided CUs are defined as division 1, division 2, division 3, and division 4.

FIG. 3 is a diagram illustrating an example of a detailed definition of the predicted block size. The block size (CU size) of the CU is defined from 64 pixels × 64 pixels in which the CU division number (CU_Depth) is 0 to 8 × 8 pixels in which the CU division number is 3, and the maximum prediction block size is CU_Depth = Predicted block size of 64 pixels x 64 pixels of 0, non-divided (2N x 2N), minimum prediction block size up to 4 pixels x 4 pixels of horizontal / vertical division (N x N) with CU_Depth = 3 Will do.

The prediction block size in the case of performing motion compensation prediction that performs prediction using correlation between screens is divided only in the horizontal direction (2N × N), only in the vertical direction, with respect to the division configuration of the CU into prediction blocks. Can be defined and a total of 13 types of prediction block sizes can be defined. However, the prediction block size in the case of intra prediction in which prediction is performed using correlation in the screen is only in the horizontal direction. Since the division into two (2N × N) and the division only into the vertical direction (N × 2N) are not possible, a total of five types of prediction block sizes are defined.

The partition configuration of the prediction block according to Embodiment 1 of the present invention is not limited to this combination. The encoding block size that can be defined can be changed by setting the maximum CU size and the minimum CU size using control parameters such as Maximum_cu_size and Minimum_cu_size shown in FIG. 3, and encoding and decoding these control parameters. Is possible.

Returning to FIG. 1, the subtraction unit 103 calculates a prediction error signal by subtracting the image signal supplied from the coding block acquisition unit 102 and the prediction signal supplied from the coding block structure selection unit 117, and outputs a prediction error signal. Is supplied to the orthogonal transform / quantization unit 104.

The orthogonal transform / quantization unit 104 performs orthogonal transform and quantization on the prediction error signal supplied from the subtraction unit 103, and the quantized prediction error signal is subjected to a prediction error encoding unit 105 and an inverse quantization / inverse conversion unit. 106.

The prediction error encoding unit 105 entropy-encodes the quantized prediction error signal supplied from the orthogonal transform / quantization unit 104, generates a code string for the prediction error signal, and supplies the code sequence to the multiplexing unit 120. .

The inverse quantization / inverse transform unit 106 performs a process such as inverse quantization or inverse orthogonal transform on the quantized prediction error signal supplied from the orthogonal transform / quantization unit 104 to generate a decoded prediction error signal. Generated and supplied to the adder 107.

The addition unit 107 adds the decoded prediction error signal supplied from the inverse quantization / inverse conversion unit 106 and the prediction signal supplied from the coding block structure selection unit 117 to generate a decoded image signal, and generates a decoded image signal. The signal is supplied to the intra-frame decoded image buffer 108 and the loop filter unit 109.

The intra-frame decoded image buffer 108 supplies the decoded image in the same frame in the region adjacent to the encoding target block to the intra prediction unit 114 and also stores the decoded image signal supplied from the addition unit 107.

The loop filter unit 109 performs a filtering process on the decoded image signal supplied from the adding unit 107 by applying a filter to remove distortion caused by encoding and to restore the image to a pre-encoded image. The resulting decoded image is supplied to the decoded image memory 110.

The decoded image memory 110 stores the decoded image signal subjected to the filtering process supplied from the loop filter unit 109. In addition, a decoded image for which decoding of the entire image has been completed is stored as a reference image by a predetermined number of images, and the reference image signal is supplied to the motion vector detection unit 111 and the motion compensation prediction unit 112.

The motion vector detection unit 111 receives the input of the image signal of the encoding target block supplied from the encoding block acquisition unit 102 and the reference image signal stored in the decoded image memory 110, and obtains a motion vector for each reference image. The motion vector value is detected and supplied to the motion compensation prediction unit 112 and the motion compensation prediction block structure selection unit 113.

A general motion vector detection method calculates an error evaluation value for an image signal corresponding to a reference image moved by a predetermined movement amount from the same position as the image signal, and moves the movement amount that minimizes the error evaluation value. Let it be a vector. As the error evaluation value, a sum of absolute differences SAD (Sum of Absolute Difference) for each pixel, a sum of squared error values SSE (Sum of Square Error) for each pixel, or the like is used. Furthermore, the code amount related to the coding of the motion vector can also be included in the error evaluation value.

The motion compensated prediction unit 112 is configured to decode the decoded image memory according to the information specifying the prediction block structure specified by the motion compensated prediction block structure selecting unit 113, the reference image specifying information, and the motion vector value input from the motion vector detecting unit 111. A prediction signal is generated by acquiring an image signal at a position obtained by moving the reference image indicated by the reference image designation information in 110 from the same position as the image signal of the prediction block by a motion vector value.

When the prediction mode specified by the motion compensated prediction block structure selection unit 113 is prediction from a single reference image, a prediction signal acquired from one reference image is used as a motion compensation prediction signal, and two prediction modes are referenced. In the case of prediction from an image, a weighted average of prediction signals acquired from two reference images is used as a motion compensation prediction signal, and the motion compensation prediction signal is supplied to the prediction mode selection unit 116. Here, the ratio of the weighted average of bi-prediction is set to 1: 1.

4 (a) to 4 (d) are diagrams for explaining the prediction type of motion compensation prediction. A process for performing prediction from a single reference image is defined as single prediction, and in the case of single prediction, prediction using either one of two reference images registered in the reference image management list, that is, L0 prediction or L1 prediction. I do.

FIG. 4A shows a case in which the prediction image is uni-prediction and the reference image (RefL0Pic) for L0 prediction is at a time before the encoding target image (CurPic). FIG. 4B shows a case in which the prediction image is a single prediction and the reference image of the L0 prediction is at a time after the encoding target image. Similarly, the L0 prediction reference image shown in FIGS. 4A and 4B can be replaced with the L1 prediction reference image (RefL1Pic) to perform single prediction.

The process of performing prediction from two reference images is defined as bi-prediction, and bi-prediction is expressed as BI prediction using both L0 prediction and L1 prediction. FIG. 4C illustrates a case where bi-prediction is performed, and the reference image for L0 prediction is at a time before the encoding target image and the reference image for L1 prediction is at a time after the encoding target image. . FIG. 4D shows a case of bi-prediction, where the reference image for L0 prediction and the reference image for L1 prediction are at a time before the encoding target image. As described above, the relationship between the prediction type of L0 / L1 and time can be used without being limited to L0 being the past direction and L1 being the future direction. In the case of bi-prediction, each of L0 prediction and L1 prediction may be performed using the same reference picture. Note that whether to perform motion compensation prediction by single prediction or bi-prediction is determined based on, for example, information (for example, a flag) indicating whether to use L0 prediction and whether to use L1 prediction. The

Bi-prediction requires image information access to two reference image memories, and therefore may require twice or more memory bandwidth compared to single prediction. When hardware is configured, bi-prediction when the prediction block size of motion compensation prediction is small becomes a bottleneck of the memory band, and the bottleneck of the memory band is suppressed in the embodiment of the present invention.

Returning to FIG. 1, the motion compensated prediction block structure selection unit 113 detects the motion vector value detected for each reference image input from the motion vector detection unit 111 and the motion information stored in the prediction mode information memory 119 ( Based on the prediction type, motion vector value, and reference image designation information), the control parameters related to the prediction block size and motion compensation prediction mode defined in Embodiment 1 generated by the coding block control parameter generation unit 122 are The reference image designation information and the motion vector value used for each of the prediction block size and the motion compensation prediction mode that are input and determined based on the control parameter are set in the motion compensation prediction unit 112. Depending on the set value, the motion compensation prediction signal supplied from the motion compensation prediction unit 112 and the image signal of the target block to be encoded supplied from the coding block acquisition unit 102 are used to optimize the prediction block size and motion compensation prediction. Determine the mode.

The motion compensated prediction block structure selection unit 113 uses the determined prediction block size, motion compensation prediction mode, prediction type corresponding to the prediction mode, motion vector, and information specifying the reference image designation information as a motion compensation prediction signal and a prediction error. Is supplied to the prediction mode selection unit 116 together with an error evaluation value for.

The intra prediction unit 114 is adjacent to the encoding target block supplied from the intra-frame decoded image buffer 108 according to the intra prediction mode defined as the information specifying the prediction block structure specified by the intra prediction block structure selection unit 115. An intra prediction signal is generated using the decoded image in the same frame, and is supplied to the intra prediction block structure selection unit 115.

Embodiment 1 Intra prediction block structure selection section 115 is generated by coding block control parameter generation section 122 according to intra prediction mode information stored in prediction mode information memory 119 and a plurality of defined intra prediction modes. The control parameter related to the prediction block size defined in the step is input, and the intra prediction mode used for each of the prediction block sizes determined based on the control parameter is set in the intra prediction unit 114. Based on the set value, the optimal prediction block size and intra prediction mode are determined using the intra prediction signal supplied from the intra prediction unit 114 and the image signal of the encoding target block supplied from the coding block acquisition unit 102. To do.

Also, the intra prediction block structure selection unit 115 supplies information specifying the determined prediction block size and intra prediction mode to the prediction mode selection unit 116 together with the intra prediction signal and the error evaluation value for the prediction error.

The prediction mode selection unit 116 is supplied from the motion compensated prediction block structure selection unit 113 and specifies the determined prediction block size, motion compensation prediction mode, prediction type according to the prediction mode, motion vector, and reference image designation information. CU size units that are hierarchically configured from the error evaluation value for the prediction error, and the error prediction value for the predicted prediction block size, intra prediction mode, and prediction error supplied from the intra prediction block structure selection unit 115 The optimum prediction mode is selected by comparing the error evaluation values.

When motion compensation prediction is selected as the optimal prediction mode information in units of CU size selected by the prediction mode selection unit 116 together with the sum of the prediction block size, the prediction signal, and the error evaluation value in units of CU size, When the intra prediction is selected as the motion compensation prediction mode, the prediction type corresponding to the prediction mode, the motion vector, the information specifying the reference image designation information, and the motion compensation prediction signal, the intra prediction mode and the intra prediction signal are selected. Is supplied to the coding block structure selection unit 117.

The coding block structure selection unit 117 is defined in Embodiment 1 generated by the coding block control parameter generation unit 122 based on the optimal prediction mode information in CU size units supplied from the prediction mode selection unit 116. The control parameter related to the encoded block size is input, the optimum CU_Depth configuration is selected in the encoded block size configuration determined based on the control parameter, the information for specifying the CU partition configuration, and the specified partition configuration The optimal prediction mode information in the CU size and additional information related to the prediction mode (motion information, intra prediction mode) are supplied to the block structure / prediction mode information additional information encoding unit 118 and the selected prediction signal is subtracted. 103 and the adder 107.

The block structure / prediction mode information additional information encoding unit 118 is supplied from the encoding block structure selection unit 117, and specifies the CU partition configuration, and the optimal prediction mode information for the specified CU size for each partition configuration. By coding the additional information related to the prediction mode and the control parameters related to the coding block and the prediction block structure supplied from the coding block control parameter generation unit 122 according to a predetermined syntax structure, the coding block unit The CU partition configuration and mode information used for prediction are encoded and supplied to the multiplexing unit 120, and the information is stored in the prediction mode information memory 119.

The prediction mode information memory 119 predetermines the CU partition configuration of the coding block unit supplied from the block structure / prediction mode information additional information encoding unit 118 and the mode information used for prediction based on the minimum prediction block size unit. Memorize images. Since the first embodiment focuses on motion compensation prediction that is prediction between screens, motion information (prediction type, motion vector, and reference image index) that is information related to motion compensation prediction in mode information is used. And add a description.

The motion information of the adjacent block of the prediction block that is the processing target of motion compensation prediction is set as a spatial candidate block group, and the motion information of the block on ColPic and the surrounding blocks that are at the same position as the processing target prediction block is the time candidate block group. To do.

ColPic is a decoded image different from the prediction block to be processed, and is stored in the decoded image memory 110 as a reference image. In Embodiment 1, ColPic is a reference image decoded immediately before. In the first embodiment, ColPic is the reference image decoded immediately before, but the reference image immediately before in display order or the reference image immediately after in display order may be used, and the reference image used for ColPic is included in the encoded stream. Direct specification is also possible.

The prediction mode information memory 119 supplies the motion information of the spatial candidate block group and the temporal candidate block group to the motion compensated prediction block structure selection unit 113 as motion information of the candidate block group, and intra prediction of adjacent blocks of the intra prediction block. The mode information is supplied to the intra prediction block structure selection unit 115.

The multiplexing unit 120 includes a prediction error encoding sequence supplied from the prediction error encoding unit 105, a CU partitioning configuration in units of encoded blocks supplied from the block structure / prediction mode information additional information encoding unit 118, and prediction. The encoded bit stream is generated by multiplexing the encoded sequence of the mode information and the additional information used in the above, and the encoded bit stream is output to the recording medium / transmission path via the output terminal 121.

The coding block control parameter generation unit 122 performs control parameters such as Maximum_cu_size and Minimum_cu_size shown in FIG. 3, which are parameters defining the coding block structure, and the block size and prediction processing of motion compensation prediction in the first embodiment. Generate parameters for defining a coding block structure or a prediction block structure, such as a control parameter to be limited, and a motion compensation prediction block structure selection unit 113, an intra prediction block structure selection unit 115, a coding block structure selection unit 117, And supplied to the block structure / prediction mode information additional information encoding unit 118. Details regarding the block size of motion compensation prediction and control parameters for limiting the prediction process will be described later.

The configuration of the moving picture encoding apparatus shown in FIG. 1 can also be realized by hardware such as an information processing apparatus including a CPU (Central Processing Unit), a frame memory, and a hard disk.

FIG. 5 is a flowchart showing the flow of the encoding process in the video encoding apparatus according to Embodiment 1 of the present invention. For each coding block unit, CU_Depth, which is a control parameter for CU partitioning, is initialized to 0 (S500), and a coding process target block image is obtained from the coding block acquisition unit 102 (S501). The motion vector detection unit 111 uses a motion vector value for each reference image according to CU partitioning from a block image to be predicted according to CU partitioning from the encoding target block image and a plurality of reference images stored in the decoded image memory 110. Is calculated (S502).

Subsequently, the motion compensation prediction block structure selection unit 113 uses the motion vector supplied from the motion vector detection unit 111, the motion information and the intra prediction mode information stored in the prediction mode information memory 119, and performs the first embodiment. The prediction signal for each of the prediction block size and the motion compensation prediction mode defined in (1) is acquired using the motion compensation prediction unit 112, and the result of selecting the optimal prediction block size and prediction mode in CU units is output. In addition, the intra prediction block structure selection unit 115 acquires prediction signals for each of the prediction block size and the intra prediction mode using the intra prediction unit 114, and selects the optimal prediction block size and prediction mode in CU units. Is output. The coding block structure selection unit 117 generates a prediction mode and a prediction signal in the optimum coding block structure using these results (S503). Details of the processing in step S503 will be described later.

Subsequently, the subtraction unit 103 calculates a difference between the encoded block image supplied from the encoded block acquisition unit 102 and the prediction signal supplied from the encoded block structure selection unit 117 as a prediction error signal (S504). ). The block structure / prediction mode information additional information encoding unit 118 includes a coding type, a prediction mode, a prediction type according to a prediction mode in the case of motion compensation prediction, a motion vector, and a coding structure supplied from the coding block structure selection unit 117. Information for specifying the reference image designation information and intra prediction mode information in the case of intra prediction are encoded according to a predetermined syntax structure, and encoded data of additional information related to the encoding structure and the prediction mode information is generated (S505). ).

Subsequently, the prediction error encoding unit 105 entropy encodes the quantized prediction error signal generated by the orthogonal transform / quantization unit 104 to generate encoded data of the prediction error (S506). The multiplexing unit 120 is supplied from the coding structure supplied from the block structure / prediction mode information additional information encoding unit 118, encoded data of additional information related to the prediction mode information, and the prediction error encoding unit 105. The encoded data of the prediction error is multiplexed to generate an encoded bit stream (S507).

The addition unit 107 adds the decoded prediction error signal supplied from the inverse quantization / inverse conversion unit 106 and the prediction signal supplied from the coding block structure selection unit 117 to generate a decoded image signal (S508). . The prediction mode information memory 119 includes motion information (prediction when motion compensation prediction is used as additional information related to the coding structure and prediction mode information supplied from the block structure / prediction mode information additional information encoding unit 118. Type, motion vector, and reference image designation information) and intra prediction mode information when intra prediction is used are stored in units of the smallest prediction block size (S509).

The decoded image signal generated by the addition unit 107 is stored in the intra-frame decoded image buffer 108, and the loop filter unit 109 performs a loop filter process for distortion removal (S510) and performs the filter. The decoded image signal is supplied to and stored in the decoded image memory 110 and used for motion compensation prediction processing of an encoded image to be encoded thereafter (S511).

[Details of prediction mode / prediction signal generation processing per CU]
Next, details of the prediction mode / prediction signal generation processing in units of CUs, which is step S503 in the flowchart of FIG. 5, will be described using the flowchart of FIG.

First, a value indicating the number of hierarchies between the set maximum CU size and minimum CU size is set as Max_CU_Depth, and it is determined whether or not the CU_Depth of the target CU is smaller than Max_CU_Depth (S600). In the first embodiment, the CU division configuration shown in FIG. 3 is assumed, and Max_CU_Depth = 3.

When CU_Depth is smaller than Max_CU_Depth (S600: YES), 1 is added to CU_Depth (S601), and the prediction mode / prediction signal generation processing in units of CUs is performed for the CU one layer below the current target CU divided into four. Perform (S602-S605). In the order of the division 1 area processing (S602), the division 2 area processing (S603), the division 3 area processing (S604), and the division 4 area processing (S605) with respect to the CU division area shown in FIG. Thus, the prediction mode / prediction signal generation processing for each CU described recursively in the flowchart of FIG. 6 is performed.

Among the prediction mode calculation results of each CU divided region, error evaluation values are integrated, and a total error evaluation value of four divided CUs is calculated (S606).

On the other hand, when CU_Depth is equal to or greater than Max_CU_Depth (S600: NO), intra prediction block structure selection unit 115 and intra prediction unit 114 in FIG. 1 calculate intra prediction modes and generate prediction signals (S607). Intra prediction mode information, a prediction signal, and an error evaluation value in the target CU are calculated.

Subsequently, the motion compensation prediction block structure selection unit 113 and the motion compensation prediction unit 112 select a motion compensation prediction block size, and generate a motion compensation prediction mode and a prediction signal for each selected prediction block (S608). A prediction block size, mode information, motion information, a prediction signal, and an error evaluation value for motion compensation prediction in the target CU are calculated. Details of step S608 will be described later.

Subsequently, the coding block structure selecting unit 117 compares the error evaluation value of the intra prediction in the target CU with the error evaluation value of the motion compensated prediction, selects a prediction method with a small error, and selects intra / inter (motion compensated prediction). ) Is determined (S609).

Next, the recursively performed processing of the flowchart of FIG. 6 (S602 to S605 of FIG. 6) and the error evaluation value summation calculation (S606), which is lower than the target CU (CU_Depth is greater) The error evaluation value for the CU is compared with the error evaluation value of the target CU, and CU_Depth applied to prediction is determined (S610).

Since the processing shown in the flowchart of FIG. 6 is recursively called, the lowest CU (Depth = Max_CU_Depth) CU is sequentially compared with the upper CU, and the optimal CU_Depth for each divided region of the CU Prediction mode can be selected.

Finally, additional information regarding the selected CU_Depth, prediction mode, and selected intra prediction or motion compensated prediction, the error evaluation value, and the prediction signal between the target CU and the CU lower than the target CU are stored. (S611) The prediction mode / prediction signal generation process in the target CU ends.

[Details of motion compensation prediction block size selection / prediction signal generation processing]
Next, details of the motion compensated prediction block size selection in step 608 in the flowchart of FIG. 6 and the motion compensation prediction mode / prediction signal generation processing for each prediction block will be described with reference to the flowchart of FIG.

First, an encoded block image to be predicted is acquired for the target CU (S700). Next, with the configuration shown in FIG. 3, motion compensation prediction mode / prediction signal generation processing is performed for each intra-CU division mode (S701 to S705).

First, the motion compensation prediction mode / prediction signal generation processing when the intra-CU division mode is 2N × 2N is performed by setting NumPart which is a value indicating the number of divisions to 1 (S701). Subsequently, NumPart is set to 2, and motion compensation prediction mode / prediction signal generation processing is performed in the case of 2N × N (S702) and N × 2N (S703).

Next, when CU_Depth is equal to Max_CU_Depth, the target CU size is 8 × 8, and an inter_4x4_enable flag (to be described later) is 1 (S704: YES), NumPart is set to 4 and motion compensation is performed when N × N Prediction mode / prediction signal generation processing is performed (S705). Details of the motion compensation prediction mode / prediction signal generation processing performed in steps S701, S702, S703, and S705 will be described later. If the condition of step S704 is not satisfied (S704: NO), step S705 is skipped and the subsequent step is performed.

In Embodiment 1, motion compensated prediction / prediction signal generation in intra-CU division in the order of 2N × 2N (S701), 2N × N (S702), N × 2N (S703), and N × N (S705) However, the processing order of the steps of each of the CU divisions may be changed, and when processing is performed by a CPU or the like that can perform parallel processing, S701, S702, S703, and S705 are performed. Can also be performed in parallel.

Subsequently, an error evaluation value for each intra-CU partition mode for which motion compensation prediction mode / prediction signal generation has been performed is compared, and an optimal prediction block size (PU) that is an optimal intra-CU partition mode is selected (S706). Prediction mode information / error evaluation value / prediction signal for the selected PU is stored (S707), and the process of step S608 in the flowchart of FIG. 6 ends.

[Definition of Motion Compensation Prediction Mode in Embodiment 1]
FIGS. 8A and 8B are diagrams for explaining two prediction modes for encoding motion information used in motion compensated prediction according to Embodiment 1 of the present invention.

In the first prediction mode, the prediction target block directly encodes its own motion information using the continuity of motion in the temporal direction and the spatial direction in the prediction target block and the encoded block adjacent to the prediction target block. In this method, the motion information of spatially and temporally adjacent blocks is used for encoding, which is called a joint prediction mode (merge mode).

Here, the spatially adjacent block refers to a block adjacent to the prediction target block among encoded blocks belonging to the same image as the prediction target block. Here, the temporally adjacent blocks indicate blocks in the same spatial position as the prediction target block and in the vicinity thereof among blocks belonging to an encoded image different from the prediction target block.

In the combined prediction mode, motion information that can be selectively combined from a plurality of adjacent block candidates can be defined, and the motion information is specified by encoding information (joined motion information index) that specifies the adjacent block to be used. The motion information acquired based on the information is used as it is for motion compensation prediction. Further, in the joint prediction mode, a Skip mode is defined in which the prediction signal predicted in the joint prediction mode is a decoded picture without encoding prediction transmission of the prediction difference information, and a decoded image is obtained with information having only the combined motion information. Can be reproduced. The Skip mode can be used when the intra-CU division mode is 2N × 2N, and the motion information transmitted in the Skip mode is the designation information that defines the adjacent block as in the combined prediction mode.

The second prediction mode is a technique for coding all the components of motion information individually and transmitting motion information with little prediction error to the prediction block, and is called a motion detection prediction mode. The motion detection prediction mode includes a prediction type indicating whether the prediction is bi-prediction or uni-prediction, information for identifying a reference image (reference image index), and encoding of motion information in the conventional motion compensation prediction. The information for specifying the motion vector is encoded separately.

In the motion detection prediction mode, the prediction mode indicates whether to use single prediction or bi-prediction. In the case of single prediction single prediction, information for specifying a reference image for one reference image, and a motion vector prediction vector The difference vector is encoded. In the case of bi-prediction, information for specifying reference images for two reference images and a motion vector are individually encoded. The prediction vector for the motion vector is generated from the motion information of the adjacent block similarly to the AVC. However, similarly to the combined prediction mode, the motion vector used for the prediction vector can be selected from a plurality of adjacent block candidates, and the motion vector is the prediction vector. Is transmitted by encoding two pieces of information (predicted vector index) for designating adjacent blocks to be used for and a difference vector.

[Explanation Regarding Block Size of Motion Compensated Prediction and Method for Limiting Prediction Processing in Embodiment 1]
Next, an approximate value related to the reference image memory amount required at the time of prediction in motion compensation prediction is shown in FIG. 9, and the prediction block size and the prediction processing restriction method in Embodiment 1 will be described. In motion compensation prediction, prediction accuracy is improved by making the accuracy of motion fine. Taking AVC as an example, motion vectors can be detected and transmitted with 1/4 pixel accuracy.

Also in the first embodiment, when a motion vector is detected and transmitted with ¼ pixel accuracy and a motion compensated prediction signal for a ¼ pixel accuracy motion is generated, an integer motion existing in the reference image is generated. Using a plurality of pixels at the position, a pixel of the reference image at the motion position with a 1/4 pixel accuracy is calculated by an interpolation filter. In the moving picture encoding apparatus and moving picture decoding apparatus according to Embodiment 1, a 7-tap FIR filter is used as an interpolation filter.

In order to apply a 7-tap filter, it is necessary to acquire 6 pixels in the horizontal and vertical directions with respect to the pixel at the horizontal / vertical integer motion position closest to the target position. When a predicted image at a motion position that is 3/4 pixels away from the right boundary portion of the prediction block is acquired, a pixel at an integer motion position that is closest to the target position is a pixel that is outside one pixel of the prediction block. Therefore, the number of pixels to be acquired further increases, and it is necessary to acquire a reference image necessary for filtering processing for the same 7 pixels as the number of taps horizontally and vertically with respect to the predicted block size.

FIG. 9 shows a case where a 7-tap filter is applied and a memory band is secured when performing uni-prediction and bi-prediction in each of the predictable block sizes definable for motion compensated prediction shown in FIG. 3 in the first embodiment. Indicates the required memory access amount of the reference image. Depending on the configuration of the reference image memory of the encoding device and the decoding device, various configurations such as a configuration in which memory access is possible in units of horizontal 4 pixels and a configuration in which units of horizontal and vertical 2 × 2 pixels are possible can be taken. However, the memory access amount indicates the maximum value of the memory access amount that needs to be obtained at the minimum regardless of the configuration of the reference image memory.

Regardless of the size of the predicted block size, the horizontal and vertical sizes that need to be additionally acquired for the filtering process do not change, so the 4 × 4 pixel size is the most encoded block size (LCU) unit. The memory access amount becomes larger, and access of nearly 6 times the size of 64 × 64 pixels is required. In addition, in the case of bi-predictive motion compensated prediction, two prediction signals are acquired from reference images at different positions, so that twice as many memory accesses as in single prediction are required.

The memory bandwidth that needs to be secured when the block size for motion compensation prediction is small or when motion compensation for bi-prediction is large, especially when the image size to be encoded becomes high definition images that are higher than HD. There is a problem that the feasibility of the encoding device and the decoding device becomes difficult. In the present invention, there are provided a motion compensation prediction limiting method and a control parameter definition and setting method for limiting, in which the memory access maximum amount of the reference image can be controlled step by step to limit the memory bandwidth, It is possible to achieve both the feasibility and encoding efficiency of a moving image encoding apparatus for fine images.

Next, FIG. 10 shows an example of the motion compensation prediction block size and control parameters for limiting the prediction processing, which are generated by the coding block control parameter generation unit 122 of FIG. 1 according to Embodiment 1 of the present invention. To do.

The control parameters are inter_4x4_enable, which is a parameter for controlling the validity / invalidity of motion compensated prediction of 4 × 4 pixels, which is the smallest motion compensated prediction block size, and only prediction processing for which bi-prediction is performed among motion compensated predictions It consists of two parameters, inter_bipred_restriction_idc, which defines the block size that prohibits

When the necessary reference image memory amounts in FIG. 9 are compared, 4 × 4 bi-prediction, 4 × 8/8 × 4 bi-prediction, 4 × 4 mono-prediction, 8 × 8 bi-prediction, 8 × 16/16 × 8 bi-prediction, 4 × 8/8 × 4 single prediction, and 16 × 16 bi-prediction are in this order. Relatively accessed except for the minimum prediction block size of 4 × 4 pixels. The amount is small.

Therefore, with regard to the minimum prediction block size, inter_4x4_enable, which is a control parameter for prohibiting the motion compensation prediction process itself, is prepared, and inter_bipred_restriction_idc that further restricts bi-prediction is prepared as a control parameter for each block size. Can control memory access amount explicitly.

By the way, for 4 × 8/8 × 4 single prediction, the amount of memory access is larger than that of 16 × 16 bi-prediction. However, it is also necessary to limit 4 × 4 and 4 × 8/8 × 4 bi-prediction with a large memory access amount. In this case, by setting the minimum CU size to 16 × 16, the intra-CU partitioning mode is Since the entire motion compensated prediction with a prediction block size smaller than an N × N 8 × 8 block can be prohibited, the motion compensated prediction process itself is prohibited in a configuration having a restriction on a fixed minimum prediction block size. It is possible to control the memory access amount.

When performing the above control, the memory access amount is controlled by combining the minimum CU size value in addition to inter_4x4_enable and inter_bipred_restriction_idc.

In Embodiment 1, inter_bipred_restriction_idc defines a value from 0 to 5 as shown in FIG. 10, and from a state where there is no restriction on bi-prediction to a state where bi-prediction with a size of 16 × 16 blocks or less is restricted. Although controllable, the range of definition is an example, and it is also possible to define a control value that is smaller or more than this value as another configuration of the embodiment of the present invention.

Control that disables the entire motion compensated prediction of a given size and a control parameter that restricts bi-prediction of motion compensated prediction of less than a given size, and controls the maximum memory access amount to be within the prescribed range This technique is the configuration in the first embodiment of the present invention.

[Overall configuration of video decoding apparatus]
FIG. 11 is a diagram showing a configuration of the moving picture decoding apparatus according to Embodiment 1 of the present invention. Hereinafter, the operation of each unit will be described. The video decoding apparatus according to Embodiment 1 includes an input terminal 1100, a demultiplexing unit 1101, a prediction difference information decoding unit 1102, an inverse quantization / inverse transform unit 1103, an addition unit 1104, an intra-frame decoded image buffer 1105, a loop filter. Unit 1106, decoded image memory 1107, prediction mode / block structure decoding unit 1108, prediction mode / block structure selection unit 1109, intra prediction information decoding unit 1110, motion information decoding unit 1111, prediction mode information memory 1112, intra prediction unit 1113, A motion compensation prediction unit 1114 and an output terminal 1115 are provided.

The encoded bit stream is supplied from the input terminal 1100 to the demultiplexing unit 1101. The demultiplexing unit 1101 is used for the code string of the supplied coded bitstream, the coded string of prediction error information, the control parameters related to the coding block and the prediction block structure, the CU partition configuration and coding block unit, and the prediction. Mode information, prediction mode according to the prediction mode in the case of motion compensated prediction, motion vector that is information specifying the motion vector, and reference image designation information, and intra prediction mode information in the case of intra prediction. Separate into coded sequences to be constructed. The coding sequence of the prediction error information is supplied to the prediction difference information decoding unit 1102, and the control parameter, the CU partition configuration of the coding block unit, and the coding sequence of the mode information used for prediction are predicted mode / block structure It supplies to the decoding part 1108.

The prediction difference information decoding unit 1102 decodes the encoded sequence of the prediction error information supplied from the demultiplexing unit 1101, and generates a quantized prediction error signal. The prediction difference information decoding unit 1102 supplies the generated quantized prediction error signal to the inverse quantization / inverse transform unit 1103.

The inverse quantization / inverse transform unit 1103 performs a process such as inverse quantization or inverse orthogonal transform on the quantized prediction error signal supplied from the prediction difference information decoding unit 1102 to generate a prediction error signal, The decoded prediction error signal is supplied to the adding unit 1104.

The adder 1104 adds the decoded prediction error signal supplied from the inverse quantization / inverse transform unit 1103 and the prediction signal supplied from the prediction mode / block structure selection unit 1109 to generate a decoded image signal, and generates a decoded image signal. The signal is supplied to the intra-frame decoded image buffer 1105 and the loop filter unit 1106.

The intra-frame decoded image buffer 1105 has the same function as the intra-frame decoded image buffer 108 in the moving picture encoding apparatus in FIG. 1, and supplies a decoded image signal in the same frame to the intra prediction unit 1113 as a reference image for intra prediction. At the same time, the decoded image signal supplied from the adding unit 1104 is stored.

The loop filter unit 1106 has the same function as the loop filter unit 109 in the moving picture coding apparatus in FIG. 1, performs a distortion removal filter on the decoded image signal supplied from the addition unit 1104, and performs filter processing. The decoded image obtained as a result of the execution is supplied to the decoded image memory 1107.

The decoded image memory 1107 has the same function as the decoded image memory 110 in the moving image encoding apparatus in FIG. 1, stores the decoded image signal supplied from the loop filter unit 1106, and uses the reference image signal as the motion compensation prediction unit 1114. To supply. The decoded image memory 1107 supplies the stored decoded image signal to the output terminal 1115 in accordance with the display order of images in accordance with the reproduction time.

The prediction mode / block structure decoding unit 1108 is a control parameter that defines the CU structure shown in FIG. 3 based on the control parameters related to the coding block and the prediction block structure supplied from the demultiplexing unit 1101, and the control parameter shown in FIG. Such a motion compensation prediction block configuration and control parameters for limiting the prediction process are generated.

Also, the prediction mode / block structure decoding unit 1108 uses the CU division configuration for each coding block unit supplied from the demultiplexing unit 1101 and the coding sequence of the mode information used for prediction to determine the CU for each coding block unit. The mode information used for the division configuration and prediction is decoded to generate a prediction block size and a prediction mode, and the prediction type, motion vector, and reference image designation information corresponding to the prediction mode in the case of motion compensated prediction are specified. The motion information, which is information, and the intra prediction mode information in the case of intra prediction are separated, and the CU partition configuration and the prediction mode information for each coding block are supplied to the prediction mode / block structure selection unit 1109.

When intra prediction is used for the prediction block, the prediction mode / block structure decoding unit 1108 supplies intra prediction mode information to the intra prediction information decoding unit 1110 together with the prediction block size, and motion compensated prediction is used. If so, the motion information decoding unit 1111 is supplied with information for specifying the motion compensation prediction mode, the prediction type corresponding to the prediction mode, the motion vector, and the reference image designation information together with the prediction block size.

The intra prediction information decoding unit 1110 decodes the prediction block size and intra prediction mode information supplied from the prediction mode / block structure decoding unit 1108, and reproduces the prediction block structure for the encoding target block and the intra prediction mode in each prediction block. To do. The intra prediction information decoding unit 1110 supplies the reproduced intra prediction mode to the intra prediction unit 1113 and also supplies it to the prediction mode information memory 1112.

The motion information decoding unit 1111 is supplied from the prediction mode / block structure decoding unit 1108 and specifies the prediction block size, the motion compensation prediction mode, and the prediction type, motion vector, and reference image designation information corresponding to the prediction mode. From the decoded motion information and the motion information of the candidate block group supplied from the prediction mode information memory 1112 to reproduce the prediction type, motion vector, and reference image designation information used for motion compensation prediction, This is supplied to the compensation prediction unit 1114. The motion information decoding unit 1111 also supplies the reproduced motion information to the prediction mode information memory 1112. A detailed configuration of the motion information decoding unit 1111 will be described later.

The prediction mode information memory 1112 has the same function as the prediction mode information memory 119 in the moving picture encoding apparatus in FIG. 1 and is supplied from the reproduced motion information supplied from the motion information decoding unit 1111 and the intra prediction information decoding unit 1110. The intra prediction mode to be performed is stored for a predetermined image on the basis of the minimum prediction block size unit. In addition, the prediction mode information memory 1112 supplies motion information of the spatial candidate block group and the temporal candidate block group to the motion information decoding unit 1111 as motion information of the candidate block group, and also intra of the decoded adjacent block in the same frame. The prediction mode information is supplied to the intra prediction information decoding unit 1110 as a prediction candidate of the mode information of the target prediction block.

The intra prediction unit 1113 has the same function as the intra prediction unit 114 in the moving picture coding apparatus in FIG. 1, and performs intra prediction from the intra-frame decoded image buffer 1105 according to the intra prediction mode supplied from the intra prediction information decoding unit 1110. A reference image is input, an intra prediction signal is generated, and supplied to the prediction mode / block structure selection unit 1109.

The motion compensation prediction unit 1114 has the same function as the motion compensation prediction unit 112 in the video encoding device of FIG. 1, and based on the motion information supplied from the motion information decoding unit 1111, the reference image in the decoded image memory 1107. A prediction signal is generated by acquiring an image signal at a position obtained by moving the reference image indicated by the designation information from the same position as the image signal of the prediction block by the motion vector value. If the prediction type of motion compensation prediction is bi-prediction, an average of the prediction signals of each prediction type is generated as a prediction signal, and the prediction signal is supplied to the prediction mode / block structure selection unit 1109.

The prediction mode / block structure selection unit 1109 performs CU partitioning based on the CU partitioning configuration for each coding block supplied from the prediction mode / block structure decoding unit 1108 and the prediction mode information, and reproduces the predicted Depending on the prediction mode of the block structure unit, in the case of motion compensation prediction, a motion compensation prediction signal is input from the motion compensation prediction unit 1114, and in the case of intra prediction, an intra prediction signal is input from the intra prediction unit 1113 and reproduced. The predicted signal is supplied to the adding unit 1104.

The output terminal 1115 outputs the decoded image signal supplied from the decoded image memory 1107 to a display medium such as a display, thereby reproducing the decoded image signal.

The configuration of the video decoding device shown in FIG. 11 can also be realized by hardware such as an information processing device including a CPU, a frame memory, a hard disk, and the like, similarly to the configuration of the video encoding device shown in FIG. is there.

FIG. 12 is a flowchart showing a flow of operation in units of coding blocks of decoding processing in the video decoding apparatus according to Embodiment 1 of the present invention. First, CU_Depth, which is a control parameter for CU partitioning, is initialized to 0 (S1200), and the demultiplexing unit 1101 converts the coded bitstream supplied from the input terminal 1100 into the coded sequence of prediction error information and the coded data. The block is divided into a CU partition configuration and a coded sequence of mode information used for prediction (S1201). The encoded sequence of prediction error information in units of encoded blocks, the CU partition configuration in units of the encoded blocks, and the encoded sequence of mode information used for prediction are the prediction difference information decoding unit 1102, the prediction mode / It is supplied to the block structure decoding unit 1108 and subjected to decoding processing in units of CUs based on the CU partition structure (S1202). The detailed operation of step S1202 will be described later.

Subsequently, the CU partitioning configuration for each coding block is decoded by the prediction mode / block structure decoding unit 1108 in step S1202, and the decoded coding structure information is stored in the prediction mode information memory 1112 (S1203).

The decoded image signal decoded by the decoding process in units of CU (S1202) is subjected to loop filter processing in the loop filter unit 1106 (S1204), stored in the decoded image memory 1107 (S1205), and decoded in units of coding blocks. Ends. In the first embodiment, the loop filter is applied in the process of the coding block unit, but the decoded image signal subjected to the loop filter is not referred to in the decoding process of the same frame, and in the motion compensation prediction of the subsequent frame Since it is referred to, it is possible to perform the process on the entire frame after the decoding process for the entire frame is completed without performing the process for each coding block.

[Details of decryption processing in CU units]
Next, details of the decoding process in units of CUs, which is step S1202 in the flowchart of FIG. 12, will be described using the flowchart of FIG.

First, it is determined whether or not the CU_Depth of the target CU is smaller than the value Max_CU_Depth indicating the number of layers between the set maximum CU size and the minimum CU size (S1300). Since the control parameters relating to the maximum CU size and the minimum CU size in FIG. 3 are encoded and transmitted, Max_CU_Depth at the time of encoding is decoded by decoding the control parameters in the decoding process. An example of the encoding information that defines Max_CU_Depth will be described later.

If CU_Depth is smaller than Max_CU_Depth (S1300: YES), CU partition information is acquired (S1301). As an example, 1-bit flag information (cu_split_flag) is encoded and transmitted according to the selection of whether or not to divide a CU, and whether or not the CU is divided by decoding this flag information. Recognize

When the CU is divided (S1302: YES), in order to divide and decode the CU, 1 is added to the CU division CU_Depth (S1303), and the CU unit decoding process is performed on the CU one layer below (S1304). -S1307), processing of division 1 area (S1304), division 2 area processing (S1305), division 3 area processing (S1306), division 4 area processing (S1307) with respect to the CU division area Specifically, the processing described in the flowchart of FIG. 13 is performed.

When CU_Depth is greater than or equal to Max_CU_Depth (S1300: NO), and when the CU is not divided (S1302: NO), the size of the CU to be decoded is determined, and it corresponds to the prediction mode in the determined CU. Decoding processing is performed.

First, information indicating whether intra prediction or motion compensation prediction is used for prediction within the CU is acquired. (S1308). In the first embodiment, skip flag information (skip_flag) indicating whether or not the skip mode is in CU units, and prediction indicating whether the prediction is intra prediction or motion compensation prediction when the CU is not in skip mode. Mode flag information (pred_mode_flag) is encoded as prediction mode information in units of CU at the time of encoding, and information indicating whether it is intra prediction or motion compensated prediction (including skip mode) by decoding these. Can be obtained.

Subsequently, when the CU is intra prediction (S1309: YES), intra prediction decoding processing for each CU is performed by the intra prediction information decoding unit 1110 and the intra prediction unit 1113 in FIG. 11 (S1311), and the target An intra prediction signal in the CU is generated and added to the decoding error signal to generate a decoded image signal (S1312), and the decoding process in units of CUs is completed.

When the CU is not intra prediction (S1309: NO), motion compensation prediction decoding processing for each CU is performed by the motion information decoding unit 1111 and the motion compensation prediction unit 1114 in FIG. 11 (S1310), and motion in the target CU. A compensated prediction signal is generated and added to the decoded error signal to generate a decoded image signal (S1312), and the decoding process for each CU is completed. Details of the operation in step S1310 will be described later.

Next, details of the motion compensated prediction decoding process in the target CU, which is step S1310 in the flowchart of FIG. 13, will be described using the flowchart of FIG. First, a decoded skip flag is acquired as information indicating the prediction mode in units of CUs (S1400). When the skip flag is 1, that is, in the skip mode (S1401: YES), prediction block partitioning within the CU is performed. The mode is 2N × 2N, NumPart is set to 1, and prediction block unit decoding of a 2N × 2N prediction block is performed (S1402).

When skip_flag is 0, that is, when the skip mode is not set (S1401: NO), the CU partition (PU) mode is set as the CU partition mode value that is the type of the motion compensated prediction block size selected by the CU at the time of encoding. When the PU mode is 2N × 2N (S1404: YES), NumPart is set to 1 and prediction block unit decoding of the 2N × 2N prediction block is performed (S1402). .

When the PU mode is not 2N × 2N (S1404: NO) and when the PU mode is 2N × N (S1405: YES), NumPart is set to 2 and prediction block unit decoding of 2N × N prediction blocks is performed. (S1406).

Subsequently, when CU_Depth is equal to Max_CU_Depth, the target CU size is 8 × 8, and an inter_4x4_enable flag described later is 1 (S1407: YES), it is further determined whether or not the PU mode is N × 2N. However, when the PU mode is N × 2N (S1409: YES), NumPart is set to 2 and prediction block unit decoding of the N × 2N prediction block is performed (S1408).

When the PU mode is not N × 2N (S1409: NO), the PU mode is N × N, NumPart is set to 4, and prediction block unit decoding of the N × N prediction block is performed (S1410).

When the condition of step S1407 is not satisfied (S1407: NO), since the N × N prediction block is not applied in the CU, NumPart is set to 2, and prediction block unit decoding of the N × 2N prediction block is performed. (S1408). Details of the prediction block unit decoding process for each PU mode performed in steps S1402, S1406, S1408, and S1410 will be described later.

In the first embodiment, regarding the condition judgment for selecting the prediction block unit decoding process for the decoded PU mode, the processes are performed in the order shown in steps S1404 to S1409 as shown in the flowchart of FIG. However, if the decoding process is performed in units of prediction blocks in accordance with the decoded PU mode, it is possible to implement a different configuration regarding the order of conditional branches.

After the prediction block unit decoding process for each PU mode is performed, mode information such as the PU mode and motion information for each prediction block is stored in the prediction mode information memory 1112 in FIG. 11 (S1411), and motion compensation for the CU is performed. The predictive decoding process ends.

[Detailed Function Description of Embodiment 1]
Next, the operation of the motion compensated prediction block structure selection unit 113 of the video encoding apparatus according to Embodiment 1 of the present invention, the detailed operation of the processes in steps S701, S702, S703, and S705 in the flowchart of FIG. explain.

[Detailed Operation Explanation of Motion Compensated Prediction Block Structure Selection Unit in Moving Picture Encoding Device in Embodiment 1]
FIG. 15 is a diagram illustrating a detailed configuration of the motion compensated prediction block structure selection unit 113 in the video encoding device according to the first embodiment. The motion compensation prediction block structure selection unit 113 has a function of determining an optimal motion compensation prediction mode and a prediction block structure.

The motion compensation prediction block structure selection unit 113 includes a motion compensation prediction generation unit 1500, a prediction error calculation unit 1501, a prediction vector calculation unit 1502, a difference vector calculation unit 1503, a motion information code amount calculation unit 1504, and a prediction mode / block structure evaluation unit. 1505, a combined motion information calculation unit 1506, a combined motion information single prediction conversion unit 1507, and a combined motion compensation prediction generation unit 1508 are included.

The motion vector value input from the motion vector detection unit 111 to the motion compensation prediction block structure selection unit 113 in FIG. 1 is supplied to the motion compensation prediction generation unit 1500 and input from the prediction mode information memory 119. Is supplied to the prediction vector calculation unit 1502 and the combined motion information calculation unit 1506.

In addition, reference image designation information and motion vectors used for motion compensation prediction are output from the motion compensation prediction generation unit 1500 and the combined motion compensation prediction generation unit 1508 to the motion compensation prediction unit 112. The generated motion compensated prediction image is supplied to the prediction error calculation unit 1501. The prediction error calculation unit 1501 is further supplied with an image signal of a prediction block to be encoded from the encoding block acquisition unit 102.

Also, the prediction mode / block structure evaluation unit 1505 supplies the prediction block structure, the motion information to be encoded and the determined prediction mode information, and the motion compensated prediction signal to the prediction mode selection unit 116.

The motion compensation prediction generation unit 1500 receives the motion vector value calculated for each reference image usable for prediction in each prediction block structure, and performs motion compensation prediction according to the bi-prediction restriction information shown in FIG. The reference image designation information is supplied to the prediction vector calculation unit 1502, and the reference image designation information and the motion vector are output.

The prediction error calculation unit 1501 calculates a prediction error evaluation value from the input motion compensated prediction image and the prediction block image to be processed. As the calculation for calculating the error evaluation value, the sum SAD of the absolute difference value for each pixel, the sum SSE of the square error value for each pixel, and the like can be used as in the error evaluation value in motion vector detection. Furthermore, a more accurate error evaluation value can be calculated by taking into account the amount of distortion components generated in the decoded image by performing orthogonal transform / quantization performed when encoding the prediction residual. . In this case, the prediction error calculation unit 1501 can be realized by having the functions of the subtraction unit 103, the orthogonal transformation / quantization unit 104, the inverse quantization / inverse transformation unit 106, and the addition unit 107 in FIG.

The prediction error calculation unit 1501 supplies the prediction error evaluation value calculated in each prediction mode and each prediction block structure and the motion compensation prediction signal to the prediction mode / block structure evaluation unit 1505.

The prediction vector calculation unit 1502 is supplied with the reference image designation information from the motion compensation prediction generation unit 1500, and the motion vector for the designated reference image from the candidate block group in the adjacent block motion information supplied from the prediction mode information memory 119. A value is input, a plurality of prediction vectors are generated together with a prediction vector candidate list, and supplied to the difference vector calculation unit 1503 together with reference image designation information. The prediction vector calculation unit 1502 creates prediction vector candidates and registers them as prediction vector candidates.

The difference vector calculation unit 1503 calculates the difference between each of the prediction vector candidates supplied from the prediction vector calculation unit 1502 and the motion vector value supplied from the motion compensated prediction generation unit 1500, and calculates the difference vector value. calculate. When the prediction vector index which is the designation information for the calculated difference vector value and the prediction vector candidate is encoded, the code amount is the smallest. The difference vector calculation unit 1503 supplies the prediction vector index and the difference vector value for the prediction vector having the smallest information amount, together with the reference image designation information, to the motion information code amount calculation unit 1504.

The motion information code amount calculation unit 1504 requires motion information in each prediction block structure and each prediction mode from the difference vector value, reference image designation information, prediction vector index, and prediction mode supplied from the difference vector calculation unit 1503. The code amount is calculated. Also, the motion information code amount calculation unit 1504 receives from the combined motion compensation prediction generation unit 1508 information indicating the combined motion information index and the prediction mode that needs to be transmitted in the combined prediction mode, and moves in the combined prediction mode. The amount of code required for information is calculated.

The motion information code amount calculation unit 1504 supplies the motion information calculated in each prediction block structure and each prediction mode and the code amount required for the motion information to the prediction mode / block structure evaluation unit 1505.

The prediction mode / block structure evaluation unit 1505 uses the prediction error evaluation value of each prediction mode supplied from the prediction error calculation unit 1501 and the motion information code amount of each prediction mode supplied from the motion information code amount calculation unit 1504. Calculating a total motion compensation prediction error evaluation value of each prediction mode, selecting a prediction mode and a prediction block size which are the smallest evaluation values, and selecting a prediction mode, a prediction block size and motion information for the selected prediction mode. To the prediction mode selection unit 116. Similarly, the prediction mode / block structure evaluation unit 1505 selects a prediction signal in the selected prediction mode and prediction block size with respect to the motion compensation prediction signal supplied from the prediction error calculation unit 1501, and selects a prediction mode selection unit. To 116.

The combined motion information calculation unit 1506 uses the candidate block group in the motion information of the adjacent blocks supplied from the prediction mode information memory 119, a prediction type indicating whether the prediction is uni-prediction or bi-prediction, reference image designation information, A plurality of pieces of motion information are generated together with a combined motion information candidate list as motion information composed of motion vector values, and supplied to the combined motion information single prediction conversion unit 1507.

FIG. 16 is a diagram illustrating a configuration of the combined motion information calculation unit 1506. The combined motion information calculation unit 1506 includes a spatial combined motion information candidate list generation unit 1600, a combined motion information candidate list deletion unit 1601, a temporal combined motion information candidate list generation unit 1602, a first combined motion information candidate list addition unit 1603, and a second. A combined motion information candidate list adding unit 1604 is included. The combined motion information calculation unit 1506 creates motion information candidates in a predetermined order from spatially adjacent candidate block groups, deletes candidates having the same motion information from the candidates, and then temporally adjacent. By adding motion information candidates created from the candidate block group, only valid motion information is registered as combined motion information candidates. The point that this temporally combined motion information candidate list generation unit is arranged after the combined motion information candidate list deletion unit is a characteristic configuration of the present embodiment, and deletes the same motion information from temporally combined motion information candidates. By eliminating the processing target, it is possible to reduce the amount of calculation without reducing the encoding efficiency. The detailed operation of the combined motion information calculation unit 1506 will be described later.

Returning to FIG. 15, the combined motion information single prediction conversion unit 1507 performs the bi-directional processing shown in FIG. 10 on the combined motion information candidate list supplied from the combined motion information calculation unit 1506 and the motion information registered in the candidate list. In accordance with the prediction restriction information, motion information whose prediction type is bi-prediction is converted into uni-prediction motion information and supplied to the combined motion compensation prediction generation unit 1508.

The combined motion compensated prediction generation unit 1508 corresponds to each registered combined motion information candidate from the combined motion information candidate list supplied from the combined motion information single prediction conversion unit 1507 according to the prediction type based on the motion information. The reference image designation information and motion vector value of one reference image (uni-prediction) or two different reference images (bi-prediction) are designated to the motion compensation prediction unit 112 to generate a motion compensated prediction image, and the respective combinations The motion information index is supplied to the motion information code amount calculation unit 1504.

In the configuration of FIG. 15, the prediction mode evaluation in each combined motion information index is performed by the prediction mode / block structure evaluation unit 1505. The prediction error evaluation value and the motion information code amount are used as the prediction error calculation unit 1501 and the motion information. A configuration in which the combined motion compensation prediction generation unit 1508 receives the code amount calculation unit 1504 and determines the combined motion index of the optimal combined motion compensation prediction, and then evaluates the optimal prediction mode including other prediction modes. It is also possible to take.

FIG. 17 is a flowchart for explaining detailed operations of the motion compensation prediction mode / prediction signal generation processing in steps S701, S702, S703, and S705 in the flowchart of FIG. This operation represents a detailed operation in the motion compensated prediction block structure selection unit 113 in FIG.

First, based on the NumPart set according to the predicted block size division mode (PU) in the defined CU, the steps from step S1701 to step S1708 are performed for each prediction block size obtained by PU division in the target CU (S1700). It is executed (S1709). First, a combined motion information candidate list is generated (S1701).

Subsequently, when the prediction block size is equal to or smaller than bipred_restriction_size that is a prediction block size that restricts bi-prediction set by the control parameter inter_bipred_restriction_idc that restricts bi-prediction illustrated in FIG. 10 (S1702: YES), the prediction block size is generated. The combined motion information candidate uni-prediction conversion is performed in which the bi-prediction motion information in each candidate in the combined motion information candidate list is replaced with the single-prediction motion information (S1703). If the predicted block size is not less than or equal to bipred_restriction_size (S1702: NO), the process proceeds to subsequent step S1704.

Next, a combined prediction mode evaluation value is generated based on the motion information in the combined motion information candidate list generated or replaced (S1704). Subsequently, a prediction mode evaluation value is generated (S1705), and an optimal prediction mode is selected by comparing the generated evaluation values (S1706). However, the order of evaluation value generation in steps S1704 and S1705 is not limited to this.

The prediction signal is output according to the selected prediction mode (S1707), and the motion information is output according to the selected prediction mode (S1708), thereby completing the motion compensation prediction mode / prediction signal generation processing for each prediction block. Detailed operations of steps S1701, S1703, S1704, and S1705 will be described later.

FIG. 18 is a flowchart for explaining the detailed operation of generating the combined motion information candidate list in step S1701 of FIG. This operation shows the detailed operation of the configuration in the combined motion information calculation unit 1506 in FIG.

The spatial combination motion information candidate list generation unit 1600 in FIG. 16 performs spatial combination from candidate blocks excluding candidate blocks outside the region or candidate blocks in the intra mode from the spatial candidate block group supplied from the prediction mode information memory 119. A motion information candidate list is generated (S1800). Detailed operations for generating the spatially coupled motion information candidate list will be described later.

Subsequently, the combined motion information candidate list deletion unit 1601 deletes the combined motion information candidates having the same motion information from the generated spatial combined motion information candidate list and updates the motion information candidate list (S1801). Detailed operation of the combined motion information candidate deletion will be described later.

The temporally combined motion information candidate list generation unit 1602 subsequently performs temporally combined motion from candidate blocks excluding candidate blocks outside the region or candidate blocks in the intra mode from the temporal candidate block group supplied from the prediction mode information memory 119. An information candidate list is generated (S1802) and combined with the temporally combined motion information candidate list to form a combined motion information candidate list. Detailed operation of the time combination motion information candidate list generation will be described later.

Next, the first combined motion information candidate list adding unit 1603 generates 0 to 2 first combined motion information candidates from the combined motion information candidate registered in the combined motion information candidate list generated by the temporal combined motion information candidate list generating unit 1602. A combined motion information candidate is generated and added to the combined motion information candidate list (S1803), and the combined motion information candidate list is supplied to the second combined motion information candidate list adding unit 1604. The detailed operation of adding the first combined motion information candidate list will be described later.

Next, the second combined motion information candidate list adding unit 1604 selects 0 to 4 second combined motion information candidates that do not depend on the combined motion information candidate list supplied from the first combined motion information candidate list adding unit 1603. Generated and added to the combined motion information candidate list supplied from the first combined motion information candidate list adding unit 1603 (S1804), and the process ends. Detailed operations for adding the second combined motion information candidate list will be described later.

The candidate block group of motion information supplied from the prediction mode information memory 119 to the combined motion information calculation unit 1506 includes a spatial candidate block group and a temporal candidate block group. First, generation of a spatially coupled motion information candidate list will be described.

FIG. 19 is a diagram showing a spatial candidate block group used for generating a spatially coupled motion information candidate list. The spatial candidate block group indicates a block of the same image adjacent to the prediction target block of the encoding target image. The block group is managed in units of the minimum prediction block size, and the position of the candidate block is managed in units of the minimum prediction block size, but when the prediction block size of the adjacent block is larger than the minimum prediction block size The same motion information is stored in all candidate blocks within the predicted block size. In the first embodiment, among the adjacent block groups, five blocks of block A0, block A1, block B0, block B1, and block B2 as shown in FIG.

FIG. 20 is a flowchart for explaining the detailed operation of generating the spatially coupled motion information candidate list. Of the five candidate blocks included in the spatial candidate block group, the following processing is repeated for block A0, block A1, block B0, block B1, and block B2 in the order of block A1, block B1, block B0, and block A0. (S2000 to S2003).

First, the validity of the candidate block is checked (S2001). If the candidate block is not out of the region and not in the intra mode, the candidate block is valid. If the candidate block is valid (S2001: YES), the motion information of the candidate block is added to the spatially combined motion information candidate list (S2002).

When the number of candidates added to the spatially coupled motion information candidate list is less than 4 following the repetitive processing from steps S2000 to S2003 (S2004: YES), the validity of the candidate block B2 is checked (S2005). When the block B2 is not out of the region and is not in the intra mode (S2005: YES), the motion information of the block B2 is added to the spatially coupled motion information candidate list (S2006).

Here, it is assumed that the spatial combination motion information candidate list includes motion information of four or less candidate blocks, but the spatial candidate block group is at least one or more processed blocks adjacent to the prediction block to be processed. The number of spatially coupled motion information candidate lists may be changed depending on the effectiveness of the candidate block, and the present invention is not limited to this.

FIG. 21 is a flowchart for explaining the detailed operation of combined motion information candidate deletion. Assuming that the maximum number of combined motion information candidates generated by the spatial combined motion information candidate list creation process is Max Spatial Cand, the following is performed for combined motion information candidates (candidate (i)) from i = Max Spatial Cand-1 to i> 0. This process is repeated (S2100 to S2106).

If the candidate (i) exists (YES in S2101), the following processing is repeated for the combined motion information candidates (candidate (ii)) from ii = i−1 to ii> = 0 (S2102˜). In S2105), when there is no candidate (i) (NO in S2101), the iterative process for candidate (ii) from step S2102 to S2105 is skipped.

First, whether the motion information (motion information (i)) of the candidate (i) and the motion information (motion information (ii)) of the candidate (ii) are the same is checked (S2103), and if they are the same (YES in S2103) ), Candidate (i) is deleted from the combined motion information candidate list (S2104), and the iterative process for candidate (ii) ends.

If the motion information (i) and the motion information (ii) are not the same (NO in S2103), 1 is subtracted from ii, and the process for the candidate (ii) is repeated (S2102 to S2105).

Subsequent to the repetitive processing from step S2100 to S2105, 1 is subtracted from i, and the processing for candidate (i) is repeated (S2100 to S2106).

FIG. 22 shows a comparison relationship of candidates in the list when there are four combined motion information candidates. That is, four spatially combined motion information candidates that do not include temporally combined motion information candidates are compared by brute force to determine identity, and duplicate candidates are deleted.

Here, the joint prediction mode uses temporal and spatial continuity of motion, and the prediction target block encodes motion information of spatially and temporally adjacent blocks without directly encoding its own motion information. Although the spatially coupled motion information candidate is based on continuity in the spatial direction, the temporally coupled motion information candidate is generated by the method described later based on the temporal direction continuity. These properties are different. Therefore, it is rare that the same motion information is included in the temporally combined motion information candidate and the spatially combined motion information candidate, and the temporally combined motion information candidate is removed from the target of the combined motion information candidate deletion process for deleting the same motion information. Even if they are excluded, it is rare that the same motion information is included in the finally obtained combined motion information candidate list.

Also, as will be described later, temporally combined motion information candidate blocks are managed in units of minimum time prediction blocks that are larger in size than the minimum prediction block, so that the size of prediction blocks that are temporally adjacent are larger than the minimum time prediction block If it is small, motion information at a position deviating from the original position is used, and as a result, the motion information often includes an error. Therefore, the motion information is often different from the motion information of the spatially coupled motion information candidate, and there is little influence even if the motion information is excluded from the target of the combined motion information candidate deletion process for deleting the same motion information.

FIG. 23 is an example of comparison contents of candidates in deletion of combined motion information candidates when the maximum number of spatially combined motion information candidates is four. FIG. 23A shows the comparison contents when only the spatially coupled motion information candidate is the target of the coupled motion information candidate deletion process, and FIG. 23B is the target of processing the spatially coupled motion information candidate and the temporally coupled motion information. It is a comparison content in the case of. By using only the spatially combined motion information candidates as the target of the combined motion information candidate deletion process, the number of comparisons of motion information is reduced from 10 times to 6 times.

Thus, by not deleting the temporally combined motion information candidate as the target of the combined motion information candidate deletion process, the number of motion information comparisons is reduced from 10 to 6 while appropriately deleting the same motion information. Is possible.

Also, it is possible to reduce the number of combined motion information candidate deletion processes by comparing only candidates that are close in spatial position without comparing the same of all spatial prediction candidates. Specifically, the combined motion information calculated from the B1 position in FIG. 19 is compared with the combined motion information at the A1 position, the combined motion information calculated from the B0 position is compared with only the combined motion information at the B1 position, and A0 By comparing the combined motion information calculated from the position with only A1 and the combined motion information calculated from the B2 position with only A1 and B1, the number of motion information comparisons can be limited to a maximum of five.

When the same comparison of combined motion information is performed only for a specific spatial prediction candidate as described above, it is better to perform the combined motion information candidate reduction process (S1801) during the generation of the spatial combined motion information candidate list (S1800). There is little influence of a decrease in coding efficiency due to motion information remaining. That is, by performing the same comparison of the combined motion information when generating the spatially combined motion information candidate list, it is possible to avoid adding unnecessary combined motion information, so the maximum number of spatial prediction candidates in step S2004 in FIG. 20 is limited to four. This is because the possibility that the combined motion information calculated from the B2 position can be added increases.

Subsequently, generation of a time-coupled motion information candidate list will be described. FIG. 24 is a diagram illustrating the definition of the temporal direction peripheral prediction block used for generating the temporally combined motion information candidate list. The temporal candidate block group indicates blocks in the same position as and around the prediction target block among the blocks belonging to the decoded image ColPic different from the image to which the prediction target block belongs. The block group is managed in units of minimum time prediction block size, and the positions of candidate blocks are managed in units of minimum time prediction block size. In Embodiment 1 of the present invention, the minimum temporal prediction block size is set to a size obtained by doubling the minimum prediction block size in the vertical and horizontal directions. When the size of the prediction block of temporally adjacent blocks is larger than the minimum temporal prediction block size, the same motion information is stored in all candidate blocks within the prediction block size. On the other hand, when the size of the prediction block is smaller than the minimum temporal prediction block size, the motion information of the prediction block located at the upper left of the temporal direction peripheral prediction block is used as the temporal direction peripheral prediction block information. FIG. 24B shows motion information of the temporal direction neighboring prediction block when the prediction block size is smaller than the minimum temporal prediction block size.

In FIG. 24A, blocks at positions A1 to A4, B1 to B4, C, D, E, F1 to F4, G1 to G4, H, and I1 to I16 are temporally adjacent block groups. In the first embodiment, among these temporally adjacent block groups, the temporal candidate block group is assumed to be two blocks, block H and block I6.

FIG. 25 is a flowchart for explaining the detailed operation of generating the time combination motion information candidate list. Regarding the block H and the block I11 that are two candidate blocks included in the time candidate block group (S2500, S2505), the validity of the candidate block is checked in the order of the block H and the block I11 (S2501). If the candidate block is valid (S2501: YES), the processing from step S2502 to step S2504 is performed, the generated motion information is registered in the time combination motion information candidate list, and the processing ends. When the candidate block indicates a position outside the screen area, or when the candidate block is an intra prediction block (S2501: NO), the candidate block is not valid, and valid / invalid determination of the next candidate block is performed.

If the candidate block is valid (S2501: YES), the reference image selection candidate to be registered in the combined motion information candidate is determined based on the motion information of the candidate block (S2502). In the first embodiment, the L0 prediction reference image is the reference image that is the closest to the processing target image among the L0 prediction reference images, and the L1 prediction reference image is the processing target image among the L1 prediction reference images. The reference image is the closest distance.

The method for determining the reference image selection candidate here is not limited to this as long as the reference image for L0 prediction and the reference image for L1 prediction can be determined. The reference image intended at the time of encoding can be determined by determining the reference image by the same method in the encoding process and the decoding process. As another determination method, for example, a method of selecting a reference image having a reference image index of 0 for a reference image for L0 prediction and a reference image for L1 prediction, or a L0 reference image and a L1 reference image used by spatially neighboring blocks. Can be used, and a method of specifying a reference image of each prediction type in the encoded stream can be used.

Next, the motion vector value to be registered in the combined motion information candidate is determined based on the motion information of the candidate block (S2503). In the first embodiment, the temporally coupled motion information calculates bi-prediction motion information based on motion vector values that are effective prediction types in motion information of candidate blocks. When the prediction type of the candidate block is L0 prediction or L1 prediction single prediction, motion information of the prediction type (L0 prediction or L1 prediction) used for prediction is selected, and its reference image designation information and motion vector value are selected. Is a reference value for generating bi-predictive motion information.

If the prediction type of the candidate block is bi-prediction, either L0 prediction or L1 prediction motion information is selected as a reference value. The reference value selection method selects, for example, motion information existing in the same prediction type as ColPic, and selects a reference image having a shorter inter-image distance from ColPic in each of L0 prediction and L1 prediction of a candidate block. For example, it is possible to select the transmission side and explicitly transmit the syntax.

When the motion vector value used as the reference for bi-predictive motion information generation is determined, the motion vector value to be registered in the combined motion information candidate is calculated.

FIG. 26 is a diagram for explaining a calculation method of motion vector values mvL0t and mvL1t registered for L0 prediction and L1 prediction with respect to the reference motion vector value ColMv for temporally coupled motion information.

The distance between images between ColPic for the reference motion vector value ColMv and the reference image that is the target of the motion vector used as a reference for the candidate block is referred to as ColDist. The inter-image distance between each reference image of L0 prediction and L1 prediction and the processing target image is set to CurrL0Dist and CurrL1Dist. A motion vector obtained by scaling ColMv with a distance ratio of ColDist to CurrL0Dist and CurrL1Dist is set as a motion vector to be registered. Specifically, the motion vector values mvL0t and mvL1t to be registered are calculated by the following

formulas

1 and 2.
mvL0t = mvCol × CurrL0Dist / ColDist (Formula 1)
mvL1t = mvCol × CurrL1Dist / ColDist (Formula 2)
It becomes.

Returning to FIG. 25, the bi-predicted reference image selection information (index) and the motion vector value generated in this way are added to the combined motion information candidates (S2504), and the temporal combined motion information candidate list creation process ends. To do.

Next, a detailed operation of the first combined motion information candidate list adding unit 1603 will be described. FIG. 27 is a flowchart for explaining the operation of the first combined motion information candidate list adding unit 1603. First, based on the number of combined motion information candidates (NumCandList) and the maximum number of combined motion information candidates (MaxNumMergeCand) registered in the combined motion information candidate list supplied from the temporally combined motion information candidate list generation unit 1602, the first additional combination is performed. MaxNumGenCand, which is the maximum number for generating motion information candidates, is calculated from Equation 3 (S2700).
MaxNumGenCand = MaxNumMergeCand-NumCandList; (NumCandList> 1)
MaxNumGenCand = 0; (NumCandList <= 1) (Formula 3)

Next, it is inspected whether MaxNumGenCand is larger than 0 (S2701). If MaxNumGenCand is not greater than 0 (S2701: NO), the process ends. If MaxNumGenCand is greater than 0 (S2701: YES), the following processing is performed. First, loopTimes that is the number of combination inspections is determined. loopTimes is set to NumCandList × NumCandList. However, if loopTimes exceeds 8, loopTimes is limited to 8 (S2702). Here, loopTimes is an integer from 0 to 7. The following processing is repeatedly performed for loopTimes (S2702 to S2708).

The combination of the combined motion information candidate M and the combined motion information candidate N is determined (S2703). Here, the relationship between the number of combination inspections, the combined motion information candidate M, and the combined motion information candidate N will be described.

FIG. 28 is a diagram for explaining the relationship between the number of combination inspections, the combined motion information candidate M, and the combined motion information candidate N. As shown in FIG. 28, M and N are different values. First, M is fixed to 0, the value of N is changed to 1 to 4 (the maximum value is NumCandList), and then the value of N is fixed to 0. The value of M is changed to 1 to 4 (the maximum value is NumCandList). Such a combination definition makes effective use of the first motion information in the combined motion information candidate list, which is the motion information with the highest probability of being selected, and actually calculates the combination pattern without having a combination table. There is an effect that can be calculated.

It is checked whether the L0 prediction of the combined motion information candidate M is valid and the L1 prediction of the combined motion information candidate N is valid (S2704). If the L0 prediction of the combined motion information candidate M is valid and the L1 prediction of the combined motion information candidate N is valid (S2704: YES), the combined motion information candidate uses the motion vector of the L0 prediction of the combined motion information candidate M and the reference image. A combined motion information candidate is generated by combining the motion vector of N L1 predictions and the reference image (S2705). If the L0 prediction of the combined motion information candidate M is not valid and the L1 prediction of the combined motion information candidate N is not valid (S2704: NO), the next combination is processed. Here, as the first additional combined motion information candidate, the motion information of the L0 prediction and the L1 prediction may be the same, and even if motion compensation is performed by bi-prediction, the same result as the single prediction of the L0 prediction or the L1 prediction is obtained. Therefore, the additional combined motion information candidate generation in which the motion information of the L0 prediction and the motion information of the L1 prediction are the same is a factor that increases the amount of calculation of the motion compensation prediction. For this reason, normally, whether or not the motion information of the L0 prediction and the motion information of the L1 prediction are the same is compared, and only when they are not the same, the first additional combined motion information candidate is set.

Subsequent to step S2705, a double-coupled motion information candidate is added to the combined motion information candidate list (S2706). Subsequent to step S2706, it is checked whether the number of generated double-coupled motion information is MaxNumGenCand (S2707). If the number of generated double coupled motion information is MaxNumGenCand (YES in S2707), the process ends. If the number of generated double coupled motion information is not MaxNumGenCand (NO in S2707), the next combination is processed.

Here, the first additional combined motion information candidate is a combined motion information candidate when there is a slight difference between the motion information of the combined motion information candidate registered in the combined motion information candidate list and the motion information candidate motion to be processed. Coding efficiency can be improved by correcting the motion information of the combined motion information candidates registered in the list to generate effective combined motion information candidates.

Next, the detailed operation of the second combined motion information candidate list adding unit 1604 will be described. FIG. 29 is a flowchart for explaining the operation of the second combined motion information candidate list adding unit 1604. First, the first addition is performed based on the number of combined motion information candidates (NumCandList) and the maximum number of combined motion information candidates (MaxNumMergeCand) registered in the combined motion information candidate list supplied from the first combined motion information candidate list adding unit 1603. MaxNumGenCand, which is the maximum number for generating combined motion information candidates, is calculated from Equation 4 (S2900).
MaxNumGenCand = MaxNumMergeCand-NumCandList; (Formula 4)

Next, the following process is repeated MaxNumCand times for i (S2901 to S2905). Here, i is an integer from 0 to MaxNumGenCand-1. The second additional combined motion in which the motion vector for L0 prediction is (0,0), the reference index is i, the motion vector for L1 prediction is (0,0), and the prediction type is i for the reference index is bi-prediction. Information candidates are generated (S2902). The second additional combined motion information candidate is added to the combined motion information candidate list (S2903). The next i is processed (S2904).

Here, the second additional combined motion information candidate has a motion vector for L0 prediction of (0, 0), a reference index of i, a motion vector of L1 prediction of (0, 0), and a reference index of i. The combined motion information candidate whose prediction type is bi-prediction was used. This is because, in a general moving image, the frequency of occurrence of combined motion information candidates in which the motion vector for L0 prediction and the motion vector for L1 prediction are (0, 0) is statistically high. The present invention is not limited to this as long as it is a combined motion information candidate that is statistically frequently used without depending on the motion information of the combined motion information candidate registered in the combined motion information candidate list. For example, the motion vectors of L0 prediction and L1 prediction may be vector values other than (0, 0), respectively, and may be set so that the reference indexes of L0 prediction and L1 prediction are different. Alternatively, the second additional combined motion information candidate can be set as an encoded image or motion information with a high occurrence frequency of a part of the encoded image, encoded in an encoded stream, and transmitted. Although the B picture (B slice) has been described here, in the case of the P picture (P slice), the second additional combined motion in which the motion vector for L0 prediction is (0, 0) and the prediction type is L0 prediction. Generate information candidates.

Here, when the reference image for L0 prediction and the reference image for L1 prediction are the same as the second additional combined motion information candidate, motion compensation is performed by bi-prediction in the same manner as the first additional combined motion information candidate list generation unit. However, since the same result as the single prediction of the L0 prediction or the L1 prediction is obtained, the additional combined motion information candidate generation in which the reference image of the L0 prediction and the reference image of the L1 prediction are the same increases the calculation amount of the motion compensation prediction. It becomes a factor. However, in the embodiment of the present invention, the motion compensation unit to be described later performs a process of collectively converting bi-prediction into single prediction, so that the motion information and L1 of the L0 prediction in the second additional combined motion information candidate list adding unit It is not necessary to determine the identity of predicted motion information, and the amount of calculation can be reduced.

Here, by setting a combined motion information candidate that does not depend on the combined motion information candidate registered in the combined motion information candidate list as the second additional combined motion information candidate, the combined motion information candidate registered in the combined motion information candidate list When the number is zero, it is possible to use the joint prediction mode and improve the encoding efficiency. In addition, when the motion information of the combined motion information candidate registered in the combined motion information candidate list and the motion information candidate motion to be processed are different, by generating a new combined motion information candidate and expanding the range of options, Encoding efficiency can be improved.

FIG. 30 is a flowchart for explaining the detailed operation of the combined motion information candidate single prediction conversion process in step S1703 of FIG. First, assuming that the number of combined motion information candidate lists generated by the combined motion information candidate list generation process is num_of_index, the following processing is repeated for combined motion information candidates from i = 0 to num_of_index-1 (from S3000) S3005).

First, the motion information stored in the index i is acquired from the combined motion information candidate list (S3001). Subsequently, when the prediction type of the motion information is single prediction (S3002: YES), the process for the motion information stored in the index i is terminated as it is, and the process proceeds to the next index (S3005).

When the motion information is not uni-prediction, that is, when the motion information is bi-prediction (S3002: NO), the L1 information of the motion information stored in the index i is used to convert the bi-prediction motion information into uni-prediction. Is invalidated (S3003). In Embodiment 1, bi-prediction motion information is converted into L0 prediction single prediction by invalidating L1 information in this way, but conversely L0 information is invalidated and bi-prediction motion information. Can be converted to single prediction of L1 prediction, and can be realized by defining a prediction type to be invalidated when implicitly converting to single prediction.

Next, the motion information of the index i converted to single prediction is stored (S3004), and the process proceeds to the next index (S3005). The combined motion information candidates from i = 0 to num_of_index-1 are processed, and the combined motion information candidate single prediction conversion process is completed.

In the first embodiment, the combined motion information bi-prediction restriction based on the prediction block size is performed after the combined motion information candidate list is generated once and then the combined motion information candidate single prediction conversion process shown in the flowchart of FIG. 30 is performed. With regard to the combined motion information single prediction conversion process, a determination is made for each candidate generation within the process shown in the flowchart of FIG. 18 which is a combined motion information candidate generation process, and a combined motion information candidate list for single prediction is generated. However, in this case, the condition judgment based on the predicted block size enters each process, which complicates the process and increases the load of the list construction process. The first embodiment has an effect of realizing a bi-prediction restriction process that prevents an increase in the load of the list building process by performing a process of converting motion information into a single prediction after building a list once.

FIG. 31 is a flowchart for explaining the detailed operation of the combined prediction mode evaluation value generation process in step S1702 of FIG. This operation shows a detailed operation of the configuration using the combined motion compensation prediction generation unit 1508 of FIG.

First, the prediction error evaluation value is set to the maximum value, and the combined motion information index that minimizes the prediction error is initialized (for example, a value outside the list such as −1) (S3100). If the number of combined motion information candidate lists generated by the combined motion information candidate list generation process is num_of_index, the following processing is repeated for combined motion information candidates from i = 0 to num_of_index-1 (S3101 to S3109).

First, the motion information stored in the index i is acquired from the combined motion information candidate list (S3102). Subsequently, a motion information code amount is calculated (S3103). In the joint prediction mode, since only the joint motion information index is encoded, only the joint motion information index becomes the motion information code amount.

In the first embodiment, a Truncated Unary code string is used as the code string of the combined motion information index. FIG. 32 is a diagram showing a Trunked Unary code string when the number of combined motion information candidates is five. When the value of the combined motion information index is encoded using the Truncated Unary code string, the smaller the combined motion information index, the smaller the code bits assigned to the combined motion information index. For example, when the number of combined motion information candidates is 5, if the combined motion information index is 1, it is represented by 2 bits of “10”, but if the combined motion information index is 3, 4 bits of “1110”. It is expressed by Here, as described above, the Truncated Unary code string is used to encode the combined motion information index, but other code string generation methods can be used, and the present invention is not limited to this.

Subsequently, when the motion information prediction type is single prediction (S3104: YES), the reference image designation information and the motion vector for one reference image are set in the motion compensation prediction unit 112 in FIG. A prediction block is generated (S3105). When the motion information is not uni-prediction, that is, when the motion information is bi-prediction (S3104: NO), reference image designation information and motion vectors for two reference images are set in the motion compensation prediction unit 112, and motion compensation is performed. A bi-prediction block is generated (S3105).

Subsequently, a prediction error evaluation value is calculated from the prediction error and the motion information code amount of the motion compensated prediction block and the prediction target block (S3107), and when the prediction error evaluation value is the minimum value, the evaluation value is updated. The prediction error minimum index is updated (S3108).

As a result of comparison of the prediction error evaluation values for all the combined motion information candidates, the selected prediction error minimum index is output together with the prediction error minimum value and the motion compensated prediction block as a combined motion information index used in the combined prediction mode. (S3109), the combined prediction mode evaluation value generation process is terminated.

FIG. 33 is a flowchart for explaining the detailed operation of the prediction mode evaluation value generation processing in step S1703 of FIG.

First, it is determined whether or not the prediction mode is single prediction (S3300). FIG. 34 shows a syntax regarding motion information of a prediction block. In FIG. 34, merge_flag indicates whether or not the mode is a joint prediction mode, and when merge_flag is 0, the motion detection prediction mode is indicated. In the case of a B slice in which bi-prediction can be used in the motion detection prediction mode, a flag inter_pred_flag indicating whether the prediction type is uni-prediction or bi-prediction is transmitted. Here, even when the size of the prediction block is equal to or smaller than the bi-prediction restricted block size, bi_pred_flag is transmitted without prohibiting bi-prediction. This is because the conditional branch is necessary for entropy coding / decoding when switching whether to transmit inter_pred_flag depending on whether the size of the prediction block is less than or equal to the bi-prediction restricted block size. This is to prevent this from happening.

33, if it is simple prediction (S3300: YES), the reference image list (LX) to be processed is set as the reference image list used for prediction (S3301). If it is not uni-prediction, it is bi-prediction, so LX is set to L0 in this case (S3302).

Next, reference image designation information (index) and motion vector values for LX prediction are acquired (S3303). Subsequently, a prediction vector candidate list is generated (S3304), an optimal prediction vector is selected from the prediction vectors, and a difference vector is generated (S3305). It is desirable to select the optimal prediction vector with the least amount of code when the difference vector between the prediction vector and the motion vector to be transmitted is actually encoded. However, the horizontal and vertical components of the difference vector are simply selected. The calculation may be simplified by a method such as selecting one having a small absolute sum.

Subsequently, it is determined again whether or not the prediction mode is single prediction (S3306). If the prediction mode is single prediction, the process proceeds to step S3311. If it is not uni-prediction, that is, if it is bi-prediction, it is determined whether or not the reference list LX to be processed is L1 (S3307). If the reference list LX is L1, the process proceeds to step S3311, and if it is not L1, that is, if it is L0, if the predicted block size is equal to or smaller than bipred_restriction_size (S3308: YES), information for L1 prediction is not calculated, The prediction mode is converted to single prediction (S3310), and the process proceeds to step S3311.

When the predicted block size is larger than bipred_restriction_size (S3308: NO), LX is set to L1 (S3309), and the same processing as the processing from step S3303 to step S3306 is performed.

In the first embodiment, when decoding is performed according to the syntax related to motion information of the prediction block shown in FIG. 34 in the decoding apparatus, bi-prediction is performed with the target prediction block size when bi-prediction restriction is performed on the prediction block size. In order to prevent the motion information from being decoded, the process of steps S3308 and S3310 is used to limit the bi-prediction in the prediction mode evaluation value generation process.

When motion vector detection assuming bi-prediction is performed at the time of motion vector detection, motion vector information used in single prediction and motion vector information of single prediction generated by restricting bi-prediction in the above step are Since there may be different cases, by registering a new motion information candidate for uni-prediction, it is possible to improve the encoding efficiency as compared with the case where the motion information for bi-prediction is simply not used.

Subsequently, a motion information code amount is calculated (S3311). In the case of the uni-prediction mode, the motion information to be encoded includes three elements of reference image designation information, a difference vector value, and a prediction vector index for one reference image. In the case of the bi-prediction mode, L0 and L1 The reference image designation information, the difference vector value, and the prediction vector index for the two reference images are a total of six elements, and the total amount of the encoded amount is calculated as the motion information code amount. As a prediction vector index code string generation method according to the present embodiment, a Truncated Unary code string is used in the same manner as the combined motion information index code string.

Subsequently, reference image designation information and a motion vector for the reference image are set in the motion compensated prediction unit 112 in FIG. 1 to generate a motion compensated prediction block (S3312).

Further, a prediction error evaluation value is calculated from the prediction error and the motion information code amount of the motion compensated prediction block and the prediction target block (S3313), the prediction error evaluation value, and reference image designation information that is motion information for the reference image; The difference vector value and the prediction vector index are output together with the motion compensated prediction block (S3314), and the prediction mode evaluation value generation process ends.

The above processing is the detailed operation of the motion compensated prediction block structure selection unit 113 in the video encoding apparatus in the first embodiment.

In Embodiment 1 of the present invention, an example of syntax that is transmitted in order to recognize the inter_4x4_enable and inter_bipred_restriction_idc shown in FIG. 10, which are control parameters for limiting the memory access amount in motion compensation prediction, in the decoding apparatus Is shown in FIG.

In FIG. 35, the control parameter values shown in FIG. 10 are transmitted as they are as part of the header information set for each sequence or image. In one example, it is transmitted inside seq_parameter_set_rbsp () that transmits parameters in sequence units, and the information for the minimum CU size shown in FIG. 3 is defined by a power of 2 based on 8 (indicating 8 × 8) in log2_min_coding_block_size_minus3 The maximum CU size (encoded block size in the first embodiment) is transmitted as log2_diff_max_min_coding_block_size having a value indicating the maximum number of CU divisions (Max_CU_Depth).

inter_4x4_enable is transmitted only when log2_min_coding_block_size_minus3 is 0, that is, when the minimum CU size is 8 × 8, as inter_4x4_enable_flag, and by sending control parameters only when the control by inter_4x4_enable is valid, transmission of invalid control information Can be prevented. On the other hand, since inter_bipred_restriction_idc is necessary for control even when the minimum CU size is 16 × 16, a configuration in which it is always transmitted is adopted.

In the first embodiment, a configuration is shown in which these control parameter values are encoded and transmitted using parameters in sequence units. However, it is also possible to change the setting at intervals of a predetermined encoded block unit or more such as a frame unit. The configuration of the first embodiment is not limited to the control parameter configuration in sequence units, and the decoding apparatus can acquire the control parameters in predetermined units.

[Detailed Operation Description of Motion Information Decoding Unit in Moving Picture Decoding Device in Embodiment 1]
FIG. 36 is a diagram showing a detailed configuration of the motion information decoding unit 1111 in the video decoding device according to Embodiment 1 shown in FIG. The motion information decoding unit 1111 includes a motion information bitstream decoding unit 3600, a prediction vector calculation unit 3601, a vector addition unit 3602, a motion compensation prediction decoding unit 3603, a combined motion information calculation unit 3604, a combined motion information single prediction conversion unit 3605, and A combined motion compensated prediction decoding unit 3606 is included.

A bit stream related to motion information input from the prediction mode / block structure decoding unit 1108 is supplied to the motion information bit stream decoding unit 3600 and input from the prediction mode information memory 1112 to the motion information decoding unit 1111 in FIG. The obtained motion information is supplied to the prediction vector calculation unit 3601 and the combined motion information calculation unit 3604.

Also, reference image designation information and motion vectors used for motion compensation prediction are output from the motion compensation prediction decoding unit 3603 and the joint motion compensation prediction decoding unit 3606 to the motion information decoding unit 1111 and information indicating the prediction type is output. The included decoded motion information is supplied to the motion compensation prediction unit 1114 and the prediction mode information memory 1112.

The motion information bitstream decoding unit 3600 decodes the input motion information bitstream according to the encoding syntax, thereby generating the transmitted prediction mode and motion information corresponding to the prediction mode. Among the generated motion information, the combined motion information index is supplied to the combined motion compensated prediction decoding unit 3606, the reference image designation information is supplied to the prediction vector calculation unit 3601, and the prediction vector index is supplied to the vector addition unit 3602. The difference vector value is supplied to the vector addition unit 3602.

The prediction vector calculation unit 3601 applies a motion compensation prediction target reference image based on the motion information of adjacent blocks supplied from the prediction mode information memory 1112 and the reference image designation information supplied from the motion information bitstream decoding unit 3600. A prediction vector candidate list is generated and supplied to the vector addition unit 3602 together with the reference image designation information. Regarding the operation of the prediction vector calculation unit 3601, the same operation as the prediction vector calculation unit 1502 of FIG. 15 in the moving image encoding apparatus is performed, and the same candidate list as the prediction vector candidate list at the time of encoding is generated.

The vector addition unit 3602 indicates a prediction vector index from the prediction vector candidate list and reference image designation information supplied from the prediction vector calculation unit 3601, and the prediction vector index and difference vector supplied from the motion information bitstream decoding unit 3600. By adding the prediction vector value and the difference vector value registered at the set position, the motion vector value for the reference image to be motion compensated prediction is reproduced. The reproduced motion vector value is supplied to the motion compensated prediction decoding unit 3603 together with the reference image designation information.

The motion compensation prediction decoding unit 3603 is supplied with the reproduced motion vector value and reference image designation information for the reference image from the vector addition unit 2602, and sets the motion vector value and the reference image designation information in the motion compensation prediction unit 1114. Thus, a motion compensated prediction signal is generated.

The combined motion information calculation unit 3604 generates a combined motion information candidate list from the motion information of adjacent blocks supplied from the prediction mode information memory 1112 and combines the combined motion information candidate list and the combined motion information candidate that is a component in the list. The reference image designation information and the motion vector value are supplied to the combined motion information single prediction conversion unit 3605.

Regarding the operation of the combined motion information calculation unit 3604, the same operation as the combined motion information calculation unit 1506 in FIG. 15 in the moving image encoding apparatus is performed, and the same candidate list as the combined motion information candidate list at the time of encoding is generated. Is done.

The combined motion information single prediction conversion unit 3605 performs the same operation as the combined motion information single prediction conversion unit 1507 of FIG. 15 in the moving image encoding device, and is supplied from the combined motion information calculation unit 3604. And, for the motion information registered in the candidate list, the motion information whose prediction type is bi-prediction is converted into motion information of uni-prediction according to the bi-prediction restriction information shown in FIG. 3606.

The combined motion compensated prediction decoding unit 3606 includes a combined motion information candidate list supplied from the combined motion information single prediction conversion unit 3605, reference image designation information of a combined motion information candidate that is a component in the list, a motion vector value, and motion. Based on the combined motion information index supplied from the information bitstream decoding unit 3600, the reference image designation information and the motion vector value in the combined motion information candidate list indicated by the combined motion information index are reproduced and set in the motion compensation prediction unit 1114. Thus, a motion compensated prediction signal is generated.

FIG. 37 is a flowchart for explaining the detailed operation of the prediction block unit decoding processing in steps S1402, S1405, S1408, and S1410 of FIG. First, an encoded stream of a CU unit is acquired (S3700), and for each prediction block size obtained by performing PU division on the target CU based on NumPart set according to the prediction block size division mode (PU) in the CU (S3701). Steps S3702 to S3706 are executed (S3707).

The encoded sequence of motion information separated from the encoded stream of the CU unit is supplied to the motion information decoding unit 1111 from the prediction mode / block structure decoding unit 1108 of FIG. 11 and is supplied from the prediction mode information memory 1112. The motion information of the decoding target block is decoded using the group motion information (S3702). Details of the processing in step S3702 will be described later.

The separated coded sequence of prediction error information is supplied to the prediction difference information decoding unit 1102 and decoded as a quantized prediction error signal, and the inverse quantization / inverse transformation unit 1103 performs inverse quantization, inverse orthogonal transform, etc. By performing the above process, a decoded prediction error signal is generated (S3703).

The motion information decoding unit 1111 supplies the motion information of the decoding target block to the motion compensation prediction unit 1114, and the motion compensation prediction unit 1114 performs motion compensation prediction according to the motion information and calculates a prediction signal (S3704). The adder 1104 supplies the decoded prediction error signal supplied from the inverse quantization / inverse transform unit 1103 and the motion compensation prediction unit 1114 to the prediction mode / block structure selection unit 1109, and further selects motion compensation prediction in the prediction mode. As a result, the prediction signal supplied to the addition unit 1104 is added to generate a decoded image signal (S3705).

The decoded image signal supplied from the adding unit 1104 is stored in the intra-frame decoded image buffer 1105 and also supplied to the loop filter unit 1106. Also, the motion information of the decoding target block supplied from the motion information decoding unit 1111 is stored in the prediction mode information memory 1112 (S3706). This is applied to all the prediction blocks in the target CU, thereby completing the decoding process for each prediction block.

FIG. 38 is a flowchart for explaining the detailed operation of the motion information decoding process in step S3702 of FIG. The motion information decoding process of step S3702 of FIG. 37 is performed by the motion information bitstream decoding unit 3600, the prediction vector calculation unit 3601, and the combined motion information calculation unit 3604.

The motion information decoding process is a process of decoding motion information from an encoded bit stream encoded with a specific syntax structure. When the Skip flag decoded first in the CU unit of the encoded block indicates the Skip mode (S3800: YES), joint prediction motion information decoding is performed (S3801). Detailed processing in step S3801 will be described later.

On the other hand, if it is not the Skip mode (S3800: NO), the merge flag is decoded (S3802). When the merge flag indicates 1 (S3803: YES), the process proceeds to joint prediction motion information decoding in step S3801.

If the merge flag is not 1 (S3803: NO), the motion prediction flag is decoded (S3804), the prediction motion information is decoded (S3805), and the process is terminated. Detailed operation of step S3805 will be described later.

FIG. 39 is a flowchart for explaining the detailed operation of the joint prediction motion information decoding process in step S3801 of FIG.

First, the combined prediction mode is set as the prediction mode (S3900), and a combined motion information candidate list is generated (S3901). The process of step S3901 is the same process as the combined motion information candidate list generation process of step S1701 of FIG. 17 in the video encoding device.

Subsequently, when the prediction block size is equal to or smaller than bipred_restriction_size that is a prediction block size that restricts bi-prediction set by the control parameter inter_bipred_restriction_idc that restricts bi-prediction shown in FIG. 10 (S3902: YES), it is stored. The combined motion information candidate single prediction conversion is performed in which the bi-prediction motion information in each candidate in the combined motion information candidate list is replaced with the single prediction motion information (S3903). In this process, the same process as the combined motion information single prediction conversion process in the encoding apparatus shown in the flowchart of FIG. 30 is performed. If the predicted block size is not less than or equal to bipred_restriction_size (S3902: NO), the process proceeds to step S3904.

Next, the combined motion information index is decoded (S3904), and then the motion information stored at the position indicated by the combined motion information index is acquired from the combined motion information candidate list (S3905). The motion information to be acquired includes a prediction type indicating single prediction / bi-prediction, reference image designation information, and a motion vector value.

In Embodiment 1, since the conversion process from the bi-prediction to the single prediction of the combined motion information does not change the index value of the combined motion information, the combined motion information of the index necessary for decoding in the decoding device In this case, step S3904 and step S3903 for restricting bi-prediction based on the prediction block size are performed after performing step S3904 and step S3905 in FIG.

The generated motion information is stored as motion information in the joint prediction mode (S3906), and is supplied to the joint motion compensation prediction decoding unit 3606.

FIG. 40 is a flowchart for explaining the detailed operation of the predicted motion information decoding process in step S3805 of FIG.

First, it is determined whether or not the prediction type is simple prediction (S4000). If it is single prediction, the reference image list (LX) to be processed is set as the reference image list used for prediction (S4001). If it is not uni-prediction, it is bi-prediction, so LX is set to L0 in this case (S4002).

Next, the reference image designation information is decoded (S4003), and the difference vector value is decoded (S4004). Next, a prediction vector candidate list is generated (S4005). When the prediction vector candidate list is larger than 1 (S4006: YES), the prediction vector index is decoded (S4007), and when the prediction vector candidate list is 1 (S4006: NO), 0 is set to the prediction vector index (S4008).

Here, in step S4005, processing similar to that in step S3304 in the flowchart of FIG. 33 in the video encoding device is performed.

Next, the motion vector value stored at the position indicated by the prediction vector index is acquired from the prediction vector candidate list (S4009). A motion vector is reproduced by adding the decoded difference vector value and motion vector value (S4010).

Subsequently, it is determined again whether or not the prediction type is single prediction (S4011). If the prediction type is single prediction, the process proceeds to step S4014. If it is not uni-prediction, that is, if it is bi-prediction, it is determined whether or not the reference list LX to be processed is L1 (S4012). If the reference list LX is L1, the process proceeds to step S4014. If it is not L1, that is, if it is L0, the predicted block size is equal to or smaller than bipred_restrcition_size (S4013: YES), the process proceeds to step S4016, and the predicted block size is bipred_restriction_size. If larger (S4013: NO), LX is set to L1 (S4015), and the same processing as the processing from step S4003 to step S4011 is performed.

When the prediction block size is less than or equal to bipred_restrcition_size, bi-prediction motion compensation is prohibited, so the transmitted motion information is converted to single prediction to ensure the memory access amount limit in the decoding device. (S4016), the process proceeds to step S4014.

Subsequently, as the generated motion information, in the case of single prediction, reference image designation information and motion vector values for one reference image, and in the case of bi-prediction, reference image designation information and motion for two reference images. The vector value is stored as motion information (S4014) and supplied to the motion compensated prediction decoding unit 3603.

In the prediction motion information decoding process according to the first embodiment, the motion information transmitted at the time of encoding is decoded according to the syntax. Therefore, in the prediction mode evaluation value generation process of FIG. Although it is possible to implement the conditional branching regarding the bi-prediction restriction to ensure the restriction of the memory access amount as in the case where the condition determination in step S4013 and the process in step S4016 are omitted, In the first embodiment, the prediction motion information decoding process according to the flowchart of FIG. 40 is adopted as a configuration that ensures the limitation of the memory band even in the decoding device.

FIG. 41 is a seq_parameter_set_rbsp () etc. that transmits parameters in sequence as shown in FIG. 35, and a level_idc that defines the maximum image size of encoding / decoding processing or the maximum number of pixels for a predetermined time unit is transmitted. In this case, since the load of the memory access amount of the reference image increases in proportion to the maximum number of processing pixels, the prediction block size and the bi-prediction of motion compensation prediction are linked to the maximum number of usable processing pixels. It is an example of the structure which adds a restriction | limiting. By limiting the possible values of inter_4x4_enable and inter_bipred_restriction_idc according to the level_idc defined and transmitted by the encoding device, the memory access is limited according to the assumed image size of the encoding device / decoding device. Therefore, according to the use of the encoding device and the decoding device, it is possible to realize an encoding device and a decoding device that can secure the necessary memory bandwidth and can maintain the encoding efficiency while reducing the processing load and the scale of the device.

In FIG. 41, as an example, when level_idc is set to 6 levels, inter_4x4_enable is not restricted (in the case of 0 and 1 can be set) under the condition that encoding with a small number of pixels is assumed, inter_bipred_restriction_idc All of the defined values can be set, but with the increase of level_idc, the prediction block size and bi-prediction restrictions are added step by step from the prediction process with a large memory access amount shown in FIG. , Inter_4x4_enable (always set to only 0) and inter_bipred_restriction_idc (increase the minimum value of possible values) can be controlled in conjunction with the maximum image size and the maximum number of processed pixels.

Further, as shown in FIG. 41, the value of inter_4x4_enable and inter_bipred_restriction_idc is implicitly set to a fixed value under restriction without being transmitted in conjunction with the maximum image size and the maximum number of processed pixels with reference to level_idc. In the decoding device, it is also possible to restrict motion compensation prediction and bi-prediction according to the set restriction.In that case, the level_idc is transmitted, so that the corresponding inter_4x4_enable and inter_bipred_restriction_idc values can be decoded. Become.

In Embodiment 1, the control parameter prohibiting motion compensated prediction of 4 × 4 prediction block size called inter_4x4_enable is used, but the prediction block restriction of motion compensated prediction is the same as the inter_bipred_restriction_idc specified prediction block. It is also possible to use a control parameter that prohibits motion compensation prediction of a block size equal to or smaller than the size, which makes it possible to control the memory access amount more finely.

In the first embodiment, bi-prediction restriction is performed on the same basis when the area of the prediction block size is the same and the number of horizontal and vertical pixels is different, such as 4 × 8 pixels and 8 × 4 pixels. Assuming that the access unit of the reference image memory is generally composed of a plurality of pixels such as 4 pixels or 8 pixels in the horizontal direction, 4 × 8 pixels having a small number of pixels in the horizontal direction are more It is also possible to define motion block prediction and bi-prediction by defining a prediction block size with a large access amount, and it is possible to control the memory access amount more suitable for the configuration of the decoding device.

Further, in the first embodiment, in order to improve the efficiency of motion compensation prediction, as shown in FIG. 42, even when a prediction block having a more detailed division within the CU and asymmetric left and right and up and down is defined. By adding a restriction on the predicted block size for these blocks, it is possible to control the memory access amount in stages.

In the other configuration of the first embodiment shown in FIG. 42, the division configuration of the CU into prediction blocks is non-division (2N × 2N), horizontal / vertical division (N × N), and division only in the horizontal direction. (2N × N), vertical division only (N × 2N), horizontal only upper 1/4, lower 3/4 asymmetric division (2N × nU), horizontal upper only 3/4, lower 1/4 asymmetric division (2N × nD), left 1/4 to vertical only, right 3/4 asymmetric division (nL × 2N), left 3/4 to vertical only, Asymmetrical division (nR × 2N) of the right 1/4, so that a prediction block size of less than 4 horizontal pixels and 4 vertical pixels is not applied, only a CU with a CU size of 16 × 16 or larger is used. A split configuration is applicable.

Subsequently, FIG. 43 shows an example of the block size of motion compensation prediction and control parameters for limiting the prediction processing in the prediction block configuration of FIG. The control parameter is a parameter for controlling validity / invalidity of motion compensated prediction of 4 × 4, 4 × 8, and 8 × 4 prediction blocks, which is a configuration that divides an 8 × 8 block that is the smallest CU size, and inter_pred_enable_idc And inter_bipred_restriction_idc that defines a block size that prohibits only the prediction processing in which bi-prediction is performed in motion compensation prediction.

In the configuration of the control parameters in FIG. 43, the order of the size of the prediction block size of 16 × 16 pixels or less, which takes into account the influence on the memory access of the horizontal and vertical number of pixels with respect to inter_bipred_restriction_idc, × 4, 4 × 8, 8 × 4, 8 × 8, 4 × 16/12 × 16 (nL × 2N / nR × 2N), 8 × 16, 16 × 12/16 × 4 (2N × nU / 2N × nD), 16 × 8, and 16 × 16, and sets a prediction block size value that restricts bi-prediction. As a result, the memory access amount can be controlled in a fine unit even for a prediction block having an asymmetric configuration in which the efficiency of motion compensation prediction is improved, as in the configuration using the control parameters shown in FIG. Thus, the efficiency of motion compensation prediction is improved, and the memory access amount can be controlled according to the allowable memory bandwidth.

In Embodiment 1, bi-prediction restriction is applied to a prediction block having a size equal to or smaller than the defined size on the basis of the prediction block size defined by inter_bipred_restriction_idc. Limiting bi-prediction to a block is also possible, and bi-prediction is limited to a prediction block of a defined size if motion compensated prediction is not performed at a prediction block size that is smaller than the prediction block size to which bi-prediction restriction is applied. It is also possible to add the above as a configuration for realizing the present invention. When the restriction of bi-prediction is added to a prediction block having a size smaller than the defined size, step S1702 shown in the flowchart of FIG. 17 and step S3308 shown in the flowchart of FIG. 33 in the encoding apparatus of the first embodiment. In the decoding apparatus of the first embodiment, whether or not the condition judgment in step S3902 shown in the flowchart of FIG. 39 and step S4013 shown in the flowchart of FIG. 40 is less than bipred_restriction_size, and the prediction block size defined by inter_bipred_restriction_idc This is realized by setting the value as a prediction block size one larger.

In the first embodiment, as shown in FIG. 35, as an example, a configuration in which inter_4x4_enable and inter_bipred_restriction_idc, which are control parameters for limiting the memory access amount in motion compensation prediction, are encoded and transmitted as individual parameters, respectively. As shown in FIG. 44, if the control parameter information can be transmitted as a parameter for controlling the memory access amount restriction of the video encoding device and the video decoding device, the combination of inter_4x4_enable and inter_bipred_restriction_idc as shown in FIG. A configuration in which the information to be defined (inter_mc_restrcution_idc) is encoded and transmitted is also possible. Information that is controlled so as not to perform motion compensation prediction processing with a prediction block size that is predetermined or less than a predetermined value and dual information with a prediction block size that is less than a predetermined value The motion compensation is further improved by the information that controls so as not to perform the prediction There is an effect that the processing for performing compensation prediction and single prediction restriction of combined motion information candidates can be integrated into one instruction information and encoded transmission and decoding can be performed.

Further, in the first embodiment, as a means for prohibiting bi-prediction used for joint motion compensation prediction to limit the memory access amount with respect to motion compensation prediction, after being stored in the joint motion information candidate index Motion information is converted from bi-prediction motion information to uni-prediction motion information according to conditions, stored, and used for prediction processing. As a result, the prediction accuracy of motion compensated prediction in the prediction block size under the condition prohibiting bi-prediction is improved, and the coding efficiency is improved.

(Embodiment 2)
Next, the video encoding device and video decoding device according to Embodiment 2 of the present invention will be described. In the second embodiment, as in the first embodiment, the configuration for limiting the maximum memory access amount is the same by combining the motion compensation prediction limitation based on the prediction block size and the bi-prediction limitation equal to or less than the prediction block size. There is a structure that adds a restriction of bi-prediction to a CU partition structure at the minimum CU size, not information indicating a prediction block size that restricts a parameter that defines a restriction of bi-prediction.

FIG. 45 shows an example of the motion compensated prediction block size and control parameters for limiting the prediction processing in Embodiment 2 of the present invention.

The control parameters are inter_4x4_enable, which is a parameter for controlling the validity / invalidity of motion compensated prediction of 4 × 4 pixels, which is the smallest motion compensated prediction block size, and only prediction processing for which bi-prediction is performed among motion compensated predictions It consists of two parameters of inter_bipred_restriction_for_mincb_idc that define the CU partition structure in the minimum CU size that prohibits.

Inter_bipred_restriction_for_mincb_idc defines four values, and controls four states: no limit, N × N limit, N × 2N / 2N × N limit or less, and all divisions (PUs) in the CU. The minimum CU size is defined as a power of 2 with log2_min_coding_block_size_minus3 as a reference (indicating 8 × 8) as shown in the syntax of FIG. 35 in Embodiment 1, and the value of inter_bipred_restriction_for_mincb_idc and the minimum CU size As a result, the block size bipred_restriction_size for limiting bi-prediction is set.

The configuration of the encoding device and the decoding device in the second embodiment can be the same as that in the first embodiment, and the point that bipred_restriction_size in the first embodiment is defined by a combination of the above log2_min_coding_block_size_minus3 and inter_bipred_restriction_for_mincb_idc. , Has a different configuration. A specific definition of bipred_restriction_size is shown in FIG.

As shown in an example of the syntax shown in FIG. 47, inter_bipred_restriction_for_mincb_idc is configured in the same way as the syntax of FIG. 35 in Embodiment 1, and is transmitted as a sequence unit parameter by seq_parameter_set_rbsp (), and instead of inter_bipred_restriction_idc Is the value to be transmitted.

Since the memory access amount becomes large and the state where the memory bandwidth needs to be limited occurs with respect to the minimum CU size at the time of encoding, the configuration for limiting the bi-prediction in conjunction with the minimum CU size is managed and transmitted. When there is little wasted parameter, and it is desired to limit the memory access amount in the encoding apparatus, the bi-prediction limitation at a larger size can be defined with a small control parameter value.

Furthermore, in the second embodiment, even when the prediction within the CU is further finely divided and the left and right and up and down asymmetric prediction blocks are defined to improve the motion compensation prediction efficiency, the size restriction for each block size is set for each CU. Even if it is not added in the hierarchy, it is sufficient to add only the definition in the minimum CU size. Therefore, the expandability is high, and the encoding block size of the high-definition image that finishes high-definition is large. It has an effect that can easily realize the restriction of sheath prediction.

(Embodiment 3)
Next, the video encoding device and video decoding device according to Embodiment 3 of the present invention will be described. In Embodiment 3, in addition to motion compensation prediction and bi-prediction limitations for limiting the memory access amount, by limiting the number of operations of combined motion prediction candidate generation processing when the prediction block size is reduced, The configuration is such that the processing load required for generating the combined motion prediction candidate is reduced.

Specifically, the same combined motion information candidate generation process is performed using the motion information of the same adjacent block in each prediction block in a prediction block size equal to or smaller than a predetermined CU size. In Embodiment 3, the configuration having the above configuration is adopted for the prediction block of 8 × 8 CU size that is the minimum CU size, and the spatial peripheral prediction block in the combined motion information candidate generation of 8 × 8 CU size of Embodiment 3 Will be described with reference to FIG.

In the 8 × 8 CU, the positions of the five blocks of the block A0, the block A1, the block B0, the block B1, and the block B2 of the spatial candidate block group for the prediction block (2N × 2N) of 8 × 8 pixels are shown in FIG. As shown, the same position as the definition of the space candidate block group in the first embodiment shown in FIG. 19 is shown.

On the other hand, the position of the spatial candidate block group with respect to the 4 × 8 pixel prediction block (N × 2N), the 8 × 4 pixel prediction block (2N × N), and the 4 × 4 pixel prediction block (N × N). 48 (b), (c), and (d), as shown in FIG. 19, the block at the adjacent position of the target prediction block shown in the definition of the spatial candidate block group in the first embodiment shown in FIG. Instead, the same position as the spatial candidate block group for 8 × 8 pixels is used for all prediction blocks. Similarly, regarding the position of the temporal candidate block group, the same position as the prediction block of 8 × 8 pixels is used for all prediction blocks of 4 × 8 pixels, 8 × 4 pixels, and 4 × 4 pixels.

That is, for the target 8 × 8 CU, the same combined motion information candidate is used in all configured prediction block structures, and the combined motion information generation process in the encoding device and the decoding device is performed once. It can be realized by the generation process.

Subsequently, an encoding process for each encoding block of the moving image encoding apparatus according to Embodiment 3 will be described. In contrast to the encoding process in units of encoding blocks in Embodiment 1, the motion compensated prediction block size selection / prediction signal generation process shown in the flowchart of FIG. 7 and the motion compensated prediction mode / prediction shown in the flowchart of FIG. Since only the signal generation process is different, these processes will be described.

FIG. 49 shows a flowchart of motion compensation prediction block size selection / prediction signal generation processing in the third embodiment. Regarding the same steps as those in the flowchart of FIG. 7 of the first embodiment, the same numbers are assigned and new step numbers are assigned only to different portions.

First, an encoded block image to be predicted is acquired for the target CU (S700). Next, it is determined whether or not the CU size of the target CU is 8 × 8 (S4908). When the CU size of the target CU is 8 × 8 (S4908: YES), combined motion information candidate list generation processing is performed (S4909). If the CU size of the target CU is not 8 × 8 (S4908: NO), the process proceeds to step S701. Regarding the details of step S4909, the same process as the combined motion information candidate list generation process of FIG. 18 in the first embodiment is performed.

After performing step S4909, when the minimum prediction block size in the target CU is equal to or smaller than bipred_restriction_size (S4910: YES), combined motion information candidate single prediction conversion processing is performed (S4911). If the minimum predicted block size in the target CU is not less than or equal to bipred_restriction_size (S4910: NO), the process proceeds to step S701. Regarding the details of step S4911, the same process as the combined motion information candidate single prediction conversion process of FIG. 30 in the first embodiment is performed.

In Embodiment 3, when the combined motion information candidate generation process in bipred_restriction_size, which is a prediction block size that restricts bi-prediction, is the prediction block size used in the target CU (when inter_4x4_enable is 1, 4 * 4/4 * 8/8 * 4/8 * 8 prediction block, and when inter_4x4_enable is 0, 4 * 8/8 * 4/8 * 8 prediction block) is the same for the target CU The bi-prediction motion information is converted into a single prediction for the combined motion information candidate list generated at the same time. That is, processing in which bipred_restriction_size is expanded to 3 (8 × 8 or less restriction) is performed.

49, after performing the combined motion information candidate single prediction conversion process of step S4911, the process proceeds to step S701. The processing from step S701 to step S707 is the same as the processing from step S701 to step S707 in the flowchart of FIG. 7 in the first embodiment.

In the third embodiment, the combined motion information candidate list generation process and the combined motion information candidate uni-prediction conversion process for the 8 × 8 CU size are performed by the same operation, and the encoding device performs 8 generations by one generation process. There is an effect that it becomes possible to generate all combined motion information candidates within the × 8 CU size. In Embodiment 3, in the configuration in which the process of step S4910 in the flowchart of FIG. 49 is not performed, the combined motion information candidate list generation process is performed with the same operation for the 8 × 8 CU size, and bipred_restriction_size is expanded. The combined motion information candidate uni-prediction conversion process in a state where the prediction is not performed can be performed, but the encoding apparatus needs the combined motion information candidate uni-prediction conversion process for each prediction block size within the 8 × 8 CU size.

Next, FIG. 50 shows a flowchart of the motion compensation prediction mode / prediction signal generation processing in the third embodiment. With respect to the same steps as those in the flowchart of FIG. 17 of the first embodiment, the same numbers are assigned and new step numbers are assigned only to different portions.

When the target CU size is not 8 × 8 for each predicted block size obtained by PU partitioning within the target CU based on the NumPart set according to the predicted block size partitioning mode (PU) within the defined CU (S5010: NO), the steps from Step S1701 to Step S1708 are executed (S1709). Regarding the processing from step S1701 to step S1708, the same processing as the flowchart of FIG. 17 in the first embodiment is performed.

If the target CU size is 8 × 8 (S5010: YES), the process proceeds from step S1701 to step S1703 and proceeds to step S1704. That is, in the case of the prediction block size where the target CU size is 8 × 8, the combined motion information candidate generated by the process in the flowchart of the motion compensation prediction block size selection / prediction signal generation process shown in FIG. 49 is selected. It is configured to perform motion compensation prediction in the combined prediction mode by using it as it is.

Subsequently, the decoding process in units of coding blocks of the video decoding device in the third embodiment performs the same process as in the first embodiment, and the candidate blocks used for generating the combined motion information candidate list in the combined motion prediction. In the case where the CU targeted for only the position is 8 × 8, candidate blocks at the same position are obtained in all the prediction blocks as shown in FIG. 48, and in the combined motion information decoding process shown in the flowchart of FIG. As a determination condition in step S3902, when the CU size is 8 × 8, it can be realized by replacing the minimum prediction block size definable in the CU with bipred_restriction_size.

In the third embodiment, when the configuration in which the determination condition in step S3902 in the combined motion information decoding process shown in the flowchart of FIG. 39 is not changed is realized, the combined motion information candidate list for the 8 × 8 CU size is provided. This has the effect of enabling the combined motion information candidate single prediction conversion processing in a state where the generation processing is performed with the same operation and bipred_restriction_size is not expanded. In the decoding apparatus, since the prediction block size for the decoding target block is specified by decoding the encoded stream, a single combined motion information candidate single prediction conversion process is performed for the specified prediction block size.

In addition, in the moving picture decoding apparatus according to the third embodiment, the combined motion information candidate generation process can be realized with fewer processes. It is possible to replace the motion information decoding process with the process of the flowchart shown in FIG. 51, and the operation will be described. With respect to the same steps as those in the flowchart of FIG. 39, the same numbers are assigned and new step numbers are assigned only to different portions.

After the combined prediction mode is set as the prediction mode (S3900), it is determined whether or not the CU size of the target prediction block is 8 × 8 (S5107). If the CU size is not 8 × 8 (S5107: NO), the process proceeds to step S3901, and the joint prediction motion information decoding process similar to that of the first embodiment is performed.

On the other hand, if the CU size is 8 × 8 (S5107: YES), it is determined whether the target prediction block is the first combined prediction mode in the target CU (S5108). If it is the first combined prediction mode (S5108: YES), combined motion information candidate list generation processing is performed (S5109). In step S5109, as shown in FIG. 48, the same processing as that in step S3901 is performed with a configuration in which candidate blocks at the same position are acquired for all prediction blocks in the CU.

If it is not the first combined prediction mode (S5108: NO), since the combined motion information candidate list generated in the same way in the target CU has already been generated, the combined motion information candidate list generation processing is not performed, and the process proceeds to step S3904. move on. Since decoding processing can be performed by generating a combined motion information candidate list once in the target CU, combined motion information candidate list generation processing when a plurality of combined prediction modes exist in the 8 × 8 CU is reduced.

After performing step S5109, it is determined whether or not the minimum predicted block size definable in the CU is equal to or smaller than bipred_restriction_size (S5110). If the minimum predicted block size is equal to or smaller than bipred_restriction_size (S5110: YES). Then, the combined motion information candidate single prediction conversion process is performed (S3903), and if the minimum prediction block size is larger than bipred_restriction_size (S5110: NO), the process proceeds to step S3904.

Regarding the processing from step S3904 to step S3906, the same processing as the processing in the flowchart of FIG. 39 in the first embodiment is performed, and the motion information in the joint prediction mode is decoded and stored.

According to the moving picture coding apparatus and the moving picture decoding apparatus in the third embodiment, combined motion prediction candidate generation when the motion compensation prediction for limiting the memory access amount, the restriction of bi-prediction, and the prediction block size are reduced It is possible to realize processing reduction with a configuration that is consistent with each restriction, and to improve the coding efficiency while simultaneously reducing the memory bandwidth and reducing the combined motion information candidate generation process.

The unit constituting the same combined motion information candidate list in Embodiment 3 has been described as an 8 × 8 size, but is not limited to the 8 × 8 size, and is a predetermined unit such as a picture unit or a sequence unit. The unit can be changed by transmitting parameter information defining the maximum predicted block size for generating the same list. As a parameter, for example, log2_parallel_merge_level_minus2 can be defined as a value corresponding to a power of 2 that serves as a reference for the horizontal / vertical size of the predicted block size for generating the same list.

(Embodiment 4)
Next, the video encoding device and video decoding device according to Embodiment 4 of the present invention will be described. In the fourth embodiment, as in the third embodiment, in addition to motion compensation prediction and bi-prediction restriction for restricting the memory access amount, combined motion prediction candidate generation processing when the prediction block size is reduced By limiting the number of operations, the processing load required for generating the combined motion prediction candidate is reduced.

In the moving picture coding apparatus according to the fourth embodiment, the combined motion information single prediction is performed in the motion compensated prediction block structure selection unit 113 shown in FIG. 15 in contrast to the moving picture coding apparatus shown in the first embodiment. The conversion unit 1507 is eliminated, and the motion vector, the reference image designation information, and the combined motion information candidate list output from the combined motion information calculation unit 1506 are directly supplied to the combined motion compensation prediction generation unit 1508.

In addition, in the video decoding device in the fourth embodiment, the combined motion information single prediction conversion unit 3605 in the motion information decoding unit 1111 shown in FIG. 36 is compared with the video decoding device shown in the first embodiment. The motion vector, the reference image designation information, and the combined motion information candidate list output from the combined motion information calculation unit 3604 are directly supplied to the combined motion compensation prediction decoding unit 3606.

In the fourth embodiment, instead of the motion information bi-prediction to uni-prediction conversion processing performed by the combined motion information uni-prediction conversion unit, bi-prediction is performed when the prediction block size is equal to or less than bipred_restriction_size during motion compensation prediction. By performing motion compensation for single prediction using only one of the motion information of the L0 prediction or the L1 prediction of the motion information, motion compensation prediction with a limited memory access amount is performed.

Specifically, in the encoding process in the fourth embodiment, the processes in step S1702 and step S1703 are eliminated in the motion compensation prediction mode / predicted signal generation process shown in the flowchart of FIG. 17 in the first embodiment. The prediction mode evaluation value generation shown in the flowchart of FIG. 33 and the inside of the motion compensation (single / bi) prediction block generation processing performed in step S3105 and step S3106 in the combined prediction mode evaluation value generation processing shown in the flowchart of 31. Within the motion compensated prediction block generation process performed in step S3312 of the process, a process for limiting to single prediction is performed.

In the fourth embodiment, the motion compensated prediction block generation operation performed in steps S3105 and S3106 of the flowchart of FIG. 31 and step S3312 of the flowchart of FIG. 33 is shown in the flowchart of FIG. The flowchart of FIG. 52 is the detailed operation of the motion compensation prediction unit 112 in the moving picture encoding apparatus shown in FIG. 1 in the fourth embodiment, and performs the following operation.

When the prediction type of the supplied motion information is single prediction (S5200: YES), a motion compensated single prediction block is generated using reference image designation information and a motion vector for one reference image (S5203).

If the supplied motion information is not uni-prediction, that is, if the motion information is bi-prediction (S5200: NO), whether the motion information for L0 prediction and the motion information for L1 prediction (reference image information and motion vector) are the same. If the motion information of the L0 prediction and the motion information of the L1 prediction are the same (S5201: YES), the L0 single prediction motion compensation prediction is performed using only the motion information of the L0 prediction (S5204). However, the motion information of bi-prediction is maintained and the motion information of L1 prediction is not changed.

When the supplied motion information of the L0 prediction and the motion information of the L1 prediction are not the same (S5201: NO), it is determined whether or not the prediction block size is equal to or smaller than bipred_restriction_size, and when the predicted block size is equal to or smaller than bipred_restriction_size (S5202: If the motion information for L0 prediction and the motion information for L1 prediction are the same (S5201: YES), L0 single prediction motion compensation prediction is performed using only the motion information for L0 prediction (S5204). However, the motion information of bi-prediction is maintained and the motion information of L1 prediction is not changed. The purpose of the bi-prediction restriction is to limit the memory band of motion compensation prediction by restricting the bi-prediction to the single prediction. Therefore, the prediction list (L0 / L1) restricted by the bi-prediction restriction is set to the L1 single prediction. May be.

When the supplied prediction block size is larger than bipred_restriction_size (S5202: NO), a motion-compensated bi-prediction block is generated using reference image designation information and motion vectors for two reference images (S5205).

Further, in the decoding process in the fourth embodiment, in the joint prediction motion information decoding process shown in the flowchart of FIG. 39 in the first embodiment, the processes in steps S3902 and S3903 are eliminated, and the prediction shown in the flowchart of FIG. Within the motion compensated prediction signal calculation process performed in step S3704 in the block unit decoding process, the process for limiting to single prediction is performed by the process shown in the flowchart of FIG. 52, similarly to the encoding process.

In the fourth embodiment, one of the L0 prediction and the L1 prediction is included in the motion information of the bi-prediction at the time of motion compensation prediction without using the configuration for converting the combined motion information candidate list into the single prediction for the restriction process of the bi-prediction. By using a configuration that performs motion compensation prediction of single prediction using only motion information, the memory access amount is limited.

Also, in the bi-prediction restriction process, the prediction information is the same as the single prediction, but the motion information can maintain a combined motion information candidate that is bi-predicted. As a result, even in a prediction block of bipred_restriction_size or less, motion information is stored for both L0 prediction and L1 prediction, so that information of bi-prediction is used as it is as adjacent reference motion information of a prediction block to be encoded and decoded thereafter. In addition, it is possible to improve the prediction efficiency of the motion prediction process of the prediction block that is encoded / decoded thereafter.

In addition, the same motion information is used as the combined motion information candidate list, and the prediction block size is small because the memory access amount can be limited by the bi-prediction restriction at the time of motion compensation prediction in the prediction block sizes of different sizes. When the same combined motion prediction candidate list is generated, even if the reference prediction block size and bipred_restriction_size constituting the same list are different, the combined motion information candidate is obtained by adopting the configuration of the fourth embodiment. In order to limit the memory access as well as having the effect of realizing the function only by the bi-prediction restriction at the time of motion compensation prediction, without having to add a condition judgment that takes into account both the same list configuration and the bi-prediction restriction at the time of list generation Add bi-prediction restriction to joint motion information at a prediction block size larger than the controlled bipred_restriction_size Since it becomes unnecessary, it has the effect of improving the encoding efficiency.

In addition, in the configuration in which bi-prediction restriction is performed at the time of motion compensation prediction, both the bi-prediction restrictions of two prediction modes (joint prediction mode and motion detection prediction mode) for encoding motion information can be handled in a lump. Bi-prediction restriction can be realized with a minimum configuration.

(Embodiment 5)
Next, the video encoding device and video decoding device according to Embodiment 5 of the present invention will be described. In the fifth embodiment, as in the first embodiment, the motion compensation prediction restriction based on the prediction block size and the bi-predictive motion compensation restriction for restricting the memory access amount are performed. For this purpose, the bi-prediction to uni-prediction conversion method for motion information in the combined motion information candidate list is different.

In the fifth embodiment, the same configuration and processing as in the first embodiment are performed, but the combined motion information candidate list generation processing shown in the flowchart of FIG. 18 and the flowchart of FIG. 30 in the first embodiment. The combined motion information candidate single prediction conversion process is different.

51, the combined motion information candidate list generation process in the fifth embodiment will be described. In the fifth embodiment, the process shown in FIG. 53 is performed in step S1701 in the flowchart of FIG. 17 for the encoding process and in step S3901 in the flowchart of FIG. 39 for the decoding process. With respect to the same steps as those in the flowchart of FIG. 18 of the first embodiment, the same numbers are assigned and new step numbers are assigned only to different portions.

Through the processing from step S1800 to step S1802, spatially combined motion information candidates and temporally combined motion information candidates obtained by deleting the same information from the spatial candidate block group, which are candidates for combined motion information, are calculated, and candidate blocks The combined motion information calculated from the motion information is generated. Next, num_list_before_combined_merge, which is the number of combined motion information generated up to step S1802, is stored (S5305). This value is used in the combined motion information candidate single prediction conversion process described later.

Subsequently, the first combined motion information candidate and the combined motion information candidate list generated by combining the motion information of a plurality of combined motion information candidates registered in the combined motion information candidate list by the processing from step S1803 to step S1804. The second combined motion information candidate generated without depending on the motion information registered in the above is added as necessary, and the combined motion information candidate list generation process is terminated.

The processing different from the first embodiment in the combined motion information candidate list generation processing in the fifth embodiment is a storage process of num_list_before_combined_merge, and the combined motion information in which the motion information of a candidate block group defined by adjacent blocks is registered. The boundary list number of the combined motion information in which the motion information combination of the candidate block group and the motion information not depending on the motion information of the candidate block is registered is stored.

Subsequently, the combined motion information candidate single prediction conversion process in Embodiment 5 will be described with reference to the flowchart of FIG. In the fifth embodiment, the processing shown in FIG. 54 is performed in step S1703 in the flowchart of FIG. 17 for the encoding process and in step S3903 in the flowchart of FIG. 39 for the decoding process. With respect to the same steps as those in the flowchart of FIG. 30 of the first embodiment, the same numbers are assigned and new step numbers are assigned only to different portions.

The combined motion information candidate single prediction conversion process shown in the flowchart of FIG. 52 differs from the flowchart of FIG. 30 in the case where the motion information is not a single prediction (S3002: NO), and the index i of the combined motion information candidate list is When smaller than num_list_before_combined_merge (S5407: YES), the L1 information of the motion information stored in the index i is invalidated in order to convert the motion information of bi-prediction into single prediction (S3003).

On the other hand, when the index i is greater than or equal to num_list_before_combined_merge (S5407: NO), the L0 information of the motion information stored in the index i is invalidated in order to convert the motion information of bi-prediction into single prediction (S5408). .

In step S3003 and step S5408, the motion information of index i converted to single prediction is stored in the combined motion information candidate list (S3004), and the process proceeds to the next index (S3005).

In combined motion information candidate single prediction conversion in Embodiment 5, candidate motion information in the combined motion information candidate list includes motion information calculated from motion information of adjacent candidate blocks and a plurality of registered motion information. For the motion information generated without depending on the motion information of the candidate block or the motion information of the candidate block, the motion information to be invalidated at the time of single prediction conversion is switched according to the prediction type (L0 prediction / L1 prediction). Thereby, especially with respect to the motion information added by the first combined motion information candidate list adding unit, the motion information of the prediction type disabled during the single prediction conversion is left, and the prediction type enabled during the single prediction conversion Motion information can be invalidated, and it is possible to leave a lot of valid motion information as combined motion information, thereby improving the coding efficiency.

In addition, in the prediction block size in which bi-prediction is limited, L0 prediction and L1 prediction are used as candidates without being biased. Therefore, even in motion information stored as motion information used at the time of encoding / decoding, L0 prediction and L1 Forecast bias is reduced. Therefore, the accuracy of the bi-prediction motion information that can be generated by the first combined motion information candidate list adding unit at the time of generating the combined motion information candidate of the subsequent prediction block can be improved, and the encoding efficiency can be improved.

In the fifth embodiment, the L1 information is invalidated when the index i is smaller than num_list_before_combined_merge, and the L0 information is invalidated when the index i is equal to or larger than num_list_before_combined_merge. A feature of this embodiment is that the L0 information is invalidated when the index i is smaller than num_list_before_combined_merge, and the L1 information is invalidated when the index i is greater than or equal to num_list_before_combined_merge.

(Embodiment 6)
Next, the video encoding device and video decoding device according to Embodiment 6 of the present invention will be described. The feature of the sixth embodiment is that it has the same configuration as that of the fifth embodiment, and switches the prediction type (L0 prediction / L1 prediction) to be invalidated in the combined motion information candidate single prediction conversion. It takes a configuration that switches based on the position.

In the sixth embodiment, the same configuration and processing as in the fifth embodiment are performed, but the combined motion information candidate list generation processing shown in the flowchart of FIG. 53 in the fifth embodiment is not performed, and the first embodiment is performed. The combined motion information candidate list generation process shown in the flowchart of FIG. 18 is performed.

In the sixth embodiment, the combined motion information candidate single prediction conversion process shown in the flowchart of FIG. 54 in the fifth embodiment is replaced with the process shown in the flowchart of FIG. In the sixth embodiment, the process shown in FIG. 55 is performed in step S1703 in the flowchart of FIG. 17 for the encoding process and in step S3903 in the flowchart of FIG. 39 for the decoding process.

Hereinafter, the flowchart of FIG. 55 will be described. Regarding the same steps as those in the flowchart of FIG. 54, the same numbers are assigned and new step numbers are assigned only to different portions.

The combined motion information candidate single prediction conversion process shown in the flowchart of FIG. 55 is different from the flowchart of FIG. 54 in the case where the motion information is not a single prediction (S3002: NO), and the index i of the combined motion information candidate list is If it is smaller than 2 (S5507: YES), the L1 information of the motion information stored in the index i is invalidated in order to convert the motion information of bi-prediction into single prediction (S3003).

On the other hand, when the index i is 2 or more (S5507: NO), the L0 information of the motion information stored in the index i is invalidated in order to convert the motion information of bi-prediction into single prediction (S5408). .

In combined motion information candidate single prediction conversion in Embodiment 6, the candidate motion information in the combined motion information candidate list is added to the first combined motion information candidate list by motion information calculated from the motion information of adjacent candidate blocks. Two motion information, which are the minimum motion information necessary for generating additional motion information for bi-prediction, and a first combined motion information candidate list adding unit and a second combined motion registered in the latter half of the list For the motion information added by the information candidate list adding unit, a configuration is adopted in which the prediction type (L0 prediction / L1 prediction) is switched for motion information that is fixedly invalidated at the time of single prediction conversion at the index position.

In the combined motion information candidate uni-predictive conversion in the sixth embodiment, combined motion information in which the motion information of the candidate block group defined by the adjacent block is registered and the motion information of the candidate block group are compared to the fifth embodiment. Since the processing for saving the boundary list number of the combined motion information in which the motion information that does not depend on the motion information of the candidate block and the candidate block is registered can be eliminated, the processing load can be reduced and the same as in the fifth embodiment In addition, for the motion information added by the first combined motion information candidate list adding unit, the motion information of the prediction type enabled at the time of the single prediction conversion is left, leaving the motion information of the prediction type disabled at the time of the single prediction conversion. It is possible to invalidate the information, and it is possible to leave a lot of effective motion information as the combined motion information, thereby improving the encoding efficiency.

Further, in Embodiment 6, since the prediction type that is invalidated not only for the first combined motion information candidate and the second combined motion information candidate but also for the spatial prediction candidate and the temporal prediction candidate can be switched, the prediction type However, when the same motion information is registered in bi-prediction, since the motion information of L0 uni-prediction and L1 uni-prediction can be used as combined motion information, the encoding efficiency can be improved.

In the combined motion information candidate single prediction conversion in Embodiment 6, the position of the index for switching the prediction type (L0 prediction / L1 prediction) to be invalidated is fixed to 2, but the prediction type is switched with a fixed index. Are the features in Embodiment 6, and the number of pieces of motion information that can be registered as a spatially combined motion information candidate, a temporally combined motion information candidate, a first combined motion information candidate, and a second combined motion information candidate, and can be registered at the maximum It is also possible to set the index value of the switching position to be fixed according to the number of combined motion information candidates.

The moving image encoded stream output from the moving image encoding apparatus of the embodiment described above has a specific data format so that it can be decoded according to the encoding method used in the embodiment. Therefore, the moving picture decoding apparatus corresponding to the moving picture encoding apparatus can decode the encoded stream of this specific data format.

When a wired or wireless network is used to exchange an encoded stream between a moving image encoding device and a moving image decoding device, the encoded stream is converted into a data format suitable for the transmission form of the communication path. It may be transmitted. In that case, a video transmission apparatus that converts the encoded stream output from the video encoding apparatus into encoded data in a data format suitable for the transmission form of the communication channel and transmits the encoded data to the network, and receives the encoded data from the network Then, a moving image receiving apparatus that restores the encoded stream and supplies the encoded stream to the moving image decoding apparatus is provided.

The moving image transmitting apparatus is a memory that buffers the encoded stream output from the moving image encoding apparatus, a packet processing unit that packetizes the encoded stream, and transmission that transmits the packetized encoded data via the network. Part. The moving image receiving apparatus generates a coded stream by packetizing the received data, a receiving unit that receives the packetized coded data via a network, a memory that buffers the received coded data, and packet processing. And a packet processing unit provided to the video decoding device.

In addition, the above-described processing related to encoding and decoding can be realized as a transmission, storage, and reception device using hardware, and is stored in a ROM (Read Only Memory), a flash memory, or the like. It can also be realized by firmware or software such as a computer. The firmware program and software program can be recorded on a computer-readable recording medium, provided from a server through a wired or wireless network, or provided as a data broadcast of terrestrial or satellite digital broadcasting Is also possible.

The present invention has been described based on the embodiments. The embodiments are exemplifications, and it will be understood by those skilled in the art that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are within the scope of the present invention. .

100 input terminal, 101 input image memory, 102 encoding block acquisition unit, 103 subtraction unit, 104 orthogonal transform / quantization unit, 105 prediction error encoding unit, 106 dequantization / inverse transform unit, 107 addition unit, 108 frame Inner decoded image buffer, 109 loop filter unit, 110 decoded image memory, 111 motion vector detection unit, 112 motion compensation prediction unit, 113 motion compensation prediction block structure selection unit, 114 intra prediction unit, 115 intra prediction block structure selection unit, 116 Prediction mode selection unit, 117 encoding block structure selection unit, 118 block structure / prediction mode information additional information encoding unit, 119 prediction mode information memory, 120 multiplexing unit, 121 output terminal, 122 code Block control parameter generation unit, 1100 input terminal, 1101 demultiplexing unit, 1102 prediction difference information decoding unit, 1103 inverse quantization / inverse transformation unit, 1104 addition unit, 1105 intra-frame decoded image buffer, 1106 loop filter unit, 1107 decoded image Memory, 1108 prediction mode / block structure decoding unit, 1109 prediction mode / block structure selection unit, 1110 intra prediction information decoding unit, 1111 motion information decoding unit, 1112 prediction mode information memory, 1113 intra prediction unit, 1114 motion compensation prediction unit, 1115 output terminal, 1500 motion compensation prediction generation unit, 1501 prediction error calculation unit, 1502 prediction vector calculation unit, 1503 difference vector calculation unit, 1504 motion information Code amount calculation unit, 1505 prediction mode / block structure evaluation unit, 1506 combined motion information calculation unit, 1507 combined motion information single prediction conversion unit, 1508 combined motion compensation prediction generation unit, 1600 spatial combined motion information candidate list generation unit, 1601 combination Motion information candidate list deletion unit, 1602 Time combined motion information candidate list generation unit, 1603 First combined motion information candidate list addition unit, 1604 Second combined motion information candidate list addition unit, 3600 motion information bitstream decoding unit, 3601 prediction vector Calculation unit, 3602 vector addition unit, 3603 motion compensation prediction decoding unit, 3604 combined motion information calculation unit, 3605 combined motion information single prediction conversion unit, 3606 combined motion compensation prediction decoding unit.

The present invention can be used for encoding and decoding techniques of moving image signals.

Claims

A video encoding device that identifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and generates an encoded stream in units of the identified prediction block,
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction unit for registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding unit that encodes index information that specifies motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion unit for converting the motion information candidates;
A motion-compensated prediction unit that performs motion-compensated prediction by either uni-prediction or bi-prediction based on the motion information candidates and generates a prediction signal of the prediction block to be encoded,
The motion information conversion unit performs prediction conversion to convert prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion compensated prediction unit is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A moving picture coding apparatus that performs the motion compensation prediction based on information.
The prediction type information includes first information indicating whether or not to use a first reference picture, and second information indicating whether or not to use a second reference picture,
The motion information conversion unit indicates the first type of prediction type information indicating that the first reference picture is used and the second information is that the second reference picture is used. The video encoding apparatus according to claim 1, wherein the prediction conversion is performed by invalidating the use of the reference picture or invalidating the use of the second reference picture.
The candidate list construction unit
Combining at least the motion information candidates derived from the block spatially adjacent to the prediction block to be encoded according to the prediction type information, deriving first new motion information, and deriving the first new motion information A first list adding unit that adds the first new motion information that has been added to the motion information candidate list;
Deriving at least second new motion information that does not depend on any of the derived motion information candidates from a block that is spatially adjacent to the prediction block to be encoded, and the second A second list adding unit for adding the new motion information to the motion information candidate list,
3. The moving picture encoding apparatus according to claim 1, wherein at least the first list adding unit adds the first new motion information to construct the motion information candidate list. 4.
4. The moving picture encoding apparatus according to claim 1, wherein the motion information conversion unit performs the prediction conversion after the candidate list construction unit constructs the motion information candidate list.
The candidate list construction unit is spatially close to the candidate block of the second size for the prediction block to be encoded that exists in the divided block having a predetermined second size. 5. The motion information derived from at least one of the divided blocks and the divided blocks that are temporally adjacent to each other is used as the motion information candidate. Video encoding device.
The motion compensated prediction unit further prohibits the motion compensated prediction when the block size of the prediction block to be encoded is a third size less than the first size. 6. The moving image encoding device according to any one of 5).
A moving image encoding apparatus that encodes the moving image using motion compensated prediction in units of blocks obtained by dividing each picture of the moving image,
A motion compensation prediction unit that generates a prediction signal of the prediction block to be encoded by motion compensation using the derived motion information;
A first control parameter that specifies whether or not motion compensated prediction is permitted in a prediction block size of a specified first size, and bi-prediction motion compensation in a prediction block size that is less than or equal to a specified second size An encoded block control parameter generation unit that generates a second control parameter that specifies the second size,
An encoding unit that encodes information used for motion compensated prediction, including the first and second control parameters;
The moving image encoding apparatus, wherein the motion compensation prediction unit performs motion compensation prediction based on the first and second control parameters.
A moving picture encoding method for specifying a prediction block from a block in which a picture is divided into a plurality of blocks and generating an encoded stream in the specified prediction block unit,
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding step for encoding index information for specifying motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion step for converting the motion information candidates;
A motion-compensated prediction step of performing motion-compensated prediction based on either the single prediction or the bi-prediction based on the motion information candidate and generating a prediction signal of the prediction block to be encoded,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion-compensated prediction step is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A moving picture coding method, wherein the motion compensation prediction is performed based on information.
A moving picture encoding program for specifying a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and generating an encoded stream in the specified prediction block unit,
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding step for encoding index information for specifying motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion step for converting the motion information candidates;
Based on the motion information candidates, a computer performs a motion compensation prediction step of performing motion compensation prediction by either uni-prediction or bi-prediction and generating a prediction signal of the prediction block to be encoded,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion-compensated prediction step is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A moving picture coding program which performs the motion compensation prediction based on information.
A predicted block is identified from a block obtained by dividing a picture into a plurality of blocks in stages, and the coded stream encoded by a moving picture coding method for generating a coded stream in the identified predicted block unit A packet processing unit that packetizes to obtain encoded data;
A transmission unit for transmitting the packetized encoded data,
The moving image encoding method includes:
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding step for encoding index information for specifying motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion step for converting the motion information candidates;
A motion-compensated prediction step of performing motion-compensated prediction based on either the single prediction or the bi-prediction based on the motion information candidate and generating a prediction signal of the prediction block to be encoded,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion-compensated prediction step is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A transmission apparatus that performs the motion compensation prediction based on information.
A predicted block is identified from a block obtained by dividing a picture into a plurality of blocks in stages, and the coded stream encoded by a moving picture coding method for generating a coded stream in the identified predicted block unit A packet processing step of packetizing to obtain encoded data;
Transmitting the packetized encoded data, and
The moving image encoding method includes:
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding step for encoding index information for specifying motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion step for converting the motion information candidates;
A motion-compensated prediction step of performing motion-compensated prediction based on either the single prediction or the bi-prediction based on the motion information candidate and generating a prediction signal of the prediction block to be encoded,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion-compensated prediction step is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A transmission method characterized in that the motion compensation prediction is performed based on information.
A predicted block is identified from a block obtained by dividing a picture into a plurality of blocks in stages, and the coded stream encoded by a moving picture coding method for generating a coded stream in the identified predicted block unit A packet processing step of packetizing to obtain encoded data;
Transmitting the packetized encoded data to the computer, and
The moving image encoding method includes:
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be encoded and a block temporally adjacent to the prediction block, and the motion information is derived as a motion information candidate of the prediction block to be encoded. A candidate list construction step of registering predetermined motion information from the motion information and constructing a motion information candidate list;
An encoding step for encoding index information for specifying motion information candidates in the motion information candidate list used for the prediction block to be encoded;
A motion information conversion step for converting the motion information candidates;
A motion-compensated prediction step of performing motion-compensated prediction based on either the single prediction or the bi-prediction based on the motion information candidate and generating a prediction signal of the prediction block to be encoded,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion-compensated prediction step is a motion converted by the prediction conversion when the block size of the prediction block to be encoded is a predetermined first size and the prediction type information indicates the bi-prediction. A transmission program that performs the motion compensation prediction based on information.
A video decoding device that identifies a prediction block from a block in which a picture is divided into a plurality of blocks and decodes an encoded stream in units of the identified prediction block,
A decoding unit that decodes index information that specifies motion information of the prediction block to be decoded from the encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction unit for registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion unit for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded With
The motion information conversion unit performs prediction conversion to convert prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion compensation prediction unit is configured when the block size of the prediction block to be decoded is a predetermined first size and the prediction type information of the designated motion information indicates the bi-prediction. A moving picture decoding apparatus that performs the motion compensation prediction based on motion information converted by conversion.
The prediction type information includes first information indicating whether or not to use a first reference picture, and second information indicating whether or not to use a second reference picture,
The motion information conversion unit indicates the first type of prediction type information indicating that the first reference picture is used and the second information is that the second reference picture is used. 14. The moving picture decoding apparatus according to claim 13, wherein the predictive conversion is performed by invalidating the use of the reference picture or invalidating the use of the second reference picture.
The candidate list construction unit
At least the motion information candidates derived from the spatially adjacent blocks to the decoding target block are combined according to the prediction type information to derive first new motion information, which is derived A first list adding unit for adding the first new motion information to the motion information candidate list;
Deriving at least second new motion information that does not depend on any of the derived motion information candidates from a block that is spatially close to the prediction block to be decoded; A second list adding unit for adding new motion information to the motion information candidate list,
The moving picture decoding apparatus according to claim 13 or 14, wherein at least the first list adding unit adds the first new motion information to construct the motion information candidate list.
16. The moving picture decoding apparatus according to claim 13, wherein the motion information conversion unit performs the prediction conversion after the candidate list construction unit constructs the motion information candidate list.
The candidate list construction unit is spatially close to the candidate block of the second size for the prediction block to be decoded that exists in the divided block of a predetermined second size. The moving image according to claim 13, wherein motion information derived from at least one of the divided blocks and the divided blocks that are temporally adjacent to each other is set as the motion information candidate. Image decoding device.
The motion compensated prediction unit further prohibits the motion compensated prediction when a block size of the prediction block to be decoded is a third size less than the first size. Any one of the moving image decoding apparatuses.
A moving picture decoding apparatus for decoding a coded stream obtained by coding the moving picture using motion compensated prediction in units of blocks obtained by dividing each picture of the moving picture,
Whether to decode motion-compensated prediction from the encoded stream and whether to permit motion-compensated prediction in the designated prediction block size of the first size from the decoded information used for motion-compensated prediction. A decoding unit that obtains a first control parameter to be designated, and a second control parameter to designate the second size, which prohibits bi-prediction motion compensation in a prediction block size equal to or smaller than the designated second size; ,
A motion compensation prediction unit that generates a prediction signal of a decoding target prediction block using information used for the motion compensation prediction,
The video decoding apparatus, wherein the motion compensation prediction unit performs motion compensation prediction based on the first and second control parameters.
A video decoding method for identifying a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and decoding an encoded stream in units of the identified prediction block,
A decoding step of decoding index information specifying motion information of the prediction block to be decoded from the encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction step of registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion step for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded With steps,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion compensation prediction step is performed when the block size of the prediction block to be decoded is a predetermined first size and the prediction type information of the designated motion information indicates the bi-prediction. A moving picture decoding method, wherein the motion compensation prediction is performed based on motion information converted by conversion.
A moving picture decoding program that identifies a prediction block from a block in which a picture is divided into a plurality of blocks and decodes an encoded stream in units of the identified prediction block,
A decoding step of decoding index information specifying motion information of the prediction block to be decoded from the encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction step of registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion step for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded Step to the computer,
The motion information conversion step performs prediction conversion for converting prediction type information indicating the bi-prediction into prediction type information indicating the single prediction among the motion information candidates,
The motion compensation prediction step is performed when the block size of the prediction block to be decoded is a predetermined first size and the prediction type information of the designated motion information indicates the bi-prediction. A moving picture decoding program that performs the motion compensation prediction based on motion information converted by conversion.
A receiving device that identifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and receives and decodes an encoded stream in which a moving image is encoded in the specified prediction block unit. ,
A receiving unit that receives encoded data in which the encoded stream is packetized;
A restoration unit that packet-processes the received encoded stream to restore the original encoded stream;
A decoding unit that decodes index information that specifies motion information of the prediction block to be decoded from the restored encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction unit for registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion unit for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded With
The motion information conversion unit converts prediction type information indicating that the motion compensation prediction is performed by the bi-prediction among the motion information candidates into prediction type information indicating that the motion compensation prediction is performed by the single prediction. Perform predictive transformations,
The motion compensation prediction unit is a case where the block size of the prediction block to be decoded is a predetermined first size, and the prediction type information of the designated motion information is the motion compensated prediction by the bi-prediction. A receiving apparatus that performs the motion-compensated prediction based on the prediction type information converted by the prediction conversion when indicating to perform.
A reception method for specifying a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and receiving and decoding an encoded stream in which a moving image is encoded in the specified prediction block unit. ,
A reception step of receiving encoded data in which the encoded stream is packetized;
A restoration step of packetizing the received encoded stream to restore the original encoded stream;
A decoding step of decoding index information specifying motion information of the prediction block to be decoded from the restored encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction step of registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion step for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded With steps,
In the motion information conversion step, prediction type information indicating that the motion compensation prediction is performed by the bi-prediction among the motion information candidates is converted into prediction type information indicating that the motion compensation prediction is performed by the single prediction. Perform predictive transformations,
The motion compensation prediction step is a case where the block size of the prediction block to be decoded is a predetermined first size, and the prediction type information of the designated motion information is the motion compensated prediction by the bi-prediction. A reception method, wherein the motion compensation prediction is performed based on the prediction type information converted by the prediction conversion.
A reception program that identifies a prediction block from a block in which a picture is divided into a plurality of blocks in stages, and receives and decodes an encoded stream in which a moving image is encoded in the specified prediction block unit. ,
A reception step of receiving encoded data in which the encoded stream is packetized;
A restoration step of packetizing the received encoded stream to restore the original encoded stream;
A decoding step of decoding index information specifying motion information of the prediction block to be decoded from the restored encoded stream;
Motion information is derived from at least one of a block spatially adjacent to the prediction block to be decoded and a block temporally adjacent to the prediction block, and the derived motion is used as a motion information candidate of the prediction block to be decoded. A candidate list construction step of registering predetermined motion information from the information and constructing a motion information candidate list;
A motion information conversion step for converting the motion information candidates;
Motion compensated prediction based on the motion information specified by the index information of the motion information candidates, and performing motion compensation prediction by either uni-prediction or bi-prediction to generate a prediction signal of the prediction block to be decoded Step to the computer,
In the motion information conversion step, prediction type information indicating that the motion compensation prediction is performed by the bi-prediction among the motion information candidates is converted into prediction type information indicating that the motion compensation prediction is performed by the single prediction. Perform predictive transformations,
The motion compensation prediction step is a case where the block size of the prediction block to be decoded is a predetermined first size, and the prediction type information of the designated motion information is the motion compensated prediction by the bi-prediction. A reception program that performs the motion-compensated prediction based on the prediction type information converted by the prediction conversion when indicating to perform.