WO2012147947A1

WO2012147947A1 - Image decoding apparatus and image encoding apparatus

Info

Publication number: WO2012147947A1
Application number: PCT/JP2012/061450
Authority: WO
Inventors: 山本　智幸; 知宏猪飼; 将伸八杉
Original assignee: シャープ株式会社
Priority date: 2011-04-28
Filing date: 2012-04-27
Publication date: 2012-11-01

Abstract

A motion video decoding device (1) decodes, from encoded data (#1), prediction parameters related to each of the prediction units. In order to predict, among prediction parameters related to each of the prediction units included in an encoding unit, prediction parameters related to each of the prediction units included in an encoding unit adjacent to the aforementioned encoding unit, the number of reference prediction parameters that can be referred to by a prediction mode decoding unit (22) is set to be less than the number of prediction units adjacent to the above-mentioned adjacent encoding unit, among the prediction units included in the encoding unit.

Description

Image decoding apparatus and image encoding apparatus

The present invention relates to an image decoding device that decodes encoded data, and an image encoding device that generates encoded data.

In order to efficiently transmit or record a moving image, a moving image encoding device (image encoding device) that generates encoded data by encoding the moving image, and decoding the encoded data A video decoding device (image decoding device) that generates a decoded image is used. As a specific moving picture encoding method, for example, H.264 is used. H.264 / MPEG-4. A method adopted in KTA software which is a codec for joint development in AVC (Non-patent Document 1) and VCEG (Video Coding Expert Group), a method adopted in TMuC (Test Model Under Consulation) software, and its successor codec And the method employed in WorkingWorkDraft 1 of High-Efficiency Video Coding (Non-Patent Document 2, hereinafter also referred to as HEVC WD1).

In such a coding system, an image (picture) constituting a moving image includes a slice obtained by dividing the image, a coding unit obtained by dividing the slice (a macroblock or a coding unit (CU: Coding)). It is also managed by a hierarchical structure composed of blocks and partitions obtained by dividing an encoding unit, and is normally encoded block by block.

In such an encoding method, a predicted image is usually generated based on a locally decoded image obtained by encoding / decoding an input image, and the predicted image is subtracted from the input image (original image). The prediction residual (which may be referred to as “difference image” or “residual image”) is encoded. In addition, examples of the method for generating a predicted image include inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction).

In inter prediction, a predicted image in a frame being decoded is generated for each prediction unit by applying motion compensation using a motion vector with the decoded frame as a reference frame.

On the other hand, in intra prediction, a predicted image in a frame being decoded is generated for each prediction unit based on a decoded area of the frame being decoded. H. H.264 / MPEG-4. As an example of intra prediction used in AVC, for each prediction unit (for example, partition), (1) one of the prediction modes is selected from a predetermined prediction mode group, and (2) the decoded area is selected. There is a method of generating a pixel value on the prediction unit by extrapolating the pixel value in an extrapolation direction (prediction direction) corresponding to the selected prediction mode (sometimes referred to as “basic prediction”).

In Non-Patent Document 3, regarding the size of the prediction unit for generating the predicted image by intra prediction, in addition to the previous size (32 × 32, 16 × 16, 8 × 8, 4 × 4 pixels), SDIP It is described that a prediction unit having a size of 32 × 8, 16 × 4, 8 × 2, 16 × 1, 8 × 32, 4 × 16, 2 × 8, and 1 × 16 pixels called (Short Distance Intra Prediction) is used. Has been.

In the generation of a prediction image, there is a case where a prediction image is generated by estimating the prediction mode of the target prediction unit from the prediction mode of the prediction image processed earlier. Therefore, the prediction mode used for generating the predicted image needs to be recorded for generating a predicted image later. In particular, since the prediction mode of the target prediction unit may be estimated from the prediction mode of the prediction unit that touches the upper side of the target prediction unit, at least the prediction mode of the prediction unit that touches the upper side of the target prediction unit Should be recorded. That is, the prediction mode of the prediction unit for which the prediction image has been generated needs to be recorded until the prediction image of the prediction unit in contact with the lower side of the prediction unit is generated. In other words, it is necessary to record the prediction mode of the prediction unit for one line of at least one frame.

However, when the size of the prediction unit is reduced, the number of prediction units for one line increases and the number of prediction modes to be recorded increases. Therefore, the memory capacity necessary for recording the prediction mode increases.

Also, in order to improve the prediction accuracy, it is conceivable to increase the number of prediction directions that can be expressed in the prediction mode of intra prediction. However, when the number of prediction directions that can be expressed in the prediction mode is increased, the amount of data for expressing the prediction mode increases, so that the memory capacity required to record the prediction mode increases.

The present invention has been made in view of the above problems, and an object of the present invention is to realize an image decoding apparatus or the like that suppresses an increase in necessary data amount while improving prediction accuracy.

In order to solve the above problems, the video decoding device according to the present invention uses a region obtained by dividing a coding unit as a prediction unit, generates a prediction image for each prediction unit with reference to a prediction parameter, In a video decoding device that generates a decoded image by adding the prediction image to a prediction residual decoded from encoded data, a prediction parameter decoding unit that decodes a prediction parameter for each prediction unit from the encoded data, For at least a part of the prediction units, the prediction parameter related to the prediction unit is decoded when the upper adjacent prediction unit adjacent to the upper side of the prediction unit belongs to the tree block to which the prediction unit belongs. If the upper neighboring prediction unit is estimated from a prediction parameter and does not belong to the tree block to which the prediction unit belongs, the prediction unit Is characterized by a prediction parameter decoding means for estimating a prediction parameter of the decoded and a recording unit which is adjacent to the upper side of the unit.

According to the above configuration, when the upper adjacent prediction unit does not belong to the tree block to which the prediction unit to be predicted belongs, estimation is performed from the decoded prediction parameters related to the recording unit adjacent to the upper side of the prediction unit. Therefore, it is only necessary to hold the prediction parameter in the recording unit, and the amount of memory for holding the prediction parameter can be reduced.

In order to solve the above problems, an image decoding apparatus according to the present invention uses a region obtained by dividing a coding unit as a prediction unit, generates a prediction image for each prediction unit with reference to a prediction parameter, In the image decoding device for generating a decoded image by adding the prediction image to the prediction residual decoded from the encoded data, prediction parameter decoding means for decoding a prediction parameter for each prediction unit from the encoded data, comprising: A prediction parameter decoding unit that estimates a prediction parameter related to the prediction unit from a decoded prediction parameter related to a prediction unit included in an adjacent coding unit adjacent to the coding unit to which the prediction unit belongs, For at least some coding units, the prediction parameter of each prediction unit included in the coding unit That is, the number of reference prediction parameters that can be referred to by the prediction parameter decoding means in order to estimate a prediction parameter for each prediction unit included in an adjacent coding unit adjacent to the coding unit is included in the coding unit. The prediction unit is set to be smaller than the number of prediction units adjacent to the adjacent coding unit.

According to the above configuration, the number of reference prediction parameters referred to when the prediction parameter decoding unit estimates the prediction parameter of the prediction unit is included in the adjacent coding unit adjacent to the coding unit to which the prediction unit belongs. , Less than the number of prediction units adjacent to the coding unit. Therefore, the number of necessary reference parameters can be reduced as compared with the case where the prediction parameters are estimated by referring to the same number of reference parameters as the number of prediction units adjacent to the coding unit.

Thereby, the amount of data necessary for the prediction parameter decoding means to estimate the prediction parameter of the prediction unit can be reduced, and the efficiency of the process of estimating the prediction parameter can be improved.

In addition, since the number of necessary reference prediction parameters is smaller than when estimating prediction parameters by referring to the same number of reference parameters as the number of prediction units adjacent to the encoding unit, It is possible to reduce the memory capacity required for recording the prediction parameter.

In addition, even if the number of prediction units is reduced and the number of prediction units is increased in order to improve the prediction accuracy, the reference prediction parameter does not increase by the same amount as the number of prediction units. The amount of data necessary for estimating the prediction parameter can be reduced.

In order to solve the above problems, an image encoding apparatus according to the present invention generates a prediction image for each prediction unit by referring to a prediction parameter, using a region obtained by dividing the encoding unit as a prediction unit, In an image coding apparatus that encodes a prediction residual obtained by subtracting a predicted image that has been subtracted from an original image and outputs encoded data, the prediction unit belongs to a prediction parameter related to each prediction unit in at least some of the coding units. Estimating from the prediction parameters already decoded for the prediction unit included in the adjacent coding unit adjacent to the coding unit, and the prediction parameter for the prediction unit does not match the estimated prediction parameter obtained by the above estimation Prediction parameter encoding means for encoding only in some cases, and for at least some of the encoding units, Prediction prediction that can be referred to by the prediction parameter encoding means to estimate a prediction parameter related to each prediction unit included in an adjacent coding unit adjacent to the coding unit among prediction parameters related to each prediction unit included in the prediction unit The number of parameters is set to be smaller than the number of prediction units adjacent to the adjacent coding unit among the prediction units included in the coding unit.

According to the above configuration, the number of reference prediction parameters referred to by the prediction parameter encoding means is included in the adjacent encoding unit adjacent to the encoding unit to which the prediction unit whose prediction parameter is estimated by the estimated prediction parameter belongs. Less than the number of prediction units adjacent to the coding unit. Therefore, the number of necessary reference parameters can be reduced as compared to the case where the same number of reference parameters as the number of prediction units adjacent to the coding unit are referred to.

This makes it possible to reduce the amount of required reference prediction parameter data and improve the processing efficiency.

In addition, since the number of reference prediction parameters required is smaller than when referring to the same number of reference parameters as the number of prediction units adjacent to the encoding unit, the reference prediction parameters are recorded. Can reduce the amount of memory required.

In addition, even if the number of prediction units is reduced and the number of prediction units is increased in order to improve the prediction accuracy, the reference prediction parameter does not increase by the same amount as the number of prediction units. The amount of data required to derive the estimated prediction parameter can be reduced.

As described above, the image decoding apparatus according to the present invention is a prediction parameter decoding unit that decodes a prediction parameter related to each prediction unit from encoded data, and the prediction parameter related to the prediction unit is set for at least some of the prediction units. A prediction parameter decoding unit that estimates from a decoded prediction parameter related to a prediction unit included in an adjacent coding unit adjacent to the coding unit to which the prediction unit belongs, and at least a part of the coding unit includes the coding unit Prediction parameters that can be referred to by the prediction parameter decoding means in order to estimate a prediction parameter for each prediction unit included in an adjacent coding unit adjacent to the coding unit among the prediction parameters for each prediction unit included in the prediction unit Of the prediction unit included in the coding unit is the adjacent coding unit. Is a configuration that is set smaller than the number of prediction units adjacent to.

Thereby, it is possible to reduce the amount of data necessary for the prediction parameter decoding means to estimate the prediction parameter of the prediction unit, and to improve the efficiency of the process of estimating the prediction parameter.

In addition, since the number of necessary reference prediction parameters is smaller than when estimating prediction parameters by referring to the same number of reference parameters as the number of prediction units adjacent to the encoding unit, There is an effect that the memory capacity required for recording the prediction parameter can be reduced.

In addition, even if the number of prediction units is reduced and the number of prediction units is increased in order to improve the prediction accuracy, the reference prediction parameter does not increase by the same amount as the number of prediction units. There is an effect that the amount of data necessary to estimate the prediction parameter can be reduced.

In addition, the image coding apparatus according to the present invention estimates a prediction parameter related to each prediction unit from a decoded prediction parameter related to a prediction unit included in an adjacent coding unit adjacent to the coding unit to which the prediction unit belongs. , Comprising prediction parameter encoding means for encoding the prediction parameter relating to the prediction unit only when the prediction parameter does not match the estimated prediction parameter obtained by the estimation, and for at least some of the encoding units, Among the prediction parameters related to each prediction unit included in the coding unit, the prediction parameter encoding means may be referred to in order to estimate the prediction parameter related to each prediction unit included in the adjacent coding unit adjacent to the coding unit. Of the prediction units included in the coding unit, the number of reference prediction parameters is the above Is a configuration that is set smaller than the number of prediction units adjacent to the contact coding units.

Thereby, it is possible to reduce the data amount of the required reference prediction parameter and to improve the processing efficiency.

In addition, since the number of reference prediction parameters required is smaller than when referring to the same number of reference parameters as the number of prediction units adjacent to the encoding unit, the reference prediction parameters are recorded. It is possible to reduce the memory capacity required for the operation.

In addition, even if the number of prediction units is reduced and the number of prediction units is increased in order to improve the prediction accuracy, the reference prediction parameter does not increase by the same amount as the number of prediction units. There is an effect that the amount of data necessary for deriving the estimated prediction parameter can be reduced.

It is a block diagram which shows the principal part structure of the prediction information decoding part of the moving image decoding apparatus which concerns on embodiment of this invention. (A) is a figure which shows the data structure of the encoding data produced | generated by the moving image encoder, and is referred by the said moving image decoder, Comprising: (a) is a figure which shows the structure of the picture layer of encoded data, (b) is a diagram showing the configuration of the slice layer included in the picture layer, (c) is a diagram showing the configuration of the TB layer included in the slice layer, and (d) is a diagram showing the configuration of the CU included in the TB layer. (E)-(g) is a figure which shows the structure of the intra prediction information about CU. It is a figure which shows the structure of the image produced | generated by the moving image encoder, and is referred by the said moving image decoder, (a) is a figure which shows the state by which a slice and TB are divided | segmented from a picture, (b) (C) is a figure which shows the state by which CU is divided | segmented from TB. It is a figure which shows the structure of the image produced | generated by the moving image encoder, and is referred by the said moving image decoder, (a), (b) is a figure which shows the state by which an intra prediction unit is divided | segmented from CU. (C) is a figure which shows the state by which the conversion unit is divided | segmented from CU. It is a figure for demonstrating the relationship between the prediction unit PU and memory in intra prediction, (a) is a figure which shows the case where prediction unit PU is a 4x4 pixel, (b) is a prediction unit PU 2x. It is a figure which shows the case of 8 pixels, (c) is a figure which shows the relationship between an encoding unit and a line memory. It is a figure for demonstrating the case where the size of the prediction unit PU and the recording unit RU which records the prediction mode of this prediction unit PU differs, (a) is a figure which shows the size of the recording unit RU, (b). (A) is a figure which shows the size of prediction unit PU, (c) is a figure which shows the relationship between prediction unit PU and recording unit RU. It is a block diagram which shows the principal part structure of the said moving image decoding apparatus. It is a figure which shows the relationship between prediction unit PU and recording unit RU. It is a figure which shows the example of the prediction mode used with the said moving image decoding apparatus, (a) is a figure which shows the relationship between ID and direction of prediction mode, (b) is the prediction unit PU and recording in a certain CU. It is a figure which shows the relationship with unit RU, (c) is a figure which shows the bit stream in the case of (b). It is a flowchart which shows the flow of a process of the said prediction information decoding part. It is a block diagram which shows the principal part structure of the moving image encoder which concerns on this Embodiment. It is a figure which shows the relationship between the production | generation prediction mode and the recording prediction mode. It is a figure which shows the relationship between the production | generation prediction mode and the recording prediction mode. It is a block diagram which shows the principal part structure of the prediction information decoding part which concerns on another embodiment of this invention. It is a figure which shows the recording unit RU and prediction mode number for every size of prediction unit PU, (a) is a figure which shows the relationship between prediction unit PU and recording unit RU, (b) is a prediction unit PU and for prediction It is a figure which shows the relationship between the precision of prediction mode, and the precision of the reference prediction mode. It is a figure for demonstrating the structure of a bit stream, (a) is a figure which shows the relationship between object CU, prediction unit PU, and recording unit RU, (b) is a case where prediction unit PU is 1 * 16 pixel (C) is a figure which shows the example of a bit stream in case prediction unit PU is 4x16 pixel. It is a figure for demonstrating the process which derives | leads-out the prediction mode for prediction, (a) is a figure which shows the correspondence of the prediction mode for prediction, and the prediction mode for reference, (b) is the content of prediction mode update information FIG. It is a flowchart which shows the flow of a process of a prediction information decoding part. It is a figure for demonstrating that a moving image decoding apparatus and a moving image coding apparatus can be utilized for transmission / reception of a moving image, (a) is the block diagram which showed the structure of the transmission device carrying a moving image coding apparatus (B) is a block diagram showing a configuration of a receiving apparatus equipped with a moving picture decoding apparatus. It is a figure for demonstrating that a moving image decoding apparatus and a moving image encoding apparatus can be utilized for recording and reproduction | regeneration of a moving image, (a) shows the structure of the recording device carrying the moving image encoding apparatus 2. FIG. 8B is a block diagram illustrating a configuration of a playback device equipped with a video decoding device.

Embodiments of an image decoding apparatus and an image encoding apparatus according to the present invention will be described below with reference to the drawings. Note that the image decoding apparatus according to the present embodiment decodes a moving image from encoded data. Therefore, hereinafter, this is referred to as “moving image decoding apparatus”. In addition, the image encoding device according to the present embodiment generates encoded data by encoding a moving image. Therefore, in the following, this is referred to as a “video encoding device”.

In this embodiment, the reduction of the memory capacity for recording the prediction mode in the intra prediction is described, but the present invention is not limited to this. Any parameter can be applied as long as it is a parameter used for predictive image generation that can be recorded not only in the prediction mode of intra prediction but also in units of less than the coding unit CU. For example, the present invention can be applied to estimated intra prediction mode selection information, residual information of intra prediction modes, motion vectors, motion vector residuals, estimated motion vector selection information, reference image selection information, reference image list selection information, and the like.

[Embodiment 1]
Embodiment 1 according to the present invention will be described with reference to FIGS. First, prior to description of the video decoding device (image decoding device) 1 according to the present embodiment, the video decoding device 1 is generated by the video encoding device (image coding device) 2 according to the present embodiment. The configuration of the encoded data # 1 decoded by the above will be described.

(Configuration of encoded data # 1)
The configuration of encoded data # 1 will be described with reference to FIGS. The encoded data # 1 includes a sequence and a plurality of pictures constituting the sequence.

FIG. 2 shows the hierarchical structure below the picture layer in the encoded data # 1. FIG. 2A is a diagram illustrating a structure of a picture layer that defines a picture PICT. FIG. 2B is a diagram showing the structure of the slice layer that defines the slice S. FIG. 2C is a diagram illustrating a structure of a tree block layer that defines a tree block TB. FIG. 2D is a diagram illustrating the structure of a CU layer that defines a coding unit (CU: Coding Unit) included in the tree block TB.

Also, (e) to (g) in FIG. 2 are information about a prediction tree (PT: prediction tree), and an example of the structure of intra prediction information PTI_Intra, which is prediction information PTI about an intra prediction (intra-screen prediction) partition. FIG.

3 and 4 are diagrams showing a state in which the slice S, the tree block TB, the prediction unit PU, and the transform unit TU are divided from the picture PICT.

(Picture layer)
In the picture layer, a set of data referred to by the video decoding device 1 for decoding a picture PICT to be processed (hereinafter also referred to as a target picture) is defined. As shown in FIG. 2A, the picture PICT includes a picture header PH and slices S ₁ to S _NS (NS is the total number of slices included in the picture PICT).

In the following description, when it is not necessary to distinguish each of the slices S ₁ to S _NS , the reference numerals may be omitted. The same applies to other data with subscripts included in encoded data # 1 described below.

The picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture. For example, the encoding mode information (entropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH.

When entropy_coding_mode_flag is 0, the picture PICT is encoded by CAVLC (Context-based Adaptive Variable Variable Length Coding). When entropy_coding_mode_flag is 1, the picture PICT is encoded by CABAC (Context-based Adaptive Binary Arithmetic Coding).

Note that the picture header PH is also referred to as a picture parameter set (PPS).

(Slice layer)
In the slice layer, a set of data referred to by the video decoding device 1 for decoding the slice S to be processed (also referred to as a target slice) is defined. As shown in FIG. 2B, the slice S includes a slice header SH and a sequence of tree blocks TB ₁ to TB _NC (NC is the total number of tree blocks included in the slice S).

The slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.

As slice types that can be specified by the slice type specification information, (1) I slice using only intra prediction at the time of encoding, (2) P slice using unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used. Note that the slice header SH may include a filter parameter referred to by a loop filter (not shown) included in the video decoding device 1.

Further, as shown in FIG. 3A, the slice S is formed by dividing the picture PICT. In FIG. 3A, the picture PICT301 is divided to form a slice S302.

(Tree block layer)
In the tree block layer, a set of data referred to by the video decoding device 1 for decoding a processing target tree block TB (hereinafter also referred to as a target tree block) is defined.

The tree block TB includes a tree block header TBH and coding unit information CU ₁ to CU _NL (NL is the total number of coding unit information included in the tree block TB). First, the relationship between the tree block TB and the coding unit information CU will be described as follows.

The tree block TB is divided into units for specifying a block size for each process of intra prediction or inter prediction and conversion.

The above unit of the tree block TB is divided by recursive quadtree division. The tree structure obtained by this recursive quadtree partitioning is hereinafter referred to as a coding tree.

Hereinafter, a unit corresponding to a leaf that is a node at the end of the coding tree is referred to as a coding node. In addition, since the encoding node is a basic unit of the encoding process, hereinafter, the encoding node is also referred to as an encoding unit (CU).

That is, coding unit information (hereinafter referred to as CU information) CU ₁ to CU _NL is information corresponding to each coding node (coding unit) obtained by recursively dividing the tree block TB into quadtrees. is there.

Also, the root of the coding tree is associated with the tree block TB. In other words, the tree block TB is associated with the highest node of the tree structure of the quadtree partition that recursively includes a plurality of encoding nodes. With this definition, the tree block TB may be referred to as LCU (largest coding unit). In addition, the tree block TB may be called a coding tree block (CTB).

Note that the size of each coding node is half the size of the coding node to which the coding node directly belongs (that is, the unit of the node one layer higher than the coding node).

Also, the size that each coding node can take depends on the size designation information of the coding node and the maximum hierarchy depth (maximum hierarchical depth) included in the sequence parameter set SPS of the coded data # 1. For example, when the size of the tree block TB is 64 × 64 pixels and the maximum hierarchy depth is 3, the encoding nodes in the hierarchy below the tree block TB have four sizes, that is, 64 × 64. Any of pixel, 32 × 32 pixel, 16 × 16 pixel, and 8 × 8 pixel can be taken.

As a block structure, as shown in FIG. 3A, the slice S is divided to form a tree block TB303. As shown in FIG. 3B, the tree block TB303 is divided to form a CU 311.

FIG. 3C shows a state where the tree block TB303 is divided into quadtrees when the maximum hierarchical depth is “2”. As shown in FIG. 3C, when the maximum layer depth is “2”, the value of the CU split flag (split_coding_unit_flag) is “1” in layer 0, and layer 1 is also “1”, CU 311b Becomes an encoding node. When the maximum layer depth is “1” and the value of the CU partition flag is “1” in layer 0, the CU 311a is an encoding node.

(Tree block header)
The tree block header TBH includes a coding parameter referred to by the video decoding device 1 to determine a decoding method of the target tree block. Specifically, as shown in FIG. 2C, tree block division information SP_TB for designating a division pattern of the target tree block into each CU, and a quantization parameter difference for designating the size of the quantization step Δqp (qp_delta) is included.

The tree block division information SP_TB is information representing a coding tree for dividing the tree block. Specifically, the shape and size of each CU included in the target tree block, and the position in the target tree block Is information to specify.

Note that the tree block division information SP_TB may not explicitly include the shape or size of the CU. For example, the tree block division information SP_TB may be a set of flags (split_coding_unit_flag) indicating whether or not the entire target tree block or a partial area of the tree block is divided into four. In this case, the shape and size of each CU can be specified by using the shape and size of the tree block together.

Further, the quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target tree block and the quantization parameter qp ′ in the tree block encoded immediately before the target tree block.

(CU layer)
In the CU layer, a set of data referred to by the video decoding device 1 for decoding a CU to be processed (hereinafter also referred to as a target CU) is defined.

Here, before explaining the specific contents of the data included in the CU information CU, the tree structure of the data included in the CU will be described. The encoding node is the root of the prediction tree PT and the transformation tree TT. The prediction tree and the conversion tree are described as follows.

In the prediction tree, the encoding node is divided into one or a plurality of prediction blocks, and the position and size of each prediction block are defined. In other words, the prediction block is one or a plurality of non-overlapping areas constituting the encoding node. The prediction tree includes one or a plurality of prediction blocks obtained by the above division.

Prediction processing is performed for each prediction block. Hereinafter, a prediction block that is a unit of prediction is also referred to as a prediction unit (PU).

Also, in the transform tree, the encoding node is divided into one or a plurality of transform blocks, and the position and size of each transform block are defined. In other words, the transform block is one or a plurality of non-overlapping areas constituting the encoding node. The conversion tree includes one or a plurality of conversion blocks obtained by the above division.

Conversion processing is performed for each conversion block. Hereinafter, a transform block that is a unit of transform is also referred to as a transform unit (TU).

(Data structure of CU information CU)
Next, specific contents of data included in the CU information CU will be described with reference to FIG. As shown in FIG. 2 (d), the CU information CU includes a skip flag SKIP, PU partition information SP_PU that specifies a partition pattern for each prediction unit of the target CU, prediction type information PType, PT information PTI, and TT. Contains information TTI.

The skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target CU. When the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target CU, skip is performed. Various types of information to be subjected to are omitted, and a default value or an estimated value is used when decoding. The skip flag SKIP is omitted for the I slice.

The PU partition information SP_PU is information for determining the shape and size of each PU included in the target CU and the position in the target CU. For example, the PU partition information SP_PU is realized from at least one of an intra partition flag (intra_split_flag) that specifies intra partition from the target CU and an inter partition flag (inter_partitining_idc) that specifies inter partition from the target CU. Can do.

The intra division flag is information that specifies the shape, size, and position in the target CU of each intra PU included in the target CU (PU in which intra prediction is used).

The inter division flag is information for designating the shape and size of each inter PU included in the target CU (PU in which inter prediction is used), and the position in the target CU.

Prediction type information PType is information that specifies whether intra prediction or inter prediction is used as a prediction image generation method for the target PU.

PT information PTI is information related to the PT included in the target CU. In other words, the PT information PTI is a set of information related to each of one or more PUs included in the PT, and is referred to when the moving image decoding apparatus 1 generates a predicted image. The PT information PTI includes inter prediction information (PTI_Inter) or intra prediction information (PTI_Intra) depending on which prediction method is specified by the prediction type information PType. Hereinafter, a PU to which intra prediction is applied is also referred to as an intra PU, and a PU to which inter prediction is applied is also referred to as an inter PU.

TT information TTI is information related to TT included in the target CU. In other words, the TT information TTI is a set of information regarding each of one or a plurality of TUs included in the TT, and is referred to when the moving image decoding apparatus 1 decodes residual data.

(Intra prediction information PTI_Intra)
The intra prediction information PTI_Intra includes a coding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction. (E) to (g) in FIG. 2 show coding parameters included in the intra prediction information PTI_Intra. FIG. 2 (e) shows an example of encoding parameters (P _P1 to P _PNP ) when the prediction unit and the recording unit (described later) are different, and NP is the total number of intra PUs included in the target CU. It is.

Further, (f) of FIG. 2 shows coding parameters (P _r1 , ΔP _P1 , and the like when the prediction unit and the recording unit are different and the accuracy of the prediction mode in the prediction unit is different from the accuracy of the prediction mode in the recording unit. ..., P _rQ ,..., ΔP _PX ), where Q is the total number of recording units included in the target CU, and PX is the total number of intra PUs included in the target CU.

FIG. _2G shows an example of encoding parameters (P _r1 to P _rQ ) when the accuracy of the prediction mode in the prediction unit is different from the accuracy of the prediction mode in the recording unit.

One of the intra-PU division methods is generated by dividing the target CU into four PUs of the same size if the intra-partition flag is 1, and if the intra-partition flag is 0, the target CU is divided. Without this, the target CU itself is handled as a PU. In this case, assuming that the size of the target CU is 2N × 2N pixels, the intra PU can take any size of 2N × 2N pixels (no division) and N × N pixels (four divisions) (here, N = 2 ⁿ , n is an arbitrary integer of 1 or more). For example, if the target CU is 128 × 128 pixels, it can be divided into 128 × 128 pixel and 64 × 64 pixel intra PUs.

Specifically, this will be described with reference to FIG. As shown in FIG. 4A, when the CU 311 is divided into 2N × 2N pixels, it becomes PU 411a, and when it is divided into N × N pixels, it becomes PU 412b to PU 412e.

Also, the intra PU is not necessarily divided into squares. This will be described with reference to FIG. The example shown in FIG. 4B shows a state in which a 32 × 32 pixel CU 311 is divided into a plurality of intra PUs. In FIG. 4B, the CU 311 includes an 1 × 4 pixel intra PU 412a, an 8 × 8 pixel intra PU 412b, a 2 × 8 pixel intra PU 412c, a 1 × 16 pixel intra PU 412d, and a 4 × 16 pixel intra PU 412e. Are divided into intra PUs including an intra PU 412f of 16 × 16 pixels.

(Inter prediction information PTI_Inter)
The inter prediction information PTI_Inter includes a coding parameter that is referred to when the video decoding device 1 generates an inter prediction image by inter prediction. The inter prediction information PTI_Inter includes inter prediction parameters PP_Inter1 to PP_InterNe (Ne is the total number of inter PUs included in the target CU) for each PU.

The inter PU divides the target CU by four symmetrical divisions of 2N × 2N pixels (the same size as the target CU), 2N × N pixels, N × 2N pixels, and N × N pixels. Created.

Also, the inter prediction parameters include an inter prediction type, a reference image index, an estimated motion vector index, and a motion vector residual.

(TT information TTI)
The TT information TTI includes a transform size, a transform type, a transform coefficient, the presence / absence of a transform coefficient in the spatial domain, the presence / absence of a transform coefficient in the frequency domain, and the quantization prediction residual for the total number of TUs included in the target CU. Contains.

The TU is formed by hierarchically dividing the target CU into a quadtree, and the size is determined by information (split_transform_flag) indicating whether or not the target CU or a partial region of the target CU is to be divided. split_transform_flag is basically encoded for each node of the quadtree, but is omitted and estimated according to the constraints on the transform size (maximum transform size, minimum transform size, maximum hierarchy depth of the quadtree). There is also a case.

FIG. 4 (c) shows a state where CU 311 is divided into quadtrees to form TUs. As shown in FIG. 4C, when it is indicated that node division is performed in the hierarchy 0 and the hierarchy 1, the PU 413b is a TU. In addition, when it is indicated that the node is divided at the hierarchy 0 and the node is not divided at the hierarchy 1, the PU 413a is a TU.

For example, when the maximum hierarchical depth is “2” and the size of the target CU is 32 × 32, the TU included in the target CU has a size of 32 × 32 pixels, 16 × 16 pixels, or 8 × 8 pixels. It can take.

The quantized prediction residual QD is encoded data generated by the moving image encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block.

Process 1: DCT transform (Discrete Cosine Transform) of the prediction residual obtained by subtracting the prediction image from the encoding target image;
Process 2: Quantize the transform coefficient obtained in Process 1;
Process 3: Variable length coding is performed on the transform coefficient quantized in Process 2;
The quantization parameter qp described above represents the magnitude of the quantization step QP used when the moving image coding apparatus 2 quantizes the transform coefficient (QP = 2 ^{qp / 6} ).

(Outline of this embodiment)
First, the reason why the memory capacity can be reduced by this embodiment will be described with reference to FIGS.

FIG. 5 is a diagram for explaining a relationship between a prediction unit PU and a memory in intra prediction, where (a) is a diagram illustrating a case where the prediction unit PU is 4 × 4 pixels, and (b) is a prediction unit. It is a figure which shows the case where PU is 2x8 pixel, (c) is a figure which shows the relationship between an encoding unit and a line memory.

In decoding of prediction parameters, for some prediction units PU, the prediction parameters for the prediction unit PU are estimated from the decoded prediction parameters for the decoded prediction unit PU ′, rather than decoding only from the encoded data. The estimated value obtained in this way may be used in combination.

For example, as shown in FIG. 5A, when estimating the prediction mode related to the PU 513 of 4 × 4 pixels, the prediction mode of the PU 513 is estimated from the prediction mode of the PU 511 in contact with the upper side of the PU 513. Therefore, in order to decode the entire CU 501 to which the PU 513 belongs, it is necessary to record the prediction modes of the PUs 511 and 512 that are in contact with the upper side of the CU 501. In other words, it is necessary to record the prediction modes of the PUs 511 and 512 until the decoding of the CU 501 is completed.

Further, as shown in FIG. 5B, when estimating the prediction mode related to the PU 521 of 2 × 8 pixels, as in the case of the PU 513, the prediction mode of the PU 521 is changed from the prediction mode of the PU 531 in contact with the upper side of the PU 521. presume. Therefore, in order to decode the entire CU 502 to which the PU 521 belongs, it is necessary to record the prediction modes of the PUs 531 to 534 in contact with the upper side of the CU 502. In other words, it is necessary to record the prediction modes of the PUs 531 to 534 that are in contact with the upper side of the CU 502 until the decoding of the CU 502 is completed.

As described above, in order to decode the coding unit CU, it may be necessary to refer to the prediction mode of the prediction unit that is in contact with the upper side of the coding unit CU. Therefore, after the prediction image is generated, it is necessary to record the prediction mode until the decoding of the coding unit CU in contact with the lower side of the prediction unit PU is completed. That is, it is necessary to record the prediction mode for one line of at least one frame.

This point will be described with reference to FIG. As shown in FIG. 5C, in order to decode the coding unit CU505, the prediction of the prediction unit PU existing on the side of the coding unit CU506 that is in contact with the upper side of the coding unit CU505, that is in contact with the coding unit CU505. A mode is required.

Therefore, it is necessary to record the prediction mode for one line (area 507) including the prediction mode of the prediction unit PU existing on the side of the coding unit CU 506 in contact with the coding unit CU 505.

However, as shown in FIGS. 5A and 5B, the coding unit CU501 and the coding unit CU502 are included in their own units even though they have the same size (8 × 8 pixels). Since the sizes of prediction unit PUs to be different are different, the number of prediction modes required for decoding is different.

Specifically, in the case of the coding unit CU501, the prediction mode of the prediction unit PU included in the coding unit CU501 can be determined from the two prediction modes of the prediction units PU511 and 512, and decoding of the coding unit CU501 is possible. On the other hand, in the case of the coding unit CU502, the prediction mode of the prediction unit PU included in the coding unit CU502 cannot be determined unless the four prediction modes of the prediction units PU531 to 535 are used. The coding unit CU502 cannot be decoded.

That is, in order to determine the prediction mode of the prediction unit PU included in the coding unit CU501, the prediction mode in units of four pixels may be recorded in the line memory, whereas in the case of the coding unit CU502, It is necessary to record the prediction mode in units of two pixels in the line memory. Therefore, the required line memory capacity is twice different.

For example, in the case of HD (High-definition television: 1920 × 1080 pixels), in order to decode the coding unit CU501, it is necessary to record 1920 ÷ 4 = 480 prediction modes in the line memory. On the other hand, in order to decode the coding unit CU502, it is necessary to record in 1920 ÷ 2 = 960 prediction mode line memories.

Therefore, even in the case of the coding unit CU502 described above, in order not to increase the memory capacity, the prediction mode is recorded in units compared to the prediction unit PU that is a unit for deriving the prediction mode. It is conceivable to make a certain recording unit a large size.

This will be described with reference to FIG. 6A and 6B are diagrams for explaining a case where the recording unit RU has a larger size than the prediction unit PU. FIG. 6A is a diagram illustrating the size of the recording unit RU. FIG. 6B is a diagram illustrating the prediction unit PU. It is a figure which shows size, (c) is a figure which shows the relationship between prediction unit PU and recording unit RU.

As shown in FIG. 6B, the 8 × 8 pixel CU 602 includes 2 × 8 pixel prediction units PU 610a to PU 610d, and the prediction modes of the prediction units PU 610a to PU 610d are set to P _Pa and P _Pb , respectively. , P _Pc and P _Pd . Further, as shown in FIG. 6A, the recording unit RU is set to 4 × 8 pixel RUs 620a and b, and the recording prediction modes are set to P _ra and P _rb .

Then, as shown in FIG. 6C, in order to derive two prediction modes P _ra and prediction modes P _Pc and P _Pd necessary to derive two prediction modes P _Pa and P _Pb. If the required prediction mode is P _rb , the encoding unit CU 602 includes two prediction units, but there are two prediction modes necessary for decoding. The capacity of the line memory can be reduced. In the future, prediction modes for each prediction unit such as prediction modes P _Pa , P _Pb , P _Pc , and P _Pd are also referred to as prediction prediction modes. In addition, the prediction modes recorded as the prediction modes P _ra and P _rb are also referred to as reference prediction modes. A method for deriving the reference prediction mode from the prediction prediction mode and a method for deriving the prediction prediction mode from the reference prediction mode will be described later.

Although introduction of a prediction unit of 2 × 8 pixels (segmentation of prediction unit) can improve the accuracy of prediction, as described above, it increases the capacity of the memory for recording the prediction mode. On the other hand, if the unit of the reference prediction mode is larger than the unit of the prediction prediction mode, the memory capacity can be prevented from increasing while maintaining high prediction accuracy. In order to realize this effect, it is not necessary to make both the vertical and horizontal units of the reference prediction mode larger than the vertical and horizontal units of the prediction prediction mode. For example, when the purpose is to compress the line memory, at least the horizontal unit of the reference prediction mode may be larger than the horizontal unit of the prediction prediction mode. As a result, the memory capacity of the line memory can be prevented from increasing while maintaining high prediction accuracy.

Note that there are various variations in the decoding method and the derivation method between the reference prediction mode and the prediction prediction mode. For example, (1) a reference prediction mode is derived based on the decoded prediction prediction mode, and (2) a prediction prediction mode is derived by decoding additional information in addition to the decoded reference prediction mode. It is possible.

(Moving picture decoding apparatus 1)
The configuration of the video decoding device 1 will be described with reference to FIG.

FIG. 7 is a block diagram showing a main configuration of the moving picture decoding apparatus 1. As illustrated in FIG. 7, the moving image decoding apparatus 1 includes a CU decoding unit 10, a prediction mode recording unit 11, and a frame memory 12. The CU decoding unit 10 includes a prediction information decoding unit 15, a prediction residual, and the like. The configuration includes a decoding unit 16, a predicted image generation unit 17, and a decoded image generation unit 18.

Schematically speaking, the moving picture decoding apparatus 1 is an apparatus that generates and outputs a decoded image # 2 by decoding the encoded data # 1. In addition, the moving image decoding apparatus 1 includes, as part thereof, H.264. H.264 / MPEG-4 AVC standard technology, VCEG (Video Coding Expert Group) technology used in KTA software, which is a joint development codec, TMuC (Test Model Underside) software This is a video decoding apparatus using the technology and the method adopted in WorkingWorkDraft 1 of High-Efficiency Video Coding (HEVC WD1).

The video decoding device 1 generates a prediction image for each prediction unit, generates a decoded image # 2 by adding the generated prediction image and a prediction residual decoded from the encoded data # 1, Output.

The encoded data # 1 input to the video decoding device 1 is input to the CU decoding unit 10.

The CU decoding unit 10 decodes the encoded data # 1, and finally generates and outputs a decoded image # 2.

The prediction mode recording unit 11 records the prediction mode decoded by the prediction information decoding unit 15 and the position of the recording unit RU in association with each other.

The decoded image # 2 is recorded in the frame memory 12. In the frame memory 12, decoded images corresponding to all CUs decoded before the target CU (for example, all CUs preceding in the raster scan order) at the time of decoding the target CU are recorded. .

The prediction information decoding unit 15 decodes prediction information from the encoded data # 1. Details of the prediction information decoding unit 15 will be described later with reference to another drawing.

The prediction residual decoding unit 16 decodes the prediction residual from the encoded data # 1, and transmits the decoded prediction residual data # 16 to the decoded image generation unit 18.

The predicted image generation unit 17 generates a predicted image from the prediction mode information # 15 acquired from the prediction information decoding unit 15 and the decoded image P ′ acquired from the frame memory 12, and predicted image data # 17 indicating the generated predicted image Is transmitted to the decoded image generation unit 18.

The decoded image generation unit 18 generates and outputs a decoded image # 2 from the prediction residual data # 16 acquired from the prediction residual decoding unit 16 and the prediction image data # 17 acquired from the prediction image generation unit 17. is there.

(Details of the prediction information decoding unit 15)
Next, details of the prediction information decoding unit 15 included in the video decoding device 1 will be described with reference to FIGS.

FIG. 1 is a block diagram illustrating a main configuration of the prediction image decoding unit 15. As shown in FIG. 1, the prediction information decoding unit 15 includes a PU structure decoding unit 21, a prediction prediction mode decoding unit (prediction parameter decoding unit) 22, and a reference prediction mode deriving unit 23.

As described above, the prediction information decoding unit 15 decodes prediction information from the encoded data # 1, and includes the PU structure decoding unit 21, the prediction prediction mode decoding unit 22, and the reference prediction mode deriving unit 23. Is included.

The PU structure decoding unit 21 decodes the PU structure of the target CU from the encoded data # 1, and notifies the prediction prediction mode decoding unit 22 of the decoded PU structure information # 21.

The prediction mode decoding unit for prediction 22 uses the PU structure information # 21 indicating the PU structure of the target CU acquired from the PU structure decoding unit 21 and the encoded data # 1, and the prediction mode recording unit RU (hereinafter, recording unit) of the target CU. RU) is set, and the prediction mode (prediction parameter) of the prediction unit PU included in each recording unit RU is decoded. Then, prediction mode information # 15 indicating the decoded prediction mode is notified to the prediction image generation unit 17 and the reference prediction mode deriving unit 23.

The prediction mode decoding unit 22 for prediction sets the recording unit RU according to a table as shown in FIG. That is, each prediction unit PU constituting the coding unit CU is 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels, 32 × 8 pixels, 8 × 32 pixels, In the case of 16 × 4 pixels and 4 × 16 pixels, the prediction mode decoder for prediction 22 sets the recording unit RU to the same size as the prediction unit PU.

On the other hand, when each prediction unit PU constituting the coding unit CU is 16 × 1 pixel, the prediction mode decoding unit 22 for prediction sets the recording unit RU to 16 × 4 pixels, and sets each coding unit CU. When the prediction unit PU is 1 × 16 pixels, the prediction mode decoding unit 22 for prediction sets the recording unit RU to 4 × 16 pixels. In addition, when each prediction unit PU constituting the coding unit CU is 8 × 2 pixels, the prediction mode decoding unit 22 for prediction sets the recording unit RU to 8 × 4 pixels, and sets each coding unit CU. When the prediction unit PU is 2 × 8 pixels, the prediction mode decoder for prediction 22 sets the recording unit RU to 4 × 8 pixels.

In the above example, the size of the recording unit RU is fixed for each size of the prediction unit PU. However, the size of the recording unit RU may be varied according to the necessity of reducing the memory size. For example, in SPS or PPS, information designating at least how many pixels the prediction mode is recorded may be sent, and the relationship between the size of each prediction unit PU and the size of the recording unit RU may be determined based on the information. . When it has been shown that the prediction mode is recorded in units of N pixels, for a prediction unit PU having a height or width less than N pixels, a recording unit having a size in which the height or width less than N is replaced with N What is necessary is just to associate RU.

Further, the prediction mode decoding unit for prediction 22 decodes the prediction parameter by the following method, for example.

Method 1 (using MPM)
In the encoded data, (1) for each prediction unit, the prediction mode of the prediction unit is a specific prediction mode (for example, the prediction mode of the upper adjacent recording unit adjacent to the upper side of the prediction unit to be decoded, and the decoding target A flag indicating whether or not it matches an estimated prediction mode estimated from a prediction mode having a small prediction mode ID among prediction modes of a left adjacent recording unit adjacent to the left side of the prediction unit; and (2) prediction When the prediction unit whose mode does not match the estimated prediction mode includes a code obtained by encoding the prediction mode related to the prediction unit, the prediction mode decoding unit 22 for prediction uses the encoding parameters as follows: Decode the prediction mode.

That is, the prediction prediction mode decoding unit 22 (1) decodes the flag from the encoded data, and (2) indicates that the flag matches the estimated prediction mode. A prediction mode related to the prediction unit is read from the prediction mode recording unit 11, and (2-2) an estimated prediction mode estimated from the read prediction mode is determined as a prediction mode related to the prediction unit, and (3) the flag is When it shows that it does not correspond with an estimated prediction mode, the prediction mode regarding the prediction unit made into object is determined by decoding the said code | symbol.

Method 2 (use multiple candidates)
(1) For each prediction unit, the prediction mode for the prediction unit includes a plurality of prediction modes (for example, the prediction mode of the upper adjacent recording unit adjacent to the upper side of the prediction unit to be decoded, and the decoding target A flag indicating whether or not the estimated prediction mode estimated from any one of the prediction modes of the left adjacent recording unit adjacent to the left side of the prediction unit) and (2) the prediction mode matches the estimated prediction mode For the prediction unit, information indicating which prediction mode the prediction mode related to the prediction unit matches, and (3) for the prediction unit whose prediction mode does not match the prediction mode, the prediction mode related to the prediction unit is encoded. When the obtained code is included, the prediction mode decoding unit for prediction 22 decodes the prediction mode from the encoding parameter as follows.

That is, the prediction mode decoder for prediction 22 (1) decodes the flag from the encoded data, and (2) indicates that the flag matches the estimated prediction mode (2-1) Decoding the information from the data, (2-2) reading the prediction mode for the prediction unit indicated by the information from the prediction mode recording unit 11, and (2-3) the estimated prediction mode estimated from the read prediction mode (3) If the flag indicates that the prediction mode does not match the estimated prediction mode, the prediction mode for the target prediction unit is determined by decoding the code.

In the processing of the above method 1 and method 2, the estimated prediction mode is used as a prediction value of the intra prediction mode of the target prediction unit. That is, if the flag indicates that the prediction value is correct, the estimated prediction mode is directly set to the intra prediction mode of the target prediction unit. If the flag does not indicate that the prediction value is correct, information for selecting any of the intra prediction modes excluding the prediction value (estimated prediction mode) is decoded, and the intra prediction mode of the target prediction unit is determined. Identified.

Note that the adjacent recording unit adjacent to the upper side or the left side of the prediction unit to be decoded can be defined by the following method. The adjacent recording unit adjacent to the upper side or the left side of a certain target prediction unit can be said to be a recording unit adjacent to the upper left pixel or the left of the recording unit including the target prediction unit.

Therefore, when the upper left pixel position of the target prediction unit is (x, y), the upper left pixel (x ′, y ′) of the recording unit including the target prediction unit can be derived by the following equation.

x ′ = (x >> log ₂ W) << log ₂ W
y ′ = (y >> log ₂ H) << log ₂ H
However, W = max (w, N), H = max (h, N),
w and h are the width and height of the prediction unit including (x, y), respectively. N is the minimum recording unit of the prediction mode. The adjacent recording unit adjacent to the upper side of the target prediction unit is the pixel (x ′, y′−). This is a recording unit including 1), and can be used when the pixel (x ′, y′−1) is included in the decoded area.

The adjacent recording unit adjacent to the left side of the target prediction unit is a recording unit including (x′−1, y ′), and the pixel (x′−1, y ′) is included in the decoded area. It is available when

The above is description of the process in the prediction mode decoding part 22 for prediction, Next, the prediction mode derivation | leading-out part 23 for reference is demonstrated.

The reference prediction mode deriving unit 23 derives a reference prediction mode (reference prediction parameter) for each recording unit RU from the prediction mode information # 15 acquired from the prediction prediction mode decoding unit 22, and the position of the recording unit RU. Are recorded in the prediction mode recording unit 11 in association with each other.

The reference prediction mode deriving unit 23 derives the reference prediction mode by the following method, for example.

Method 1A (simple decimation (decoding order A))
The prediction mode for prediction in the prediction unit decoded first among the prediction units included in the target recording unit is set as the reference prediction mode.

Method 1B (simple decimation (decoding order B))
The prediction prediction mode in the prediction unit decoded last among the prediction units included in the target recording unit is set as a reference prediction mode.

Method 2A (simple decimation (position A))
The prediction prediction mode in the prediction unit including the upper left pixel of the target recording unit is set as the reference prediction mode.

Method 2B (simple decimation (position B))
The prediction prediction mode in the prediction unit including the lower right pixel of the target recording unit is set as the reference prediction mode.

Method 3 (in order of priority)
Among the prediction prediction modes of each prediction unit included in the target recording unit, the prediction mode with the highest priority (the prediction mode ID is the smallest) is set as the reference prediction mode.

Method 4 (average or median direction)
When each prediction unit included in the target recording unit includes only direction prediction, each prediction mode is mapped to an angle, and a prediction mode corresponding to an average value or a median value of each angle is set as a reference prediction mode. In other cases (when DC prediction or Planar prediction is included), DC prediction is set as a reference prediction mode.

In the present embodiment, the number of prediction directions in the prediction mode is 33 for both the prediction prediction mode and the reference prediction mode regardless of the size of the prediction unit PU. This will be described with reference to FIG. FIG. 9 is a diagram showing an example of a prediction mode in the present embodiment, (a) is a diagram showing a relationship between the prediction mode ID and direction, and (b) is a prediction unit PU in a certain CU. It is a figure which shows the relationship with the recording unit RU, (c) is a figure which shows the bit stream in (b).

As shown in FIG. 9A, in this embodiment, IDs of 0, 1, 3 to 33 are assigned in any direction, 2 is assigned to DC prediction, and 34 is assigned to Planar prediction. It has been.

9B, the coding unit CU901 includes 1 × 16 pixel prediction units PU 910a to 910d and 911a to 913d, and a 4 × 16 pixel recording unit RU 920a corresponding to the prediction unit. This indicates a state in which .about.d is set. More specifically, the prediction units PU 910a to d 910d correspond to the recording unit RU 920a, the prediction units 911a to 911d correspond to the recording unit RU 920b, and the prediction units PU 913a to 913d correspond to the recording unit RU 920d.

Further, when the prediction mode of the prediction unit PU 910a is P _P0 , the prediction mode of the prediction unit PU 910b is P _P1 , the prediction mode of the prediction unit PU 910c is P _P2 ,..., And the prediction mode of the prediction unit PU 913d is P _P15. bitstream, as shown in (c) of FIG. 9, consisting of the head and _{_{_{_{P P0, P P1, P P2}}}} , P P3, ... P P15 took form.

Further, the prediction modes of the recording units RU 920a to RUd are P _r0 , P _r1 , P _r2 , and P _r3 , respectively.

P _rk (k = 0,..., K−1) is a reference prediction mode, and P _pl (l = 0,..., L−1) is a prediction prediction mode. The values of P _rk and P _pl are prediction mode IDs.

(Processing flow in the prediction information decoding unit 15)
Next, the flow of processing in the prediction information decoding unit 15 of the video decoding device 1 will be described with reference to FIG. FIG. 10 is a flowchart showing a process flow of the prediction information decoding unit 15.

When the prediction information decoding unit 15 acquires the encoded data # 1 (S1), the PU structure decoding unit 21 decodes the prediction unit PU structure of the target CU from the encoded data # 1 (S2). And the prediction mode decoding part 22 for prediction sets the recording unit RU of the prediction mode for reference in object CU from the prediction unit PU structure decoded by the PU structure decoding part 21 (S3).

Thereafter, the prediction mode decoding unit 22 for prediction decodes the prediction mode for prediction of the prediction unit PU included in the recording unit RU (S5 to S7) for each recording unit RU (S4). Next, the reference prediction mode deriving unit 23 derives a reference prediction mode (S8), and records it in the prediction mode recording unit 11 together with the position of the recording unit (S9). Then, steps S5 to S9 are performed for all the recording units (S10), and the prediction unit PU structure included in the target CU and the prediction mode for prediction of each prediction unit PU are output as prediction information (prediction mode information # 15) ( S11).

The above is the flow of processing in the prediction information decoding unit 15.

(Moving picture encoding device 2)
Next, the moving image encoding device (image decoding device) 2 will be described with reference to FIG. Generally speaking, the moving image encoding device 2 is a device that generates and outputs encoded data # 1 by encoding the input image # 100. In addition, the moving image encoding apparatus 2 includes, as part thereof, H.264. 264 / MPEG-4 AVC standard technology, VCEG (Video Coding Expert Group) technology used in joint development codec KTA software, TMuC (Test Model under Consideration) software This is a moving picture encoding apparatus using a technique and a method adopted in HEVC WD1 as a successor codec.

FIG. 11 is a block diagram showing a main part configuration of the moving picture encoding apparatus 2. As illustrated in FIG. 11, the moving image encoding device 2 includes a prediction information determination unit 31, a reference prediction mode derivation unit 32, a prediction mode recording unit 33, a prediction residual encoding unit 34, a prediction information encoding unit 35, This configuration includes a predicted image generation unit 36, a prediction residual decoding unit 37, a decoded image generation unit 38, a frame memory 39, and an encoded data generation unit (prediction parameter encoding means) 40.

The prediction information determination unit 31 sets a coding unit CU from the acquired input image # 100, sets a prediction unit PU in each coding unit CU, and determines a prediction type in the prediction unit PU. And a prediction parameter is determined according to the determined prediction type. For example, for a prediction unit PU whose prediction type is determined to be intra prediction, the prediction mode in the prediction unit PU is determined. For the prediction unit PU whose prediction type is determined to be inter prediction, the inter prediction type, reference image index, estimated motion vector index, and motion vector residual in the prediction unit PU are determined.

Then, the prediction mode information # 31 indicating the determined prediction unit PU and the prediction parameter is notified to the reference prediction mode deriving unit 32, the prediction information encoding unit 35, and the predicted image generating unit 36.

The reference prediction mode deriving unit 32 determines the recording unit RU corresponding to the prediction unit PU from the prediction mode information # 31 acquired from the prediction information determining unit 31. Then, the reference prediction mode of the recording unit RU is derived and recorded in the prediction mode recording unit 33 together with the position of the recording unit RU in the coding unit CU. Note that details of the processing in the reference prediction mode deriving unit 32 are the same as those in the reference prediction mode deriving unit 23 of the video decoding device 1, and thus description thereof is omitted.

The recorded prediction mode is used for variable-length coding of the prediction mode with high coding efficiency when the prediction mode of the generated prediction image is transmitted to the moving picture decoding apparatus 1. For example, by using the estimated prediction mode derived based on the recorded prediction mode, the prediction mode can be encoded with a smaller amount of code than when the prediction mode to be encoded is directly encoded. Therefore, it is necessary to record the prediction mode of the prediction unit that is in contact with the upper side or the left side of the prediction unit that is the generation target of the predicted image.

The prediction information encoding unit 35 encodes the prediction mode information # 31 acquired from the prediction information determination unit 31, and notifies the encoded prediction mode encoded data # 35 to the encoded data generation unit 40.

The prediction information encoding part 35 encodes prediction mode information # 31 as follows, for example.

(Method 1)
(1) The prediction mode of the target prediction unit is a specific prediction mode (for example, the prediction mode of the upper adjacent recording unit adjacent to the upper side of the target prediction unit and the left adjacent to the left side of the prediction unit of the decoding target) This is estimated from a prediction mode having a small prediction mode ID among prediction modes of adjacent recording units. At this time, the specific prediction mode is read from the prediction mode recording unit 33. (2) The estimated prediction mode (estimated prediction mode) is compared with the prediction mode related to the target prediction unit acquired from the prediction information determination unit 31. (3) If the prediction mode related to the target prediction unit matches the estimated prediction mode, a flag indicating that is encoded. (4) On the other hand, if the prediction mode related to the target prediction unit does not match the estimated prediction mode, a flag indicating that fact and a prediction mode related to the target prediction unit are encoded.

(Method 2)
(1) The prediction mode of the target prediction unit is a plurality of prediction modes (for example, the prediction mode of the upper adjacent recording unit adjacent to the upper side of the target prediction unit and the left adjacent to the left side of the prediction unit of the decoding target) Estimate from each adjacent recording unit prediction mode). At this time, the plurality of prediction modes are read from the prediction mode recording unit 33. (2) Each estimated prediction mode (estimated prediction mode) is compared with the prediction mode related to the target prediction unit acquired from the prediction information determination unit 31. (3) When the prediction mode related to the target prediction unit matches any of the estimated prediction modes, a flag indicating that fact and the prediction prediction mode estimated from the prediction mode related to which prediction unit the matched estimated prediction mode indicates Information is encoded. (4) On the other hand, when the prediction mode regarding the target prediction unit does not match any of the estimated prediction modes, a flag indicating that fact and the prediction mode regarding the target prediction unit are encoded.

The prediction image generation unit 36 generates a prediction image from the prediction mode information # 31 acquired from the prediction information determination unit 31 and the decoded image stored in the frame memory 39, and prediction image data # 36 indicating the generated prediction image Is notified to the prediction residual encoding unit 34 and the decoded image generation unit 38.

The prediction residual encoding unit 34 derives a prediction residual from the input image # 100 and the prediction image acquired from the prediction image generation unit 36, and encodes the derived prediction residual encoded prediction residual data # 34 to the encoded data generation unit 40 and the prediction residual decoding unit 37.

The prediction residual decoding unit 37 decodes the prediction residual encoded data # 34 acquired from the prediction residual encoding unit 34, and notifies the decoded image generation unit 38 of the decoded prediction residual data # 37. .

The decoded image generation unit 38 generates a decoded image from the prediction image acquired from the prediction image generation unit 36 and the prediction residual data # 37 acquired from the prediction residual decoding unit 37, and decoded image data indicating the generated decoded image # 38 is recorded in the frame memory 39.

The encoded data generation unit 40 encodes encoded data # 1 from the prediction mode encoded data # 35 acquired from the prediction information encoding unit 35 and the prediction residual encoded data # 34 acquired from the prediction residual encoding unit 34. Is generated and output.

(Appendix 1)
The recording unit RU may be set to a different value in the horizontal direction and the vertical direction. In particular, the horizontal recording unit may be set to a larger value than the vertical recording unit. For example, a recording unit RU of 4 × 16 pixels is associated with a prediction unit PU of 1 × 16 pixels, and a recording unit RU of 16 × 1 pixels is associated with a prediction unit PU of 16 × 1 pixels.

When referring to the prediction mode, it is conceivable to refer to the prediction mode of the prediction unit in contact with the upper side and the left side of the target prediction unit PU. It is necessary to record the prediction mode (for the screen width). On the other hand, in order to refer to the prediction mode of the prediction unit in contact with the left side, the prediction mode for 1 LCU (TB) may be recorded.

Therefore, by setting the horizontal unit of the recording unit RU to a larger value than the vertical unit, the accuracy of the estimated prediction mode can be reduced while reducing the capacity of the memory (line buffer) for recording the prediction mode. Can be increased.

As the estimated prediction mode, a prediction mode having a smaller prediction mode ID may be set between the prediction mode of the upper adjacent recording unit RU and the prediction mode of the left adjacent recording unit RU in the target prediction unit PU.

Note that the memory capacity required increases as the number of prediction units PU increases in a direction parallel to the scan direction. Therefore, if the recording unit RU is set so that the number of prediction units PU in the direction parallel to the scanning direction is reduced, the memory capacity can be reduced.

(Appendix 2)
The prediction mode (reference prediction mode) corresponding to the recording unit RU does not necessarily have to be recorded in association with the upper left pixel position of the recording unit RU. For example, the encoding unit CU is divided by a predetermined unit (for example, 4 × 4 pixels), and the reference to the reference prediction mode is set so that the same value is referenced in the recording unit RU in each region. Also good. In this case, the prediction mode of the recording unit RU adjacent to the target prediction unit PU is the reference prediction mode referred to by the unit adjacent to the target prediction unit PU.

When N × M pixels are used as the predetermined unit, the reference to the reference prediction mode at the position (x, y) can be defined by the following equation.
x ′ = (x >> log ₂ N) << log ₂ N
y ′ = (y >> log ₂ M) << log ₂ M
[Embodiment 2]
The following will describe another embodiment of the present invention with reference to FIGS. For convenience of explanation, members having the same functions as those shown in the first embodiment are given the same reference numerals, and explanation thereof is omitted.

The present embodiment differs from the first embodiment in the accuracy of the prediction mode (prediction prediction mode) used for generating a prediction image in the prediction unit PU, and the prediction mode for recording the prediction mode. The accuracy of (reference prediction mode) is different.

Before describing the configuration of the moving picture decoding apparatus 1 ′ according to the present embodiment, the accuracy of the prediction mode for prediction used to generate a prediction image in the prediction unit PU and the prediction mode for prediction are recorded. The reason why the memory capacity can be reduced by changing the accuracy of the reference prediction mode will be described with reference to FIGS. 12 and 13 are diagrams illustrating the relationship between the prediction prediction mode and the reference prediction mode.

When the number of prediction modes increases (for example, the number of prediction directions increases), the prediction accuracy improves, but the capacity of the memory for recording also increases. For example, when there are 32 types of prediction modes, a memory of 5 bits per prediction unit is required, and when the prediction modes are 256 types, a memory of 8 bits per prediction unit is required.

In HD (1920 × 1080 pixels), when recording a prediction mode in units of four pixels, if there are 32 types of prediction modes, it is necessary to record a prediction mode of 1920 ÷ 4 × 5 ÷ 8 = 300 (bytes). When there are 256 types of prediction modes, it is necessary to record a prediction mode of 1920 ÷ 4 × 8 ÷ 8 = 480 (bytes).

Therefore, if the prediction direction accuracy differs between the prediction prediction mode and the reference prediction mode, that is, if the prediction direction accuracy of the reference prediction mode is lower than the prediction direction accuracy of the prediction prediction mode, it is high. The memory capacity can be reduced while maintaining the prediction accuracy.

In the example shown in FIG. 12, the number of prediction prediction modes is 130 (0 to 129), the number of reference prediction modes is 34 (0 to 33), and the prediction prediction mode s1 and the reference prediction mode s2 The relationship is as shown in the following equation.

s1 = (s2-1) << 2 + 1
s2 = (s1-1) >> 2 + 1
However, since s1 = 0 and s2 = 0 correspond to a prediction mode that is not a direction prediction (for example, DC mode or Planar mode), the above relationship does not hold. Instead, the relationship of s1 = s2 holds. The above conversion process is a mapping between two prediction parameters expressing directional predictions with different accuracy. (1) Exclusion of prediction modes that are non-directional predictions (the term “−1” in the above equation is (2) Adjustment of direction prediction accuracy (the terms “>> 2” and “<< 2” in the above equation correspond)), (3) Addition of prediction mode that is non-directional prediction (above Generalized by three steps (corresponding to the term “+1” in the equation).

Then, as shown in FIG. 13, the prediction prediction mode P _Pe (FIG. 13B) decoded by the prediction unit PU 1301 and the reference prediction mode P _rf (FIG. 13A) have different accuracy. Then, as illustrated in FIG. 13C, when the prediction mode for prediction of the PU 1301 is derived, the prediction mode P _Pe for prediction is derived from the prediction mode P _rf for reference.

(Configuration of moving picture decoding apparatus 1 ')
Next, with reference to FIG. 14, the structure of moving image decoding apparatus 1 ′ according to the present embodiment will be described. Since the moving picture decoding apparatus 1 ′ differs from the moving picture decoding apparatus 1 only in the configuration of the prediction information decoding unit 15, the prediction information decoding unit 15 ′ of the moving picture decoding apparatus 1 ′ will be described.

FIG. 14 is a block diagram showing a configuration of the prediction information decoding unit 15 '. As shown in FIG. 14, the prediction information decoding unit 15 ′ includes a PU structure decoding unit 21, a reference prediction mode decoding unit 24, a prediction mode update information decoding unit 25, and a prediction prediction mode deriving unit 26. .

Since the PU structure decoding unit 21 is the same as the PU structure decoding unit 21 of the prediction information decoding unit 15, the description thereof is omitted.

In the video decoding device 1 ′, the sizes of the prediction unit PU and the recording unit RU are different, and the accuracy of the prediction mode for prediction and the prediction mode for reference is different. This point will be described with reference to FIG. FIG. 15 is a diagram showing the recording unit RU and the number of prediction modes for each size of the prediction unit PU, (a) is a diagram showing the relationship between the prediction unit PU and the recording unit RU, and (b) is the prediction unit. It is a figure which shows the relationship between the precision of PU, the prediction mode for prediction, and the precision of the prediction mode for reference.

By using a table 1501 as shown in FIG. 15A, the recording unit RU can be set. In the example shown in the table 1501, the prediction unit PU is 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels, 32 × 8 pixels, 8 × 32 pixels, 16 × 4. In the case of pixels, 4 × 16 pixels, the recording unit RU has the same size.

Also, when the prediction unit PU is 16 × 1 pixels, the recording unit RU is 16 × 4 pixels, and when the prediction unit PU is 1 × 16 pixels, the recording unit RU is 4 × 16 pixels.

Further, when the prediction unit PU is 8 × 2 pixels, the recording unit RU is 8 × 4 pixels, and when the prediction unit PU is 2 × 8 pixels, the recording unit RU is 4 × 8 pixels.

Further, the number of prediction prediction modes and the number of reference prediction modes can be set by the table 1502 shown in FIG. In the example shown in the table 1502, the prediction unit PU is 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels, 32 × 8 pixels, 8 × 32 pixels, and 16 × 4 pixels. In the case of 4 × 16 pixels, the number of prediction prediction modes and the number of reference prediction modes are 33 directions.

Further, when the prediction unit PU is 16 × 1 pixel, 1 × 16 pixel, 8 × 2 pixel, 2 × 8 pixel, the number of prediction prediction modes is 129 directions, and the number of reference prediction modes is 33 directions.

The reference prediction mode decoding unit 24 sets the prediction mode recording unit RU of the target CU based on the PU structure information # 21 acquired from the PU structure decoding unit 21. Then, for each recording unit RU included in the target CU, the reference prediction mode is decoded from the encoded data # 1, and is recorded in the prediction mode recording unit 11 together with the position of the recording unit RU in the target CU. Further, the prediction prediction mode data # 24 indicating the decoded reference prediction mode is notified to the prediction prediction mode deriving unit 26.

Further, the reference prediction mode decoding unit 24 decodes the prediction parameters by the following method, for example.

Method 1 (using MPM)
(1) For the recording unit to which each prediction unit belongs to the encoded data, the prediction mode of the recording unit is a prediction mode related to a specific recording unit (for example, the prediction mode of the upper adjacent recording unit adjacent to the upper side of the target recording unit) And a flag indicating whether or not it matches the estimated prediction mode estimated from the prediction mode of the left adjacent recording unit adjacent to the left side of the target recording unit (prediction mode with a small prediction mode ID) ( 2) For a recording unit whose prediction mode does not match the estimated prediction mode, when a code obtained by encoding the prediction mode related to the recording unit is included, the reference prediction mode decoding unit 24 encodes the code as follows: The prediction mode is decoded from the optimization parameters.

That is, the reference prediction mode decoding unit 24 (1) decodes the flag from the encoded data, and (2) indicates that the flag matches the estimated prediction mode. The prediction mode related to the recording unit is read from the prediction mode recording unit 11, (2-2) the prediction mode related to the target recording unit is estimated from the read prediction mode, and (3) the flag does not match the estimated prediction mode , The prediction mode for the target recording unit is determined by decoding the code.

Method 2 (use multiple candidates)
(1) With respect to a recording unit to which each prediction unit belongs to encoded data, a prediction mode (for example, an upper adjacent recording unit adjacent to the upper side of the target recording unit) which is related to any one of a plurality of recording units. And a flag indicating whether or not the prediction mode is estimated to match the prediction mode estimated from the prediction mode of the left recording unit adjacent to the left side of the target recording unit), and (2) the prediction mode is the estimated prediction mode Information indicating which recording unit (any of the plurality of recording units) the prediction mode for the recording unit matches, and (3) the prediction mode is the estimated prediction mode When a recording unit that does not match includes a code obtained by encoding a prediction mode related to the recording unit, the reference prediction mode decoding unit 2 Decodes the prediction mode from the encoding parameters as follows.

That is, the reference prediction mode decoding unit 24 (1) decodes the flag from the encoded data, and (2) indicates that the flag matches the estimated prediction mode. Decoding the information from the data, (2-2) reading the prediction mode related to the recording unit indicated by the information from the prediction mode recording unit 11, and (2-3) the prediction mode related to the target recording unit from the read prediction mode. (3) When the flag indicates that it does not match the estimated prediction mode, the prediction mode for the target recording unit is determined by decoding the code.

The bit stream configuration of the prediction mode when the accuracy of the prediction prediction mode and the accuracy of the reference prediction mode are different will be described with reference to FIG. 16A and 16B are diagrams for explaining the configuration of the bitstream. FIG. 16A is a diagram illustrating the relationship among the target CU, the prediction unit PU, and the recording unit RU, and FIG. It is a figure which shows the example of a bit stream in the case of 16 pixels, (c) is a figure which shows the example of a bit stream in case prediction unit PU is 4x16 pixels.

In the example shown in FIG. 16A, the 16 × 16 pixel target CU includes 1 × 16 pixel prediction units PU1610a, 1610b, 1610c,..., 1613d, and includes four prediction units PU. As described above, 4 × 16 pixel recording units RUs 1620a, 1620b,..., 1620d are included.

Here, the prediction mode of the prediction unit PU 1610a is P _P0 , the prediction mode of the prediction unit PU 1610b is P _P1 , the prediction mode of the prediction unit PU 1610c is P _P2 , the prediction mode of the prediction unit PU 1610d is P _P3 ,. mode is the _{P P15,} the prediction mode of the recording unit RU1620a is _{P r0,} the prediction mode of the recording unit RU1620b is _{P r1,} the prediction mode of the recording unit RU1620c is _{P r2,} the prediction mode of the recording unit RU1620d is assumed to be _{P r3} . P _rk (k = 0,..., K−1) is a reference prediction mode, and P _pl (1 = 0,..., L−1) is a prediction prediction mode. The values of P _rk and P _pl are prediction mode IDs.

In this case, as shown in FIG. 16B, the bit stream of the prediction mode of the target CU is P _r0 , ΔP _P0 , ΔP _P1 , ΔP _P2 , ΔP _P3 , P _r1 , ΔP _P4 ,. _{P15 and} so on. Here, ΔP _P0 is prediction mode update information, and indicates a difference between the prediction mode of the recording unit RU to which the prediction unit PU belongs and the prediction mode of the prediction unit PU. In the case of ΔP _P0 , the difference between P _r0 and P _P0 is shown. The prediction mode update information is decoded only when the corresponding reference prediction mode is direction prediction. Therefore, it is omitted in the case of DC prediction or Planar prediction.

That is, P _r0 , P _r1 ,... Indicates a prediction mode in intra prediction, and ΔP _P0 , ΔP _P1 , ΔP _P2 ,... Indicate information for selecting either DC prediction or Planar prediction. When P _r0 , P _r1 ,... Indicate directional prediction in intra prediction, ΔP _P0 , ΔP _P1 , ΔP _P2 ,... Are decoded, but P _r0 , P _r1 _,. When indicating Planar prediction, ΔP _P0 , ΔP _P1 , ΔP _P2 ,... Are not decoded.

If the prediction unit PU is the same 4 × 16 pixels as the recording unit RU, as shown in (c) of FIG. 16, a bit stream of P _r0 , P _r1 , P _r2 , P _r3 is formed from the top.

As described above, when the unit or accuracy differs between the recording unit RU and the prediction unit PU, the prediction mode update information is encoded and decoded.

The prediction mode update information decoding unit 25 decodes the prediction mode update information from the encoded data # 1 and notifies the prediction mode update information data # 25 indicating the decoded prediction mode update information to the prediction mode deriving unit 26 for prediction. It is.

The prediction mode deriving unit 26 for prediction uses prediction reference mode data # 24 acquired from the reference prediction mode decoding unit 24 and prediction mode update information data # 25 acquired from the prediction mode update information decoding unit 25. The mode is derived and notified to the predicted image generation unit 17.

This will be described in more detail with reference to FIG. FIG. 17 is a diagram for explaining the process of deriving the prediction prediction mode, (a) is a diagram showing the correspondence between the prediction prediction mode and the reference prediction mode, and (b) is the prediction mode. It is a figure which shows the content of update information.

When prediction mode update information ΔP _pl is decoded by the prediction mode update information decoding unit 25, the prediction prediction mode deriving unit 26 decodes the reference prediction mode P _rk of the recording unit RU including the target prediction unit PU. Then, using the decoded P _rk and ΔP _Pl , the parameter s3 of the prediction mode for prediction is derived by the following equation.

s4 = S (P _rk )
s3 = (s4 << 2) + u
However, S (P _rk ) is a function that maps the prediction mode P _rk to the prediction mode that indicates the same direction in a one-to-one correspondence. U is determined by the prediction mode update information ΔP _pl . FIG. 17B shows a table 1701 indicating the relationship between the prediction mode update information ΔP _pl and the corresponding code bit. In the example shown in the table 1701, the sign bit when the prediction mode update information ΔP _pl is “0” is “1”, the sign bit when the prediction mode update information ΔP _pl is “± 1” is “01x”, and the prediction mode The sign bit when the update information ΔP _pl is “± 2” corresponds to “00x”.

In the table 1701, the absolute value of the update information is truncated and unary encoded, and if the update information is non-zero, a sign code is added. Although encoding may be performed using different variable-length encoding schemes, a variable-length encoding scheme that assigns a short code to update information having a small absolute value is preferable.

Next, the flow of processing in the prediction information decoding unit 15 'will be described with reference to FIG. FIG. 18 is a flowchart showing a process flow of the prediction information decoding unit 15 ′.

When the prediction information decoding unit 157 acquires the encoded data # 1 (S21), the PU structure decoding unit 21 decodes the prediction unit PU structure of the target CU from the encoded data # 1 (S22). Then, the reference prediction mode decoding unit 24 sets the recording unit RU of the reference prediction mode in the target CU from the prediction unit PU structure decoded by the PU structure decoding unit 21 (S23).

Thereafter, the reference prediction mode decoding unit 24 decodes the reference prediction mode for each recording unit RU (S4) (S25). Next, the prediction mode update information decoding part 25 decodes prediction mode update information about each prediction unit (S26) (S27). Then, the prediction prediction mode deriving unit 26 derives a prediction prediction mode from the reference prediction mode decoded in step S25 and the prediction mode update information decoded in step S26 (S28).

When the processing of steps S27 and S28 is completed for each prediction unit PU (S29), the reference prediction mode decoding unit 24 records the decoded reference prediction mode in the prediction mode recording unit 11 (S30). Then, Steps S25 to S30 are performed for all the recording units (S31), and the prediction unit PU structure included in the target CU and the prediction mode for prediction of each prediction unit PU are output as prediction information (prediction mode information # 15) ( S32).

The above is the flow of processing in the prediction information decoding unit 15 '.

(Modification)
An edge-based prediction mode (DCIM mode) that determines a prediction direction based on an edge direction derived based on a pixel value of a decoded region adjacent to the target prediction unit is added to the prediction mode of the second embodiment described above. It may be a configuration.

Specifically, a flag for selecting either edge-based prediction or the prediction mode (UIP mode) described in the second embodiment is included in the encoded data for each recording unit RU. Then, the reference prediction mode decoding unit 24 decodes the flag to determine whether it is the UIP mode or the DCIM mode. In the case of the UIP mode, the prediction information is decoded by the method of the second embodiment. On the other hand, in the DCIM mode, the prediction information is decoded by the following method.

(1) The edge direction is derived based on the pixel value of the decoded area adjacent to the recording unit RU.

(2) Prediction mode (s _h ) that expresses the prediction direction closest to the edge direction derived in (1) above with high accuracy (for example, accuracy of prediction mode for prediction in Embodiment 2 (129 types)). Choose from. Let e1 be the value of the selected prediction mode.

(3) The following processing is applied to each prediction unit PU included in the recording unit RU.

(3.1) The prediction mode update information P _pl is decoded.

(3.2) As shown in the following equation, the prediction mode update information P _pl is added to the prediction mode e1 to derive the prediction mode for prediction of the target PU.

Prediction prediction mode s = e1 + u
(4) The prediction mode ID (Pr) corresponding to the prediction direction s _{h ′} approximating the prediction mode e1 is derived by the following equation and recorded in the prediction mode recording unit.

s _{h ′} = e1 >> 2
Pr = S ⁻¹ (s _{h ′} )
Note that S-1 (s _{h ′} ) is a function that maps the prediction direction sh _′ to the prediction mode ID in the same direction on a one-to-one basis.

(Appendix 3)
In the said Embodiment 2, although the method to update a prediction direction by prediction mode update information within prediction unit PU was demonstrated, you may update except a prediction direction.

For example, when the prediction mode decoded from the recording unit RU is the DC prediction mode or the Planar prediction mode, information indicating which prediction mode is applied to each prediction unit PU may be used as the prediction mode update information. Both the DC prediction mode and the Planar prediction mode are predictions suitable for a flat region. Therefore, in the recording unit RU corresponding to the flat area, the coding efficiency can be improved by selectively switching to the preferred prediction mode of both predictions.

Further, when the direction mode is decoded as the prediction mode of the recording unit RU, information indicating which of the prediction mode and the DC mode to be applied to each prediction unit PU may be used as the prediction mode update information. For example, in the case of a prediction unit PU of 16 × 1 pixels, there are many cases where an edge exists only in a partial region in a recording unit RU of 16 × 4 pixels. In such a case, encoding efficiency can be improved by selectively switching between the direction mode and the DC mode.

(Appendix 4)
In the second embodiment, the method for updating the prediction direction of the prediction mode for prediction by sending the difference with respect to the prediction direction of the reference prediction mode by the prediction mode update information in the prediction unit PU has been described. There is no. For example, information for selecting either a prediction mode or a reference prediction mode that is highly likely to be selected by the PU may be sent as update information. As a prediction mode with high possibility of being selected, DC mode is mentioned, for example. Another example is a prediction mode that is determined according to the shape of the prediction unit PU. Specifically, when the prediction unit PU has a vertically long shape (such as 1 × 16 or 2 × 8), the prediction mode corresponding to the prediction direction in the vertical direction is selected, and the prediction unit PU has a horizontally long shape (16 × 1 or In the case of 8 × 2, etc., the prediction mode corresponding to the horizontal prediction direction may be included in the selection candidates based on the update information as a prediction mode with a high possibility of being selected.

(Appendix 5)
In the second embodiment, it has been described that the capacity of the memory for recording the intra prediction mode can be reduced. However, the present invention is not limited to the intra prediction mode, and is used for generating a predicted image that can be recorded in units smaller than the coding unit CU. A method of making the size of the prediction unit PU and the recording unit RU different can also be applied to the parameter.

For example, the present invention can be applied to estimated intra prediction mode selection information, residual information of intra prediction modes, motion vectors, motion vector residuals, estimated motion vector selection information, reference image selection information, reference image list selection information, and the like.

(Appendix 6)
In the second embodiment, the accuracy of the reference prediction mode is higher than the accuracy of the prediction prediction mode in the configuration of “making the size of the recording unit larger than the size of the prediction unit” described in the first embodiment. Although the configuration of “lower” is combined, the latter configuration does not necessarily need to be combined with the former configuration, and the single configuration is effective.

(Appendix 7)
In the first embodiment and the second embodiment, it is described that the reference prediction mode is used when the estimated value of the intra prediction mode is derived. However, it is not always necessary to use only the reference prediction mode in all cases. For example, when referring to a prediction mode decoded relatively recently, the intra prediction mode may be estimated using the prediction prediction mode instead of the reference prediction mode. Specifically, the estimated value may be derived based on the prediction prediction mode of the prediction unit PU adjacent on the left side and the reference prediction mode of the recording unit RU adjacent on the upper side.

When decoding LCU (TB) in the raster scan order, the recording unit RU adjacent to the left side is decoded relatively recently compared to the recording unit RU adjacent to the upper side. Therefore, the capacity of the memory for recording the prediction prediction mode of the prediction unit PU adjacent on the left side is compared with the capacity of the memory for recording the prediction prediction mode of the prediction unit PU adjacent on the upper side. And few.

Therefore, the accuracy of the estimated value of the prediction mode can be increased without greatly increasing the memory capacity. As another example, when the prediction unit PU adjacent to the left side or the upper side is in the same coding unit CU or LCU (TB) as the target prediction unit PU, the prediction prediction mode is used, otherwise The intra prediction mode estimation value may be derived using the reference prediction mode of the recording unit RU adjacent on the left side or the upper side. Since the prediction units PU included in the same coding unit CU and LCU (TB) are decoded at relatively close timings, it is possible to improve the accuracy of the prediction mode estimation value without greatly increasing the memory capacity.

A specific example will be described in the case where the above example is applied to the process of deriving the estimated motion vector based on the motion vector of the adjacent region above the target prediction unit. In the following, it is assumed that the size of the recording unit RU is 8 × 8 pixels. The minimum width of the prediction unit is 4 pixels.
(1) The position (xP, yP) of the upper left pixel of the target prediction unit is acquired.
(2) The prediction unit including (xP, yP-1) is set as the upper adjacent prediction unit of the target prediction unit.
(3) It is determined whether the target prediction unit and the upper adjacent prediction unit belong to the same LCU.
(3-1) When belonging to the same LCU, the motion vector of the upper adjacent prediction unit is set as the estimated motion vector.
(3-2) When not belonging to the same LCU, the motion vector of the upper adjacent recording unit is set as the estimated motion vector. Here, the upper adjacent recording unit is a recording unit including (xP, yP-1).

When the estimated motion vector is derived by the above procedure, when the upper adjacent prediction unit belongs to an LCU different from the target prediction unit, the motion vector is referred to in the recording unit. That is, it is only necessary to store the motion vector in the memory in units of recording in the LCU on one line of the target LCU. When the motion vector of the upper adjacent prediction unit is always referred to, it is necessary to hold the motion vector in the prediction unit in the LCU on one line of the target LCU. Therefore, when the width of the recording unit is larger than the minimum width of the prediction unit, the amount of line memory for holding the motion vector in the LCU on one line of the target LCU is reduced by estimating the motion vector by the above procedure. it can.

Note that the above memory reduction effect can be obtained by referring to only the coordinates in the recording unit when referring to the motion vector. For example, when the recording unit is 8 × 8, only one motion vector needs to be referred to per 8 × 8 area. Hereinafter, an example will be described.

For example, the recording position (xB ′, yB ′) of the motion vector belonging to the upper adjacent recording unit may be determined as follows.
xB '= ((xP + 7) >> 3) << 3
yB '= yP-1
Here, the motion vector of the LCU on one line is recorded at a position of (N × 8, yP−1) in a recording unit. Here, N is an integer of 0 or more. In this case, one motion vector is recorded for every 8 pixels in the x-axis direction. According to the above equation, the recording position of the motion vector in the upper adjacent recording unit is the recording position of the motion vector at a position closest to the pixel one pixel above the upper left pixel of the target PU.

Further, the recording position (xB ′, yB ′) of the motion vector belonging to the upper recording unit may be determined as follows.
xB '= (xP >> 3) << 3
yB '= yP-1
Also in this case, the motion vector of the LCU on one line is recorded at the position of (N × 8, yP−1) in the recording unit. Here, N is an integer of 0 or more. In this case, one motion vector is recorded for every 8 pixels in the x-axis direction. According to the above formula, the recording position of the motion vector of the upper adjacent recording unit is the position of the quotient obtained by dividing the x coordinate of the pixel by the width of the recording unit with reference to the pixel one pixel above the upper left pixel of the target PU. , And the x coordinate position.

Further, the recording position (xB ′, yB ′) of the motion vector belonging to the upper recording unit may be determined as follows.
xB '= (xP >> 3) << 3 [0 <= (xB% 16) <8]
xB '= (((xP >> 3) + 1) << 3)-1 [8 <= (xB% 16) <16]
yB '= yP-1
Here, the motion vector of the LCU on one line is recorded at positions (N × 16, yP−1) and (N × 16−1, yP−1) in the recording unit. Here, N is an integer of 0 or more. In this case, one motion vector is recorded for every eight pixels in the x-axis direction. Here, D is a remainder obtained by dividing the x coordinate of the pixel by a number twice the width of the recording unit, and E is the width of the recording unit. According to the above formula, the recording position of the motion vector of the upper adjacent recording unit is (N × 0) if the value of D is 0 or more and less than E with reference to the pixel one pixel above the upper left pixel of the target prediction unit. Referring to the motion vector recorded at the position 16, yP-1), if the value of D is greater than or equal to E and less than 2E, it is recorded at the position (N × 16-1, yP-1) Refers to the motion vector.

Note that the estimated motion vector derived in the above example may be one of a plurality of estimated motion vector candidates. Further, the estimated motion vector may be directly used for motion compensation, or a motion vector obtained by adding a difference motion vector to the estimated motion vector may be used for motion compensation.

(Appendix 8)
In Embodiment 1 and Embodiment 2 described above, the case where the reference prediction mode is used for derivation of the estimated value of the intra prediction mode has been described, but the present embodiment also applies when the reference prediction mode is used for other purposes. It is valid. The present embodiment can be applied to any process that refers to an intra prediction mode that has been previously decoded in decoding order. For example, it can be applied to a case where a deblocking filter having an appropriate strength is applied after determining the continuity of the boundary between prediction units PU by referring to the intra prediction mode. In such a case, the memory capacity for recording the intra prediction mode can be reduced by using the reference prediction mode.

(Application examples)
The moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above can be used by being mounted on various apparatuses that perform moving picture transmission, reception, recording, and reproduction. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

First, it will be described with reference to FIG. 19 that the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above can be used for transmission and reception of moving pictures.

FIG. 19A is a block diagram showing a configuration of a transmission apparatus A equipped with the moving picture encoding apparatus 2. As shown in FIG. 19 (a), the transmitting apparatus A encodes a moving image, obtains encoded data, and modulates a carrier wave with the encoded data obtained by the encoding unit A1. A modulation unit A2 that obtains a modulation signal by the transmission unit A2 and a transmission unit A3 that transmits the modulation signal obtained by the modulation unit A2. The moving image encoding device 2 described above is used as the encoding unit A1.

The transmission apparatus A has a camera A4 that captures a moving image, a recording medium A5 that records the moving image, an input terminal A6 for inputting the moving image from the outside, as a supply source of the moving image that is input to the encoding unit A1. You may further provide image processing part A7 which produces | generates or processes an image. FIG. 19A illustrates a configuration in which the transmission apparatus A includes all of these, but a part of the configuration may be omitted.

The recording medium A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded using a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium A5 according to the recording encoding method may be interposed between the recording medium A5 and the encoding unit A1.

FIG. 19B is a block diagram illustrating a configuration of the receiving device B on which the moving image decoding device 1 is mounted. As shown in FIG. 19B, the receiving device B includes a receiving unit B1 that receives a modulated signal, a demodulating unit B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit B1, and a demodulating unit. A decoding unit B3 that obtains a moving image by decoding the encoded data obtained by B2. The moving picture decoding apparatus 1 described above is used as the decoding unit B3.

The receiving apparatus B has a display B4 for displaying a moving image, a recording medium B5 for recording the moving image, and an output terminal for outputting the moving image as a supply destination of the moving image output from the decoding unit B3. B6 may be further provided. FIG. 19B illustrates a configuration in which the receiving apparatus B includes all of these, but some of them may be omitted.

Note that the recording medium B5 may be for recording an unencoded moving image, or is encoded by a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) that encodes the moving image acquired from the decoding unit B3 in accordance with the recording encoding method may be interposed between the decoding unit B3 and the recording medium B5.

Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

For example, a terrestrial digital broadcast broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) is an example of a transmitting apparatus A / receiving apparatus B that transmits and receives modulated signals by wireless broadcasting. A broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) for cable television broadcasting is an example of a transmitting device A / receiving device B that transmits and receives a modulated signal by cable broadcasting.

Also, a server (workstation etc.) / Client (television receiver, personal computer, smart phone etc.) such as VOD (Video On Demand) service and video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication. This is an example of A / reception device B (usually, either wireless or wired is used as a transmission medium in a LAN, and wired is used as a transmission medium in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

In addition to the function of decoding the encoded data downloaded from the server and displaying it on the display, the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device A and the reception device B.

Next, the fact that the above-described moving picture decoding apparatus 1 and moving picture encoding apparatus 2 can be used for recording and reproduction of moving pictures will be described with reference to FIG.

FIG. 20A is a block diagram showing a configuration of a recording apparatus C equipped with the moving picture decoding apparatus 1 described above. As shown in FIG. 20 (a), the recording device C encodes a moving image to obtain encoded data, and writes the encoded data obtained by the encoding unit C1 to the recording medium M. And a writing unit C2. The moving image encoding device 2 described above is used as the encoding unit C1.

The recording medium M may be of a type built in the recording device C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) (Trademark) or the like may be mounted on a drive device (not shown) built in the recording apparatus C.

The recording apparatus C also serves as a moving image supply source to be input to the encoding unit C1, a camera C3 that captures moving images, an input terminal C4 for inputting moving images from the outside, and reception for receiving moving images. A unit C5 and an image processing unit C6 that generates or processes an image may be further provided. FIG. 20A illustrates a configuration in which the recording apparatus C includes all of these, but some of them may be omitted.

The receiving unit C5 may receive an unencoded moving image, or receives encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit C5 and the encoding unit C1.

Examples of such a recording device C include a DVD recorder, a BD recorder, and an HD (Hard Disk) recorder (in this case, the input terminal C4 or the receiving unit C5 is a main source of moving images). In addition, a camcorder (in this case, the camera C3 is a main source of moving images), a personal computer (in this case, the receiving unit C5 or the image processing unit C6 is a main source of moving images), a smartphone (this In this case, the camera C3 or the receiving unit C5 is a main source of moving images).

FIG. 20B is a block showing the configuration of the playback device D on which the above-described video decoding device 1 is mounted. As shown in FIG. 20 (b), the playback device D obtains a moving image by decoding the read data D1 that reads the encoded data written on the recording medium M and the read data read by the read unit D1. A decoding unit D2. The moving picture decoding apparatus 1 described above is used as the decoding unit D2.

The recording medium M may be of a type built in the playback device D such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory. It may be of a type connected to the playback device D, or (3) may be loaded into a drive device (not shown) built in the playback device D, such as DVD or BD. Good.

Further, the playback device D has a display D3 for displaying a moving image, an output terminal D4 for outputting the moving image to the outside, and a transmitting unit for transmitting the moving image as a supply destination of the moving image output by the decoding unit D2. D5 may be further provided. FIG. 20B illustrates a configuration in which the playback apparatus D includes all of these, but some of the configurations may be omitted.

The transmission unit D5 may transmit a non-encoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, an encoding unit (not shown) that encodes a moving image with a transmission encoding method may be interposed between the decoding unit D2 and the transmission unit D5.

Examples of such a playback device D include a DVD player, a BD player, and an HDD player (in this case, an output terminal D4 to which a television receiver or the like is connected is a main moving image supply destination). . In addition, a television receiver (in this case, the display D3 is a main destination of moving images), a desktop PC (in this case, the output terminal D4 or the transmission unit D5 is a main destination of moving images), Laptop type or tablet type PC (in this case, display D3 or transmission unit D5 is the main video image supply destination), smartphone (in this case, display D3 or transmission unit D5 is the main video image supply destination) ), Digital signage (also referred to as an electronic signboard or an electronic bulletin board, and the display D3 or the transmission unit D5 is the main supply destination of moving images) is an example of such a playback device D.

(Configuration by software)
Finally, each block of the moving

image decoding apparatuses

1 and 1 ′ and the moving image encoding apparatus 2, particularly the CU decoding unit 10 (prediction information decoding unit 15 (PU structure decoding unit 21, prediction prediction mode decoding unit 22, reference) Prediction mode deriving unit 23), prediction residual decoding unit 16, prediction image generation unit 17, decoded image generation unit 18), prediction information decoding unit 15 ′ (reference prediction mode decoding unit 24, prediction mode update information decoding unit 25, Prediction prediction mode derivation unit 26), prediction information determination unit 31, reference prediction mode derivation unit 32, prediction residual encoding unit 34, prediction information encoding unit 35, prediction image generation unit 36, prediction residual decoding unit 37 The decoded image generation unit 38 and the encoded data generation unit 40 may be realized by hardware by a logic circuit formed on an integrated circuit (IC chip), or use a CPU (central processing unit). Sof It may be realized as a software.

In the latter case, the moving

picture decoding apparatuses

1 and 1 ′ and the moving picture coding apparatus 2 include a CPU that executes instructions of a control program that realizes each function, a ROM (read only memory) that stores the program, and the program. A RAM (random access memory) to be developed, a storage device (recording medium) such as a memory for storing the program and various data, and the like are provided. An object of the present invention is to provide program codes (execution format program, intermediate code program, source program) of control programs for the

video decoding devices

1, 1 ′ and the video encoding device 2 that are software for realizing the functions described above. Is recorded on the recording medium by the computer (or CPU or MPU (microprocessor unit)) supplied to the above-described moving

picture decoding apparatus

1, 1 ′ and moving picture encoding apparatus 2. This can also be achieved by reading and executing the program code.

Examples of the recording medium include tapes such as a magnetic tape and a cassette tape, a magnetic disk such as a floppy (registered trademark) disk / hard disk, a CD-ROM (compact disk-read-only memory) / MO (magneto-optical) / Discs including optical discs such as MD (Mini Disc) / DVD (digital versatile disc) / CD-R (CD Recordable), IC cards (including memory cards) / optical cards, mask ROM / EPROM (erasable) Programmable read-only memory) / EEPROM (electrically erasable and programmable programmable read-only memory) / semiconductor memory such as flash ROM, or logic circuits such as PLD (Programmable logic device) and FPGA (Field Programmable Gate Array) be able to.

Further, the moving picture decoding apparatuses 1, 1 'and the moving picture encoding apparatus 2 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, Internet, intranet, extranet, LAN (local area network), ISDN (integrated area services digital area), VAN (value-added area network), CATV (community area antenna television) communication network, virtual area private network (virtual area private network), A telephone line network, a mobile communication network, a satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, IEEE (institute of electrical and electronic engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (asynchronous digital subscriber loop) line, etc. wired such as IrDA (infrared data association) or remote control , Bluetooth (registered trademark), IEEE802.11 wireless, HDR (high data rate), NFC (Near field communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc. Is possible. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

(Other)
The above-described embodiment can also be expressed as follows.

An image decoding apparatus according to the present invention uses a region obtained by dividing an encoding unit as a prediction unit, generates a prediction image for each prediction unit with reference to a prediction parameter, and decodes a prediction residual decoded from encoded data In the image decoding device for generating a decoded image by adding the predicted image to the prediction parameter decoding means for decoding a prediction parameter for each prediction unit from the encoded data, the prediction parameter decoding means for at least a part of the prediction units A prediction parameter decoding unit configured to estimate a prediction parameter related to a unit from a decoded prediction parameter related to a prediction unit included in an adjacent coding unit adjacent to the coding unit to which the prediction unit belongs, and at least a part of the coding unit Out of the prediction parameters for each prediction unit included in the coding unit, which are adjacent to the coding unit. The number of reference prediction parameters that can be referred to by the prediction parameter decoding means to estimate the prediction parameters for each prediction unit included in the adjacent coding unit is the adjacent code among the prediction units included in the coding unit. It is characterized in that it is set smaller than the number of prediction units adjacent to the conversion unit.

In the image decoding apparatus according to the present invention, the prediction parameter decoding unit records, for the partial coding unit, only the reference prediction parameter among the prediction parameters for each prediction unit included in the coding unit in the memory. You may do.

According to the above configuration, since only the reference prediction parameter is recorded in the memory, it is adjacent to the encoding unit included in the adjacent encoding unit adjacent to the encoding unit to which the prediction unit for estimating the prediction parameter belongs. Compared with the case where all the prediction parameters of a prediction unit are recorded, the required memory capacity can be reduced.

In the image decoding device according to the present invention, the encoded data includes a prediction parameter code obtained by encoding a reference prediction parameter for each of the partial encoding units, and each prediction included in the encoding unit. A difference code obtained by encoding a difference between a prediction parameter related to a unit and the reference prediction parameter is included, and the prediction parameter decoding unit is configured to decode the prediction parameter code. A prediction parameter related to each prediction unit included in the partial coding unit may be derived by adding a difference obtained by decoding the difference code to the prediction parameter.

According to the above configuration, the prediction parameter decoding means derives the prediction parameter related to the prediction unit by adding the reference prediction parameter and the difference between the reference prediction parameter and the prediction parameter related to the prediction unit. Thereby, since the prediction parameter can be derived by deriving a prediction parameter with higher accuracy than the reference prediction parameter, the accuracy of the prediction image can be improved.

In the image decoding apparatus according to the present invention, the prediction image is generated by intra prediction, the prediction parameter indicates a prediction mode in intra prediction, and the prediction parameter decoding means is for the reference When the prediction parameter indicates direction prediction in intra prediction, the difference code may be decoded.

According to the above configuration, the difference is decoded when the reference prediction parameter indicates the direction prediction in the intra prediction. Thereby, since the difference is decoded in the case of direction prediction in which the accuracy of prediction is increased by the difference, the difference code may be included in the encoded data only when necessary, and the encoding efficiency can be improved.

In the image decoding apparatus according to the present invention, the predicted image is generated by intra prediction, the prediction parameter indicates a prediction mode in intra prediction, and the parameter decoding means includes the reference prediction When the parameter indicates edge-based prediction in intra prediction, the difference code may be decoded.

According to the above configuration, when the reference prediction mode indicates edge-based prediction, the difference is decoded. Thereby, since the difference is decoded in the case of edge-based prediction in which the accuracy of prediction is increased by the difference, the difference code may be included in the encoded data only when necessary, and the encoding efficiency can be improved.

In the image decoding apparatus according to the present invention, the predicted image is generated by intra prediction, the prediction parameter indicates a prediction mode in intra prediction, and the encoded data includes the part of the encoded data. The prediction mode code obtained by encoding the reference prediction mode that is the reference prediction parameter and the prediction mode that is the prediction parameter for each prediction unit included in the encoding unit are the surroundings of the prediction image. And a selection code obtained by encoding selection information, which is information for selecting either DC prediction or Planar prediction for generating a prediction image from the average of the pixel values of the pixels, and the prediction parameter decoding means Includes at least some of the encoding units from the reference prediction parameters obtained by decoding the encoded data. When the prediction mode for each prediction unit is derived and the reference prediction mode indicates either DC prediction or Planar prediction, the selection code is further decoded to The prediction mode for each prediction unit included in the coding unit may be derived.

According to the above configuration, only when the reference prediction mode indicates DC prediction or Planar prediction, the selection code for selecting one of the prediction methods suitable for the flat region is decoded. As a result, it is only necessary to include a selection code for selecting a prediction method suitable for a flat region in the encoded data, so that the encoding efficiency can be improved.

In the image decoding apparatus according to the present invention, the prediction parameter decoding means includes, for the part of the coding units, a reference prediction parameter to be referred to in order to estimate a prediction parameter related to the prediction unit, the part of the coding units. May be derived from the prediction parameters for each prediction unit included in the adjacent coding unit adjacent to.

According to the above configuration, the prediction parameter for each prediction unit belonging to the adjacent coding unit adjacent to the coding unit to which the prediction unit for estimating the prediction parameter by referring to the reference prediction parameter belongs. Derived from Thereby, the prediction parameter for reference can be derived appropriately.

In the image decoding apparatus according to the present invention, the predicted image is generated by intra prediction, the prediction parameter indicates a prediction mode in intra prediction, and the prediction parameter decoding means For a coding unit, among the decoded prediction modes for each prediction unit included in the adjacent coding unit adjacent to the coding unit, a prediction mode with the smallest prediction mode ID may be used as a reference prediction parameter. Good.

According to the above configuration, the prediction mode for each prediction unit belonging to the adjacent coding unit adjacent to the coding unit to which the prediction unit for estimating the prediction mode by referring to the reference prediction mode belongs. The prediction mode with the smallest prediction mode ID is selected. Since the prediction mode with the smallest prediction mode ID is the prediction mode most likely to be selected, this makes it possible to set a more appropriate prediction mode as the reference prediction mode.

In the image decoding apparatus according to the present invention, the accuracy of the reference prediction parameter recorded in the memory may be set lower than the accuracy of the prediction parameter decoded by the parameter decoding means.

According to the above configuration, the reference prediction parameter with lower accuracy than the prediction parameter decoded by the prediction parameter decoding means is recorded. A prediction parameter with low accuracy has a smaller amount of data than a prediction parameter with high accuracy. Therefore, the capacity of the recording memory can be reduced while taking into account the generation of a highly accurate predicted image.

In the image decoding apparatus according to the present invention, the predicted image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
The accuracy of the prediction mode recorded in the memory may be set lower than the accuracy of the prediction mode decoded by the prediction parameter decoding unit.

According to the above configuration, it is possible to generate a prediction image with higher accuracy and reduce the capacity of the memory used for recording the prediction mode. In order to improve accuracy, it is conceivable to derive a prediction mode by designating a direction in units of an angle smaller than a predetermined value.

The image encoding apparatus according to the present invention uses a region obtained by dividing an encoding unit as a prediction unit, generates a prediction image for each prediction unit with reference to a prediction parameter, and subtracts the generated prediction image from an original image In an image encoding device that encodes the predicted residual and outputs encoded data, at least in some encoding units, a prediction parameter for each prediction unit is set to an adjacent code adjacent to the encoding unit to which the prediction unit belongs. A prediction that estimates from a decoded prediction parameter related to a prediction unit included in a coding unit and encodes the prediction parameter related to the prediction unit only when the prediction parameter does not match the estimated prediction parameter obtained by the above estimation Parameter encoding means, and at least a part of the encoding units is related to each prediction unit included in the encoding unit. Among the prediction parameters, the number of reference prediction parameters that can be referred to by the prediction parameter encoding means to estimate a prediction parameter for each prediction unit included in an adjacent coding unit adjacent to the coding unit is Of the prediction units included in the coding unit, the prediction unit is set to be smaller than the number of prediction units adjacent to the adjacent coding unit.

The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the embodiments can be obtained by appropriately combining technical means disclosed in different embodiments. The form is also included in the technical scope of the present invention.

The present invention can be suitably applied to a decoding device that decodes encoded data and an encoding device that generates encoded data. Further, the present invention can be suitably applied to the data structure of encoded data generated by the encoding device and referenced by the decoding device.

1 video decoding device (image decoding device)
2 Video encoding device (image encoding device)
21 PU structure decoding unit 22 Prediction mode decoding unit for prediction (prediction parameter decoding means)
23 reference prediction mode deriving unit 24 reference prediction mode decoding unit 25 prediction mode update information decoding unit 26 prediction prediction mode deriving unit 32 reference prediction mode deriving unit 40 encoded data generation unit (prediction parameter encoding means)

Claims

An area obtained by dividing the coding unit is used as a prediction unit, a prediction image is generated for each prediction unit with reference to the prediction parameter, and the prediction image is added to the prediction residual decoded from the encoded data and decoded. In a video decoding device that generates an image,
Prediction parameter decoding means for decoding a prediction parameter for each prediction unit from the encoded data, and for at least some of the prediction units,
Prediction parameters for the prediction unit
When the upper adjacent prediction unit adjacent to the upper side of the prediction unit belongs to the tree block to which the prediction unit belongs, an estimation is performed from the decoded prediction parameters related to the upper adjacent prediction unit,
When the upper adjacent prediction unit does not belong to the tree block to which the prediction unit belongs, the prediction unit includes a prediction parameter decoding unit that estimates from a decoded prediction parameter related to a recording unit adjacent to the upper side of the prediction unit. A moving picture decoding apparatus.
The predicted image is a predicted image generated by inter prediction,
The moving picture decoding apparatus according to claim 1, wherein the prediction parameter is a motion vector.
An area obtained by dividing the coding unit is used as a prediction unit, a prediction image is generated for each prediction unit with reference to the prediction parameter, and the prediction image is added to the prediction residual decoded from the encoded data and decoded. In an image decoding device that generates an image,
Prediction parameter decoding means for decoding a prediction parameter for each prediction unit from the encoded data, and for at least a part of the prediction units, the prediction parameter for the prediction unit is adjacent to the coding unit to which the prediction unit belongs. A prediction parameter decoding unit that estimates from a prediction parameter that has been decoded with respect to a prediction unit included in a coding unit;
In order to estimate a prediction parameter for each prediction unit included in an adjacent coding unit adjacent to the coding unit among prediction parameters for each prediction unit included in the coding unit for at least a part of the coding units. The number of reference prediction parameters that can be referred to by the prediction parameter decoding means is set to be smaller than the number of prediction units adjacent to the adjacent coding unit among the prediction units included in the coding unit. An image decoding apparatus characterized by the above.
The said prediction parameter decoding means records only the prediction parameter for a reference among the prediction parameters regarding each prediction unit contained in this encoding unit about the said one part encoding unit, It is characterized by the above-mentioned. 4. The image decoding device according to 3.
The encoded data includes a prediction parameter code obtained by encoding a reference prediction parameter, a prediction parameter for each prediction unit included in the encoding unit, and the reference prediction for the partial encoding unit. And the difference code obtained by encoding the difference with the parameter,
The prediction parameter decoding means includes the difference obtained by decoding the difference code to the reference prediction parameter obtained by decoding the prediction parameter code, thereby including in the partial coding unit. The image decoding apparatus according to claim 3, wherein a prediction parameter for each prediction unit is derived.
The prediction image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
6. The image decoding apparatus according to claim 5, wherein the prediction parameter decoding means decodes the difference code when the reference prediction parameter indicates direction prediction in intra prediction.
The prediction image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
6. The image decoding apparatus according to claim 5, wherein the parameter decoding unit decodes the difference code when the reference prediction parameter indicates edge-based prediction in intra prediction.
The prediction image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
The encoded data includes a prediction mode code obtained by encoding a reference prediction mode, which is a reference prediction parameter, for each of the some encoding units, and a prediction regarding each prediction unit included in the encoding unit. A selection code obtained by encoding selection information, which is information for selecting either DC prediction or Planar prediction, in which a prediction mode as a parameter generates a prediction image from an average of pixel values of pixels around the prediction image. Included,
The prediction parameter decoding means derives a prediction mode for each prediction unit included in the partial coding unit from at least the reference prediction parameter obtained by decoding the encoded data. ,
When the reference prediction mode indicates either DC prediction or Planar prediction, the selection code is further decoded to derive a prediction mode for each prediction unit included in the partial coding unit. The image decoding device according to claim 3 or 4, wherein
The prediction parameter decoding means includes, for the part of the coding units, a reference prediction parameter that is referred to in order to estimate a prediction parameter related to the prediction unit in an adjacent coding unit adjacent to the part of the coding units. The image decoding device according to claim 3, wherein the image decoding device is derived from a prediction parameter related to each prediction unit.
The prediction image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
The prediction parameter decoding means selects a prediction mode having a minimum prediction mode ID from among the decoded prediction modes for each prediction unit included in an adjacent coding unit adjacent to the coding unit for the some coding units. The image decoding device according to claim 9, wherein the image decoding device is a reference prediction parameter.
The image decoding apparatus according to claim 4, wherein the accuracy of the reference prediction parameter recorded in the memory is set lower than the accuracy of the prediction parameter decoded by the prediction parameter decoding means.
The prediction image is generated by intra prediction, and the prediction parameter indicates a prediction mode in intra prediction,
12. The image decoding apparatus according to claim 11, wherein the accuracy of the prediction mode recorded in the memory is set lower than the accuracy of the prediction mode decoded by the prediction parameter decoding means.
Using the region obtained by dividing the coding unit as the prediction unit, refer to the prediction parameter, generate a prediction image for each prediction unit, and encode and encode the prediction residual obtained by subtracting the generated prediction image from the original image In an image encoding device that outputs data,
The prediction parameter for each prediction unit is estimated from the decoded prediction parameters for the prediction unit included in the adjacent coding unit adjacent to the coding unit to which the prediction unit belongs, and the prediction parameter for the prediction unit is determined as the prediction parameter. Includes a prediction parameter encoding means for encoding only when the estimated prediction parameter obtained by the above estimation does not match,
In order to estimate a prediction parameter for each prediction unit included in an adjacent coding unit adjacent to the coding unit among prediction parameters for each prediction unit included in the coding unit for at least a part of the coding units. The number of reference prediction parameters that can be referred to by the prediction parameter encoding means is set to be smaller than the number of prediction units adjacent to the adjacent encoding unit among the prediction units included in the encoding unit. An image encoding apparatus characterized by that.