WO2012077719A1

WO2012077719A1 - Image decoding device and image coding device

Info

Publication number: WO2012077719A1
Application number: PCT/JP2011/078332
Authority: WO
Inventors: 将伸八杉
Original assignee: シャープ株式会社
Priority date: 2010-12-09
Filing date: 2011-12-07
Publication date: 2012-06-14

Abstract

This moving image decoding device (1) generates a prediction image by a prediction from the pixel value of a reference pixel present in a different position within the same screen, and is provided with an intra prediction image generation unit (12c) which determines, with respect to each of the reference pixels, whether or not smoothing is performed between the reference pixel and another pixel adjacent thereto, performs the smoothing between the determined reference pixel and the other pixel, and regarding the smoothed reference pixel, generates a prediction image using the smoothed reference pixel.

Description

Image decoding apparatus and image encoding apparatus

The present invention relates to an image decoding apparatus that decodes encoded data, and an image encoding apparatus that generates encoded data.

In order to efficiently transmit or record a moving image, a moving image encoding device (image encoding device) that generates encoded data by encoding the moving image, and decoding the encoded data A video decoding device (image decoding device) that generates a decoded image is used. As a specific moving picture encoding method, for example, H.264 is used. H.264 / MPEG-4. AVC (Non-Patent Document 1), VCEG (Video Coding Expert Group) adopted by KTA software, which is a codec for joint development, and TMuC (Test Model Under Consideration) software, a successor codec. There are some methods.

In such an encoding method, an image (picture) constituting a moving image is a slice obtained by dividing an image, and a coding unit (Coding スライス Unit) obtained by dividing the slice. And a hierarchical structure consisting of blocks and partitions obtained by dividing an encoding unit, and is usually encoded block by block.

In such an encoding method, a predicted image is usually generated based on a local decoded image obtained by encoding / decoding an input image, and a difference image (“residual” between the predicted image and the input image is generated. (Sometimes referred to as “difference image” or “prediction residual”). As methods for generating a predicted image, methods called inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction) are known.

In inter prediction, a prediction image in a prediction target frame is generated for each prediction unit by applying motion compensation using a motion vector to a reference image in a reference frame (decoded image) obtained by decoding the entire frame. Is done.

On the other hand, in intra prediction, predicted images in the frame are sequentially generated based on locally decoded images in the same frame. H. H.264 / MPEG-4. As an example of intra prediction used in AVC, any prediction direction is selected from prediction directions included in a predetermined prediction direction group for each prediction unit (for example, partition), and a locally decoded image is used. And a method of generating a pixel value on the prediction unit by extrapolating the pixel value of the reference pixel in the selected prediction direction (sometimes referred to as “basic prediction”).

Further, in Non-Patent Document 2, for each prediction unit, the edge direction is calculated based on the pixel values of pixels around the prediction unit, and the pixel value of the reference pixel in the local decoded image is out of the calculated edge direction. A method for generating a pixel value on the prediction unit by inserting (a method called Differential Coding of Intra Modes (DCIM), sometimes called “edge prediction” or “edge-based prediction”) is disclosed. .

Also, as described in Non-Patent Document 3, H. H.264 / MPEG-4. In AVC 8 × 8 intra-frame (intra) predictive coding, before generating pixel values on a prediction unit, pixel values of pixels that are in contact with the upper and left sides of the prediction unit, which are reference pixels, are smoothed ( smoothing). This eliminates block distortion that may occur at the boundary between 4 × 4 pixel prediction units when two 4 × 4 pixel prediction units are adjacent to the upper or left side of the 8 × 8 pixel target prediction unit. can do. Therefore, according to the technique described in Non-Patent Document 3, the pixel value including the block distortion is used as it is for the prediction, thereby preventing the pixel value of the pixel on the prediction unit from containing a large error. Can do.

Specific description will be given with reference to FIG. As shown in FIG. 10, a 3-tap filter whose filter coefficients are (1/4, 1/2, 1/4) in order to remove high-frequency components from neighboring pixels in contact with the upper and left sides of the prediction unit 800. Is applied to perform smoothing. For example, when the pixel 800b in FIG. 10 is smoothed, if the pixel values after the pixels 800a, b, and c are encoded and decoded are Q1, Q2, and Q3, the pixel value after the smoothing of the pixel 800b is (Q1 / 4 + Q2 / 2 + Q3 / 4). This is performed for all peripheral pixels in contact with the upper and left sides of the prediction unit. Note that the same smoothing process is performed for the upper left, upper right, and lower right pixels.

Also, Non-Patent Document 4 describes a technique for switching ON / OFF of smoothing by adaptively specifying whether smoothing is performed for each prediction unit in TMuC. For example, as shown in FIG. 11, whether to perform smoothing for each of the prediction units 401 to 407 is designated by a flag. In the case shown in FIG. 11, the

prediction units

401, 402, and 407 are smoothed, and the prediction units 403 to 406 are not smoothed. In the encoder, whether to perform smoothing is compared with the case where smoothing is performed for each prediction unit and the case where smoothing is not performed, and the one with lower RD (Rate-Distortion) cost is selected. Then, the presence or absence of smoothing is designated by a flag. The decoder switches the smoothing ON / OFF based on the flag.

Non-Patent Documents

5 and 6 describe a technique for switching ON / OFF of smoothing according to the size of a prediction unit and a prediction direction (prediction mode) without using a flag in TMuC. . Specifically, the prediction direction for smoothing is determined in advance for each size of the prediction unit, and smoothing is performed when this is true, and smoothing is not performed when this is not the case.

As described above, a plurality of techniques for smoothing reference pixels referred to in intra prediction have been disclosed, but all have problems. Since TMuC can take various sizes as a prediction unit, even if the technique disclosed in Non-Patent Document 3 is used, block distortion that may occur at the boundary between prediction units adjacent to the target prediction unit. There is a problem that it cannot be removed properly. Even if the techniques disclosed in Non-Patent Documents 4 to 6 are used, there is a possibility that the edges inherent in the encoding target image may be smoothed, so that the encoding efficiency is not improved. Alternatively, there is a problem that the encoding efficiency is not improved as expected.

The present invention has been made in view of the above-described problems, and an object of the present invention is to realize an image decoding apparatus or the like that performs smoothing more appropriately than the conventional technique on reference pixels used for intra prediction. It is in.

In order to solve the above-described problem, the image decoding apparatus according to the present invention, for each prediction unit, calculates a pixel value of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen. An image decoding apparatus that generates a decoded image by adding the generated predicted image to a prediction residual decoded from encoded data, and for each reference pixel adjacent to the reference pixel. A reference pixel smoothing determination unit that determines whether or not to perform smoothing with another pixel, and a reference pixel that the reference pixel smoothing determination unit determines to perform smoothing as the other pixel. Smoothing means that performs smoothing between the reference pixel and the reference pixel that is determined to be smoothed by the reference pixel smoothing determination means among the reference pixels, after the smoothing means smoothes the reference pixel To generate the predicted image Is characterized in that it comprises a, the predicted image generating means for.

According to said structure, whether the reference pixel used for the intra prediction which predicts the pixel value of the pixel in the said prediction image from the pixel value of the reference pixel in another position in the same screen for every prediction unit is smoothed Is determined, and only the reference pixels determined to be smoothed are smoothed to generate a predicted image.

Therefore, it is possible to determine whether to perform smoothing for each reference pixel. Therefore, as in the prior art, whether or not smoothing is determined for each prediction unit results in smoothing for all reference pixels of the prediction unit, or smoothing for all reference pixels. It is possible to prevent the absence.

In addition, since it is possible to determine whether or not smoothing is performed for each reference pixel, it is possible to generate a predicted image with higher accuracy without performing smoothing to reference pixels that do not require smoothing.

Therefore, it is possible to perform smoothing more appropriately for the reference pixels used for intra prediction than before.

Note that the prediction unit may be a PU described in the embodiment or a partition obtained by dividing the PU.

In order to solve the above-described problem, the image encoding device according to the present invention, for each prediction unit, calculates a pixel of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen. An image encoding apparatus that generates encoded data by encoding a prediction residual between a generated predicted image and an original image, by predicting a value, and for each of the reference pixels A reference pixel smoothing determining unit that determines whether or not to perform smoothing between the pixel and another adjacent pixel, and the reference pixel determined by the reference pixel smoothing determining unit to perform the smoothing Smoothing means that performs smoothing between the reference pixel and reference pixels that are determined to be smoothed by the reference pixel smoothing determination means among the reference pixels after the smoothing means performs smoothing. Prediction using the reference pixels It is characterized in that it comprises the predicted image generating means for generating an image, a.

As described above, the image decoding apparatus according to the present invention is a reference pixel smoothing determination unit that determines whether or not to perform smoothing between each reference pixel and another pixel adjacent to the reference pixel. A smoothing means for smoothing the reference pixel determined to be smoothed by the reference pixel smoothing determining means with the other pixels, and among the reference pixels, the reference pixel smoothing determining means The reference pixel determined to be smoothed is configured to include a predicted image generation unit that generates the predicted image using the reference pixel after smoothing by the smoothing unit.

Therefore, it is possible to determine whether to perform smoothing for each reference pixel. Therefore, as in the prior art, whether or not smoothing is determined for each prediction unit results in smoothing for all reference pixels of the prediction unit, or smoothing for all reference pixels. There is an effect that it can be prevented.

Further, since the presence or absence of smoothing can be determined for each reference pixel, it is possible to generate a predicted image with higher accuracy without smoothing to reference pixels that do not require smoothing. There is an effect.

Therefore, there is an effect that smoothing can be performed more appropriately than in the past with respect to reference pixels used for intra prediction.

In addition, since it is not necessary to encode a flag indicating whether smoothing is performed for each prediction unit as in the prior art, there is an effect that the encoding efficiency can be improved.

1, showing an embodiment of the present invention, is a block diagram illustrating a main configuration of a moving picture decoding apparatus. FIG. FIG. 2A shows a data structure of encoded data referred to by the moving picture decoding apparatus, FIG. 2A shows a configuration of a picture layer of the encoded data, and FIG. FIG. 2C is a diagram illustrating a configuration of a slice layer included in the picture layer, FIG. 2C is a diagram illustrating a configuration of each CU configuring the LCU layer included in the slice layer, and FIG. FIG. 2E is a diagram illustrating a configuration of a leaf CU included in a layer, FIG. 2E is a diagram illustrating a configuration of inter prediction information for the leaf CU, and FIG. 2F is intra prediction information for a leaf CU. FIG. It is a figure for demonstrating the reference image which performs smoothing. It is a flowchart which shows the flow which determines whether smoothing is performed about a reference pixel. It is a flowchart which shows the flow which determines whether smoothing is performed about a reference pixel. It is a flowchart which shows the flow which determines whether smoothing is performed about a reference pixel. FIGS. 7A and 7B are diagrams for explaining the operation of the video decoding device, in which FIG. 7A illustrates a prediction mode referred to by the video decoding device together with a prediction mode index, and FIG. FIG. 4 is a diagram illustrating pixels belonging to a target partition and decoded pixels in the vicinity thereof. FIG. 8A is a diagram for describing intra-prediction image generation processing when the edge-based prediction mode is selected in the video decoding device, and FIG. 8A illustrates a target partition as a partition around the target partition. FIG. 8B is a diagram illustrating parameters for specifying the correction angle together with the predicted direction after correction. It is a block diagram which shows the principal part structure of the moving image encoder which concerns on this invention. It is a figure which shows a prior art and is an explanatory view of the method of smoothing a reference pixel. It is a figure which shows a prior art and shows whether smoothing is performed for every block. It is a figure for demonstrating that a moving image decoding apparatus and a moving image coding apparatus can be utilized for transmission / reception of a moving image, (a) is the block diagram which showed the structure of the transmission device carrying a moving image coding apparatus (B) is a block diagram showing a configuration of a receiving apparatus equipped with a moving picture decoding apparatus. It is a figure for demonstrating that a moving image decoding apparatus and a moving image encoding apparatus can be utilized for recording and reproduction | regeneration of a moving image, (a) shows the structure of the recording device carrying the moving image encoding apparatus 2. FIG. 8B is a block diagram illustrating a configuration of a playback device equipped with a video decoding device.

Referring to FIGS. 1 to 9, an embodiment of the present invention will be described as follows. A moving image decoding apparatus (image decoding apparatus) 1 according to the present embodiment decodes a moving image from encoded data. In addition, the moving image encoding apparatus according to the present embodiment generates encoded data by encoding a moving image. However, the scope of application of the present invention is not limited to this. That is, as will be apparent from the following description, the feature of the present invention lies in intra prediction, and is established without assuming a plurality of frames. That is, the present invention can be applied to a general decoding apparatus and a general encoding apparatus regardless of whether the target is a moving image or a still image.

(Configuration of encoded data # 1)
Prior to the description of the moving picture decoding apparatus 1 according to the present embodiment, the configuration of the encoded data # 1 generated by the moving picture encoding apparatus 2 according to the present embodiment and decoded by the moving picture decoding apparatus 1 will be described with reference to FIG. Will be described with reference to FIG. The encoded data # 1 has a hierarchical structure including a sequence layer, a GOP (Group Of Pictures) layer, a picture layer, a slice layer, and a maximum coding unit (LCU) layer.

FIG. 2 shows the hierarchical structure below the picture layer in the encoded data # 1. FIGS. 2A to 2F are a picture layer P, a slice layer S, an LCU layer LCU, a leaf CU included in the LCU (denoted as CUL in FIG. 2D), and inter prediction (inter-screen prediction), respectively. It is a figure which shows the structure of inter prediction information PI_Inter which is the prediction information PI about a partition, and intra prediction information PI_Intra which is the prediction information PI about an intra prediction (prediction in a screen) partition.

(Picture layer)
The picture layer P is a set of data that is referenced by the video decoding device 1 in order to decode a target picture that is a processing target picture. As shown in FIG. 2A, the picture layer P includes a picture header PH and slice layers S1 to SNs (Ns is the total number of slice layers included in the picture layer P).

The picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture. For example, the encoding mode information (entropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH. When entropy_coding_mode_flag is 0, the picture is encoded by CAVLC (Context-based Adaptive Variable Length Coding). When entropy_coding_mode_flag is 1, the picture is encoded by CABAC (Context-based Adaptive Binary Arithmetic Coding). It has become.

(Slice layer)
Each slice layer S included in the picture layer P is a set of data referred to by the video decoding device 1 in order to decode a target slice that is a slice to be processed. As shown in FIG. 2B, the slice layer S includes a slice header SH and LCU layers LCU1 to LCUn (Nc is the total number of LCUs included in the slice S).

The slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH. Further, the slice header SH includes a filter parameter FP that is referred to by a loop filter included in the video decoding device 1.

As slice types that can be specified by the slice type specification information, (1) I slice using only intra prediction at the time of encoding, and (2) P using unidirectional prediction or intra prediction at the time of encoding. Slice, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding.

(LCU layer)
Each LCU layer LCU included in the slice layer S is a set of data that the video decoding device 1 refers to in order to decode the target LCU that is the processing target LCU.

The LCU layer LCU is composed of a plurality of coding units (CU: Coding Units) obtained by hierarchically dividing the LCU into a quadtree. In other words, the LCU layer LCU is a coding unit corresponding to the highest level in a hierarchical structure that recursively includes a plurality of CUs. As shown in FIG. 2C, each CU included in the LCU layer LCU has a hierarchical structure that recursively includes a CU header CUH and a plurality of CUs obtained by dividing the CU into quadtrees. is doing.

The size of each CU excluding the LCU is half the size of the CU to which the CU directly belongs (that is, the CU one layer higher than the CU), and the size that each CU can take is encoded data # 1. Dependent on the size and hierarchical depth of the LCU included in the sequence parameter set SPS. For example, when the size of the LCU is 128 × 128 pixels and the maximum hierarchical depth is 5, the CUs in the hierarchical level below the LCU have five sizes, that is, 128 × 128 pixels and 64 × 64 pixels. , 32 × 32 pixels, 16 × 16 pixels, and 8 × 8 pixels. A CU that is not further divided is called a leaf CU.

(CU header)
The CU header CUH includes a coding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target CU. Specifically, as shown in FIG. 2C, a CU division flag SP_CU that specifies whether or not the target CU is further divided into four subordinate CUs is included. When the CU division flag SP_CU is 0, that is, when the CU is not further divided, the CU is a leaf CU.

(Leaf CU)
A CU (CU leaf) that is not further divided is handled as a prediction unit PU (Prediction Unit) and a transform unit TU (Transform Unit).

As shown in FIG. 2 (d), the leaf CU (denoted as CUL in FIG. 2 (d)) includes (1) PU information PUI that is referred to when the moving image decoding apparatus 1 generates a predicted image, and (2) The TU information TUI that is referred to when the residual data is decoded by the moving picture decoding apparatus 1 is included.

The skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target PU. When the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target leaf, PU information PUI and TU information TUI in the leaf CU are omitted. Note that the skip flag SKIP is omitted for the I slice.

The PU information PUI includes a skip flag SKIP, prediction type information PT, and prediction information PI as shown in FIG. The prediction type information PT is information that specifies whether intra prediction or inter prediction is used as a predicted image generation method for the target leaf CU (target PU). The prediction information PI includes intra prediction information PI_Intra or inter prediction information PI_Inter depending on which prediction method is specified by the prediction type information PT. Hereinafter, a PU to which intra prediction is applied is also referred to as an intra PU, and a PU to which inter prediction is applied is also referred to as an inter PU.

The PU information PUI includes information specifying the shape and size of each partition included in the target PU and the position in the target PU. Here, the partition is one or a plurality of non-overlapping areas constituting the target leaf CU, and the generation of the predicted image is performed in units of partitions.

As shown in FIG. 2D, the TU information TUI specifies a quantization parameter difference Δqp (tu_qp_delta) that specifies the magnitude of the quantization step, and a division pattern for each block of the target leaf CU (target TU). TU partition information SP_TU and quantized prediction residuals QD1 to QDNT (NT is the total number of blocks included in the target TU) are included.

The quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target TU and the quantization parameter qp ′ in the TU encoded immediately before the TU.

TU partition information SP_TU is information that specifies the shape and size of each block included in the target TU and the position in the target TU. Each TU can be, for example, a size from 64 × 64 pixels to 2 × 2 pixels. Here, the block is one or a plurality of non-overlapping areas constituting the target leaf CU, and encoding / decoding of the prediction residual is performed in units of blocks.

Each quantized prediction residual QD is encoded data generated by the moving image encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block. Process 1: DCT transform (Discrete Cosine Transform) is performed on the prediction residual obtained by subtracting the prediction image from the encoding target image. Process 2: The DCT coefficient obtained in Process 1 is quantized. Process 3: The DCT coefficient quantized in Process 2 is variable length encoded. The quantization parameter qp described above represents the magnitude of the quantization step QP used when the moving picture coding apparatus 2 quantizes the DCT coefficient (QP = 2 ^{qp / 6} ).

(Inter prediction information PI_Inter)
The inter prediction information PI_Inter includes a coding parameter that is referred to when the video decoding device 1 generates an inter prediction image by inter prediction. As shown in FIG. 2 (e), the inter prediction information PI_Inter includes inter PU partition information SP_Inter that specifies a partition pattern for each partition of the target PU, and inter prediction parameters PP_Inter1 to PP_InterNe (Ne for each partition). The total number of inter prediction partitions included in the target PU).

Specifically, the inter-PU partition information SP_Inter is information for designating the shape and size of each inter prediction partition included in the target PU (inter PU) and the position in the target PU.

The inter PU is composed of four symmetric splittings of 2N × 2N pixels, 2N × N pixels, N × 2N pixels, and N × N pixels, and 2N × nU pixels, 2N × nD pixels, and nL × 2N. It is possible to divide into 8 types of partitions in total by four asymmetric splits of pixels and nR × 2N pixels. Here, the specific value of N is defined by the size of the CU to which the PU belongs, and the specific values of nU, nD, nL, and nR are determined according to the value of N. For example, an inter PU of 128 × 128 pixels is 128 × 128 pixels, 128 × 64 pixels, 64 × 128 pixels, 64 × 64 pixels, 128 × 32 pixels, 128 × 96 pixels, 32 × 128 pixels, and 96 × It is possible to divide into 128-pixel inter prediction partitions.

(Inter prediction parameter)
As illustrated in FIG. 2E, the inter prediction parameter PP_Inter includes a reference image index RI, an estimated motion vector index PMVI, and a motion vector residual MVD.

(Intra prediction information PI_Intra)
The intra prediction information PI_Intra includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction. As shown in FIG. 2 (f), the intra prediction information PI_Intra includes intra PU partition information SP_Intra that specifies a partition pattern of the target PU (intra PU) into each partition, and intra prediction parameters PP_Intra1 to PP_IntraNa for each partition. (Na is the total number of intra prediction partitions included in the target PU).

Specifically, the intra-PU partition information SP_Intra is information that specifies the shape and size of each intra-predicted partition included in the target PU, and the position in the target PU. The intra PU split information SP_Intra includes an intra split flag (intra_split_flag) that specifies whether or not the target PU is split into partitions. If the intra partition flag is 1, the target PU is divided symmetrically into four partitions. If the intra partition flag is 0, the target PU is not divided and the target PU itself is one partition. Are treated as Therefore, if the size of the target PU is 2N × 2N pixels, the intra prediction partition can take any of 2N × 2N pixels (no division) and N × N pixels (four divisions) (where, N = 2 ⁿ , n is an arbitrary integer of 1 or more). For example, a 128 × 128 pixel intra PU can be divided into 128 × 128 pixel and 64 × 64 pixel intra prediction partitions.

(Intra prediction parameter PP_Intra)
As shown in FIG. 2F, the intra prediction parameter PP_Intra includes an estimation flag MPM, a residual prediction mode index RIPM, and an additional index AI. The intra prediction parameter PP_Intra is a parameter for designating an intra prediction method (prediction mode) for each partition.

The estimation flag MPM is a flag indicating whether or not the prediction mode estimated based on the prediction mode allocated to the peripheral partition of the target partition that is the processing target is the same as the prediction mode for the target partition. . Here, examples of partitions around the target partition include a partition adjacent to the upper side of the target partition and a partition adjacent to the left side of the target partition.

The residual prediction mode index RIPM is an index included in the intra prediction parameter PP_Intra when the estimated prediction mode and the prediction mode for the target partition are different, and is an index for designating a prediction mode assigned to the target partition. It is.

The additional index AI is an index for specifying the intra prediction method for the target partition in more detail when the prediction mode assigned to the target partition is a predetermined prediction mode.

(Moving picture decoding apparatus 1)
Hereinafter, the moving picture decoding apparatus 1 according to the present embodiment will be described with reference to FIGS. The moving picture decoding apparatus 1 includes H.264 as a part thereof. H.264 / MPEG-4. Decoding device including technology adopted in KTA software which is a codec for joint development in AVC and VCEG (Video Coding Expert Group), and technology adopted in TMuC (Test Model under Consideration) software which is a successor codec It is.

FIG. 1 is a block diagram showing a configuration of the moving picture decoding apparatus 1. As shown in FIG. 1, the moving image decoding apparatus 1 includes a variable length code decoding unit 11, a predicted image generation unit 12, an inverse quantization / inverse conversion unit 13, an adder 14, a frame memory 15, and a loop filter 16. I have. As shown in FIG. 1, the predicted image generation unit 12 includes a motion vector restoration unit 12a, an inter predicted image generation unit 12b, an intra predicted image generation unit (reference pixel smoothing determination unit, smoothing unit, predicted image generation unit). ) 12c and a prediction method determination unit 12d. The moving picture decoding apparatus 1 is an apparatus for generating moving picture # 2 by decoding encoded data # 1.

(Variable-length code decoding unit 11)
The variable length code decoding unit 11 decodes the prediction parameter PP related to each partition from the encoded data # 1, and supplies the decoded prediction parameter PP to the predicted image generation unit 12. Specifically, for the inter prediction partition, the variable-length code decoding unit 11 receives the inter prediction parameter PP_Inter including the reference image index RI, the estimated motion vector index PMVI, and the motion vector residual MVD from the encoded data # 1. These are decoded and supplied to the motion vector restoration unit 12a. On the other hand, for the intra prediction partition, the intra prediction parameter PP_Intra including the estimation flag MPM, the residual index RIPM, and the additional index AI is decoded from the encoded data # 1, and these are supplied to the intra prediction image generation unit 12c. In addition, the variable length code decoding unit 11 supplies size designation information for designating the size of the partition to the intra predicted image generation unit 12c (not shown).

Also, the variable length code decoding unit 11 decodes the prediction type information PT for each partition from the encoded data # 1, and supplies this to the prediction method determination unit 12d. Furthermore, the variable-length code decoding unit 11 decodes the quantization prediction residual QD for each block and the quantization parameter difference Δqp for the TU including the block from the encoded data # 1, and dequantizes and reverses them. This is supplied to the conversion unit 13. Further, the variable length code decoding unit 11 decodes the filter parameter FP from the encoded data # 1 and supplies this to the loop filter 16.

As a specific decoding method by the variable-length code decoding unit 11, CABAC (Context-based Adaptive Binary Arithmetic Coding) which is one arithmetic coding / decoding method, or one non-arithmetic encoding / decoding method is used. A certain CAVLC (Context-based Adaptive Variable Variable Length Coding) is used. Here, CABAC is an encoding / decoding scheme that performs adaptive binary arithmetic coding based on context, and CALVC is an encoding / decoding scheme that uses a set of variable length codes that adaptively switch contexts. It is. CABAC has a larger code amount reduction effect than CAVLC, but also has an aspect of increasing the processing amount.

The variable length code decoding unit 11 refers to the encoding mode information (entropy_coding_mode_flag) included in the picture header PH of the encoded data # 1 to determine whether the target picture has been encoded by CABAC or by CAVLC. Can be identified. In addition, the variable length code decoding unit 11 decodes the target picture using a decoding method corresponding to the identified encoding method.

(Predicted image generation unit 12)
The predicted image generation unit 12 identifies whether each partition is an inter prediction partition for performing inter prediction or an intra prediction partition for performing intra prediction based on the prediction type information PT for each partition. In the former case, the inter prediction image Pred_Inter is generated, and the generated inter prediction image Pred_Inter is supplied to the adder 14 as the prediction image Pred. In the latter case, the intra prediction image Pred_Intra is generated, The generated intra predicted image Pred_Intra is supplied to the adder 14. Note that, when the skip mode is applied to the processing target PU, the predicted image generation unit 12 omits decoding of other parameters belonging to the PU.

(Motion vector restoration unit 12a)
The motion vector restoration unit 12a restores the motion vector mv related to each inter prediction partition from the motion vector residual MVD related to that partition and the restored motion vector mv ′ related to another partition. Specifically, (1) the estimated motion vector pmv is derived from the restored motion vector mv ′ according to the estimation method specified by the estimated motion vector index PMVI, and (2) the derived estimated motion vector pmv and the motion vector remaining are derived. The motion vector mv is obtained by adding the difference MVD. It should be noted that the restored motion vector mv ′ relating to other partitions can be read from the frame memory 15. The motion vector restoration unit 12a supplies the restored motion vector mv to the inter predicted image generation unit 12b together with the corresponding reference image index RI.

(Inter prediction image generation unit 12b)
The inter prediction image generation unit 12b generates a motion compensated image mc related to each inter prediction partition by inter-screen prediction. Specifically, using the motion vector mv supplied from the motion vector restoration unit 12a, the motion compensation image mc from the filtered decoded image P_ALF ′ designated by the reference image index RI supplied from the motion vector restoration unit 12a. Is generated. Here, the filtered decoded image P_ALF ′ is an image obtained by performing the filtering process by the loop filter 16 on the decoded image that has already been decoded for the entire frame, and the inter predicted image generation unit 12b. Can read out the pixel value of each pixel constituting the filtered decoded image P_ALF ′ from the frame memory 15. The motion compensated image mc generated by the inter predicted image generation unit 12b is supplied to the prediction method determination unit 12d as an inter predicted image Pred_Inter.

(Intra predicted image generation unit 12c)
The intra predicted image generation unit 12c generates a predicted image Pred_Intra related to each intra prediction partition. Specifically, first, a prediction mode is specified based on the intra prediction parameter PP_Intra supplied from the variable length code decoding unit 11, and the specified prediction mode is assigned to the target partition in, for example, raster scan order. Subsequently, a predicted image Pred_Intra is generated from the (local) decoded image P by intra prediction according to the prediction method indicated by the prediction mode. When performing intra prediction, whether or not smoothing is necessary is determined for the neighboring pixels of the target partition. And about the pixel determined to perform smoothing, after performing smoothing, intra prediction is performed.

Specifically, this will be described with reference to FIG. As shown in FIG. 3, it is determined whether or not to smooth the reference pixels 301a to 301q used for generating a prediction image of the prediction unit (target partition) 301. This determination is made, for example, depending on whether the reference pixel is a pixel near the boundary of a block such as PU or CU. Then, smoothing is performed only for pixels near the boundary. Here, the reference pixels 301c, d, e, and f which are pixels within two pixels from the boundary between the

adjacent blocks

302 and 303 are smoothed. Since high-frequency distortion tends to occur at the block boundary, if only pixels near the boundary are smoothed, distortion can be reduced and distortion can be prevented from propagating to the predicted image. Therefore, the image quality can be improved. In addition, since it is not necessary to encode a flag indicating whether smoothing is performed in units of blocks, it is possible to improve encoding efficiency.

In FIG. 3, for example, when the boundary of the adjacent block is between the

reference pixels

301m and 301n, the reference pixels 301l, m, n, and o which are pixels within two pixels from the boundary of the adjacent block are smoothed. .

Then, the intra predicted image Pred_Intra generated by the intra predicted image generating unit 12c is supplied to the prediction method determining unit 12d. Note that the intra predicted image generation unit 12c may be configured to generate the predicted image Pred_Intra from the filtered decoded image P_ALF by intra prediction. The smoothing necessity determination process will be described later.

(Prediction method determination unit 12d)
The prediction method determination unit 12d determines whether each partition is an inter prediction partition that should perform inter prediction or an intra prediction partition that should perform intra prediction based on the prediction type information PT about the PU to which each partition belongs. To do. In the former case, the inter prediction image Pred_Inter generated by the inter prediction image generation unit 12b is supplied to the adder 14 as the prediction image Pred. In the latter case, the inter prediction image generation unit 12c generates the inter prediction image Pred_Inter. The intra predicted image Pred_Intra that has been processed is supplied to the adder 14 as the predicted image Pred.

(Inverse quantization / inverse transform unit 13)
The inverse quantization / inverse transform unit 13 (1) inversely quantizes the quantized prediction residual QD, (2) performs inverse DCT (Discrete Cosine Transform) transform on the DCT coefficient obtained by the inverse quantization, and (3) The prediction residual D obtained by the inverse DCT transform is supplied to the adder 14. When the quantization prediction residual QD is inversely quantized, the inverse quantization / inverse transform unit 13 derives the quantization step QP from the quantization parameter difference Δqp supplied from the variable length code decoding unit 11. The quantization parameter qp can be derived by adding the quantization parameter difference Δqp to the quantization parameter qp ′ relating to the TU that has been inversely quantized / inversely DCT transformed immediately before, and the quantization step QP is derived from the quantization step qp, for example, QP = 2 ^{pq / 6} . The generation of the prediction residual D by the inverse quantization / inverse transform unit 13 is performed in units of blocks obtained by dividing TUs or TUs.

(Adder 14)
The adder 14 generates the decoded image P by adding the prediction image Pred supplied from the prediction image generation unit 12 and the prediction residual D supplied from the inverse quantization / inverse conversion unit 13. The generated decoded image P is stored in the frame memory 15.

(Loop filter 16)
The loop filter 16 reads the decoded image P from the frame memory 15 and performs block noise reduction processing (deblocking processing) at one or both of the partition boundary and the block boundary of the decoded image P. In addition, the loop filter 16 performs adaptive filter processing using the filter parameter FP decoded from the encoded data # 1 on the decoded image subjected to the block noise reduction processing, and the adaptive filter processing Is output to the frame memory 15 as a filtered decoded image P_ALF.

(Smoothing necessity determination processing by the intra predicted image generation unit 12c)
Next, smoothing necessity determination processing in the intra predicted image generation unit 12c will be described with reference to FIG. The smoothing necessity determination process is performed on all reference pixels used for generating a predicted image, that is, a pixel in contact with either the upper side or the left side of the target partition, and an upper left pixel of the target partition (the upper left vertex of the target partition This is performed for the pixels to be shared (S31).

First, with respect to a pixel to be determined, it is determined whether or not the pixel exists in the vicinity of the boundary between adjacent blocks in contact with the target partition (S32). Whether or not the boundary is near is determined by whether or not it is within a predetermined distance d _th pixels from the boundary. The adjacent block (unit area) may be any of PU (prediction unit), CU (processing unit), TU (transform unit), and the partition obtained by dividing PU and TU Any of the obtained blocks may be used. Further, these may be used in combination. Further, only the boundary where the boundary between the PU and the TU overlaps may be a boundary used for the determination process.

If it is determined that the pixel is near the boundary (YES in S32), the intra predicted image generation unit 12c smoothes the reference pixel (S33). Smoothing is performed, for example, by applying a smoothing filter of 1: 2: 1. Specifically, for the pixel value p (x, y) of the reference pixel existing at (x, y), the pixel value after smoothing is p (x, y) = (p (x−1, y)) + P (x, y) × 2 + p (x + 1, y)) / 4.

Thereafter, when the above process is completed for all the reference pixels (S34), the intra predicted image generation unit 12c performs a predicted image generation process.

(Smoothing necessity determination process 2)
Next, smoothing necessity determination processing of the intra predicted image generation unit 12c by a method different from the above will be described with reference to FIG.

As shown in FIG. 5, the smoothing necessity determination process is performed for all reference pixels used for generating a predicted image, that is, a pixel in contact with either the upper side or the left side of the target partition, and the upper left pixel ( This is performed for pixels sharing the top left vertex of the target partition (S41). And the intra estimated image generation part 12c determines whether the pixel concerned exists in the boundary vicinity of the adjacent blocks which are in contact with the object partition about the pixel used as a decision object (S42). This determination process is performed in the same manner as in step S32 described above.

Next, the intra predicted image generation unit 12c determines whether or not the block boundary strength (Bs value: Boundary Strength) of the deblocking process to be performed later is greater than or equal to the threshold for the pixel to be determined (S43). As the threshold value, for example, Bs value = 4 can be set.

Since the deblocking process is a well-known technique, the detailed description is omitted, but the Bs value is set in five stages from 0 to 4. This is set as follows. When the pixels sandwiching the block boundary are the pixel p and the pixel q, respectively, when at least one of the pixel p or the pixel q belongs to the intra (in-screen) CU and is located at the boundary of the CU , Bs = 4. Here, the intra CU indicates a leaf CU encoded using intra prediction.

Further, when either one of the pixel p and the pixel q belongs to the intra CU but is not located at the boundary of the CU, Bs = 3.

Further, when neither the pixel p nor the pixel q belongs to the intra CU, and one of the transform blocks to which the pixel p or the pixel q belongs has an orthogonal transform coefficient, Bs = 2.

Also, neither the pixel p nor the pixel q belongs to the intra CU, and neither of the transform blocks to which the pixel p and the pixel q belong has an orthogonal transform coefficient, but the reference picture is different or the number of reference pictures Or when the motion vector values differ by a predetermined threshold value or more, Bs = 1.

Further, neither the pixel p nor the pixel q belongs to the CU, and neither the pixel p nor the pixel q has an orthogonal transformation coefficient, the reference picture is the same, and the value of the motion vector is less than a predetermined threshold value. In this case, Bs = 0.

If the Bs value is equal to or greater than the threshold (YES in S43), the intra predicted image generation unit 12c smoothes the reference pixel (S44). The smoothing process is performed by the same method as in step S33 described above.

After that, when the above process is completed for all the reference pixels (S45), the intra predicted image generation unit 12c performs a predicted image generation process.

In addition to this, for example, the intra prediction information PI_Intra includes a flag (smoothing information) indicating whether or not smoothing is performed on the pixels near the boundary, thereby determining whether or not smoothing is performed. You may decide.

(Smoothing necessity determination process 3)
Next, smoothing necessity determination processing of the intra predicted image generation unit 12c by a method different from the above will be described with reference to FIG.

As shown in FIG. 6, smoothing necessity determination processing is performed for all reference pixels used for generating a predicted image, that is, a pixel in contact with either the upper side or the left side of the target partition, and the upper left pixel ( This is performed for pixels that share the top left vertex of the target partition (S51). Then, the intra predicted image generation unit 12c determines whether the pixel to be determined is an edge pixel (S52).

For example, when the pixel (x, y) to be determined exists on the upper side of the target partition, the determination as to whether or not the pixel is an edge pixel is as follows: 3 × 3 pixels centered on the pixel (x, y−1) It is determined by whether or not the value to which the Sobel filter as shown is applied is greater than or equal to the threshold value.

When the intra predicted image generation unit 12c determines that the pixel to be determined is an edge pixel (YES in S52), the intra predicted image generation unit 12c smoothes the pixel (S53). The smoothing process is performed by the same method as in step S33 described above.

After that, when the above process is completed for all the reference pixels (S54), the intra predicted image generation unit 12c performs a predicted image generation process.

Note that if the pixel to be determined is not an edge caused by block distortion but a pixel corresponding to an edge inherent in the encoding target image, it is not necessary to perform smoothing. Therefore, smoothing may be performed when the value to which the Sobel filter is applied is equal to or greater than the first threshold and equal to or less than the second threshold. Here, the second threshold value can be determined in advance according to the strength of the edge of the encoding target image.

Also, instead of using a Sobel filter for edge pixel determination, the right and left (when touching the upper side of the prediction unit) or the upper and lower (when touching the left side of the prediction unit) of the pixel to be determined You may determine by the difference of a pixel value being more than a threshold value. For example, when determining the pixel (x, y) in contact with the upper side of the prediction unit, the pixel (x, y) that satisfies | p (x−1, y) −p (x + 1, y) | ≧ threshold Th is determined. You may determine with an edge pixel.

(Generation process of intra prediction image Pred_Intra by intra prediction image generation unit 12c)
Next, generation processing of the intra predicted image Pred_Intra by the intra predicted image generation unit 12c will be described with reference to FIG. FIG. 7 is a diagram for explaining the operation of the video decoding device 1, and FIG. 7A is a prediction mode referred to by the video decoding device 1, and includes a plurality of basic prediction modes, and FIG. 7B is a diagram illustrating a prediction mode included in an extended set including one edge-based prediction mode together with a prediction mode index, and FIG. 7B is a diagram illustrating pixels belonging to the target partition and decoded pixels in the vicinity thereof. It is.

The intra-predicted image generation unit 12c (1) a basic prediction mode that designates one or more predetermined prediction directions and DC prediction, and (2) calculation using pixel values around the target partition Prediction mode that determines the prediction direction by, for example, an edge whose prediction direction is the edge direction calculated from pixel values around the target partition (or the direction represented by the sum of the angle indicated by the edge direction and the correction angle) Based on the prediction mode specified by the intra prediction parameter PP_Intra among the base prediction modes, the intra prediction image Pred_Intra in the target partition is generated.

In other words, the intra-predicted image generation unit 12c is designated by an intra-prediction parameter PP_Intra from a set of prediction modes (hereinafter also referred to as “extended set”) including one or a plurality of basic prediction modes and an edge-based prediction mode. A prediction mode to be selected is selected, and an intra prediction image Pred_Intra in the target partition is generated based on the selected prediction mode.

Hereinafter, a set of basic prediction modes that specify one or more predetermined prediction directions and DC prediction is also referred to as a basic prediction mode set. That is, the extended set includes a prediction mode included in the basic prediction mode set and an edge-based prediction mode.

FIG. 7A is a diagram showing each prediction mode included in the extended set together with a prediction mode index assigned to each prediction mode. FIG. 7A shows each direction prediction mode belonging to the basic prediction mode set and the prediction direction indicated by each direction prediction mode. As shown in FIG. 7A, the edge-based prediction mode is specified by index 1, and the DC prediction mode included in the basic prediction mode is specified by index 0, and each directional prediction mode included in the basic prediction mode set. Are specified by indices 2-9.

Note that the information indicating the correspondence between each index and each prediction mode, and the information indicating the correspondence between each direction prediction mode belonging to the basic prediction mode set and each prediction direction are moving images that generate encoded data # 1. A common configuration can be used for both the image encoding device and the moving image decoding device 1 that decodes the encoded data # 1. The moving picture decoding apparatus 1 stores such information in its own memory, and whether the prediction mode specified by the decoded index is the edge-based prediction mode, the DC prediction mode, or It is possible to identify the direction prediction mode, and when the prediction mode specified by the decoded index is the direction prediction mode, which prediction direction the direction prediction mode specifies. Can be identified.

The information indicating the correspondence between each index and each prediction mode, and the information indicating the correspondence between each direction prediction mode belonging to the basic prediction mode set and each prediction direction are, for example, for each sequence, for each picture, or Further, a configuration may be adopted in which, for each slice, transmission is performed from the moving image encoding device to the moving image decoding device 1.

In FIG. 7A, the edge-based prediction mode is assigned to the index 1, but the present embodiment is not limited to this, and the characteristics of the decoding target image and the edge-based prediction mode are selected. It is possible to adopt a configuration in which an optimal index is assigned in accordance with the frequency of execution. For example, in the configuration in which the prediction mode specified by a smaller index among the prediction modes assigned to the partitions around the target partition is set as the estimated prediction mode for the target partition, the prediction mode with the smaller index is selected. Will be more frequent. In the case of such a configuration, when the decoding target image includes many edges, it is preferable to assign a smaller index to the edge-based prediction mode. On the other hand, when the prediction mode having a smaller index is selected at a higher frequency and the image to be decoded includes many edges, the edge-based prediction mode is selected. It is preferable to assign a larger index to this.

Further, in FIG. 7A, the basic prediction mode set is exemplified as the case where the basic prediction mode set includes a prediction mode that specifies any one of eight different direction predictions. However, the present embodiment is limited to this. It is not something. For example, as the basic prediction mode set, a set including a prediction mode that specifies any of nine or more different directions may be used. As such an example, for example, a set including a prediction mode for designating any of 16 different directions and a prediction mode for designating any of 32 different directions can be given.

In addition, as a prediction mode included in the basic prediction mode set, one of one or more predetermined directions or one or more non-directional prediction modes (for example, DC prediction) is designated. The present embodiment is not limited by the number of prediction modes included in the basic prediction mode set.

(Prediction image calculation processing in edge-based prediction mode)
Next, prediction image calculation processing by the intra-prediction image generation unit 12c in the edge-based prediction mode will be specifically described with reference to (a) to (b) of FIG. FIG. 8 is a diagram for explaining the intra-prediction image generation process when the edge-based prediction mode is selected. FIG. 8A illustrates the target partition OP as a partition adjacent to the target partition OP. It is a figure shown with partition NP1 which shares NP2 and NP3, and the upper left vertex of a target partition, and FIG.8 (b) is a figure which shows the parameter which designates a correction angle with the prediction direction after correction | amendment.

FIG. 8A shows a case where the target partition OP and the partitions NP1 to NP3 are all 4 × 4 pixels, but the present embodiment is not limited to this, and the target partition OP is four. The present invention can also be applied when the size is other than × 4 pixels or when the partitions NP1 to NP3 are other than 4 × 4 pixels. It is assumed that the pixel values of the pixels included in the partitions NP1 to NP3 shown in FIG. 8A have been decoded.

First, the intra predicted image generation unit 12c calculates edge vectors b _i (i = 1 to M, M is the total number of pixels included in the partitions NP1 to 3) for each pixel included in the partitions NP1 to NP3. Here, the calculation of the edge vectors b _i may be used Sobel filter Gx shown below, and Gy.

The Sobel filters Gx and Gy are filter matrices used for calculating an image gradient along the x direction and an image gradient along the y direction, respectively.

The intra predicted image generation unit 12c calculates, as the edge direction, a direction orthogonal to the calculated image gradient in the x direction and the image gradient represented by the image gradient along the y direction.

Subsequently, the intra predicted image generation unit 12c has a function T (α) shown below.
T (α) = Σ <e, b _i > ²
Define Here, e represents a unit vector whose angle between its own direction and the horizontal direction (x direction) is α, and the symbol <,> represents the inner product of both vectors. The symbol Σ indicates that the subscript i is to be summed from 1 to M.

Subsequently, the intra predicted image generation unit 12c sets an argument α ^* that maximizes the function T (α) ^.
α ^* = argmaxS (α)
And the direction represented by α ^* is set to the edge direction for the target partition. In the above description, it is assumed that the angle α and the angle α ^* are represented with the horizontal right direction being 0 degrees and the clockwise direction being positive (the same applies to the expression of the following angles).

FIG. 8B shows an example of the prediction direction specified by the value t (t = −2, −1, 0, 1, 2) indicated by the additional index AI.

In the edge-based prediction mode, the intra-predicted image generation unit 12c extrapolates the decoded pixel values for pixels around the target partition in the prediction direction determined as described above, so that An intra prediction image Pred_Intra is generated. Note that if there are decoded pixels on both sides along the prediction direction, the intra predicted image Pred_Intra may be generated by interpolating the pixel values of these pixels. The pixel of the decoded image here is a pixel that has been smoothed if necessary according to the result of the necessity determination process for smoothing.

For example, the intra-predicted image generation unit 12c sets the pixel among the decoded pixels that are located on the virtual line segment that faces the reverse direction of the prediction direction, starting from the pixel position of the prediction target pixel in the target partition. The intra predicted image Pred_Intra in the target partition is generated by setting the pixel value of the closest pixel (hereinafter also referred to as the closest pixel) to the pixel value of the prediction target pixel. Further, the pixel value of the prediction target pixel may be a value calculated using the pixel value of the nearest pixel and the pixel values of the pixels around the nearest pixel.

In the above description, when the intra predicted image generation unit 12c calculates the edge direction, the partition adjacent to the upper side of the target partition, the partition adjacent to the left side of the target partition, and the upper left vertex of the target partition are shared. Although the case where the pixel values of the pixels belonging to the partition to be referred to is taken as an example, the present embodiment is not limited to this, and the intra-predicted image generation unit 12c is more generally set around the target partition. The edge direction can be calculated with reference to the decoded pixel values belonging to the reference region.

(Prediction image calculation processing in basic prediction mode)
Next, the prediction image generation processing by the intra prediction image generation unit 12c in the basic prediction mode will be specifically described.

When the DC prediction mode is selected for the target partition, the intra predicted image generation unit 12c generates an intra predicted image Pred_Intra for the target partition by taking an average value of decoded pixel values around the target partition. .

Further, when the direction prediction mode designated by any of the indexes 2 to 9 is selected, the intra predicted image generation unit 12c decodes the periphery of the target partition along the prediction direction indicated by the selected direction prediction mode. The intra predicted image Pred_Intra for the target partition is generated by extrapolating the completed pixel values. Note that if there are decoded pixels on both sides along the prediction direction, the intra predicted image Pred_Intra may be generated by interpolating the pixel values of these pixels.

Hereinafter, with reference to FIG. 7B, an example of a predicted image calculation process performed by the intra predicted image generation unit 12c will be described. In the following example, the description will be made on the assumption that the size of the target partition is 4 × 4 pixels, but this does not limit the present embodiment.

FIG. 7B is a diagram illustrating each pixel (prediction target pixel) of the target partition, which is 4 × 4 pixels, and pixels (reference pixels) around the target partition. As shown in FIG. 7B, the prediction target pixels are denoted by reference signs a to p, the reference pixels are denoted by reference signs A to M, and a pixel X (X is any one of a to p and A to M). Let the value be represented as X. Further, it is assumed that the reference pixels A to M have all been decoded.

(Prediction mode 0)
When the index of the allocated prediction mode is 0 (DC prediction), the intra predicted image generation unit 12c converts the pixel values a to p into the following formulas a to p = ave (A, B, C, D, I, J, K, L)
Generate by. Here, ave (...) Indicates that an element included in parentheses is averaged.

(Prediction mode 2)
When the allocated prediction mode index is 2, the intra-predicted image generation unit 12c converts the pixel values a to p into the following formulas a, e, i, m = A,
b, f, j, n = B,
c, g, k, o = C,
d, h, l, p = D
Generate by.

(Prediction mode 5)
When the allocated prediction mode index is 5, the intra-predicted image generation unit 12c converts the pixel values a to p into the following expression d = (B + (C × 2) + D + 2) >> 2,
c, h = (A + (B × 2) + C + 2) >> 2,
b, g, l = (M + (A × 2) + B + 2) >> 2,
a, f, k, p = (I + (M × 2) + A + 2) >> 2,
e, j, o = (J + (I × 2) + M + 2) >> 2,
i, n = (K + (J × 2) + I + 2) >> 2,
m = (L + (K × 2) + J + 2) >> 2
Generate by. Here, “>>” represents a right shift operation, and for any positive integer x, s, the value of x >> s is equal to the value obtained by rounding down the fractional part of x ÷ (2２s).

Also, the intra predicted image generation unit 12c can calculate the pixel values a to p by the same method for the basic prediction modes other than the above prediction modes.

In addition, the intra predicted image generation unit 12c generates an intra predicted image Pred_Intra in the edge-based prediction mode by performing substantially the same process as described above using the prediction direction calculated in the edge-based prediction mode. Can do.

(Appendix 1)
In the above-described embodiment, it is determined whether or not to smooth the target pixel using the Bs value. However, the Bs value is used without using the Bs value. You can also.

That is, when both the target pixel and the reference pixel adjacent to the target pixel are included in the intra CU and the CU boundary is between the two pixels, smoothing is performed. Thereby, it is possible to achieve the same effect as when switching whether or not smoothing is performed according to whether or not the Bs value is 4 or more without calculating the Bs value.

Further, the smoothing may be performed when either the target pixel or a reference pixel adjacent to the target pixel is included in the intra CU and the boundary between the two pixels is a processing unit that is not a CU. Thereby, the same effect as the case where it switches whether it smoothes according to whether Bs value is 3 or more can be show | played.

(Appendix 2)
In addition, when the angle formed by the motion vector mv assigned to the partition including the target pixel and the motion vector mv allocated to the partition including the reference pixel opposite to the target pixel is greater than or equal to the threshold value Alternatively, smoothing may be performed when the difference between the components of the motion vector mv is equal to or greater than a threshold value.

(Appendix 3)
Also, the number of reference images that are referenced when inter-predicting a partition that includes the target pixel, and a reference that is referenced when inter-predicting a partition that includes the reference pixel on the opposite side across the boundary with the target pixel Smoothing may be performed when the number of images is different. Further, a reference image that is referred to when inter-predicting a partition including the target pixel, and a reference image that is referred to when inter-predicting a partition including the reference pixel on the opposite side across the boundary with the target pixel If they are different, smoothing may be performed.

(Appendix 4)
Also, the difference in value between the quantization parameter QP (QuantizationParameter) for the TU including the target pixel and the quantization parameter QP (QuantizationParameter) for the TU including the reference pixel opposite to the target pixel across the boundary. When is equal to or greater than the threshold, smoothing may be performed.

(Appendix 5)
Further, smoothing is performed when the transform coefficient is not encoded in at least one of the TU including the target pixel and the TU including the reference pixel on the opposite side across the boundary with the target pixel. It is good.

(Appendix 6)
In the above-described embodiment, each pixel that is referred to when generating an intra-predicted image is smoothed by applying a 1: 2: 1 3-tap smoothing filter. A filter having the above tap number may be applied. Thereby, the block distortion can be reduced with high accuracy. In addition, as an example of such a filter, the filter used for the conventional deblocking process is mentioned, for example.

(Appendix 7)
The smoothing described in the present embodiment is effective not only in the above-described intra prediction modes but also in the case where an intra predicted image is generated using pixel values of reference pixels. In addition, when the intra prediction mode is DC prediction, the reference pixels are used only as an average value for prediction image generation, and therefore smoothing may be omitted.

(Moving picture encoding device 2)
The configuration of the moving image encoding apparatus (image encoding apparatus) 2 according to the present embodiment will be described with reference to FIG. The moving image encoding apparatus 2 includes H.264 as a part thereof. H.264 / MPEG-4. Coding including technology adopted in KTA software, which is a codec for joint development in AVC and VCEG (Video Coding Expert Group), and technology adopted in TMuC (Test Model under Consideration) software, which is the successor codec Device.

FIG. 9 is a block diagram showing a configuration of the moving picture encoding apparatus 2. As illustrated in FIG. 9, the moving image encoding apparatus 2 includes a predicted image generation unit 21, a transform / quantization unit 22, an inverse quantization / inverse transform unit 23, an adder 24, a frame memory 25, a loop filter 26, a variable A long code encoding unit 27 and a subtracter 28 are provided. As shown in FIG. 9, the predicted image generation unit 21 includes an intra predicted image generation unit (reference pixel smoothing determination unit, smoothing unit, predicted image generation unit) 21a, a motion vector detection unit 21b, and an inter predicted image generation. A unit 21c, a prediction scheme control unit 21d, and a motion vector redundancy deletion unit 21e. The moving image encoding device 2 is a device that generates encoded data # 1 by encoding moving image # 10 (encoding target image).

(Predicted image generation unit 21)
The predicted image generation unit 21 recursively divides the processing target LCU into one or a plurality of lower-order CUs, further divides each leaf CU into one or a plurality of partitions, and uses an inter-screen prediction for each partition. A predicted image Pred_Inter or an intra predicted image Pred_Intra using intra prediction is generated. The generated inter prediction image Pred_Inter and intra prediction image Pred_Intra are supplied to the adder 24 and the subtracter 28 as the prediction image Pred.

Note that the prediction image generation unit 21 omits encoding of other parameters belonging to the PU for the PU to which the skip mode is applied. Also, (1) the mode of division into lower CUs and partitions in the target LCU, (2) whether to apply the skip mode, and (3) which of the inter predicted image Pred_Inter and the intra predicted image Pred_Intra for each partition Whether to generate is determined so as to optimize the encoding efficiency.

(Intra predicted image generation unit 21a)
The intra predicted image generation unit 21a generates a predicted image Pred_Intra for each partition by intra prediction. Specifically, (1) a prediction mode used for intra prediction is selected for each partition, and (2) a prediction image Pred_Intra is generated from the decoded image P using the selected prediction mode. The intra predicted image generation unit 21a supplies the generated intra predicted image Pred_Intra to the prediction method control unit 21d.

More specifically, the intra-prediction image generation unit 21a selects any one of the prediction modes included in the above-described basic prediction mode set and the prediction mode included in the extended set including the edge-based prediction mode. The intra prediction image Pred_Intra is generated according to the method indicated by the selected prediction mode. Then, before generating the intra-predicted image Pred_Intra, whether or not smoothing is necessary for the neighboring pixels of the target partition is determined. Then, after smoothing the pixels determined to be smoothed, an intra predicted image Pred_Intra is generated. The smoothing necessity determination process is the same as the process in the intra predicted image generation unit 12c included in the video decoding device 1.

(Motion vector detection unit 21b)
The motion vector detection unit 21b detects a motion vector mv regarding each partition. Specifically, (1) the filtered decoded image P_ALF ′ used as the reference image is selected, and (2) the target partition is searched by searching for the region that best approximates the target partition in the selected filtered decoded image P_ALF ′. Detects a motion vector mv. Here, the filtered decoded image P_ALF ′ is an image obtained by performing adaptive filter processing by the loop filter 26 on the decoded image that has already been decoded for the entire frame, and is a motion vector detection unit. 21b can read out the pixel value of each pixel constituting the filtered decoded image P_ALF ′ from the frame memory 25. The motion vector detection unit 21b supplies the detected motion vector mv to the inter prediction image generation unit 21c and the motion vector redundancy deletion unit 21e together with the reference image index RI that specifies the filtered decoded image P_ALF ′ used as the reference image. To do.

(Inter prediction image generation unit 21c)
The inter prediction image generation unit 21c generates a motion compensated image mc related to each inter prediction partition by inter-screen prediction. Specifically, the motion compensation image mc is obtained from the filtered decoded image P_ALF ′ designated by the reference image index RI supplied from the motion vector detection unit 21b, using the motion vector mv supplied from the motion vector detection unit 21b. Generate. Similar to the motion vector detection unit 21b, the inter prediction image generation unit 21c can read out the pixel value of each pixel constituting the filtered decoded image P_ALF ′ from the frame memory 25. The inter prediction image generation unit 21c supplies the generated motion compensated image mc (inter prediction image Pred_Inter) together with the reference image index RI supplied from the motion vector detection unit 21b to the prediction method control unit 21d.

(Prediction method controller 21d)
The prediction scheme control unit 21d compares the intra predicted image Pred_Intra and the inter predicted image Pred_Inter with the encoding target image and selects whether to perform intra prediction or inter prediction. When intra prediction is selected, the prediction scheme control unit 21d supplies the intra prediction image Pred_Intra as the prediction image Pred to the adder 24 and the subtracter 28, and also uses the intra prediction parameter PP_Intra supplied from the intra prediction image generation unit 21a. This is supplied to the variable length code encoding unit 27. On the other hand, when the inter prediction is selected, the prediction scheme control unit 21d supplies the inter prediction image Pred_Inter as the prediction image Pred to the adder 24 and the subtractor 28, and the reference image index RI and motion vector redundancy described later. The estimated motion vector index PMVI and the motion vector residual MVD supplied from the deletion unit 21e are supplied to the variable length code encoding unit 27 as an inter prediction parameter PP_Inter.

(Motion vector redundancy deleting unit 21e)
The motion vector redundancy deletion unit 21e deletes redundancy in the motion vector mv detected by the motion vector detection unit 21b. Specifically, (1) an estimation method used for estimating the motion vector mv is selected, (2) an estimated motion vector pmv is derived according to the selected estimation method, and (3) the estimated motion vector pmv is subtracted from the motion vector mv. As a result, a motion vector residual MVD is generated. The motion vector redundancy deleting unit 21e supplies the generated motion vector residual MVD to the prediction method control unit 21d together with the estimated motion vector index PMVI indicating the selected estimation method.

(Transformation / quantization unit 22)
The transform / quantization unit 22 (1) performs DCT transform (Discrete Cosine Transform) on the prediction residual D obtained by subtracting the predicted image Pred from the encoding target image (original image) for each block (transform unit), and (2) The DCT coefficient obtained by the DCT transform is quantized, and (3) the quantized prediction residual QD obtained by the quantization is supplied to the variable length code encoder 27 and the inverse quantization / inverse transform unit 23. The transform / quantization unit 22 (1) selects a quantization step QP to be used for quantization for each TU, and (2) sets a quantization parameter difference Δqp indicating the size of the selected quantization step QP. This is supplied to the variable-length code encoding unit 27, and (3) the selected quantization step QP is supplied to the inverse quantization / inverse transform unit 23. Here, the quantization parameter difference Δqp is the quantization parameter qp for the TU that has been DCT transformed / quantized immediately before, based on the value of the quantization parameter qp (for example, QP = 2 ^{pq / 6} ) for the TU to be DCT transformed / quantized. The difference value obtained by subtracting the value of '.

(Inverse quantization / inverse transform unit 23)
The inverse quantization / inverse transform unit 23 (1) inversely quantizes the quantized prediction residual QD, (2) performs inverse DCT (Discrete Cosine Transform) transformation on the DCT coefficient obtained by the inverse quantization, and (3) The prediction residual D obtained by the inverse DCT transform is supplied to the adder 24. When the quantization prediction residual QD is inversely quantized, the quantization step QP supplied from the transform / quantization unit 22 is used. Note that the prediction residual D output from the inverse quantization / inverse transform unit 23 is obtained by adding a quantization error to the prediction residual D input to the transform / quantization unit 22. Common names are used for this purpose.

(Adder 24)
The adder 24 adds the predicted image Pred selected by the prediction scheme control unit 21d to the prediction residual D generated by the inverse quantization / inverse transform unit 23, thereby obtaining the (local) decoded image P. Generate. The (local) decoded image P generated by the adder 24 is supplied to the loop filter 26 and stored in the frame memory 25, and is used as a reference image in intra prediction.

(Variable-length code encoding unit 27)
The variable length code encoding unit 27 (1) the quantization prediction residual QD and Δqp supplied from the transform / quantization unit 22, and (2) the quantization parameter PP (inter prediction) supplied from the prediction scheme control unit 21d. The parameter PP_Inter and the intra prediction parameter PP_Intra) and (3) the filter parameter FP supplied from the loop filter 26 are variable-length-encoded to generate encoded data # 1.

As a specific encoding method by the variable length code encoding unit 27, CABAC (Context-based Adaptive Binary Arithmetic Coding) which is one arithmetic coding / decoding method, or one non-arithmetic encoding / decoding method is used. CAVLC (Context-based Adaptive VLC) which is a method is used.

The variable length code encoding unit 27 determines which encoding method of CABAC or CAVLC is to be used for each picture, performs encoding using the determined encoding method, and specifies the determined encoding method The mode information (entropy_coding_mode_flag) is included in the picture header PH of the encoded data # 1.

(Subtractor 28)
The subtracter 28 generates the prediction residual D by subtracting the prediction image Pred selected by the prediction method control unit 21d from the encoding target image. The prediction residual D generated by the subtracter 28 is DCT transformed / quantized by the transform / quantization unit 22.

(Loop filter 26)
The loop filter 26 reads the decoded image P from the frame memory 25 and performs block noise reduction processing (deblocking processing) at one or both of the partition boundary and the block boundary of the decoded image P. The loop filter 26 performs adaptive filter processing using the adaptively calculated filter parameter FP on the decoded image subjected to block noise reduction processing, and the adaptive filter processing is performed. The decoded image P is output to the frame memory 25 as a filtered decoded image P_ALF. The filtered decoded image P_ALF is mainly used as a reference image in the inter predicted image generation unit 21c.

(Application examples)
The moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above can be used by being mounted on various apparatuses that perform moving picture transmission, reception, recording, and reproduction. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

First, it will be described with reference to FIG. 12 that the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above can be used for transmission and reception of moving pictures.

FIG. 12A is a block diagram illustrating a configuration of the transmission apparatus A in which the moving picture encoding apparatus 2 is mounted. As shown in FIG. 12A, the transmitting apparatus A encodes a moving image, obtains encoded data, and modulates a carrier wave with the encoded data obtained by the encoding unit A1. A modulation unit A2 that obtains a modulation signal by the transmission unit A2 and a transmission unit A3 that transmits the modulation signal obtained by the modulation unit A2. The moving image encoding device 2 described above is used as the encoding unit A1.

The transmission apparatus A has a camera A4 that captures a moving image, a recording medium A5 that records the moving image, an input terminal A6 for inputting the moving image from the outside, as a supply source of the moving image that is input to the encoding unit A1. You may further provide image processing part A7 which produces | generates or processes an image. FIG. 12A illustrates a configuration in which the transmission apparatus A includes all of these, but some of them may be omitted.

The recording medium A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded using a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium A5 according to the recording encoding method may be interposed between the recording medium A5 and the encoding unit A1.

FIG. 12B is a block diagram illustrating a configuration of the receiving device B on which the moving image decoding device 1 is mounted. As illustrated in FIG. 12B, the receiving device B includes a receiving unit B1 that receives a modulated signal, a demodulating unit B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit B1, and a demodulating unit. And a decoding unit B3 that obtains a moving image by decoding the encoded data obtained by B2. The moving picture decoding apparatus 1 described above is used as the decoding unit B3.

The receiving apparatus B has a display B4 for displaying a moving image, a recording medium B5 for recording the moving image, and an output terminal for outputting the moving image as a supply destination of the moving image output from the decoding unit B3. B6 may be further provided. FIG. 12B illustrates a configuration in which the receiving apparatus B includes all of these, but a part of the configuration may be omitted.

Note that the recording medium B5 may be for recording an unencoded moving image, or is encoded by a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) that encodes the moving image acquired from the decoding unit B3 in accordance with the recording encoding method may be interposed between the decoding unit B3 and the recording medium B5.

Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

For example, a terrestrial digital broadcast broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) is an example of a transmitting apparatus A / receiving apparatus B that transmits and receives modulated signals by wireless broadcasting. A broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) for cable television broadcasting is an example of a transmitting device A / receiving device B that transmits and receives a modulated signal by cable broadcasting.

Also, a server (workstation etc.) / Client (television receiver, personal computer, smart phone etc.) such as VOD (Video On Demand) service and video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication. This is an example of A / reception device B (usually, either wireless or wired is used as a transmission medium in a LAN, and wired is used as a transmission medium in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

In addition to the function of decoding the encoded data downloaded from the server and displaying it on the display, the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device A and the reception device B.

Next, the fact that the above-described moving picture decoding apparatus 1 and moving picture encoding apparatus 2 can be used for recording and reproduction of moving pictures will be described with reference to FIG.

FIG. 13A is a block diagram showing a configuration of a recording apparatus C on which the above-described moving picture decoding apparatus 1 is mounted. As shown in FIG. 13 (a), the recording device C encodes a moving image to obtain encoded data, and writes the encoded data obtained by the encoding unit C1 to the recording medium M. And a writing unit C2. The moving image encoding device 2 described above is used as the encoding unit C1.

The recording medium M may be of a type built in the recording device C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) (Trademark) or the like may be mounted on a drive device (not shown) built in the recording apparatus C.

The recording apparatus C also serves as a moving image supply source to be input to the encoding unit C1, a camera C3 that captures moving images, an input terminal C4 for inputting moving images from the outside, and reception for receiving moving images. A unit C5 and an image processing unit C6 that generates or processes an image may be further provided. FIG. 13A illustrates a configuration in which the recording apparatus C includes all of these, but a part of the configuration may be omitted.

The receiving unit C5 may receive an unencoded moving image, or receives encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit C5 and the encoding unit C1.

Examples of such a recording device C include a DVD recorder, a BD recorder, and an HD (Hard Disk) recorder (in this case, the input terminal C4 or the receiving unit C5 is a main source of moving images). In addition, a camcorder (in this case, the camera C3 is a main source of moving images), a personal computer (in this case, the receiving unit C5 or the image processing unit C6 is a main source of moving images), a smartphone (this In this case, the camera C3 or the receiving unit C5 is a main source of moving images).

FIG. 13B is a block diagram showing the configuration of the playback device D equipped with the above-described video decoding device 1. As shown in FIG. 13 (b), the playback device D obtains a moving image by decoding the read data D1 that reads the encoded data written on the recording medium M and the read data read by the read unit D1. And a decoding unit D2. The moving picture decoding apparatus 1 described above is used as the decoding unit D2.

The recording medium M may be of a type built in the playback device D such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory. It may be of a type connected to the playback device D, or (3) may be loaded into a drive device (not shown) built in the playback device D, such as DVD or BD. Good.

Further, the playback device D has a display D3 for displaying a moving image, an output terminal D4 for outputting the moving image to the outside, and a transmitting unit for transmitting the moving image as a supply destination of the moving image output by the decoding unit D2. D5 may be further provided. FIG. 13B illustrates a configuration in which the playback apparatus D includes all of these, but some of them may be omitted.

The transmission unit D5 may transmit a non-encoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, an encoding unit (not shown) that encodes a moving image with a transmission encoding method may be interposed between the decoding unit D2 and the transmission unit D5.

Examples of such a playback device D include a DVD player, a BD player, and an HDD player (in this case, an output terminal D4 to which a television receiver or the like is connected is a main moving image supply destination). . In addition, a television receiver (in this case, the display D3 is a main destination of moving images), a desktop PC (in this case, the output terminal D4 or the transmission unit D5 is a main destination of moving images), Laptop type or tablet type PC (in this case, display D3 or transmission unit D5 is the main video image supply destination), smartphone (in this case, display D3 or transmission unit D5 is the main video image supply destination) ), Digital signage (also referred to as an electronic signboard or an electronic bulletin board, and the display D3 or the transmission unit D5 is the main supply destination of moving images) is an example of such a playback device D.

(Configuration by software)
Finally, each block of the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2, in particular, a variable length code decoding unit 11, a prediction image generation unit (motion vector restoration unit 12 a, inter prediction image generation unit 12 b, intra prediction image generation Unit 12c, prediction method determination unit 12d) 12, inverse quantization / inverse conversion unit 13, adder 14, frame memory 15, loop filter 16, prediction image generation unit (intra prediction image generation unit 21a, motion vector detection unit 21b, Inter prediction image generation unit 21c, prediction method control unit 21d, motion vector redundancy deletion unit 21e) 21, transform / quantization unit 22, inverse quantization / inverse transform unit 23, adder 24, frame memory 25, loop filter 26 The variable-length code encoding unit 27 and the subtracter 28 are implemented in hardware by a logic circuit formed on an integrated circuit (IC chip). It may be realized, or may be implemented in software using a CPU (central processing unit).

In the latter case, the video decoding device 1 and the video encoding device 2 develop a CPU that executes instructions of a control program that realizes each function, a ROM (read （only memory) that stores the program, and the program. A RAM (random access memory), a storage device (recording medium) such as a memory for storing the program and various data, and the like are provided. The object of the present invention is to provide a computer program code (execution format program, intermediate code program, source program) of a control program for the video decoding device 1 and the video encoding device 2 that is software that implements the functions described above. Is supplied to the above-described moving picture decoding apparatus 1 and moving picture encoding apparatus 2 and the computer (or CPU or MPU (microprocessor unit)) is recorded on the recording medium. It can also be achieved by reading and executing the code.

Examples of the recording medium include tapes such as a magnetic tape and a cassette tape, a magnetic disk such as a floppy (registered trademark) disk / hard disk, a CD-ROM (compact disk-read-only memory) / MO (magneto-optical) / Discs including optical discs such as MD (Mini Disc) / DVD (digital versatile disc) / CD-R (CD Recordable), IC cards (including memory cards) / optical cards, mask ROM / EPROM (erasable) Programmable read-only memory) / EEPROM (electrically erasable and programmable programmable read-only memory) / semiconductor memory such as flash ROM, or logic circuits such as PLD (Programmable logic device) and FPGA (Field Programmable Gate Array) be able to.

Further, the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, Internet, intranet, extranet, LAN (local area network), ISDN (integrated area services digital area), VAN (value-added area network), CATV (community area antenna television) communication network, virtual area private network (virtual area private network), A telephone line network, a mobile communication network, a satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, IEEE (institute of electrical and electronic engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (asynchronous digital subscriber loop) line, etc. wired such as IrDA (infrared data association) or remote control , Bluetooth (registered trademark), IEEE802.11 wireless, HDR (high data rate), NFC (Near field communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc. Is possible. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

(Other)
The above-described embodiment can also be expressed as follows.

The image decoding apparatus according to the present invention generates a prediction image by predicting a pixel value of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen for each prediction unit, An image decoding apparatus that generates a decoded image by adding a generated predicted image to a prediction residual decoded from encoded data, and for each reference pixel, between the reference pixel and another adjacent pixel The reference pixel smoothing determining means for determining whether or not to perform smoothing, and the smoothing for smoothing the reference pixel determined to be smoothed by the reference pixel smoothing determining means with the other pixels. Among the reference pixels and the reference pixels determined to be smoothed by the reference pixel smoothing determination unit, the predicted image is generated using the reference pixels smoothed by the smoothing unit. Predicted image generation means for It is characterized in that it comprises.

In the image decoding device according to the present invention, the reference pixel smoothing determining means is configured such that, when the reference pixel to be determined exists in the vicinity of a boundary between a plurality of unit regions in contact with the prediction unit to be processed, the reference pixel It is preferable to determine that smoothing is performed.

Here, the unit area may be any of a partition obtained by dividing a CU, a TU, a PU, and a PU described in the embodiment, and a block obtained by dividing a TU. Further, the vicinity of the boundary means a range of a predetermined distance (for example, two pixels) from the boundary.

And, the boundary between unit areas is likely to cause high-frequency distortion. Therefore, according to the above configuration, since it is determined that the reference pixels near the boundary are to be smoothed, this high-frequency distortion can be appropriately reduced.

In the image decoding device according to the present invention, the reference pixel smoothing determining unit, when the block boundary strength when a deblocking filter is applied in a later process for a reference pixel to be determined exceeds a threshold, It may be determined that the pixel is smoothed.

According to the above configuration, when the block boundary strength when the deblocking filter is applied in the subsequent processing exceeds the threshold for the reference pixel, it is determined that smoothing is performed.

A pixel to which a deblocking filter having a high block boundary strength is applied is a pixel having a large block distortion. Therefore, it is possible to perform smoothing for a reference pixel having a large block distortion.

In the image decoding device according to the present invention, the reference pixel smoothing determining means includes at least one of a unit region including a reference pixel to be determined and an adjacent unit region in contact with the unit region at the boundary. May be determined to perform the smoothing of the reference pixel when the predicted region is generated by intra prediction.

When at least one of a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary is a unit region in which a prediction image is generated by intra prediction, these Block distortion tends to occur at the boundary between unit areas. Therefore, according to the above configuration, it is possible to smooth the reference pixels that are likely to cause block distortion.

In addition, at least one of the unit region including the reference pixel to be determined and the adjacent unit region that is in contact with the unit region at the boundary is near the boundary of the unit region where the prediction image is generated by intra prediction. When a deblocking filter is applied to the reference pixel, a deblocking filter having a high block boundary strength is applied. Therefore, according to said structure, the reference pixel to which a deblocking filter with high block boundary intensity | strength is applied can be smoothed, without calculating block boundary intensity | strength.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining unit performs inter prediction on both the unit region including the reference pixel to be determined and the adjacent unit region in contact with the unit region at the boundary. If the angle formed by the motion vector assigned to each unit area exceeds a threshold when the predicted image is generated by the unit area, the reference pixel may be determined to be smoothed. .

When the angle formed by the motion vectors between the prediction units is large, the pixel at the boundary between the prediction units is likely to be distorted. Therefore, according to said structure, the reference pixel considered that distortion is large can be smoothed.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining unit performs inter prediction on both the unit region including the reference pixel to be determined and the adjacent unit region in contact with the unit region at the boundary. If the difference between the motion vector components assigned to each unit area exceeds a threshold when the predicted image is generated by the unit area, the reference pixel is determined to be smoothed. Good.

When there is a large difference in motion vector components between unit areas, the pixel at the boundary between the unit areas is likely to be distorted. Therefore, according to said structure, the reference pixel considered that distortion is large can be smoothed.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining unit performs inter prediction on both the unit region including the reference pixel to be determined and the adjacent unit region in contact with the unit region at the boundary. In the case of the unit region in which the predicted image is generated by the above, when the number of reference images used for inter prediction in each unit region is different, it may be determined that the reference pixel is smoothed.

画素 Pixels at the boundary between unit areas with different numbers of reference pixels used for inter prediction are highly likely to be distorted. Therefore, according to said structure, the reference pixel considered that distortion is large can be smoothed.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining unit performs inter prediction on both the unit region including the reference pixel to be determined and the adjacent unit region in contact with the unit region at the boundary. In the case where the prediction unit is the prediction unit in which the prediction image is generated, the reference pixel may be determined to be smoothed when the reference image used for the inter prediction in each unit region is different.

画素 Pixels at the boundary between unit areas with different reference images used for inter prediction are likely to be distorted. Therefore, according to said structure, the reference pixel considered that distortion is large can be smoothed.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining means includes a unit area including the reference pixel to be determined and quantization values of adjacent unit areas that are in contact with the unit area at the boundary. When the difference exceeds the threshold value, it may be determined that the reference pixel is smoothed.

When there is a large difference in quantization value between unit areas, the pixel at the boundary between the unit areas is likely to be distorted. Therefore, according to said structure, the reference pixel considered that distortion is large can be smoothed.

In the image decoding apparatus according to the present invention, the reference pixel smoothing determining means may determine that the reference pixel is smoothed when the reference pixel to be determined is an edge pixel.

According to the above configuration, when the reference pixel to be determined is an edge pixel, the reference pixel is smoothed. Edge pixels are very likely to have block distortion. Therefore, it is possible to smooth the reference pixels that are likely to have block distortion.

In the image decoding apparatus according to the present invention, the encoded data includes smoothing information indicating that a reference pixel used for intra prediction is smoothed, and the smoothing unit detects the smoothing information. Alternatively, the reference pixel existing in the vicinity of the boundary between the plurality of unit regions that are in contact with the prediction unit to be processed may be smoothed.

According to the above configuration, the reference pixel is smoothed by the smoothing information. Therefore, it is not necessary to determine whether or not a specific condition is satisfied, and the processing efficiency can be improved.

The image encoding device according to the present invention generates a prediction image by predicting a pixel value of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen for each prediction unit. An image encoding device that generates encoded data by encoding a prediction residual between a generated predicted image and an original image, and for each of the reference pixels, another pixel adjacent to the reference pixel The reference pixel smoothing determining means for determining whether to perform smoothing between the reference pixel and the reference pixel determined to be smoothed by the reference pixel smoothing determining means between the other pixels. Among the reference pixels, the reference pixel determined to be smoothed by the reference pixel smoothing determining means is used for the prediction using the reference pixel after the smoothing means smoothes. Predictive image generation to generate images It is characterized in that it comprises a stage, a.

The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope of the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

The present invention can be suitably applied to a decoding device that decodes encoded data and an encoding device that generates encoded data. Further, the present invention can be suitably applied to the data structure of encoded data generated by the encoding device and referenced by the decoding device.

1 video decoding device (image decoding device)
2 Video encoding device (image encoding device)
12 prediction image generation unit 12b inter prediction image generation unit 12c intra prediction image generation unit (reference pixel smoothing determination unit, smoothing unit, prediction image generation unit)
21 prediction image generation unit 21a intra prediction image generation unit (reference pixel smoothing determination unit, smoothing unit, prediction image generation unit)
21c Inter prediction image generation unit

Claims

A prediction image is generated for each prediction unit by predicting a pixel value of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen, and the generated prediction image is generated from encoded data. An image decoding device that generates a decoded image by adding to the decoded prediction residual,
For each of the reference pixels, reference pixel smoothing determining means for determining whether or not to perform smoothing between other pixels adjacent to the reference pixel;
Smoothing means for smoothing the reference pixel determined to be smoothed by the reference pixel smoothing determining means with the other pixels;
Of the reference pixels, for the reference pixels determined to be smoothed by the reference pixel smoothing determination means, predicted image generation for generating the predicted image using the reference pixels smoothed by the smoothing means And an image decoding device.
The reference pixel smoothing determining unit determines that the reference pixel is to be smoothed when the reference pixel to be determined exists in the vicinity of a boundary between a plurality of unit regions that are in contact with the prediction unit to be processed. The image decoding apparatus according to claim 1.
The reference pixel smoothing determining unit determines that the reference pixel is to be smoothed when the block boundary strength when the deblocking filter is applied in a later process exceeds a threshold for the reference pixel to be determined. The image decoding apparatus according to claim 2.
The reference pixel smoothing determining unit generates a predicted image by intra prediction of at least one of a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary. The image decoding apparatus according to claim 2, wherein when it is a unit area, the reference pixel is determined to be smoothed.
The reference pixel smoothing determining means includes a unit region in which a prediction image is generated by inter prediction of both a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary. 4. The image decoding according to claim 2, wherein when the angle formed by the motion vector assigned to each unit region exceeds a threshold, it is determined that the reference pixel is smoothed. 5. apparatus.
The reference pixel smoothing determining means includes a unit region in which a prediction image is generated by inter prediction of both a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary. 4. The image according to claim 2, wherein when the difference between the motion vector components allocated to each of the unit regions exceeds a threshold, the reference pixel is determined to be smoothed. 5. Decoding device.
The reference pixel smoothing determining means includes a unit region in which a prediction image is generated by inter prediction of both a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary. The image decoding apparatus according to claim 2, wherein when the number of reference images used for inter prediction in each of the unit regions is different, the reference pixel is determined to be smoothed.
The reference pixel smoothing determining means includes a unit region in which a prediction image is generated by inter prediction of both a unit region including a reference pixel to be determined and an adjacent unit region that is in contact with the unit region at the boundary. 4, when the reference images used for inter prediction in the respective unit regions are different, it is determined that the reference pixels are smoothed.
The reference pixel smoothing determining means, when the difference between the quantization values of the unit area including the reference pixel to be determined and the adjacent unit area in contact with the unit area at the boundary exceeds a threshold, The image decoding apparatus according to claim 2, wherein it is determined that smoothing is performed.
The image decoding apparatus according to claim 1, wherein the reference pixel smoothing determining means determines that the reference pixel is smoothed when the reference pixel to be determined is an edge pixel.
The encoded data includes smoothing information indicating that the reference pixels used for intra prediction are smoothed,
The said smoothing means smoothes the reference pixel which exists in the boundary vicinity between several unit area | regions which contact | connects the prediction unit of a process target, if the said smoothing information is detected. Image decoding apparatus.
A prediction image is generated for each prediction unit by predicting a pixel value of a pixel in the prediction image from a pixel value of a reference pixel at another position in the same screen, and the generated prediction image and the original image An image encoding device that generates encoded data by encoding a prediction residual,
For each of the reference pixels, reference pixel smoothing determining means for determining whether or not to perform smoothing between other pixels adjacent to the reference pixel;
Smoothing means for smoothing the reference pixel determined to be smoothed by the reference pixel smoothing determining means with the other pixels;
Of the reference pixels, for the reference pixels determined to be smoothed by the reference pixel smoothing determination means, predicted image generation for generating the predicted image using the reference pixels smoothed by the smoothing means And an image encoding device.