CN114830643A

CN114830643A - Image encoding method and image decoding method

Info

Publication number: CN114830643A
Application number: CN202080087366.0A
Authority: CN
Inventors: 清水拓也
Original assignee: Maxell Ltd
Current assignee: Maxell Ltd
Priority date: 2019-12-19
Filing date: 2020-12-11
Publication date: 2022-07-29
Also published as: JP2021097368A; WO2021125085A1

Abstract

The image encoding method for encoding an image according to the present invention includes: a predicted image generation step of performing a synthesis process of synthesizing the predicted image for the inter prediction and the predicted image for the intra prediction with respect to the block to be encoded to generate a synthesized predicted image; and an encoding step of encoding a difference between pixel values of the predicted image generated in the predicted image generation step and the image of the block to be encoded, wherein the intra prediction used for the combining process can use a plurality of types of intra predictions including matrix weighted intra prediction, and wherein the combining process performs weighting processing on the predicted image for the inter prediction and the predicted image for the intra prediction, and determines weighting parameters for the weighting processing based on a type of the intra prediction.

Description

Image encoding method and image decoding method

Technical Field

The present invention relates to an image encoding technique for encoding an image or an image decoding technique for decoding an image.

Background

As a method for digitizing, recording, and transmitting image and audio information, h.264/avc (advanced Video coding) and h.265/hevc (high Efficiency Video coding) standards have been established. Research is being conducted to realize a next-generation system called vvc (versatile Video coding) that further exceeds the compression ratio of ISO/IEC MPEG and ITU-T VCEG (refer to non-patent document 1).

Documents of the prior art

Non-patent document

Non-patent document 1: xiaozhong Xu and Shan Liu, "Recent advances in video coding beyond the HEVC standard" SIP (2019), vol.8

Disclosure of Invention

Problems to be solved by the invention

As one of the technical candidates for VVC, studies have been made on using a prediction image obtained by synthesizing an inter prediction image and an intra prediction image for the same encoding/decoding target block.

However, only a predicted image obtained by simply averaging and combining an inter-predicted image and an intra-predicted image is used, and the reduction in the amount of code is not sufficient.

The present invention has been made in view of the above problems, and an object of the present invention is to provide a more preferable image encoding technique and image decoding technique.

Means for solving the problems

In order to achieve the above object, one embodiment of the present invention may be configured as follows: the method comprises the following steps: a predicted image generation step of performing a synthesis process of synthesizing the predicted image for the inter prediction and the predicted image for the intra prediction with respect to the block to be encoded to generate a synthesized predicted image; and an encoding step of encoding a difference between pixel values of the predicted image generated in the predicted image generation step and the image of the block to be encoded, wherein the intra prediction used for the combining process can use a plurality of types of intra predictions including matrix weighted intra prediction, and wherein the combining process performs weighting processing on the predicted image for the inter prediction and the predicted image for the intra prediction, and determines weighting parameters for the weighting processing based on a type of the intra prediction.

Effects of the invention

According to the present invention, a more preferable image encoding technique and image decoding technique can be provided.

Drawings

Fig. 1 is an explanatory diagram of an example of an image coding apparatus according to embodiment 1 of the present invention.

Fig. 2 is an explanatory diagram of an example of the image decoding apparatus according to embodiment 2 of the present invention.

Fig. 3 is an explanatory diagram of an example of the image encoding method according to embodiment 1 of the present invention.

Fig. 4 is an explanatory diagram of an example of the image decoding method according to embodiment 2 of the present invention.

FIG. 5 is an explanatory view of an example of the data recording medium according to embodiment 3 of the present invention.

Fig. 6 is an explanatory diagram of an example of the type of intra prediction according to an embodiment of the present invention.

Fig. 7 is an explanatory diagram of an example of plane prediction according to an embodiment of the present invention.

Fig. 8 is an explanatory diagram of an example of the process of the synthesis prediction according to the embodiment of the present invention.

Fig. 9 is an explanatory diagram of an example of the process of the synthesis prediction according to the embodiment of the present invention.

Fig. 10 is an explanatory diagram of an example of the process of the synthesis prediction according to the embodiment of the present invention.

Fig. 11 is an explanatory diagram of an example of the process of the synthesis prediction according to the embodiment of the present invention.

Fig. 12 is a diagram illustrating an example of matrix weighted intra prediction according to an embodiment of the present invention.

Fig. 13 is an explanatory diagram of an example of the process of the synthesis prediction according to the embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

In addition, in each drawing, components denoted by the same reference numerals have the same functions.

In the description and drawings of the present specification, "0 vec" or "0 vector" indicates a vector in which the value of each component is 0, or a vector obtained by conversion and setting.

In addition, "unreferenceable" in the description and drawings of the present specification means that information of a block cannot be acquired because the block position is out of the range of the screen. "referenceable" means that information of a block can be acquired, and the information of the block includes information such as a pixel value, a vector, a reference frame number, and/or a prediction mode.

Note that the expression "residual component" in each description of the present specification and each drawing also includes the same meaning as "prediction error".

Note that the expression "region" in each description of the present specification and each drawing also includes the same meaning as "image".

In addition, the expression "to be transmitted together with a token" in each description of the present specification and each drawing also includes the meaning of "to be transmitted including a token".

(example 1)

First, embodiment 1 of the present invention is explained with reference to the drawings.

Fig. 1 shows an example of a block diagram of an image coding apparatus according to an embodiment of the present invention.

The image coding apparatus includes, for example, an image input unit 101, a block dividing unit 102, a mode management unit 103, an intra prediction unit 104, an inter prediction unit 105, a synthesis prediction unit 120, a block processing unit 106, a transformation/quantization unit 107, an inverse quantization/inverse transformation unit 108, an image synthesis/filtering unit 109, a coded image management unit 110, an entropy coding unit 111, and a data output unit 112.

The operation of each component of the image coding apparatus will be described in detail below.

The operation of each component of the image coding apparatus can be regarded as an autonomous operation of each component, for example, as described below. For example, the present invention may be realized by cooperation of software stored in the control unit and the storage unit.

First, the image input unit 101 acquires and inputs an original image to be encoded. Next, the block dividing unit 102 divides the input original image into blocks of a certain size called CTUs (coding Tree unit), and further analyzes the input image, thereby dividing each CTU into more detailed blocks according to its characteristics. These blocks, which are units of coding, are called cu (coding unit). The CU partition by the CTU is managed by a tree structure such as a Quadtree (Quadtree), a Ternary tree (Ternary tree), or a Binary tree. The inside of a CU may be further divided into subblocks for prediction and tu (transform unit) for frequency transform, quantization, and the like.

The mode management unit 103 manages a mode for determining a coding method of each CU. The encoding process is performed using a plurality of intra prediction methods, inter prediction methods, and composite prediction methods, and a mode with the highest encoding efficiency is determined for the CU. The most efficient mode is a mode in which a coding error can be minimized for a certain amount of code. There are a plurality of optimum modes, and the mode may be selected appropriately according to the situation. The determination of which mode is efficient is made by combining prediction processing in a plurality of modes by the intra prediction unit 104, the inter prediction unit 105, or the composite prediction unit 120, code amount measurement of residual components and various flags by other processing units, and reproduced image error prediction at the time of decoding. In general, the mode is determined for each CU unit, but the CU may be divided into sub-blocks and the mode may be determined for each sub-block.

As a prediction method of a block to be encoded (CU or sub-block), there are intra (intra) prediction, inter (inter) prediction, and composite prediction in which intra prediction and inter prediction are combined, and these are performed by the intra prediction unit 104, the inter prediction unit 105, and the composite prediction unit 120, respectively. The intra-frame prediction uses information of the same frame encoded before the encoding target block, and the inter-frame prediction uses information of a frame before or after the reproduction time, which is encoded before the encoding target frame. Here, the intra prediction unit 104, the inter prediction unit 105, and the synthesis prediction unit 120 are described as 1 unit for the sake of description, but may be provided for each encoding mode and each frame.

The intra prediction unit 104 performs intra prediction processing. In the "prediction process", a prediction image is generated. The intra-picture prediction process predicts pixels of a block to be encoded using information of the same frame that is encoded prior to the block to be encoded. The intra prediction includes DC (Direct Current) prediction, angle prediction, plane prediction, matrix prediction, cross-component prediction, multi-line prediction, intra block copy, and the like. Regarding the transmission of the intra prediction mode, the estimation of the mode with the highest probability or the like is performed from the intra prediction mode of the already encoded block.

The inter prediction unit 105 performs inter prediction processing. In addition, a prediction image is generated in the "prediction process". The inter-picture prediction process predicts the pixels of the encoding target block using information of a frame before or after the reproduction time, which is encoded before the encoding target frame. The inter prediction includes motion compensation prediction, merge mode prediction, prediction based on affine transformation, prediction based on triangular block segmentation, optical flow prediction, prediction based on decoder-side motion prediction, and the like.

The synthesis prediction unit 120 performs synthesis prediction for generating a synthesis prediction image in which a prediction image obtained by intra prediction and a prediction image obtained by inter prediction are synthesized. The predicted image obtained by the intra prediction used by the synthesis prediction unit 120 may be a predicted image obtained by various intra prediction processes that can be performed by the intra prediction unit 104. As the predicted image obtained by the inter prediction used in the synthesis prediction unit 120, a predicted image obtained by various intra prediction processes that can be performed by the inter prediction unit 105 described above may be used. Details of the synthesis method are described later.

The block processing unit 106 obtains, for each block to be encoded, a difference (difference) between a predicted image generated by the intra prediction unit 104 through intra prediction, a predicted image generated by the inter prediction unit 105 through inter prediction, or a predicted image generated by the synthesis prediction unit 120 through synthesis prediction and an original image of the block to be encoded obtained from the block division unit 102, calculates a residual component, and outputs the residual component.

The transform/quantization unit 107 performs frequency transform and quantization processing on the residual component input from the block processing unit 106, and outputs a coefficient group. The frequency conversion may be performed using dct (discrete Cosine transform), dst (discrete Sine transform), or a form that can be processed by integer arithmetic. The coefficient set is transmitted to both a process for restoring an image to generate a decoded image for prediction and a process for outputting data. The transform and quantization may also be skipped according to the mode designation.

The inverse quantization/inverse transformation unit 108 performs inverse quantization and inverse transformation to generate a decoded image using the coefficient group obtained from the transformation/quantization unit 107 for prediction, and outputs a residual component obtained by restoration. The inverse quantization and the inverse transformation may be performed in opposite directions corresponding to the quantization and the transformation by the transformation/quantization unit, respectively. Inverse quantization and inverse transformation may also be skipped according to the mode designation.

The image synthesis/filtering unit 109 synthesizes the predicted image generated by the intra prediction unit 104, the predicted image generated by the inter prediction unit 105, or the predicted image generated by the synthesis prediction unit 120 with the residual component restored by the inverse quantization/inverse transformation unit 108, and performs processing such as loop filtering to generate a decoded image.

The decoded image management unit 110 holds a decoded image, and manages images, mode information, and the like that are referred to for intra prediction, inter prediction, or composite prediction.

The entropy encoding unit 111 performs entropy encoding processing on the mode information and the coefficient group information and outputs the result as a bit string. As the entropy encoding method, a cabac (context Adaptive Binary arithmetric code) method or the like may be used. Variable length codes, fixed length codes may be used in combination. The context determination may refer to a predetermined table.

The data output unit 112 outputs the encoded data to the recording medium and the transmission path.

Next, a flow of an encoding method in the image encoding device according to embodiment 1 of the present invention will be described with reference to fig. 3.

First, in step 301, an original image to be encoded is input, and the content of the image is analyzed to determine a division method, and the image is divided into blocks. The analysis of the image content may be performed for the entire image, or for a plurality of frames in combination, or may be performed for each block unit such as a slice (slice), tile (tile), brick (brick), CTU, or the like obtained by dividing the image. The block is generally divided into CTUs of a certain size and then divided into CUs in a tree structure.

Next, in step 302, intra prediction is performed on the encoding target block of the original image acquired in step 301. The intra prediction mode is as described above. Prediction is performed for a plurality of modes for each intra prediction mode.

Next, in step 303, inter prediction is performed on the encoding target block of the original image acquired in step 301. The inter prediction mode is as described above. Prediction is performed for a plurality of modes for each inter prediction mode.

Next, in step 320, synthesis prediction is performed on the encoding target block of the original image acquired in step 301. Details of the synthesized prediction mode are described later.

Next, in step 304, residual components are separated from the images of the encoding target blocks after intra prediction, inter prediction, and composite prediction, and the transform processing, quantization processing, and entropy encoding processing of the residual components are performed, in accordance with the respective modes, to calculate encoded data.

Next, in step 305, inverse quantization and inverse transform processing are performed in accordance with each mode, and a residual component is synthesized with the predicted image, thereby generating a decoded image. The decoded image is managed together with prediction data and various kinds of encoded data in intra prediction and inter prediction, and used for prediction of other blocks to be encoded.

Next, in step 306, the patterns are compared to determine the pattern that can be encoded with the highest efficiency. The modes include an intra prediction mode, an inter prediction mode, a synthesis prediction mode, and the like, and are collectively referred to as an encoding mode. The mode selection method is as described above.

In step 307, the encoded data of the block to be encoded is output in accordance with the determined encoding mode. The above-described encoding process for each block to be encoded is repeated for the entire image, and the image is encoded.

Next, the synthetic prediction mode of the present embodiment will be described. Here, the same processing is performed on the encoding side and the decoding side in the predicted image generation processing based on the combined prediction mode. Therefore, in the following description of the synthetic prediction mode of the present embodiment, the same processing is performed on the encoding side and the decoding side unless otherwise specified.

The synthesized prediction mode of the present embodiment generates a predicted image obtained by synthesized prediction by performing weighted average of an intra-predicted image obtained by intra-prediction and an inter-predicted image obtained by inter-prediction. Specifically, the inter-prediction pixel is Pinter and the intra-prediction pixel is Pintra, and each prediction pixel Pcom (hereinafter referred to as a synthesized prediction pixel) of the prediction image is calculated by the following expression 1.

Pcom ═ (w Pintra + (4-w) Pinter)/4 … … formula 1

That is, the process of expression 1 is a weighted average process of inter-prediction pixels and intra-prediction pixels using a weighting parameter w. In addition, this process may also be expressed as weighted addition.

The inter-prediction image used in the synthesized prediction mode according to the present embodiment can be a candidate of a prediction image obtained by all inter-predictions that can be used in the above-described inter-prediction mode. In the inter prediction mode, only the prediction image obtained in the merge mode may be used.

As for the intra-prediction image used in the synthetic prediction mode of the present embodiment, a prediction image obtained by all intra-predictions that can be used in the above-described intra-prediction mode can be used as a candidate. In the intra prediction mode, only a predicted image obtained by angular prediction, DC prediction, or planar prediction may be used as the target.

Here, in the synthesized prediction mode of the present embodiment, the weighting parameter w for calculating the synthesized prediction pixel is determined in accordance with the type of intra prediction for generating the intra prediction image used in the synthesized prediction mode. That is, the weight of the intra-prediction image in the weighted average of the intra-prediction image and the inter-prediction image is changed in accordance with the characteristics of the different intra-predictions, thereby realizing generation of a more preferable prediction image.

A specific example of the process of determining the weighting parameter w in the synthesized prediction mode according to the present embodiment will be described below.

First, intra prediction, that is, angular prediction, DC prediction, and planar prediction, included in candidates used for generating an intra prediction image used in the synthetic prediction mode according to the present embodiment will be described with reference to fig. 6 and 7.

The prediction directions of the angle prediction are shown in fig. 6. The angle prediction is a prediction method for predicting a prediction target pixel by referring to adjacent pixels indicated by prediction directions shown in the drawing. This prediction is unidirectional prediction, and it is also possible to perform prediction in which adjacent pixel values are copied in the opposite direction to each prediction direction. The prediction directions can be identified by numbers, and are set as the prediction directions 2 to 66 shown in the figure, for example. In the figure, 2 (lower left direction), 18 (horizontal left direction), 34 (upper left direction), 50 (upper direction (vertical direction)), and 66 (upper right direction) are shown as representative prediction directions. In this figure, the description of the other prediction directions is omitted for the sake of simplicity of explanation, but the prediction direction is defined in the clockwise rotation direction from the prediction direction 2 to the prediction direction 66 as the number of prediction directions increases. The angle prediction is a unidirectional prediction based on neighboring pixel values.

Therefore, the prediction residual tends to be reduced for an image having a pattern such as a stripe pattern extending in a predetermined direction. The compression rate increases with decreasing prediction residual. When the encoding side selects angle prediction, when an image with reduced prediction residues in angle prediction is encoded, there is a high possibility that the image is an image in which a pattern extends in a predetermined direction. In this way, when the encoding side selects angle prediction for the block to be encoded, there is a high possibility that angle prediction using the same prediction direction as the block to be encoded or a prediction direction close to the prediction direction of the block to be encoded is also used for adjacent blocks. This is because, as long as the pattern in the image in which the pattern extends in the predetermined direction continues to extend, the prediction residual is highly likely to decrease in angular prediction in the same prediction direction as the block to be encoded or in a prediction direction close to the prediction direction of the block to be encoded. This can also be expressed as a high possibility of high prediction accuracy.

In this case, in a plurality of adjacent blocks, it is highly likely that angle prediction of the same prediction direction or angle prediction of a prediction direction close to the same direction is repeatedly selected. That is, the angle prediction may be considered to be intra prediction having a relatively high approximation with the prediction method of the adjacent block. In addition, "the prediction direction close to the prediction direction of the block to be encoded" may be specifically defined as a prediction direction in which the number of the prediction direction is ± 1 with respect to the number of the prediction direction of the block to be encoded. Alternatively, specifically, the prediction direction number may be defined as a prediction direction of ± 2 of the prediction direction number of the block to be encoded.

In fig. 6, the number of DC prediction is further indicated as 1. DC prediction is a prediction method for generating a prediction image corresponding to only the DC component in the frequency domain. That is, all the pixel values within the prediction image of the DC prediction are constant values. DC prediction is a prediction method that is not based on neighboring pixel values. However, an image having a tendency to reduce the prediction residual in the DC prediction is a planar pattern having a certain luminance and a wide range. In this way, when the encoding side selects DC prediction for the block to be encoded, there is a high possibility that the same DC prediction as that for the block to be encoded is also used for adjacent blocks. This is because the likelihood of a reduction in the prediction residual in the same DC prediction is high as long as the patterns of planes of a certain luminance are connected in the peripheral direction.

This can also be expressed as a high possibility of high prediction accuracy. In this case, there is a high possibility that the same DC prediction is repeatedly selected in a plurality of adjacent blocks. That is, although DC prediction is a prediction method not based on the adjacent pixel values, it can be considered as intra prediction having a relatively high approximation with the prediction method of the adjacent block.

In fig. 6, it is further shown that the number of the plane prediction is 0. Fig. 7 shows a specific prediction method of the plane prediction. In plane prediction, a reference pixel is generated at the right boundary of the encoding target block using the upper right neighboring pixel of the encoding target block (step 710) (step 711). Next, a first prediction image is generated by generating each prediction pixel by linear interpolation using the right-side reference pixel and a left-side adjacent pixel opposite to the right-side reference pixel (fig. 7(a)) (process 712).

In addition, a reference pixel is generated at the lower boundary of the encoding target block using the lower left neighboring pixel of the encoding target block (process 720) (process 721). Next, a second prediction image is generated by generating each prediction pixel by linear interpolation using the lower reference pixel and the upper adjacent pixel opposed to the lower reference pixel (fig. 7(B)) (process 722). Then, the first predicted image (fig. 7 a) and the second predicted image (fig. 7B) thus generated are averaged to generate a final predicted image. The planar prediction is an intra prediction having a characteristic in that a first predicted image (fig. 7 a) and a second predicted image (fig. 7B) generated by linear interpolation are averaged to generate a predicted image of a nonlinear curved surface.

Planar prediction is prediction based on neighboring pixel values. However, since the predicted image generated by the plane prediction is a characteristic nonlinear curved surface, the approximation to the prediction method of the adjacent block is not high. As described above, when the prediction residual decreases in the angle prediction and the DC prediction, there is a high possibility that the same prediction method as that of the adjacent block is repeated. In contrast, there are not many images of patterns that require repetition of a characteristic nonlinear curved surface of a predicted image for plane prediction.

Therefore, the plane prediction can be regarded as intra prediction having a relatively low approximation with the prediction method of the adjacent block. When the encoding side selects plane prediction for the block to be encoded, the encoding side is likely to be an image in which the prediction residual is not sufficiently reduced in angle prediction and DC prediction. Since an image in which the prediction residual is not sufficiently reduced in the angle prediction and the DC prediction is not a simple pattern, a prediction residual obtained by plane prediction of an image in which the prediction residual is not reduced in the angle prediction and the DC prediction is more likely to be larger than a prediction residual obtained when the prediction residual is reduced in the angle prediction and the DC prediction. This can also be expressed as a high possibility that the prediction accuracy is low.

As described using fig. 6 and 7, the angular prediction, the DC prediction, and the plane prediction that can be used in the synthetic prediction mode of the present embodiment are intra predictions having different characteristics, respectively. In the combined prediction mode according to the present embodiment, the value of the weighting parameter w in equation 1 is controlled in accordance with the type of intra prediction, focusing on the characteristics of each type of intra prediction. The w determination process in the synthesized prediction mode according to the present embodiment will be described with reference to fig. 8.

Fig. 8 shows an example of w determination in the synthetic prediction mode according to the present embodiment. Fig. 8 shows 3 examples of w determination example 1, w determination example 2, and w determination example 3. These are examples of different determination methods. In each determination method, the value of w is determined from 1, 2, and 3 for each condition. In equation 1, the larger the value of w, the greater the weighting of intra prediction in the composite prediction mode. Conversely, in equation 1, the smaller the value of w, the smaller the weight of intra prediction in the composite prediction mode. These determination examples are explained below.

In w determination example 1, when the intra prediction used in the composite prediction mode is the plane prediction, it is determined that w is 1. When intra prediction used in the combined prediction mode is angular prediction and the prediction direction is the horizontal direction (18) or the vertical direction (50), w is determined to be 3. The intra prediction used in the synthetic prediction mode is angle prediction, and when the prediction direction is 2-17 or 51-66, it is determined that w is 3. When intra prediction used in the synthesis prediction mode is angle prediction and the prediction direction is 19 to 49, w is determined to be 3. When the intra prediction used in the synthetic prediction mode is DC prediction, w is determined to be 3.

That is, in the w decision example 1, the decision method is to make the plane prediction, which is the intra prediction having a relatively low similarity to the prediction method of the adjacent block, have a smaller value of w than the angle prediction and the DC prediction. It can also be expressed that the predicted image of the plane prediction is a characteristic nonlinear curved surface and the prediction accuracy is highly likely to be low, and therefore it is preferable to apply the smallest weight in the combined prediction mode.

In contrast, in w determination example 1, for angular prediction, which is intra prediction having a relatively high similarity to the prediction method of the adjacent block, 3, which is a value of w larger than that in plane prediction, is selected in any prediction direction. The decision method is to make the value of w larger than that of the plane prediction in intra prediction, i.e., angular prediction, which has a relatively high similarity to the prediction method of the adjacent block. It can also be expressed that since the prediction image of the angle prediction is simple unidirectional prediction and the prediction accuracy is highly likely, it is preferable to apply a relatively large weight to the synthesized prediction mode.

In addition, in w determination example 1, for DC prediction, which is intra prediction having a relatively high similarity to the prediction method of the adjacent block, 3, which is a value of w larger than the value of w in plane prediction, is selected. The decision method is to make DC prediction, which is intra prediction having a relatively high similarity to the prediction method of the adjacent block, have a larger value of w than planar prediction. It can also be expressed that since the prediction image of the DC prediction is a simple constant value plane and the prediction accuracy is highly likely, it is preferable to apply a relatively large weight to the synthesized prediction mode.

In this way, in w determination example 1, the value of w is determined according to the type of intra prediction. In this case, the value for which the plane prediction is the minimum is set. In the angular prediction, the value of w is set to be constant for any prediction direction. That is, in the w determination example 1, the value of w does not change according to the prediction direction of the angle prediction.

Next, in w determination example 2, when the intra prediction used in the composite prediction mode is the plane prediction, it is determined that w is 1. When intra prediction used in the synthetic prediction mode is angular prediction and the prediction direction is the horizontal direction (18) or the vertical direction (50), w is determined to be 2. The intra prediction used in the synthetic prediction mode is angle prediction, and when the prediction direction is 2-17 or 51-66, it is determined that w is 3. When intra prediction used in the synthesis prediction mode is angle prediction and the prediction direction is 19 to 49, w is determined to be 3. When the intra prediction used in the synthetic prediction mode is DC prediction, w is determined to be 3.

That is, in the w decision example 2, the decision method is such that the plane prediction, which is intra prediction having relatively low similarity to the prediction method of the adjacent block, takes a smaller value of w than the angle prediction and the DC prediction, as in the w decision example 1. The reason is the same as in determination example 1. In w determination example 2, in intra prediction, i.e., angular prediction, which is relatively high in the similarity to the prediction methods of adjacent blocks, 2 or 3, which is a value of w larger than that in plane prediction, is selected in any prediction direction. The decision method is to make the value of w larger than that of the plane prediction in intra prediction, i.e., angular prediction, which has a relatively high similarity to the prediction method of the adjacent block.

It can also be expressed that since the prediction image of the angle prediction is simple unidirectional prediction and the prediction accuracy is highly likely, it is preferable to apply a relatively large weight to the synthesized prediction mode. However, in w determination example 2, unlike w determination example 1, even if angle prediction is performed in the same manner, the value of w is changed according to the prediction direction.

Specifically, when the prediction direction is the horizontal direction (18) or the vertical direction (50), w is set to 2, and a lower weight is set than w of the other prediction direction in the angle prediction, which is set to 3. This is because the reference pixel in the prediction direction in the horizontal direction (18) is only the pixel of the left adjacent block, and the probability that the approximation with the prediction direction of the adjacent block is limited to the left adjacent block is high, and the overall approximation with the prediction direction of the adjacent block is considered to be lower than that with other prediction directions.

Similarly, since the reference pixels in the prediction direction in the vertical direction (50) include only the pixels of the upper neighboring blocks, the probability that the approximation with the prediction direction of the neighboring blocks is limited to the upper neighboring blocks is high, and the overall approximation with the prediction direction of the neighboring blocks is considered to be lower than that in other prediction directions. In addition, in w determination example 2, as in w determination example 1, for DC prediction, which is intra prediction having a relatively high similarity to the prediction direction of the adjacent block, 3, which is a value of w larger than the value of w in plane prediction, is selected. The reason is the same as in determination example 1.

In this way, in the w determination example 2, the value of w is determined according to the type of intra prediction. In addition, when intra prediction is angular prediction, the value of w is also determined in consideration of the prediction direction. In addition, the minimum value of the plane prediction is set.

w determination example 3 is different from w determination example 2 in that when intra prediction used in the synthetic prediction mode is angular prediction and the prediction directions are 2 to 17 or 51 to 66, w is determined to be 2, and the other points are common. In the case of angle prediction and prediction directions 2 to 17 or 51 to 66, since the angle of the prediction direction is not a completely horizontal direction or a completely vertical direction, it is considered that there is an approximation of the prediction method to any one of the upper neighboring block and the left neighboring block.

However, the pixel included in the reference pixel is either the pixel of the upper neighboring block or the pixel of the left neighboring block. In the case of the prediction directions 2 to 17 or 51 to 66, the overall similarity to the prediction methods of the adjacent blocks is likely to be lower than the prediction directions 19 to 49 in which the reference pixels include both the pixels of the upper adjacent block and the pixels of the left adjacent block. In view of this, in w determination example 3, in the case of prediction directions 2 to 17 or 51 to 66, w is determined to be 2, and the weight is made smaller than in the case of prediction directions 19 to 49.

According to the example of determining w in the synthesized prediction mode in fig. 8 described above, the weighting of the intra-predicted image in the synthesis of the intra-predicted image and the inter-predicted image can be determined more appropriately according to the characteristics of each type of intra-prediction used in the synthesized prediction mode.

As described above, the encoding process in one embodiment of the present application is performed.

According to the image encoding device and the image encoding method of embodiment 1 described above, a more preferable synthesis prediction mode can be realized.

In addition, the image encoding device and the image encoding method of embodiment 1 can be applied to a recording device, a mobile phone, a digital camera, and the like using them.

According to the image encoding device and the image encoding method of embodiment 1 of the present invention described above, the amount of encoded data can be reduced, and the quality of a decoded image when the encoded data is decoded can be prevented from deteriorating. That is, a high compression ratio and a better image quality can be achieved.

That is, according to the image encoding device and the image encoding method of embodiment 1 of the present invention, a more preferable image encoding technique can be provided.

(example 2)

Next, fig. 2 shows an example of a block diagram of an image decoding apparatus according to embodiment 2 of the present invention.

The image decoding apparatus includes, for example, a stream analyzer 201, a block manager 202, a mode determiner 203, an intra predictor 204, an inter predictor 205, a synthesis predictor 220, a coefficient analyzer 206, an inverse quantization/inverse transformation unit 207, an image synthesis/filter unit 208, a decoded image manager 209, and an image output unit 210.

The operation of each component of the image decoding apparatus will be described in detail below.

The operation of each component of the image decoding apparatus can be regarded as autonomous operation of each component as described below, for example. For example, the present invention may be realized by cooperation of software stored in the control unit and the storage unit.

First, the stream analysis section 201 analyzes an input coded stream. Here, the flow analysis unit 201 also performs processing for extracting data from packets and processing for acquiring information of various headers and flags.

In this case, the encoded stream input by the stream analysis unit 201 is, for example, an encoded stream generated by the image encoding device and the image encoding method according to embodiment 1. The method of production is as in example 1, and therefore, the description thereof is omitted. Or may be a coded stream read from the data recording medium shown in embodiment 3. The recording method is described later.

Next, the block management unit 202 manages the processing of the blocks according to the block-divided information analyzed by the flow analysis unit 201. Generally, an encoded image is divided into blocks, and each block to be encoded is managed by a tree structure or the like. The processing order of the blocks is often performed in the raster scan order, but the processing may be performed in an arbitrarily determined order such as zigzag scan. The block division method is as described in embodiment 1.

Next, the mode determination unit 203 determines the coding mode specified by a flag or the like for each block to be coded. The decoding process performs a process corresponding to the coding mode of the discrimination result. The following describes processing for each coding mode.

First, when the encoding mode is intra-encoding, the intra prediction unit 204 generates a prediction image based on intra prediction. The intra prediction mode is as described in embodiment 1. The generation processing of the prediction image based on prediction is basically the same processing on the encoding side and the decoding side.

When the encoding mode is encoding based on inter prediction, the inter prediction unit 205 generates a predicted image based on inter prediction. The inter prediction mode is as described in embodiment 1. The generation process of the prediction image based on the prediction is basically the same process on the encoding side and the decoding side.

When the encoding mode is encoding based on the synthesis prediction, the synthesis prediction unit 220 generates a prediction image based on the synthesis prediction. The synthetic prediction mode is as described in example 1. The generation processing of the prediction image based on prediction is basically the same processing on the encoding side and the decoding side.

On the other hand, the coefficient analysis unit 206 analyzes the encoded data of each block to be encoded included in the input encoded stream, decodes the entropy-encoded data, and outputs encoded data of a coefficient group including a residual component. At this time, processing corresponding to the coding mode of the determination result of the mode determination unit 203 is performed.

The inverse quantization/inverse transformation unit 207 performs inverse quantization processing and inverse transformation on the encoded data of the coefficient group including the residual component, and restores the residual component. The inverse quantization and inverse transformation methods are as described above. Inverse quantization and inverse transformation may also be skipped according to the mode designation.

The residual components restored as described above are synthesized by the image synthesis/filter unit 208 with the prediction image output from the intra prediction unit 204, the inter prediction unit 205, or the synthesis prediction unit 220, and are output as a decoded image by performing processing such as loop filtering.

The decoded image management unit 209 holds the decoded image, and manages images, mode information, and the like to be referred to for intra prediction, inter prediction, or synthetic prediction.

The finally decoded image is output by the image output unit 210, and the image is decoded.

Next, a flow of an image decoding method in the image decoding device according to embodiment 2 of the present invention will be described with reference to fig. 4.

First, in step 401, a coded stream to be decoded is acquired, and data is analyzed. In addition, the processing of the block is managed according to the block division information obtained by the analysis. The block division method is as described in embodiment 1.

Next, in step 402, the information of the coding mode analyzed in step 401 is used to determine the coding mode for 1 coding unit (block unit, pixel unit, or the like) included in the coded data. Here, the process proceeds to step 403 in the case of the intra coding mode, proceeds to step 404 in the case of the inter coding mode, and proceeds to step 420 in the case of the composite coding mode.

In step 403, a predicted image is generated by intra prediction according to the method specified by the encoding mode. The intra prediction mode is as described in embodiment 1.

In step 404, a predicted image is generated by inter prediction according to the method specified by the encoding mode. The inter prediction mode is as described in embodiment 1.

In step 420, a predicted image is generated by the synthetic prediction according to the method specified by the encoding mode. The synthetic prediction mode is as described in example 1.

In step 405, the encoded data of each block to be encoded is analyzed by a method specified by the encoding mode, the entropy-encoded data is decoded, and the encoded data of the coefficient group including the residual component is output. Further, the encoded data of the coefficient group including the residual component is subjected to inverse quantization and inverse transformation, and the residual component is restored. The inverse quantization and inverse transformation methods are as described above. Inverse quantization and inverse transformation may also be skipped according to the mode designation.

In step 406, a decoded image is generated by synthesizing a predicted image generated by intra prediction, inter prediction, or synthetic prediction, etc., with the residual component obtained by the restoration, and further performing processing such as loop filtering, etc., for each block to be encoded. The above-described decoding process is performed on the entire image in units of blocks to be encoded, thereby generating a decoded image.

In step 407, the generated decoded image is output and displayed.

The processing in the decoding-side synthetic prediction mode in this embodiment is substantially the same as the processing in the synthetic prediction mode described in embodiment 1 using expression 1, fig. 6, fig. 7, and fig. 8, and therefore, the repetitive description thereof will be omitted. In addition, the expression "encoding target block" in the description of the synthetic prediction mode in embodiment 1 may be replaced with "decoding target block" in the processing of the synthetic prediction mode on the decoding side.

In the decoding-side synthesized prediction mode of the present embodiment, as described above, weighted average processing using the inter prediction pixels and the intra prediction pixels in expression 1 is performed. In the decoding-side synthetic prediction mode of the present embodiment, the weighting parameter w is determined by the determination process shown in fig. 8 as described above. This makes it possible to more appropriately determine the weighting of the intra-predicted image in the combination of the intra-predicted image and the inter-predicted image, according to the characteristics of each type of intra-prediction used in the combined prediction mode.

In addition to the example shown in the present embodiment, the size of a block used in an encoding mode may be used as a parameter, and each encoding mode may be subdivided to use a predetermined encoded stream as a stream to be decoded.

As described above, the decoding process of one embodiment of the present application is performed.

The image decoding apparatus and the image decoding method of embodiment 2 can be applied to a playback apparatus, a mobile phone, a digital camera, and the like using them.

According to the image decoding device and the image decoding method of embodiment 2 of the present invention described above, encoded data with a small code amount can be decoded with higher picture quality.

That is, according to the image decoding apparatus and the image decoding method of embodiment 2 of the present invention, a more preferable image decoding technique can be provided.

(example 3)

Next, fig. 5 shows an example of a data recording medium according to embodiment 3 of the present invention.

The encoded stream of the present embodiment of the present invention is an encoded stream generated by the image encoding apparatus or the image encoding method of embodiment 1. The method of production is as in example 1, and therefore, the description thereof is omitted.

Here, the encoded stream of the present embodiment is recorded as a data string 502 in the data recording medium 501, for example. The data string 502 is recorded as a coded stream conforming to a prescribed syntax, for example.

First, a bit string divided into units of a certain size called nal (network Abstraction layer) unit 503 is derived for a coded stream. Bit strings of NAL units are read according to a predetermined rule such as variable length coding and converted into rbsp (raw Byte Sequence payload). The data of the RBSP is composed of information such as a sequence parameter set 504, a picture parameter set 505, a decoding parameter set, a video parameter set, and the like, and slice data 506.

Inside each slice, information 507 about each block is included, for example. Within the information on the block, for example, an area in which the encoding mode is recorded for each block is present, and this area is set as the encoding mode flag 508.

According to the data recording medium of embodiment 3 of the present invention described above, the amount of code can be reduced, and image quality degradation can be prevented. That is, a data recording medium on which a coded stream having a high compression rate and a high image quality is recorded can be realized.

Further, the coded stream generated by the image coding apparatus or the image coding method in each embodiment described later can also be recorded in the data recording medium of the present embodiment with the configuration described above in the present embodiment.

(example 4)

Next, embodiment 4 of the present invention will be described with reference to the drawings.

In embodiment 4 of the present invention, the image encoding device and the image encoding method in embodiment 1 change the process of determining the weighting parameter w in the process of the synthesis prediction.

Specifically, the process of determining the weighting parameter w is changed from the process of determining the weighting parameter w shown in fig. 8 to the process of determining the weighting parameter w shown in fig. 9. In the image encoding device and the image encoding method according to embodiment 4, the configuration, operation, and processing other than the determination processing of the weighting parameter w shown in fig. 9 are the same as those of the image encoding device and the image encoding method according to embodiment 1. Therefore, the same points as those of the image encoding device and the image encoding method of embodiment 1 will not be described repeatedly.

Fig. 9 shows an example of the determination process of the weighting parameter w in the process of the synthesis prediction in the image coding apparatus and the image coding method according to embodiment 4 of the present invention. Since the same process of determining the weighting parameter w is performed also on the decoding side, the description will be made on an expression that the prediction target block is an "encoding (decoding) target block".

In the example of fig. 9, unlike the example of fig. 8, in the process of determining the weighting parameter w, not only the intra prediction mode of the block to be encoded (decoded) but also the combination of the prediction modes of a plurality of adjacent blocks are considered. Here, in the present embodiment, as an example of the plurality of adjacent blocks, a combination of prediction modes of 2 adjacent blocks, i.e., a block immediately above the block to be encoded (decoded) and a block immediately to the left of the block to be encoded (decoded), is considered.

That is, in the example of fig. 8, the weighting parameter w is determined only in accordance with the intra prediction mode of the block to be encoded (decoded), but in the example of fig. 9, even if the intra prediction mode of the block to be encoded (decoded) is the same, the weighting parameter w is changed if the combination of the prediction modes of the plurality of adjacent blocks is different.

In embodiment 1, it has been described that the approximation to the prediction method of the adjacent block differs depending on the kind of the intra prediction mode of the encoding (decoding) target block. Specifically, the characteristics of the similarity to the prediction method of the adjacent block are described for the plane prediction, the case where the prediction direction in the angle prediction is 18 or 50, the case where the prediction direction in the angle prediction is 2 to 17 or 51 to 66, the case where the prediction direction in the angle prediction is 19 to 49, and the DC prediction, respectively.

In fig. 9, the weighting parameter w is determined for each of the characteristics of the types of intra prediction modes in accordance with the combination of the prediction modes of the plurality of adjacent blocks.

Specific conditions for the combination of prediction modes of the plurality of adjacent blocks include (1) a case where 2 adjacent blocks, that is, a block adjacent to the block to be encoded (decoded) and a block adjacent to the block to be encoded (decoded) directly above the block to be encoded (decoded), are not available, (2) a case where one of the 2 adjacent blocks, that is, a block adjacent to the block to be encoded (decoded) directly above the block to be encoded (decoded) and a block adjacent to the block to be encoded (decoded) directly left, is unavailable and the other is inter-predicted, (3) a case where both of the 2 adjacent blocks, that is, a block adjacent to the block to be encoded (decoded) directly above the block to be encoded (decoded) and a block adjacent to the block to be encoded (decoded) directly left, are inter-predicted, (4) a case where one of the 2 adjacent blocks, that is, an adjacent to the block to be encoded (decoded) and a block adjacent to the block to be encoded (decoded) directly left is unavailable and the other is intra-predicted, (5) The conditions (1) to (6) are such that one of 2 adjacent blocks, i.e., a block immediately above the block to be encoded (decoded) and a block immediately to the left of the block to be encoded (decoded), is inter-predicted and the other is intra-predicted, and (6) both of 2 adjacent blocks, i.e., a block immediately above the block to be encoded (decoded) and a block immediately to the left of the block to be encoded (decoded), are intra-predicted.

Specifically, the conditions (1) to (6) are determined as follows.

First, when the intra prediction mode of the block to be encoded (decoded) is the plane prediction, w may be determined as shown in fig. 9. Specifically, w is set to 1 under the conditions (1) to (3), and w is set to 2 under the conditions (4) to (6). As illustrated in embodiment 1, the plane prediction is intra prediction in which the approximation of the prediction method with the neighboring block is relatively low.

However, even when there is an available intra prediction block in the adjacent block, there is a possibility that the same plane prediction as that of the encoding (decoding) target block is performed, and therefore, the approximation to the prediction method of the adjacent block is slightly improved. Therefore, it is preferable that the conditions (4) to (6) be weighted more than the conditions (1) to (3). As a modification, w may be set to 1 under the conditions (1) to (5) and may be set to 2 only under the condition (6). Since plane prediction uses both upper and left neighboring pixels of a target block in its processing, the approximation of a prediction method is improved only when both of the prediction modes of a plurality of neighboring blocks are intra prediction, and this principle is used in a modification.

Next, when the intra prediction mode of the block to be encoded (decoded) is angular prediction and the prediction direction is 18 or 50, w may be determined as shown in fig. 9. Specifically, w is set to 1 under the conditions (1) to (3), and w is set to 2 under the conditions (4) to (6). When the intra prediction mode is angular prediction and the prediction direction is 18 or 50, the intra prediction has an approximation to a prediction method of an adjacent block in either the horizontal direction or the vertical direction.

Therefore, for the conditions (4) to (6) where there is an available intra prediction block, it is preferable to increase the weight as compared with the conditions (1) to (3). However, the effect of the proximity to the prediction method of the adjacent block is limited to either the horizontal direction or the vertical direction. Therefore, even in the case of the condition (6) in which both of the plurality of adjacent blocks are available, the effect of the approximation to the prediction method of the adjacent blocks is not changed as compared with the conditions (4) and (5) in which only one of the plurality of adjacent blocks is available. Accordingly, in the example of the present figure, w is also set to 2 under the condition (6).

Next, when the intra prediction mode of the block to be encoded (decoded) is angle prediction and the prediction direction is 2 to 17 or 51 to 66, and when the intra prediction mode of the block to be encoded (decoded) is angle prediction and the prediction direction is 2 to 17 or 51 to 66, w may be determined as shown in fig. 9. Specifically, w is set to 1 under the conditions (1) to (3), 2 under the conditions (4) and (5), and 3 under the condition (6). In the case where the intra prediction mode is angular prediction and the prediction direction is 2 to 17 or 51 to 66, and in the case where the intra prediction mode is angular prediction and the prediction direction is 2 to 17 or 51 to 66, since the prediction direction is diagonal, both the block adjacent to the upper side and the block adjacent to the left side in the intra prediction have the similarity of the prediction method. Thus, there is a tendency that the more intra prediction blocks available in the neighboring blocks are increased, the higher the approximation of the prediction method with the neighboring blocks. Therefore, it is preferable that the value of the weighting parameter w is set to be larger as the intra prediction block available in the neighboring block increases as described above.

Next, when the intra prediction mode of the block to be encoded (decoded) is DC prediction, w may be determined as shown in fig. 9. Specifically, w is set to 2 under the conditions (1) to (3), and w is set to 3 under the conditions (4) to (6). As described in embodiment 1, when the DC prediction is selected in the process of the encoding side, the encoding target image is highly likely to be a pattern of a plane having a constant luminance and a wide range. Since this probability is particularly high when the neighboring blocks include an available intra prediction block, it is preferable to set w to 3. However, in the example of the present figure, when the available intra prediction block is not included in the neighboring block, since the possibility that the encoding target image is a pattern of a plane having a certain luminance which widely spreads is slightly reduced, w is set to 2.

According to the above-described determination process of the weighting parameter w in fig. 9, the weighting parameter w can be more preferably determined by considering not only the intra prediction mode of the block to be encoded (decoded) but also the combination of the prediction modes of a plurality of adjacent blocks in the determination of the weighting parameter w in expression 1 in the synthesis prediction process.

Further, by using a combination of prediction modes of a plurality of adjacent blocks as a condition, it is possible to perform preferable condition setting without transmitting a new flag.

In the example of fig. 9, the weight w is increased in (6) compared to the condition (1) regardless of the prediction mode of the block to be encoded (decoded). This is because, regardless of which prediction mode the prediction mode of the encoding (decoding) target block is, there is a tendency that if the intra prediction blocks available in the neighboring blocks are increased, at least the approximation with the prediction method of the neighboring block is improved.

In addition, in the example of fig. 9, in the case where neither plane prediction nor DC prediction is specified for intra prediction used in synthetic prediction, the weight w is increased in the case where there is an available intra prediction block in a plurality of adjacent blocks, as compared with the case where there is no available intra prediction block in a plurality of adjacent blocks.

As a modification of fig. 9, w may be 2 under the conditions (4) and (5) of DC prediction as shown in fig. 10. In the example of fig. 10, w is only the condition (6) of 3 in the DC prediction. This is an example of a high possibility that the encoding target image is a pattern of a plane having a large range of constant brightness under the condition (6) that intra prediction is available only for both of a plurality of adjacent blocks. The example of fig. 10 is the same as that of fig. 9 except for the conditions (4) and (5) of DC prediction, and therefore, repetitive description thereof will be omitted.

According to the image encoding device and the image encoding method of embodiment 4 of the present invention described above, an image can be encoded more preferably.

(example 5)

Next, example 5 of the present invention will be described.

In embodiment 5 of the present invention, the image decoding apparatus and the image decoding method in embodiment 2 change the process of determining the weighting parameter w in the process of the synthesis prediction.

Specifically, the process of determining the weighting parameter w is changed from the process of determining the weighting parameter w shown in fig. 8 to the process of determining the weighting parameter w shown in fig. 9 or 10. The image decoding apparatus and the image decoding method according to embodiment 5 use the determination process of the weighting parameter w shown in fig. 9 or 10, and the configuration, operation, and process other than the determination process of the weighting parameter w shown in fig. 9 or 10 are the same as those of the image decoding apparatus and the image decoding method according to embodiment 2. Therefore, the same points as those of the image decoding apparatus and the image decoding method of embodiment 2 will not be described repeatedly.

The details of the determination process of the weighting parameter w in fig. 9 or 10 are as described in example 4, and therefore, the repetitive description thereof will be omitted.

As shown in fig. 9 or 10, in the image decoding apparatus and the image decoding method according to embodiment 5 of the present invention described above, in determining the weighting parameter w in expression 1 in the process of synthesizing prediction, the weighting parameter w can be determined more preferably by considering not only the intra prediction mode of the block to be encoded (decoded) but also the combination of the prediction modes of a plurality of adjacent blocks.

According to the image decoding apparatus and the image decoding method of embodiment 5 of the present invention described above, it is possible to decode an image more preferably.

(example 6)

Next, embodiment 6 of the present invention will be described with reference to the drawings.

Embodiment 6 of the present invention is the image encoding device and the image encoding method according to embodiment 1 or embodiment 4, wherein a flag for specifying a prediction mode of a predicted image to be synthesized with an intra-predicted image in a process of synthesizing prediction is newly set, and setting control of the flag and determination control of the prediction mode using the flag are performed.

In the image encoding apparatus and the image encoding method according to embodiment 1 or embodiment 4, the merge mode is taken as an example of the prediction mode of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction. In addition, as an example of a prediction mode of a prediction image synthesized with an intra prediction image in the process of synthesizing prediction, inter prediction using a prediction mode other than the merge mode may be used.

In contrast, embodiment 6 of the present invention proposes a more specific process of the prediction mode determination method.

Specifically, the inter prediction specifying flag in the synthesized prediction mode shown in fig. 11 is set, and the flag is set on the encoding side. On the decoding side, the decision control of the prediction mode of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction is performed using the flag.

The image encoding apparatus and the image encoding method according to embodiment 6 are the same as those according to embodiment 1 or embodiment 4 except for the processing related to the inter prediction designation flag in the synthesized prediction mode shown in fig. 11. Therefore, the same points as those of the image encoding device and the image encoding method of embodiment 1 or embodiment 4 will not be described repeatedly.

Fig. 11 shows an example of control of setting an inter prediction specifying flag of a synthesis prediction mode and control of determining a prediction mode of a prediction image to be synthesized with an intra prediction image in the process of synthesis prediction in the image encoding apparatus and the image encoding method according to embodiment 6 of the present invention.

In the example of fig. 11, 1-bit information is set as the inter prediction specifying flag of the synthesis prediction mode. On the encoding side, 3 kinds of processing can be performed, such as (1) not transmitting the 1 bit of the inter prediction specifying flag of the composite prediction mode, (2) setting the 1 bit to 0 and transmitting, and (3) setting the 1 bit to 1 and transmitting.

On the other hand, on the decoding side, (1) when the encoding side does not transmit 1 bit of the inter prediction specifying flag of the combined prediction mode, the prediction mode of the predicted image combined with the intra predicted image in the process of the combined prediction is determined as the merge mode. Thus, the merge pattern can be specified without adding 1-bit information. The merge mode itself is a prediction mode that has a very high approximation to the prediction method of the adjacent block, so the merge mode may be a preferable option in many cases.

In contrast, (2) when the encoding side sets 0 to 1 bit of the inter prediction specification flag of the synthesis prediction mode and transmits the flag, the decoding side selects, for the prediction mode of the predicted image synthesized with the intra predicted image in the process of synthesis prediction, the same inter prediction mode as that of the right-left adjacent block if the right-left adjacent block of the decoding target block is available and inter prediction is performed. In addition, if the right-left neighboring block of the decoding target block does not satisfy the condition, the merge mode is selected as the prediction mode of the predicted image synthesized with the intra-predicted image in the process of synthesizing prediction.

In this way, as the prediction mode of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction, the prediction mode of the right-left adjacent block of the decoding target block can be selected as a prediction mode other than the merge mode. This is preferable in the case where the prediction mode using the right left neighboring block of the decoding target block is more efficient in coding than the merge mode.

In addition, (3) when the encoding side sets 1 bit of the inter prediction specifying flag of the synthesis prediction mode to 1 and transmits the flag, the decoding side selects the same inter prediction mode as that of the immediately-above adjacent block if the adjacent block immediately above the block to be decoded is available and inter prediction is available for the prediction mode of the predicted image synthesized with the intra predicted image in the process of synthesis prediction.

In addition, if the condition is not satisfied by the adjacent block immediately above the decoding target block, the merge mode is selected as the prediction mode of the predicted image synthesized with the intra-predicted image in the process of synthesizing prediction. In this way, as the prediction mode of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction, the prediction mode of the adjacent block immediately above the block to be decoded can be selected as a prediction mode other than the merge mode. This is preferable in the case where the prediction mode using the immediately-above neighboring block of the decoding target block is more efficient in encoding than the merge mode.

From the above-described control of setting the inter prediction designation flag in the combined prediction mode in fig. 11 and the control of determining the prediction mode of the predicted image to be combined with the intra predicted image in the processing of the combined prediction, it is possible to more preferably set and determine the merge mode and the other prediction modes as the prediction modes of the predicted image to be combined with the intra predicted image in the processing of the combined prediction. In particular, it is preferable that only the 1-bit flag is set to add a prediction mode other than the merge mode as a prediction mode of a prediction image to be synthesized with an intra prediction image in the process of synthesizing a prediction. In addition, for a merge mode having a very high approximation to the prediction method of the adjacent block, it is possible to set and determine without transmitting a special (or dedicated) flag that specifies inter prediction in the synthesized prediction mode, and it is preferable to suppress an increase in transmission information.

According to the image encoding device and the image encoding method of embodiment 6 of the present invention described above, an image can be encoded more preferably.

(example 7)

Next, example 7 of the present invention will be described.

Embodiment 7 of the present invention is the image decoding apparatus and the image decoding method according to embodiment 2 or embodiment 5, in which a flag for specifying a prediction mode of a predicted image to be synthesized with an intra-predicted image in a process of synthesizing prediction is newly set, and setting control of the flag and determination control of the prediction mode using the flag are performed.

Specifically, the prediction mode of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction is determined by the method shown in fig. 11. The image decoding apparatus and the image decoding method according to embodiment 2 or embodiment 5 are the same in configuration, operation, and processing except for the determination processing of the prediction mode using the flag for specifying the prediction mode of the predicted image to be synthesized with the intra-predicted image in the synthesis prediction processing shown in fig. 11. Therefore, the same points as those of the image decoding apparatus and the image decoding method of embodiment 2 or embodiment 5 will not be described repeatedly.

The details of the prediction mode determination processing using the flag for specifying the prediction mode of the predicted image to be synthesized with the intra-predicted image in the synthesis prediction processing shown in fig. 11 are as described in embodiment 6, and therefore, the repetitive description thereof will be omitted.

In the image decoding apparatus and the image decoding method according to embodiment 7 of the present invention described above, the merge mode and the other prediction modes can be more preferably determined as the prediction modes of the predicted image to be synthesized with the intra-predicted image in the process of synthesizing prediction. In particular, it is preferable that only the 1-bit flag is set to add a prediction mode other than the merge mode as a prediction mode of a prediction image to be synthesized with an intra prediction image in the process of synthesizing a prediction. In addition, for a merge mode having a very high approximation to the prediction method of the adjacent block, it can be determined without transmitting a special (or dedicated) flag that specifies inter prediction in the synthesized prediction mode, and it is preferable to suppress an increase in transmission information from the encoding side.

According to the image decoding apparatus and the image decoding method of embodiment 7 of the present invention described above, it is possible to decode an image more preferably.

(example 8)

Next, embodiment 8 of the present invention will be described with reference to the drawings.

Embodiment 8 of the present invention also supports a prediction image obtained by matrix weighted intra prediction as an intra prediction image used in the process of synthesis prediction in the image encoding apparatus and the image encoding method of embodiment 1. The matrix weighted intra prediction in the present embodiment is an example of the prediction processing described as "matrix prediction" in embodiment 1.

Specifically, the matrix weighted intra prediction described with reference to fig. 12 is used as the intra prediction image of the process of the synthesis prediction. Further, the process of determining the weighting parameter w is changed from the process of determining the weighting parameter w shown in fig. 8 to the process of determining the weighting parameter w shown in fig. 13. The image encoding apparatus and the image encoding method according to embodiment 8 are the same as those according to embodiment 1 except for the matrix weighted intra prediction described with reference to fig. 12 and the determination process of the weighting parameter w shown in fig. 13. Therefore, the same points as those of the image encoding device and the image encoding method of embodiment 1 will not be described repeatedly.

First, matrix weighted intra prediction will be described with reference to fig. 12. Since the same matrix weighted intra prediction is performed also on the decoding side, the description will be made on the expression "encoding (decoding) target block" for the prediction target block. Matrix weighted intra prediction performs prediction of pixels by vector operation using boundary pixels of adjacent blocks of an encoding (decoding) target block and a matrix specified by a mode. The matrix weighted intra prediction process generates a predicted image by (1) a downsampling process, (2) a vector operation (weighted matrix operation) process, and (3) an upsampling process after the boundary pixel preparation process.

First, as boundary pixel preparation processing, for an encoding (decoding) target block, boundary pixels of 1 line are acquired from a block immediately to the left adjacent to the block as an immediately left boundary pixel column. Further, boundary pixels of 1 line are obtained from the immediately above adjacent block as immediately above boundary pixel columns. Here, in the case where the boundary pixel is unavailable, it is generated in the same manner as the existing intra prediction. If there is no decoded pixel in the boundary pixels right to the left and right above, the median value of 1< (Bitdepth-1) is set. The scanning is performed upward to the right in the order of the upward direction and the rightward direction from the lower left pixel, and the pixel originally present is copied to the lower left pixel. Scanning is again from the bottom left to the top direction, and if there is no pixel, the pixel below it is copied. Likewise from the top left to the right, the pixel to the left is copied if there is no pixel.

Next, as (1) downsampling processing, the right-left boundary pixel row and the right-top boundary pixel row are averaged for every 2 pixels to be a reduced right-left boundary pixel row and a reduced right-top boundary pixel row, respectively.

Next, as (2) vector operation (weighting matrix operation), the reduced right-left boundary pixel row and the reduced right-top boundary pixel row are synthesized by a method determined in accordance with the size of the block to be encoded (decoded) and a predetermined mode, and a reduced boundary vector is generated. Next, a vector operation using a matrix specified by the matrix-weighted intra prediction mode is performed on the reduced boundary vector to generate a reduced prediction vector. A prediction pixel at a sub-sampling position of a block to be decoded is generated based on the reduced prediction vector.

Next, as the (3) up-sampling process, linear interpolation is performed from the boundary pixel row, the reduced boundary pixel row, and the prediction pixels at the sub-sampling positions, thereby generating the entire prediction pixels of the block to be encoded (decoded).

Next, fig. 13 shows an example of a process of determining the weighting parameter w in the process of the synthesis prediction in the image coding apparatus and the image coding method according to embodiment 8 of the present invention. Since the same process of determining the weighting parameter w is performed also on the decoding side, the description will be made on an expression that the prediction target block is an "encoding (decoding) target block".

In the determination process of the weighting parameter w in the process of the combined prediction in fig. 13, the determination process of the weighting parameter w in the case where the intra prediction used in the process of the combined prediction is the matrix weighted intra prediction is added to the determination process of the weighting parameter w in fig. 8. Specifically, the rightmost column of the table of fig. 13 is added. Here, in the matrix weighted intra prediction, the prediction image changes depending on the weighting matrix specified by the flag for the block to be encoded (decoded) in the matrix weighted intra prediction mode. The flag is specified at the encoding side and transmitted to the decoding side. This makes it possible for the decoding side to determine the weighting matrix specified by the encoding side. Therefore, the matrix weighted intra prediction is not highly similar to the prediction method of the adjacent block. However, since both the boundary pixel row of the immediately above adjacent block and the boundary pixel row of the immediately left adjacent block are used, the similarity itself between the predicted image of the matrix weighted intra prediction and the predicted image of the adjacent block is high. Further, since matrix weighted intra prediction can improve prediction accuracy according to an image depending on a weighting matrix specified by a flag, it tends to be different from plane prediction in which it is difficult to improve prediction accuracy. Therefore, when using matrix weighted intra prediction as the intra prediction used for the synthesis prediction, the weighting parameter w may be determined such that w is 3 as shown in fig. 13. That is, the value of w may be selected to be larger than the value of w in the plane prediction.

According to the example of determining w in the synthesized prediction mode of fig. 13 described above, even when matrix weighted intra prediction is used as an option for the type of intra prediction used in the synthesized prediction mode, the weighting of the intra prediction image in the synthesis of the intra prediction image and the inter prediction image can be more preferably determined.

According to the image encoding device and the image encoding method of embodiment 8 described above, a more preferable synthesis prediction mode can be realized, and an image can be encoded more preferably.

(example 9)

Next, example 9 of the present invention will be described.

Embodiment 9 of the present invention also supports a prediction image obtained by matrix-weighted intra prediction as an intra prediction image used in the process of synthesis prediction in the image decoding apparatus and the image decoding method of embodiment 2. The matrix weighted intra prediction in the present embodiment is an example of the prediction processing described as "matrix prediction" in embodiment 2. The details of the processing of the matrix weighted intra prediction are as described in embodiment 8, and therefore, the repetitive description thereof is omitted.

Specifically, the process of determining the weighting parameter w is changed from the process of determining the weighting parameter w shown in fig. 8 to the process of determining the weighting parameter w shown in fig. 13. The image decoding apparatus and the image decoding method according to embodiment 9 use the determination process of the weighting parameter w shown in fig. 13, and the configuration, operation, and process other than the determination process of the weighting parameter w shown in fig. 13 are the same as those of the image decoding apparatus and the image decoding method according to embodiment 2. Therefore, the same points as those of the image decoding apparatus and the image decoding method of embodiment 2 will not be described repeatedly.

The details of the determination process of the weighting parameter w in fig. 13 are as described in example 8, and therefore, the repetitive description thereof will be omitted.

In the image decoding apparatus and the image decoding method according to embodiment 9 of the present invention described above, as shown in fig. 13, even when matrix weighted intra prediction is used as an option for the type of intra prediction used in the combined prediction mode, the weighting of the intra prediction image in the combination of the intra prediction image and the inter prediction image can be determined more preferably.

According to the image decoding device and the image decoding method of embodiment 9 of the present invention described above, a more preferable synthesis prediction mode can be realized, and an image can be decoded more preferably.

In addition, any combination of the embodiments of the drawings, the methods, and the like described above may be an embodiment of the present invention.

According to the embodiments of the present invention described above, the amount of code can be reduced, and image quality degradation can be prevented. That is, a high compression ratio and a better image quality can be achieved.

Description of the reference numerals

101 … image input section, 102 … block division section, 103 … mode management section, 104 … intra prediction section, 105 … inter prediction section, 106 … block processing section, 107 … transformation/quantization section, 108 … inverse quantization/inverse transformation section, 109 … image synthesis/filter section, 110 … decoded image management section, 111 … entropy coding section, 112 … data output section, 201 … stream analysis section, 202 … block management section, 203 … mode determination section, 204 … intra prediction section, 205 … inter prediction section, 206 … coefficient analysis section, 207 … inverse quantization/inverse transformation section, 208 … image synthesis/filter section, 209 … decoded image management section, 210 … image output section, 120 … synthesis prediction section, 220 … synthesis prediction section.

Claims

1. An image encoding method for encoding an image, comprising:

a predicted image generation step of performing a synthesis process of synthesizing the predicted image for the inter prediction and the predicted image for the intra prediction with respect to the block to be encoded to generate a synthesized predicted image; and

an encoding step of encoding a difference between pixel values of the prediction image generated in the prediction image generation step and the image of the block to be encoded,

the intra-prediction used for the synthesis process can use a variety of intra-predictions, including matrix weighted intra-prediction,

in the combining process, the inter-predicted image and the intra-predicted image are weighted, and weighting parameters of the intra-predicted image in the weighting process are determined based on the type of the intra-prediction.

2. An image encoding method for encoding an image, comprising:

the types of intra prediction that can be used in the predicted image generation step include plane prediction and matrix weighted intra prediction,

in the combining process, a weighting process is performed on the prediction image for the inter prediction and the prediction image for the intra prediction,

the processing state of the weighting processing includes: for the weighting of the prediction image for intra prediction, the weighting of plane prediction in the kind of intra prediction is set to a state smaller than the weighting of matrix weighted intra prediction.

3. An image encoding method for encoding an image, comprising:

in the combining process, the weighting process is performed on the inter-predicted image and the intra-predicted image, and the processing state of the weighting process includes a state in which the type of the intra-prediction is plane prediction and the weighting of the intra-predicted image is set to be minimum.

4. An image decoding method for decoding a coded stream obtained by coding an image, comprising:

a predicted image generation step of performing a synthesis process of synthesizing the predicted image for the inter prediction and the predicted image for the intra prediction with respect to the block to be decoded to generate a synthesized predicted image; and

a decoded image generation step of generating a decoded image based on the predicted image generated in the predicted image generation step and a difference image of the block to be decoded,

5. An image decoding method for decoding a coded stream obtained by coding an image, comprising:

6. An image decoding method for decoding a coded stream obtained by coding an image, comprising: