WO2012035640A1

WO2012035640A1 - Moving picture encoding method and moving picture decoding method

Info

Publication number: WO2012035640A1
Application number: PCT/JP2010/066102
Authority: WO
Inventors: 昭行谷沢; 太一郎塩寺
Original assignee: 株式会社東芝
Priority date: 2010-09-16
Filing date: 2010-09-16
Publication date: 2012-03-22

Abstract

A moving picture encoding method for dividing an input image signal into pixel blocks represented by the depth of hierarchies according to a quadtree segmentation, generating a prediction error signal for the pixel blocks obtained by the division, and encoding transform coefficients, the method including: a step of setting a first prediction direction from a set of a plurality of prediction directions and generating a first predictive picture signal; a step of setting a second prediction direction different from the first prediction direction from the set of prediction directions, and generating a second predictive picture signal; a step of deriving a relative distance between a pixel to be predicted and a reference pixel in each of the first and second prediction directions, and deriving a difference value between the relative distances; a step of deriving a predetermined weight component according to the difference value; a step of weighted averaging a first unidirectional intra predictive picture and a second unidirectional intra predictive picture according to the weight component to generate a third predictive picture signal; a step of generating a prediction error signal from the third predictive picture signal; and a step of encoding the prediction error signal.

Description

Video encoding method and video decoding method

Embodiments of the present invention relate to an intra-screen prediction method, a video encoding method, and a video decoding method in video encoding and decoding.

In recent years, an image coding method with greatly improved coding efficiency has been jointly developed by ITU-T and ISO / IEC. H. 26 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”). H. H.264 achieves higher prediction efficiency than in-screen prediction in ISO / IEC MPEG-1, 2 and 4 (hereinafter referred to as intra prediction) by incorporating direction prediction in the spatial region (pixel region). Yes. H. As an extension of H.264, a method for further improving the coding efficiency by introducing a maximum of 34 types of prediction angles and prediction methods and performing intra prediction has been proposed.

However, in Non-Patent Document 1, since a prediction value is generated at an individual prediction angle for each of a plurality of types of prediction modes and copied in the prediction direction, a texture having a luminance gradient that smoothly changes within a pixel block. Such a video or a video with gradation cannot be predicted efficiently, and the prediction error may increase.

Therefore, an object of the present embodiment is to provide a moving image encoding device and a moving image decoding device including a prediction image generating device capable of improving encoding efficiency.

According to the embodiment, the moving image encoding method divides an input image signal into pixel blocks represented by hierarchical depths according to quadtree division, and generates prediction error signals for these divided pixel blocks. Then, in the moving picture encoding method for encoding the transform coefficient, a step of setting a first prediction direction from a plurality of prediction direction sets to create a first prediction image signal, and the plurality of prediction direction sets A step of creating a second prediction image signal by setting a second prediction direction different from the first prediction direction, and a pixel to be predicted and reference for each of the first and second prediction directions Deriving a relative distance from the pixel, deriving a difference value of the relative distance, deriving a predetermined weight component according to the difference value, and according to the weight component, a first unit Direction A step of weighted averaging the prediction image and the second unidirectional intra prediction image to generate a third prediction image signal, a step of generating a prediction error signal from the third prediction image signal, and encoding the prediction error signal And the step of performing.

According to the embodiment, the moving picture decoding method divides an input image signal into pixel blocks expressed by hierarchical depth according to quadtree division, and performs a decoding process on the divided pixel blocks. In the image decoding method, a step of creating a first predicted image signal by setting a decoded first prediction direction from among a plurality of prediction direction sets; and Creating a second predicted image signal by setting a decoded second prediction direction different from the one prediction direction; and for each of the first and second prediction directions, a prediction target pixel and a reference pixel, Deriving a relative distance between the two, deriving a difference value of the relative distance, deriving a predetermined weight component according to the difference value, and according to the weight component, a first unidirectional intra The measurement image and the second unidirectional intra-prediction image and the weighted average comprises the steps of generating a third predictive image signal, and generating a decoded image signal from said third prediction image signal.

1 is a block diagram illustrating a moving image encoding apparatus according to a first embodiment. Explanatory drawing of the prediction encoding order of a pixel block. Explanatory drawing of an example of pixel block size. Explanatory drawing of another example of pixel block size. Explanatory drawing of another example of pixel block size. Explanatory drawing of an example of the pixel block in a coding tree unit. Explanatory drawing of another example of the pixel block in a coding tree unit. Explanatory drawing of another example of the pixel block in a coding tree unit. Explanatory drawing of another example of the pixel block in a coding tree unit. Explanatory drawing of the prediction mode candidate number at the time of the high speed mode determination which concerns on 1st Embodiment. (A) is explanatory drawing of intra prediction mode, (b) is explanatory drawing of the reference pixel and prediction pixel of intra prediction mode, (c) is explanatory drawing of the horizontal prediction mode of intra prediction mode, ( d) Explanatory drawing of the orthogonal lower right prediction mode of intra prediction mode. The block diagram which illustrates the intra prediction part concerning a 1st embodiment. Explanatory drawing of the number of unidirectional intra predictions and the number of bidirectional intra predictions which concern on 1st Embodiment. Explanatory drawing which illustrates the prediction direction which concerns on 1st Embodiment. The table figure which illustrates the relationship of prediction mode, prediction type, bidirectional | two-way intra prediction, and unidirectional intra prediction based on 1st Embodiment. The table figure which illustrates the relationship of prediction mode, prediction type, bidirectional | two-way intra prediction, and unidirectional intra prediction based on 1st Embodiment. The table figure which illustrates the relationship of prediction mode, prediction type, bidirectional | two-way intra prediction, and unidirectional intra prediction based on 1st Embodiment. The table figure which illustrates a response | compatibility with the parameter | index of the prediction angle based on 1st Embodiment, and the prediction angle in the case of prediction image generation. The block diagram of the bidirectional | two-way intra estimated image generation part based on 1st Embodiment. The table figure which illustrates a response | compatibility with bidirectional | two-way intra prediction and two unidirectional intra prediction based on 1st Embodiment. The block diagram which shows an example of the calculation method of a city area distance based on 1st Embodiment. The block diagram which shows another example of the calculation method of the city area distance based on 1st Embodiment. The block diagram which shows another example of the calculation method of the city area distance based on 1st Embodiment. The table which illustrates the relationship between the prediction mode and the distance of a prediction pixel position based on 1st Embodiment. The table which illustrates the mapping of prediction mode and a distance table based on 1st Embodiment. The table which illustrates the relationship between relative distance and a weight component based on 1st Embodiment. 6 is another table illustrating the relationship between the relative distance and the weight component according to the first embodiment. The block diagram which shows another embodiment of the intra estimation part based on 1st Embodiment. Explanatory drawing of a syntax structure. Explanatory drawing of a slice header syntax. Explanatory drawing which shows an example of a prediction unit syntax. Explanatory drawing which shows another example of a prediction unit syntax. Explanatory drawing which shows another example of a prediction unit syntax. Explanatory drawing which shows another example of a prediction unit syntax. Explanatory drawing which shows another example of a prediction unit syntax. The flowchart which shows an example of the calculation method of intraPredModeN. The table which shows the relationship at the time of predicting prediction mode. The table which shows another example of the relationship at the time of predicting prediction mode. The block diagram which shows the 1st modification of the intra estimation part based on 1st Embodiment. Explanatory drawing which shows an example of the prediction unit syntax in the 1st modification based on 1st Embodiment. Explanatory drawing which shows another example of the prediction unit syntax in the 1st modification based on 1st Embodiment. Explanatory drawing which shows an example of the predicted value generation method of a pixel level. The block diagram which shows another example of the intra estimation part based on 1st Embodiment. The block diagram which shows another example of the intra estimation part based on 1st Embodiment. The block diagram which shows an example of the composite intra estimated image generation part based on 1st Embodiment. Explanatory drawing which shows an example according to Prediction unit syntax based on 1st Embodiment. Explanatory drawing which shows another example of the prediction unit syntax based on 1st Embodiment. The block diagram which illustrates the moving picture coding device concerning a 2nd embodiment. The block diagram which illustrates the orthogonal transformation part based on 2nd Embodiment. The block diagram which illustrates the inverse orthogonal transformation part based on 2nd Embodiment. The table which shows the relationship between prediction mode and a conversion index based on 2nd Embodiment. The block diagram which illustrates the coefficient order control part concerning a 2nd embodiment. The block diagram which illustrates another coefficient order control part concerning a 2nd embodiment. Explanatory drawing which shows an example of the transform unit syntax based on 2nd Embodiment. The block diagram which shows another example of the orthogonal transformation part based on 3rd Embodiment. The block diagram which shows an example of the inverse orthogonal transformation part based on 3rd Embodiment. Explanatory drawing which shows an example of the transform unit syntax based on 3rd Embodiment. Explanatory drawing which shows an example of the unidirectional intra prediction mode, prediction type, and prediction angle parameter | index based on 1st Embodiment. The block diagram which shows an example of the moving image decoding apparatus based on 4th Embodiment. The block diagram which shows an example of the moving image decoding apparatus based on 5th Embodiment. The block diagram which illustrates the coefficient order restoration part concerning a 5th embodiment. The block diagram which shows another example of the coefficient order decompression | restoration part based on 5th Embodiment.

Hereinafter, with reference to the drawings, a video encoding device and a video decoding device according to each embodiment will be described in detail. In the following description, the term “image” can be appropriately read as terms such as “video”, “pixel”, “image signal”, and “image data”. Further, in the following embodiments, the same numbered portions are assumed to perform the same operation, and repeated description is omitted.
(First embodiment)
The first embodiment relates to an image encoding device. A moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a fourth embodiment. This image encoding device can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array). The image encoding apparatus can also be realized by causing a computer to execute an image encoding program.

As shown in FIG. 1, the image coding apparatus according to the present embodiment includes a subtraction unit 101, an orthogonal transformation unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transformation unit 105, an addition unit 106, and a loop filter. 107, a reference image memory 108, an intra prediction unit 109, an inter prediction unit 110, a prediction selection switch 111, a prediction selection unit 112, an entropy encoding unit 113, an output buffer 114, and an encoding control unit 115.

The image coding apparatus in FIG. 1 divides each frame or each field constituting the input image signal 116 into a plurality of pixel blocks, performs predictive coding on the divided pixel blocks, and generates coded data 127. Output. In the following description, for the sake of simplicity, it is assumed that pixel blocks are predictively encoded from the upper left to the lower right as shown in FIG. 2A. In FIG. 2A, the encoded pixel block p is located on the left side and the upper side of the encoding target pixel block c in the encoding processing target frame f.

Here, the pixel block refers to a unit for processing an image such as an M × N size block (N and M are natural numbers), a coding tree unit, a macro block, a sub block, and one pixel. In the following description, the pixel block is basically used in the meaning of the coding tree unit. However, the pixel block can be interpreted in the above-described meaning by appropriately replacing the description. The coding tree unit is typically a 16 × 16 pixel block shown in FIG. 2B, for example, but may be a 32 × 32 pixel block shown in FIG. 2C or a 64 × 64 pixel block shown in FIG. 3D. It may be an 8 × 8 pixel block (not shown) or a 4 × 4 pixel block. Also, the coding tree unit need not necessarily be square. Hereinafter, the encoding target block or coding tree unit of the input image signal 116 may be referred to as a “prediction target block”. The coding unit is not limited to a pixel block such as a coding tree unit, and a frame, a field, a slice, or a combination thereof can be used.

3A to 3D are diagrams showing specific examples of coding tree units. FIG. 3A shows an example where the size of the coding tree unit is 64 × 64 (N = 32). Here, N represents the size of the reference coding tree unit. The size when divided is defined as N, and the case where it is not divided is defined as 2N. The coding tree unit has a quadtree structure, and when divided, the four pixel blocks are indexed in the Z-scan order. FIG. 3B shows an example in which the 64 × 64 pixel block in FIG. 3A is divided into quadtrees. The numbers shown in the figure represent the Z scan order. Further, it is possible to further perform quadtree division within the index of one quadtree of the coding tree unit. The depth of division is defined by Depth. That is, FIG. 3A shows an example in which Depth = 0. FIG. 3C shows an example of a 32 × 32 (N = 16) size coding tree unit in the case of Depth = 1. A unit having the largest coding tree unit is called a large coding tree unit, and an input image signal is encoded in the order of raster scanning in this unit.

The image encoding apparatus in FIG. 1 performs intra prediction (also referred to as intra-frame prediction, intra-frame prediction, etc.) or inter prediction (inter-screen prediction) for a pixel block based on the encoding parameter input from the encoding control unit 115. Prediction, inter-frame prediction, motion compensation prediction, etc.) is performed to generate the predicted image signal 126. This image coding apparatus orthogonally transforms and quantizes the prediction error signal 117 between the pixel block (input image signal 116) and the predicted image signal 126, performs entropy coding, generates coded data 127, and outputs it. To do.

The image encoding device in FIG. 1 performs encoding by selectively applying a plurality of prediction modes having different block sizes and generation methods of the predicted image signal 126. The generation method of the predicted image signal 126 can be broadly divided into two types: intra prediction in which prediction is performed within the encoding target frame and inter prediction in which prediction is performed using one or a plurality of reference frames that are temporally different. is there.

Hereinafter, each element included in the image encoding device in FIG. 1 will be described.
The subtraction unit 101 subtracts the corresponding prediction image signal 126 from the encoding target block of the input image signal 116 to obtain a prediction error signal 117. The subtraction unit 101 inputs the prediction error signal 117 to the orthogonal transformation unit 102.

The orthogonal transform unit 102 performs an orthogonal transform such as discrete cosine transform (DCT) on the prediction error signal 117 from the subtraction unit 101 to obtain a transform coefficient 118. The orthogonal transform unit 102 inputs the transform coefficient 118 to the quantization unit 103.

The quantization unit 103 performs quantization on the transform coefficient from the orthogonal transform unit 102 to obtain a quantized transform coefficient 119. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter and a quantization matrix specified by the encoding control unit 115. The quantization parameter indicates the fineness of quantization. The quantization matrix is used for weighting the fineness of quantization for each component of the transform coefficient. The quantization unit 103 inputs the quantized transform coefficient 119 to the entropy encoding unit 113 and the inverse quantization unit 104.

The entropy encoding unit 113 performs various encoding parameters such as the quantized transform coefficient 119 from the quantization unit 103, the prediction information 125 from the prediction selection unit 112, and the quantization information specified by the encoding control unit 115. Entropy encoding (for example, Huffman encoding, arithmetic encoding, etc.) is performed to generate encoded data. The encoding parameter is a parameter necessary for decoding, such as prediction information 125, information on transform coefficients, information on quantization, and the like. For example, the encoding control unit 115 has an internal memory (not shown), the encoding parameter is held in this memory, and the encoding parameter of an already encoded pixel block adjacent when encoding the prediction target block is stored. It is good also as a structure to use. For example, H.M. In the H.264 intra prediction, the prediction value of the prediction mode of the prediction target block can be derived from the prediction mode information of the encoded adjacent block.

The encoded data generated by the entropy encoding unit 113 is temporarily accumulated, for example, in the output buffer 114 through multiplexing, and is output as encoded data 127 according to an appropriate output timing managed by the encoding control unit 115. . The encoded data 127 is output to, for example, a storage system (storage medium) or a transmission system (communication line) not shown.

The inverse quantization unit 104 performs inverse quantization on the quantized transform coefficient 119 from the quantization unit 103 to obtain a restored transform coefficient 120. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103. The quantization information used in the quantization unit 103 is loaded from the internal memory of the encoding control unit 115. The inverse quantization unit 104 inputs the restored transform coefficient 120 to the inverse orthogonal transform unit 105.

The inverse orthogonal transform unit 105 performs an inverse orthogonal transform corresponding to the orthogonal transform performed in the orthogonal transform unit 102 such as an inverse discrete cosine transform on the restored transform coefficient 120 from the inverse quantization unit 104, A restoration prediction error signal 121 is obtained. The inverse orthogonal transform unit 105 inputs the restored prediction error signal 121 to the addition unit 106.

The addition unit 106 adds the restored prediction error signal 121 and the corresponding prediction image signal 126 to generate a local decoded image signal 122. The decoded image signal 122 is input to the loop filter 107. The loop filter 107 performs a deblocking filter, a Wiener filter, or the like on the input decoded image signal 122 to generate a filtered image signal 123. The generated filtered image signal 123 is input to the reference image memory 108.

The reference image memory 108 stores the filtered image signal 123 after local decoding in the memory, and when the predicted image is generated as necessary by the intra prediction unit 109 and the inter prediction unit 110, the reference image signal 124 is used. Referenced each time.

The intra prediction unit 109 performs intra prediction using the reference image signal 124 stored in the reference image memory 108. For example, H.M. In H.264, an intra prediction image is obtained by performing pixel interpolation (copying or copying after interpolation) along a prediction direction such as a vertical direction or a horizontal direction using an encoded reference pixel value adjacent to a prediction target block. Generate. FIG. The prediction direction of intra prediction in H.264 is shown. Further, in FIG. 2 shows an arrangement relationship between reference pixels and encoding target pixels in H.264. FIG. 5C illustrates a predicted image generation method in mode 1 (horizontal prediction), and FIG. 5D illustrates a predicted image generation method in mode 4 (diagonal lower right prediction).

In non-patent literature, H. The prediction direction of H.264 is further expanded to 34 directions to increase the number of prediction modes. A predicted pixel value is created by performing linear interpolation with 32-pixel accuracy in accordance with the predicted angle, and is copied in the predicted direction. Details of the intra prediction unit 109 used in the present embodiment of the present invention will be described later.

The inter prediction unit 110 performs inter prediction using the reference image signal 124 stored in the reference image memory 108. Specifically, the inter prediction unit 110 performs block matching processing between the prediction target block and the reference image signal 124 to derive a motion shift amount (motion vector). The inter prediction unit 110 performs an interpolation process (motion compensation) based on the motion vector to generate an inter prediction image. H. With H.264, interpolation processing up to 1/4 pixel accuracy is possible. The derived motion vector is entropy encoded as part of the prediction information 125.

The prediction selection switch 111 selects the output terminal of the intra prediction unit 109 or the output terminal of the inter prediction unit 110 according to the prediction information 125 from the prediction selection unit 112, and subtracts the intra prediction image or the inter prediction image as the prediction image signal 126. 101 and the adder 106. When the prediction information 125 suggests intra prediction, the prediction selection switch 111 connects a switch to the output terminal from the intra prediction unit 109. On the other hand, when the prediction information 125 suggests inter prediction, the prediction selection switch 111 connects a switch to the output terminal from the inter prediction unit 110.

The prediction selection unit 112 has a function of setting the prediction information 125 according to the prediction mode controlled by the encoding control unit 115. As described above, intra prediction or inter prediction can be selected for generating the predicted image signal 126, but a plurality of modes can be further selected for each of intra prediction and inter prediction. The encoding control unit 115 determines one of a plurality of prediction modes of intra prediction and inter prediction as the optimal prediction mode, and the prediction selection unit 112 sets the prediction information 125 according to the determined optimal prediction mode. .

For example, for intra prediction, prediction mode information is designated by the intra prediction unit 109 from the encoding control unit 115, and the intra prediction unit 109 generates a predicted image signal 126 according to the prediction mode information. The encoding control unit 115 may specify a plurality of prediction mode information in order from the smallest prediction mode number, or may specify a plurality of prediction mode information in order from the largest. The encoding control unit 115 may limit the prediction mode according to the characteristics of the input image. The encoding control unit 115 does not necessarily specify all prediction modes, and may specify at least one prediction mode information for the encoding target block.

For example, the encoding control unit 115 determines an optimal prediction mode using a cost function represented by the following mathematical formula (1).

In Equation (1) (hereinafter referred to as simple encoding cost), OH indicates a code amount relating to prediction information 125 (for example, motion vector information and prediction block size information), and SAD is a prediction target block, a prediction image signal 126, and The difference absolute value sum (ie, the cumulative sum of the absolute values of the prediction error signal 117) is shown. Further, λ represents a Lagrange undetermined multiplier determined based on the value of quantization information (quantization parameter), and K represents an encoding cost. When Expression (1) is used, the prediction mode that minimizes the coding cost K is determined as the optimum prediction mode from the viewpoint of the generated code amount and the prediction error. As a modification of Equation (1), the encoding cost may be estimated from OH alone or SAD alone, or the encoding cost may be estimated using a value obtained by subjecting SAD to Hadamard transform or an approximation thereof.

It is also possible to determine an optimal prediction mode by using a temporary encoding unit (not shown). For example, the encoding control unit 115 determines an optimal prediction mode using a cost function expressed by the following formula (2).

In Equation (2), D represents a square error sum (ie, encoding distortion) between the prediction target block and the local decoded image, and R represents a prediction between the prediction target block and the prediction image signal 126 in the prediction mode. An error amount indicates a code amount estimated by provisional encoding, and J indicates an encoding cost. In order to derive the encoding cost J (hereinafter referred to as the detailed encoding cost) of Equation (2), provisional encoding processing and local decoding processing are required for each prediction mode, so that the circuit scale or the amount of calculation increases. . On the other hand, since the encoding cost J is derived based on more accurate encoding distortion and code amount, it is easy to determine the optimal prediction mode with high accuracy and maintain high encoding efficiency. As a modification of Equation (2), the encoding cost may be estimated from only R or D, or the encoding cost may be estimated using an approximate value of R or D. These costs may be used hierarchically. The encoding control unit 115 performs determination using Expression (1) or Expression (2) based on information obtained in advance regarding the prediction target block (prediction mode of surrounding pixel blocks, image analysis result, and the like). The number of prediction mode candidates may be narrowed down in advance.

As a modification of the present embodiment, the number of prediction mode candidates can be further reduced while maintaining coding performance by performing two-stage mode determination combining Formula (1) and Formula (2). It becomes possible. Here, unlike the formula (2), the simple encoding cost represented by the formula (1) does not require a local decoding process, and can be calculated at high speed. In the moving picture coding apparatus according to the present embodiment, H.264 is used. Since the number of prediction modes is large even when compared with H.264, mode determination using the detailed coding cost is not realistic. Therefore, as a first step, mode determination using the simple coding cost is performed on the prediction modes available in the pixel block, and prediction mode candidates are derived.

Here, the number of prediction mode candidates is changed using the property that the correlation between the simple coding cost and the detailed coding cost increases as the value of the quantization parameter that determines the roughness of quantization increases.

FIG. 4 shows the number of prediction mode candidates selected in the first step. PuSize is an index indicating the size of a pixel block (sometimes referred to as a prediction unit) that performs prediction described later. QP represents a quantization parameter, and the number of prediction mode candidates changes depending on the quotient divided by 5. Since the detailed coding cost only has to be derived for the number of prediction mode candidates narrowed down in this way, the number of local decoding processes can be greatly reduced.

The encoding control unit 115 controls each element of the image encoding device in FIG. Specifically, the encoding control unit 115 performs various controls for the encoding process including the above-described operation.

Hereinafter, the details of the intra prediction unit 109 according to the present embodiment will be described with reference to FIG.
<Intra Prediction Unit 109>
The intra prediction unit 109 illustrated in FIG. 6 includes a unidirectional intra predicted image generation unit 601, a bidirectional intra predicted image generation unit 602, a prediction mode information setting unit 603, and a selection switch 604. First, the reference image signal 124 is input from the reference image memory 108 to the unidirectional intra predicted image generation unit 601 and the bidirectional intra predicted image generation unit 602. Here, according to the prediction mode information controlled by the encoding control unit 115, the prediction mode information setting unit 603 determines the prediction mode generated by the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit 602. Set and output prediction mode 605. The selection switch 604 has a function of switching the output ends of the respective intra predicted image generation units according to the prediction mode 605. If the input prediction mode 605 is the unidirectional intra prediction mode, the output terminal of the unidirectional intra prediction image generation unit 601 is connected to the switch, and if the prediction mode 605 is the bidirectional intra prediction mode, the bidirectional intra prediction is performed. The output terminal of the image generation unit 602 is connected. On the other hand, each of the intra predicted

image generation units

601 and 602 generates the predicted image signal 126 according to the prediction mode 605. The generated prediction image signal 126 (also referred to as a fifth prediction image signal) is output from the intra prediction unit 109. The output signal of the unidirectional intra predicted image generation unit 601 is also called a fourth predicted image signal, and the output signal of the bidirectional intra predicted image generation unit 602 is also called a third predicted image signal.

First, the prediction mode information setting unit 603 will be described in detail. FIG. 7 shows the number of prediction modes according to the block size according to the present embodiment of the present invention. PuSize indicates the pixel block (prediction unit) size to be predicted, and seven types of sizes from PU_2x2 to PU_128x128 are defined. IntraUniModeNum represents the number of prediction modes for unidirectional intra prediction, and IntraBiModeNum represents the number of prediction modes for bidirectional intra prediction. Also, Number of modes is the total number of prediction modes for each pixel block (prediction unit) size.

On the other hand, FIG. 9 shows the relationship between the prediction mode and the prediction method when PuSize is PU_8x8, PU_16x16, and PU_32x32. FIG. 10 shows a case where PuSize is PU_4 × 4, and FIG. 11 shows a case where PU_64 × 64 or PU_128 × 128. Here, IntraPredMode indicates a prediction mode number, and IntraBipredFlag is a flag indicating whether or not bidirectional intra prediction. When the flag is 1, it indicates that the prediction mode is the bidirectional intra prediction mode. When the flag is 0, it indicates that the prediction mode is a unidirectional intra prediction mode. IntraPredTypeLX indicates the prediction type of intra prediction. Intra_Vertical means that the vertical direction is the reference for prediction, and Intra_Horizontal means that the horizontal direction is the reference for prediction. Note that 0 or 1 is applied to X in IntraPredTypeLX. IntraPredTypeL0 indicates the first prediction mode of unidirectional intra prediction or bidirectional intra prediction. IntraPredTypeL1 indicates the second prediction mode of bidirectional intra prediction. IntraPred AngleID is an index indicating an index of a prediction angle. The prediction angle actually used in the generation of the predicted value is shown in FIG. Here, puPartIdx represents an index of the prediction unit that is divided in the quadtree division described with reference to FIG. 3B.

For example, when IntraPredMode is 4, since IntraPredTypeL0 is Intra_Vertical, it can be seen that the vertical direction is used as a reference for prediction. As can be seen from the figure, a total of 33 from IntraPredMode = 0 to 32 indicate the unidirectional intra prediction mode, and a total of 16 from IntraPredMode = 32 to 48 indicate the bidirectional intra prediction mode.

The prediction mode information setting unit 603 converts the above-described prediction information corresponding to the designated prediction mode 605 to the unidirectional intra prediction image generation unit 601 and the bidirectional intra prediction image generation unit 602 under the control of the encoding control unit 115. And the prediction mode 605 is output to the selection switch 604.

Next, the unidirectional intra predicted image generation unit 601 will be described in detail. The unidirectional intra predicted image generation unit 601 has a function of generating the predicted image signal 126 for a plurality of prediction directions shown in FIG. In FIG. 8, there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines. H. The direction of a typical prediction angle indicated by H.264 is indicated by an arrow. In the present embodiment of the present invention, 33 kinds of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a diamond. H. Similar to H.264, DC prediction for predicting with an average value of available reference pixels is added, and there are 34 prediction modes in total.

When IntraPredMode = 4, since IntraPredAngleIDL0 is −4, the prediction image signal 126 is generated in the prediction direction indicated by IntraPredMode = 4 in FIG. An arrow indicated by a dotted line in FIG. 8 indicates a prediction mode whose prediction type is Intra_Vertical, and an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.

Next, a prediction image generation method of the unidirectional intra prediction image generation unit 601 will be described. Here, a predicted image value is generated based on the input reference image signal 124, and pixels are copied in the above-described prediction direction. The predicted image value is generated by performing interpolation with 1/32 pixel accuracy. FIG. 12 shows the relationship between IntraPredAngleIDLX and intraPredAngle used for predictive image value generation. intraPredAngle indicates a prediction angle that is actually used when a predicted value is generated. For example, when the prediction type is Intra_Vertical and intraPredAngle shown in FIG. 12 is a positive value, a prediction value generation method is expressed by Expression (3). Here, BLK_SIZE indicates the size of the pixel block (prediction unit), and ref [] indicates an array in which reference image signals are stored. Pred (k, m) indicates the generated predicted image signal 126.

Even for conditions other than the above, a predicted value can be generated by a similar method according to the table of FIG. For example, the prediction value of the prediction mode indicated by IntraPredMode = 1 is H.264 shown in FIG. This is the same as H.264 horizontal prediction. The above is description of the unidirectional intra estimated image generation part 601 in this Embodiment of this invention.

Next, the bidirectional intra predicted image generation unit 602 will be described in detail. FIG. 13 shows a block diagram of the bidirectional intra-predicted image generation unit 602. The bidirectional intra predicted image generation unit 602 includes a first unidirectional intra predicted image generation unit 1301, a second unidirectional intra predicted image generation unit 1302, and a weighted average unit 1303, and is based on the input reference image signal 124. Two unidirectional intra-predicted images are generated, and a function of generating a predicted image signal 126 by weighting and averaging them is provided.

The functions of the first unidirectional intra predicted image generation unit 1301 and the second unidirectional intra predicted image generation unit 1302 are the same. In either case, a prediction image signal corresponding to a prediction mode given according to prediction mode information controlled by the encoding control unit 115 is generated. A first predicted image signal 1304 is output from the first unidirectional intra predicted image generation unit 1301, and a second predicted image signal 1305 is output from the second unidirectional intra predicted image generation unit 1302. Each predicted image signal is input to the weighted average unit 1303, and weighted average processing is performed. The output signal of the weighted average unit 1303 is also called a third predicted image signal.

The table in FIG. 14 is a table for deriving two unidirectional intra prediction modes from the bidirectional intra prediction mode. Here, BiPredIdx is derived using Equation (4).

For example, when PuSize = PU — 8 × 8 and IntraPredMode = 33, it can be seen from FIG. 7 that IntraUniModeNum = 33, and thus BiPredIdx = 0. As a result, it is derived from FIG. 14 that the first unidirectional intra prediction mode (MappedBi2Uni (0, idx)) is 1 and the second unidirectional intra prediction mode (MappedBi2Uni (1, idx)) is 0. In other PuSize and IntraPredMode, it is possible to derive two prediction modes by the same method. Hereinafter, the first unidirectional intra prediction mode is expressed as IntraPredModeL0, and the second unidirectional intra prediction mode is expressed as IntraPredModeL1.

Thus, the first predicted image signal 1304 and the second predicted image signal 1305 generated by the first unidirectional intra predicted image generation unit 1301 and the second unidirectional intra predicted image generation unit 1302 are sent to the weighted average unit 1303. Entered.

The weighted average unit 1303 calculates the Euclidean distance or the city area distance (Manhattan distance) based on the prediction directions of IntraPredModeL0 and IntraPredModeL1, and derives a weight component used in the weighted average process. The weight component of each pixel is represented by the Euclidean distance from the reference pixel used for prediction or the reciprocal of the urban distance, and is generalized by the following equation.

Here, when the Euclidean distance is used, ΔL is expressed by the following equation.

On the other hand, when using the city distance, ΔL is expressed by the following equation.

The weight table for each prediction mode is generalized by the following equation.

Here, ρ _L0 (n) represents the weight component of the pixel position n in IntraPredModeL0, and ρ _L1 (n) represents the weight component of the pixel position n in IntraPredModeL1. Therefore, the final prediction signal at the pixel position n is expressed by the following equation.

Here, BiPred (n) represents a predicted image signal at the pixel position n, and PredL0 (n) and PredL1 (n) are predicted image signals of IntraPredModeL0 and IntraPredModeL1, respectively.

In this embodiment, the prediction signal is generated by selecting two prediction modes for generating the prediction pixel. However, as another embodiment, a prediction value may be generated by selecting three or more prediction modes. . In this case, the ratio of the reciprocal of the spatial distance from the reference pixel to the prediction pixel may be set as the weighting factor.

In this embodiment, the Euclidean distance from the reference pixel used in the prediction mode or the reciprocal of the urban area distance is directly used as a weight component. However, as a modification of the present embodiment, the Euclidean distance and the urban area distance from the reference pixel are variables. The weight component may be set using the distribution model. The distribution model uses at least one of a linear model, an M-order function (M ≧ 1), a nonlinear function such as a one-sided Laplace distribution or a one-sided Gaussian distribution, and a fixed value that is a fixed value regardless of the distance from the reference pixel. When a one-sided Gaussian distribution is used as a model, the weight component is expressed by the following equation.

Here, ρ (n) is a weight component at the position n of the predicted pixel, σ ² is variance, and A is a constant (A> 0).

When the one-sided Laplace distribution is used as a model, the weight component is expressed by the following equation.

Here, σ is a standard deviation, and B is a constant (B> 0).

Further, an isotropic correlation model obtained by modeling an autocorrelation function, an elliptic correlation model, a generalized Gaussian model obtained by generalizing a Laplace function or a Gaussian function may be used as the weight component model.

When the weight components represented by Equation (5), Equation (8), Equation (10), and Equation (11) are calculated each time the predicted image is generated, a plurality of multipliers are required, and the hardware scale increases. . For this reason, the circuit scale required for the said calculation can be reduced by calculating a weight component beforehand according to the relative distance for every prediction mode, and hold | maintaining in a memory. Here, a method for deriving the weight component when the city distance is used will be described. Here, the relative distance is a distance between a prediction target pixel and a reference pixel with respect to a certain prediction direction.

The city area distance ΔL _L0 of IntraPredMode _L0 and the city area distance ΔL _{L1 of} IntraPredMode _L1 are calculated from Equation (7). Here, the relative distance varies depending on the prediction direction of the two prediction modes. As an example, typical distances in the case of PuSize = PU — 4 × 4 are shown in FIGS. FIG. 15A shows the city distance when IntraPredModeLX = 0. FIG. 15B shows the city distance in the case of IntraPredModeLX = 1. FIG. 15C shows the city distance in the case of IntraPredModeLX = 3. Similarly, the distance can be derived using Expression (6) or Expression (7) according to each prediction mode. However, in the case of DC prediction with IntraPredModeLX = 2, the distance is 2 at all pixel positions. FIG. 16 shows a table of distances in five typical prediction modes when PuSize = PU — 4 × 4. When the number of IntraPredModeLX is large, the table sizes of these distance tables may increase.

In the present embodiment, the required memory amount is reduced by sharing a distance table in a prediction mode with several prediction angles close to each other. FIG. 17 shows the mapping of IntraPredModeLX used for distance table derivation. Here, an example is shown in which a table of only the prediction mode corresponding to the prediction mode corresponding to the prediction mode and the DC prediction in 45 degrees is prepared, and other prediction angles are mapped closer to the prepared reference prediction mode. ing. When the distance from the reference prediction mode is the same, the index is mapped to the smaller one. The prediction mode shown in “MappedIntraPredMode” is referred to from FIG. 17, and a distance table can be derived.

By using the distance table, the relative distance for each pixel in the two prediction modes is calculated using the following equation.

Here, BLK_WIDTH and BLK_HEIGHT indicate the width and height of the pixel block (prediction unit), respectively, and DistDiff (n) indicates the relative distance between the two prediction modes at the pixel position n. Using Equation (12), the final prediction signal at the pixel position n is expressed by the following equation.

Here, in order to avoid an increase in hardware scale due to the use of decimal point arithmetic, when weight components are scaled in advance and converted to integer arithmetic, the following expression is obtained.

Here, for example, when the decimal part is expressed with 10-bit precision, WM = 1024, Offset = 512, and SHIFT = 10. These satisfy the following relationship.

SHIFT indicates the calculation accuracy of the decimal point calculation of the weight component, and an optimal combination may be selected by balancing the coding performance and the circuit scale at the time of hardware implementation.

FIGS. 18A and 18B show examples in which weight components using the one-sided Laplace distribution model in the present embodiment of the present invention are tabulated. FIG. 18A shows a weight component table in the case of PuSize = PU — 4 × 4. FIG. 18B shows a weight component table in the case of PuSize = PU_8 × 8. Other PuSizes can also be derived using Equation (5), Equation (8), Equation (10), and Equation (11).
The above is the details of the intra prediction unit 109 according to the present embodiment of the present invention.

<Modification of processing amount reduction on the encoding unit side of the intra prediction unit 109>
As a modification of the present embodiment, the internal configuration of the intra prediction unit 109 may be the configuration shown in FIG. In this case, compared with the configuration of the intra prediction unit 109 shown in FIG. 6, an image buffer 1901 is added, and the bidirectional intra predicted image generation unit 602 is replaced with a weighted average unit 1303. The primary image buffer 1901 has a function of temporarily storing the prediction image signal 126 for each prediction mode generated by the unidirectional intra prediction image generation unit 601 in the buffer, and the prediction controlled by the encoding control unit 115. Depending on the mode, the prediction image signal 126 corresponding to the necessary prediction mode is output to the weighted average unit 1303. This eliminates the need for the bidirectional intra predicted image generation unit 602 to hold the first unidirectional intra predicted image generation unit 1301 and the second unidirectional intra predicted image generation unit 1302, thereby reducing the hardware scale. It becomes possible.

<Syntax structure 1>
Hereinafter, the syntax used by the image coding apparatus 100 in FIG. 1 will be described.
The syntax indicates the structure of encoded data (for example, encoded data 127 in FIG. 1) when the image encoding device encodes moving image data. When decoding the encoded data, the moving picture decoding apparatus interprets the syntax with reference to the same syntax structure. FIG. 20 shows an example of syntax 2000 used by the moving picture encoding apparatus of FIG.

The syntax 2000 includes three parts: a high level syntax 2001, a slice level syntax 2002, and a coding tree level syntax 2003. The high level syntax 2001 includes syntax information of a layer higher than the slice. A slice refers to a rectangular area or a continuous area included in a frame or a field. The slice level syntax 2002 includes information necessary for decoding each slice. The coding tree level syntax 2003 includes information necessary for decoding each coding tree (ie, each coding tree unit). Each of these parts includes more detailed syntax.

The high level syntax 2001 includes sequence and picture level syntaxes such as a sequence parameter set syntax 2004 and a picture parameter set syntax 2005. The slice level syntax 2002 includes a slice header syntax 2006, a slice data syntax 2007, and the like. The coding tree level syntax 2003 includes a coding tree unit syntax 2008, a prediction unit syntax 2009, and the like.

The coding tree unit syntax 2008 can have a quadtree structure. Specifically, the coding tree unit syntax 2008 can be recursively called as a syntax element of the coding tree unit syntax 2008. That is, one coding tree unit can be subdivided with a quadtree. The coding tree unit syntax 2008 includes a transform unit syntax 2010. The transform unit syntax 2010 is called at each coding tree unit syntax 2008 at the extreme end of the quadtree. The transform unit syntax 2010 describes information related to inverse orthogonal transformation and quantization.

FIG. 21 illustrates the slice header syntax 2006 according to the present embodiment. The slice_bipred_intra_flag shown in FIG. 21 is a syntax element indicating, for example, validity / invalidity of bidirectional intra prediction according to the present embodiment for the slice.

When slice_bipred_intra_flag is 0, the bidirectional intra according to this embodiment in the slice is invalid. Therefore, the orthogonal transform unit 102 and the inverse orthogonal transform unit 105 perform only unidirectional intra prediction. As an example of unidirectional intra prediction, prediction in which IntraBipredFlag [] in FIGS. Intra prediction specified in H.264 may be performed.

As an example, when slice_bipred_intra_flag is 1, the bidirectional intra prediction according to the present embodiment is effective in the entire area in the slice.

As another example, when slice_bipred_intra_flag is 1, in the syntax of a lower layer (coding tree unit, transform unit, etc.), the prediction validity / efficiency according to the present embodiment is determined for each local region in the slice. Invalidity may be specified.

FIG. 22A shows an example of the prediction unit syntax. Pred_mode in the figure indicates the prediction type of the prediction unit. MODE_INTRA indicates that the prediction type is intra prediction. intra_split_flag is a flag indicating whether or not the prediction unit is further divided into four prediction units. When intra_split_flag is 1, a prediction unit is obtained by dividing a prediction unit into four in half in the vertical and horizontal sizes. When intra_split_flag is 0, the prediction unit is not divided.

Intra_luma_bipred_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is a unidirectional intra prediction mode or a bidirectional intra prediction mode. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set. The flag is set with the value of IntraBiprededFlag of the prediction unit shown in FIGS.

When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the used bi-directional intra prediction mode among a plurality of prepared bi-directional intra prediction modes. Intra_luma_bipred_mode [i] is encoded. intra_luma_bipred_mode [i] may be encoded with the isometric length according to the number of bidirectional intra prediction modes IntraBiModeNum shown in FIG. 7, or may be encoded using a predetermined code table. When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive encoding is performed from adjacent blocks.

prev_intra_luma_unipred_flag [i] is a flag indicating whether or not the prediction value MostProbable of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the MostProbable calculation method will be described later. When prev_intra_luma_unipred_flag [i] is 1, it indicates that the MostProbable and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_flag [i] is 0, it indicates that the MostProbable and the intra prediction mode IntraPredMode are different, and the information rem_intraiprecoded_code that specifies whether the intra prediction mode IntraPredMode is a mode other than MostProbable. . rem_intra_luma_unipred_mode [i] may be encoded with the isometric length according to the bidirectional intra prediction mode number IntraUniModeNum shown in FIG. 7, or may be encoded using a predetermined code table. From the intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using the following equation.

Next, a method for calculating MostProbable that is a predicted value in the prediction mode will be described. MostProbable is calculated according to the following equation.

Min (x, y) is a parameter for outputting the smaller one of the inputs x and y.

Also, intraPredModeA and intraPredModeB indicate intra prediction modes of prediction units adjacent to the left and above the encoded prediction unit. Hereinafter, intraPredModeA and intraPredModeB are collectively expressed as intraPredModeN. N is set to A or B. A method of calculating intraPredModeN will be described using the flowchart shown in FIG. First, it is determined whether a coding tree unit to which an adjacent prediction unit belongs can be used (step S2301). If the coding tree unit cannot be used (NO in S2301), reference to intraPredModeN is not possible. “−1” indicating “” is set. On the other hand, if the coding tree unit is available (YES in S2301), it is next determined whether or not intra prediction is applied to the adjacent prediction unit (step S2302). When the adjacent prediction unit is not intra prediction (NO in S2302), “2” meaning “Intra_DC” is set in intraPredModeN. On the other hand, when the adjacent prediction unit is intra prediction (YES in S2302), it is next determined whether or not the adjacent prediction unit is bidirectional intra prediction (step S2303). When the adjacent prediction unit is not bidirectional intra prediction, that is, in the case of unidirectional intra prediction (NO in S2303), the prediction mode IntraPredMode of the adjacent prediction unit is set in intraPredModeN. On the other hand, when the adjacent PU is bidirectional intra prediction (S2303 is YES), the prediction mode of the adjacent block is converted to the unidirectional intra prediction mode. Specifically, intraPredModeN is calculated using the following equation.

Here, IntraUniModeNum is the number of unidirectional intra prediction modes determined by the size of the adjacent prediction unit, and an example thereof is shown in FIG. Also, “MappedBi2Uni (List, idx)” is a table for converting the bidirectional intra prediction mode into the unidirectional intra prediction mode. List is the unidirectional intra prediction mode of List0 (corresponding to IntraPredTypeL0 [] shown in FIGS. 9, 10, and 11) of the two unidirectional intra prediction modes constituting the bidirectional intra prediction mode. Is a flag indicating whether to use the unidirectional intra prediction mode of List1 or the IntraPredTypeL1 [] shown in FIGS. 9, 10, and 11). List1 is used for conversion to the unidirectional intra prediction mode. FIG. 14 shows an example of the conversion table. The numerical values in the figure correspond to IntraPredMode shown in FIGS.

When MostProbable calculated using intraPredModeN calculated as described above and Expression (18) is −1, MostProbable is replaced with 2 (Intra_DC). Further, when MostProbable is larger than the number of unidirectional intra prediction modes IntraUniPredModeNum of the encoded prediction unit, MostProbable is recalculated using the following equation.

“MappedMostProble ()” is a table for converting MostProbable, and an example is shown in FIG. 24.

<Syntax structure 2>
Next, another example of the prediction unit syntax is shown in FIG. 22C. Since pred_mode and intra_split_flag are the same as the syntax example described above, description thereof is omitted. luma_pred_mode_code_type [i] indicates the type of the prediction mode IntraPredMode applied to the prediction unit, where 0 (IntraUnifiedMostProb) is unidirectional intra prediction and the intra prediction mode is the same as MostProbable, 1 (IntraUnipre intrareprediction) The intra prediction mode is different from MostProbable, and 2 (IntraBipred) indicates a bidirectional intra prediction mode. FIG. 24 shows bidirectional intra prediction modes, 1 and unidirectional intra prediction (whether or not the same prediction mode as MostProbable). FIG. 25 shows an example of assignment of the number of modes according to the meaning corresponding to luma_pred_mode_code_type, bin, and the mode configuration shown in FIG. When luma_pred_mode_code_type [i] is 0, the intra prediction mode is the MostProbable mode, so no further information encoding is necessary. When luma_pred_mode_code_type [i] is 1, information rem_intra_luma_unipred_mode [i] that specifies which mode other than MostProbable is the intra prediction mode IntraPredMode is encoded. rem_intra_luma_unipred_mode [i] may be encoded with the isometric length according to the bidirectional intra prediction mode number IntraUniModeNum shown in FIG. 7, or may be encoded using a predetermined code table. From the intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using Equation (16). Further, when luma_pred_mode_code_type [i] is 2, it indicates that the prediction unit is bidirectional intra prediction, and information that identifies the used bidirectional intra prediction mode among the prepared bidirectional intra prediction modes. Intra_luma_bipred_mode [i] is encoded. intra_luma_bipred_mode [i] may be encoded with the isometric length according to the number of bidirectional intra prediction modes IntraBiModeNum shown in FIG. 7, or may be encoded using a predetermined code table.
The above is the syntax configuration according to the present embodiment.

<Syntax structure 3>
Yet another example of the prediction unit syntax is shown in FIG. 22D. In this example, based on the prediction unit syntax shown in FIG. 22A, whether bidirectional intra prediction can be used or whether conventional intra unidirectional prediction can be used with bidirectional intra prediction disabled. Shows the syntax for switching within the encoding prediction unit. In the case where the bidirectional intra prediction cannot be used and only the conventional unidirectional intra prediction can be used, the table shown in FIG. 43 may be used instead of FIG. 9, and the IntraPredMode in FIG. The table may be ignored. FIG. 43 is a table in which IntraPredTypeL1 and IntraPredAngleIdL1 indicating information related to the second prediction mode at the time of bidirectional intra prediction are deleted from FIG. 9, and unnecessary IntraPredMode 33 or more tables are deleted. The same structure regarding FIG. 42 and FIG. 9 is applicable also regarding FIG. 10, FIG.
Note that pred_mode and intra_split_flag are the same as the syntax example described above, and thus description thereof is omitted.

Intra_bipred_flag is a flag indicating whether or not bi-directional intra prediction can be used in the encoded prediction unit. When intra_bipred_flag is 0, it indicates that bi-directional intra prediction is not used in the encoded prediction unit. Even when intra_split_flag is 1, that is, when the encoded prediction unit is further divided into four, bi-directional intra prediction is not used in all prediction units, and only uni-directional intra prediction is effective.

When intra_bipred_flag is 1, it indicates that bi-directional intra prediction can be used in the encoded prediction unit. Even when intra_split_flag is 1, that is, when the encoded prediction unit is further divided into four, in all prediction units, bidirectional intra prediction can be selected in addition to unidirectional intra prediction.

In a region where bi-directional intra prediction is not necessary (for example, a flat region), intra_bipred_flag is encoded as 0 to disable bi-directional intra prediction. Since the amount of codes necessary for encoding can be reduced, encoding efficiency is improved.

<Syntax structure 4>
Still another example relating to the prediction unit syntax is shown in FIG. 22E. In this example, based on the prediction unit syntax shown in FIG. 22C, whether bidirectional intra prediction can be used or whether only conventional unidirectional intra prediction can be used with bidirectional intra prediction disabled. Shows the syntax for switching within the encoding prediction unit. intra_bipred_flag is a flag indicating whether or not bi-directional intra prediction can be used in the encoding prediction unit, and is the same as the above-described intra_bipred_flag, and thus the description thereof is omitted.

(First modification)
<First Modification of Intra Prediction Unit>
As a first modification related to the intra prediction unit 109, in combination with adaptive reference pixel filtering shown in JCTVC-B205_draft002, section 5.2.1 “Intra prediction process for luma samples”, JCT-VC 2nd Meeting Geneva, July, 2010 It doesn't matter. FIG. 26 shows the intra prediction unit 109 when adaptive reference pixel filtering is used. 6 is different from the intra prediction unit 109 shown in FIG. 6 in that a reference pixel filter unit 2601 is added. The reference pixel filter unit 2601 receives the reference image signal 124 and the prediction mode 605, performs an adaptive filtering process described later, and outputs a filtered reference image signal 2602. The filtered reference image signal 2602 is input to the unidirectional intra predicted image generation unit 601 and the bidirectional intra predicted image generation unit 602. The configuration and processing other than the reference pixel filter unit 2601 are the same as those of the intra prediction unit 109 shown in FIG.

Next, the reference pixel filter unit 2601 will be described. The reference pixel filter unit 2601 determines whether or not to filter reference pixels used for intra prediction according to the reference pixel filter flag and the intra prediction mode included in the prediction mode 605. The reference pixel filter flag is a flag indicating whether or not reference pixels are filtered when the intra prediction mode IntraPredMode is a value other than “Intra_DC”. When the reference pixel filter flag is 1, the reference pixel is filtered. In the case of the reference pixel filter flag 0, the reference pixel is not filtered. When IntraPredMode is “Intra_DC”, the reference pixel is not filtered and the reference pixel filter flag is set to 0. When the reference pixel filter flag is 1, a filtered reference image signal 2602 is calculated by the following filtering. Note that p [x, y] indicates a reference pixel before filtering, and pf [x, y] indicates a reference pixel in filter terms. Further, x and y indicate relative positions of the reference pixels when the upper left pixel position in the prediction unit is x = 0 and y = 0. PuPartSize indicates the size (pixel) of the prediction unit.

<Syntax structure 5>
27A and 27B show a prediction unit syntax structure when performing adaptive reference pixel filtering. FIG. 27A adds the syntax intra_luma_filter_flag [i] related to the adaptive reference pixel filter to FIG. 22A. In addition, FIG. 27B adds syntax intra_luma_filter_flag [i] related to the adaptive reference pixel filter to FIG. 22C. intra_luma_filter_flag [i] is further encoded when the intra prediction mode IntraPredMode [i] is other than Intra_DC. When the flag is 0, it indicates that the reference pixel is not filtered. Further, when intra_luma_filter_flag [i] is 1, it indicates that the reference pixel filtering is applied.

In the above example, intra_luma_filter_flag [i] is encoded when the intra prediction mode IntraPredMode [i] is other than Intra_DC. As another example, when IntraPredMode [i] is 0 to 2, intra_luma_filter_flag [i ] May not be encoded. In this case, intra_luma_filter_flag [i] is set to 0.

In addition, the intra_luma_filter_flag [i] described above may be added in the same meaning for the other syntax structures shown in FIGS. 22B, 22D, and 22E.

(Second modification)
<Second Modification of Intra Prediction Unit>
As a second modification of the intra prediction unit 109, it may be used in combination with the composite intra prediction shown in JCTVC-B205_draft002, section 9.6 “Combined Intra Prediction”, JCT-VC 2nd Meeting Geneva, July, 2010 . In the decoded intra prediction in this document, a prediction value is obtained by performing weighted averaging of the result of the above-described unidirectional intra prediction and the average value of pixels adjacent to the left, top, and top left with respect to the prediction pixel. When the decoded image signal 122 is calculated in the moving image decoding device 4400 or the image encoding device 100, it is possible to use decoded pixels as pixels adjacent to the left, upper, and upper left. On the other hand, since it is impossible to use decoded pixels before the decoded image signal 122 is calculated in the image encoding device 100, the input image signal 116 is used as a pixel adjacent to the left, upper, and upper left. FIG. 28 shows positions of adjacent decoded pixels A (left), B (upper), and C (upper left) used for prediction of the prediction target pixel X. Therefore, composite intra prediction is a so-called open-loop prediction method in which prediction values differ between the image encoding device 100 and the moving image decoding device 4400.

FIG. 30 shows a block diagram of the intra prediction unit 4408 (109) when combined with composite intra prediction. A difference is that a composite intra predicted image generation unit 2901, a selection switch 2902, and a decoded image buffer 3001 are added to the intra prediction unit 109 shown in FIG.

When the bidirectional intra prediction and the composite intra prediction are combined, first, in the selection switch 604, the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit according to the prediction mode information controlled by the encoding control unit 115. The output terminal of 602 is switched. Hereinafter, the output predicted image signal 126 is referred to as a direction predicted image signal 126.

Then, the direction prediction image signal is input to the composite intra prediction image generation unit 2901, and a prediction image signal 4420 (126) in the composite intra prediction is generated. The description of the composite intra predicted image generation unit 2901 will be described later. Thereafter, in the selection switch 2902, which one of the prediction image signal 4420 (126) and the direction prediction image signal in the composite intra prediction is used according to the composite intra prediction application flag in the prediction mode information controlled by the encoding control unit 115. And the final predicted image signal 4420 (126) in the intra prediction unit 4408 (109) is output. When the composite intra prediction application flag is 1, the predicted image signal 4420 (126) output from the composite intra predicted image generation unit 2901 is the final predicted image signal 4420 (126). On the other hand, when the composite intra prediction application flag is 0, the direction prediction image signal 4420 (126) is the prediction image signal 4420 (126) that is finally output. The predicted image signal output from the composite intra predicted image generation unit 2901 is also called a sixth predicted image signal.

When the prediction image signal 4420 (126) is generated from the composite intra prediction image generation unit 2901, the decoded prediction error signal 4416 separately decoded by the addition unit 4405 is added to the pixel unit to generate a decoded image signal 4417 for each pixel. And stored in the decoded pixel buffer 3001. The stored decoded image signal 4417 in units of pixels is input to the composite intra predicted image generation unit 2901 as the reference pixel 3002, and is used for pixel level prediction described later as the adjacent pixel 3104 shown in FIG.

Next, the composite intra prediction image generation unit 2901 will be described with reference to FIG. The composite intra prediction image generation unit 2901 includes a pixel level prediction signal generation unit 3101 and a composite intra prediction calculation unit 3102. The pixel level prediction signal generation unit 3101 receives the reference pixel 3002 as the adjacent pixel 3104, and outputs the pixel level prediction signal 3103 by predicting the prediction target pixel X from the adjacent pixel. Specifically, the pixel level prediction signal 3103 (X) of the prediction target pixel is calculated from A, B, and C indicating the adjacent pixel 3104 using Expression (21).

Note that the coefficients related to A, B, and C may be other values.

The composite intra prediction calculation unit 3102 performs a weighted average of the direction prediction image signal 126 (X ′) and the pixel level prediction signal 3103 (X), and outputs a final prediction image signal 126 (P). Specifically, the following formula is used.

Note that W is a weighted average weighting factor (an integer value between W = 0 and 32) of the direction prediction image signal 126 (X ′) and the pixel level prediction signal 3103 (X).

When the prediction image signal 126 is generated using the composite intra prediction, and the prediction error signal 117 and the decoded image signal 122 are further generated, the decoded image signal 122 may have different values in encoding and decoding. . Therefore, after all the decoded image signals 122 in the encoded prediction syntax are generated, the above-described combined intra prediction is performed again using the decoded image signal 122 as an adjacent pixel, so that the same predicted image signal 126 as that in the decoding is obtained. Is further added to the prediction error signal 117, and the decoded image signal 122 identical to the decoding can be generated.

The above is an embodiment when combined with composite intra prediction.

(Third Modification)
<Third Modification of Intra Prediction Unit>
The weighting factor W may be switched according to the position of the prediction pixel in the prediction unit. In general, a prediction image signal generated using unidirectional intra prediction and bidirectional intra prediction generates a prediction value from spatially adjacent reference pixels positioned on the left or above already encoded. The absolute value of the prediction error tends to increase as the distance from the reference pixel increases. Therefore, the weighting coefficient of the direction prediction image signal 126 and the pixel level prediction signal 3103 is increased when the weight coefficient of the direction prediction image signal 126 is close to the reference pixel, and is decreased when the distance is far away, thereby improving the prediction accuracy. It becomes possible.

On the other hand, in the complex intra prediction, a prediction error signal is generated using an input image signal at the time of encoding. At this time, since the pixel level prediction signal 3103 becomes an input image signal, even if the spatial distance between the reference pixel position and the prediction pixel position is increased, the prediction of the pixel level prediction signal 3103 is compared with the direction prediction image signal 126. High accuracy. However, the weighting coefficient of the direction prediction image signal 126 and the pixel level prediction signal 3103 is simply increased when the weight coefficient of the direction prediction image signal 126 is close to the reference pixel, and is decreased when the distance is small. Although the prediction error is reduced, there is a problem that the prediction accuracy at the time of encoding and the prediction value at the time of local decoding are different and the prediction accuracy is lowered. Therefore, especially when the value of the quantization parameter is large, as the spatial distance between the reference pixel position and the predicted pixel position becomes large, the difference generated in the case of such an open loop is set by setting the value of W small. A decrease in coding efficiency due to the phenomenon can be suppressed.

<Syntax structure 6>
32A and 32B show a prediction unit syntax structure when performing composite intra prediction. FIG. 32A is different from FIG. 22A in that a syntax combined_intra_pred_flag for switching presence / absence of composite intra prediction is added. This is equivalent to the above-described composite intra prediction application flag. In addition, FIG. 32B adds a syntax combined_intra_pred_flag for switching presence / absence of composite intra prediction to FIG. 22C. When combined_intra_pred_flag is 1, the selection switch 2902 shown in FIG. 30 is connected to the output terminal of the composite intra prediction image generation unit 2901. When combined_intra_pred_flag is 0, the selection switch 2902 shown in FIG. 30 is connected to the output terminal of either the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit 602 to which the selection switch 604 is connected. .

Furthermore, it may be combined with the second embodiment of the intra prediction unit. The above is the description regarding another embodiment of the intra prediction unit 109.

According to the first embodiment described above, highly efficient intra prediction can be realized. Therefore, the coding efficiency is improved, and the subjective image quality is also improved.

(Second Embodiment)
<Moving Image Encoding Device—Second Embodiment>
The video encoding apparatus according to the second embodiment differs from the above-described image encoding apparatus according to the first embodiment in the details of orthogonal transform and inverse orthogonal transform. In the following description, in the present embodiment, the same parts as those in the first embodiment are denoted by the same indexes, and different parts will be mainly described. A moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a fifth embodiment.

FIG. 33 is a block diagram showing a video encoding apparatus according to the second embodiment. The change from the moving picture encoding apparatus according to the first embodiment is that a transformation selection unit 3301 and a coefficient order control unit 3302 are added. Also, the internal structures of the orthogonal transform unit 102 and the inverse orthogonal transform unit 105 are different. Hereinafter, the process of FIG. 33 will be described.

First, the orthogonal transform unit 102 and the inverse orthogonal transform unit 105 will be described with reference to FIGS.

<Orthogonal transform unit 102>
34 includes a first orthogonal transform unit 3401, a second orthogonal transform unit 3402, an Nth orthogonal transform unit 3403, and a transform selection switch 3404. Although an example having N types of orthogonal transform units is shown here, there may be a plurality of transform sizes using the same orthogonal transform method, or there may be a plurality of orthogonal transform units performing different orthogonal transform methods. . Moreover, each may be mixed. For example, the first orthogonal transform unit 3401 can be set to 4 × 4 size DCT, the second orthogonal transform unit 3402 can be set to 8 × 8 size DCT, and the Nth orthogonal transform unit 3403 can be set to 16 × 16 size DCT. The first orthogonal transform unit 3401 is 4 × 4 size DCT, the second orthogonal transform unit 3402 is 4 × 4 size DST (discrete sine transform), and the Nth orthogonal transform unit 3403 is 8 × 8 size KLT (Karunen-Labe transform). ) Can also be set. In addition, it is possible to select a transform that is not orthogonal transform, or a single transform. In this case, N = 1 is considered.

First, the conversion selection switch 3404 will be described. The conversion selection switch 3404 has a function of selecting the output terminal of the subtraction unit 101 according to the conversion selection information 3303. The transformation selection information 3303 is one piece of information controlled by the encoding control unit 115 and is set by the transformation selection unit 3301 according to the prediction information 125. For example, H.M. In H.264, 4 × 4 DCT is set for intra prediction of a 4 × 4 pixel block (prediction unit), and 8 × 8 DCT is set for intra prediction of an 8 × 8 pixel block (prediction unit). In the present embodiment, when the transformation selection information 3303 indicates the first orthogonal transformation, the output terminal of the switch is connected to the first orthogonal transformation unit 3401. On the other hand, when the transformation selection information 3303 is the second orthogonal transformation, the output end is connected to the second orthogonal transformation unit 3402.

Next, processing of the first orthogonal transform unit 3401 to the Nth orthogonal transform unit 3403 will be described. In the present embodiment, an example in which one of N orthogonal transform units is DCT and the other is KLT (Karunen-Loeve transform) will be described. Here, the first orthogonal transform unit 3401 performs DCT, and the other

orthogonal transform units

3402 and 3403 perform KLT (Carhunen-Labe transform).

<Inverse orthogonal transform unit 105>
The inverse orthogonal transform unit 105 in FIG. 35 includes a first inverse orthogonal transform unit 3501, a second inverse orthogonal transform unit 3502, an Nth inverse orthogonal transform unit 3503, and a transform selection switch 3504. First, the conversion selection switch 3504 will be described. The conversion selection switch 3504 has a function of selecting the output terminal of the inverse quantization unit 104 according to the input conversion selection information 3303. The transformation selection information 3303 is one piece of information controlled by the encoding control unit 115 and is set by the transformation selection unit 3301 according to the prediction information 125.

When the transform selection information 3303 is the first orthogonal transform, the output terminal of the switch is connected to the first inverse orthogonal transform unit 3501. On the other hand, when the transformation selection information 3303 is the second orthogonal transformation, the output end is connected to the second inverse orthogonal transformation unit 3502. Similarly, when the transform selection information 3303 is the Nth orthogonal transform, the output terminal is connected to the Nth inverse orthogonal transform unit 3503. Here, the transform selection information 3303 set in the orthogonal transform unit 102 and the transform selection information 3303 set in the inverse orthogonal transform unit 105 are the same, and the inverse orthogonal transform corresponding to the transform performed in the orthogonal transform unit 102 is performed. This is performed synchronously by the inverse orthogonal transform unit 105. That is, the first inverse orthogonal transform unit 3501 performs inverse discrete cosine transform (hereinafter referred to as IDCT), and the second inverse orthogonal transform unit 3502 and the Nth inverse orthogonal transform unit 3503 are based on KLT (Karunen-Labe transform). Inverse transformation is performed. Although an example using IDCT or the like is shown here as an example, orthogonal transformation such as Hadamard transformation or discrete sine transformation may be used, or non-orthogonal transformation may be used. In any case, the corresponding inverse conversion is performed in conjunction with the conversion unit 102.

<Conversion selection unit 3301>
Next, the conversion selection unit 3301 shown in FIG. 33 will be described. Prediction information 125 controlled by the encoding control unit 115 and including the prediction mode set by the prediction selection unit 112 is input to the transform selection unit 3301. Based on the prediction information 125, the transform selection unit 3301 has a function of setting MapdTransformIdx information indicating which orthogonal transform is used for which prediction mode. FIG. 36 shows conversion selection information 3303 (MappedTransformIdx) in intra prediction. Here, an example of N = 9 is shown. Note that the first orthogonal transform unit 3401 and the corresponding first inverse orthogonal transform unit 3501 are selected at the time of DC prediction corresponding to IntraPredModeLX = 2. By mapping to the reference prediction mode with a close prediction angle in this way, compared to the case of preparing an orthogonal transformer and an inverse orthogonal transformer for all prediction modes, orthogonal transformation and inverse orthogonal transformation at the time of hardware implementation It is possible to reduce the circuit scale. When bi-directional intra prediction is selected, two IntraPredModeL0 and IntraPredModeL1 are derived according to FIG. 14, respectively, and then the mapped transformIdx is derived from FIG. 36 using a prediction mode corresponding to IntraPredModeL1. In the present embodiment of the present invention, an example of N = 9 is shown, but the value of N may be selected in an optimal combination by balancing the coding performance and the circuit scale at the time of hardware implementation.

<Coefficient order control unit 3302>
Next, the coefficient order control unit 3302 will be described. FIG. 37 shows a block diagram of the coefficient order control unit 3302. The coefficient order control unit 3302 includes a coefficient order selection switch 3704, a first coefficient order conversion unit 3701, a second coefficient order conversion unit 3702, and an Nth coefficient order conversion unit 3703. The coefficient order selection switch 3704 has a function of switching the output terminal of the switch and the coefficient order conversion units 3701 to 3703 in accordance with, for example, the mapped transform IDx shown in FIG. The N types of coefficient forward conversion units 3701 to 3703 have a function of converting the two-dimensional data of the quantized conversion coefficient 119 quantized by the quantization unit 103 into one-dimensional data. For example, H.M. In H.264, two-dimensional data is converted into one-dimensional data using a zigzag scan.

When orthogonal transform in consideration of the intra prediction direction is used, the quantized transform coefficient 119 obtained by performing the quantization process on the transform coefficient 118 subjected to the orthogonal transform has a characteristic that the tendency of generating non-zero transform coefficients in the block is biased. have. The tendency of occurrence of this non-zero transform coefficient has different properties for each prediction direction of intra prediction. However, when different videos are encoded, the generation tendency of non-zero transform coefficients in the same prediction direction has a similar property. Therefore, when transforming two-dimensional data into one-dimensional data (2D-1D conversion), entropy coding is performed preferentially from transform coefficients at positions where the occurrence probability of non-zero transform coefficients is high, thereby encoding transform coefficients. It is possible to reduce information. Therefore, by learning the occurrence probability of a non-zero conversion coefficient in advance based on information representing a prediction direction such as a prediction mode included in the prediction information 125, for example, H.264. Compared with H.264, it is possible to reduce the code amount of the transform coefficient without causing an increase in the calculation amount.

As yet another example, the coefficient order control unit 3302 may dynamically update the scan order in 2D-1D conversion. The coefficient order control unit 3302 that performs such an operation is illustrated in FIG. The coefficient order control unit 3302 includes an occurrence frequency counting unit 3801 and an updating unit 3802 in addition to the configuration of FIG. The coefficient order conversion units 3701,..., 3703 are the same except that the scan order is updated by the coefficient order update unit 3802.

The occurrence frequency counting unit 3801 creates a histogram 3804 of the number of occurrences of non-zero coefficients in each element of the quantized transform coefficient sequence 3304 for each prediction mode. The occurrence frequency counting unit 3801 inputs the created histogram 3804 to the update unit 3802.

The update unit 3802 updates the coefficient order based on the histogram 3804 at a predetermined timing. The timing is, for example, the timing when the coding process of the coding tree unit is finished, the timing when the coding process for one line in the coding tree unit is finished, or the like.

Specifically, the update unit 3802 refers to the histogram 3804 and updates the coefficient order with respect to a prediction mode having an element in which the number of occurrences of non-zero coefficients is counted more than a threshold. For example, the update unit 3802 updates the prediction mode having an element in which the occurrence of a non-zero coefficient is counted 16 times or more. By providing a threshold value for the number of occurrences, the coefficient order is updated globally, so that it is difficult to converge to a local optimum solution.

The update unit 3802 sorts the elements in descending order of the occurrence frequency of the non-zero coefficient with respect to the prediction mode to be updated. Sorting can be realized by existing algorithms such as bubble sort and quick sort. Then, the update unit 3802 inputs the update coefficient order 3803 indicating the order of the sorted elements to the coefficient order conversion units 3701 to 3703 corresponding to the prediction mode to be updated.

When the update coefficient order 3803 is input, each conversion unit performs 2D-1D conversion according to the updated scan order. When the scan order is dynamically updated, the initial scan order of each 2D-1D conversion unit needs to be determined in advance. In this way, by dynamically updating the scan order, the tendency of occurrence of non-zero coefficients in the quantized transform coefficients 119 changes according to the influence of the properties of the predicted image, quantization information (quantization parameters), and the like. Even in this case, high encoding efficiency can be expected stably. Specifically, the generated code amount of run-length encoding in the entropy encoding unit 113 can be suppressed.
The syntax configuration in this embodiment is the same as that in the first embodiment.

As a modification of the present embodiment, the conversion selection unit 3301 can select the mapped transform IDx separately from the prediction information 125. In this case, information indicating which nine types of orthogonal transforms or inverse orthogonal transforms are used is set in the entropy encoding unit 113 and encoded together with the quantized transform coefficient sequence 3304. FIG. 39 shows an example of syntax in this modification. Directional_transform_idx indicated in the syntax indicates information indicating which of N orthogonal transforms has been selected.

According to the second embodiment described above, highly efficient orthogonal transformation and inverse orthogonal transformation can be realized while alleviating the difficulty in hardware implementation and software implementation. Therefore, the coding efficiency is improved, and the subjective image quality is also improved.

(Third embodiment)
<Video Encoding Device—Third Embodiment>
As an embodiment related to the orthogonal transformation unit 102, it may be combined with the rotation transformation shown in JCTVC-B205_draft002, section 5.3.5.2 “Rotational transformation process”, JCT-VC 2nd Meeting Geneva, July, 2010. Rotational transformation is a technique for further increasing the coefficient density of transformation coefficients by further performing rotational transformation after orthogonal transformation using DCT.

<Orthogonal transform unit 102>
FIG. 40 shows a block diagram of the orthogonal transform unit 102 according to the present embodiment. The orthogonal transform unit 102 includes new processing units such as a first rotation transform unit 4001, a second rotation transform unit 4002, an Nth rotation transform unit 4003, and a discrete cosine transform unit 4004, and has an existing transform selection switch 3404. The discrete cosine transform unit 4004 performs DCT, for example. The conversion coefficient after DCT is input to the conversion selection switch 3404. Here, the conversion selection switch 3404 connects the output end of the switch to one of the first rotation conversion unit 4001, the second rotation conversion unit 4002, and the Nth rotation conversion unit 4003 according to the conversion selection information 3303. For example, the switches are sequentially switched according to the control of the encoding control unit 115. The rotation conversion units 4001 to 4003 perform rotation conversion for each conversion coefficient using a predetermined rotation matrix. The conversion coefficient 118 after the rotation conversion is output. This conversion is a reversible conversion.

Here, it may be determined which rotation matrix is to be used by using the encoding cost as shown in Equation (1) and Equation (2). In addition, a table in which a prediction mode and a conversion number are associated with each other as shown in FIG. Here, an example in which the rotation conversion unit is applied before the quantization unit 103 is shown, but the rotation conversion unit may be applied to the quantization conversion coefficient 119 after the quantization process. In this case, the orthogonal transform unit 102 performs only DCT.

<Inverse orthogonal transform unit 105>
FIG. 41 is a block diagram of the inverse orthogonal transform unit 105 according to the present embodiment. The inverse orthogonal transform unit 105 includes new processing units such as a first inverse rotation transform unit 4101, a second inverse rotation transform unit 4102, an Nth inverse rotation transform unit 4103, and an inverse discrete cosine transform unit 4104, and an existing transform selection switch 3504. Have The restored transform coefficient 120 input after the inverse quantization process is input to the transform selection switch 3504. Here, the conversion selection switch 3504 connects the output terminal of the switch to one of the first reverse rotation conversion unit 4101, the second reverse rotation conversion unit 4102, and the Nth reverse rotation conversion unit 4103 according to the conversion selection information 3303. Thereafter, the same inverse rotation transform used in the orthogonal transform unit 102 is subjected to inverse rotation transform processing by any one of the inverse rotation transform units 4101 to 4103, and is output to the inverse discrete cosine transform unit 4104. The inverse discrete cosine transform unit 4104 performs, for example, IDCT on the input signal to restore the restored prediction error signal 121. Although an example using IDCT is shown here as an example, orthogonal transform such as Hadamard transform or discrete sine transform may be used, or non-orthogonal transform may be used. In any case, the corresponding inverse conversion is performed in conjunction with the conversion unit 102.

FIG. 42 shows the syntax in the present embodiment. The rotation_transform_idx shown in the syntax means the number of the rotation matrix to be used.

According to the third embodiment described above, highly efficient orthogonal transformation and inverse orthogonal transformation can be realized while alleviating the difficulty in hardware implementation and software implementation. Therefore, the coding efficiency is improved, and the subjective image quality is also improved.

(Fourth embodiment)
The fourth embodiment relates to a moving picture decoding apparatus. The video encoding device corresponding to the video decoding device according to the present embodiment is as described in the first embodiment. That is, the moving picture decoding apparatus according to the present embodiment decodes encoded data generated by, for example, the moving picture encoding apparatus according to the first embodiment.

As shown in FIG. 44, the moving picture decoding apparatus according to this embodiment includes an input buffer 4401, an entropy decoding unit 4402, an inverse quantization unit 4403, an inverse orthogonal transform unit 4404, an addition unit 4405, and a loop filter 4406. An image memory 4407, an intra prediction unit 4408, an inter prediction unit 4409, a prediction selection switch 4410, and an output buffer 4411 are included.

44 decodes the encoded data 4413 stored in the input buffer 4401, stores the decoded image 4422 in the output buffer 4411, and outputs it as an output image. The encoded data 4413 is output from, for example, the moving image encoding apparatus shown in FIG. 1, and is temporarily stored in the input buffer 4401 via a storage system or transmission system (not shown).

The entropy decoding unit 4402 performs decoding based on the syntax for each frame or field for decoding the encoded data 4413. The entropy decoding unit 4402 sequentially entropy-decodes the code string of each syntax, and reproduces the encoding parameters of the encoding target block such as the prediction information 4421 including the prediction mode information and the quantization transform coefficient 4414. The encoding parameter is a parameter necessary for decoding such as prediction information 4421, information on transform coefficients, information on quantization, and the like.

The inverse quantization unit 4403 performs inverse quantization on the quantized transform coefficient 4414 from the entropy decoding unit 4402 to obtain a restored transform coefficient 4415. Specifically, the inverse quantization unit 4403 performs inverse quantization according to the information regarding the quantization decoded by the entropy decoding unit 4402. The inverse quantization unit 4403 inputs the restored transform coefficient 4415 to the inverse orthogonal transform unit 4404.

The inverse orthogonal transform unit 4404 performs inverse orthogonal transform corresponding to the orthogonal transform performed on the encoding side, on the reconstruction transform coefficient 4415 from the inverse quantization unit 4403, and obtains a reconstruction prediction error signal 4416. The inverse orthogonal transform unit 4404 inputs the restored prediction error signal 4416 to the adder 4405.

The addition unit 4405 adds the restored prediction error signal 4416 and the corresponding predicted image signal 4420 to generate a decoded image signal 4417. The decoded image signal 4417 is input to the loop filter 4406. The loop filter 4406 performs a deblocking filter, a Wiener filter, or the like on the input decoded image signal 4417 to generate a filtered image signal 4418. The generated filtered image signal 4418 is temporarily stored in the output buffer 4411 for the output image, and is also stored in the reference image memory 4407 for the reference image signal 4419. The filtered image signal 4418 stored in the reference image memory 4407 is referenced as a reference image signal 4419 by the intra prediction unit 4408 and the inter prediction unit 4409 as necessary in units of frames or fields. The filtered image signal 4418 temporarily accumulated in the output buffer 4411 is output according to the output timing managed by the decoding control unit 4412.

The intra prediction unit 4408, the inter prediction unit 4409, and the selection switch 4410 are substantially the same or similar elements as the intra prediction unit 109, the inter prediction unit 110, and the selection switch 111 in FIG. The intra prediction unit 4408 (109) performs intra prediction using the reference image signal 4419 stored in the reference image memory 4407. For example, H.M. In H.264, an intra prediction image is obtained by performing pixel interpolation (copying or copying after interpolation) along a prediction direction such as a vertical direction or a horizontal direction using an encoded reference pixel value adjacent to a prediction target block. Generate. FIG. The prediction direction of intra prediction in H.264 is shown. Further, in FIG. 2 shows an arrangement relationship between reference pixels and encoding target pixels in H.264. 5C shows a predicted image generation method in mode 1 (horizontal prediction), and FIG. 5D shows a predicted image generation method in mode 4 (diagonal lower right prediction; Intra_NxN_Diagonal_Down_Right in FIG. 4A). Yes.

Also, Jung-Hye Min, “Unification of the Directional Intra Prediction Methods in TMuC”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11-VCB , July 2010, H. The prediction direction of H.264 is further expanded to 34 directions to increase the number of prediction modes. A predicted pixel value is created by performing linear interpolation with 32-pixel accuracy in accordance with the predicted angle, and is copied in the predicted direction. Details of the intra prediction unit 109 used in the present embodiment of the present invention will be described later.

The inter prediction unit 4409 (110) performs inter prediction using the reference image signal 4419 stored in the reference image memory 4407. Specifically, the inter prediction unit 4409 (110) obtains a motion shift amount (motion vector) between the prediction target block and the reference image signal 124 from the entropy decoding unit 4402, and based on this motion vector. Inter prediction processing (motion compensation) is performed to generate an inter predicted image. H. With H.264, interpolation processing up to 1/4 pixel accuracy is possible.

The prediction selection switch 4410 selects the output terminal of the intra prediction unit 4408 or the output terminal of the inter prediction unit 4409 according to the decoded prediction information 4421, and inputs the intra predicted image or the inter predicted image as the predicted image signal 4420 to the adding unit 4405. . When the prediction information 4421 indicates intra prediction, the prediction selection switch 4410 connects a switch to the output terminal from the intra prediction unit 4408. On the other hand, when the prediction information 4421 indicates inter prediction, the prediction selection switch 4410 connects a switch to the output terminal from the inter prediction unit 4409.

The decoding control unit 4412 controls each element of the moving picture decoding apparatus in FIG. Specifically, the decoding control unit 4412 performs various controls for decoding processing including the above-described operation.

44 uses the same or similar syntax as the syntax described with reference to FIGS. 20, 21, 22A to E, 27A to 27B, and 32A to 32B. Description is omitted.

Details of the intra prediction unit 4408 (109) will be described below with reference to FIG.

In this embodiment, the intra prediction unit 4408 has the same configuration and processing content as the intra prediction unit 109 described in the first embodiment.

The intra prediction unit 4408 (109) shown in FIG. 6 includes a unidirectional intra predicted image generation unit 601, a bidirectional intra predicted image generation unit 602, a prediction mode information setting unit 603, and a selection switch 604. First, a reference image signal 4419 (124) is input from the reference image memory 4407 to the unidirectional intra predicted image generation unit 601 and the bidirectional intra predicted image generation unit 602. Here, according to the prediction mode information controlled by the decoding control unit 4412, the prediction mode information setting unit 603 selects the prediction mode generated by the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit 602. Set and output prediction mode 605. The selection switch 604 has a function of switching the output ends of the respective intra predicted image generation units according to the prediction mode 605. If the input prediction mode 605 is the unidirectional intra prediction mode, the output terminal of the unidirectional intra prediction image generation unit 601 is connected to the switch, and if the prediction mode 605 is the bidirectional intra prediction mode, the bidirectional intra prediction is performed. The output terminal of the image generation unit 602 is connected. On the other hand, each intra prediction image production |

generation part

601 and 602 produces | generates the prediction image signal 4420 (126) according to the prediction mode 605. FIG. The generated predicted image signal 4420 (126) is output from the intra prediction unit 109.

On the other hand, FIG. 9 shows the relationship between the prediction mode and the prediction method when PuSize is PU_8x8, PU_16x16, and PU_32x32. FIG. 10 shows a case where PuSize is PU_4 × 4, and FIG. 11 shows a case where PU_64 × 64 or PU_128 × 128. Here, IntraPredMode indicates a prediction mode number, and IntraBipredFlag is a flag indicating whether or not bidirectional intra prediction. When the flag is 1, it indicates that the prediction mode is the bidirectional intra prediction mode. When the flag is 0, it indicates that the prediction mode is a unidirectional intra prediction mode. IntraPredTypeLX indicates the prediction type of intra prediction. Intra_Vertical means that the vertical direction is the reference for prediction, and Intra_Horizontal means that the horizontal direction is the reference for prediction. Note that 0 or 1 is applied to X in IntraPredTypeLX. IntraPredTypeL0 indicates the first prediction mode of unidirectional intra prediction or bidirectional intra prediction. IntraPredTypeL1 indicates the second prediction mode of bidirectional intra prediction. IntraPred AngleID is an index indicating an index of a prediction angle. The prediction angle actually used in the generation of the predicted value is shown in FIG. Here, puPartIdx represents the index of the divided block in the quadtree division described with reference to FIG. 3B.

The prediction mode information setting unit 603 converts the above-described prediction information corresponding to the designated prediction mode 605 to the unidirectional intra prediction image generation unit 601 and the bidirectional intra prediction image generation unit 602 under the control of the decoding control unit 4412. And the prediction mode 605 is output to the selection switch.

Next, the unidirectional intra predicted image generation unit 601 will be described in detail. The unidirectional intra predicted image generation unit 601 has a function of generating a predicted image signal 4420 (126) for a plurality of prediction directions shown in FIG. In FIG. 8, there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines. H. The direction of a typical prediction angle indicated by H.264 is indicated by an arrow. In the present embodiment of the present invention, 33 kinds of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a diamond. H. Similar to H.264, DC prediction for predicting with an average value of available reference pixels is added, and there are 34 prediction modes in total.

In the case of IntraPredMode = 4, since IntraPredAngleIDL0 is −4, the prediction image signal 4420 (126) is generated in the prediction direction indicated by IntraPredMode = 4 in FIG. An arrow indicated by a dotted line in FIG. 8 indicates a prediction mode whose prediction type is Intra_Vertical, and an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.

<Intra Prediction Unit 4408 (109)>
Next, a prediction image generation method of the unidirectional intra prediction image generation unit 601 will be described. Here, a predicted image value is generated based on the input reference image signal 4419 (124), and a pixel is copied in the above-described prediction direction. The predicted image value is generated by performing interpolation with 1/32 pixel accuracy. FIG. 12 shows the relationship between IntraPredAngleIDLX and intraPredAngle used for predictive image value generation. intraPredAngle indicates a prediction angle that is actually used when a predicted value is generated. For example, when the prediction type is Intra_Vertical and intraPredAngle shown in FIG. 12 is a positive value, a prediction value generation method is expressed by Expression (3). Here, BLK_SIZE indicates the size of the pixel block (prediction unit), and ref [] indicates an array in which reference image signals are stored. Also, pred (k, m) indicates the generated predicted image signal 4420 (126).

Next, the bidirectional intra predicted image generation unit 602 will be described in detail. FIG. 13 shows a block diagram of the bidirectional intra-predicted image generation unit 602. The bidirectional intra predicted image generation unit 602 includes a first unidirectional intra predicted image generation unit 1301, a second unidirectional intra predicted image generation unit 1302, and a weighted average unit 1303. The input reference image signal 4419 (124 ), Two unidirectional intra prediction images are generated, and a weighted average of these is generated to generate a prediction image signal 4420 (126).

The functions of the first unidirectional intra predicted image generation unit 1301 and the second unidirectional intra predicted image generation unit 1302 are the same. In either case, a prediction image signal corresponding to a prediction mode given according to prediction mode information controlled by the encoding control unit 115 is generated. A first predicted image signal 1304 is output from the first unidirectional intra predicted image generation unit 1301, and a second predicted image signal 1305 is output from the second unidirectional intra predicted image generation unit 1302. Each predicted image signal is input to the weighted average unit 1303, and weighted average processing is performed.

For example, in the case of PuSize = PU_8 × 8 and IntraPredMode = 33, it can be seen from FIG. 7 that IntraUniModeNum = 33, and BiPredIdx = 0. As a result, it is derived from FIG. 14 that the first unidirectional intra prediction mode (MappedBi2Uni (0, idx)) is 1 and the second unidirectional intra prediction mode (MappedBi2Uni (1, idx)) is 0. In other PuSize and IntraPredMode, it is possible to derive two prediction modes by the same method. Hereinafter, the first unidirectional intra prediction mode is expressed as IntraPredModeL0, and the second unidirectional intra prediction mode is expressed as IntraPredModeL1.

The weighted average unit 1303 calculates a Euclidean distance or a city area distance (Manhattan distance) based on the prediction directions of IntraPredModeL0 and IntraPredModeL1, and derives a weight component used in the weighted average process. The weight component of each pixel is represented by the reciprocal of the Euclidean distance or the city distance from the reference pixel used for prediction, and is generalized by Expression (5). Here, when using the Euclidean distance, ΔL is expressed by Equation (6). On the other hand, when using the city distance, ΔL is expressed by Equation (7). The weight table for each prediction mode is generalized to Equation (8). Therefore, the final prediction signal at the pixel position n is expressed by Equation (9).

In this embodiment, the Euclidean distance from the reference pixel used in the prediction mode or the reciprocal of the urban area distance is used as a weight component as it is, but as another embodiment, the Euclidean distance from the reference pixel and the urban area distance are variables. The weight component may be set using the distributed model. The distribution model uses at least one of a linear model, an M-order function (M ≧ 1), a nonlinear function such as a one-sided Laplace distribution or a one-sided Gaussian distribution, or a fixed value that is a fixed value regardless of the distance from the reference pixel. When the one-sided Gaussian distribution is used as a model, the weight component is expressed by Equation (10). Further, when the one-sided Laplace distribution is used as a model, the weight component is expressed by Expression (11).

When the weight components represented by Equation (5), Equation (8), Equation (10), and Equation (11) are calculated each time the predicted image is generated, a plurality of multipliers are required, and the hardware scale increases. . For this reason, the circuit scale required for the said calculation can be reduced by calculating a weight component beforehand according to the relative distance for every prediction mode, and hold | maintaining in a memory. Here, a method for deriving the weight component when the city distance is used will be described.

In the present embodiment of the present invention, the required memory amount is reduced by sharing the distance table of several prediction modes with close prediction angles. FIG. 17 shows the mapping of IntraPredModeLX used for distance table derivation. Here, an example is shown in which a table of only the prediction mode corresponding to the prediction mode corresponding to the prediction mode and the DC prediction in 45 degrees is prepared, and other prediction angles are mapped closer to the prepared reference prediction mode. ing. When the distance from the reference prediction mode is the same, the index is mapped to the smaller one. The prediction mode shown in “MappedIntraPredMode” is referred to from FIG. 17, and a distance table can be derived.

By using the distance table, the relative distance for each pixel in the two prediction modes is calculated using Equation (12). Using Equation (12), the final prediction signal at the pixel position n is represented by Equation (13). Here, in order to avoid an increase in hardware scale due to the use of decimal point arithmetic, the weight component is scaled in advance and converted to integer arithmetic, it can be expressed by Equation (14). Here, for example, when the decimal part is expressed with 10-bit precision, WM = 1024, Offset = 512, and SHIFT = 10. These satisfy the relationship of Expression (15).

<Syntax structure 1>
Hereinafter, the syntax used by the moving picture decoding apparatus 4400 in FIG. 44 will be described.
The syntax indicates the structure of encoded data (for example, encoded data 127 in FIG. 1) when the moving image decoding apparatus 4400 decodes moving image data. The image encoding apparatus represented by the first embodiment encodes this encoded data using the same syntax structure. FIG. 20 shows an example of syntax 2000 used by the image coding apparatus in FIG. Since the syntax 2000 is the same as that of the first embodiment, detailed description thereof is omitted.

Next, an example of the reflection unit syntax according to the present embodiment will be described.

When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the used bi-directional intra prediction mode among a plurality of prepared bi-directional intra prediction modes. Intra_luma_bipred_mode [i] is decoded. intra_luma_bipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraBiModeNum shown in FIG. 7, or may be decoded using a predetermined code table. When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive decoding is performed from adjacent blocks.

Prev_intra_luma_unipred_flag [i] is a flag indicating whether or not the prediction value MostProbable of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the MostProbable calculation method will be described later. When prev_intra_luma_unipred_flag [i] is 1, it indicates that the MostProbable and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_flag [i] is 0, it indicates that the MostProbable and the intra prediction mode IntraPredMode are different, and the information rem_intraprelum decoding that further specifies the intra prediction mode IntraPredMode other than MostProbable. . rem_intra_luma_unipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraUniModeNum shown in FIG. 7, or may be decoded using a predetermined code table. From the intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using Equation (16).

Next, a method for calculating MostProbable, which is a predicted value in the prediction mode, will be described. MostProbable is calculated according to Equation (17). Min (x, y) is a parameter for outputting the smaller one of the inputs x and y.

In addition, intraPredModeA and intraPredModeB indicate intra prediction modes of prediction units adjacent to the left and above the decoded prediction unit. Hereinafter, intraPredModeA and intraPredModeB are collectively expressed as intraPredModeN. N is set to A or B. A method of calculating intraPredModeN will be described using the flowchart shown in FIG. First, it is determined whether a coding tree unit to which an adjacent prediction unit belongs can be used (step S2301). If the coding tree unit cannot be used (NO in S2301), reference to intraPredModeN is not possible. “−1” indicating “” is set. On the other hand, if the coding tree unit is available (YES in S2301), it is next determined whether or not intra prediction is applied to the adjacent prediction unit (step S2302). When the adjacent prediction unit is not intra prediction (NO in S2302), “2” meaning “Intra_DC” is set in intraPredModeN. On the other hand, when the adjacent prediction unit is intra prediction (YES in S2302), it is next determined whether or not the adjacent prediction unit is bidirectional intra prediction (step S2303). When the adjacent prediction unit is not bidirectional intra prediction, that is, in the case of unidirectional intra prediction (NO in S2303), the prediction mode IntraPredMode of the adjacent prediction unit is set in intraPredModeN. On the other hand, when the adjacent PU is bidirectional intra prediction (S2303 is YES), the prediction mode of the adjacent block is converted to the unidirectional intra prediction mode. Specifically, intraPredModeN is calculated using Equation (18). Here, IntraUniModeNum is the number of unidirectional intra prediction modes determined by the size of the adjacent prediction unit, and an example thereof is shown in FIG. Also, “MappedBi2Uni (List, idx)” is a table for converting the bidirectional intra prediction mode into the unidirectional intra prediction mode. List is the unidirectional intra prediction mode of List0 (corresponding to IntraPredTypeL0 [] shown in FIGS. 9, 10, and 11) of the two unidirectional intra prediction modes constituting the bidirectional intra prediction mode. Is a flag indicating whether to use the unidirectional intra prediction mode of List1 or equivalent to IntraPredTypeL1 [] shown in FIGS. 9, 10, and 11; For example, List1 is used for conversion to the unidirectional intra prediction mode. FIG. 14 shows an example of the conversion table. The numerical values in the figure correspond to IntraPredMode shown in FIGS.

When MostProbable calculated using intraPredModeN calculated as described above and Equation (18) is −1, MostProbable is replaced with 2 (Intra_DC). When MostProbable is larger than the number of unidirectional intra prediction modes IntraUniPredModeNum of the decoding prediction unit, MostProbable is recalculated using the following equation (19). “MappedMostProble ()” is a table for converting MostProbable, and an example is shown in FIG. 24.

<Syntax structure 2>
Next, another example of the prediction unit syntax is shown in FIG. 22C. Since pred_mode and intra_split_flag are the same as the syntax example described above, description thereof is omitted. luma_pred_mode_code_type [i] indicates the type of the prediction mode IntraPredMode applied to the prediction unit, where 0 (IntraUnifiedMostProb) is unidirectional intra prediction and the intra prediction mode is the same as MostProbable, 1 (IntraUnipre intrareprediction) The intra prediction mode is different from MostProbable, and 2 (IntraBipred) indicates a bidirectional intra prediction mode. FIG. 24 shows bidirectional intra prediction modes, 1 and unidirectional intra prediction (whether or not the same prediction mode as MostProbable). FIG. 25 shows an example of assignment of the number of modes according to the meaning corresponding to luma_pred_mode_code_type, bin, and the mode configuration shown in FIG. When luma_pred_mode_code_type [i] is 0, the intra prediction mode is the MostProbable mode, so no further information decoding is necessary. When luma_pred_mode_code_type [i] is 1, information rem_intra_luma_unipred_mode [i] that specifies which mode other than MostProbable is the intra prediction mode IntraPredMode is decoded. rem_intra_luma_unipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraUniModeNum shown in FIG. 7, or may be decoded using a predetermined code table. From the intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using Equation (16). Further, when luma_pred_mode_code_type [i] is 2, it indicates that the prediction unit is bidirectional intra prediction, and information that identifies the used bidirectional intra prediction mode among the prepared bidirectional intra prediction modes. Intra_luma_bipred_mode [i] is decoded. intra_luma_bipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraBiModeNum shown in FIG. 7, or may be decoded using a predetermined code table.
The above is the syntax configuration according to the present embodiment.

<Syntax structure 3>
Yet another example of the prediction unit syntax is shown in FIG. 22D. In this example, based on the prediction unit syntax shown in FIG. 22A, whether bidirectional intra prediction can be used or whether conventional intra unidirectional prediction can be used with bidirectional intra prediction disabled. Shows the syntax for switching within the prediction unit to be decoded.
Note that pred_mode and intra_split_flag are the same as the syntax example described above, and thus description thereof is omitted.

Intra_bipred_flag is a flag indicating whether or not bidirectional intra prediction can be used in the decoding prediction unit. When intra_bipred_flag is 0, it indicates that bi-directional intra prediction is not used in the decoding prediction unit. Even when intra_split_flag is 1, that is, when the decoded prediction unit is further divided into four, bi-directional intra prediction is not used in all prediction units, and only uni-directional intra prediction is effective.

When intra_bipred_flag is 1, it indicates that bidirectional intra prediction can be used in the decoding prediction unit. Even when intra_split_flag is 1, that is, when the decoded prediction unit is further divided into four, in all prediction units, bidirectional intra prediction can be selected in addition to unidirectional intra prediction.

In a region where bi-directional intra prediction is unnecessary (for example, a flat region), the intra-bipred_flag is decoded as 0 to disable bi-directional intra prediction. Since the amount of code required for decoding can be reduced, the coding efficiency is improved.

<Syntax structure 4>
Still another example relating to the prediction unit syntax is shown in FIG. 22E. In this example, based on the prediction unit syntax shown in FIG. 22C, whether bidirectional intra prediction can be used or whether only conventional unidirectional intra prediction can be used with bidirectional intra prediction disabled. Shows the syntax for switching in the decoding prediction unit. intra_bipred_flag is a flag indicating whether or not bi-directional intra prediction can be used in the decoding prediction unit, and is the same as the above-described intra_bipred_flag, and thus the description thereof is omitted.

(First modification)
<Intra prediction unit first modification>
As a first modification related to the intra prediction unit 4408, in combination with adaptive reference pixel filtering shown in JCTVC-B205_draft002, section 5.2.1 “Intra prediction process for luma samples”, JCT-VC 2nd Meeting Geneva, July, 2010 It doesn't matter. FIG. 26 shows an intra prediction unit 4408 (109) when adaptive reference pixel filtering is used. It differs from the intra prediction unit 4408 (109) shown in FIG. 6 in that a reference pixel filter unit 2601 is added. The reference pixel filter unit 2601 receives the reference image signal 4419 (124) and the prediction mode 605, performs adaptive filter processing described later, and outputs a filtered reference image signal 2602. The filtered reference image signal 2602 is input to the unidirectional intra predicted image generation unit 601 and the bidirectional intra predicted image generation unit 602. The configuration and processing other than the reference pixel filter unit 2601 are the same as those of the intra prediction unit 4408 (109) shown in FIG.

Next, the reference pixel filter unit 2601 will be described. The reference pixel filter unit 2601 determines whether or not to filter reference pixels used for intra prediction according to the reference pixel filter flag and the intra prediction mode included in the prediction mode 605. The reference pixel filter flag is a flag indicating whether or not reference pixels are filtered when the intra prediction mode IntraPredMode is a value other than “Intra_DC”. When the reference pixel filter flag is 1, the reference pixel is filtered. In the case of the reference pixel filter flag 0, the reference pixel is not filtered. When IntraPredMode is “Intra_DC”, the reference pixel is not filtered and the reference pixel filter flag is set to 0. When the reference pixel filter flag is 1, a filtered reference image signal 2602 is calculated by filtering shown in Expression (20). Note that p [x, y] indicates a reference pixel before filtering, and pf [x, y] indicates a reference pixel in filter terms. X and y indicate the relative positions of the reference pixels when the upper left pixel position in the prediction unit is x = 0 and y = 0. PuPartSize indicates the size (pixel) of the prediction unit.

<Syntax structure 5>
27A and 27B show a prediction unit syntax structure when performing adaptive reference pixel filtering. FIG. 27A adds the syntax intra_luma_filter_flag [i] related to the adaptive reference pixel filter to FIG. 22A. In addition, FIG. 27B adds syntax intra_luma_filter_flag [i] related to the adaptive reference pixel filter to FIG. 22C. intra_luma_filter_flag [i] is further decoded when the intra prediction mode IntraPredMode [i] is other than Intra_DC. When the flag is 0, it indicates that the reference pixel is not filtered. Further, when intra_luma_filter_flag [i] is 1, it indicates that the reference pixel filtering is applied.

In the above example, intra_luma_filter_flag [i] is decoded when the intra prediction mode IntraPredMode [i] is other than Intra_DC. However, as another example, when IntraPredMode [i] is 0 to 2, intra_luma_filter_flag [i ] Need not be decrypted. In this case, intra_luma_filter_flag [i] is set to 0.

(Second modification)
<Intra prediction unit second modification>
As a second modification related to the intra prediction unit 4408 (109), it is used in combination with the composite intra prediction shown in JCTVC-B205_draft002, section 9.6 “Combined Intra Prediction”, JCT-VC 2nd Meeting Geneva, July, 2010. It doesn't matter. In the decoded intra prediction in this document, a prediction value is obtained by performing weighted averaging of the result of the above-described unidirectional intra prediction and the average value of pixels adjacent to the left, top, and top left with respect to the prediction pixel. When the decoded image signal 4417 is calculated in the moving image decoding device 4400 or the image encoding device 100, it is possible to use decoded pixels as pixels adjacent to the left, upper, and upper left.

FIG. 30 shows a block diagram of the intra prediction unit 4408 (109) when combined with composite intra prediction. The difference is that a composite intra predicted image generation unit 2901, a selection switch 2902, and a decoded pixel buffer 3001 are added to the intra prediction unit 4408 (109) shown in FIG.

When the bidirectional intra prediction and the composite intra prediction are combined, first, in the selection switch 604, the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit according to the prediction mode information controlled by the decoding control unit 4412. The output terminal of 602 is switched. Hereinafter, the output predicted image signal 4420 (126) is referred to as a direction predicted image signal 4420 (126).

Then, the direction prediction image signal is input to the composite intra prediction image generation unit 2901, and a prediction image signal 4420 (126) in the composite intra prediction is generated. The description of the composite intra predicted image generation unit 2901 will be described later. Thereafter, in the selection switch 2902, which one of the prediction image signal 4420 (126) and the direction prediction image signal in the composite intra prediction is used according to the composite intra prediction application flag in the prediction mode information controlled by the decoding control unit 4412. And the final predicted image signal 4420 (126) in the intra prediction unit 4408 (109) is output. When the composite intra prediction application flag is 1, the predicted image signal 4420 (126) output from the composite intra predicted image generation unit 2901 is the final predicted image signal 4420 (126). On the other hand, when the composite intra prediction application flag is 0, the direction prediction image signal 4420 (126) is the prediction image signal 126 that is finally output.

Next, the composite intra prediction image generation unit 2901 will be described with reference to FIG. The composite intra prediction image generation unit 2901 includes a pixel level prediction signal generation unit 3101 and a composite intra prediction calculation unit 3102. The pixel level prediction signal generation unit 3101 predicts the prediction target pixel X from adjacent pixels and outputs a pixel level prediction signal 3103. As described above, the adjacent pixel indicates the decoded image signal 4417. Specifically, the pixel level prediction signal 3103 (X) of the prediction target pixel is calculated using the number (21). The coefficients related to A, B, and C may be other values.

The composite intra prediction calculation unit 3102 performs a weighted average of the direction prediction image signal 4420 (126) (X ′) and the pixel level prediction signal 3103 (X), and outputs a final prediction image signal 4420 (126) (P). To do. Specifically, Formula (22) is used.

Note that W is a weighted average weight coefficient (an integer value between W = 0 and 32) of the direction prediction image signal 4420 (126) (X ′) and the pixel level prediction signal 3103 (X). The above is an embodiment when combined with composite intra prediction.

<Syntax structure 6>
32A and 32B show a prediction unit syntax structure when performing composite intra prediction. FIG. 32A is different from FIG. 22A in that a syntax combined_intra_pred_flag for switching presence / absence of composite intra prediction is added. This is equivalent to the above-described composite intra prediction application flag. In addition, FIG. 32B adds a syntax combined_intra_pred_flag for switching presence / absence of composite intra prediction to FIG. 22C. When combined_intra_pred_flag is 1, the selection switch 2902 shown in FIG. 30 is connected to the output terminal of the composite intra prediction image generation unit 2901. When combined_intra_pred_flag is 0, the selection switch 2902 shown in FIG. 29 is connected to the output terminal of either the unidirectional intra prediction image generation unit 601 or the bidirectional intra prediction image generation unit 602 to which the selection switch 604 is connected. .

Furthermore, it may be combined with the second embodiment of the intra prediction unit.

This completes the description of another embodiment of the intra prediction unit 4408.

According to the fourth embodiment described above, since the same or similar intra prediction unit as that of the video encoding device according to the first embodiment is included, the same or the same as the video encoding device according to the first embodiment or Similar effects can be obtained.

(Fifth embodiment)
<Video Decoding Device—Fifth Embodiment>
The video decoding device according to the fifth embodiment differs from the video decoding device according to the above-described fourth embodiment in the details of inverse orthogonal transform. In the following description, in this embodiment, the same parts as those in the fourth embodiment are denoted by the same reference numerals, and different parts will be mainly described. The moving picture coding apparatus corresponding to the moving picture decoding apparatus according to the present embodiment is as described in the second embodiment.

FIG. 45 is a block diagram showing a moving picture decoding apparatus according to the fifth embodiment. A change from the moving picture decoding apparatus according to the fifth embodiment is that a conversion selection unit 4502 and a coefficient order restoration unit 4501 are added. Also, the internal structure of the inverse orthogonal transform unit 4404 is different.

<Inverse orthogonal transform unit 4404>
First, the inverse orthogonal transform unit 4404 will be described with reference to FIG. Note that the inverse orthogonal transform unit 4404 has the same configuration as the inverse orthogonal transform unit 105 according to the second embodiment. Therefore, in the present embodiment, the conversion selection information 3303 in FIG. 35 is replaced with the conversion selection information 4504, the restored transform coefficient 120 is replaced with the restored transform coefficient 4415, and the restored prediction error signal 121 is replaced with the restored prediction error signal 4416. .

35 includes a first inverse orthogonal transform unit 3501, a second inverse orthogonal transform unit 3502, an Nth inverse orthogonal transform unit 3503, and a transform selection switch 3504. First, the conversion selection switch 3504 will be described. The conversion selection switch 3504 has a function of selecting the output terminal of the inverse quantization unit 4403 according to the input conversion selection information 4504. The conversion selection information 4504 is one of information controlled by the decoding control unit 4412, and is set by the conversion selection unit 4502 in accordance with the prediction information 4421 (125).

When the conversion selection information 4504 is the first orthogonal transform, the output terminal of the switch is connected to the first inverse orthogonal transform unit 3501. On the other hand, when the transformation selection information 4504 is the second orthogonal transformation, the output end is connected to the second inverse orthogonal transformation unit 3502. Similarly, when the transform selection information 4504 is the Nth orthogonal transform, the output terminal is connected to the Nth inverse orthogonal transform unit 3503.

<Conversion selection unit 4502>
Next, the conversion selection unit 4502 shown in FIG. 45 will be described. Prediction information 4421 (125), which is controlled by the decoding control unit 4412 and decoded by the entropy decoding unit 4402, is input to the transformation selection unit 4502. Based on the prediction information 4421 (125), the transform selection unit 4502 has a function of setting MapdTransformIdx information indicating which inverse orthogonal transform is to be used for which prediction mode. FIG. 36 shows conversion selection information 4504 (MappedTransformIdx) in intra prediction. Here, an example of N = 9 is shown. Note that the first inverse orthogonal transform unit 3501 is selected at the time of DC prediction corresponding to IntraPredModeLX = 2. By mapping to the reference prediction mode with a close prediction angle in this way, compared to the case of preparing an orthogonal transformer and an inverse orthogonal transformer for all prediction modes, orthogonal transformation and inverse orthogonal transformation at the time of hardware implementation It is possible to reduce the circuit scale. When bi-directional intra prediction is selected, two IntraPredModeL0 and IntraPredModeL1 are derived according to FIG. 14, respectively, and then the mapped transformIdx is derived from FIG. 36 using a prediction mode corresponding to IntraPredModeL1. In the present embodiment of the present invention, an example of N = 9 has been shown, but the value of N may be selected in an optimal combination by balancing the coding performance and the circuit scale at the time of hardware implementation.

<Coefficient order restoration unit 4501>
Next, the coefficient order restoration unit 4501 will be described. FIG. 46 shows a block diagram of the coefficient order restoration unit 4501. The coefficient order restoration unit 4501 has a function of performing reverse scan order conversion with the coefficient order control unit 3302 according to the second embodiment.

The coefficient order restoration unit 4501 includes a coefficient order selection switch 4604, a first coefficient forward / reverse transform unit 4601, a second coefficient forward / reverse transform unit 4602, and an Nth coefficient forward / reverse transform unit 4603. The coefficient order selection switch 4604 has a function of switching the output terminal of the switch and the coefficient order inverse conversion units 4601 to 4603 in accordance with, for example, the mapped transform IDx shown in FIG. The N types of coefficient forward / inverse transform units 4601 to 4603 have a function of inversely transforming one-dimensional data into two-dimensional data with respect to the quantized transform coefficient sequence 4503 decoded by the entropy decoding unit 4402. For example, H.M. In H.264, two-dimensional data is converted into one-dimensional data using a zigzag scan. Here, for example, it means that conversion from a zigzag scan to a raster scan is performed.

When using orthogonal transform in consideration of the prediction direction of intra prediction, the quantized transform coefficient obtained by performing quantization processing on the transform coefficient that has been subjected to orthogonal transform has the property that the tendency of generating non-zero transform coefficients in the block is biased. Have. The tendency of occurrence of this non-zero transform coefficient has different properties for each prediction direction of intra prediction. However, when different videos are encoded, the generation tendency of non-zero transform coefficients in the same prediction direction has a similar property. Therefore, when transforming two-dimensional data into one-dimensional data (2D-1D conversion), entropy coding is performed preferentially from transform coefficients at positions where the occurrence probability of non-zero transform coefficients is high, thereby encoding transform coefficients. It is possible to reduce information. Conversely, on the decoding side, it is necessary to restore the one-dimensional data to the two-dimensional data. Here, the raster scan is restored as a one-dimensional reference scan.

As yet another example, the coefficient order restoration unit 4501 may dynamically update the scan order in the 1D-2D conversion. The configuration of the coefficient order restoration unit 4501 that performs such an operation is illustrated in FIG. The coefficient order restoration unit 4501 includes an occurrence frequency counting unit 4701 and an updating unit 4702 in addition to the configuration of FIG. .., 4603 are the same except that the 1D-2D scan order is updated by the updating unit 4702.

The occurrence frequency counting unit 4701 creates a histogram 4704 of the number of occurrences of non-zero coefficients in each element of the quantized transform coefficient sequence 4503 for each prediction mode. The occurrence frequency counting unit 4701 inputs the created histogram 4704 to the update unit 4702.

The update unit 4702 updates the coefficient order based on the histogram 4704 at a predetermined timing. The timing is, for example, the timing when the coding process of the coding tree unit is finished, the timing when the coding process for one line in the coding tree unit is finished, or the like.

Specifically, the update unit 4702 refers to the histogram 4704 and updates the coefficient order with respect to a prediction mode having an element in which the number of occurrences of non-zero coefficients is counted more than a threshold. For example, the update unit 4702 updates the prediction mode having an element in which the occurrence of a non-zero coefficient is counted 16 times or more. By providing a threshold value for the number of occurrences, the coefficient order is updated globally, so that it is difficult to converge to a local optimum solution.

The update unit 4702 sorts the elements in descending order of the occurrence frequency of the non-zero coefficient regarding the prediction mode to be updated. Sorting can be realized by existing algorithms such as bubble sort and quick sort. Then, the update unit 4702 inputs the update coefficient order 4703 indicating the order of the sorted elements to the coefficient order inverse transform units 4601 to 4603 corresponding to the prediction mode to be updated.

When the update coefficient order 4703 is input, each inverse conversion unit performs 1D-2D conversion in accordance with the updated scan order. When the scan order is dynamically updated, the initial scan order of each 1D-2D conversion unit needs to be determined in advance. The initial scanning order is the same as that of the coefficient order control unit 3302 of the moving picture coding apparatus shown in FIG. In this way, when the scan order is dynamically updated, the tendency of occurrence of non-zero coefficients in the quantized transform coefficients changes according to the effect of the predicted image properties, quantization information (quantization parameters), etc. In addition, stable and high encoding efficiency can be expected. Specifically, the generated code amount of run-length encoding in the entropy encoding unit 113 can be suppressed.

The syntax configuration in the present embodiment is the same as that in the fourth embodiment.

As a modification of the present embodiment, the conversion selection unit 4502 can select the mapped transform idx separately from the prediction information 4421. In this case, information indicating which nine types of orthogonal transforms or inverse orthogonal transforms are used is set in the decoding control unit 4412 and used by the inverse orthogonal transform unit 4404. FIG. 39 shows an example of syntax in the present embodiment. Directional_transform_idx indicated in the syntax indicates information indicating which of N orthogonal transforms has been selected.

According to the fifth embodiment described above, the same or similar inverse orthogonal transform unit as that of the video encoding device according to the second embodiment is included, and thus the same as the video encoding device according to the second embodiment. Or a similar effect can be obtained.

(Sixth embodiment)
<Video Decoding Device—Sixth Embodiment>
The video decoding device according to the sixth embodiment differs from the video decoding device according to the above-described fourth embodiment in the details of inverse orthogonal transform. In the following description, in this embodiment, the same parts as those in the fourth embodiment are denoted by the same reference numerals, and different parts will be mainly described. The moving picture encoding apparatus corresponding to the moving picture decoding apparatus according to the present embodiment is as described in the third embodiment.

As an embodiment related to the inverse orthogonal transform unit 105, JCTVC-B205_draft002, 5.3.5.2 section “Rotational transformation process”, JCT-VC 2nd Meeting Geneva, July, 2010, may be combined.

<Inverse orthogonal transform unit 4404 (105)>
FIG. 41 is a block diagram of inverse orthogonal transform section 4404 (105) according to the present embodiment. The inverse orthogonal transform unit 4404 (105) has new processing units such as a first inverse rotation transform unit 4101, a second inverse rotation transform unit 4102, an Nth inverse rotation transform unit 4103, and an inverse discrete cosine transform unit 4104. A selection switch 3504 is included. The restored transform coefficient 4415 (120) input after the inverse quantization processing is input to the transform selection switch 3504. Here, the conversion selection switch 3504 sets the output end of the switch to one of the first reverse rotation conversion unit 4101, the second reverse rotation conversion unit 4102, and the Nth reverse rotation conversion unit 4103 according to the conversion selection information 4504 (3303). Connecting. Thereafter, the same inverse rotation transform unit 4101 to 4103 as that used in the orthogonal transform unit 102 shown in FIG. 40 is subjected to the inverse rotation transform process, and the result is output to the inverse discrete cosine transform unit 4104. The inverse discrete cosine transform unit 4104 performs, for example, IDCT on the input signal to restore the restored prediction error signal 4416 (121). Although an example using IDCT is shown here as an example, orthogonal transform such as Hadamard transform or discrete sine transform may be used, or non-orthogonal transform may be used. In any case, corresponding inverse transformation is performed in conjunction with the orthogonal transformation unit 102 shown in FIG.

According to the sixth embodiment described above, the same or similar inverse orthogonal transform unit as that of the image encoding device according to the third embodiment is included, and therefore the same or similar as that of the image encoding device according to the third embodiment. The effect of can be obtained.

Hereinafter, modifications of each embodiment will be listed and introduced.
In the first to sixth embodiments, an example is described in which a frame is divided into rectangular blocks of 16 × 16 pixel size and the like, and encoding / decoding is performed in order from the upper left block to the lower right side of the screen ( (See FIG. 2A). However, the encoding order and the decoding order are not limited to this example. For example, encoding and decoding may be performed sequentially from the lower right to the upper left, or encoding and decoding may be performed so as to draw a spiral from the center of the screen toward the screen end. Furthermore, encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed so as to draw a spiral from the screen edge toward the center of the screen.

In the first to sixth embodiments, the description has been given by exemplifying the prediction target block size such as the 4 × 4 pixel block, the 8 × 8 pixel block, and the 16 × 16 pixel block. However, the prediction target block is a uniform block. It does not have to be a shape. For example, the prediction target block (prediction unit) size may be a 16 × 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, a 4 × 8 pixel block, or the like. Also, it is not necessary to unify all the block sizes within one coding tree unit, and a plurality of different block sizes may be mixed. When a plurality of different block sizes are mixed in one coding tree unit, the amount of codes for encoding or decoding the division information increases as the number of divisions increases. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the division information and the quality of the locally decoded image or the decoded image.

In the first to sixth embodiments, for the sake of simplicity, a comprehensive description of the color signal component has been described without distinguishing between the luminance signal and the color difference signal. However, when the prediction process is different between the luminance signal and the color difference signal, the same or different prediction methods may be used. If different prediction methods are used between the luminance signal and the chrominance signal, the prediction method selected for the chrominance signal can be encoded or decoded in the same manner as the luminance signal.

In the first to sixth embodiments, for the sake of simplicity, a comprehensive description of the color signal component has been described without distinguishing between the luminance signal and the color difference signal. However, when the orthogonal transformation process is different between the luminance signal and the color difference signal, the same or different orthogonal transformation methods may be used. If different orthogonal transformation methods are used between the luminance signal and the color difference signal, the orthogonal transformation method selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.

In the first to sixth embodiments, syntax elements that are not defined in the present invention can be inserted between the rows of the table shown in the syntax configuration, and other conditional branch descriptions are included. It does not matter. Alternatively, the syntax table can be divided and integrated into a plurality of tables. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.

As described above, each embodiment can realize highly efficient orthogonal transformation and inverse orthogonal transformation while alleviating the difficulty in hardware implementation and software implementation. Therefore, according to each embodiment, the encoding efficiency is improved, and the subjective image quality is also improved.

Although several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

The instructions shown in the processing procedure shown in the above embodiment can be executed based on a program that is software. A general-purpose computer system stores this program in advance, and by reading this program, it is also possible to obtain the same effects as those obtained by the video encoding device and video decoding device of the above-described embodiment. is there. The instructions described in the above-described embodiments are, as programs that can be executed by a computer, magnetic disks (flexible disks, hard disks, etc.), optical disks (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD). ± R, DVD ± RW, etc.), semiconductor memory, or a similar recording medium. As long as the recording medium is readable by the computer or the embedded system, the storage format may be any form. If the computer reads the program from the recording medium and causes the CPU to execute instructions described in the program based on the program, the computer is similar to the video encoding device and video decoding device of the above-described embodiment. Operation can be realized. Of course, when the computer acquires or reads the program, it may be acquired or read through a network.
In addition, the OS (operating system), database management software, MW (middleware) such as a network, etc. running on the computer based on the instructions of the program installed in the computer or embedded system from the recording medium implement this embodiment. A part of each process for performing may be executed.
Furthermore, the recording medium in the present invention is not limited to a medium independent of a computer or an embedded system, but also includes a recording medium in which a program transmitted via a LAN or the Internet is downloaded and stored or temporarily stored. Further, the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
Further, the number of recording media is not limited to one, and when the processing in the present embodiment is executed from a plurality of media, it is included in the recording media in the present invention, and the configuration of the media may be any configuration.

The computer or the embedded system in the present invention is for executing each process in the present embodiment based on a program stored in a recording medium, and includes a single device such as a personal computer or a microcomputer, Any configuration such as a system in which apparatuses are connected to a network may be used.
Further, the computer in the embodiment of the present invention is not limited to a personal computer, but includes an arithmetic processing device, a microcomputer, and the like included in an information processing device, and a device capable of realizing the functions in the embodiment of the present invention by a program, The device is a general term.

DESCRIPTION OF SYMBOLS 101 ... Subtraction part, 102 ... Orthogonal transformation part, 103 ... Quantization part, 104 ... Dequantization part, 105 ... Inverse orthogonal transformation part, 106 ... Addition part, 107 ... Loop filter, 108 ... Reference image memory, 109 ... Intra Prediction unit 110 ... Inter prediction unit 111 ... Prediction selection switch 112 ... Prediction selection unit 113 ... Entropy encoding unit 114 ... Output buffer 115 ... Encoding control unit 116 ... Input image signal 117 ... Prediction error Signal 118, transform coefficient, 119 quantized transform coefficient, 120 restored transform coefficient, 121 restored restoration error signal, 122 decoded image signal, 123 filtered image signal, 124 reference image signal, 125 prediction information , 126 ... predicted image signal, 127 ... encoded data, 601 ... unidirectional intra predicted image generation unit, 602 ... bidirectional intra predicted image raw 603 ... Prediction mode information setting unit, 604 ... Selection switch, 605 ... Prediction mode, 1301 ... First unidirectional intra prediction image generation unit, 1302 ... Second unidirectional intra prediction image generation unit, 1303 ... Weighted average unit DESCRIPTION OF SYMBOLS 1304 ... 1st prediction image signal, 1305 ... 2nd prediction image signal, 1901 ... Image buffer, 2000 ... Syntax, 2001 ... High level syntax, 2002 ... Slice level syntax, 2003 ... Coding tree level syntax, 2004 ... Sequence parameter set Syntax: 2005 ... Picture parameter set syntax, 2006: Slice header syntax, 2007 ... Slice data syntax, 2008 ... Coding tree unit syntax, 2009 ... Prediction unit syntax , 2010 ... Transform unit syntax, 2601 ... Reference pixel filter unit, 2602 ... Filtered reference image signal, 2901 ... Composite intra prediction image generation unit, 2902 ... Selection switch, 3001 ... Decoded pixel buffer, 3001 ... Decoded image buffer, 3002 Reference pixels, 3101 ... Pixel level prediction signal generation unit, 3102 ... Composite intra prediction calculation unit, 3103 ... Pixel level prediction signal, 3104 ... Adjacent pixel, 3301 ... Conversion selection unit, 3302 ... Coefficient order control unit, 3303 ... Conversion selection Information, 3304 ... Quantized transform coefficient sequence, 3401 ... First orthogonal transform unit, 3402 ... Second orthogonal transform unit, 3403 ... Nth orthogonal transform unit, 3404 ... Transform selection switch, 3501 ... First inverse orthogonal transform unit, 3502 ... Second inverse orthogonal transform unit, 3503 ... Nth inverse orthogonal transform unit, 3504 ... Conversion selection switch, 3701 ... First coefficient order conversion section, 3702 ... Second coefficient order conversion section, 3703 ... Nth coefficient order conversion section, 3704 ... Coefficient order selection switch, 3801 ... Occurrence frequency counting section, 3802 ... Coefficient order Update unit, 3803 ... Update coefficient order, 3804 ... Histogram, 4001 ... Rotation conversion unit, 4001 ... First rotation conversion unit, 4002 ... Second rotation conversion unit, 4003 ... Nth rotation conversion unit, 4004 ... Discrete cosine conversion unit, 4101: First reverse rotation conversion unit, 4102 ... Second reverse rotation conversion unit, 4103 ... Nth reverse rotation conversion unit, 4104 ... Inverse discrete cosine conversion unit, 4401 ... Input buffer, 4402 ... Entropy decoding unit, 4403 ... Reverse Quantization unit, 4404 ... inverse orthogonal transform unit, 4405 ... addition unit, 4406 ... loop filter, 4407 ... reference image memory, 4408 ... in La prediction unit, 4409 ... inter prediction unit, 4410 ... prediction selection switch, 4411 ... output buffer, 4412 ... decoding control unit, 4413 ... encoded data, 4414 ... quantized transform coefficient, 4415 ... restoration transform coefficient, 4416 ... restoration Prediction error signal, 4417 ... decoded image signal, 4418 ... filtered image signal, 4419 ... reference image signal, 4420 ... predicted image signal, 4421 ... prediction information, 4422 ... decoded image, 4501 ... coefficient order restoration unit, 4502 ... conversion selection. , 4503... Quantized transform coefficient sequence, 4504... Transform selection information, 4601, 4602, 4603... Coefficient order inverse transform section, 4701... Occurrence frequency count section, 4702.

Claims

A moving picture coding method that divides an input image signal into pixel blocks expressed by hierarchical depth according to quadtree division, generates a prediction error signal for these divided pixel blocks, and encodes transform coefficients In
Creating a first predicted image signal by setting a first predicted direction from among a plurality of predicted direction sets;
Creating a second prediction image signal by setting a second prediction direction different from the first prediction direction from the plurality of prediction direction sets;
Deriving a relative distance between a pixel to be predicted and a reference pixel for each of the first and second prediction directions, and deriving a difference value of the relative distance;
Deriving a predetermined weight component according to the difference value;
In accordance with the weight component, a weighted average of the first unidirectional intra-predicted image and the second unidirectional intra-predicted image to generate a third predicted image signal;
Generating a prediction error signal from the third predicted image signal;
Encoding the prediction error signal;
A video encoding method comprising:
Creating a fourth predicted image signal by setting a fourth prediction direction from the plurality of prediction direction sets;
Selecting one of the third predicted image signal and the fourth predicted image signal as a fifth predicted image signal;
Generating a prediction error signal from the fifth predicted image signal;
Encoding a prediction error signal;
The moving picture encoding method according to claim 1, further comprising:
In order to encode a prediction mode that identifies the fifth predicted image signal,
Generating a reference prediction mode from the prediction mode corresponding to at least one encoded pixel block;
Predictively encoding the prediction mode using the reference prediction mode;
The video encoding method according to claim 2, further comprising:
When the prediction mode specifies the fourth prediction image signal as the fifth prediction image signal, and the reference prediction mode specifies the third prediction image signal as the fifth prediction image signal,
The method further comprising: replacing the prediction mode for specifying one of the first prediction image signal and the second prediction image signal included in the third prediction image signal as the reference prediction mode. The moving image encoding method described in 1.
The relative distance is derived based on the Euclidean distance or the urban area distance, based on the reference pixel corresponding to the start point of the prediction direction,
The difference value of the relative distance is derived based on the difference value derived in the first prediction direction and the second prediction direction, respectively, and the width or height of the pixel block,
The moving picture coding method according to claim 4, wherein the weight component in the first prediction direction is designed to be larger as the relative distance is closer and smaller as it is farther.
Further comprising orthogonally transforming the prediction error signal;
The orthogonal transform is based on the tendency that the absolute value of the prediction error increases according to the distance from the reference pixel with respect to the selected first or second predetermined prediction direction. 6. The moving picture coding method according to claim 5, wherein orthogonal transformation is performed using a plurality of transformation matrix sets designed in advance for each prediction direction so that the coefficient density after transformation is high.
Performing a weighted average of the fifth predicted image signal and the input image signal to generate a local predicted image signal;
Generating a locally decoded image signal based on the locally predicted image signal;
Performing the same weighted average from the fifth predicted image signal and the locally decoded image signal to generate a sixth predicted image signal;
The video encoding method according to claim 6, further comprising:
The weighted average derives a relative distance from a reference image with respect to the first or second predetermined prediction direction when the sixth predicted image signal is generated, and the relative distance is The moving image encoding method according to claim 7, wherein a weight component of a close pixel is increased and a weight component of a pixel having a long relative distance is decreased.
For the already-encoded reference pixels used when creating the first predicted image signal, the second predicted image signal, and the fourth predicted image signal,
Replacing the reference pixel by filtering the reference pixel and a reference pixel adjacent to the reference pixel;
Selecting whether to apply the filtering for each pixel block;
The moving picture coding method according to claim 7, further comprising:
Further comprising orthogonally transforming the prediction error signal;
The orthogonal transform includes a step of obtaining a transform coefficient by performing a discrete cosine transform, a step of performing a rotational transform on the transform coefficient, and further transforming the transform coefficient,
In the rotation transformation, one selected from a plurality of rotation transformation matrix sets designed in advance so as to have higher coefficient density after the rotation transformation than the discrete cosine transformation, and the rotation transformation matrix is selected. The moving image encoding method according to claim 9, wherein an index indicating a set is encoded.
In the moving picture decoding method in which the input image signal is divided into pixel blocks represented by the depth of the hierarchy according to quadtree division, and decoding processing is performed on these divided pixel blocks.
Creating a first predicted image signal by setting a decoded first prediction direction from among a plurality of prediction direction sets;
Creating a second predicted image signal by setting a decoded second prediction direction different from the first prediction direction from the plurality of prediction direction sets;
Deriving a relative distance between a pixel to be predicted and a reference pixel for each of the first and second prediction directions, and deriving a difference value of the relative distance;
Deriving a predetermined weight component according to the difference value;
In accordance with the weight component, a weighted average of the first unidirectional intra-predicted image and the second unidirectional intra-predicted image to generate a third predicted image signal;
Generating a decoded image signal from the third predicted image signal;
A video decoding method comprising:
Creating a fourth predicted image signal by setting a fourth predicted direction from among a plurality of predicted direction sets;
Selecting one of the third predicted image signal and the fourth predicted image signal as a fifth predicted image signal;
Generating a prediction error signal from the fifth predicted image signal;
Decoding a prediction error signal;
The moving picture decoding method according to claim 11, further comprising:
In order to decode a prediction mode that identifies the fifth predicted image signal,
Generating a reference prediction mode from the prediction mode corresponding to at least one decoded pixel block;
Predictively decoding the prediction mode using the reference prediction mode;
The moving picture decoding method according to claim 12, further comprising:
When the prediction mode specifies the fourth prediction image signal as the fifth prediction image signal, and the reference prediction mode specifies the third prediction image signal as the fifth prediction image signal,
The method of claim 13, further comprising: replacing the prediction mode for specifying one of the first prediction image signal and the second prediction image signal included in the third prediction image signal as the reference prediction mode. The moving picture decoding method described in 1.
The relative distance is derived based on the Euclidean distance or the urban area distance, based on the reference pixel corresponding to the start point of the prediction direction,
The difference value of the relative distance is derived based on the difference value derived in the first prediction direction and the second prediction direction, respectively, and the width or height of the pixel block,
The moving picture decoding method according to claim 14, wherein the weight component in the first prediction direction is designed to be larger as the relative distance is closer and smaller as it is farther.
Based on the tendency that the absolute value of the prediction error increases according to the distance from the reference pixel with respect to the decoded first or second determined prediction direction, the coefficient after the conversion is more than the discrete cosine conversion The moving picture decoding method according to claim 15, further comprising a step of performing an inverse orthogonal transformation using a plurality of transformation matrix sets designed in advance for each prediction direction so as to increase the density.
Decoding the encoded data to create a locally decoded image signal;
Performing a weighted average of the third predicted image signal and the locally decoded image signal to generate a fourth predicted image signal;
The video decoding method according to claim 16, further comprising:
The weighted average derives a relative distance from a reference image with respect to the first or second predetermined prediction direction when the third predicted image signal is generated, and the relative distance is The moving picture decoding method according to claim 17, wherein a weight component of a close pixel is increased and a weight component of a pixel having a relative distance is decreased.
For the already decoded reference pixels used when creating the first predicted image signal, the second predicted image signal, and the fourth predicted image signal,
Replacing the reference pixel by filtering the reference pixel and a reference pixel adjacent to the reference pixel;
Selecting whether to apply the filtering for each pixel block;
The video decoding method according to claim 17, further comprising:
Decoding an index indicating the rotation transformation matrix set;
In order to increase the coefficient density after the rotation transformation than the discrete cosine transformation, one is selected according to an index decoded from a plurality of rotation transformation matrix sets designed in advance, and the inverse rotation transformation is performed on the decoded transformation coefficient. Reversely transforming the transform coefficient;
Performing inverse discrete cosine transform on the transform coefficient after the inverse rotation transform to obtain a prediction error;
The video decoding method according to claim 18, further comprising: