WO2012172667A1

WO2012172667A1 - Video encoding method, video decoding method, and device

Info

Publication number: WO2012172667A1
Application number: PCT/JP2011/063737
Authority: WO
Inventors: 昭行谷沢; 山口　潤; 太一郎塩寺; 山影　朋夫
Original assignee: 株式会社東芝
Priority date: 2011-06-15
Filing date: 2011-06-15
Publication date: 2012-12-20

Abstract

In a video encoding method according to an embodiment, a combination of one-dimensional transformations, comprising first orthogonal transformations only, is selected when intra prediction is being performed in which two or more prediction modes use one or more reference pixel lines; and a combination of first orthogonal transformations and second orthogonal transformations is selected when intra prediction is being performed in which each of the two or more prediction modes uses one and the same reference pixel line. A prediction image signal is generated using the two or more prediction modes. Two-dimensional transformation is performed on a prediction differential signal derived from the prediction image signal, using the selected combination of the one-dimensional transformations to generate a transformation coefficient. The transformation coefficient and the prediction information representing the combination of two or more prediction modes are encoded.

Description

Video encoding method, video decoding method and apparatus

Embodiments described herein relate generally to an intra-screen prediction method, a moving image encoding method, a moving image decoding method, and an apparatus for encoding and decoding moving images.

In recent years, an image coding method with greatly improved coding efficiency has been jointly developed by ITU-T and ISO / IEC. H. H.264 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”). H. H.264 adopts direction prediction in the spatial domain (pixel domain) and performs orthogonal transform based on discrete cosine transform (DCT) on the prediction error signal generated by the difference between the input image signal and the predicted image signal, thereby obtaining ISO / IEC. Compared with the intra-picture encoding in MPEG-1, 2, 4, high encoding efficiency is realized.
H. In order to obtain an encoding efficiency exceeding H.264, a separation type defined by a combination of two types of one-dimensional transformations with respect to a prediction error generated by weighted averaging of predicted images using two types of direction prediction A method of performing two-dimensional orthogonal transformation is disclosed as Non-Patent Document 1.

However, Non-Patent Document 1 possesses a prediction mode corresponding to one direction prediction and a look-up table (LUT) for one-dimensional conversion, and uses the property that the prediction error tends to be different for each prediction direction. The prediction direction is mapped to four types of separate two-dimensional transformations including discrete cosine transformation (DCT) and predetermined orthogonal transformation (for example, discrete sine transformation (DST) and Karhunen-Loeve transformation (KLT)). When bi-directional prediction is selected, a separation type two-dimensional transform corresponding to a prediction mode with a small prediction mode number is selected from the two prediction modes, but the two prediction modes of bi-directional prediction are different reference pixel lines. Is used, the prediction residual of bi-directional prediction is different from the tendency of any prediction error of these two unidirectional predictions, so that the coding efficiency may be reduced.

Therefore, an object of the present embodiment is to provide a moving picture coding method, a moving picture decoding method, and an apparatus that can improve coding efficiency.

The moving image encoding method of the embodiment selects a combination of one-dimensional transformation only for the first orthogonal transformation when two or more prediction modes are intra-frame prediction processing using one or more reference pixel lines, A combination of the first orthogonal transform and the second orthogonal transform is selected when each of the two or more prediction modes is an intra-screen prediction process using one reference pixel line and the same reference pixel line. A predicted image signal is generated using the two or more prediction modes. A prediction coefficient signal derived from the prediction image signal is subjected to a two-dimensional conversion process using the selected combination of one-dimensional conversions to generate a conversion coefficient. The prediction information indicating a combination of the two or more prediction modes and the transform coefficient are encoded.

1 is a block diagram illustrating an image encoding device according to a first embodiment. Explanatory drawing of the prediction encoding order of a pixel block. Explanatory drawing of an example of pixel block size. Explanatory drawing of another example of pixel block size. Explanatory drawing of another example of pixel block size. FIG. 3 is an explanatory diagram illustrating an example of a pixel block in a coding tree block. Explanatory drawing of another example of the pixel block in a coding tree block. Explanatory drawing of another example of the pixel block in a coding tree block. Explanatory drawing of another example of the pixel block in a coding tree block. Explanatory drawing of intra prediction mode. Explanatory drawing of the reference pixel and prediction pixel of intra prediction mode. Explanatory drawing of the horizontal prediction mode of intra prediction mode. Explanatory drawing of the orthogonal lower right prediction mode of intra prediction mode. Explanatory drawing which illustrates the prediction direction which concerns on 1st Embodiment. The table figure which illustrates a response | compatibility with the parameter | index of the prediction angle based on 1st Embodiment, and the prediction angle in the case of prediction image generation. The table figure which illustrates the relationship of prediction mode, prediction type, bidirectional | two-way intra prediction, and unidirectional intra prediction based on 1st Embodiment. The table figure following FIG. 7A. FIG. 3 is a block diagram illustrating an intra bi-predictive image generation unit 109 according to the first embodiment. FIG. 3 is a block diagram illustrating an orthogonal transform unit 102 according to the first embodiment. Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment. Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment. Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment. Explanatory drawing which illustrates the relationship between bidirectional | two-way intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment. Explanatory drawing which illustrates the relationship between bidirectional | two-way intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment. The table figure which illustrates the relationship between prediction mode, a prediction type, bidirectional | two-way intra prediction, unidirectional intra prediction, and a conversion index based on 1st Embodiment. The table figure following FIG. 11A. FIG. 3 is a block diagram illustrating an inverse orthogonal transform unit 105 according to the first embodiment. The table figure which illustrates the relationship between the conversion index which concerns on 1st Embodiment, a vertical conversion index, and a horizontal conversion index. The table figure which illustrates the conversion matrix name of the vertical conversion index which concerns on 1st Embodiment, and a horizontal conversion index. The flowchart figure to the prediction image generation process and orthogonal transformation based on 1st Embodiment. Explanatory drawing of a syntax structure. Explanatory drawing of the slice header syntax based on 1st Embodiment. Explanatory drawing which shows an example of the coding tree block syntax based on 1st Embodiment. Explanatory drawing which shows an example of the transform unit syntax based on 1st Embodiment. Explanatory drawing which shows an example of the prediction unit syntax based on 1st Embodiment. The block diagram which illustrates the picture coding device concerning a 2nd embodiment. The block diagram which illustrates the prediction direction derivation | leading-out part 2001 based on 2nd Embodiment. Explanatory drawing which illustrates a reference pixel line and prediction direction derivation based on a 2nd embodiment. The table figure which illustrates the relationship between prediction mode, bidirectional | two-way intra prediction, unidirectional intra prediction, and a conversion index based on 2nd Embodiment. Explanatory drawing of the slice header syntax based on 2nd Embodiment. Explanatory drawing which shows an example of the prediction unit syntax based on 2nd Embodiment. Explanatory drawing which shows an example of the prediction unit syntax based on 2nd Embodiment. Explanatory drawing which shows an example of the prediction unit syntax based on 2nd Embodiment. The table figure which illustrates the relationship between prediction mode, a prediction type, bidirectional | two-way intra prediction, unidirectional intra prediction, and a conversion index based on 2nd Embodiment. FIG. 26B is a table diagram following FIG. FIG. 26B is a table diagram following FIG. The block diagram which illustrates the moving picture decoding device concerning a 3rd embodiment. The block diagram which illustrates the moving picture decoding device concerning a 4th embodiment.

Hereinafter, each embodiment will be described with reference to the drawings. In the following description, the term “image” can be appropriately read as terms such as “video”, “pixel”, “image signal”, “picture”, and “image data”. Moreover, in the following embodiment, the same number is attached | subjected about what performs the same operation | movement, and repeated description is abbreviate | omitted.
(First embodiment)
The first embodiment relates to an image encoding device. A moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a third embodiment. This image encoding device can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array). The image encoding apparatus can also be realized by causing a computer to execute an image encoding program.

As illustrated in FIG. 1, the image encoding device 100 according to the present embodiment includes a subtraction unit 101, an orthogonal transformation unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transformation unit 105, an addition unit 106, and a reference. Image memory 107, intra unidirectional prediction image generation unit 108, intra bidirectional prediction image generation unit 109, inter prediction image generation unit 110, prediction selection switch 111, conversion information setting unit 112, prediction selection unit 113, entropy encoding unit 114 And an output buffer 115.

The image coding apparatus in FIG. 1 divides each frame or each field constituting the input image 117 into a plurality of pixel blocks, performs predictive coding on the divided pixel blocks, and outputs coded data 128. To do. In the following description, for the sake of simplicity, it is assumed that pixel blocks are predictively encoded from the upper left to the lower right as shown in FIG. 2A. In FIG. 2A, the encoded pixel block p is located on the left side and the upper side of the encoding target pixel block c in the encoding processing target frame f.

Here, the pixel block refers to a unit for processing an image such as an M × N size block (N and M are natural numbers), a coding tree block, a macro block, a sub block, and one pixel. In the following description, the pixel block is basically used in the sense of a coding tree block. However, the pixel block can be interpreted in the above-described meaning by appropriately replacing the description. For example, the pixel block in the description of the prediction unit is interpreted as the pixel block of the prediction unit. Hereinafter, the coding tree block is typically a 16 × 16 pixel block shown in FIG. 2B, for example, but may be a 32 × 32 pixel block shown in FIG. 2C or a 64 × 64 pixel block shown in FIG. 2D. However, an 8 × 8 pixel block (not shown) and a 4 × 4 pixel block may be used. Also, the coding tree block need not necessarily be square. Hereinafter, the encoding target block or coding tree block of the input image 117 may be referred to as a “prediction target block” or a “prediction pixel block”. The coding unit is not limited to a pixel block such as a coding tree block, and a frame, a field, a slice, or a combination thereof can be used.

3A to 3D are diagrams showing specific examples of coding tree blocks. FIG. 3A shows an example in which the size of the coding tree block is 64 × 64 (N = 32). Here, N represents the size of the reference coding tree block. The size when divided is defined as N, and the size when not divided is defined as 2N. The coding tree block has a quadtree structure, and when divided, the four pixel blocks are indexed in the Z-scan order as shown in FIG. 3B. An example in which the 64 × 64 pixel block of FIG. 3A is divided into quadtrees is shown in FIG. 3B. Further, it is possible to further divide the quadtree within the index of one quadtree of the coding tree block. In this way, if the depth of division is defined by Depth, it can be divided hierarchically. For example, FIG. 3A shows an example in which Depth = 0. FIG. 3C shows an example of a 32 × 32 (N = 16) size coding tree block in the case of Depth = 1. The unit having the largest coding tree block is called a large coding tree block, and the input image signal is encoded in this unit in raster scan order. FIG. 3D shows an example in which the 32 × 32 pixel block of FIG. 3C is divided into quadtrees.

The image encoding apparatus in FIG. 1 performs intra prediction (also referred to as intra-frame prediction, intra-frame prediction, etc.) or inter prediction (inter-screen prediction) on a pixel block based on the encoding parameter input from the encoding control unit 116. Prediction, inter-frame prediction, motion compensation prediction, etc.) is performed to generate a predicted image 125. This image coding apparatus orthogonally transforms and quantizes a prediction error (also called a prediction difference signal) 118 generated by subtracting an input image 117 divided into pixel blocks and a predicted image 125, performs entropy coding, and performs coding. Generated data 128 is output.

The image encoding device in FIG. 1 performs encoding by selectively applying a plurality of prediction modes having different block sizes and generation methods of the predicted image 125. The generation method of the prediction image 125 is roughly classified into two types, that is, intra prediction in which prediction is performed in the encoding target frame and inter prediction in which prediction is performed using one or a plurality of reference frames that are temporally different. .

Hereinafter, each element included in the image encoding device in FIG. 1 will be described.
The subtractor 101 subtracts the corresponding prediction image 125 from the input image 117 divided into pixel blocks to obtain a prediction error 118. The prediction error 118 output from the subtractor 101 is input to the orthogonal transform unit 102.

The orthogonal transform unit 102 performs, for example, discrete cosine transform (DCT) or discrete sine transform (DCT) on the prediction error 118 output from the subtractor 101 based on the transform information 127 output from the transform information setting unit 112 described later. DST) is performed to obtain a transform coefficient 119. The transform coefficient 119 output from the orthogonal transform unit 102 is input to the quantization unit 103.

The quantization unit 103 performs a quantization process on the transform coefficient 119 output from the orthogonal transform unit 102 to obtain a quantized transform coefficient 120. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter and a quantization matrix specified by the encoding control unit 116 (quantization step size derived from the quantization coefficient by the transform information). Divide by). The quantization parameter indicates the fineness of quantization. The quantization matrix is used for weighting the fineness of quantization for each component of the transform coefficient. The quantization unit 103 inputs the quantized transform coefficient 120 to the entropy encoding unit 114 and the inverse quantization unit 104.

The entropy encoding unit 114 performs various encoding parameters such as the quantized transform coefficient 120 from the quantization unit 103, the prediction information 126 from the prediction selection unit 113, and the quantization information specified by the encoding control unit 116. Entropy encoding (for example, Huffman encoding, arithmetic encoding, etc.) is performed to generate encoded data. The encoding parameter is a parameter necessary for decoding such as prediction information 126, information on the transform coefficient 120, information on quantization, and the like. For example, the encoding control unit 116 has an internal memory (not shown), the encoding parameter is held in this memory, and the encoding parameter of the adjacent already encoded pixel block is used when encoding the pixel block. It is good also as a structure. For example, H.M. In the H.264 intra prediction, the prediction value of the prediction mode of the pixel block can be derived from the prediction mode information of the encoded adjacent block.

The encoded data generated by the entropy encoding unit 114 is temporarily accumulated in the output buffer 115 through multiplexing, for example, and output as encoded data 128 according to an appropriate output timing managed by the encoding control unit 116. . The encoded data 128 is output to, for example, a storage system (storage medium) or a transmission system (communication line) not shown.

The inverse quantization unit 104 performs an inverse quantization process on the quantized transform coefficient 120 output from the quantization unit 103 to obtain a restored transform coefficient 121. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103 (multiplies the quantization transform coefficient 120 by the quantization step size derived from the quantization information). . At this time, the quantization information used in the quantization unit 103 is loaded from an internal memory (not shown) of the encoding control unit 116 and used. The inverse quantization unit 104 inputs the restored transform coefficient 120 to the inverse orthogonal transform unit 105.

The inverse orthogonal transform unit 105 performs, for example, inverse discrete cosine transform (IDCT) or inverse discrete on the restored transform coefficient 121 from the inverse quantization unit 104 based on transform information 127 output from the transform information setting unit 112 described later. An inverse orthogonal transformation corresponding to the orthogonal transformation performed in the orthogonal transformation unit 102 such as sine transformation (IDST) is performed to obtain a restored prediction error 122. The reconstruction prediction error 122 output from the inverse orthogonal transform unit 105 is input to the addition unit 106.

The addition unit 106 adds the restored prediction error 122 and the corresponding prediction image 125 to generate a local decoded image 123. The locally decoded image 123 is input to the reference image memory 107.

The reference image memory 107 stores the local decoded image 123 in the memory, and a prediction image is generated as necessary by the intra unidirectional prediction image generation unit 108, the intra bidirectional prediction image generation unit 109, and the inter prediction image generation unit 110. Is referred to as the reference image 124 each time.

<Intra Unidirectional Prediction Image Generation Unit 108>
The intra unidirectional predicted image generation unit 108 performs unidirectional intra prediction using the reference image 124 stored in the reference image memory 107. For example, in H.264 / MPEG-4 AVC, pixel supplementation (copying or copying) is performed along a prediction direction such as a vertical direction or a horizontal direction using an encoded reference pixel line spatially adjacent to a prediction target block. An intra prediction image is generated by performing an interpolation process (referring to a filter process or the like). FIG. 4A shows a prediction direction of intra prediction in H.264 / MPEG-4 AVC. FIG. 4B shows an arrangement relationship between reference pixel lines and encoding target pixels in H.264 / MPEG-4 AVC. M and A to H shown in the figure indicate reference pixel lines located above the prediction target block. M and I to L indicate reference pixel lines located on the left of the prediction target block. FIG. 4C shows a predicted image generation method in mode 1 (horizontal prediction), in which pixels I to L are copied in the prediction direction from the left reference pixel line. FIG. 4D shows a predicted image generation method in mode 4 (diagonal lower right prediction). In this case, for example, the predicted value of the pixel position below the reference pixel B is derived by performing (1, 2, 1) 3-tap linear filter processing on the three reference pixels A, B, and C. The derived pixel values are also copied in the prediction direction shown in the figure.

Also, it is possible to further expand the prediction direction of H.264 / MPEG-4 AVC and increase the number of prediction modes. For example, the prediction mode is expanded to 34, the pixel position accuracy corresponding to the fractional position is determined as 32 pixel accuracy, and the prediction pixel value is created by performing linear interpolation (such as 3-tap filter processing). It is also possible. FIG. 5 shows an example of the prediction angle and the prediction mode when the prediction mode is expanded up to 34 prediction modes. In FIG. 5, there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines. In addition, the direction of a typical prediction angle indicated by H.264 / MPEG-4 AVC is indicated by an arrow. In the present embodiment of the present invention, 33 types of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a circle. Similarly to H.264 / MPEG-4 AVC, DC prediction for prediction based on the average value of available reference pixels is added, and there are a total of 34 prediction modes.

When IntraPredMode = 4, since IntraPredAngleIdL0 in FIG. 7A described later is −4, the prediction image 125 is generated in the prediction direction indicated by IntraPredMode = 4 in FIG. An arrow indicated by a dotted line in FIG. 5 indicates a prediction mode whose prediction type is Intra_Vertical, and an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.

FIG. 6 shows the relationship between IntraPredAngleIdLX and intraPredAngle used for predictive image value generation. 7A and 7B show the relationship between the prediction mode (PredMode), a bidirectional prediction flag (BipredFlag) described later, a prediction mode type (PredTypeL0, PredTypeL1), and a prediction angle (PredAngleIdL0, PredAngleIdL1).

intraPredAngle indicates the prediction angle that is actually used when the predicted value is generated. For example, a prediction value generation method in the case where the prediction type is Intra_Vertical and the intraPredAngle shown in FIGS. 7A and 7B is a positive value is expressed by the following equation (1). Here, BLK_SIZE indicates the size of the pixel block, and ref [] indicates an array in which a reference image (also referred to as a reference pixel line) is stored. Further, pred (k, m) indicates the generated predicted image 125.

Even for conditions other than the above, a predicted value can be generated by a similar method according to the tables of FIGS. 7A and 7B. For example, the prediction value of the prediction mode indicated by IntraPredMode = 1 is the same as the H.264 / MPEG-4MAVC horizontal prediction shown in FIG. 4C.

<Intra bidirectional prediction image generation unit 109>
The intra bidirectional prediction image generation unit 109 performs bidirectional intra prediction using the reference image 124 stored in the reference image memory 107. For example, in Non-Patent Document 1 described above, after selecting two types of prediction modes from nine types of prediction modes defined in H.264 / MPEG-4 AVC and generating respective prediction image signals, The predicted image signal is generated by performing the filtering process every time.

Bidirectional prediction when the number of unidirectional prediction modes is expanded to 34 types will be described more specifically with reference to FIG. In the present embodiment, the maximum number of modes is not limited, and bi-directional prediction can be easily expanded in any number of uni-directional predictions.

8 includes a weighted average unit 801, a first unidirectional intra predicted image generation unit 802, and a second unidirectional intra predicted image generation unit 803. The functions of the first unidirectional intra predicted image generation unit 802 and the second unidirectional intra predicted image generation unit 803 are the same. These may be the same as the intra unidirectional predicted image generation unit 108. In this case, since the three processing units can have the same hardware configuration, the circuit scale can be reduced. Each of these generates a prediction image corresponding to the prediction mode given according to the prediction mode information controlled by the encoding control unit 116.

A first predicted image 851 is output from the first unidirectional intra predicted image generation unit 802, and a second predicted image 852 is output from the second unidirectional intra predicted image generation unit 803. Each predicted image is input to the weighted average unit 801, and a weighted average process is performed. In the weighted average unit 801, calculation based on the following formula (2) is performed. When the first predicted image 851 is P1 [x, y] and the second predicted image 852 is P2 [x, y], the bidirectionally predicted image P [x, y] is expressed by the following equation.

Here, W [x, y] represents a weighted table, and is set by a combination of two prediction modes used in bidirectional intra prediction. Norm is a fixed number for normalization introduced in order to make Equation (2) an integer arithmetic process, Offset indicates an offset in rounding, and Shift indicates a shift amount for division. For example, in order to perform the above calculation with W [x, y] as 10-bit precision, the Norm value is 1024, the Offset value is 512, and the Shift value is 10.

Bidirectional intra prediction indicates that the BipedFlag shown in FIGS. 7A and 7B is 1. In this case, two prediction mode types are defined, the prediction mode type corresponding to the first unidirectional intra prediction image generation unit 802 is PredTypeL0, and the prediction mode corresponding to the second unidirectional intra prediction image generation unit 803 is PredTypeL1. It is shown in The same applies to PredAngleIdLX, and the type and angle of each prediction mode are expressed by whether the X portion is 0 or 1.

In FIG. 7A and FIG. 7B, the combination of two prediction modes corresponding to bidirectional intra prediction is illustrated as a fixed combination. However, a special mode that uses a prediction mode held by an encoded adjacent pixel block is illustrated. A prediction mode may be added. In this case, since a combination of two prediction modes having a close spatial distance can be selected, a combination of bidirectional intra prediction that matches the characteristics of the image can be realized without depending on the number of fixed prediction modes. It is also possible to reduce the number of fixed prediction modes. Further, in the present embodiment of the present invention, an example in which the number of bidirectional intra prediction modes is 16 is shown, but the increase / decrease in the number of prediction modes can be easily changed. Increasing the number of prediction modes increases prediction accuracy, while increasing the overhead for encoding the prediction mode. The optimum number of prediction modes may be set in consideration of the balance between the number of prediction modes and coding efficiency.
The above is the description of the intra bidirectional prediction image generation unit 109.

1 uses the reference image 124 stored in the reference image memory 107 to perform inter prediction. Specifically, the inter prediction unit 110 performs an interpolation process (motion compensation) based on a motion shift amount (motion vector) between the prediction target block and the reference image 124 to generate an inter prediction image. In H.264 / MPEG-4 AVC, interpolation processing up to ¼ pixel accuracy is possible. The derived motion vector is entropy encoded as part of the prediction information 126.

The prediction selection switch 111 outputs the output terminal of the intra unidirectional prediction image generation unit 108, the output terminal of the intra bidirectional prediction image generation unit 109, or the output terminal of the inter prediction image generation unit 110 from the prediction selection unit 113. The intra-predicted image or the inter-predicted image is selected as the predicted image 125 and input to the subtracting unit 101 and the adding unit 106. When the prediction information 126 suggests intra prediction, the prediction selection switch 111 outputs from the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 according to the prediction mode shown in FIGS. 7A and 7B. Connect the switch to the end. On the other hand, when the prediction information 126 suggests inter prediction, the prediction selection switch 111 connects a switch to the output terminal from the inter prediction unit 110.

<Prediction selection unit-coding control unit mode determination>
The prediction selection unit 113 has a function of setting the prediction information 126 according to the prediction mode controlled by the encoding control unit 116. As described above, intra prediction or inter prediction can be selected to generate the predicted image 125, but a plurality of modes can be further selected for each of intra prediction and inter prediction. The encoding control unit 116 selects one of a plurality of prediction modes of intra prediction and inter prediction as the optimal prediction mode, and the prediction selection unit 113 sets the prediction information 126 according to the determined optimal prediction mode. .

For example, regarding the intra prediction, when the prediction mode information is selected from the table shown in FIGS. 7A and 7B from the encoding control unit 116, the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 is selected. Generates a prediction image 125 according to the prediction mode information. The encoding control unit 116 may specify the prediction mode information in order from the smallest prediction mode number, or may specify the prediction mode information in order from the largest. Further, the prediction mode may be limited according to the characteristics of the input image, or a predetermined prediction mode may be selected. It is not always necessary to specify all the prediction modes, and at least one prediction mode information may be specified for the encoding target block.

For example, the encoding control unit 116 determines an optimal prediction mode using a cost function shown in the following mathematical formula (3).

In Equation (3), OH represents an estimated amount of code amount (for example, the number of bits of a value representing a symbol in binary) related to the prediction information 126 (for example, prediction mode information, motion vector information, and prediction block size information), and SAD Indicates the sum of absolute differences between the prediction target block and the predicted image 125 (that is, the cumulative sum of the absolute values of the prediction errors 118). Further, λ represents a Lagrange undetermined multiplier determined based on the value of quantization information (quantization parameter), and K represents an encoding cost. When Expression (3) is used, the prediction mode that minimizes the coding cost K is determined as the optimum prediction mode from the viewpoint of the generated code amount and the prediction error. As a modification of Equation (3), the encoding cost may be estimated from OH alone or SAD alone, or the encoding cost may be estimated using a value obtained by subjecting SAD to Hadamard transform or an approximate value thereof.

It is also possible to determine an optimal prediction mode by using a temporary encoding unit (not shown). For example, the encoding control unit 116 determines an optimal prediction mode using a cost function represented by the following mathematical formula (4).

In Equation (4), D indicates a square error sum (ie, encoding distortion) between the prediction target block and the local decoded image, and R indicates a prediction error between the prediction target block and the prediction image 125 in the prediction mode. Indicates the amount of code estimated by provisional encoding, and J indicates the encoding cost. In order to derive the encoding cost J of Equation (4), provisional encoding processing and local decoding processing are required for each prediction mode, so that the circuit scale or the amount of calculation increases. On the other hand, since the encoding cost J is derived based on more accurate encoding distortion and code amount, it is easy to determine the optimal prediction mode with high accuracy and maintain high encoding efficiency. As a modification of Equation (4), the encoding cost may be estimated from only R or D, or the encoding cost may be estimated using an approximate value of R or D. These costs may be used hierarchically. The encoding control unit 116 performs determination using Expression (3) or Expression (4) based on information obtained in advance regarding the prediction target block (prediction mode of surrounding pixel blocks, results of image analysis, and the like). The number of prediction mode candidates may be narrowed down in advance.

The conversion information setting unit 112 has a function of generating conversion information 127 used in orthogonal transformation based on the prediction information 126 output from the prediction selection unit 113 and input via the prediction selection switch 111.

The above is the outline of the image coding apparatus 100 in the present embodiment of the present invention. Subsequently, the orthogonal transform unit 102 and the inverse orthogonal transform unit 105 will be described in detail with reference to FIGS. 9, 12, 13A, and 13B, and the transform information setting unit 112 will be described in detail with reference to FIGS. 11A and 11B. Do.

<Orthogonal transform unit 102>
The orthogonal transform unit 102 includes a selection switch A 901, a vertical conversion unit 906, a transposition unit 904, a selection switch B 905, and a horizontal conversion unit 907. The vertical transform unit 906 includes a 1D discrete cosine transform unit 902 and a 1D discrete sine transform unit 903. The horizontal conversion unit 907 includes a 1D discrete cosine conversion unit 902 and a 1D discrete cosine conversion unit 903. The order of the vertical conversion unit 906 and the horizontal conversion unit 907 is an example, and these may be reversed.

The 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 have a common function in that the input matrix is multiplied by a discrete cosine transform or a 1D transform matrix for discrete sine transform, respectively.

The selection switch A 901 guides the prediction error 118 to one of the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 952. The 1D discrete cosine transform unit 902 performs 1D discrete cosine transform on the input prediction error (matrix) 118 and outputs a temporary transform coefficient 951. The 1D discrete sine transform unit 903 performs 1D discrete sine transform on the input prediction error (matrix) 118 and outputs a temporary transform coefficient 951. Specifically, the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 perform a one-dimensional orthogonal transform represented by the following formula (5) to remove the vertical correlation of the prediction error (matrix) 118. To do.

Here, the 1D discrete cosine transform unit 902 has a transformation matrix (i, j = 0, 1, 2,..., N−1) represented by the following formula (6), and the 1D discrete cosine transform unit 903 has the following It has a transformation matrix (i, j = 1, 2, 3,..., N) expressed by Equation (7).

In Equation (5), X represents a matrix (N × N) of a prediction error (matrix) 118, and V comprehensively includes a 1D discrete cosine transform unit 902 and a 1D discrete sine transform unit 903 (both are N × N). Y indicates an output matrix (N × N) of the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903. As shown in the equations (6) and (7), the transformation matrix V is, for example, N × N in which discrete cosine transform bases or discrete sine transform bases prepared for removing the vertical correlation of the matrix X are arranged horizontally. Is the transformation matrix.

The transposition unit 904 transposes the output matrix (Y) of the vertical conversion unit 906, and gives the result to the selection switch B905. However, the transposing unit 904 is an example, and corresponding hardware may not necessarily be prepared. For example, when 1D orthogonal transformation (one-dimensional orthogonal transformation) by the vertical transformation unit 906 is executed (each element of the output matrix of the vertical transformation unit 906) is held, and 1D orthogonal transformation is performed by the horizontal transformation unit 907 If the data is read out in an appropriate order, the transposition of the output matrix (Y) can be executed without preparing the hardware corresponding to the transposition unit 904.

The selection switch B 905 guides the input matrix from the transposing unit 904 to either the 1D discrete cosine transform unit 902 or the 1D discrete sine transform unit 903 according to the horizontal transform index (Horizontal_transform_idx) included in the 1D transform index 952. The 1D discrete cosine transform unit 902 performs a discrete cosine transform on the input matrix and outputs a transform coefficient 119. The 1D discrete sine transform unit 903 performs discrete sine transform on the input matrix and outputs a transform coefficient 119. Specifically, the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 (that is, the horizontal transform unit 907) perform a one-dimensional orthogonal transform represented by the following formula (8) to generate a horizontal direction of the prediction error. Remove the correlation.

In Equation (8), H comprehensively represents a 1D discrete cosine transform matrix and a 1D discrete sine transform matrix (both N × N), and Z represents a 1D discrete cosine transform unit 902 or a 1D discrete sine transform unit 903. An output matrix (N × N) is shown, which refers to the transform coefficient 119. Specifically, the transformation matrix H is a transform matrix of N × N which the discrete cosine transform matrix or discrete sine transform matrix to remove the correlation in the vertical direction arranged horizontally in matrix Y ^T. That is, the discrete cosine transform matrix of Equation (6) or the discrete sine transform matrix of Equation (7) corresponds to each.

As described above, the orthogonal transform unit 102 performs the separation-type 2D orthogonal transform (two-dimensional orthogonal transform) on the prediction error (matrix) 118 according to the 1D transform index 952 output from the 1D transform setting unit 908. A conversion coefficient (matrix) 119 is generated. Note that the 1D discrete cosine transform unit 902 may be replaced with a discrete cosine transform of H.264 / MPEG-4 変換 AVC in a form that reuses an existing orthogonal transform. Further, the orthogonal transform unit 102 may implement various orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform in addition to DCT.

<Effect of orthogonal transform-relationship with intra unidirectional prediction>
Here, the difference in properties between the 1D discrete cosine transform unit 902 and the 1D discrete cosine transform unit 903 will be described. Intra prediction modes supported by H.264 / MPEG-4 AVC, etc. are interpolated using a reference pixel group (reference pixel line) on a line adjacent to one or both of the left and upper sides of the prediction target block. Some of them generate a predicted image by copying processed pixel values (for example, filter processing) along the prediction direction. Since the prediction mode of the intra prediction uses the spatial correlation of the image, the prediction accuracy tends to decrease as the distance from the reference pixel to the prediction target pixel increases. That is, the absolute value of the prediction error tends to increase according to the distance from the reference pixel to the prediction target pixel. This tendency is the same regardless of the prediction direction. More specifically, an intra prediction mode (for example, mode 1 and mode 8 in FIG. 4A) for copying a pixel value subjected to interpolation processing (for example, filter processing) using a reference pixel group on a line adjacent to the left of the prediction target block. For IntraPredMode of FIG. 5, 1, 8, 9, etc.), the prediction error shows a tendency in the horizontal direction. Intra prediction mode (for example, mode 0, mode 3 and mode 7 in FIG. 4A, FIG. 5) that copies pixel values that have undergone interpolation processing (for example, filter processing) using a reference pixel group on a line adjacent to the prediction target block. IntraPredMode of 0, 5, 6, etc.), the prediction error shows a tendency in the vertical direction. Further, a prediction mode (for example, mode 4, mode 4 in FIG. 4A) that copies pixel values interpolated (for example, filtered) using the reference pixel group on the line adjacent to the left of the prediction target block and the line adjacent above. 5 and mode 6 and IntraPredMode in FIG. 5 is 3, 4, 7, etc.), the prediction error shows a tendency related to the horizontal direction and the vertical direction. In summary, it can be said that the tendency is related to the direction orthogonal to the line of the reference pixel group used for generating the predicted image.

When showing such a tendency, the 1D discrete sine transform unit 903 has a higher coefficient density when performing 1D orthogonal transform in the orthogonal direction (vertical direction or horizontal direction) than the 1D discrete cosine transform unit 902 ( That is, the non-zero coefficient ratio in the quantized transform coefficient 121 is reduced). On the other hand, the 1D discrete cosine transform unit 902 is a general-purpose transform matrix that does not have such properties. If 1D orthogonal transformation is performed in the orthogonal direction using 1D discrete sine transformation, the prediction error conversion efficiency of intra prediction is improved, and consequently the coding efficiency is improved. For example, a prediction error signal in mode 0 (vertical prediction) shows the above tendency in the vertical direction but does not show the above tendency in the horizontal direction. Therefore, by performing 1D orthogonal transform using the 1D discrete sine transform unit 903 in the vertical transform unit 906 and performing 1D orthogonal transform using the 1D discrete cosine transform unit 902 in the horizontal transform unit 907, efficient orthogonal transform is performed. realizable.

<Difference between orthogonal transformation for unidirectional prediction and bidirectional prediction>
Next, differences in characteristics of prediction images 125 generated by the intra unidirectional prediction image generation unit 108 and the bidirectional prediction image generation unit 109 and the tendency of prediction errors will be described with reference to FIGS. 10A to 10E. The predicted image 125 generated by the intra unidirectional predicted image generation unit 108 has an effect of removing the spatial direction correlation of the input image 117. In general, the spatial correlation of an image is higher as the distance is shorter, and decreases as the distance is longer. For this reason, the prediction error is smaller as the distance between the reference pixel and the prediction coping pixel is closer to the prediction direction, and the prediction error is larger as the distance is longer. FIG. 10A shows an example of a general prediction error tendency statistically seen as a prediction direction corresponding to mode 0 in FIG. 4A. As described above, the prediction error in the vertical direction increases as the distance from the reference pixel increases, while the prediction error in the horizontal direction has the same magnitude. In this case, the spatial correlation can be efficiently removed by selecting DST as 1D vertical conversion and selecting DCT as 1D horizontal conversion. FIG. 10B shows an example of a general prediction pixel tendency statistically seen as a prediction direction corresponding to mode 1 in FIG. 4A. It can be seen that while the vertical prediction error is the same size, the horizontal prediction error increases as the distance from the reference pixel increases. In this case, spatial correlation can be efficiently removed by selecting DCT as 1D vertical conversion and selecting DST as 1D horizontal conversion. FIG. 10C shows an example of the prediction direction corresponding to mode 4 in FIG. It can be seen that the prediction error in the vertical direction and the prediction error in the horizontal direction both increase as the distance from the reference pixel that is the starting point in the prediction direction increases. In this case, spatial correlation can be efficiently removed by selecting DST as 1D vertical conversion and selecting DST as 1D horizontal conversion.

The bidirectional prediction image generation unit 109 is a prediction image generation method that averages two prediction directions with weights. For this reason, there is an effect of removing the correlation that changes smoothly in space while maintaining the characteristics of the two prediction directions. FIG. 10D shows an example of a general tendency of prediction error seen statistically when mode 1 and mode 8 in FIG. 4A are used. In any mode, the prediction is performed only from the line positioned to the left of the prediction target block as the reference pixel line. Therefore, it can be seen that the prediction error in the vertical direction is the same size, but the prediction error in the horizontal direction increases as the distance from the reference pixel increases. In this case, spatial correlation can be efficiently removed by selecting DCT as 1D vertical conversion and selecting DST as 1D horizontal conversion. FIG. 10E shows an example of a general tendency of prediction error seen statistically when mode 0 and mode 1 in FIG. 4A are used. Mode 0 uses a reference line located above the prediction target block, and mode 1 uses a reference line located to the left of the prediction target block. The prediction error in this case has a tendency similar to the prediction error of the intra unidirectional prediction described with reference to FIG. 10C, and the prediction error in the vertical direction and the prediction error in the horizontal direction are the left and upper references that are the starting points of the prediction direction. It can be seen that both increase as the distance from the pixel increases. However, as shown in Non-Patent Document 1, when 1D vertical conversion and 1D horizontal conversion are selected according to the two selected prediction modes, either the conversion of FIG. 10A or FIG. 10B is specified, and the encoding efficiency is increased. The problem of deteriorating arises. Therefore, in the present embodiment of the present invention, as shown in the lower right of FIG. 10E, spatial correlation can be efficiently removed by selecting DST as 1D vertical conversion and selecting DST as 1D horizontal conversion. .

<Effect of orthogonal transformation on plane prediction>
As shown in FIG. 11A, when PredMode indicates 2, Intra_DC is designated. Intra_DC indicates DC prediction that is predicted by the average value of available reference pixels. As the prediction error of DC prediction, the directional error as described above does not occur statistically, so the DCT used in H.264 / MPEG-4 AVC is selected. On the other hand, in the case of bidirectional intra prediction, if one of the two prediction modes is Intra_DC, the tendency of the prediction error changes according to the prediction direction of the other prediction modes. Therefore, in such a case, it is possible to efficiently reduce the redundancy of the prediction error by setting TransformIdx determined in the other prediction mode. For example, as illustrated in FIGS. 11A and 11B, when PredMode is 34, bidirectional intra prediction is performed in two prediction modes with PredMode of 2 and PredMode of 0. Since the first is Intra_DC, the same TransformIdx as TransformIdx = 2 held by the other PredMode = 0 is selected with PredMode = 34. Although only DC prediction has been described here, TransformIdx may be set in a similar framework even when a prediction mode that is not such direction prediction is selected. Note that DCT is selected when the two prediction modes of bidirectional intra prediction are not directional prediction.

<Inverse orthogonal transform unit 105>
Hereinafter, the details of the inverse orthogonal transform unit 105 according to the present embodiment will be described with reference to FIG.
The inverse orthogonal transform unit 105 includes a selection switch A 1201, a vertical inverse transform unit 1206, a transposition unit 1204, a selection switch B 1205, and a horizontal inverse transform unit 1207. The vertical inverse transform unit 1206 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. The horizontal inverse transform unit 1207 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. Note that the order of the vertical inverse transform unit 1206 and the horizontal inverse transform unit 1207 is an example, and these may be reversed.

The two 1D inverse discrete cosine transform units 1202 shown in the figure can also be realized by using physically identical hardware in a time division manner. The same applies to the 1D inverse discrete sine transform unit 1203.

The selection switch A 1201 converts the restored transform coefficient 121 of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 1252 output from the 1D transform setting unit 1208. Lead to one of them. The 1D inverse discrete cosine transform unit 1202 multiplies the input restoration transform coefficient 121 (matrix format) by a transposed matrix of the discrete cosine transform matrix and outputs the result. The 1D inverse discrete sine transform unit 1203 multiplies the input restoration transform coefficient 121 by the transposed matrix of the discrete sine transform matrix and outputs the result. Specifically, the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 (that is, the vertical inverse transform unit 1206) perform a one-dimensional inverse orthogonal transform represented by the following equation (9).

In Equation (9), a Z 'represents a matrix of restoring conversion coefficient 121 (N × N), V T is the transpose matrix of 1D inverse discrete cosine transform matrix and 1D inverse discrete sine transform matrix (both N × N) Y ′ indicates an output matrix (N × N) of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203. That, V ^T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.

The transposition unit 1204 transposes the output matrix (Y ′) of the vertical inverse transform unit 1206 and supplies the transposition to the selection switch B 1205. However, the transposition unit 1204 is an example, and corresponding hardware may not necessarily be prepared. For example, the result (one element of the output matrix of the vertical inverse transform unit 1206) obtained by executing the 1D inverse orthogonal transform (one-dimensional inverse orthogonal transform) by the vertical inverse transform unit 1206 is held, and the 1D inverse by the horizontal inverse transform unit 1207 is held. If reading is performed in an appropriate order when performing orthogonal transformation, transposition of the output matrix (Y ′) can be performed without preparing hardware corresponding to the transposition unit 1204.

The selection switch B 1205 converts the input matrix from the transposition unit 1204 into the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform according to the horizontal transformation index (Horizontal_transform_idx) included in the 1D transformation index 1252 output from the 1D transformation setting unit 1208. Lead to one of the sections 1203. The 1D inverse discrete cosine transform unit 1202 performs 1D inverse discrete cosine transform on the input matrix and outputs the result. The 1D inverse discrete sine transform unit 1203 performs 1D inverse discrete sine transform on the input matrix and outputs the result. Specifically, the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 (that is, the horizontal inverse transform unit 1207) perform a one-dimensional inverse orthogonal transform represented by the following formula (10).

In Equation (10), ^{H T} is 1D are generically indicates a transposed matrix of the discrete cosine transform matrix and 1D discrete sine transform matrix (both N × N), X 'is 1D inverse discrete cosine transform unit 1202 and 1D An output matrix (N × N) of the inverse discrete sine transform unit 1203 is shown, which indicates the restored prediction error 122. That, H ^T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.

As described above, the inverse orthogonal transform unit 105 performs inverse orthogonal transform on the reconstructed transform coefficient (matrix) 121 according to the input orthogonal transform information 127 to generate a reconstructed prediction error (matrix) 122. Note that the 1D inverse discrete cosine transform unit 1202 may be replaced with an inverse discrete cosine transform of H.264 / MPEG-4 AVC by reusing the existing inverse orthogonal transform. Further, the inverse orthogonal transform unit 105 may realize various inverse orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform. In any case, an inverse orthogonal transform corresponding to the orthogonal transform unit 102 may be selected.

Hereinafter, details of the 1D conversion setting unit 908 in FIG. 9 and the 1D conversion setting unit 1208 in FIG. 12 that have not been described will be described. Since these two have the same function, they will be described as the 1D conversion setting unit 1208 in FIG. Based on the orthogonal transformation information 127, the 1D transformation setting unit 1208 selects a transformation matrix used for vertical orthogonal transformation and vertical inverse orthogonal transformation, and performs horizontal orthogonal transformation and horizontal inverse orthogonal transformation. And a function of setting a 1D conversion index for selecting a conversion matrix to be used. The 1D transform index 1252 directly or indirectly indicates the orthogonal transform selected by the vertical orthogonal transform and the horizontal orthogonal transform, respectively.

For example, the 1D transform index 1252 can be expressed by a transform index (TransformIdx) shown in FIG. 13A and a 1D orthogonal transform in the vertical or horizontal direction (Vertical_transform_idx and Horizontal_transform_idx, respectively). With reference to the table in FIG. 13A, a 1D transformation index (Vertical_transform_idx) for a vertical transformation unit and a 1D transformation index (Horizontal_transform_idx) for a horizontal transformation unit can be derived from the transformation index. Further, FIG. 13B shows whether each idx indicates discrete cosine transform or discrete sine transform. When idx is “1”, it indicates a discrete sine transform matrix (DST), and when it is “0”, it indicates a discrete cosine transform matrix (DCT). In this figure, for example, the corresponding 1D transform index 1252 is referenced based on TransformIdx included in the orthogonal transform information 127, and Vertical_transform_idx is output to the selection switch A and Horizontal_transform_idx is output to the selection switch B. For example, when Vertical_transform_idx or Horizontal_transform_idx indicates DCT with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete cosine transform unit 1202 (or 1D discrete cosine transform unit 902). When Vertical_transform_idx or Horizontal_transform_idx indicates DST with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete sine transform unit 1203 (or 1D discrete sine transform unit 903).

Next, orthogonal transformation information will be described. The orthogonal transform information indirectly and directly indicates a transform index corresponding to the selected prediction mode with reference to a predetermined transform index and prediction mode map. 11A and 11B show the relationship between the prediction mode and the conversion index. 11A and 11B are obtained by adding TransformIdx to the intra prediction mode shown in FIGS. 7A and 7B. From this table, TransformIdx can be derived according to the selected prediction mode. By classifying the vertical direction and horizontal direction of each prediction mode into two classes according to the presence or absence of the above-mentioned tendency, and adaptively applying a 1D discrete cosine transform matrix or 1D discrete sine transform matrix for each of the vertical direction and the horizontal direction, High conversion efficiency is achieved stably.

<Prediction image generation to orthogonal transform processing flow>
Next, processing from predicted image generation processing to orthogonal transformation processing will be described with reference to FIG. First (S1401), the prediction selection unit 113 determines whether the prediction mode is intra prediction based on the selection of the encoding control unit 116 (S1402). When the determination is No, the inter prediction process is performed as usual. If the determination is Yes, the prediction selection unit 113 next determines whether the prediction mode is bidirectional intra prediction (S1403). When the determination is No, the unidirectional prediction mode is output to the prediction selection unit 113 after being set (S1404) and input to the prediction selection switch 111. The prediction selection switch 111 connects the input end of the switch to the intra unidirectional prediction image generation unit 108, and the prediction image 125 generated by the intra unidirectional prediction image generation unit 108 is output (S1405). Prediction information 126 indicating the prediction mode of unidirectional prediction set by the prediction selection unit 113 is input to the conversion information setting unit 112. The transformation information setting unit 112 refers to the LUT shown in FIGS. 11A and 11B and outputs the transformation information 127 (TransformIdx) of the selected prediction mode (S1406).

If the determination in S1403 is Yes, the prediction selection unit 113 sets two prediction modes (S1407) and outputs the prediction mode to the prediction selection switch 111. Referring to FIGS. 11A and 11B, in the case of bi-directional prediction, two prediction modes are included in one PreMode indicating the prediction mode. For example, when PredMode is 34, bidirectional intra prediction is performed in two prediction modes in which PredMode is 2 and PredMode is 0. That is, when calling the prediction mode, there are cases where two prediction modes are included, but here when referring to the prediction mode, there are two “PredTypeLX” (prediction type) and “PredAngleIdLX” (prediction angle). Indicates the specified mode. Therefore, in the case of unidirectional prediction, one prediction mode is designated for one PredMode, and in the case of bidirectional prediction, two prediction modes are designated for one PredMode. The prediction selection switch 111 connects the input end of the switch to the intra bidirectional prediction image generation unit 109, and the prediction selection switch 111 connects the input end of the switch to the intra bidirectional prediction image generation unit 109. The predicted image 125 generated by the directional predicted image generation unit 109 is output (S1408). Prediction information 126 indicating the prediction mode of bidirectional prediction set by the prediction selection unit 113 is input to the conversion information setting unit 112. The conversion information setting unit 112 determines whether the reference pixel lines in the two set prediction modes are different (S1409). If the determination is Yes, the prediction mode transformation information 127 (TransformIdx) is output so that Vertical_transform_idx and Horizontal_transform_idx select DST, respectively (S1410). When the determination is No, the conversion information setting unit 112 outputs the conversion information 127 (TransformIdx) possessed by the smaller of the two set prediction mode numbers (S1411).

The 1D transformation setting unit 908 in the orthogonal transformation unit 102 and the 1D transformation setting unit 1208 in the inverse orthogonal transformation unit 105 derive Vertical_transform_idx and Horizontal_transform_idx by referring to FIGS. 13A and 13B according to the inputted transformation information 127. . Vertical_transform_idx is input to the selection switches A 901 and 1201, and the output terminal of the switch is input to the 1D discrete cosine transform unit 902 (or 1D inverse discrete cosine transform unit 1202) or 1D discrete sine transform unit 903 (or 1D inverse discrete sine transform unit 1203). Connecting. Horizontal_transform_idx is input to the selection switches B905 and 605, and the output end of the switch is input to the 1D discrete cosine transform unit 902 (or 1D inverse discrete cosine transform unit 1202) or 1D discrete sine transform unit 903 (or 1D inverse discrete sine transform unit 1203). Connect (S1412).

The prediction error 118 is input to the orthogonal transformation unit 102, and orthogonal transformation processing is performed using the set transformation matrix (S1413). The transform coefficient 119 after the orthogonal transform is output to the quantization unit 103 (S1414).

Here, it has been described in S1409 that it is determined whether or not the reference pixel lines of the two prediction modes are different from each other. However, the combination of the prediction modes may be determined in advance, as shown in FIGS. 11A and 11B. It is possible to omit the processing necessary for the determination by holding the LUT corresponding to.

Here, the processing flow of the orthogonal transformation processing has been described, but the 1D transformation setting unit 1208 also sets Vertical_transform_idx and Horizontal_transform_idx in the same processing procedure for the inverse orthogonal transformation unit 105. The restored transform coefficient 121 is input to the inverse orthogonal transform unit 105, and an inverse orthogonal transform process is performed using the set transform matrix. Thereafter, the restoration prediction error 122 is output to the adding unit 106.
The above is the description of the process flow of the predicted image generation process, the orthogonal transform process, and the inverse orthogonal transform process according to the present embodiment of the present invention.

<Syntax configuration>
Hereinafter, the syntax used by the image coding apparatus 100 in FIG. 1 will be described.
The syntax indicates the structure of encoded data (for example, encoded data 128 in FIG. 1) when the image encoding apparatus encodes moving image data. When decoding the encoded data, the moving picture decoding apparatus interprets the syntax with reference to the same syntax structure. FIG. 15 illustrates a syntax 1500 used by the image encoding device in FIG.

The syntax 1500 includes three parts: a high level syntax 1501, a slice level syntax 1502, and a coding tree level syntax 1503. The high level syntax 1501 includes syntax information of a layer higher than the slice. A slice refers to a rectangular area or a continuous area included in a frame or a field. The slice level syntax 1502 includes information necessary for decoding each slice. Coding tree level syntax 1503 includes information necessary to decode each coding tree (ie, each coding tree block). Each of these parts includes more detailed syntax.

The high level syntax 1501 includes sequence and picture level syntaxes such as a sequence parameter set syntax 1504 and a picture parameter set syntax 1505. The slice level syntax 1502 includes a slice header syntax 1506, a slice data syntax 1507, and the like. The coding tree level syntax 1503 includes a coding tree block syntax 1508, a prediction unit syntax 1509, and the like.

The coding tree block syntax 1508 can have a quadtree structure. Specifically, the coding tree block syntax 1508 can be recursively called as a syntax element of the coding tree block syntax 1508. That is, one coding tree block can be subdivided with a quadtree. Also, the coding tree block syntax 1508 includes a transform unit syntax 1510. The transform unit syntax 1510 is invoked at each coding tree block syntax 1508 at the extreme end of the quadtree. The transform unit syntax 1510 describes information related to inverse orthogonal transformation and quantization.

The transform unit syntax 1510 can have a quadtree structure. Specifically, the transform unit syntax 1510 can be further recursively called as a syntax element of the transform unit syntax 1510. That is, one transform unit can be subdivided with a quadtree.

FIG. 16 illustrates a slice header syntax 1506 according to the present embodiment. The slice_bipred_intra_flag illustrated in FIG. 16 is a syntax element indicating, for example, the validity / invalidity of bidirectional intra prediction according to the present embodiment for the slice.

When slice_bipred_intra_flag is 0, the bidirectional intra according to the present embodiment in the slice is invalid. Therefore, the prediction selection unit 113 does not set a prediction mode including bidirectional intra prediction, and the prediction selection switch 111 does not connect the output terminal of the switch to the intra bidirectional prediction image generation unit 109. As an example of unidirectional intra prediction, prediction in which BipedFlag [] in FIGS. 7A and 7B, FIG. 11A and FIG. 11B is 0, or intra prediction defined in H.264 / MPEG-4 AVC may be performed. Absent.

As an example, when slice_bipred_intra_flag is 1, the bidirectional intra prediction according to the present embodiment is valid in the entire area of the slice.
Further, as another example, when slice_bipred_intra_flag is 1, in the syntax of a lower layer (coding tree block, prediction unit, etc.), the prediction validity / Invalidity may be specified.

The slice_directional_transform_intra_flag shown in FIG. 16 is a syntax element indicating, for example, the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to this embodiment with respect to the slice.

When slice_directional_transform_intra_flag is 0, the discrete sine transform and the inverse discrete sine transform according to the present embodiment in the slice are invalid. Therefore, the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx. As an example, when slice_directional_transform_intra_flag is 1, the discrete sine transform and the inverse discrete sine transform according to the present embodiment are effective over the entire area in the slice.

As another example, when slice_directional_transform_intra_flag is 1, the discrete sine transform according to the present embodiment for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.) The validity / invalidity of the inverse discrete sine transform may be defined.

FIG. 17 illustrates the coding tree block syntax 1508 according to the present embodiment. Ctb_directional_transform_flag shown in FIG. 17 is a syntax element indicating whether the discrete sine transform and the inverse discrete sine transform according to the present embodiment are valid / invalid for the coding tree block. Also, pred_mode shown in FIG. 17 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction. ctb_directional_transform_flag is encoded only when the above-described slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction. CBP indicates Coded_Block_Pattern information, and is information indicating whether or not there is a transform coefficient in the coding tree block. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so ctb_directional_transform_flag is not encoded.

When ctb_directional_transform_flag is 0, the discrete sine transform and the inverse discrete sine transform according to the present embodiment in the coding tree block are invalid. Therefore, the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
On the other hand, when ctb_directional_transform_flag is 1, the discrete sine transform and the inverse discrete sine transform according to the present embodiment are valid in the coding tree block.

As in the example of FIG. 17, in the coding tree block syntax 1508, when the flag that defines the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to the present embodiment is encoded, compared to the case where this flag is not encoded. Thus, the information amount (code amount) increases. However, by encoding this flag, it is possible to perform optimal orthogonal transform for each local region (ie, coding tree block).

FIG. 18 illustrates a transform unit syntax 1510 according to this embodiment. A tu_directional_transform_flag shown in FIG. 18 is a syntax element indicating validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to this embodiment with respect to the transform unit. Further, pred_mode shown in FIG. 18 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction. tu_directional_transform_flag is encoded only when slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction. coded_block_flag is 1-bit information indicating whether or not there is a transform coefficient in the transform unit. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so tu_directional_transform_flag is not encoded.

When tu_directional_transform_flag is 0, the discrete sine transform and the inverse discrete sine transform according to this embodiment in the transform unit are invalid. Therefore, the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.

On the other hand, when tu_directional_transform_flag is 1, the discrete sine transform and inverse discrete sine transform according to the present embodiment are valid in the coding tree block.

As in the example of FIG. 18, in the transform unit syntax 1510, when the flag that defines the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to the present embodiment is encoded, compared to the case where this flag is not encoded. Thus, the information amount (code amount) increases. However, by encoding this flag, it is possible to perform optimal orthogonal transform for each local region (that is, transform unit).

FIG. 19 shows an example of the prediction unit syntax. Pred_mode in FIG. 19 indicates the prediction type of the prediction unit. MODE_INTRA indicates that the prediction type is intra prediction. intra_split_flag is a flag indicating whether or not the prediction unit is further divided into four prediction units. When intra_split_flag is 1, a prediction unit is a prediction unit obtained by dividing the prediction unit into four in half in the vertical and horizontal sizes. When intra_split_flag is 0, the prediction unit is not divided.

Intra_luma_bipred_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is a unidirectional intra prediction mode or a bidirectional intra prediction mode. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set. In the flag, the value of IntraBipredFlag of the prediction unit shown in FIGS. 9, 12, 13A, and 13B is set.

When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the bi-directional intra prediction mode used from among the prepared bi-directional intra prediction modes. Intra_luma_bipred_mode [i] is encoded. intra_luma_bipred_mode [i] may be encoded with the isometric length according to the bidirectional intra prediction mode number IntraBipredNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or encoded using a predetermined code table. May be. When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive encoding is performed from adjacent blocks.

prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the method of calculating MostProbableMode will be described later. When prev_intra_luma_unipred_idx [i] is not 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. . rem_intra_luma_unipred_mode [i] may be encoded with the same length according to the bidirectional intra prediction mode number IntraPredModeNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be encoded using a predetermined code table. May be. From intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using the following equation.

Here, numCand indicates the number of candidates for MostProbableMode, and candModeList [cIdx] indicates the MostProbableMode that is actually a candidate. Here, numCand is set to 2, and the candidate MostProbableMode is set to IntraPredMode of pixel blocks positioned on the upper and left sides that have already been predicted and are adjacent to the prediction target block. At this time, candModeList [0] is indicated as MPM_L0, and candModeList [1] is indicated as MPM_L1. When prev_intra_luma_unipred_idx [i] is 1, a prediction mode of MPM_L0 is derived, and when prev_intra_luma_unipred_idx [i] is 2, a prediction mode of MPM_L1 is derived. When the number of numCand is plural, candModeList [cIdx] may be in the same prediction mode. In this case, since the redundant information is included in the information to be encoded, the redundant prediction mode is omitted and expressed as shown in Expression (11). For the code table for actual encoding, an optimum code table may be created in consideration of the maximum number of these prediction modes.

When numCand is 1, MostProbableMode is calculated according to the following equation.

Min (x, y) is a parameter for outputting the smaller one of the inputs x and y. IntraPredModeA and IntraPredModeB indicate the intra prediction modes of prediction units adjacent to the left and above the encoded prediction unit.

Here, the case where numCand is 1 and 2 has been described, but it is also possible to easily increase the number of candidates by adding a predicted pixel block adjacent to the prediction target block. Also, the prediction unit syntax of FIG. 19 can be easily changed according to the number of candidates.

It should be noted that syntax elements not defined in this embodiment may be inserted between the lines of the syntax tables illustrated in FIGS. 16, 17, 18 and 19, and other conditional branch descriptions are included. It may be. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, the term of each illustrated syntax element can be changed arbitrarily.

As described above, the image coding apparatus according to the present embodiment, when using two reference pixel lines having different prediction directions, has a different prediction error tendency from either of the two prediction directions, and follows one prediction mode. When orthogonal transform is selected, the problem that the coding efficiency is lowered without taking advantage of the tendency of intra prediction in which the prediction accuracy is lowered as the distance between the reference pixel and the prediction pixel is increased is solved. This image encoding apparatus classifies the vertical direction and horizontal direction of each prediction mode into two classes according to the presence or absence of the above-described tendency, and adaptively performs 1D discrete cosine transform or 1D discrete sine transform for each of the vertical direction and horizontal direction. Apply. In the 1D discrete sine transform, when the 1D orthogonal transform is performed in the direction orthogonal to the line of the reference pixel group (vertical direction or horizontal direction), the coefficient density is higher than the 1D discrete cosine transform (that is, quantization) The ratio of the non-zero coefficient in the conversion coefficient 121 is reduced). Therefore, according to the image coding apparatus according to the present embodiment, high conversion efficiency is stably achieved as compared with a case where fixed orthogonal transformation such as DCT is uniformly applied to each prediction mode.

Further, the orthogonal transform unit 102 and the inverse orthogonal transform unit 105 according to the present embodiment are suitable for both hardware implementation and software implementation.
The above is the description of the image encoding device according to the first embodiment.

(Second Embodiment)
The image encoding device according to the second embodiment differs from the image encoding device according to the first embodiment described above in the details of the intra unidirectional prediction image generation unit 108 and the intra bidirectional prediction image generation unit 109. In the following description, the same parts as those in the first embodiment are denoted by the same reference numerals in the present embodiment, and different parts will be mainly described. A moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a fourth embodiment.

FIG. 20 is a block diagram of an image encoding apparatus 2000 according to the second embodiment of the present invention. The difference from FIG. 1 is that a prediction direction deriving unit 2001 is newly added and that prediction direction deriving information 2051 is output from the prediction direction deriving unit 2001 to the prediction selecting unit 113. Further, it is assumed that the intra unidirectional prediction image generation unit 108 and the intra bidirectional prediction image generation unit 109 are extended to 128 directions. Specifically, it means that the angle gradient of 180 degrees used for direction prediction is divided into 128 pieces, and a prediction direction is assigned every 1.4 degrees. However, the prediction modes of the unidirectional prediction described in the first embodiment are the same as those in FIGS. 7A and 7B and FIGS. 11A and 11B. Since the other configuration is the same as that of the first embodiment, description thereof is omitted. Here, intra prediction performed using the prediction direction deriving unit 2001 is referred to as a prediction direction derivation mode.

The reference image 124 output from the reference image memory 107 is input to the prediction direction deriving unit 2001. The prediction direction deriving unit 2001 has a function of analyzing the input reference image 124 and generating prediction direction derivation information 2051. The prediction direction deriving unit 2001 will be described with reference to FIG. As illustrated in FIG. 21, the prediction direction deriving unit 2001 includes a left reference pixel line edge deriving unit 2101, an upper reference pixel line edge deriving unit 2102, and a prediction direction deriving information generating unit 2103. The left reference pixel line edge deriving unit 2101 has a function of performing edge detection processing on a reference pixel line located to the left of the prediction target pixel block and deriving an edge direction. On the other hand, the upper reference pixel line edge deriving unit 2102 has a function of performing edge detection processing on a reference pixel line located above the prediction target pixel block and deriving an edge direction.

FIG. 22 shows an example of pixels used for the derivation of the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102. The left reference pixel line edge deriving unit 2101 uses two lines indicated by diagonal lines from the upper right to the lower left located on the left side of the prediction target pixel block. The upper reference pixel line edge deriving unit 2102 uses two lines indicated by diagonal lines from the upper left to the lower right located above the prediction target pixel block. In the present embodiment of the present invention, two lines have been described, but one line or three lines may be used, and more lines may be used. In the drawing, an example in which the edge direction is derived using the left reference pixel line is shown as A, and an example in which the edge direction is derived using the upper reference pixel line is shown as B.

As a specific example of edge detection, both deriving

units

2101 and 2102 perform edge intensity detection using an operator as shown in the following equation (13).

Here, Gx represents the edge strength in the horizontal direction (x coordinate system), and Gy represents the edge strength in the vertical direction (y coordinate system). Further, any operator such as a Sobel operator, a Prewitt operator, or a Kirsch operator may be used as the edge detection operator.

When the operator of Equation (13) is applied to the reference pixel line, an edge direction vector is derived for each pixel. The following formula (14) is used to derive the optimum edge direction from these edge vectors.

Here, <a, b> represents the inner product of two vectors. S (θ) represents the sum of squares of edge strength (direction vector) (i = 1, 2,..., N) on the unit vector. The unit vector and edge strength (direction vector) are respectively expressed by the following equations.

Therefore, by optimizing the equation (13), the representative edge angle represented by the following equation can be calculated.

Next, the prediction direction derivation information generation unit 2103 in this embodiment of the present invention will be described. The edge strength of the following expression of each pixel line derived by the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102 is input to the prediction direction deriving information generating unit 2103.

Here, the prediction direction derivation information generation unit 2103 calculates Equations (14) and (15) using all the input edge strengths, and derives a unidirectional representative edge angle. On the other hand, both the expression edge (14) and the expression (15) are separately calculated with respect to the two representative edge angles of the left reference pixel line and the upper reference pixel line, and both have two representative edge angles. The direction representative edge angle is derived. Furthermore, the prediction direction derivation information generation unit 2103 derives a plurality of peripheral unidirectional representative edge angles that are angularly adjacent, starting from the unidirectional representative edge angle. For example, it is assumed that there are 128 types of edge angles and they are arranged in the order of angles. At this time, when the representative edge angle is RDM, the peripheral unidirectional representative edge angles are expressed as RDM-1, RDM + 1, RDM-2, RDM + 2,. In this embodiment, the peripheral unidirectional representative edge angle is 10 (for example, ± 5).

Next, the prediction direction derivation information 2051 will be described. In the prediction direction derivation information 2051, the unidirectional representative edge angle, the bidirectional representative edge angle, and the peripheral unidirectional representative edge angle are indirectly or directly related as the prediction mode. FIG. 23 shows an example of the prediction direction derivation information 2051. In FIG. 23, the unidirectional representative edge angle is indicated by RDM, the representative edge angle of the left reference pixel line of the bidirectional representative edge angle is indicated by RDM_L0, and the representative edge angle of the upper reference pixel line of the bidirectional representative edge angle is It is indicated by RDM_L1. RDMPredMode indicates the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention. RDMBipredFlag indicates whether the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention is bidirectional prediction. RDMPredAngleIdL0 and RDMPredAngleIdL1 indicate which prediction angle the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention indicates. Note that the prediction direction derivation information 2051 includes two representative edge angles included in the unidirectional representative edge angle and the bidirectional representative edge angle, in addition to FIG. The relationship between these prediction modes and TransformIdx will be described later.

The prediction direction derivation information 2051 generated by the prediction direction derivation unit 2001 is input to the prediction selection unit 113. A prediction image 125 is generated by the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 according to the prediction mode selected here. These predicted image generation units are the same as those in the first embodiment except that the number of predicted angles is expanded to 128. For example, when RDMPredMode is 1, the first predicted image signal 851 and the second predicted image signal 852 in FIG. 8 are generated at two unidirectional representative edge angles included in the bidirectional representative edge angle, and the weighted average An average process is performed in the unit 801, and the predicted image 125 is output.

<Effect of prediction direction deriving unit>
In addition to the first embodiment of the present invention, the prediction direction deriving unit 2001 is added to change the configuration, so that the prediction direction, which is the conventional 33 directions, is expanded to 128 directions, and further based on the reference pixel line. By performing edge detection, it is not necessary to set all 128 prediction modes, and a highly accurate prediction angle can be selected. Here, since the angle that can be derived in advance is determined to be 128 directions, the prediction direction can be selected from the representative angle calculated by Expression (15). For example, by dividing 180 degrees into 128, quantizing the representative angle calculated by Equation (15), and mapping to a maximum 128 prediction directions, a more accurate prediction direction can be used. The additional mode information required in the present embodiment is the 12 types shown in FIG. 23, and the overhead when encoding the prediction mode is significantly larger than when 128 types of prediction modes are simply added. Can be reduced.

<Orthogonal transformation when a prediction direction deriving unit is added>
In the second embodiment in which the prediction direction deriving unit 2001 is added, there is a problem that the reference pixel line that is actually used in the prediction cannot be specified unless the edge of the reference pixel line is detected and the unidirectional representative edge angle is derived. . Therefore, in the second embodiment, TransformIdx is selected in units of coding tree blocks (or the first prediction unit included in the coding tree block). For example, assume that the N × N pixel block 0 shown in FIG. 2C is a prediction unit, and that a 2N × 2N pixel block is a coding tree block. Since the syntax encoding is performed in units of coding tree blocks, the reference pixel lines used in the prediction unit other than the prediction unit 0 are uncoded, so which reference pixel line is used. It cannot be specified. Since the 0th prediction unit corresponding to the first pixel block of the coding tree block has already been encoded for the upper and left pixels, TransformIdx is selected using the information of the reference pixel line derived here. The In FIG. 23, TransformIdx [0] for following the TransformIdx of the head prediction unit is specified regardless of each prediction mode.

Note that TransformIdx when the prediction mode described in FIG. 23 is selected in advance may be determined. For example, it is possible to easily set TransformIdx to 0 in order to reduce the process of deriving the predicted angle.

<Syntax>
FIG. 24 illustrates a slice header syntax 1506 according to this embodiment. The slice_derived_direction_intra_flag illustrated in FIG. 24 is a syntax element indicating, for example, the validity / invalidity of the prediction direction deriving method according to the present embodiment for the slice.

When slice_derived_direction_intra_flag is 0, the prediction direction derivation mode according to the present embodiment in the slice is invalid. Therefore, the prediction selection unit 113 does not set the prediction mode including the prediction direction derivation information 2051, and the prediction selection switch 111 connects the output terminals of the switches according to the first embodiment of the present invention. As an example, when slice_derived_direction_intra_flag is 1, the prediction direction derivation mode according to the present embodiment is valid over the entire area in the slice.

As another example, when the prediction direction derivation method is 1, the prediction according to the present embodiment is performed for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.). The validity / invalidity of the direction derivation mode may be defined.

FIG. 25A shows an example of the prediction unit syntax. intra_derived_direction_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is the first embodiment shown in FIGS. 11A and 11B or the second embodiment shown in FIG. It is. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set.

When intra_derived_direction_flag [i] is 1, it indicates that the prediction unit uses the prediction direction derivation mode shown in FIG. In this case, intra_direction_mode [i], which is information identifying the used intra prediction mode, is encoded from among the prepared prediction modes. As shown in FIG. 23, in this mode, a bidirectional intra prediction mode and a unidirectional intra prediction are mixedly expressed. These prediction modes can be expressed as separate syntax elements and encoded. In the second embodiment of the present invention, intra_direction_mode [i] is described as an example corresponding to RDMPredMode, but the prediction mode expression method may be changed based on RDMPredAngleIdL0. For example, when bi-directional intra prediction corresponding to 1 of RDMPredMode is omitted, the prediction mode can also be expressed by dividing it into two types of syntax elements: a 1-bit flag representing a code and an index representing a change. It becomes. In this case, a new flag indicating whether or not bidirectional intra prediction is used may be prepared. Also, intra_direction_mode [i] may be encoded with the same length according to the number of prediction modes, or may be encoded using a predetermined code table. When intra_direction_mode [i] is 0, it indicates that the prediction unit does not use the prediction direction derivation method according to the present embodiment of the present invention, and is encoded according to the method described in the first embodiment.

FIG. 25B shows an example of the prediction unit syntax as another embodiment of the present invention. In FIG. 25B, intra_direction_mode [i] is expressed by being divided into prev_intra_direction_mode [i] and rem_intra_direction_mode [i]. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12). prev_intra_direction_mode [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_direction_mode [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_direction_mode [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_direction_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. . The rem_intra_direction_mode [i] may be encoded with an equal length according to the number of prediction modes, or may be encoded using a predetermined code table.

FIG. 25C shows an example of the prediction unit syntax as another embodiment of the present invention. In this figure, PredMode described in the first embodiment and PredMode described in the second embodiment are integrated and expressed as one PredMode table. PredMode in this case is shown in FIGS. 26A, 26B, and 26C. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12). prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_luma_unipred_idx [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. . The rem_intra_luma_unipred_mode [i] may be encoded with the same length according to the number of prediction modes, or may be encoded using a predetermined code table.
The above is the detailed description of the image coding apparatus 2000 according to the second embodiment of the present invention.

(Third embodiment)
The third embodiment relates to a moving picture decoding apparatus for decoding encoded data encoded by the first moving picture decoding apparatus. That is, the decoding apparatus according to the present embodiment decodes encoded data generated by, for example, the image encoding apparatus according to the first embodiment.

As shown in FIG. 27, the moving picture decoding apparatus 2700 according to this embodiment includes an input buffer 2701, an entropy decoding unit 2702, an inverse quantization unit 2703, an inverse orthogonal transform unit 2704, an addition unit 2705, and a reference image memory 2706. , An intra unidirectional prediction image generation unit 2707, an intra bidirectional prediction image generation unit 2708, an inter prediction image generation unit 2709, a prediction selection switch 2710, a conversion information setting unit 2711, an output buffer 2712, and a prediction selection unit 2714.

27 decodes the encoded data 2725 stored in the input buffer 2701, stores the decoded image 2724 in the output buffer 2712, and outputs it as an output image. The encoded data 2725 is output from, for example, the image encoding device of FIG. 1 and the like, and is temporarily stored in the input buffer 2701 through a storage system or a transmission system (not shown).

The entropy decoding unit 2702 performs decoding based on the syntax for each frame or field for decoding the encoded data 2725. The entropy decoding unit 2702 sequentially entropy-decodes the code string of each syntax, and reproduces the encoding parameters of the encoding target block such as prediction information 2721 including the prediction mode information and the quantized transform coefficient (sequence) 2715. The coding parameters are all parameters necessary for decoding such as prediction information 2721, information on transform coefficients, information on quantization, and the like.

The inverse quantization unit 2703 performs inverse quantization on the quantized transform coefficient 2715 from the entropy decoding unit 2702 to obtain a restored transform coefficient 2716. Specifically, the inverse quantization unit 2703 performs inverse quantization according to the information regarding the quantization decoded by the entropy decoding unit 2702 (the quantization step width derived from the quantization information into the quantization transform coefficient 2715). Multiply). The inverse quantization unit 2703 inputs the restored transform coefficient 2716 to the inverse orthogonal transform unit 2704.

The inverse orthogonal transform unit 2704 performs inverse orthogonal transform corresponding to the orthogonal transform performed on the encoding side on the reconstructed transform coefficient 2716 from the inverse quantization unit 2703, and reconstructed prediction error (also referred to as a prediction difference signal). 2717 is obtained. The inverse orthogonal transform unit 2704 inputs the restoration prediction error 2717 to the addition unit 2705.

The addition unit 2705 adds the restored prediction error 2717 and the corresponding predicted image 2722 to generate a decoded image 2718. The decoded image 2718 is input to the reference image memory 2706. The decoded image 2718 is then temporarily stored in the output buffer 2712 for the output image. The decoded image 2718 temporarily stored in the output buffer 2712 is output to a display device system such as a display or a monitor (not shown) or a video device system according to the output timing managed by the decoding control unit 2713. The decoded image signal 2718 stored in the reference image memory 2706 is used as a reference image 2719 by the intra unidirectional prediction image generation unit 2707, the intra bidirectional prediction image generation unit 2708, and the inter prediction image generation unit 2709 as a frame unit or as necessary. Referenced by field.

Inverse quantization unit 2703, inverse orthogonal transform unit 2704, addition unit 2705, reference image memory 2706, intra unidirectional prediction image generation unit 2707, intra bidirectional prediction image generation unit 2708, inter prediction image generation unit 2709, conversion information setting unit 2711 and the selection switch 2710 are the inverse quantization unit 104, the inverse orthogonal transform unit 105, the addition unit 106, the reference image memory 107, the intra unidirectional prediction image generation unit 108, the intra bidirectional prediction image generation unit 109, the inter The predicted image generation unit 110, the conversion information setting unit 112, and the selection switch 111 are substantially the same or similar elements.

<Intra Unidirectional Prediction Image Generation Unit 2707 (108)>
The intra unidirectional predicted image generation unit 2707 (108) performs unidirectional intra prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). For example, in H.264 / MPEG-4 AVC, pixel interpolation (copying or interpolation) is performed along a prediction direction such as a vertical direction or a horizontal direction using a decoded reference pixel line spatially adjacent to a prediction target block. The intra prediction image is generated by performing processing (referring to filter processing or the like). FIG. 4A shows a prediction direction of intra prediction in H.264 / MPEG-4 AVC. FIG. 4B shows an arrangement relationship between reference pixel lines and encoding target pixels in H.264 / MPEG-4 AVC. FIG. 4C shows a predicted image generation method in mode 1 (horizontal prediction), in which pixels I to L are copied in the prediction direction from the left reference pixel line. FIG. 4D shows a predicted image generation method in mode 4 (diagonal lower right prediction). It is also possible to easily expand the prediction direction of H.264 / MPEG-4 AVC and increase the number of prediction modes. For example, the prediction mode is expanded to 34, the pixel position accuracy corresponding to the fractional position is determined as 32 pixel accuracy, and the prediction pixel value is created by performing linear interpolation (such as 3-tap filter processing). It is also possible. FIG. 5 shows an example of the prediction angle and the prediction mode when the prediction mode is expanded up to 34 prediction modes. In FIG. 5, there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines. In addition, the direction of a typical prediction angle indicated by H.264 / MPEG-4 AVC is indicated by an arrow. In the present embodiment of the present invention, 33 types of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a circle. Similarly to H.264 / MPEG-4 AVC, DC prediction for prediction based on the average value of available reference pixels is added, and there are a total of 34 prediction modes.

When IntraPredMode = 4, since IntraPredAngleIdL0 in FIGS. 7A and 7B is −4, a prediction image 2722 (125) is generated in the prediction direction indicated by IntraPredMode = 4 in FIG. An arrow indicated by a dotted line in FIG. 5 indicates a prediction mode whose prediction type is Intra_Vertical, and an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.

FIG. 6 shows the relationship between IntraPredAngleIdLX and intraPredAngle used for predictive image value generation. 7A and 7B show the relationship between the prediction mode (PredMode), the bidirectional prediction flag (BipredFlag), the prediction mode type (PredTypeL0, PredTypeL1), and the prediction angle (PredAngleIdL0, PredAngleIdL1).

IntraPredAngle indicates the prediction angle that is actually used when the predicted value is generated. For example, when a prediction value generation method in the case where the prediction type is Intra_Vertical and the intraPredAngle shown in FIGS. 7A and 7B is a positive value is expressed by a mathematical expression, it is expressed by the mathematical expression (1). Here, BLK_SIZE indicates the size of the pixel block, and ref [] indicates an array in which a reference image (also referred to as a reference pixel line) is stored. Moreover, pred (k, m) indicates the generated predicted image 125.

<Intra bidirectional prediction image generation unit 109>
The intra bidirectional prediction image generation unit 2708 (109) performs bidirectional intra prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). For example, in Non-Patent Document 1 described above, after selecting two types of prediction modes from nine types of prediction modes defined in H.264 / MPEG-4 AVC and generating respective prediction image signals, The predicted image signal is generated by performing the filtering process every time.

The intra bidirectional prediction image generation unit 109 (2708) illustrated in FIG. 8 holds a weighted average unit 801, a first unidirectional intra prediction image generation unit 802, and a second unidirectional intra prediction image generation unit 803. . The functions of the first unidirectional intra predicted image generation unit 802 and the second unidirectional intra predicted image generation unit 803 are the same. These may be the same as the intra unidirectional predicted image generation unit 108. In this case, since the three processing units can have the same hardware configuration, the circuit scale can be reduced.

The first prediction image 851 is output from the first unidirectional intra prediction image generation unit 802, and the second prediction image 852 is output from the second unidirectional intra prediction image generation unit 803. Each predicted image is input to the weighted average unit 801, and a weighted average process is performed. In the weighted average unit 801, calculation based on Expression (2) is performed. When the first predicted image 851 is P1 [x, y] and the second predicted image is P2 [x, y], the bidirectionally predicted image P [x, y] is expressed by Equation (2).

7A and 7B exemplify the combination of two prediction modes corresponding to bidirectional intra prediction as a fixed combination, but a special prediction using a prediction mode held by a decoded adjacent pixel block Modes may be added. In this case, since a combination of two prediction modes having a close spatial distance can be selected, a combination of bidirectional intra prediction that matches the characteristics of the image can be realized without depending on the number of fixed prediction modes. It is also possible to reduce the number of fixed prediction modes. Further, in the present embodiment of the present invention, an example in which the number of bidirectional intra prediction modes is 16 is shown, but the increase / decrease in the number of prediction modes can be easily changed. Increasing the number of prediction modes increases prediction accuracy, while increasing the overhead for decoding the prediction mode. The optimum number of prediction modes may be set in consideration of the balance between the number of prediction modes and coding efficiency.
The above is the description of the intra bidirectional predicted image generation unit 2708 (109).

27. The inter prediction image generation unit 2709 (110) in FIG. 27 performs inter prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). Specifically, the inter prediction unit 2709 (110) performs interpolating processing (motion compensation) based on the amount of motion shift (motion vector) between the prediction target block and the reference image 2719 (124) to perform inter prediction. Generate an image. In H.264 / MPEG-4 AVC, interpolation processing up to ¼ pixel accuracy is possible. The derived motion vector is decoded by the entropy decoding unit 2702 as a part of the prediction information 2721 (126).

The prediction selection switch 2710 (111) is an output terminal of the intra unidirectional prediction image generation unit 2707 (108), an output terminal of the intra bidirectional prediction image generation unit 2708 (109), or the inter prediction image generation unit 2709 (110). The output end is selected according to the prediction information 2721 (126) output from the entropy decoding unit 2702, and the intra prediction image or the inter prediction image is input to the addition unit 2705 (106) as the prediction image 2722 (125). When the prediction information 2721 (126) suggests intra prediction, the prediction selection switch 2710 (111) causes the intra unidirectional prediction image generation unit 2707 (108) or both intra according to the prediction mode shown in FIGS. 7A and 7B. A switch is connected to the output terminal from the direction prediction image generation unit 2708 (109). On the other hand, when the prediction information 2721 (126) suggests inter prediction, the prediction selection switch 2710 (111) connects the switch to the output terminal from the inter prediction unit 2709 (110).

The prediction selection unit 2714 controls the output terminal of the switch to the prediction selection switch 2710 based on the prediction information 2721 sent from the entropy decoding unit 2702. As described above, intra prediction or inter prediction can be selected to generate the predicted image 2722, but a plurality of modes are defined for each of intra prediction and inter prediction. One prediction mode is input as prediction information 2721 from these. In the case of intra bidirectional prediction, substantially two prediction modes are selected as shown in FIGS. 11A and 11B.

<Inverse orthogonal transform unit 2704 (105)>
Hereinafter, the details of the inverse orthogonal transform unit 2704 (105) according to the present embodiment will be described with reference to FIG.
The inverse orthogonal transform unit 2704 (105) includes a selection switch A 1201, a vertical inverse transform unit 1206, a transposition unit 1204, a selection switch B 1205, and a horizontal inverse transform unit 1207. The vertical inverse transform unit 1206 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. The horizontal inverse transform unit 1207 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. Note that the order of the vertical inverse transform unit 1206 and the horizontal inverse transform unit 1207 is an example, and these may be reversed.

The two 1D inverse discrete cosine transform units 1202 shown in FIG. 12 can also be realized by using physically identical hardware in a time division manner. The same applies to the 1D inverse discrete sine transform unit 1203.

The selection switch A 1201 converts the restored transform coefficient 121 of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 1252 output from the 1D transform setting unit 1208 Lead to one of them. The 1D inverse discrete cosine transform unit 1202 multiplies the input restoration transform coefficient 121 (matrix format) by a transposed matrix of the discrete cosine transform matrix and outputs the result. The 1D inverse discrete sine transform unit 1203 multiplies the input restoration transform coefficient 121 by the transposed matrix of the discrete sine transform matrix and outputs the result. Specifically, the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 (that is, the vertical inverse transform unit 1206) perform one-dimensional inverse orthogonal transform represented by Expression (9).

The transposition unit 1204 transposes the output matrix (Y ′) of the vertical inverse transform unit 1206 and supplies the transposition to the selection switch B 1205. However, the transposition unit 1204 is an example, and corresponding hardware may not necessarily be prepared. For example, when 1D inverse orthogonal transform by the vertical inverse transform unit 1206 is executed (each element of the output matrix of the vertical inverse transform unit 1206) is held, and 1D inverse orthogonal transform is performed by the horizontal inverse transform unit 1207 If read in an appropriate order, transposition of the output matrix (Y ′) can be executed without preparing hardware corresponding to the transposition unit 1204.

The selection switch B 1205 converts the input matrix from the transposition unit 1204 into a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform according to a horizontal transformation index (Horizontal_transform_idx) included in the 1D transformation index 1252 output from the 1D transformation setting unit 1208. Lead to one of the sections 1203. The 1D inverse discrete cosine transform unit 1202 performs 1D inverse discrete cosine transform on the input matrix and outputs the result. The 1D inverse discrete sine transform unit 1203 performs 1D inverse discrete sine transform on the input matrix and outputs the result. Specifically, the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 (that is, the horizontal inverse transform unit 1207) perform one-dimensional inverse orthogonal transform represented by Expression (10).

As described above, the inverse orthogonal transform unit 105 performs inverse orthogonal transform on the reconstructed transform coefficient (matrix) 2716 (121) according to the input orthogonal transform information 2723 (127), and reconstructed prediction error (matrix) 2717. (122) is generated. Note that the 1D inverse discrete cosine transform unit 1202 may be replaced with an inverse discrete cosine transform of H.264 / MPEG-4 AVC by reusing the existing inverse orthogonal transform. Further, the inverse orthogonal transform unit 2704 (105) may realize various inverse orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform. In any case, an inverse orthogonal transform corresponding to the orthogonal transform unit 102 of the image encoding device 100 shown in FIG. 1 may be selected.

Details of the 1D conversion setting unit 1208 in FIG. 12 will be described below. The 1D conversion setting unit 1208, based on the orthogonal conversion information 2723 (127), a 1D conversion index for selecting a conversion matrix used for vertical orthogonal conversion and vertical inverse orthogonal conversion, horizontal orthogonal conversion and horizontal reverse It has a function of setting a 1D transform index for selecting a transform matrix used for orthogonal transform. The 1D transform index 1252 directly or indirectly indicates the orthogonal transform selected by the vertical orthogonal transform and the horizontal orthogonal transform, respectively.

For example, the 1D transform index 1252 can be expressed by a transform index (TransformIdx) shown in FIG. 13A and a 1D orthogonal transform in the vertical or horizontal direction (Vertical_transform_idx and Horizontal_transform_idx, respectively). With reference to the table in FIG. 13A, a 1D transformation index (Vertical_transform_idx) for a vertical transformation unit and a 1D transformation index (Horizontal_transform_idx) for a horizontal transformation unit can be derived from the transformation index. Further, FIG. 13B shows whether each idx indicates discrete cosine transform or discrete sine transform. When idx is “1”, it indicates a discrete sine transform matrix (DST), and when it is “0”, it indicates a discrete cosine transform matrix (DCT). In FIG. 13B, for example, the corresponding 1D transform index 1252 is referenced based on TransformIdx included in the orthogonal transform information 127, and Vertical_transform_idx is output to the selection switch A and Horizontal_transform_idx is output to the selection switch B. For example, when Vertical_transform_idx or H Horizontal_transform_idx indicates DCT with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete cosine transform unit 1202 (or 1D discrete cosine transform unit 902). When Vertical_transform_idx or Horizontal_transform_idx indicates DST with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete sine transform unit 1203 (or 1D discrete sine transform unit 903).

<Syntax configuration>
Hereinafter, the syntax used by the video decoding device 2700 in FIG. 27 will be described.
The syntax indicates the structure of encoded data (for example, encoded data 2725 in FIG. 27) when the image encoding apparatus encodes moving image data. When decoding the encoded data, the moving picture decoding apparatus interprets the syntax with reference to the same syntax structure. FIG. 15 illustrates a syntax 1500 used by the video decoding apparatus in FIG.

When slice_bipred_intra_flag is 0, the bidirectional intra according to the present embodiment in the slice is invalid. Therefore, the prediction selection switch 2710 (111) does not connect the output terminal of the switch to the intra bidirectional prediction image generation unit 2708 (109). As an example of unidirectional intra prediction, prediction in which BipedFlag [] in FIGS. 7A and 7B, FIG. 11A and FIG. 11B is 0, or intra prediction defined in H.264 / MPEG-4 AVC may be performed. Absent.

As an example, when slice_bipred_intra_flag is 1, the bidirectional intra prediction according to the present embodiment is valid in the entire area of the slice.
Further, as another example, when slice_bipred_intra_flag is 1, in the syntax of a lower layer (coding tree block, transform unit, etc.), the prediction validity / Invalidity may be specified.

The slice_directional_transform_intra_flag shown in FIG. 16 is a syntax element indicating the validity / invalidity of the inverse discrete sine transform according to the present embodiment with respect to the slice, for example.

When slice_directional_transform_intra_flag is 0, the inverse discrete sine transform according to this embodiment in the slice is invalid. Therefore, the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx. As an example, when slice_directional_transform_intra_flag is 1, the inverse discrete sine transform according to the present embodiment is valid over the entire area in the slice.

As another example, when slice_directional_transform_intra_flag is 1, the inverse discrete sine transform according to the present embodiment for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.) The validity / invalidity may be defined.

FIG. 17 illustrates the coding tree block syntax 1508 according to the present embodiment. Ctb_directional_transform_flag shown in FIG. 17 is a syntax element indicating validity / invalidity of the inverse discrete sine transform according to this embodiment with respect to the coding tree block. Also, pred_mode shown in FIG. 17 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction. ctb_directional_transform_flag is encoded only when the above-described slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction. CBP indicates Coded_Block_Pattern information, and is information indicating whether or not there is a transform coefficient in the coding tree block. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so ctb_directional_transform_flag is not decoded.

When ctb_directional_transform_flag is 0, the inverse discrete sine transform according to the present embodiment in the coding tree block is invalid. Therefore, the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
On the other hand, when ctb_directional_transform_flag is 1, the inverse discrete sine transform according to the present embodiment is valid in the coding tree block.

As in the example of FIG. 17, the coding tree block syntax 1508 encodes a flag that defines the validity / invalidity of the inverse discrete sine transform according to the present embodiment, so that each local region (ie, coding tree block) is encoded. Optimal inverse orthogonal transform can be performed.

FIG. 18 illustrates a transform unit syntax 1510 according to this embodiment. The tu_directional_transform_flag shown in FIG. 18 is a syntax element indicating validity / invalidity of the inverse discrete sine transform according to this embodiment with respect to the transform unit. Further, pred_mode shown in FIG. 18 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction. tu_directional_transform_flag is encoded only when slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction. coded_block_flag is 1-bit information indicating whether or not there is a transform coefficient in the transform unit. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so tu_directional_transform_flag is not decoded.

When tu_directional_transform_flag is 0, the inverse discrete sine transform according to the present embodiment in the transform unit is invalid. Therefore, the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it. Alternatively, the 1D

conversion setting units

908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
On the other hand, when tu_directional_transform_flag is 1, the inverse discrete sine transform according to the present embodiment is valid in the coding tree block.

As in the example of FIG. 18, in the transform unit syntax 1510, when the flag that defines the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to the present embodiment is decoded, each local region (ie, transform unit) is decoded. It is possible to perform inverse orthogonal transform optimal for the above.

When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the bi-directional intra prediction mode used from among the prepared bi-directional intra prediction modes. Decrypt intra_luma_bipred_mode [i]. intra_luma_bipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraBipredNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be decoded using a predetermined code table. good. When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive decoding is performed from adjacent blocks.

Prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the calculation method of MostProbableMode will be described later. When prev_intra_luma_unipred_idx [i] is not 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which intra prediction mode IntraPredMode is other than MostProbableMode is decoded. rem_intra_luma_unipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraPredModeNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be decoded using a predetermined code table. good. From intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using Equation (11).

Here, numCand indicates the number of candidates for MostProbableMode, and candModeList [cIdx] indicates the MostProbableMode that is actually a candidate. Here, numCand is set to 2, and the candidate MostProbableMode is set to IntraPredMode of pixel blocks positioned on the upper and left sides that have already been predicted and are adjacent to the prediction target block. At this time, candModeList [0] is indicated as MPM_L0, and candModeList [1] is indicated as MPM_L1. When prev_intra_luma_unipred_idx [i] is 1, a prediction mode of MPM_L0 is derived, and when prev_intra_luma_unipred_idx [i] is 2, a prediction mode of MPM_L1 is derived. When the number of numCand is plural, candModeList [cIdx] may be in the same prediction mode. In this case, since the redundant information is included in the information to be encoded, the redundant prediction mode is omitted and expressed as shown in Expression (11). For the code table for actual encoding, an optimum code table may be created in consideration of the maximum number of these prediction modes.
When numCand is 1, MostProbableMode is calculated according to Equation (12).

Min (x, y) is a parameter that outputs the smaller of input x and y. IntraPredModeA and IntraPredModeB indicate the intra prediction modes of prediction units adjacent to the left and above the encoded prediction unit.

As described above, the image coding apparatus according to the present embodiment, when using two reference pixel lines having different prediction directions, has a different prediction error tendency from either of the two prediction directions, and follows one prediction mode. When orthogonal transform is selected, the problem that the coding efficiency is lowered without taking advantage of the tendency of intra prediction in which the prediction accuracy is lowered as the distance between the reference pixel and the prediction pixel is increased is solved. This image encoding apparatus classifies the vertical direction and horizontal direction of each prediction mode into two classes according to the presence or absence of the above-described tendency, and adaptively performs 1D discrete cosine transform or 1D discrete sine transform for each of the vertical direction and horizontal direction. Apply. In the 1D discrete sine transform, when the 1D orthogonal transform is performed in a direction orthogonal to the line of the reference pixel group (vertical direction or horizontal direction), the coefficient density is higher than that of the 1D discrete cosine transform. Therefore, according to the image decoding apparatus according to the present embodiment, high conversion efficiency is stably achieved as compared with a case where fixed orthogonal transformation such as DCT is uniformly applied to each prediction mode.

Further, the inverse orthogonal transform unit 105 according to the present embodiment is suitable for both hardware implementation and software implementation.
The above is the description of the image encoding device according to the third embodiment.

(Fourth embodiment)
The fourth embodiment relates to a moving picture decoding apparatus for decoding encoded data encoded by a second moving picture decoding apparatus. That is, the moving picture decoding apparatus according to the present embodiment decodes encoded data generated by, for example, the image encoding apparatus according to the second embodiment.

The moving picture decoding apparatus according to the fourth embodiment includes the moving picture decoding apparatus according to the third embodiment described above, an intra unidirectional prediction image generation unit 2707 (108), and an intra bidirectional prediction image generation unit 2708 ( 109) in detail. In the following description, in this embodiment, the same parts as those in the third embodiment are denoted by the same reference numerals, and different parts will be mainly described.

FIG. 28 is a block diagram of a moving image decoding apparatus 2800 according to the fourth embodiment of the present invention. The difference from FIG. 27 is that a prediction direction deriving unit 2801 (2001) is newly added and prediction direction deriving information 2851 (2051) from the prediction direction deriving unit 2801 (2001) to the prediction selection switch 2710 (111). This is the output point. In addition, it is assumed that the intra unidirectional prediction image generation unit 2707 (108) and the intra bidirectional prediction image generation unit 2708 (109) are extended to 128 directions. Specifically, it means that the angle gradient of 180 degrees used for direction prediction is divided into 128 pieces, and a prediction direction is assigned every 1.4 degrees. However, the prediction modes of the unidirectional prediction described in the third embodiment are the same as those in FIGS. 7A and 7B and FIGS. 11A and 11B. Since the other configuration is the same as that of the first embodiment, description thereof is omitted. Here, intra prediction performed using the prediction direction deriving unit 2801 (2001) is referred to as a prediction direction deriving mode.

The reference image 2719 (124) output from the reference image memory 2706 (107) is input to the prediction direction deriving unit 2801 (2001). The prediction direction deriving unit 2801 (2001) has a function of analyzing the input reference image 2719 (124) and generating prediction direction deriving information 2851 (2051). The prediction direction deriving unit 2801 (2001) will be described with reference to FIG. As shown in FIG. 21, the prediction direction deriving unit 2801 (2001) includes a left reference pixel line edge deriving unit 2101, an upper reference pixel line edge deriving unit 2102, and a prediction direction deriving information generating unit 2103. The left reference pixel line edge deriving unit 2101 has a function of performing edge detection processing on a reference pixel line located to the left of the prediction target pixel block and deriving an edge direction. On the other hand, the upper reference pixel line edge deriving unit 2102 has a function of performing edge detection processing on a reference pixel line located above the prediction target pixel block and deriving an edge direction.

As a specific example of edge detection, both processing units perform edge intensity detection using an operator as shown in Equation (13). Here, Gx indicates the edge strength in the horizontal direction (x coordinate system), and indicates the edge strength in Gy (y coordinate system). Further, any operator such as a Sobel operator, a Prewitt operator, or a Kirsch operator may be used as the edge detection operator.

When the operator of Equation (13) is applied to the reference pixel line, an edge direction vector is derived for each pixel. Equation (14) is used to derive the optimum edge direction from these edge vectors. Here, <a, b> represents the inner product of two vectors. S (θ) represents the sum of squares of edge strength (direction vector) (i = 1, 2,..., N) on the unit vector. The unit vector and edge strength (direction vector) are respectively expressed by the following equations.

Therefore, the representative edge angle can be calculated by optimizing Equation (13) using Equation (15).

Next, the prediction direction derivation information 2851 (2051) will be described. In the prediction direction derivation information 2851 (2051), the unidirectional representative edge angle, the bidirectional representative edge angle, and the peripheral unidirectional representative edge angle are related indirectly or directly as the prediction mode. FIG. 23 shows an example of the prediction direction derivation information 2851 (2051). In this figure, the unidirectional representative edge angle is indicated by RDM, the representative edge angle of the left reference pixel line of the bidirectional representative edge angle is indicated by RDM_L0, and the representative edge angle of the upper reference pixel line of the bidirectional representative edge angle is It is indicated by RDM_L1. RDMPredMode indicates the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention. RDMBipredFlag indicates whether the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention is bidirectional prediction. RDMPredAngleIdL0 and RDMPredAngleIdL1 indicate which prediction angle the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention indicates. Note that the prediction direction derivation information 2851 (2051) includes two representative edge angles included in the unidirectional representative edge angle and the bidirectional representative edge angle, in addition to FIG. The relationship between these prediction modes and TransformIdx will be described later.

The prediction direction derivation information 2851 (2051) generated by the prediction direction derivation unit 2801 (2001) is input to the prediction selection switch 2710 (111). A prediction image 125 is generated by the intra unidirectional prediction image generation unit 2707 (108) or the intra bidirectional prediction image generation unit 2708 (109) according to the prediction mode selected here. These predicted image generation units are the same as those in the first embodiment except that the number of predicted angles is expanded to 128. For example, when RDMPredMode is 1, the first predicted image signal 851 and the second predicted image signal 852 in FIG. 8 are generated at two unidirectional representative edge angles included in the bidirectional representative edge angle, and the weighted average An average process is performed in the unit 801, and a predicted image 2722 (125) is output.

The prediction selection unit 2714 controls the output terminal of the prediction selection switch 2710 based on the prediction information 2721 sent from the entropy decoding unit 2702 and the prediction direction derivation information 2851 input from the prediction direction derivation unit 2801. As described above, intra prediction or inter prediction can be selected to generate the predicted image 2722, but a plurality of modes are defined for each of intra prediction and inter prediction. One prediction mode is input as prediction information 2721 from these. In the case of intra bidirectional prediction, substantially two prediction modes are selected as shown in FIGS. 11A and 11B.

<Orthogonal transformation when a prediction direction deriving unit is added>
In the fourth embodiment in which a prediction direction deriving unit 2801 (2001) is added, it is possible to detect a reference pixel line that is actually used in prediction unless edge detection of the reference pixel line is performed and a unidirectional representative edge angle is derived. There's a problem. Therefore, in the fourth embodiment, TransformIdx is selected in units of coding tree blocks (or the first prediction unit included in the coding tree block). For example, assume that the N × N pixel block 0 shown in FIG. 2C is a prediction unit, and that a 2N × 2N pixel block is a coding tree block. Since the syntax encoding is performed in units of coding tree blocks, the reference pixel lines used in the prediction unit other than the prediction unit 0 are uncoded, so which reference pixel line is used. It cannot be specified. Since the 0th prediction unit corresponding to the first pixel block of the coding tree block has been decoded for the adjacent upper and left pixels, TransformIdx is selected using the information of the reference pixel line derived here. . In FIG. 23, TransformIdx [0] for following the TransformIdx of the head prediction unit is specified regardless of each prediction mode.

When slice_derived_direction_intra_flag is 0, the prediction direction derivation mode according to the present embodiment in the slice is invalid. Therefore, the prediction selection switch 2710 (111) connects the output terminal of the switch according to the first embodiment of the present invention. As an example, when slice_derived_direction_intra_flag is 1, the prediction direction derivation mode according to the present embodiment is valid over the entire area in the slice.

FIG. 25A shows an example of the prediction unit syntax.
intra_derived_direction_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is the first embodiment shown in FIGS. 11A and 11B or the fourth embodiment shown in FIG. is there. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set.

When intra_derived_direction_flag [i] is 1, it indicates that the prediction unit uses the prediction direction derivation mode shown in FIG. In this case, intra_direction_mode [i], which is information identifying the used intra prediction mode, is encoded from among the prepared prediction modes. As shown in FIG. 23, in this mode, a bidirectional intra prediction mode and a unidirectional intra prediction are mixedly expressed. These prediction modes can be expressed as separate syntax elements and encoded. In the second embodiment of the present invention, intra_direction_mode [i] is described as an example corresponding to RDMPredMode, but the prediction mode expression method may be changed based on RDMPredAngleIdL0. For example, when bi-directional intra prediction corresponding to 1 of RDMPredMode is omitted, the prediction mode is expressed by dividing it into two types of syntax elements: a 1-bit flag indicating a sign (+/−) and an index indicating a change. It is also possible. In this case, a new flag indicating whether or not bidirectional intra prediction is used may be prepared. Also, intra_direction_mode [i] may be decoded with an equal length according to the number of prediction modes, or may be decoded using a predetermined code table. When intra_direction_mode [i] is 0, it indicates that the prediction unit does not use the prediction direction deriving method according to the present embodiment of the present invention, and decoding is performed according to the method described in the first embodiment.

FIG. 25B shows an example of the prediction unit syntax as another embodiment of the present invention. In FIG. 25B, intra_direction_mode [i] is expressed by being divided into prev_intra_direction_mode [i] and rem_intra_direction_mode [i]. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12). prev_intra_direction_mode [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_direction_mode [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_direction_mode [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_direction_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is decoded. The rem_intra_direction_mode [i] may be decoded in equal length according to the number of prediction modes, or may be decoded using a predetermined code table.

An example of the prediction unit syntax is shown in FIG. 25C as another embodiment of the present invention. In FIG. 25C, PredMode described in the first embodiment and PredMode described in the second embodiment are integrated and expressed as one PredMode table. PredMode in this case is shown in FIGS. 26A, 26B, and 26C. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12). prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_luma_unipred_idx [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is decoded. The rem_intra_luma_unipred_mode [i] may be decoded in the same length according to the number of prediction modes, or may be decoded using a predetermined code table.

The above is the detailed description of the image coding apparatus 2000 according to the second embodiment of the present invention.

Hereinafter, modifications of each embodiment will be listed and introduced.
In the first to fourth embodiments, an example is described in which a frame is divided into rectangular blocks of 16 × 16 pixel size and the like, and encoding / decoding is sequentially performed from the upper left block to the lower right side of the screen ( (See FIG. 2A). However, the encoding order and the decoding order are not limited to this example. For example, encoding and decoding may be performed sequentially from the lower right to the upper left, or encoding and decoding may be performed so as to draw a spiral from the center of the screen toward the screen end. Furthermore, encoding and decoding may be performed sequentially from the upper right to the lower left, or encoding and decoding may be performed so as to draw a spiral from the screen end toward the center of the screen. In this case, since the position of the adjacent pixel block that can be referred to changes depending on the encoding order, the position may be changed to a usable position as appropriate.

In the first to fourth embodiments, the description has been given by exemplifying the prediction target block size such as the 4 × 4 pixel block, the 8 × 8 pixel block, and the 16 × 16 pixel block. However, the prediction target block is a uniform block. It does not have to be a shape. For example, the prediction target block size may be a 16 × 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, a 4 × 8 pixel block, or the like. Moreover, it is not necessary to unify all the block sizes within one coding tree block, and a plurality of different block sizes may be mixed. When a plurality of different block sizes are mixed in one coding tree block, the code amount for encoding or decoding the division information increases with the increase in the number of divisions. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the division information and the quality of the locally decoded image or the decoded image.

In the first to fourth embodiments, for the sake of simplification, a comprehensive description of the color signal component has been described without distinguishing between the prediction process for the luminance signal and the color difference signal. However, when the prediction process is different between the luminance signal and the color difference signal, the same or different prediction methods may be used. If different prediction methods are used between the luminance signal and the chrominance signal, the prediction method selected for the chrominance signal can be encoded or decoded in the same manner as the luminance signal.

In the first to fourth embodiments, for the sake of simplification, a comprehensive description of the color signal component is described without distinguishing between the orthogonal transformation process and the inverse orthogonal transformation process for the luminance signal and the color difference signal. However, when the orthogonal transformation process is different between the luminance signal and the color difference signal, the same or different orthogonal transformation methods may be used. If different orthogonal transformation methods are used between the luminance signal and the color difference signal, the orthogonal transformation method selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.

In the first to fourth embodiments, syntax elements that are not defined in the present invention can be inserted between the rows of the table shown in the syntax configuration, and descriptions relating to other conditional branches are included. It does not matter. Alternatively, the syntax table can be divided and integrated into a plurality of tables. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.

As described above, each embodiment realizes high-efficiency intra prediction and corresponding high-efficiency orthogonal transform and inverse orthogonal transform while alleviating the difficulty in hardware implementation and software implementation. Therefore, according to each embodiment, the encoding efficiency is improved, and the subjective image quality is also improved.

For example, it is possible to provide a program that realizes the processing of each of the above embodiments by storing it in a computer-readable storage medium. The storage medium can be a computer-readable storage medium such as a magnetic disk, optical disk (CD-ROM, CD-R, DVD, etc.), magneto-optical disk (MO, etc.), semiconductor memory, etc. For example, the storage format may be any form.

Further, the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.

The instructions shown in the processing procedure shown in the above embodiment can be executed based on a program that is software. A general-purpose computer system stores this program in advance, and by reading this program, it is also possible to obtain the same effects as those obtained by the video encoding device and video decoding device of the above-described embodiment. is there. The instructions described in the above-described embodiments are, as programs that can be executed by a computer, magnetic disks (flexible disks, hard disks, etc.), optical disks (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD). ± R, DVD ± RW, etc.), semiconductor memory, or a similar recording medium. As long as the recording medium is readable by the computer or the embedded system, the storage format may be any form. If the computer reads the program from the recording medium and causes the CPU to execute instructions described in the program based on the program, the computer is similar to the video encoding device and video decoding device of the above-described embodiment. Operation can be realized. Of course, when the computer acquires or reads the program, it may be acquired or read through a network.
In addition, the OS (operating system), database management software, MW (middleware) such as a network, etc. running on the computer based on the instructions of the program installed in the computer or embedded system from the recording medium implement this embodiment. A part of each process for performing may be executed.
Furthermore, the recording medium in the present invention is not limited to a medium independent of a computer or an embedded system, but also includes a recording medium in which a program transmitted via a LAN or the Internet is downloaded and stored or temporarily stored. Further, the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
Further, the number of recording media is not limited to one, and when the processing in the present embodiment is executed from a plurality of media, it is included in the recording media in the present invention, and the configuration of the media may be any configuration.

The computer or the embedded system in the present invention is for executing each process in the present embodiment based on a program stored in a recording medium, and includes a single device such as a personal computer or a microcomputer, Any configuration such as a system in which apparatuses are connected to a network may be used.
Further, the computer in the embodiment of the present invention is not limited to a personal computer, but includes an arithmetic processing device, a microcomputer, and the like included in an information processing device, and a device capable of realizing the functions in the embodiment of the present invention by a program, The device is a general term.

Although several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

DESCRIPTION OF SYMBOLS 101 ... Subtraction part 102 ... Orthogonal transformation part 103 ... Quantization part 104 ... Inverse quantization part 105 ... Inverse orthogonal transformation part 106 ... Adder part 107 ... Reference image memory 108 ... Intra unidirectional prediction image generation 109: Intra bi-predictive image generation unit, 110: Inter prediction image generation unit, 111 ... Prediction selection switch, 112 ... Conversion information setting unit, 113 ... Prediction selection switch, 114 ... Entropy encoding unit, 115 ... Output buffer 116: Encoding control unit, 117: Input image, 118: Prediction error, 119 ... Transformation coefficient, 120 ... Quantization transformation coefficient, 121 ... Restored transformation coefficient, 122 ... Restored prediction error, 123 ... Restored image, 124 ... see Image, 125 ... Predictive image, 126 ... Prediction information, 127 ... Conversion information, 128 ... Encoded data, 801 ... Weighted average part, 802 ... First Direction intra predicted image generation unit, 803 ... second unidirectional intra predicted image generation unit, 851 ... first predicted image, 852 ... second predicted image, 901, 1201 ... selection switch A, 902 ... 1D discrete cosine transform unit, 903 ... 1D discrete sine transform unit, 904 ... transpose unit, 905, 1205 ... selection switch B, 906 ... vertical transform unit, 907 ... horizontal transform unit, 908 ... 1D transform setting unit, 951 ... temporary transform coefficient, 952 ... 1D transform index 1202... 1D inverse discrete cosine transform unit, 1203... 1D inverse discrete sine transform unit, 1204... Transpose unit, 1206... Vertical inverse transform unit, 1207 ... horizontal inverse transform unit, 1252. High level syntax, 1502 ... Slice level syntax, 1503 ... Coding tree 1504: Sequence parameter set syntax, 1505 ... Picture parameter set syntax, 1506 ... Slice header syntax, 1507 ... Slice data syntax, 1508 ... Coding tree block syntax, 1509 ... Prediction unit syntax, 2001 ... Prediction direction derivation unit, 2051 ... prediction direction derivation information, 2101 ... left reference pixel line edge derivation unit, 2102 ... upper reference pixel line edge derivation unit, 2103 ... prediction direction derivation information generation unit, 2701 ... input buffer, 2702 ... entropy decoding unit, 2703 ... inverse Quantization unit, 2704 ... inverse orthogonal transform unit, 2705 ... addition unit, 2706 ... reference image memory, 2707 ... intra unidirectional prediction image generation unit, 2708 ... intra bidirectional prediction image generation , 2709 ... inter prediction image generation part, 2710 ... prediction selection switch, 2711 ... conversion information setting part, 2712 ... output buffer, 2713 ... decoding control part, 2714 ... prediction selection part, 2801 ... prediction direction deriving part, 2851 ... Prediction direction derivation information, 2715: Quantization transform coefficient (sequence), 2716: Restoration transform coefficient, 2717 ... Restoration prediction error, 2718 ... Decoded image, 2719 ... Reference image, 2721 ... Prediction information, 2722 ... Predicted image, 2724 ... Decoding Image, 2725 ... encoded data.

Claims

When two or more prediction modes are intra-screen prediction processes using one or more reference pixel lines, a combination of one-dimensional transformation only for the first orthogonal transformation is selected, and each of the two or more prediction modes is 1 In the case of the intra prediction process using two reference pixel lines and the same reference pixel line, a combination of the first orthogonal transformation and the second orthogonal transformation is selected,
Generating a predicted image signal using the two or more prediction modes;
The prediction difference signal derived from the prediction image signal is subjected to a two-dimensional conversion process using the selected combination of one-dimensional conversion, and a conversion coefficient is generated,
Encoding prediction information indicating a combination of the two or more prediction modes and the transform coefficient;
A moving picture encoding method comprising:
The moving image encoding method according to claim 1, wherein the first orthogonal transform is a discrete sine transform, and the second orthogonal transform is a discrete cosine transform.
When the two or more prediction modes are in-screen prediction processing using one reference pixel line and the same reference pixel line, respectively, the reference pixel line refers only to a line above the pixel block. When a combination of transforms in the order of discrete sine transform and discrete cosine transform is selected and the reference pixel line refers only to the left line of the pixel block, the combination of transforms in the order of discrete cosine transform and discrete sine transform The video encoding method according to claim 2, further comprising: selecting.
The said 2 or more prediction mode WHEREIN: When the prediction process which does not have a prediction direction is contained, it further comprises selecting the combination only of discrete cosine transform, The said claim 2 characterized by the above-mentioned. A video encoding method.
The prediction process having no prediction direction includes DC prediction for prediction based on an average value of reference pixel lines, and planar prediction for performing extrapolation or interpolation processing for each pixel position of the reference pixel line. Item 5. A video encoding method according to Item 4.
Perform edge detection processing on the reference pixel line, derive a prediction direction based on the direction of the edge,
Encoding a prediction mode using the derived prediction direction;
The moving picture coding method according to claim 2, further comprising:
The first orthogonal transform is based on a tendency that an absolute value of a prediction difference of a prediction mode for generating an intra-screen prediction image using at least one reference pixel line increases according to a distance from a reference pixel. A conversion base that is predetermined so that coefficient density is higher than that of the second orthogonal transformation matrix when one-dimensional transformation in a direction orthogonal to a reference pixel line is performed is included. The moving image encoding method according to 1.
In the first orthogonal transform, the block size is N (N is an integer of 2 or more), and N × N matrix components (i, j) are

In the second orthogonal transformation, an N × N matrix component (i, j) is

Using the transformation matrix
8. The moving picture encoding method according to claim 7, wherein:
When two or more prediction modes are intra-screen prediction processes using one or more reference pixel lines, a combination of one-dimensional transformation only for the first orthogonal transformation is selected, and each of the two or more prediction modes is 1 In the case of the intra prediction process using two reference pixel lines and the same reference pixel line, a combination of the first orthogonal transformation and the second orthogonal transformation is selected,
Generating a predicted image signal using the two or more prediction modes;
Using the combination of the selected one-dimensional transformation, a two-dimensional inverse transformation process is performed to generate a prediction difference signal,
A moving picture decoding method comprising: adding the generated prediction difference signal and the prediction picture signal to generate a decoded picture signal.
10. The moving picture decoding method according to claim 9, wherein the first orthogonal transform is a discrete sine transform, and the second orthogonal transform is a discrete cosine transform.
When the two or more prediction modes are in-screen prediction processing using one reference pixel line and the same reference pixel line, respectively, the reference pixel line refers only to a line above the pixel block. When a combination of transforms in the order of discrete sine transform and discrete cosine transform is selected and the reference pixel line refers only to the left line of the pixel block, the combination of transforms in the order of discrete cosine transform and discrete sine transform The video decoding method according to claim 10, further comprising: selecting.
11. The method according to claim 10, further comprising selecting a combination of only discrete cosine transform when a prediction process having no prediction direction is included in the two or more prediction modes. Video decoding method.
The prediction process having no prediction direction includes DC prediction for prediction based on an average value of reference pixel lines, and planar prediction for performing extrapolation or interpolation processing for each pixel position of the reference pixel line. Item 13. The moving picture decoding method according to Item 12.
Perform edge detection processing on the reference pixel line, derive a prediction direction based on the direction of the edge,
Decoding a prediction mode using the derived prediction direction;
The moving picture decoding method according to claim 10, further comprising:
The first orthogonal transform is based on a tendency that an absolute value of a prediction difference of a prediction mode for generating an intra-screen prediction image using at least one reference pixel line increases according to a distance from a reference pixel. A conversion base that is predetermined so that coefficient density is higher than that of the second orthogonal transformation matrix when one-dimensional transformation in a direction orthogonal to a reference pixel line is performed is included. 9. The moving picture decoding method according to 9.
In the first orthogonal transform, the block size is N (N is an integer of 2 or more), and N × N matrix components (i, j) are

In the second orthogonal transformation, an N × N matrix component (i, j) is

Using the transformation matrix
The moving picture decoding method according to claim 15, wherein:
When two or more prediction modes are intra-screen prediction processes using one or more reference pixel lines, a combination of one-dimensional transformation only for the first orthogonal transformation is selected, and each of the two or more prediction modes is 1 A selection unit that selects a combination of the first orthogonal transformation and the second orthogonal transformation in the case of intra prediction processing using two reference pixel lines and the same reference pixel line;
A predicted image generation unit that generates a predicted image signal using the two or more prediction modes;
An orthogonal transformation unit that performs a two-dimensional transformation process on the prediction difference signal derived from the prediction image signal using the selected one-dimensional transformation combination, and generates a transformation coefficient;
An encoding unit that encodes prediction information indicating a combination of the two or more prediction modes and the transform coefficient;
A video encoding device comprising:
When two or more prediction modes are intra-screen prediction processes using one or more reference pixel lines, a combination of one-dimensional transformation only for the first orthogonal transformation is selected, and each of the two or more prediction modes is 1 A selection unit that selects a combination of the first orthogonal transformation and the second orthogonal transformation in the case of intra prediction processing using two reference pixel lines and the same reference pixel line;
A predicted image generation unit that generates a predicted image signal using the two or more prediction modes;
An inverse orthogonal transform unit that performs a two-dimensional inverse transform process and generates a prediction difference signal using the selected one-dimensional transform combination;
A moving picture decoding apparatus comprising: an adding unit that adds the generated prediction difference signal and the predicted image signal to generate a decoded image signal.