JP2009272727A

JP2009272727A - Transformation method based on directivity of prediction error, image-encoding method and image-decoding method

Info

Publication number: JP2009272727A
Application number: JP2008119298A
Authority: JP
Inventors: Yuma Sano; 雄磨佐野; Akiyuki Tanizawa; 昭行谷沢; Takeshi Nakajo; 健中條
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2008-04-30
Filing date: 2008-04-30
Publication date: 2009-11-19

Abstract

<P>PROBLEM TO BE SOLVED: To provide a moving image-encoding method for which the power concentration of a transformation coefficient is improved by orthogonal transformation, corresponding to the direction of in-screen prediction. <P>SOLUTION: The moving image-encoding method for encoding input image signals includes: a step of generating predicted error signals (116), indicating the difference value among prediction signals (121), generated according to a predetermined prediction mode and the input image signals (115); an orthogonal transformation step of orthogonally transforming the prediction error signals by a first transformation method, depending on the prediction direction of the prediction mode or a second transformation method which is independent of the prediction direction and generating the transformation coefficients; a quantization step of executing quantization processing on the transformation coefficients and generating quantized transformation coefficients; and an encoding processing step of executing entropy-encoding processing to the quantized transformation coefficient (117) and transformation information (118) indicating whether it is the first transformation method or the second transformation method and generating encoded data (124). <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、動画像符号化装置、動画像符号化方法、動画像復号化装置及び動画像復号化方法に係わり、特に画面内予測の方向に対応した直交変換により変換係数の電力集中度を向上させ、ひいては符号化効率を改善する効果が得られるようにした動画像符号化装置、動画像符号化方法、動画像復号化装置、動画像復号化方法に関する。 The present invention relates to a moving image encoding device, a moving image encoding method, a moving image decoding device, and a moving image decoding method, and in particular, improves power concentration of transform coefficients by orthogonal transform corresponding to the direction of intra prediction. The present invention relates to a moving image encoding device, a moving image encoding method, a moving image decoding device, and a moving image decoding method, which can obtain an effect of improving encoding efficiency.

Ｈ．２６４／ＭＰＥＧ−４ＡＶＣ（以下Ｈ．２６４と呼ぶ）等の動画像符号化方式では、直交変換として離散コサイン変換(Discrete Cosine Transform、以下ＤＣＴ)/逆離散コサイン変換(Inverse Discrete Cosine Transform、以下ＩＤＣＴ)が行われている。現状では、ＤＣＴ／ＩＤＣＴは水平方向と垂直方向に１次元ＤＣＴ／ＩＤＣＴを２回行うことで実現されている[非特許文献１]。ＤＣＴは、予測誤差の相関係数が画素間距離に従って規則的に低減するモデルに対しては、カルーネン-レーブ変換(以下ＫＬＴと呼ぶ)と同等の変換効率が得られるが、実際の予測誤差信号はエッジ成分などにより、必ずしも水平垂直方向のＤＣＴが最適とは限らず、画像の特徴に応じてＤＣＴをかける向きを変える研究や［非特許文献２］、ＤＣＴと離散サイン変換（Discrete Sine Transform、以下ＤＳＴ)を切り替える研究が行われてきた[特許文献１]。
A. Hallapuro, M. Karczewicz, and H. Malvar, “Low complexity transform and quantization,” Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, Jan. ２００２, JVT-B０３８ and JVT-B０３９. J. Fu, B. Zeng, “Diagonal Discrete Cosine Transforms for Image Coding”, PCM ２００６特開平１１-５５６７８ H. In a moving picture coding system such as H.264 / MPEG-4 AVC (hereinafter referred to as H.264), a discrete cosine transform (hereinafter referred to as DCT) / inverse discrete cosine transform (hereinafter referred to as IDCT) is used as an orthogonal transform. ) Is done. At present, DCT / IDCT is realized by performing one-dimensional DCT / IDCT twice in the horizontal and vertical directions [Non-Patent Document 1]. DCT has a conversion efficiency equivalent to that of the Karhunen-Reeve transform (hereinafter referred to as KLT) for a model in which the correlation coefficient of the prediction error is regularly reduced according to the inter-pixel distance. The DCT in the horizontal and vertical directions is not always optimal depending on the edge component, etc., and research on changing the direction of applying the DCT according to the feature of the image [Non-Patent Document 2], DCT and discrete sine transform (Discrete Sine Transform, The following studies have been conducted to switch DST) [Patent Document 1].
A. Hallapuro, M. Karczewicz, and H. Malvar, “Low complexity transform and quantization,” Joint Video Team of ISO / IEC MPEG and ITU-T VCEG, Jan. 2002, JVT-B038 and JVT-B039. J. Fu, B. Zeng, “Diagonal Discrete Cosine Transforms for Image Coding”, PCM 2006 JP-A-11-55678

Ｈ．２６４で用いられている非特許文献１の水平垂直ＤＣＴは、予測誤差の相関係数が画素間距離に従って規則的に低減するモデルに対しては、ＫＬＴと同等の変換効率が得られるが、予測誤差信号にエッジ成分が発生すると、必ずしも水平垂直方向のＤＣＴが最適とは限らず、変換効率が十分に得られない可能性がある。 H. The horizontal and vertical DCT of Non-Patent Document 1 used in H.264 provides conversion efficiency equivalent to that of KLT for a model in which the correlation coefficient of the prediction error is regularly reduced according to the inter-pixel distance. When an edge component is generated in the error signal, the DCT in the horizontal and vertical directions is not necessarily optimal, and there is a possibility that sufficient conversion efficiency cannot be obtained.

また、特許文献１では、画像の特長によってＤＣＴとＤＳＴを切り替えて変換を行っているが、ＤＳＴは内挿予測された画素に対しては変換効率が高いが、Ｈ．２６４におけるイントラ予測は外挿予測であるために、イントラ予測によって生じる予測誤差に対してＤＳＴをかけても変換効率の向上は見込めない。 Further, in Patent Document 1, conversion is performed by switching between DCT and DST depending on the feature of the image. DST has high conversion efficiency for pixels predicted to be interpolated. Since intra prediction in H.264 is extrapolation prediction, improvement in conversion efficiency cannot be expected even if DST is applied to a prediction error caused by intra prediction.

非特許文献２では、サイド情報削減のために１次元ＤＣＴをかける方向を、従来のＨ．２６４の変換方式に加えて、直交右下方向と直交左下方向の２方向に制限し、これらの方向からレート歪的に最適な変換の方向を選択している。しかし、イントラ予測の方向とその予測誤差の方向性が類似しており、予測方向と異なる方向に１次元ＤＣＴをかけても係数の電力が集中しないため、直交右下方向と直交左下方向のそれぞれの方向に変換をかけることは冗長であり、それぞれの変換方向を識別するためのサイド情報は符号化効率の低下につながる。また斜めに１次元ＤＣＴをかける際の基底長が、符号化ブロックの端部では著しく短く、変換の効果が十分に発揮されていない。 In Non-Patent Document 2, the direction in which one-dimensional DCT is applied to reduce side information is the same as that of conventional H.264. In addition to the H.264 conversion method, the orthogonal right lower direction and the orthogonal left lower direction are limited to two directions, and the optimum conversion direction in terms of rate distortion is selected from these directions. However, since the direction of intra prediction and the directionality of the prediction error are similar, and the power of the coefficient is not concentrated even if the one-dimensional DCT is applied in a direction different from the prediction direction, each of the orthogonal lower right direction and the orthogonal lower left direction It is redundant to perform conversion in the direction, and side information for identifying each conversion direction leads to a decrease in encoding efficiency. Further, the base length when the one-dimensional DCT is applied obliquely is extremely short at the end of the coding block, and the effect of the conversion is not sufficiently exhibited.

本発明の一態様は、入力画像信号を符号化する動画像符号化方法であって、予め定められた予測モードに従って生成された予測信号と入力画像信号との差分値を示す予測誤差信号を生成する予測誤差信号生成ステップと、前記予測モードの予測方向に依存する第１の変換方法又は前記予測方向に依存しない第２の変換方法によって、前記予測誤差信号を直交変換し、変換係数を生成する直交変換ステップと、前記変換係数に対して量子化処理を施し、量子化変換係数を生成する量子化ステップと、前記量子化変換係数と、前記第１の変換方法か前記第２の変換方法かを示す変換情報に対してエントロピー符号化処理を行い、符号化データを生成する符号化処理ステップと、を含む動画像符号化方法を提供する。 One aspect of the present invention is a moving image encoding method for encoding an input image signal, which generates a prediction error signal indicating a difference value between a prediction signal generated according to a predetermined prediction mode and the input image signal. The prediction error signal is generated by the orthogonal transformation of the prediction error signal by the first conversion method depending on the prediction direction of the prediction mode and the second conversion method not depending on the prediction direction to generate a conversion coefficient. An orthogonal transform step, a quantization step for performing a quantization process on the transform coefficient to generate a quantized transform coefficient, the quantized transform coefficient, and the first transform method or the second transform method. And a coding process step for generating encoded data by performing entropy coding processing on the conversion information indicating the above.

空間相関を利用するＨ．２６４のようなイントラ予測を行う場合、予測方向と予測誤差の方向性は同一である可能性が高いので、予測方向以外の方向の１次元ＤＣＴを考慮しなくても電力集中度は低下しない。また、本発明の態様の方式は、従来のＨ．２６４の変換方式に加えて、１次元ＤＣＴをかける方向の候補が１つに決まるので、非特許文献２と比較してサイド情報を減らすことが可能である。また、画像の端部の画素について、画素を折り返すことで基底長の長い１次元ＤＣＴをかけるため、電力集中度が上がる。これら３点から、本変換方式によって生成された変換係数に対して量子化処理、エントロピー符号化処理を行うことによって、最終的に符号量を削減することが可能である。また、従来水平と垂直の２方向に１次元ＤＣＴをかけていたが、本方式では１次元ＤＣＴは予測方向に１回で十分であるため、符号化及び復号化処理が低減される。 H. using spatial correlation When intra prediction such as H.264 is performed, there is a high possibility that the prediction direction and the directionality of the prediction error are the same, so the power concentration does not decrease even if one-dimensional DCT in directions other than the prediction direction is not considered. Further, the system of the aspect of the present invention is a conventional H.264 standard. In addition to the H.264 conversion method, one candidate for the direction in which the one-dimensional DCT is applied is determined, so that side information can be reduced as compared with Non-Patent Document 2. Further, since the one-dimensional DCT having a long base length is applied to the pixels at the edge of the image by folding the pixels, the power concentration level is increased. From these three points, it is possible to finally reduce the code amount by performing quantization processing and entropy coding processing on the transform coefficient generated by this transform method. Conventionally, one-dimensional DCT is applied in two directions, horizontal and vertical. However, in this method, one-dimensional DCT is sufficient in the prediction direction, so that encoding and decoding processes are reduced.

以下、添付図面を参照して、本発明に係る動画像符号化方法及び動画像符号化装置、動画像復号方法及び動画像復号化装置の最良な実施の形態を詳細に説明する。 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Exemplary embodiments of a moving picture coding method, a moving picture coding apparatus, a moving picture decoding method, and a moving picture decoding apparatus according to the present invention will be described below in detail with reference to the accompanying drawings.

[第１の実施形態]
(符号化器について)
図１に示す第１の実施形態に係わる動画像符号化装置１００は、予測モード情報から変換情報を制御する制御部１０７と、変換情報によって変換方法を選択する変換方法選択スイッチ１０１と、予測の方向に対応して変換量子化を行う変換量子化部１０２〜１０５と、変換係数と変換情報とを符号化するエントロピー符号化部１０６とを備えている。 [First embodiment]
(About the encoder)
A video encoding apparatus 100 according to the first embodiment shown in FIG. 1 includes a control unit 107 that controls conversion information from prediction mode information, a conversion method selection switch 101 that selects a conversion method based on the conversion information, Transform quantization units 102 to 105 that perform transform quantization corresponding to directions, and an entropy encoding unit 106 that encodes transform coefficients and transform information are provided.

図１において、入力動画像信号１１５はマクロブロック毎又はマクロブロックペア毎に分割されて、動画像符号化装置１００に入力される。動画像符号化装置１００では、予測部１１４が入力動画像信号１１５に対してフレーム内予測を行う。この予測部１１４について以下に具体的に説明する。 In FIG. 1, an input moving image signal 115 is divided for each macroblock or macroblock pair and input to the moving image encoding apparatus 100. In the video encoding device 100, the prediction unit 114 performs intra-frame prediction on the input video signal 115. The prediction unit 114 will be specifically described below.

予測部１１４は、マクロブロックをさらに分割したサブブロックを単位として予測を行う。例えば、Ｈ．２６４のイントラ予測における輝度信号においては、１６個の４ｘ４画素サブブロックを持つマクロブロックと、４個の８ｘ８画素サブブロックを持つマクロブロックと、１個の１６x１６画素サブブロックを持つマクロブロックが存在する。符号化対象フレームと符号化対象マクロブロックとの関係を図２（ａ）に、４ｘ４画素サブブロックを持つマクロブロックを図２（ｂ）に、８ｘ８画素サブブロックを持つマクロブロックを図２（ｃ）に、１６ｘ１６画素サブブロックを持つマクロブロックを図２（ｄ）に示す。４ｘ４サブブロックと８ｘ８サブブロックには９つの予測モードが存在し、１６x１６サブブロックには４つの予測モードが存在し、各サブブロックでイントラ予測が行われる。 The prediction unit 114 performs prediction in units of sub blocks obtained by further dividing the macro block. For example, H.M. In the luminance signal in the H.264 intra prediction, there are a macroblock having 16 4 × 4 pixel subblocks, a macroblock having four 8 × 8 pixel subblocks, and a macroblock having one 16 × 16 pixel subblock. . FIG. 2A shows the relationship between the encoding target frame and the encoding target macroblock, FIG. 2B shows the macroblock having 4 × 4 pixel subblocks, and FIG. 2C shows the macroblock having 8 × 8 pixel subblocks. 2) shows a macroblock having 16 × 16 pixel sub-blocks in FIG. Nine prediction modes exist in the 4x4 subblock and the 8x8 subblock, and four prediction modes exist in the 16x16 subblock, and intra prediction is performed in each subblock.

図３はＨ．２６４のイントラ予測の４x４サブブロック及び８x８サブブロックにおける全予測モード、即ち垂直予測モード、水平予測モード、ＤＣ予測モード、直交左下予測モード、直交右下予測モード、垂直右予測モード、水平下予測モード、垂直左予測モード、及び水平上予測モードを示している。各予測モードの予測方向を図４（ａ）に示す。図４（ａ）に示す予測方向について具体的な予測方法を、図４（ｂ）〜（ｄ）を用いて説明する。 FIG. All prediction modes in 4 × 4 sub-block and 8 × 8 sub-block of H.264 intra prediction, that is, vertical prediction mode, horizontal prediction mode, DC prediction mode, orthogonal lower left prediction mode, orthogonal lower right prediction mode, vertical right prediction mode, horizontal lower prediction mode , Vertical left prediction mode, and horizontal top prediction mode. The prediction direction of each prediction mode is shown in FIG. A specific prediction method for the prediction direction shown in FIG. 4A will be described with reference to FIGS.

図４（ｂ）の符号化対象サブブロックに対して、Ｈ．２６４イントラ予測モードのモード２のＤＣ予測が選択された場合、数式（１）で予測画素が計算される。
H＝(A＋B＋C＋D）、Ｖ＝(I＋J＋K＋L)
a〜p＝（H+V+４）>>３
…（１）。 For the sub-block to be encoded in FIG. When the mode 2 DC prediction in the H.264 intra prediction mode is selected, the prediction pixel is calculated by Expression (1).
H = (A + B + C + D), V = (I + J + K + L)
a ~ p = (H + V + 4) >> 3
... (1).

参照画素が利用できない時は、利用できる参照画素の平均値で予測値が生成される。利用できる画素が１つも存在しない場合は、符号化装置の最大輝度値の半分の値（８ビットなら１２８）で予測値が生成される。その他のモードが選択された場合、方向予測部１１４は、図４（a）で示される予測方向に対して、参照画素から補間された予測値をコピーする予測方法を用いる。具体的には、モード０(垂直予測)が選択された場合の予測値生成方法を説明する。 When the reference pixel cannot be used, a predicted value is generated using an average value of the available reference pixels. If there is no usable pixel, a prediction value is generated with a value that is half the maximum luminance value of the encoding device (128 for 8 bits). When the other mode is selected, the direction prediction unit 114 uses a prediction method that copies the prediction value interpolated from the reference pixel with respect to the prediction direction shown in FIG. Specifically, a prediction value generation method when mode 0 (vertical prediction) is selected will be described.

下記のモードは、参照画素ＡからＤまでが利用できるときだけ、選択することが可能である。
a, e, i, m= A
b, f, j, n = B
c, g, k, o = C
d, h, l, p = D予測方法の詳細は図４（ｃ）に示されており、参照画素Ａ〜Ｄの輝度値がそのまま垂直方向にコピーされ、予測値として補填される。 The following modes can be selected only when reference pixels A to D are available.
a, e, i, m = A
b, f, j, n = B
c, g, k, o = C
The details of the d, h, l, p = D prediction method are shown in FIG. 4C, and the luminance values of the reference pixels A to D are copied as they are in the vertical direction and supplemented as prediction values.

一方、モード４(直交右下予測)が選択されたときの予測方法についても数式(２)で説明する。
d = ( B + ( C << 1 ) + D + 2 ) >> 2
c, h = ( A + ( B << 1 ) + C + 2 ) >> 2
b, g, l = ( M + ( A << 1 ) + B + 2 ) >> 2
a, f, k, p = (I + ( M << 1 ) + A + 2 ) >> 2
e, j, o = (J + ( I << 1 ) + M +2 ) >> 2
i, n = (K + ( J<< 1) + I + 2 ) >> 2
m = (L + (K << 1) + J + 2 ) >> 2
…（２）
この予測モードは、参照画素ＡからＤ、ＩからＭまでが利用できるときにのみ選択することができる。本予測モードの詳細は図４（ｄ）に示されており、この本予測モードによると、３タップのフィルタによって生成された値を右下４５度方向へコピーし、予測値として補填する。 On the other hand, the prediction method when mode 4 (orthogonal lower right prediction) is selected will also be described using equation (2).
d = (B + (C << 1) + D + 2) >> 2
c, h = (A + (B << 1) + C + 2) >> 2
b, g, l = (M + (A << 1) + B + 2) >> 2
a, f, k, p = (I + (M << 1) + A + 2) >> 2
e, j, o = (J + (I << 1) + M +2) >> 2
i, n = (K + (J << 1) + I + 2) >> 2
m = (L + (K << 1) + J + 2) >> 2
... (2)
This prediction mode can be selected only when reference pixels A to D and I to M are available. The details of the main prediction mode are shown in FIG. 4D, and according to the main prediction mode, the value generated by the 3-tap filter is copied in the lower right 45 degrees direction and compensated as a predicted value.

予測モード０、２、４以外の予測方法に関してもほぼ同様の枠組みが用いられており、予測方向に対して利用できる参照画素から補間値を生成し、その値を予測方向に応じてコピーするという予測を行う。また、４ｘ４画素ブロックについて予測方法を説明したが、８ｘ８画素ブロックについても同様の方法で参照画素から補間値を生成し、その値を予測方向に応じてコピーすることで予測を行う。 An almost similar framework is used for prediction methods other than prediction modes 0, 2, and 4, and an interpolation value is generated from reference pixels that can be used in the prediction direction, and the value is copied according to the prediction direction. Make a prediction. Moreover, although the prediction method was demonstrated about the 4x4 pixel block, it predicts by producing | generating the interpolation value from a reference pixel also about an 8x8 pixel block, and copying the value according to a prediction direction.

以上のように、予測部１１４で生成された予測画像信号１２１と入力画像信号１１５との差分を取ることで、予測誤差信号１１６が生成される。このとき、予測部１１４で選択された予測モード情報１２０に応じて、符号化制御部１０７は変換情報１１８を生成する。変換情報１１８は、直交変換に関して水平垂直方向のＤＣＴを行うのか、予測モード情報が示す予測方向に沿って１次元ＤＣＴを行うのかを判別するための情報であり、変換方法選択スイッチ１０１、逆変換方法選択スイッチ１０８を切り替えるために用いられる。実施形態では、符号化制御部１０７は予測モード情報が示す予測方向が水平又は垂直方向のとき水平垂直変換情報のみを出力し、斜め方向であると水平垂直変換情報と斜め変換情報とを出力する。 As described above, the prediction error signal 116 is generated by taking the difference between the prediction image signal 121 generated by the prediction unit 114 and the input image signal 115. At this time, the encoding control unit 107 generates conversion information 118 according to the prediction mode information 120 selected by the prediction unit 114. The conversion information 118 is information for determining whether to perform DCT in the horizontal / vertical direction for orthogonal transform or to perform one-dimensional DCT along the prediction direction indicated by the prediction mode information. Used to switch the method selection switch 108. In the embodiment, the encoding control unit 107 outputs only horizontal / vertical conversion information when the prediction direction indicated by the prediction mode information is horizontal or vertical, and outputs horizontal / vertical conversion information and diagonal conversion information when the prediction direction is the diagonal direction. .

予測誤差信号１１６は、変換方法選択スイッチ１０１に入力され、変換情報１１８に従って、スイッチが切り替えられ、第１変換部１０２か第２変換部１０３のいずれかに入力される。第１変換部１０２では図５（ａ）に示すように予測誤差信号１１６は水平垂直ＤＣＴ器２００により水平垂直ＤＣＴされる。水平垂直ＤＣＴ器２００からの変換係数２３２はその後図５（ｂ）に示すように第１量子化部１０４内で水平垂直ＤＣＴ器２００に対応した量子化器２０１で量子化される。 The prediction error signal 116 is input to the conversion method selection switch 101, the switch is switched according to the conversion information 118, and is input to either the first conversion unit 102 or the second conversion unit 103. In the first conversion unit 102, the prediction error signal 116 is subjected to horizontal and vertical DCT by a horizontal and vertical DCT unit 200 as shown in FIG. The transform coefficient 232 from the horizontal / vertical DCT unit 200 is then quantized by a quantizer 201 corresponding to the horizontal / vertical DCT unit 200 in the first quantizing unit 104 as shown in FIG.

第２変換部１０３では、図６（ａ）に示すように、予測誤差信号１１６は、ＤＣＴ器２０３〜２０８から選択された、予測モード情報１２０の予測方向に対応したＤＣＴ器によってＤＣＴ変換される。その後、第２量子化部１０５では、図６（ｂ）に示すように、量子化器２１０〜２１５から、予測モード情報１２０に対応した量子化器が選択され、第２変換部１０３の変換係数２３３が量子化され、量子化変換係数１１７が生成される。 In the second conversion unit 103, as shown in FIG. 6A, the prediction error signal 116 is DCT-transformed by the DCT unit selected from the DCT units 203 to 208 and corresponding to the prediction direction of the prediction mode information 120. . Thereafter, in the second quantization unit 105, as shown in FIG. 6B, a quantizer corresponding to the prediction mode information 120 is selected from the quantizers 210 to 215, and the transform coefficient of the second transform unit 103 is selected. 233 is quantized and a quantized transform coefficient 117 is generated.

ここで、符号化効率の観点から、最適なモードを判定するためのモード判定について詳細に説明を行う。サブブロックがとりうる全ての予測モードについて、変換情報１１８が示す直交変換以降の符号化処理を行い、その結果得られた符号化コストが最小となるモードが、サブブロックの最適モードと判定される。具体的に説明すると、予測モードに規定された予測方法で生成された予測信号と原画像信号との差分値の絶対値を計算した差分誤差ＳＡＤとその予測モードを用いたときに必要となるサイド情報の値ＯＨから数式（３）を用いてモードを判定する。
K＝SAD＋λ×OH （３）
λは定数で与えられ、量子化スケールの値に基づいて決められる。このようにして得られたコストＫを基にモードが決定される。一般的にコストＫがもっとも小さい値を与えるモードが最適モードとして選択される。 Here, from the viewpoint of encoding efficiency, mode determination for determining an optimal mode will be described in detail. For all the prediction modes that can be taken by the sub-block, the encoding process after the orthogonal transformation indicated by the transformation information 118 is performed, and the mode that results in the lowest coding cost is determined as the optimum mode of the sub-block. . More specifically, the difference error SAD obtained by calculating the absolute value of the difference value between the prediction signal generated by the prediction method defined in the prediction mode and the original image signal and the side required when using the prediction mode The mode is determined from the information value OH using Equation (3).
K = SAD + λ × OH (3)
λ is given as a constant and is determined based on the value of the quantization scale. The mode is determined based on the cost K thus obtained. Generally, the mode that gives the smallest value of cost K is selected as the optimum mode.

本実施の形態では、サイド情報と差分値の絶対値を用いたが、別の実施の形態として、モード情報のみ、予測誤差信号の絶対和のみを用いてモードを判定しても良いし、これらをアダマール変換したり、近似した値を利用したりしても良い。また、入力画像のアクティビティを用いてコストを作成しても良いし、量子化スケールを利用してコスト関数を作成しても良い。 In this embodiment, the side information and the absolute value of the difference value are used. However, as another embodiment, the mode may be determined using only the mode information or only the absolute sum of the prediction error signals. May be subjected to Hadamard transform or an approximate value may be used. Further, the cost may be created using the activity of the input image, or the cost function may be created using the quantization scale.

コストを算出するための別の実施の形態として、仮符号化を行い、仮符号化の予測モードで生成された予測信号と原画像との差分値を実際に符号化した場合の符号量と、変換係数を局部復号した局部復号画像と入力画像信号との二乗誤差を用いてモードを判定しても良い。この場合のモード判定式は数式(４)のようになる。
J=D+λ×R （４）
ここで、Ｄは、入力画像と局部復号画像の二乗誤差を表す符号化歪である。一方、Ｒは仮符号化によって見積もられた符号量を表している。本コストを用いた場合は、符号化モード毎にエントロピー符号化と局部復号(逆量子化や逆変換処理を含む)が必要となるため、回路規模は増大するが、正確な符号量と符号化歪を用いることが可能となり、符号化効率を高く維持することが可能である。本コストも、符号量のみ、符号化歪のみを用いてコストを算出しても良いし、これらを近似した値を用いてコスト関数を作成しても良い。 As another embodiment for calculating cost, provisional encoding is performed, and the amount of code when the difference value between the prediction signal generated in the prediction mode of provisional encoding and the original image is actually encoded, and The mode may be determined using a square error between the locally decoded image obtained by locally decoding the transform coefficient and the input image signal. The mode determination formula in this case is as shown in Formula (4).
J = D + λ × R (4)
Here, D is an encoding distortion representing a square error between the input image and the locally decoded image. On the other hand, R represents a code amount estimated by provisional encoding. When this cost is used, entropy coding and local decoding (including inverse quantization and inverse transform processing) are required for each coding mode, so the circuit scale increases, but the exact code amount and coding Distortion can be used, and encoding efficiency can be kept high. For this cost, the cost may be calculated using only the code amount or only the coding distortion, or a cost function may be created using a value approximating these.

次に、予測方向に沿った１次元ＤＣＴの方法について説明を行う。まず、Ｈ．２６４で行われているＤＣＴについて説明する。４ｘ４サブブロックに対して、入力信号をＸ、変換行列をＴ、変換係数をＣとすると、変換係数Ｃは数式（５）に従って求められる。

Next, a one-dimensional DCT method along the prediction direction will be described. First, H. The DCT performed in H.264 will be described. For a 4 × 4 sub-block, if the input signal is X, the transformation matrix is T, and the transformation coefficient is C, the transformation coefficient C is obtained according to Equation (5).

すなわち、入力画像信号に対して、変換行列Ｔを用いて、水平方向に１次元ＤＣＴをかけた後、変換行列Ｔの転置行列Ｔ^tを用いて、垂直方向に１次元ＤＣＴをかけている。 That is, the input image signal is subjected to the one-dimensional DCT in the horizontal direction using the transformation matrix T, and then subjected to the one-dimensional DCT in the vertical direction using the transposed matrix T ^t of the transformation matrix T.

上記演算は、８ｘ８サブブロックに対しても同様である。それに対し、本実施の形態では、４ｘ４サブブロックと８ｘ８サブブロックに対して、Ｈ．２６４で定められている９種類のイントラ予測の中から、垂直予測、水平予測、ＤＣ予測の３種類を除いた６方向について、予測方向に沿った、方向性をもったＤＣＴ変換を行う。 The above calculation is the same for the 8 × 8 sub-block. On the other hand, in the present embodiment, H.264 is applied to 4 × 4 subblocks and 8 × 8 subblocks. A DCT transform with directionality along the prediction direction is performed for six directions from among the nine types of intra prediction defined in H.264 except for three types of vertical prediction, horizontal prediction, and DC prediction.

まず、Ｈ．２６４のイントラ予測モード３〜８までの各モードに対応した変換方法について、図７に示す４ｘ４サブブロックを使って説明する。予測モード３の場合、サブブロック内の画素を、予測方向に沿って、(a)、(b,e)、(c,f,i)、(d,g,j,m)、(h,k,n)、(l,o)、(p)の７組に画素をまとめる。１次元ＤＣＴをかける画素列の組み合わせを図８（ａ）に示す。変換係数をF(u)とすると、４x４画素ブロックの対角線にあたる画素列(d,g,j,m)に関しては、数式（６）のように基底長４の１次元ＤＣＴが行われる。

First, H. A conversion method corresponding to each of the H.264 intra prediction modes 3 to 8 will be described using 4 × 4 sub-blocks illustrated in FIG. 7. In the prediction mode 3, the pixels in the sub-block are moved along the prediction direction (a), (b, e), (c, f, i), (d, g, j, m), (h, Pixels are grouped into 7 sets of k, n), (l, o), and (p). FIG. 8A shows a combination of pixel columns to which one-dimensional DCT is applied. Assuming that the conversion coefficient is F (u), a one-dimensional DCT with a base length of 4 is performed on the pixel column (d, g, j, m) corresponding to the diagonal line of the 4 × 4 pixel block as shown in Equation (6).

(ただし、f(0)=d, f(1)=g, f(2)=j, f(3)=m)。 (However, f (0) = d, f (1) = g, f (2) = j, f (3) = m).

次に、画素列(c,f,i)に関しては、数式(７)のように基底長３の１次元ＤＣＴが行われる。

Next, for the pixel column (c, f, i), a one-dimensional DCT with a base length of 3 is performed as in Equation (7).

(ただし、f(0)=c, f(1)=f, f(2)=i)。 (However, f (0) = c, f (1) = f, f (2) = i).

さらに、画素列(a)、(b,e)については、画素を折り返し結合して数式(８)のように基底長３の１次元ＤＣＴが行われる。

Further, for the pixel columns (a) and (b, e), one-dimensional DCT with a base length of 3 is performed as shown in Equation (8) by folding and combining the pixels.

(ただし、f(0)=b, f(1)=e, f(2)=a 又は、f(0)=e, f(1)=b, f(2)=a)。 (However, f (0) = b, f (1) = e, f (2) = a or f (0) = e, f (1) = b, f (2) = a).

また、４x４変換の別の例として、画素を折り返し合わせずに１次ＤＣＴを行う方法が考えられる。この時の１次元ＤＣＴをかける画素列の組み合わせを図８（ｂ）に示す。画素列(d,g,j,m)、(e,f,i)に関しては前記１次元ＤＣＴと同様であるが、(b,e)、(a)に関して、画素を折り返し結合せずそれぞれに対して数式(９)、数式(１０)のように１次元ＤＣＴを行う。

Further, as another example of 4 × 4 conversion, a method of performing primary DCT without returning pixels is conceivable. FIG. 8B shows a combination of pixel columns to which the one-dimensional DCT is applied at this time. The pixel columns (d, g, j, m) and (e, f, i) are the same as those of the one-dimensional DCT, but the pixels are not folded and combined with respect to (b, e) and (a). On the other hand, one-dimensional DCT is performed as shown in Equation (9) and Equation (10).

(ただし、f(0)=b, f(1)=e)

(However, f (0) = b, f (1) = e)

(ただし、f(0)= a)。 (However, f (0) = a).

また、別の例として、画素列(c,f,i)、(e,b)、(a)を全て折り返し結合して１次元ＤＣＴを行う方法が考えられる。この時の１次元ＤＣＴをかける画素列の組み合わせを図８（ｃ）に示す。画素列(d,g,j,m)に関しては前記１次元ＤＣＴと同様であるが、画素列(c,f,i)、(e,b)、(a)に関しては、数式（１１）のように１次元ＤＣＴを行う。

As another example, a method of performing one-dimensional DCT by folding and joining all the pixel columns (c, f, i), (e, b), and (a) is conceivable. FIG. 8C shows a combination of pixel columns to which one-dimensional DCT is applied at this time. The pixel column (d, g, j, m) is the same as the one-dimensional DCT, but the pixel column (c, f, i), (e, b), (a) As such, one-dimensional DCT is performed.

(ただし、f(0)=c, f(1)=f, f(2)=i, f(3)=e, f(4)=b, f(5)=a
もしくは、f(0)=i, f(1)=f, f(2)=c, f(3)=b, f(4)=e, f(5)=a)
残りの画素列(h,k,n)、(l,e)、(p)については、(e,f,i)、(b,e)、(a)と、４x４ブロックの対角線である画素列(d,g,j,m)に関して線対称であるので、(e,f,i)、(b,e)、(a)と同様の計算で１次元ＤＣＴを行う。 (However, f (0) = c, f (1) = f, f (2) = i, f (3) = e, f (4) = b, f (5) = a
Or f (0) = i, f (1) = f, f (2) = c, f (3) = b, f (4) = e, f (5) = a)
For the remaining pixel columns (h, k, n), (l, e), (p), pixels that are diagonal lines of (e, f, i), (b, e), (a) and 4x4 blocks Since the line (d, g, j, m) is line symmetric, one-dimensional DCT is performed by the same calculation as (e, f, i), (b, e), and (a).

予測モード４の場合は、予測モード３の予測方向を左右反転したものと考えられるので、予測モード３の時と同様にしてＤＣＴ演算を行うことが可能である。
また、上記説明では実数ＤＣＴ演算を用いたが、Ｈ．２６４で規定されているように整数演算でＤＣＴを行うことも可能である。 In the case of the prediction mode 4, since it is considered that the prediction direction of the prediction mode 3 is reversed left and right, the DCT calculation can be performed in the same manner as in the prediction mode 3.
In the above description, real DCT calculation is used. It is also possible to perform DCT by integer arithmetic as defined in H.264.

次に、予測モード５〜８に対応するＤＣＴ演算方法を説明する。
予測モード５の場合、サブブロック内の画素を、予測方向に沿って、(a,e,j,n)、(b,f,k,o)、(c,g,l,p)、(i,m)、(d,h)の５組に画素をまとめる。１次元ＤＣＴをかける画素列の組み合わせを図９（ａ）に示す。変換係数をF(u)とすると、画素列(a,e,j,n)に関しては、数式(１２)のように基底長４の１次元ＤＣＴが行われる。

Next, DCT calculation methods corresponding to prediction modes 5 to 8 will be described.
In the case of the prediction mode 5, the pixels in the sub-block are moved along the prediction direction (a, e, j, n), (b, f, k, o), (c, g, l, p), ( Pixels are grouped into five groups i, m) and (d, h). FIG. 9A shows a combination of pixel columns to which one-dimensional DCT is applied. Assuming that the conversion coefficient is F (u), a one-dimensional DCT with a base length of 4 is performed on the pixel array (a, e, j, n) as shown in Equation (12).

(ただし、f(0)=a, f(1)=e, f(2)=j, f(3)=n)
画素列(b,f,k,o)、(c,g,l,p)に関しても同じ演算が行われる。 (However, f (0) = a, f (1) = e, f (2) = j, f (3) = n)
The same calculation is performed on the pixel columns (b, f, k, o) and (c, g, l, p).

次に、画素列(i,m)に関しては、数式(１３)のように基底長３の１次元ＤＣＴが行われる。

Next, for the pixel column (i, m), a one-dimensional DCT with a base length of 3 is performed as in Equation (13).

(ただし、f(0)=i, f(1)=m)
画素列(d,h)に関しても数式（１３）と同じ演算が行われる。 (However, f (0) = i, f (1) = m)
The same calculation as Expression (13) is performed on the pixel column (d, h).

また、予測モード５に対応したＤＣＴ演算の別の例として、画素を折り返し合わせて１次ＤＣＴを行う方法が考えられる。この時の１次元ＤＣＴをかける画素列の組み合わせを図９（ｂ）に示す。画素列(b,f,k,o)に関しては前記１次元ＤＣＴと同様であるが、(a,e,j,n)、(m,i)と(p,l,g,c)、(d,h)に関しては、画素を折り返し結合して数式(１４)のように１次元ＤＣＴを行う。

Further, as another example of the DCT calculation corresponding to the prediction mode 5, a method of performing primary DCT by folding pixels together is conceivable. FIG. 9B shows a combination of pixel columns to which the one-dimensional DCT is applied at this time. The pixel column (b, f, k, o) is the same as the one-dimensional DCT, but (a, e, j, n), (m, i) and (p, l, g, c), ( With respect to d, h), the pixels are folded and combined to perform one-dimensional DCT as shown in Equation (14).

(ただし、f(0)=a, f(1)=e, f(2)=j, f(3)=n, f(4)=m, f(5)=i)
画素列(p,l,g,c)、(d,h)に関しても数式（１４）と同じ演算が行われる。 (However, f (0) = a, f (1) = e, f (2) = j, f (3) = n, f (4) = m, f (5) = i)
The same calculation as Expression (14) is performed for the pixel columns (p, l, g, c) and (d, h).

予測モード６〜８の場合は、予測モード５の予測方向を回転又は反転したものと考えられるので、予測モード５の時と同様にしてＤＣＴ演算を行うことが可能である。
また、上記説明では実数ＤＣＴ演算を用いたが、Ｈ．２６４で規定されているように整数演算でＤＣＴを行うことも可能である。 In the case of the prediction modes 6 to 8, since it is considered that the prediction direction of the prediction mode 5 is rotated or inverted, it is possible to perform the DCT calculation in the same manner as in the prediction mode 5.
In the above description, real DCT calculation is used. It is also possible to perform DCT by integer arithmetic as defined in H.264.

次に、８ｘ８サブブロックについて、Ｈ．２６４のイントラ予測モード３〜８までの各モードについて、図１０に示す８ｘ８ブロックを用いて説明する。
予測モード３の場合、サブブロック内の画素を、予測方向に沿って、(a0)、(a1,b0)、(a2,b1,c0)、(a3,b2,c1,d0)、(a4,b3,c2,d1,e0)、(a5,b4,c3,d2,e1,f0)、(a6,b5,c4,d3,e2,f1,g0)、(a7,b6,c5,d4,e3,f2,g1,h0)、(b7,c6,d5,e4,f3,g2,h1)、(c7,d6,e5,f4,g3,h2)、(d7,e6,f5,g4,h3)、(e7,f6,g5,h4)、(f7,g6,h5)、(g7,h6)、(h7)の１５組に画素をまとめる。１次元ＤＣＴをかける画素列の組み合わせを図１１（ａ）に示す。変換係数をF(u)とすると、８x８画素ブロックの対角線にあたる画素列(a7,b6,c5,d4,e3,f2,g1,h0)に関しては、数式(１５)のように基底長８の１次元ＤＣＴが行われる。

Next, for the 8 × 8 sub-block, Each of the H.264 intra prediction modes 3 to 8 will be described using an 8 × 8 block shown in FIG. 10.
In the prediction mode 3, the pixels in the sub-block are moved along the prediction direction (a0), (a1, b0), (a2, b1, c0), (a3, b2, c1, d0), (a4, b3, c2, d1, e0), (a5, b4, c3, d2, e1, f0), (a6, b5, c4, d3, e2, f1, g0), (a7, b6, c5, d4, e3, f2, g1, h0), (b7, c6, d5, e4, f3, g2, h1), (c7, d6, e5, f4, g3, h2), (d7, e6, f5, g4, h3), ( e7, f6, g5, h4), (f7, g6, h5), (g7, h6), and 15 groups of (h7). FIG. 11A shows a combination of pixel columns to which one-dimensional DCT is applied. Assuming that the conversion coefficient is F (u), the pixel length (a7, b6, c5, d4, e3, f2, g1, h0) corresponding to the diagonal line of the 8 × 8 pixel block is 1 with a base length of 8 as shown in Equation (15). Dimensional DCT is performed.

(ただし、f(0)=a7, f(1)=b6, f(2)=c5, f(3)=d4, f(4)=e3, f(5)=f2, f(6)=g1, f(7)=h0)。 (However, f (0) = a7, f (1) = b6, f (2) = c5, f (3) = d4, f (4) = e3, f (5) = f2, f (6) = g1, f (7) = h0).

次に、画素列(a5,b4,c3,d2,e1,f0)、(a6,b5,c4,d3,e2,f1,g0)に関しては、画素列を折り返し合わせて、数式(１６)のように基底長１３の１次元ＤＣＴが行われる。

Next, with respect to the pixel columns (a5, b4, c3, d2, e1, f0) and (a6, b5, c4, d3, e2, f1, g0), the pixel columns are folded back to obtain Equation (16) A one-dimensional DCT with a base length of 13 is performed.

(ただし、f(0)=a6, ,f(1)=b5, f(2)=c4, f(3)=d3, f(4)=e2, f(5)=f1, f(6)=g0, f(7)=f0,
f(8)=e1, ,f(9)=d2, f(10)=c3, f(11)=b4, f(12)=a5,
又は、f(0)=g0, ,f(1)=f1, f(2)=e2, f(3)=d3, f(4)=c4, f(5)=b5, f(6)=a6, f(7)=a5,
f(8)=b4, ,f(9)=c3, f(10)=d2, f(11)=e1, f(12)=f0)。 (However, f (0) = a6,, f (1) = b5, f (2) = c4, f (3) = d3, f (4) = e2, f (5) = f1, f (6) = g0, f (7) = f0,
f (8) = e1,, f (9) = d2, f (10) = c3, f (11) = b4, f (12) = a5,
Or f (0) = g0,, f (1) = f1, f (2) = e2, f (3) = d3, f (4) = c4, f (5) = b5, f (6) = a6, f (7) = a5,
f (8) = b4,, f (9) = c3, f (10) = d2, f (11) = e1, f (12) = f0).

さらに、画素列(a3,b2,c1,d0)、(a4,b3,c2,d1,e0)については、画素列を折り返し合わせて数式(１７)のように基底長９の１次元ＤＣＴが行われる。

Further, for the pixel columns (a3, b2, c1, d0) and (a4, b3, c2, d1, e0), the pixel columns are folded and a one-dimensional DCT having a base length of 9 is performed as shown in Equation (17). Is called.

(ただし、f(0)=a4, ,f(1)=b3, f(2)=c2, f(3)=d1, f(4)=e0, f(5)=d0, f(6)=c1, f(7)=b2,
f(8)=a3又は、f(0)=e0, ,f(1)=d1, f(2)=c2, f(3)=b3, f(4)=a4, f(5)=a3, f(6)=b2, f(11)=c1, f(12)=d0)。 (However, f (0) = a4,, f (1) = b3, f (2) = c2, f (3) = d1, f (4) = e0, f (5) = d0, f (6) = c1, f (7) = b2,
f (8) = a3 or f (0) = e0,, f (1) = d1, f (2) = c2, f (3) = b3, f (4) = a4, f (5) = a3 , f (6) = b2, f (11) = c1, f (12) = d0).

さらに、画素列(a0)、(a1,b0)、(a2,b1,c0)については、画素列を折り返し結合して数式(１８)のように基底長６の１次元ＤＣＴが行われる。

Further, for the pixel columns (a0), (a1, b0), and (a2, b1, c0), the pixel columns are folded and combined, and a one-dimensional DCT with a base length of 6 is performed as shown in Equation (18).

(ただし、f(0)=a2, f(1)=b1, f(2)=c0, f(3)=b0,f(4)=a(1),f(5)=a(0)
もしくは、f(0)=c0,f(1)=b1,f(2)=a2,f(3)=a1,f(4)=b0,f(5)=a(0))。 (However, f (0) = a2, f (1) = b1, f (2) = c0, f (3) = b0, f (4) = a (1), f (5) = a (0)
Or, f (0) = c0, f (1) = b1, f (2) = a2, f (3) = a1, f (4) = b0, f (5) = a (0)).

また、上記では、画素列(a0)、(a1,b0)、(a2,b1,c0)、(a3,b2,c1,d0)、(a4,b3,c2,d1,e0)、(a5,b4,c3,d2,e1,f0)、(a6,b5,c4,d3,e2,f1,g0)に対して、隣り合う画素列を結合して１次元ＤＣＴを行ったが、 (a7,b6,c5,d4,e3,f2,g1,h0)に対して行った１次元ＤＣＴ同様に、それぞれの画素列を結合せずに１次元ＤＣＴを行うことも可能である。この時の１次元ＤＣＴをかける画素列の組み合わせを図１１（ｂ）に示す。 In the above, the pixel columns (a0), (a1, b0), (a2, b1, c0), (a3, b2, c1, d0), (a4, b3, c2, d1, e0), (a5, b4, c3, d2, e1, f0) and (a6, b5, c4, d3, e2, f1, g0) were subjected to one-dimensional DCT by combining adjacent pixel columns. , c5, d4, e3, f2, g1, h0), it is also possible to perform a one-dimensional DCT without combining the respective pixel columns. FIG. 11B shows a combination of pixel columns to which the one-dimensional DCT is applied at this time.

残りの画素列の(b7,c6,d5,e4,f3,g2,h1)、(c7,d6,e5,f4,g3,h2)、(d7,e6,f5,g4,h3)、(e7,f6,g5,h4)、(f7,g6,h5)、(g7,h6)については、(a0)、(a1,b0)、(a2,b1,c0)、(a3,b2,c1,d0)、(a4,b3,c2,d1,e0)、(a5,b4,c3,d2,e1,f0)、(a6,b5,c4,d3,e2,f1,g0)と、８ｘ８ブロックの対角線である画素列(a7,b6,c5,d4,e3,f2,g1,h0)に関して線対称であるので、(a0)、(a1,b0)、(a2,b1,c0)、(a3,b2,c1,d0)、(a4,b3,c2,d1,e0)、(a5,b4,c3,d2,e1,f0)、(a6,b5,c4,d3,e2,f1,g0)と同様の計算で１次元ＤＣＴを行う。 (B7, c6, d5, e4, f3, g2, h1), (c7, d6, e5, f4, g3, h2), (d7, e6, f5, g4, h3), (e7, For (f6, g5, h4), (f7, g6, h5), (g7, h6), (a0), (a1, b0), (a2, b1, c0), (a3, b2, c1, d0) , (A4, b3, c2, d1, e0), (a5, b4, c3, d2, e1, f0), (a6, b5, c4, d3, e2, f1, g0) and a diagonal line of 8 × 8 blocks Since it is axisymmetric with respect to the pixel column (a7, b6, c5, d4, e3, f2, g1, h0), (a0), (a1, b0), (a2, b1, c0), (a3, b2, c1 , d0), (a4, b3, c2, d1, e0), (a5, b4, c3, d2, e1, f0), (a6, b5, c4, d3, e2, f1, g0) Perform one-dimensional DCT.

予測モード４の場合は、予測モード３の予測方向を回転又は反転したものと考えられるので、予測モード３の時と同様にしてＤＣＴ演算を行うことが可能である。
また、上記説明では実数ＤＣＴ演算を用いたが、Ｈ．２６４で規定されているように整数演算でＤＣＴを行うことも可能である。 In the case of the prediction mode 4, since it is considered that the prediction direction of the prediction mode 3 is rotated or reversed, the DCT calculation can be performed in the same manner as in the prediction mode 3.
In the above description, real DCT calculation is used. It is also possible to perform DCT by integer arithmetic as defined in H.264.

次に、予測モード５〜８に対応するＤＣＴ演算方法を説明する。 Next, DCT calculation methods corresponding to prediction modes 5 to 8 will be described.

予測モード５の場合、サブブロック内の画素を、予測方向に沿って、(a0,b0,c1,d1,e2,f2,g3,h3)、(a1,b1,c2,d2,e3,f3,g4,h4)、(a2,b2,c3,d3,e4,f4,g5,h5)、(a3,b3,c4,d4,e5,f5,g6,h6)、(a4,b4,c5,d5,e6,f6,g7,h7)、(c0,d0,e1,f1,g2,h2)、(a5,b5,c6,d6,e7,f7)、(e0,f0,g1,h1)、(h0,g0) 、(a6,b6,c7,d7)、(a7,b7)の１１組に画素をまとめる。この時の１次元ＤＣＴをかける画素列の組み合わせを図１２（ａ）に示す。変換係数をF(u)とすると、画素列(a0,b0,c1,d1,e2,f2,g3,h3)に関しては、数式(１９)のように基底長８の１次元ＤＣＴが行われる。

In the case of the prediction mode 5, the pixels in the sub-block are (a0, b0, c1, d1, e2, f2, g3, h3), (a1, b1, c2, d2, e3, f3, g4, h4), (a2, b2, c3, d3, e4, f4, g5, h5), (a3, b3, c4, d4, e5, f5, g6, h6), (a4, b4, c5, d5, e6, f6, g7, h7), (c0, d0, e1, f1, g2, h2), (a5, b5, c6, d6, e7, f7), (e0, f0, g1, h1), (h0, g0), (a6, b6, c7, d7) and (a7, b7) are grouped into 11 groups of pixels. FIG. 12A shows a combination of pixel columns to which one-dimensional DCT is applied at this time. Assuming that the conversion coefficient is F (u), a one-dimensional DCT with a base length of 8 is performed on the pixel column (a0, b0, c1, d1, e2, f2, g3, h3) as shown in Equation (19).

(ただし、f(0)=a0, f(1)=b0, f(2)=c1, f(3)=d1, f(4)=e2, f(5)=f2, f(6)=g3, f(7)=h3)
画素列(a1,b1,c2,d2,e3,f3,g4,h4)、(a2,b2,c3,d3,e4,f4,g5,h5)、(a3,b3,c4,d4,e5,f5,g6,h6)、(a4,b4,c5,d5,e6,f6,g7,h7)に関しても同じ演算が行われる。 (However, f (0) = a0, f (1) = b0, f (2) = c1, f (3) = d1, f (4) = e2, f (5) = f2, f (6) = g3, f (7) = h3)
Pixel column (a1, b1, c2, d2, e3, f3, g4, h4), (a2, b2, c3, d3, e4, f4, g5, h5), (a3, b3, c4, d4, e5, f5 , g6, h6) and (a4, b4, c5, d5, e6, f6, g7, h7), the same calculation is performed.

次に、画素列(c0,d0,e1,f1,g2,h2)に関しては、数式(２０)のように基底長６の１次元ＤＣＴが行われる。

Next, for the pixel column (c0, d0, e1, f1, g2, h2), a one-dimensional DCT with a base length of 6 is performed as in Equation (20).

(ただし、f(0)=c0, f(1)=d0,f(2)=e1,f(3)=f1,f(4)=g2,f(5)=h2)
画素列(a5,b5,c6,d6,e7,f7)に関しても同じ演算が行われる。 (However, f (0) = c0, f (1) = d0, f (2) = e1, f (3) = f1, f (4) = g2, f (5) = h2)
The same calculation is performed on the pixel columns (a5, b5, c6, d6, e7, f7).

次に、画素列(e0,f0,g1,h1)、(h0,g0) に関しては、画素列を折り返し結合して、数式(２１)のように基底長６の１次元DCTが行われる。

Next, with respect to the pixel columns (e0, f0, g1, h1) and (h0, g0), the pixel columns are folded and combined, and a one-dimensional DCT with a base length of 6 is performed as shown in Equation (21).

(ただし、f(0)=e0, f(1)=f0,f(2)=g1,f(3)=h1,f(4)=h0,f(5)=g0)
画素(a6,b6,c7,d7)、(a7,b7)に関しても同じ演算が行われる。 (However, f (0) = e0, f (1) = f0, f (2) = g1, f (3) = h1, f (4) = h0, f (5) = g0)
The same calculation is performed on the pixels (a6, b6, c7, d7) and (a7, b7).

また、上記では、画素列(e0,f0,g1,h1)、(h0,g0)、(d7,c7,b6,a6)、(a7,b7)に対して、隣り合う画素列を結合して1次元ＤＣＴを行ったが、 (a0,b0,c1,d1,e2,f2,g3,h3)に対して行った１次元ＤＣＴ同様に、それぞれの画素列を結合せずに１次元ＤＣＴを行うことも可能である。この時の１次元ＤＣＴをかける画素列の組み合わせを図１２（ｂ）に示す。 In the above, the pixel columns (e0, f0, g1, h1), (h0, g0), (d7, c7, b6, a6), (a7, b7) are combined with adjacent pixel columns. The one-dimensional DCT is performed, but the one-dimensional DCT is performed without combining the respective pixel columns, similarly to the one-dimensional DCT performed on (a0, b0, c1, d1, e2, f2, g3, h3). It is also possible. FIG. 12B shows a combination of pixel columns to which the one-dimensional DCT is applied at this time.

予測モード６〜８の場合は、予測モード５の予測方向を回転もしくは反転したものと考えられるので、予測モード５の時と同様にしてＤＣＴ演算を行うことが可能である。
また、上記説明では実数ＤＣＴ演算を用いたが、Ｈ．２６４で規定されているように整数演算でＤＣＴを行うことも可能である。 In the case of the prediction modes 6 to 8, since it is considered that the prediction direction of the prediction mode 5 is rotated or reversed, it is possible to perform the DCT calculation as in the case of the prediction mode 5.
In the above description, real DCT calculation is used. It is also possible to perform DCT by integer arithmetic as defined in H.264.

本実施の形態では、符号化対象のサブブロックのブロックサイズとして、４x４サイズと８x８サイズを用いたが、別の実施の形態として、４ｘ４サイズよりも小さいサイズのブロックや、８x８サイズよりも大きいサイズのブロックをサブブロックとして符号化処理を行うことも可能である。例えば、２x２サイズや１６x１６サイズのブロックで符号化処理を行っても構わない。 In this embodiment, the 4 × 4 size and the 8 × 8 size are used as the block size of the sub-block to be encoded. However, as another embodiment, a block having a size smaller than the 4 × 4 size or a size larger than the 8 × 8 size is used. It is also possible to perform the encoding process using these blocks as sub-blocks. For example, the encoding process may be performed with a block of 2 × 2 size or 16 × 16 size.

また、本実施の形態では、直交変換方法として、ＤＣＴを用いたが、別の実施の形態として、ＫＬＴ、ＤＳＴ、離散ウェーブレット変換(Discrete Wavelet Transform、以下ＤＷＴ)を用いることが可能である。 In this embodiment, DCT is used as the orthogonal transformation method. However, as another embodiment, KLT, DST, and discrete wavelet transform (hereinafter referred to as DWT) can be used.

エントロピー符号化部１０６には、第１変換部１０２及び第１量子化部１０４又は第２変換部１０３及び第２量子化部１０５で生成された量子化変換係数１１７、そして符号化制御部１０７で生成された変換情報１１８が入力され、エントロピー符号化が行われる。モード判定によって最適と判定されたモードの変換情報１１８、変換係数１１７がエントロピー符号化され、符号化データ１２４が生成される。
また、モード判定によって最適モードと判定された予測モードが平均予測、垂直方向又は水平方向の予測である場合、エントロピー符号化部１０６は変換情報１１８を符号化せず、量子化変換係数１１７のみをエントロピー符号化する。 The entropy encoding unit 106 includes a quantized transform coefficient 117 generated by the first transform unit 102 and the first quantization unit 104 or the second transform unit 103 and the second quantization unit 105, and an encoding control unit 107. The generated conversion information 118 is input and entropy coding is performed. The conversion information 118 and the conversion coefficient 117 of the mode determined to be optimal by the mode determination are entropy encoded, and encoded data 124 is generated.
In addition, when the prediction mode determined as the optimum mode by the mode determination is average prediction, prediction in the vertical direction or horizontal direction, the entropy encoding unit 106 does not encode the conversion information 118, and only the quantized conversion coefficient 117 is used. Entropy encoding.

量子化変換係数１１７は、逆変換方法選択スイッチ１０８に入力される。逆変換方法選択スイッチ１０８は、変換情報１１８の値に従ってスイッチが切り替えられ、量子化変換係数１１７は、第２逆量子化部１０９又は第１逆量子化部１１０に入力される。 The quantized transform coefficient 117 is input to the inverse transform method selection switch 108. The inverse transformation method selection switch 108 is switched according to the value of the transformation information 118, and the quantized transformation coefficient 117 is input to the second inverse quantization unit 109 or the first inverse quantization unit 110.

第１逆量子化部１１０では図１３（ａ）のように、量子化変換係数１１７に対して水平垂直方向の逆ＤＣＴに対応した逆量子化が逆量子化器２１６によって行われ、変換係数はその後図１３（ｂ）のように、第１逆変換部１１２で水平垂直方向の逆ＤＣＴが逆ＤＣＴ器２１７によって行われ、復号予測誤差信号１１９が生成される。また、第２逆量子化部１０９では図１４（ａ）のように、量子化変換係数１１７は予測モード情報１２０の予測方向の逆ＤＣＴに対応した逆量子化が逆量子化器（２１９〜２２４）によって行われる。さらに変換係数は、図１４（ｂ）のように、第２逆変換部１１１で、逆ＤＣＴ器２２６〜２３１から選択された、予測モード情報１２０の予測方向に対応した逆ＤＣＴ器によって逆ＤＣＴが行われ、復号予測誤差信号１１９が生成される。
復号予測誤差信号１１９は、予測部１１４で生成された予測画像信号１２１と加算され、参照メモリ１１３に蓄えられた後、参照画像信号１２２として予測部１１４に入力される。 In the first inverse quantization unit 110, as shown in FIG. 13A, inverse quantization corresponding to inverse DCT in the horizontal and vertical directions is performed on the quantized transform coefficient 117 by the inverse quantizer 216, and the transform coefficient is Thereafter, as shown in FIG. 13B, the inverse DCT in the horizontal and vertical directions is performed by the inverse DCT unit 217 in the first inverse transform unit 112, and the decoded prediction error signal 119 is generated. In addition, in the second inverse quantization unit 109, as shown in FIG. 14A, the quantization transform coefficient 117 is subjected to inverse quantization corresponding to inverse DCT in the prediction direction of the prediction mode information 120 by an inverse quantizer (219 to 224). ). Further, as shown in FIG. 14B, the transform coefficient is obtained by the inverse DCT by the inverse DCT unit corresponding to the prediction direction of the prediction mode information 120 selected from the inverse DCT units 226 to 231 by the second inverse transform unit 111. And a decoded prediction error signal 119 is generated.
The decoded prediction error signal 119 is added to the prediction image signal 121 generated by the prediction unit 114, stored in the reference memory 113, and then input to the prediction unit 114 as the reference image signal 122.

以上が本実施の形態にかかる動画像符号化装置１００の構成である。以下、本実施の形態の動画像符号化方法について、動画像符号化装置１００が実施する場合を例にあげ、図１５のフローチャートを参照しながら説明する。 The above is the configuration of the video encoding apparatus 100 according to the present embodiment. Hereinafter, the moving picture coding method according to the present embodiment will be described with reference to the flowchart of FIG.

動画像符号化装置１００に、１フレーム毎に動画像信号が入力されると、入力された画像はマクロブロック毎又はマクロブロックペア毎に符号化が開始される(ステップS３０１)。マクロブロックを符号化するに当たって、サブブロックのインデックスをsub_blk=０に初期化し、同時にサブブロックの個数MAX_SUB_BLKをセットする(ステップS３０２)。 When a moving image signal is input to the moving image encoding device 100 for each frame, encoding of the input image is started for each macroblock or macroblock pair (step S301). In encoding the macroblock, the subblock index is initialized to sub_blk = 0, and the number of subblocks MAX_SUB_BLK is set at the same time (step S302).

サブブロックのインデックスsub_blkがMAX_SUB_BLKより小さいかどうかを判別し、マクロブロック内のサブブロックにまだ符号化していないサブブロックが存在することを確認する(ステップS３０３)。ステップS３０３の判定が「Ｙｅｓ」の場合は、予測モードのインデックスをmode_idx＝０に初期化し、選択されうるモード数MAX_MODEをセットする(ステップS３０４)。 It is determined whether or not the sub-block index sub_blk is smaller than MAX_SUB_BLK, and it is confirmed that there is an uncoded sub-block in the sub-block in the macro block (step S303). If the determination in step S303 is “Yes”, the prediction mode index is initialized to mode_idx = 0, and the number of modes MAX_MODE that can be selected is set (step S304).

次に、mode_idxがMAX_MODEよりも小さいか否かを判定し(ステップS３０５)、ステップS３０５の判定が「Ｙｅｓ」の場合は、mode_idxの値に応じてtrans_idxをセットする。すなわち、mode_idxが斜め方向の予測モードを示すモード番号である場合は、trans_idx=０に設定し、mode_idxが斜め方向ではない予測モードを示すモード番号である場合は、trans_idxは生成されない(ステップS３０６)。さらに、予測部１１４にて、予測モードmode_idxにおける予測画像信号１２１が生成される(ステップS３０７)。予測部１１４で生成された予測画像信号１２１と入力画像信号１１５との差分をとることで、予測モードmode_idxにおける予測誤差信号１１６が求められる(ステップS３０８)。 Next, it is determined whether or not mode_idx is smaller than MAX_MODE (step S305). If the determination in step S305 is “Yes”, trans_idx is set according to the value of mode_idx. That is, when mode_idx is a mode number indicating a prediction mode in an oblique direction, trans_idx = 0 is set, and when mode_idx is a mode number indicating a prediction mode not in an oblique direction, trans_idx is not generated (step S306). . Further, the prediction unit 114 generates a prediction image signal 121 in the prediction mode mode_idx (step S307). By taking the difference between the predicted image signal 121 generated by the prediction unit 114 and the input image signal 115, the prediction error signal 116 in the prediction mode mode_idx is obtained (step S308).

次に、符号化制御部１０７は、予測部１１４で選択された予測モードmode_idxが、斜め方向の予測モードを示すモード番号のうち最小の値であるmode_directionよりも小さいか、又はtrans_idx==１かどうかを判別する(ステップS３０９)。ステップS３０９の判定における２つの条件のうちいずれかを満たす場合は変換方法選択スイッチ１０１を第１変換部１０２に切り替え、予測誤差信号１１６に対して水平垂直ＤＣＴが行われ(ステップS３１０)、その後変換係数は水平垂直ＤＣＴに対応した量子化が行われ、量子化変換係数１１７が生成される(ステップS３１２)。 Next, the encoding control unit 107 determines whether the prediction mode mode_idx selected by the prediction unit 114 is smaller than mode_direction which is the smallest value among the mode numbers indicating the prediction modes in the oblique direction, or trans_idx == 1 It is determined whether or not (step S309). If either of the two conditions in the determination in step S309 is satisfied, the conversion method selection switch 101 is switched to the first conversion unit 102, and the horizontal / vertical DCT is performed on the prediction error signal 116 (step S310). The coefficient is quantized corresponding to horizontal and vertical DCT, and a quantized transform coefficient 117 is generated (step S312).

ステップS３０９の判定における２つの条件をいずれも満たさない場合、選択スイッチ１０１を第２変換部１０３に切り替え、さらに予測モードmode_idxに従って、第２変換部１０３のスイッチ２０２は変換器２０３〜２０８を切り替える。スイッチ２０２により選択された変換器により予測誤差信号１１６に対して予測方向に応じたＤＣＴが行われる(ステップ３１１)。その後変換係数は、第２量子化部１０５において各ＤＣＴに対応した量子化が第２量子化部１０５の量子化器２１０〜２１５の対応する各量子化器によって行われ、量子化変換係数１１７が生成される(ステップS３１３)。 When neither of the two conditions in the determination in step S309 is satisfied, the selection switch 101 is switched to the second conversion unit 103, and the switch 202 of the second conversion unit 103 switches the converters 203 to 208 according to the prediction mode mode_idx. DCT corresponding to the prediction direction is performed on the prediction error signal 116 by the converter selected by the switch 202 (step 311). Thereafter, the transform coefficient is quantized corresponding to each DCT in the second quantizing unit 105 by the corresponding quantizers of the quantizers 210 to 215 of the second quantizing unit 105, and the quantized transform coefficient 117 is obtained. It is generated (step S313).

即ち、予測方向が垂直方向及び水平方向ではない場合のみ、垂直水平方向の変換方法と予測モードに従った予測方向の変換方法とを切替え、予測方向が垂直方向及び水平方向の場合、常に垂直水平方向の変換方法を選択する。要するに、予測方向が斜め方向の場合には、垂直水平方向の変換と斜め方向の変換とを行い、予測方向が垂直方向及び水平方向の場合には、垂直水平方向の変換のみを行う。 That is, only when the prediction direction is not the vertical direction or the horizontal direction, the conversion method between the vertical horizontal direction and the prediction direction conversion method according to the prediction mode is switched. When the prediction direction is the vertical direction and the horizontal direction, the horizontal direction is always vertical. Select the direction conversion method. In short, when the prediction direction is an oblique direction, conversion in the vertical and horizontal directions and conversion in the oblique direction are performed, and when the prediction directions are the vertical direction and the horizontal direction, only conversion in the vertical and horizontal directions is performed.

次に、量子化変換係数１１７と変換情報１１８は、エントロピー符号化部１０６で符号化される(ステップS３１４)。この際、mode_idx < mode_directionの場合は、変換情報trans_idxが生成されないので、変換情報１１８は符号化されず、変換係数１１７のみが符号化される。即ち、前記エントロピー符号化部１０６は、変換情報と変換係数を符号化する際に、予測モード情報が示す予測方向が垂直方向及び水平方向ではない場合のみ、変換係数に加えて変換情報を符号化し、予測モードが示す予測方向が垂直方向及び水平方向である場合は、変換情報は符号化せず、変換係数のみを符号化する。 Next, the quantized transform coefficient 117 and the transform information 118 are encoded by the entropy encoding unit 106 (step S314). At this time, if mode_idx <mode_direction, the conversion information trans_idx is not generated, so the conversion information 118 is not encoded, and only the conversion coefficient 117 is encoded. That is, when encoding the transform information and the transform coefficient, the entropy coding unit 106 codes the transform information in addition to the transform coefficient only when the prediction direction indicated by the prediction mode information is not the vertical direction and the horizontal direction. When the prediction directions indicated by the prediction mode are the vertical direction and the horizontal direction, the conversion information is not encoded, and only the conversion coefficient is encoded.

ステップS３１４のエントロピー符号化の後、予測モードのＲＤコストを保存し(ステップS３１５)、その後、trans_idxが１かどうかを判定する(ステップS３１６)。ステップS３１６において、trans_idx==０だった場合は、trans_idxに１を足して(ステップS３１７)、ステップS３０９に戻り、予測モードmode_idxと変換情報trans_idxに基づいて再度分岐判定を行う。 After entropy coding in step S314, the prediction mode RD cost is stored (step S315), and then it is determined whether or not trans_idx is 1 (step S316). In step S316, if trans_idx == 0, 1 is added to trans_idx (step S317), the process returns to step S309, and branch determination is performed again based on the prediction mode mode_idx and the conversion information trans_idx.

ステップS３０９〜S３１５を経て、再度ステップS３１６においてtrans_idxが１かどうかの判定を行う。ステップS３１６の判定において、trans_idx==１だった場合はmode_idxに１を加算して、次の予測モードをセットする(ステップS３１８)。その後、セットした予測モードmode_idxがMAX_MODEより小さいか否かを確認し(ステップS３０５)、小さい場合ステップS３０６〜ステップS３１８の処理を行う。可能な全てのモードについて、ステップS３０６〜ステップS３１８までの処理を行った後、ステップS３０５でmode_idx<MAX_MODEの判定が「ＮＯ」と判定されると、サブブロックのインデックスsub_blkに１を加算して、次のサブブロックの符号化処理に移る(ステップS３１９)。次のサブブロックがMAX_SUB_BLKより小さいか否かを判定し(ステップS３０３)、ステップS３０３の判定が「Ｙｅｓ」の場合はステップS３０４〜ステップS３１９の処理を行う。全サブブロックについてステップS３０４〜ステップS３１９が終了した後、sub_blk<MAX_SUB_BLKの判定が「ＮＯ」と判定されると、各サブブロックについて求められた各モードのコスト関数を比較し、各ブロックの最適モードをロードし(ステップS３２０)、最適モードで符号化を行った際の符号化データをマクロブロックで多重化し、送出される(ステップS３２１)。 After steps S309 to S315, it is determined again whether or not trans_idx is 1 in step S316. If it is determined in step S316 that trans_idx == 1, 1 is added to mode_idx, and the next prediction mode is set (step S318). Thereafter, it is confirmed whether or not the set prediction mode mode_idx is smaller than MAX_MODE (step S305). If it is smaller, the processes of steps S306 to S318 are performed. After performing the processing from step S306 to step S318 for all possible modes, if the determination of mode_idx <MAX_MODE is “NO” in step S305, 1 is added to the subblock index sub_blk, The process proceeds to the encoding process for the next sub-block (step S319). It is determined whether or not the next sub-block is smaller than MAX_SUB_BLK (step S303). If the determination in step S303 is “Yes”, the processing from step S304 to step S319 is performed. If the determination of sub_blk <MAX_SUB_BLK is “NO” after steps S304 to S319 have been completed for all the subblocks, the cost functions of the respective modes obtained for the respective subblocks are compared, and the optimum mode of each block is compared. Is loaded (step S320), and the encoded data when encoding is performed in the optimum mode is multiplexed with a macroblock and transmitted (step S321).

次に、本実施の形態で用いられるシンタクスの説明を行う。シンタクスの全体的な構造を図１８に示す。本実施の形態で用いられるシンタクスは、主に３つのパートからなり、High Level Syntaxはスライス以上の上位レイヤのシンタクス情報が詰め込まれている。Slice Level Syntaxはスライス毎に必要な情報が明記されており、Macroblock Level Syntaxはマクロブロック毎に必要とされる可変長符号化された誤差信号やモード情報が明記されている。
シンタクスのそれぞれは、さらに詳細なシンタクスで構成されており、High Level Syntaxでは、Sequence parameter set syntaxとPicture parameter set syntaxなどのシーケンス、ピクチャレベルのシンタクスから構成されている。Slice Level Syntaxでは、Slice header syntax、Slice data syntaxなどから成る。さらに、Macroblock Level Syntaxは、Macroblock layer syntax、Macroblock prediction syntaxなどから構成されている。 Next, the syntax used in this embodiment will be described. The overall structure of the syntax is shown in FIG. The syntax used in the present embodiment is mainly composed of three parts, and High Level Syntax is packed with syntax information of an upper layer higher than a slice. Slice Level Syntax specifies information required for each slice, and Macroblock Level Syntax specifies error signal and mode information encoded in variable length required for each macroblock.
Each syntax is composed of more detailed syntax, and in High Level Syntax, it is composed of sequences such as Sequence parameter set syntax and Picture parameter set syntax, and syntax of picture level. Slice Level Syntax consists of Slice header syntax, Slice data syntax, etc. Furthermore, Macroblock Level Syntax is composed of Macroblock layer syntax, Macroblock prediction syntax, and the like.

本実施の形態で、必要となるシンタクス情報はシーケンスヘッダ、スライスヘッダ、マクロブロックヘッダであり、それぞれのシンタクスを以下で説明する。 In the present embodiment, required syntax information is a sequence header, a slice header, and a macroblock header, and each syntax will be described below.

図１９のシーケンスヘッダ内に示されるdirectional_dct_seq_flagは、符号化対象シーケンスで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、符号化対象シーケンスで本実施の形態の直交変換方式を使用することが可能である。次に、図２０のピクチャヘッダ内に示されるdirectional_dct_pic_flagは、ピクチャで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、ピクチャで本実施の形態の直交変換方式を用いることが可能となる。次に、図２１のスライスヘッダ内に示されるdirectional_dct_slice_flagは、スライスで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、スライスで本実施の形態の直交変換方式を用いることが可能となる。 The directional_dct_seq_flag shown in the sequence header of FIG. 19 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used in the encoding target sequence. When this flag is 1, the directional_dct_seq_flag is set in the encoding target sequence. It is possible to use a form of orthogonal transform scheme. Next, directional_dct_pic_flag shown in the picture header of FIG. 20 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used for a picture. When this flag is 1, the orthogonal of the present embodiment is used for a picture. A conversion method can be used. Next, directional_dct_slice_flag shown in the slice header of FIG. 21 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used in the slice. When this flag is 1, the orthogonal of the present embodiment is used in the slice. A conversion method can be used.

図２２に示すようにマクロブロックヘッダ内には、マクロブロックレイヤ内のシンタクスとして、directional_dct4x4_mode_flag、directional_dct8x8_mode_flagの二つのフラグを持つ。本実施の形態では、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをマクロブロックにつき１ビット生成する。これら二つのフラグは、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報であり、それぞれのフラグが１を示す場合は、水平垂直ＤＣＴが行われていることを示し、０を示す場合は予測方向に依存したＤＣＴが行われていることを示す。 As shown in FIG. 22, the macroblock header has two flags of directional_dct4x4_mode_flag and directional_dct8x8_mode_flag as syntax in the macroblock layer. In this embodiment, when a macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, if the subblock in the macroblock is 4x4 block, directional_dct_4x4_mode_flag is set and subblock is 8x8 block If there is, one bit of directional_dct8x8_mode_flag is generated for each macroblock. These two flags are transformation information indicating whether the orthogonal transformation is horizontal / vertical DCT or DCT depending on the prediction direction. When each flag indicates 1, horizontal / vertical DCT is used. Indicates that DCT is being performed, and a value of 0 indicates that DCT depending on the prediction direction is being performed.

本実施の形態では、マクロブロックごとに、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報を決定するが、別の実施の形態として、サブブロックごとに変換情報を決定したり、さらにはサブブロック単位で補助情報に従って変換情報を決定したりする方法が考えられる。 In this embodiment, for each macroblock, transform information indicating whether orthogonal transform is performing horizontal / vertical DCT or DCT depending on the prediction direction is determined. However, as another embodiment, A method of determining conversion information for each sub-block, or determining conversion information according to auxiliary information for each sub-block can be considered.

上述のように第１の実施の形態によると、各ブロックに対して、従来のH.２６４の変換方式に加えて、変換方向をイントラ予測が行われる方向に限定して１次元ＤＣＴをかける。また、斜め方向に１次元DCTをかける際には、イントラ予測の方向を考慮して画素を折り返すことで、基底長の長い１次元DCTをかける。また、予測方向に１次元DCTをかける際は、予測方向に１回だけ１次元DCTをかける。即ち、予測方向が斜め方向の場合、垂直水平変換と予測方向の斜め変換との２つを行い、斜め方向の変換を予測方向に対応する一種類に絞り、垂直水平変換結果と予測方向に合わせた斜め方向の変換結果とを比較して最適な変換方法を採用する。 As described above, according to the first embodiment, in addition to the conventional H.264 conversion method, one-dimensional DCT is applied to each block while limiting the conversion direction to the direction in which intra prediction is performed. In addition, when the one-dimensional DCT is applied in an oblique direction, the one-dimensional DCT having a long base length is applied by folding back pixels in consideration of the direction of intra prediction. In addition, when the one-dimensional DCT is applied in the prediction direction, the one-dimensional DCT is applied only once in the prediction direction. In other words, when the prediction direction is diagonal, the vertical horizontal conversion and the diagonal conversion of the prediction direction are performed, and the diagonal conversion is narrowed down to one type corresponding to the prediction direction, and the vertical horizontal conversion result and the prediction direction are matched. The optimum conversion method is adopted by comparing the result of conversion in the diagonal direction.

[第２の実施の形態]
次に、第２の実施の形態について説明する。本実施の形態における動画像符号化装置の全体的な構成は、第１の実施形態とほぼ同様であるため、第1の実施の形態との相違点のみを説明する。第１の実施の形態では、本実施の形態の直交変換方式を用いるかどうかを、マクロブロックごとに、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報を決定するが、本実施の形態では、変換情報を、サブマクロブロック単位で切り替える。本実施の形態におけるマクロブロックヘッダ内のシンタックスの構造を図２３に示す。 [Second Embodiment]
Next, a second embodiment will be described. Since the overall configuration of the moving picture coding apparatus according to the present embodiment is substantially the same as that of the first embodiment, only differences from the first embodiment will be described. In the first embodiment, whether or not the orthogonal transform method of the present embodiment is used is determined for each macroblock whether the orthogonal transform performs horizontal / vertical DCT or performs DCT depending on the prediction direction. In this embodiment, the conversion information is switched in units of sub-macroblocks. FIG. 23 shows the syntax structure in the macroblock header in the present embodiment.

本実施の形態では、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをサブマクロブロックにつき１ビット生成する。すなわち、マクロブロックが４x４変換を行う場合は、１マクロブロック中に１６個のdirectional_dct4x4_mode_flagが存在し、８x８変換を行う場合は１マクロブロック中に４個のdirectional_dct8x8_mode_flagが存在する。 In this embodiment, when a macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, if the subblock in the macroblock is 4x4 block, directional_dct_4x4_mode_flag is set and subblock is 8x8 block If there is, one bit of directional_dct8x8_mode_flag is generated for each sub macroblock. That is, 16 directional_dct4x4_mode_flag exists in one macroblock when the macroblock performs 4x4 conversion, and 4 directional_dct8x8_mode_flag exists in one macroblock when the 8x8 conversion is performed.

[第３の実施の形態]
次に、本発明の第３の実施の形態について説明する。本実施の形態における動画像符号化装置の全体的な構成は、第２の実施形態とほぼ同様であるため、第２の実施の形態との相違点のみを説明する。本実施の形態では、第２の実施の形態と同様にして、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをサブブロックにつき１ビットを生成する。この際に、第２の実施の形態とは異なり、全てのサブブロックにフラグを作成するのではなく、予測モード情報やサブブロックに非ゼロ係数が含まれるかを表すcoded_block_patternの補助情報を用いて、サブブロックにフラグを生成するかどうかを決定する。本実施の形態におけるマクロブロックヘッダ内のシンタックスの構造を図２４に示す。intra8x8_pred_mode、intra4x4_pred_modeはそれぞれ、８x８ブロックと４x４ブロックの予測モードを表す値で、予測方向と予測モードの値の関係は図３の値と同様である。符号化の際に、予測モードが示す予測の方向が垂直方向、水平方向、ＤＣ方向である場合、さらにサブブロックが含まれる８x８画素ブロックにおいて、Coded_Block_Patternから非ゼロ係数を持たないと判定される場合は、予測方向に依存した変換を行う必要がないので、directional_dct4x4_mode_flag、又はdirectional_dct8x8_mode_flagは生成されない。 [Third embodiment]
Next, a third embodiment of the present invention will be described. Since the overall configuration of the moving picture encoding apparatus in the present embodiment is substantially the same as that of the second embodiment, only differences from the second embodiment will be described. In this embodiment, as in the second embodiment, when the macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, the subblock in the macroblock is a 4 × 4 block. If there is, directional_dct_4x4_mode_flag is generated, and if the subblock is an 8x8 block, directional_dct8x8_mode_flag is generated for one bit per subblock. At this time, unlike the second embodiment, instead of creating a flag for all sub-blocks, using prediction mode information and auxiliary information of coded_block_pattern indicating whether the non-zero coefficient is included in the sub-block Determine whether to generate a flag in the sub-block. FIG. 24 shows the syntax structure in the macroblock header in the present embodiment. intra8x8_pred_mode and intra4x4_pred_mode are values representing prediction modes of the 8x8 block and the 4x4 block, respectively, and the relationship between the prediction direction and the value of the prediction mode is the same as the value of FIG. When encoding, when the prediction direction indicated by the prediction mode is the vertical direction, the horizontal direction, or the DC direction, and when it is determined that the 8 × 8 pixel block including the sub-block does not have a non-zero coefficient from Coded_Block_Pattern Since there is no need to perform conversion depending on the prediction direction, directional_dct4x4_mode_flag or directional_dct8x8_mode_flag is not generated.

[第４の実施の形態]
図１６は、第４の実施形態に係わる動画像復号化装置４００の構成を示すブロック図である。本実施形態に係る動画像復号化装置４００は、符号化データから変換係数、予測モード情報、変換情報を解析するエントロピー復号部４０１と、解析された変換情報から逆量子化及び逆変換方法を選択する逆変換方法選択スイッチ４０２と、予測モード情報から得られる予測の方向に対応した逆量子化及び逆変換を行う逆量子化変換部４０３〜４０６とを備えている。 [Fourth embodiment]
FIG. 16 is a block diagram illustrating a configuration of a video decoding device 400 according to the fourth embodiment. The video decoding apparatus 400 according to the present embodiment selects an entropy decoding unit 401 that analyzes transform coefficients, prediction mode information, and transform information from encoded data, and an inverse quantization and inverse transform method from the analyzed transform information. Inverse transform method selection switch 402, and inverse quantization transform units 403 to 406 that perform inverse quantization and inverse transform corresponding to the direction of prediction obtained from the prediction mode information.

図１６において、動画像符号化装置１００から送出され、伝送系又は蓄積系を経て送られてきた符号化データ４０９は、多重化分離器（図示せず）により１フレーム毎にシンタクスに基づいて分離された後、エントロピー復号部４０１に入力される。エントロピー復号部４０１では、符号化データの各シンタクスの可変長符号が復号され、変換情報４１０、予測モード情報４１１、量子化変換係数４１２などが再生される。 In FIG. 16, encoded data 409 sent from the moving image encoding apparatus 100 and sent via the transmission system or the storage system is separated on the basis of the syntax for each frame by a demultiplexer (not shown). Is then input to the entropy decoding unit 401. In the entropy decoding unit 401, the variable length code of each syntax of the encoded data is decoded, and the transform information 410, the prediction mode information 411, the quantized transform coefficient 412 and the like are reproduced.

量子化変換係数は、逆変換方法選択スイッチ４０２に入力され、エントロピー復号部４０１で復号された変換情報４１０に基づいて、第１逆量子化部４０３、第２逆量子化部４０４のいずれかに接続される。第１逆量子化部４０３に接続された場合、水平垂直ＤＣＴに対応した逆量子化方法で、量子化変換係数は逆量子化される。一方、第２逆量子化部４０４に接続された場合、エントロピー復号部４０１で再生された予測モード情報４１１によって、逆量子化方法が逆量子化器２１９〜２２４から選択され、予測モード情報が示す予測方向のＤＣＴに対応した逆量子化が行われる。第１逆量子化部４０３又は第２逆量子化部４０４で逆量子化された変換係数は、対応する第１逆変換部４０５又は第２逆変換部４０６に入力される。第１逆変換部４０５では、水平垂直逆ＤＣＴが行われ、第２逆変換部４０６では、予測モード情報４１１によって、逆ＤＣＴの方法が逆ＤＣＴ器２２６〜２３１から選択され、予測モードが示す予測方向に依存した逆ＤＣＴが行われ、復号予測誤差信号４１３が出力される。 The quantized transform coefficient is input to the inverse transform method selection switch 402, and based on the transform information 410 decoded by the entropy decoding unit 401, either the first inverse quantizer 403 or the second inverse quantizer 404. Connected. When connected to the first inverse quantization unit 403, the quantized transform coefficients are inversely quantized by an inverse quantization method corresponding to horizontal and vertical DCT. On the other hand, when connected to the second inverse quantization unit 404, the inverse quantization method is selected from the inverse quantizers 219 to 224 by the prediction mode information 411 reproduced by the entropy decoding unit 401, and the prediction mode information indicates Inverse quantization corresponding to DCT in the prediction direction is performed. The transform coefficient inversely quantized by the first inverse quantization unit 403 or the second inverse quantization unit 404 is input to the corresponding first inverse transform unit 405 or second inverse transform unit 406. The first inverse transform unit 405 performs horizontal / vertical inverse DCT, and the second inverse transform unit 406 selects the inverse DCT method from the inverse DCT units 226 to 231 based on the prediction mode information 411, and indicates the prediction indicated by the prediction mode. The direction-dependent inverse DCT is performed, and a decoded prediction error signal 413 is output.

参照メモリ４０７に蓄積されていた参照画像４１４と予測モード情報４１１を用いて、予測部４０８で予測画像４１５が生成される。復号予測誤差信号４１３と予測画像４１５を足し合わせることで復号画像信号が生成され、参照メモリ４０７に蓄積された後、再生画像として出力される。 A prediction image 415 is generated by the prediction unit 408 using the reference image 414 and the prediction mode information 411 stored in the reference memory 407. A decoded image signal is generated by adding the decoded prediction error signal 413 and the predicted image 415, accumulated in the reference memory 407, and then output as a reproduced image.

また、本実施の形態では、直交逆変換方法として、ＩＤＣＴを用いたが、別の実施の形態として、ＫＬＴ、ＤＳＴ、ＤＷＴの逆変換を用いることが可能である。 In this embodiment, IDCT is used as the orthogonal inverse transform method. However, as another embodiment, inverse transform of KLT, DST, and DWT can be used.

以上が本実施の形態にかかる動画像復号化装置４００の構成である。以下、本発明にかかる本実施の形態の動画像復号化方法について、動画像復号化装置４００が実施する場合を例にあげ、図１７のフローチャートを参照しながら説明する。 The above is the configuration of the moving picture decoding apparatus 400 according to the present embodiment. Hereinafter, the moving picture decoding method according to the present embodiment of the present invention will be described with reference to the flowchart of FIG. 17 by taking the moving picture decoding apparatus 400 as an example.

動画像復号化装置４００が、１フレーム毎に動画像符号化データを取得する(ステップS５０１)と、入力された動画像符号化データはマクロブロック毎、或いはマクロブロックペア毎に復号化が開始される。符号化データはまず、エントロピー復号部４０１に入力され、符号化データの各シンタクスの可変長符号が復号され、変換情報４１０、予測モード情報４１１、量子化変換係数４１２などが再生される(ステップＳ５０２)。さらに、ステップＳ５０２で復号されたデータから、シンタックスの解析を行う(ステップＳ５０３)。 When the moving image decoding apparatus 400 obtains moving image encoded data for each frame (step S501), decoding of the input moving image encoded data is started for each macroblock or each macroblock pair. The First, the encoded data is input to the entropy decoding unit 401, the variable length code of each syntax of the encoded data is decoded, and the transform information 410, the prediction mode information 411, the quantized transform coefficient 412 and the like are reproduced (step S502). ). Further, syntax analysis is performed from the data decoded in step S502 (step S503).

次に、サブブロックのインデックスをsub_blk=０に初期化する（ステップＳ５０４）。sub_blkがマクロブロックのサブブロック数MAX_SUB_BLKよりも小さいかどうかを判定し(ステップＳ５０５)、小さい場合はステップS５０３で解析したシンタックスから、予測モード情報mode_idx、変換情報trans_idxをロードする(ステップＳ５０６)。その後予測部４０８は、ステップS５０６でロードした予測モードmode_idxの予測画像を生成する(Ｓ５０７)。 Next, the sub-block index is initialized to sub_blk = 0 (step S504). It is determined whether or not sub_blk is smaller than the number of sub-blocks MAX_SUB_BLK of the macroblock (step S505). If smaller, prediction mode information mode_idx and conversion information trans_idx are loaded from the syntax analyzed in step S503 (step S506). Thereafter, the prediction unit 408 generates a prediction image of the prediction mode mode_idx loaded in step S506 (S507).

次に、予測モードmode_idxが斜め方向の予測を示す最も小さい予測モード番号mode_directionよりも小さいか、又はtrans_idx==１であるかどうかを判定する(ステップＳ５０８)。ステップS５０８の判定が「Ｙｅｓ」であれば、水平垂直ＤＣＴに対応した逆量子化が行われ(ステップＳ５０９)、その後水平垂直逆ＤＣＴが行われる(ステップＳ５１１)。ステップＳ５０８の判定が「ＮＯ」であれば、イントラ予測方向のＤＣＴに対応した逆量子化が行われ(ステップＳ５１０)、その後イントラ予測方向に依存した逆ＤＣＴが行われる(ステップＳ５１２)。その後、復号された誤差信号と予測画像が合成され(ステップＳ５１３)、参照メモリ４０７に蓄積される(ステップＳ５１４)。その後、サブブロックのインデックスsub_blkに１を加算して、ステップS５０５に戻る。ステップS５０５の判定が、「ＮＯ」である場合、マクロブロックの全サブブロックの復号化が終了し、復号画像を出力する(ステップＳ５１６)。 Next, it is determined whether or not the prediction mode mode_idx is smaller than the smallest prediction mode number mode_direction indicating prediction in the oblique direction, or trans_idx == 1 (step S508). If the determination in step S508 is “Yes”, inverse quantization corresponding to horizontal and vertical DCT is performed (step S509), and then horizontal and vertical inverse DCT is performed (step S511). If the determination in step S508 is “NO”, inverse quantization corresponding to DCT in the intra prediction direction is performed (step S510), and then inverse DCT depending on the intra prediction direction is performed (step S512). Thereafter, the decoded error signal and the predicted image are combined (step S513) and stored in the reference memory 407 (step S514). Thereafter, 1 is added to the sub-block index sub_blk, and the process returns to step S505. If the determination in step S505 is “NO”, decoding of all the sub-blocks of the macroblock is completed and a decoded image is output (step S516).

即ち、予測方向が斜め方向の場合、復号された変換情報から垂直水平逆変換又は予測方向の斜め逆変換のいずれか1つの逆変換を行い、予測方向が斜めでは無い場合、常に垂直水平逆変換を行う。 That is, when the prediction direction is diagonal, the inverse one of vertical and horizontal reverse conversion or diagonal prediction conversion is performed from the decoded conversion information. When the prediction direction is not diagonal, the vertical horizontal reverse conversion is always performed. I do.

次に本実施の形態で用いられるシンタクスの説明を行う。シンタクスの全体的な構造を図１８に示す。本実施の形態で用いられるシンタクスは、主に３つのパートからなり、High Level Syntaxはスライス以上の上位レイヤのシンタクス情報が詰め込まれている。Slice Level Syntaxはスライス毎に必要な情報が明記されており、Macroblock Level Syntaxはマクロブロック毎に必要とされる可変長符号化された誤差信号やモード情報が明記されている。
これらシンタックスのそれぞれは、さらに詳細なシンタクスで構成されており、High Level Syntaxでは、Sequence parameter set syntaxとPicture parameter set syntaxなどのシーケンス、ピクチャレベルのシンタクスから構成されている。Slice Level Syntaxでは、Slice header syntax、Slice data syntaxなどから成る。さらに、Macroblock Level Syntaxは、Macroblock layer syntax、Macroblock prediction syntaxなどから構成されている。 Next, the syntax used in this embodiment will be described. The overall structure of the syntax is shown in FIG. The syntax used in the present embodiment is mainly composed of three parts, and High Level Syntax is packed with syntax information of an upper layer higher than a slice. Slice Level Syntax specifies information required for each slice, and Macroblock Level Syntax specifies error signal and mode information encoded in variable length required for each macroblock.
Each of these syntaxes is composed of more detailed syntax, and in High Level Syntax, it is composed of sequences such as Sequence parameter set syntax and Picture parameter set syntax, and picture level syntax. Slice Level Syntax consists of Slice header syntax, Slice data syntax, etc. Furthermore, Macroblock Level Syntax is composed of Macroblock layer syntax, Macroblock prediction syntax, and the like.

本実施の形態で、必要となるシンタクス情報はシーケンスヘッダ、スライスヘッダ、マクロブロックヘッダであり、それぞれのシンタクスを以下で説明する。
図１９のシーケンスヘッダ内に示されるdirectional_dct_seq_flagは、符号化対象シーケンスで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、符号化対象シーケンスで本実施の形態の直交変換方式を使用することが可能である。次に、図２０のピクチャヘッダ内に示されるdirectional_dct_pic_flagは、ピクチャで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、ピクチャで本実施の形態の直交変換方式を用いることが可能となる。次に、図２１のスライスヘッダ内に示されるdirectional_dct_slice_flagは、スライスで本実施の形態の直交変換方式を用いるかどうかを示すフラグであり、このフラグが１の場合、スライスで本実施の形態の直交変換方式を用いることが可能となる。 In the present embodiment, required syntax information is a sequence header, a slice header, and a macroblock header, and each syntax will be described below.
The directional_dct_seq_flag shown in the sequence header of FIG. 19 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used in the encoding target sequence. When this flag is 1, the directional_dct_seq_flag is set in the encoding target sequence. It is possible to use a form of orthogonal transform scheme. Next, directional_dct_pic_flag shown in the picture header of FIG. 20 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used for a picture. When this flag is 1, the orthogonal of the present embodiment is used for a picture. A conversion method can be used. Next, directional_dct_slice_flag shown in the slice header of FIG. 21 is a flag indicating whether or not the orthogonal transform method of the present embodiment is used in the slice. When this flag is 1, the orthogonal of the present embodiment is used in the slice. A conversion method can be used.

図２２に示すようにマクロブロックヘッダ内には、マクロブロックレイヤ内のシンタクスとして、directional_dct4x4_mode_flag、directional_dct8x8_mode_flagの二つのフラグを持つ。本実施の形態では、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをマクロブロックにつき１ビット生成する。この二つのフラグは、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報であり、それぞれのフラグが１を示す場合は、水平垂直ＤＣＴが行われていることを示し、０を示す場合は予測方向に依存したＤＣＴが行われていることを示す。 As shown in FIG. 22, the macroblock header has two flags of directional_dct4x4_mode_flag and directional_dct8x8_mode_flag as syntax in the macroblock layer. In this embodiment, when a macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, if the subblock in the macroblock is 4x4 block, directional_dct_4x4_mode_flag is set and subblock is 8x8 block If there is, one bit of directional_dct8x8_mode_flag is generated for each macroblock. These two flags are transformation information indicating whether the orthogonal transformation is horizontal / vertical DCT or DCT depending on the prediction direction. When each flag indicates 1, the horizontal / vertical DCT is used. Indicates that DCT is being performed, and a value of 0 indicates that DCT depending on the prediction direction is being performed.

本実施の形態では、マクロブロックごとに、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報を決定するが、別の実施の形態として、サブブロックごとに変換情報を決定したり、サブブロック単位で補助情報に従って変換情報を決定したりする方法が考えられる。 In this embodiment, for each macroblock, transform information indicating whether orthogonal transform is performing horizontal / vertical DCT or DCT depending on the prediction direction is determined. However, as another embodiment, A method of determining conversion information for each sub-block or determining conversion information according to auxiliary information in units of sub-blocks can be considered.

[第５の実施の形態]
次に本発明の第５の実施の形態について説明する。本実施の形態における動画像復号化装置の全体的な構成は、第４の実施形態とほぼ同様であるため、第４の実施の形態との相違点のみを説明する。 [Fifth embodiment]
Next, a fifth embodiment of the present invention will be described. Since the overall configuration of the moving picture decoding apparatus in the present embodiment is substantially the same as that of the fourth embodiment, only differences from the fourth embodiment will be described.

第４の実施の形態では、本実施の形態の直交変換方式を用いるかどうかを、マクロブロックごとに、直交変換が水平垂直ＤＣＴを行っているのか、又は予測方向に依存したＤＣＴを行っているのかを示す変換情報を決定するが、本実施の形態では、変換情報を、サブマクロブロック単位で切り替える。 In the fourth embodiment, whether or not the orthogonal transform method of the present embodiment is used is determined for each macroblock whether the orthogonal transform performs horizontal / vertical DCT or performs DCT depending on the prediction direction. In this embodiment, the conversion information is switched in units of sub-macroblocks.

本実施の形態におけるマクロブロックヘッダ内のシンタックスの構造を図２３に示す。本実施の形態では、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをサブマクロブロックにつき１ビットを生成する。すなわち、マクロブロックが４x４変換を行う場合は、１マクロブロック中に１６個のdirectional_dct4x4_mode_flagが存在し、８x８変換を行う場合は１マクロブロック中に４個のdirectional_dct8x8_mode_flagが存在する。 FIG. 23 shows the syntax structure in the macroblock header in the present embodiment. In this embodiment, when a macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, if the subblock in the macroblock is 4x4 block, directional_dct_4x4_mode_flag is set and subblock is 8x8 block If there is, directional_dct8x8_mode_flag is generated by 1 bit per sub macroblock. That is, 16 directional_dct4x4_mode_flag exists in one macroblock when the macroblock performs 4x4 conversion, and 4 directional_dct8x8_mode_flag exists in one macroblock when the 8x8 conversion is performed.

[第６の実施の形態]
次に、本発明の第６の実施の形態について説明する。本実施の形態における動画像復号化装置の全体的な構成は、第５の実施の形態とほぼ同様であるため、第５の実施の形態との相違点のみを説明する。 [Sixth embodiment]
Next, a sixth embodiment of the present invention will be described. Since the overall configuration of the moving picture decoding apparatus in the present embodiment is almost the same as that of the fifth embodiment, only differences from the fifth embodiment will be described.

本実施の形態では、第５の実施の形態と同様にして、マクロブロックがイントラマクロブロックであり、directional_dct_seq_flag、directional_dct_pic_flag、directional_dct_slice_flagのいずれかが１である場合、マクロブロック中のサブブロックが４x４ブロックであればdirectional_dct_4x4_mode_flagを、サブブロックが８x８ブロックであればdirectional_dct8x8_mode_flagをサブブロックにつき１ビットを生成する。この際に、第５の実施の形態とは異なり、全てのサブブロックにフラグを作成するのではなく、予測モード情報やサブブロックに非ゼロ係数が含まれるかを表すcoded_block_patternの補助情報を用いて、サブブロックにフラグを生成するかどうかを決定する。 In the present embodiment, as in the fifth embodiment, when the macroblock is an intra macroblock and any of directional_dct_seq_flag, directional_dct_pic_flag, and directional_dct_slice_flag is 1, the subblock in the macroblock is a 4 × 4 block. If there is, directional_dct_4x4_mode_flag is generated, and if the subblock is an 8x8 block, directional_dct8x8_mode_flag is generated for one bit per subblock. At this time, unlike the fifth embodiment, instead of creating a flag for all sub-blocks, using prediction mode information and auxiliary information of coded_block_pattern indicating whether the non-zero coefficient is included in the sub-block Determine whether to generate a flag in the sub-block.

本実施の形態におけるマクロブロックヘッダ内のシンタックスの構造を図２４に示す。intra8x8_pred_mode、intra4x4_pred_modeはそれぞれ、８x８ブロックと４x４ブロックの予測モードを表す値で、予測方向と予測モードの値の関係は図３の値と同様である。符号化の際に、予測モードが示す予測の方向が垂直方向、水平方向、ＤＣ方向である場合、さらにサブブロックが含まれる８x８画素ブロックにおいて、Coded_Block_Patternから非ゼロ係数を持たないと判定される場合は、予測方向に依存した変換を行う必要がないので、directional_dct4x4_mode_flag、又はdirectional_dct8x8_mode_flagは生成されない。復号化器は、復号化したデータの中に、directional_dct4x4_mode_flag、又はdirectional_dct8x8_mode_flagが存在しない場合、directional_dct4x4_mode_flag、又はdirectional_dct8x8_mode_flagの値が１であるとして、その後の復号化処理を行う。 FIG. 24 shows the syntax structure in the macroblock header in the present embodiment. intra8x8_pred_mode and intra4x4_pred_mode are values representing prediction modes of the 8x8 block and the 4x4 block, respectively, and the relationship between the prediction direction and the value of the prediction mode is the same as the value of FIG. When encoding, when the prediction direction indicated by the prediction mode is the vertical direction, the horizontal direction, or the DC direction, and when it is determined that the 8 × 8 pixel block including the sub-block does not have a non-zero coefficient from Coded_Block_Pattern Since there is no need to perform conversion depending on the prediction direction, directional_dct4x4_mode_flag or directional_dct8x8_mode_flag is not generated. When there is no directional_dct4x4_mode_flag or directional_dct8x8_mode_flag in the decoded data, the decoder assumes that the value of directional_dct4x4_mode_flag or directional_dct8x8_mode_flag is 1, and performs subsequent decoding processing.

第１の実施の形態に従う画像符号化装置の構成を示すブロック図。1 is a block diagram showing a configuration of an image encoding device according to a first embodiment. Ｈ．２６４におけるマクロブロックと各サイズのサブブロックの関係を示す図。H. The figure which shows the relationship between the macroblock in H.264, and the subblock of each size. Ｈ．２６４イントラ予測の全予測モードを示す図。H. The figure which shows all the prediction modes of H.264 intra prediction. Ｈ．２６４イントラ予測の予測方向示す図。H. The figure which shows the prediction direction of H.264 intra prediction. 第1の実施の形態に係わる第１変換部、第１量子化部の構造を示す図。The figure which shows the structure of the 1st conversion part concerning the 1st Embodiment, and a 1st quantization part. 同実施の形態に係わる第２変換部、第２量子化部の構造を示す図。The figure which shows the structure of the 2nd conversion part concerning the same embodiment, and a 2nd quantization part. 同実施の形態に係わる４ｘ４画素サブブロック。4 × 4 pixel sub-block according to the embodiment. 同実施の形態に係わる１次元ＤＣＴ方法（４ｘ４ブロック、予測モード３）を示す図The figure which shows the one-dimensional DCT method (4x4 block, prediction mode 3) concerning the embodiment 同実施の形態に係わる１次元ＤＣＴ方法（４ｘ４ブロック、予測モード５）を示す図The figure which shows the one-dimensional DCT method (4x4 block, prediction mode 5) concerning the embodiment 同実施の形態に係わる８ｘ８画素サブブロックを示す図。The figure which shows the 8x8 pixel subblock concerning the embodiment. 同実施の形態に係わる１次元ＤＣＴ方法（８ｘ８ブロック、予測モード３）を示す図The figure which shows the one-dimensional DCT method (8x8 block, prediction mode 3) concerning the embodiment 同実施の形態に係わる１次元ＤＣＴ方法（８ｘ８ブロック、予測モード５）を示す図The figure which shows the one-dimensional DCT method (8x8 block, prediction mode 5) concerning the embodiment 同実施の形態に係わる第１逆変換部、第１逆量子化部の構造を示す図。The figure which shows the structure of the 1st inverse transformation part concerning the same embodiment, and a 1st inverse quantization part. 同実施の形態に係わる第２逆変換部、第２逆量子化部の構造を示す図。The figure which shows the structure of the 2nd inverse transformation part concerning the same embodiment, and a 2nd inverse quantization part. 同画像符号化装置による動画像符号化方法の処理フローチャート。The processing flowchart of the moving image encoding method by the image encoding apparatus. 本発明の第１の実施の形態に従う画像復号化装置の構成を示すブロック図。1 is a block diagram showing a configuration of an image decoding device according to a first embodiment of the present invention. 同画像復号化装置による動画像復号化方法の処理フローチャート。The processing flowchart of the moving image decoding method by the image decoding apparatus. 同実施の形態に係わる全体的なシンタクス構造の概略を示す図。The figure which shows the outline of the whole syntax structure concerning the embodiment. 第１、第４の実施の形態に係わるシーケンスヘッダのデータ構造の概略を示す図。The figure which shows the outline of the data structure of the sequence header concerning 1st, 4th embodiment. 第１、第４の実施形態に係わるピクチャヘッダのデータ構造の概略を示す図。The figure which shows the outline of the data structure of the picture header concerning 1st, 4th embodiment. 第１、第４の実施形態に係わるスライスヘッダのデータ構造の概略を示す図。The figure which shows the outline of the data structure of the slice header concerning 1st, 4th embodiment. 第１、第４の実施形態に係わるマクロブロックヘッダのデータ構造概略を示す図。The figure which shows the data structure outline | summary of the macroblock header concerning 1st, 4th embodiment. 第２、第５の実施形態に係わるマクロブロックヘッダのデータ構造概略を示す図。The figure which shows the data structure outline | summary of the macroblock header concerning 2nd, 5th embodiment. 第３、第６の実施形態に係わるマクロブロックヘッダのデータ構造概略を示す図。The figure which shows the data structure outline | summary of the macroblock header concerning 3rd, 6th embodiment.

Explanation of symbols

１００…動画像符号化装置、１０１…変換方法選択スイッチ、１０２…第1変換部、１０３…第２変換部、１０４…第１量子化部、１０５…第２量子化部、１０６…エントロピー符号化部、１０７…符号化制御部、１０８…逆変換方法選択スイッチ、１０９…第２逆量子化部、１１０…第１逆量子化部、１１１…第２逆変換部、１１２…第１逆変換部、１１３…参照メモリ、１１４…予測部、１１５…入力画像信号、１１６…予測誤差信号、１１７…量子化変換係数、１１８…変換情報、１１９…復号予測誤差信号、１２０…予測モード情報、１２１…予測画像信号、１２２…参照画像信号、１２４…符号化データ、２００…水平垂直ＤＣＴ器、２０１…水平垂直ＤＣＴ用量子化器、２０２…スイッチ、２０３〜２０８…ＤＣＴ器、２１０〜２１５…量子化器、２１６…水平垂直ＤＣＴ用逆量子化器、２１７…水平垂直逆ＤＣＴ器、２１８…スイッチ、２１９〜２２４…逆量子化器、２２５…スイッチ、２２６〜２３１…逆ＤＣＴ器、４０１…エントロピー復号部、４０２…逆変換方法選択スイッチ、４０３…第１逆量子化部、４０４…第２逆量子化部、４０５…第１逆変換部、４０６…第２逆変換部、４０７…参照メモリ、４０８…予測部、４０９…符号化データ、４１０…変換情報、４１１…予測モード情報、４１２…量子化変換係数、４１３…復号予測誤差信号、４１４…参照画像、４１５…予測画像、４１６…復号画像信号、４１７…復号化制御部 DESCRIPTION OF SYMBOLS 100 ... Moving image encoding apparatus, 101 ... Conversion method selection switch, 102 ... 1st conversion part, 103 ... 2nd conversion part, 104 ... 1st quantization part, 105 ... 2nd quantization part, 106 ... Entropy encoding 107: Encoding control unit, 108: Inverse transform method selection switch, 109 ... Second inverse quantization unit, 110 ... First inverse quantization unit, 111 ... Second inverse transform unit, 112 ... First inverse transform unit , 113 ... Reference memory, 114 ... Prediction unit, 115 ... Input image signal, 116 ... Prediction error signal, 117 ... Quantization transform coefficient, 118 ... Conversion information, 119 ... Decoded prediction error signal, 120 ... Prediction mode information, 121 ... Predicted image signal, 122 ... reference image signal, 124 ... encoded data, 200 ... horizontal / vertical DCT, 201 ... quantizer for horizontal / vertical DCT, 202 ... switch, 203-208 ... DCT, 210-2 DESCRIPTION OF SYMBOLS 15 ... Quantizer, 216 ... Inverse quantizer for horizontal / vertical DCT, 217 ... Horizontal / vertical inverse DCT, 218 ... Switch, 219-224 ... Inverse quantizer, 225 ... Switch, 226-231 ... Inverse DCT, 401 ... entropy decoding unit, 402 ... inverse transformation method selection switch, 403 ... first inverse quantization unit, 404 ... second inverse quantization unit, 405 ... first inverse transformation unit, 406 ... second inverse transformation unit, 407 ... Reference memory, 408 ... prediction unit, 409 ... encoded data, 410 ... transform information, 411 ... prediction mode information, 412 ... quantized transform coefficient, 413 ... decoded prediction error signal, 414 ... reference image, 415 ... predicted image, 416 ... Decoded image signal, 417 ... Decoding control unit

Claims

A moving image encoding method for encoding an input image signal,
A prediction error signal generation step for generating a prediction error signal indicating a difference value between a prediction signal generated according to a predetermined prediction mode and an input image signal;
An orthogonal transform step of orthogonally transforming the prediction error signal to generate transform coefficients by a first transform method that depends on a prediction direction of the prediction mode or a second transform method that does not depend on the prediction direction;
A quantization step of performing a quantization process on the transform coefficient to generate a quantized transform coefficient;
An encoding process step of performing entropy encoding processing on the quantized transform coefficient and transform information indicating the first transform method or the second transform method to generate encoded data;
A moving picture encoding method including:

When the prediction direction is not the vertical direction or the horizontal direction, the first conversion method or the second conversion method is selected. When the prediction direction is the vertical direction or the horizontal direction, the first conversion method is selected. A conversion method selection step to select;
The moving image encoding method according to claim 1, wherein the orthogonal transformation step performs orthogonal transformation according to a selected transformation method.

The orthogonal transformation step, when the first transformation method is selected in the transformation method selection step, orthogonally transforms the prediction error signal along a prediction direction indicated by a prediction mode. The moving image encoding method according to Item 2.

In the orthogonal transforming step, when performing orthogonal transform along the prediction direction, one-dimensional orthogonal transform is performed on a pixel sequence obtained by folding back and joining adjacent spatially linear pixel sequences at a block end. The moving picture encoding method according to any one of claims 1 to 3.

The orthogonal transformation step, when performing orthogonal transformation along the prediction direction, orthogonal transformation is performed using any of DCT, KLT, DST, and DWT as a method of orthogonal transformation. 5. The moving image encoding method according to any one of 1 to 4.

In the orthogonal transform step, when the first transform method depending on the prediction direction is performed, the quantization step performs quantization using a different quantization method for each orthogonal transform according to the first orthogonal transform method. The moving picture coding method according to claim 1, wherein

The entropy encoding step encodes the conversion information in addition to the conversion coefficient only when the prediction direction indicated by the prediction mode information is not the vertical direction and the horizontal direction when encoding the conversion information and the conversion coefficient. The conversion information is not encoded when the prediction directions indicated by the modes are the vertical direction and the horizontal direction, and only the conversion coefficient is encoded. Video encoding method.

In the transform method selection step, the orthogonal transform method of the encoding target block is changed from the first transform method or the second transform method to at least one unit of sequence, picture, slice, or macroblock. The moving picture coding method according to claim 1, wherein the moving picture coding method is changed by:

A moving image encoding device for encoding an input image signal,
A prediction error signal generation unit that generates a prediction error signal indicating a difference value between a prediction signal generated according to a predetermined prediction mode and an input image signal;
An orthogonal transform unit that orthogonally transforms the prediction error signal and generates transform coefficients by a first transform method that depends on a prediction direction of the prediction mode or a second transform method that does not depend on the prediction direction;
A quantization unit that performs a quantization process on the transform coefficient and generates a quantized transform coefficient;
An encoding unit that performs entropy encoding processing on the quantized conversion coefficient and conversion information indicating the first conversion method or the second conversion method, and generates encoded data;
A video encoding apparatus including:

A decoding step of decoding the input encoded data according to a predetermined method and deriving orthogonal transform coefficients necessary for the decoding process, prediction mode information, and transform information;
Inverse transform method selection for selecting a first inverse transform method that depends on the prediction direction or a second inverse transform method that does not depend on the prediction direction as an inverse transform method of the decoding target block of the encoded data according to the transform information Steps,
An inverse quantization step of performing an inverse quantization process on the orthogonal transform coefficient obtained in the decoding step to generate an orthogonal transform coefficient;
According to the inverse transformation method selected in the inverse transformation method selection step, the orthogonal transformation coefficient is inversely orthogonal transformed to generate a decoded prediction error signal; and
Generating a prediction signal from the decoded decoded image signal based on the encoding mode information, and adding the prediction signal to the decoded prediction error signal to generate a decoded image;
A video decoding method comprising:

In the inverse transform selection step, when the first transform method is selected, the inverse quantization step performs inverse quantization with a different inverse quantization method for each orthogonal transform depending on the prediction direction. The moving picture decoding method according to claim 11, wherein:

11. The inverse transform selection step, when the first transform method is selected, the inverse orthogonal transform step performs an inverse orthogonal transform along a prediction direction indicated by a prediction mode. The moving picture decoding method according to claim 11.

In the inverse orthogonal transform step, when performing the inverse orthogonal transform along the prediction direction, the one-dimensional inverse orthogonal transform is not performed only on the spatially linear transform coefficients, but adjacent spatially straight lines are performed. 13. The moving picture decoding method according to claim 11, wherein one-dimensional inverse orthogonal transform is performed on a transform coefficient sequence obtained by folding transform transform coefficients at a block end and combining them.

The inverse orthogonal transform step performs inverse orthogonal transform using any one of inverse DCT, inverse KLT, inverse DST, and inverse DWT as an inverse orthogonal transform method when performing orthogonal transform along the prediction direction. The moving picture decoding method according to claim 11, wherein: the moving picture decoding method according to claim 11.

In the inverse transform method selection step, the inverse orthogonal transform method of the decoding target block of the encoded data is the first inverse transform method or the first inverse transform method in at least one unit of each sequence, each picture, each slice, or each macroblock The moving picture decoding method according to claim 11, wherein the moving picture decoding method is changed to the second inverse transform method.

A decoding unit that decodes input encoded data according to a predetermined method and derives orthogonal transform coefficients necessary for decoding processing, prediction mode information, and transform information;
Inverse transform method selection for selecting a first inverse transform method that depends on the prediction direction or a second inverse transform method that does not depend on the prediction direction as an inverse transform method of the decoding target block of the encoded data according to the transform information And
An inverse quantization unit that performs an inverse quantization process on the orthogonal transform coefficient obtained in the decoding step and generates an orthogonal transform coefficient;
According to the inverse transform method selected in the inverse transform method selection step, an inverse orthogonal transform unit that performs inverse orthogonal transform on the orthogonal transform coefficient and generates a decoded prediction error signal;
An image restoration unit that generates a prediction signal from the decoded decoded image signal based on the coding mode information, and adds the prediction signal to the decoded prediction error signal to generate a decoded image;
A moving picture decoding apparatus comprising: