JP2010135864A

JP2010135864A - Image encoding method, device, image decoding method, and device

Info

Publication number: JP2010135864A
Application number: JP2007087202A
Authority: JP
Inventors: Akiyuki Tanizawa; 昭行谷沢; Taiichiro Shiodera; 太一郎塩寺; Takeshi Nakajo; 健中條
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2007-03-29
Filing date: 2007-03-29
Publication date: 2010-06-17
Also published as: TW200850012A; WO2008123254A1

Abstract

<P>PROBLEM TO BE SOLVED: To provide the encoding device of a moving image, which reduces the operation cost and improves the encoding efficiency. <P>SOLUTION: An image encoding method includes the steps of: a predicted image for an encoding object pixel block is generated, on the basis of a selected predicted mode; an optimum predicted mode is determined, on the basis of a prediction error between an input image and the predicted image, and the code amount of the predicted mode; the order of the selection frequency of the predicted mode indicating the selection frequency of the predicted mode is rearranged by the determined predicted mode; the index of a rearranged frequency information table is generated; for the encoding object pixel block, predicted mode information is extracted from the index; a predicted image signal corresponding to the extracted predicted mode information is generated; the cost of the predicted mode is calculated; one encoding mode is selected from the cost; and prediction error signals, the table length of the frequency information table, and an index number indicating the selected encoding mode are encoded according to the selected encoding mode. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、動画像または静止画像のための予測モード推定、画像符号化、復号化方法及び装置に関する。 The present invention relates to a prediction mode estimation, image encoding, and decoding method and apparatus for moving images or still images.

従来よりも大幅に符号化効率を向上させた動画像符号化方法が、ＩＴＵ−ＴとＩＳＯ／ＩＥＣとの共同で、ＩＴＵ−ＴＲｅｃ．Ｈ．２６４及びＩＳＯ／ＩＥＣ１４４９６−１０として勧告されている。（以下、「Ｈ．２６４」という）。ＩＳＯ／ＩＥＣＭＰＥＧ−１，２，４、ＩＴＵ−ＴＨ．２６１、Ｈ．２６３といった従来の画面内符号化方式は直交変換後の周波数領域（ＤＣＴ係数）上でのフレーム内予測を行い，変換係数の符号量削減を図っているのに対して、Ｈ．２６４では空間領域（画素領域）での方向予測（非特許文献１）を取り入れることにより，従来（ＩＳＯ／ＩＥＣＭＰＥＧ−１，２，４）の動画像符号化方式のフレーム内予測と比較して高い予測効率を実現している。 A moving picture coding method with significantly improved coding efficiency compared to the prior art has been developed in collaboration with ITU-T Rec. H. H.264 and ISO / IEC 14496-10. (Hereinafter referred to as “H.264”). ISO / IEC MPEG-1, 2, 4, ITU-TH 261, H.H. A conventional intra-picture encoding method such as H.263 performs intra-frame prediction on the frequency domain (DCT coefficient) after orthogonal transform to reduce the code amount of the transform coefficient. H.264 uses direction prediction (Non-Patent Document 1) in the spatial region (pixel region), and compared with the intra-frame prediction of the conventional (ISO / IEC MPEG-1, 2, 4) video coding system. High prediction efficiency is achieved.

Ｈ．２６４ハイプロファイルなどでは、輝度信号に対して３種類のフレーム内予測方式が規定されており、そのうちの１つをマクロブロック（１６ｘ１６画素ブロック）単位に選択できる方式となっている。３種類のフレーム内予測方式は夫々４ｘ４画素予測、８ｘ８画素予測、１６ｘ１６画素予測と呼ばれている。 H. In the H.264 high profile and the like, three types of intra-frame prediction methods are defined for luminance signals, and one of them can be selected in units of macroblocks (16 × 16 pixel blocks). The three types of intra-frame prediction methods are called 4 × 4 pixel prediction, 8 × 8 pixel prediction, and 16 × 16 pixel prediction, respectively.

１６ｘ１６画素予測では、４つの符号化モードが規定されており、垂直予測、水平予測、ＤＣ予測、平面予測と呼ばれている。復号処理を終えたデブロッキングフィルタ適用前の周囲のマクロブロックの画素値を参照画素値として用い、予測処理に使用する。１６ｘ１６予測の予測モード情報はマクロブロックタイプに包含されており、他の予測と比較してモードを送信するための符号量が大幅に少なくなっている。 In 16 × 16 pixel prediction, four encoding modes are defined, which are called vertical prediction, horizontal prediction, DC prediction, and plane prediction. The pixel value of the surrounding macroblock before applying the deblocking filter after the decoding process is used as the reference pixel value, and used for the prediction process. The prediction mode information of 16 × 16 prediction is included in the macroblock type, and the amount of codes for transmitting the mode is significantly reduced compared to other predictions.

一方、４ｘ４画素／８ｘ８画素予測は、マクロブロック内の輝度信号を夫々１６個／４個の４ｘ４／８ｘ８画素ブロックに分割し、夫々の画素ブロックに対して、９つのモードのいずれかをブロック単位に選択する。９つのモードは、利用可能な参照画素の平均値で予測するＤＣ予測（モード２）を除いて、夫々２２．５度づつの予測方向を持っており、参照画素を用いて予測方向に外挿補間を行って予測値を生成する。４ｘ４／８ｘ８画素予測は１６ｘ１６画素予測と比較して予測処理の単位が小さいために、複雑なテクスチャを持つ画像に対しても比較的効率の高い予測が行えるが、予測方向に対して単純に補間値をコピーするだけの予測であり、参照画素からの距離が離れるほど、予測誤差が増大するという問題点がある。 On the other hand, in the 4 × 4 pixel / 8 × 8 pixel prediction, the luminance signal in the macroblock is divided into 16/4/4 × 4/8 × 8 pixel blocks, and any of the nine modes is divided into block units for each pixel block. Select Each of the nine modes has a prediction direction of 22.5 degrees except for DC prediction (mode 2) in which prediction is performed using an average value of available reference pixels, and extrapolates in the prediction direction using reference pixels. Interpolation is performed to generate a predicted value. Since 4x4 / 8x8 pixel prediction has a smaller unit of prediction processing than 16x16 pixel prediction, relatively efficient prediction can be performed even for images with complex textures, but simple interpolation with respect to the prediction direction There is a problem that the prediction error is increased as the distance from the reference pixel is increased.

このように近年の動画像符号化方式では、ハードウェアの高性能化に伴って選択可能な予測モードの数が増える傾向にあり、予測モードのモード情報の符号化による符号量の増加が大きな問題となっている。H．264ハイプロファイルにおいても１６ｘ１６画素予測で４モード、４ｘ４画素／８ｘ８画素予測で各9モードと予測モード数が多く、低ビットレートでの符号化時に小画素ブロックの予測モードは選択されにくい傾向にある。一方、予測モード数の増加は演算コストの増大を招き、携帯機器や省電力機器での符号化時に、当該小画素ブロックの予測モードを利用できないなどの問題もある。 Thus, in recent video coding systems, the number of prediction modes that can be selected tends to increase as the performance of hardware increases, and the increase in code amount due to the encoding of mode information in the prediction mode is a major problem. It has become. H. Even in the H.264 high profile, there are 4 modes for 16x16 pixel prediction, 9 modes each for 4x4 pixel / 8x8 pixel prediction, and there are many prediction modes, and the prediction mode for small pixel blocks tends to be difficult to select when encoding at a low bit rate. . On the other hand, an increase in the number of prediction modes causes an increase in calculation cost, and there is a problem that the prediction mode of the small pixel block cannot be used at the time of encoding in a portable device or a power saving device.

このような問題に対して非特許文献２では、予測モードとして選択率の高くない１６ｘ１６画素予測に変わるダイレクト予測モードを導入している。特に１６ｘ１６画素予測の平面予測は２５６個の全画素に対して予測画素値を生成するため、他の予測モードと比較しても演算コストが増えている。ダイレクト予測モードでは、４ｘ４画素予測の予測をそのまま利用し、予測モード情報を復号化器に送信しない、という予測モードである。そのため、Ｈ．２６４に規定されているモード導出方法を利用して、符号化対象の画素ブロックに隣接する上下の画素ブロックの予測モードから当該予測モードを予測している。４ｘ４画素予測を利用しているため、予測モードの予測が当たれば符号化効率を維持しつつ、予測モード情報を削減することが可能である。しかし、予測モードの予測が外れた場合、符号化効率が低下することが問題となっている。
Greg Conklin, “New Intra Prediction Modes”, ITU-T Q.6/SG16 VCEG, VCEG-N54, Sep. 2001. Lu Yu, Feng Yi, “Low complexity intra prediction”, ITU-T SG16/Q.6 VCEG-Z14, April 2005 To deal with such a problem, Non-Patent Document 2 introduces a direct prediction mode that changes to 16 × 16 pixel prediction with a low selection rate as a prediction mode. In particular, the planar prediction of 16 × 16 pixel prediction generates prediction pixel values for all 256 pixels, and thus the calculation cost is increased even when compared with other prediction modes. The direct prediction mode is a prediction mode in which the prediction of 4 × 4 pixel prediction is used as it is and the prediction mode information is not transmitted to the decoder. Therefore, H.H. The prediction mode is predicted from the prediction modes of the upper and lower pixel blocks adjacent to the pixel block to be encoded using the mode derivation method defined in H.264. Since the 4 × 4 pixel prediction is used, it is possible to reduce the prediction mode information while maintaining the encoding efficiency if the prediction mode is predicted. However, there is a problem that the encoding efficiency is reduced when the prediction mode is not predicted.
Greg Conklin, “New Intra Prediction Modes”, ITU-T Q.6 / SG16 VCEG, VCEG-N54, Sep. 2001. Lu Yu, Feng Yi, “Low complexity intra prediction”, ITU-T SG16 / Q.6 VCEG-Z14, April 2005

以上説明したように、Ｈ．２６４ハイプロファイルに規定されている方法で、符号化モードを送信する場合、低ビットレートではモード情報の符号量が無視できないため、予測性能の良い予測モードが選択されにくくなり符号化効率が低下する、また、ダイレクト予測モードでは、モードの予測が外れたときに符号化効率が低下する、という問題があった。 As explained above, H.P. H.264 When the encoding mode is transmitted by the method defined in the high profile, since the code amount of the mode information cannot be ignored at a low bit rate, it is difficult to select a prediction mode with good prediction performance, and the encoding efficiency is lowered. Moreover, in the direct prediction mode, there is a problem that the encoding efficiency is lowered when the mode is not predicted.

本発明の実施形態によると、予測モードに関する付帯情報の選択頻度を示す頻度情報テーブルを準備するステップと、入力画像を複数の画素ブロックに分割するステップと、前記画素ブロックの符号化対象画素ブロックに応じて予測モードに関する付帯情報を選択するステップと、選択した付帯情報に基づいて参照画像を用いて前記符号化対象画素ブロックに対する予測画像を生成するステップと、入力画像と予測画像との予測誤差と前記予測モードの符号量に基づいて最適予測モードを決定し、決定された予測モードにより前記頻度情報テーブルの予測モードの選択頻度順序を並び替えるステップと、並び替えた前記頻度情報テーブルのインデックスを生成するステップと、前記符号化対象画素ブロックに対して、前記インデックスから１つ以上の付帯情報を抽出するステップと、抽出された前記付帯情報に対応した予測信号を生成するステップと、前記予測モードのコストを計算し、前記コストから１つの符号化モードを選択するステップと、選択された前記符号化モードに従って前記予測誤差信号と前記頻度情報テーブルのテーブル長と、選択された符号化モードを示す、前記頻度情報テーブル中のインデックス番号を符号化するステップと、を具備することを特徴とする画像符号化方法を提供する。 According to the embodiment of the present invention, the step of preparing a frequency information table indicating the selection frequency of the auxiliary information regarding the prediction mode, the step of dividing the input image into a plurality of pixel blocks, and the encoding target pixel block of the pixel block A step of selecting supplementary information related to the prediction mode, a step of generating a prediction image for the encoding target pixel block using a reference image based on the selected supplementary information, a prediction error between the input image and the prediction image, The optimal prediction mode is determined based on the code amount of the prediction mode, the selection frequency order of the prediction mode of the frequency information table is rearranged according to the determined prediction mode, and the index of the rearranged frequency information table is generated And 1 from the index for the pixel block to be encoded. Extracting the above supplementary information; generating a prediction signal corresponding to the extracted supplementary information; calculating a cost of the prediction mode; and selecting one encoding mode from the cost; Encoding the prediction error signal and the table length of the frequency information table according to the selected encoding mode, and encoding an index number in the frequency information table indicating the selected encoding mode. An image encoding method is provided.

本発明によれば、ハードウェアコストを削減しつつ、符号化効率を向上させた画像符号化・復号化方法及び装置を実現できる。 According to the present invention, it is possible to realize an image encoding / decoding method and apparatus with improved encoding efficiency while reducing hardware costs.

以下に添付図面を参照して、この発明に係る動画像符号化方法及び動画像符号化装置、動画像復号化方法及び動画像復号化装置の最良な実施形態を詳細に説明する。 Exemplary embodiments of a moving image encoding method, a moving image encoding device, a moving image decoding method, and a moving image decoding device according to the present invention will be explained below in detail with reference to the accompanying drawings.

図１を参照して、本発明の実施形態に係わる動画像符号化装置の構成を説明する。 With reference to FIG. 1, the structure of the moving image encoding device concerning embodiment of this invention is demonstrated.

動画像符号化装置の構成
（符号化：第１の実施の形態）
図１に示される動画像符号化装置によると、動画像信号は小画素ブロック毎に分割され、符号化部１００に入力される。符号化部１００では、内部予測及びモード判定部１０２が行う予測モードとして、ブロックサイズや予測画像信号の生成方法の異なる複数の予測モードが用意されている。本実施の形態では、図５(ａ)に示されているように左上から右下に向かって符号化処理がなされていくものとする。 Configuration of moving picture encoding apparatus (encoding: first embodiment)
According to the moving image encoding apparatus shown in FIG. 1, the moving image signal is divided for each small pixel block and input to the encoding unit 100. In the encoding unit 100, a plurality of prediction modes having different block sizes and prediction image signal generation methods are prepared as prediction modes performed by the internal prediction and mode determination unit 102. In the present embodiment, it is assumed that the encoding process is performed from the upper left to the lower right as shown in FIG.

符号化部１００に入力される入力画像信号１１０は、画面分割部１０１によって、図５（ｂ）にあるような１６ｘ１６画素のブロックに分割される。入力画像信号１１０の画素ブロックが内部予測／モード判定部１０２へと入力される。内部予測／モード判定部１０２を介した画素ブロックは、後述するモード判定部１０３、変換量子化部１０４を介して、最終的に符号化処理部１０５によって符号化される。符号化された画素ブロックは出力バッファで蓄積された後に、符号化制御部１０８が管理する出力タイミングで符号化データ１１５として出力される。 The input image signal 110 input to the encoding unit 100 is divided by the screen dividing unit 101 into blocks of 16 × 16 pixels as shown in FIG. A pixel block of the input image signal 110 is input to the internal prediction / mode determination unit 102. The pixel block that has passed through the internal prediction / mode determination unit 102 is finally encoded by the encoding processing unit 105 via a mode determination unit 103 and a transform quantization unit 104 described later. The encoded pixel block is accumulated in the output buffer, and then output as encoded data 115 at an output timing managed by the encoding control unit 108.

１６ｘ１６画素ブロックはマクロブロックと呼ばれ、以下の符号化処理の基本的な処理ブロックサイズとなっている。符号化部１００では、このマクロブロック単位に入力画像信号１１０を読み込み、符号化処理を行う。尚、マクロブロックは３２ｘ３２画素ブロック単位であっても８ｘ８画素ブロック単位であっても良い。マクロブロックの一例を図５（ｂ）にて示す。 The 16 × 16 pixel block is called a macro block and has a basic processing block size for the following encoding process. The encoding unit 100 reads the input image signal 110 for each macroblock and performs an encoding process. Note that the macroblock may be a 32 × 32 pixel block unit or an 8 × 8 pixel block unit. An example of the macroblock is shown in FIG.

内部予測／モード判定部１０２は、参照画像メモリ１０７に一時保存された、符号化済みの参照画素を用いて、マクロブロックで選択可能な全ての予測モードで予測画像信号１１１を生成する。即ち、内部予測／モード判定部１０２は、符号化対象画素ブロックで取り得る符号化モードの全ての予測画像信号を生成する。ただし、Ｈ．２６４のフレーム内予測（４ｘ４画素予測（図５（ｃ）参照）又は８ｘ８画素予測（図５（ｄ）参照））のようにマクロブロック内で局所復号画像を作成しないと次の予測が行えないような場合に関しては、内部予測／モード判定部１０２は内部で係数変換及び量子化、逆量子化及び逆変換を行っても良い。 The internal prediction / mode determination unit 102 uses the encoded reference pixels temporarily stored in the reference image memory 107 to generate the prediction image signal 111 in all prediction modes that can be selected in the macroblock. That is, the internal prediction / mode determination unit 102 generates all prediction image signals in encoding modes that can be taken by the encoding target pixel block. However, H. Like in H.264 intra-frame prediction (4 × 4 pixel prediction (see FIG. 5C)) or 8 × 8 pixel prediction (see FIG. 5D), the next prediction cannot be performed unless a locally decoded image is created in the macroblock. In such a case, the internal prediction / mode determination unit 102 may internally perform coefficient transformation and quantization, inverse quantization, and inverse transformation.

内部予測／モード判定部１０２で生成された予測画像信号１１１は、入力画像信号１１０とともにモード判定部１０３へと入力される。モード判定部１０３は、予測画像信号１１１を逆量子化逆変換部１０６へ入力するとともに、入力画像信号１１０から予測画像信号１１１を差し引くことによって予測誤差信号１１２を生成し、変換量子化部１０４へ入力する。同時にモード判定部１０３は、内部予測／モード判定部１０２で予測されたモード情報と生成された予測誤差信号１１２を基にモード判定を行う。より具体的に説明すると本実施の形態では、モード判定部１０３は次式のようなコストを用いたモード判定を行う。 The predicted image signal 111 generated by the internal prediction / mode determination unit 102 is input to the mode determination unit 103 together with the input image signal 110. The mode determination unit 103 inputs the predicted image signal 111 to the inverse quantization inverse transform unit 106, generates a prediction error signal 112 by subtracting the predicted image signal 111 from the input image signal 110, and outputs the prediction error signal 112 to the transform quantization unit 104. input. At the same time, the mode determination unit 103 performs mode determination based on the mode information predicted by the internal prediction / mode determination unit 102 and the generated prediction error signal 112. More specifically, in the present embodiment, the mode determination unit 103 performs mode determination using a cost such as the following equation.

Ｋ＝ＳＡＤ＋λ×ＯＨ（１）
但し、ＯＨはモード情報、ＳＡＤは予測誤差信号の絶対和とする。また、λは定数で与えられ、量子化幅や量子化パラメータの値に基づいて決められる。このようにして得られたコストを基にモードが決定される。この場合、コストＫがもっとも小さい値を与えるモードが最適モードとして選択される。 K = SAD + λ × OH (1)
However, OH is mode information, and SAD is the absolute sum of prediction error signals. Λ is given as a constant and is determined based on the quantization width and the value of the quantization parameter. The mode is determined based on the cost thus obtained. In this case, the mode giving the value with the smallest cost K is selected as the optimum mode.

本実施の形態では、モード情報と予測誤差信号の絶対和を用いたが、別の実施の形態として、モード情報のみ、予測誤差信号の絶対和のみを用いてモードを判定しても良いし、これらをアダマール変換したり、これらに近似した値を利用したりしても良い。また、入力画像信号のアクテビティを用いてコストを作成しても良いし、量子化幅、量子化パラメータを利用してコスト関数を作成しても良い。 In the present embodiment, the absolute sum of the mode information and the prediction error signal is used. However, as another embodiment, the mode may be determined using only the mode information or only the absolute sum of the prediction error signal. These may be Hadamard transformed, or values approximate to these may be used. Further, the cost may be created using the activity of the input image signal, or the cost function may be created using the quantization width and the quantization parameter.

コストを算出するための別の実施の形態として、仮符号化部を用意し、この仮符号化部の符号化モードで生成された予測誤差信号を実際に符号化した場合の符号量と、符号化データを局部復号して得た局部復号画像１１４と入力画像信号１１０との二乗誤差とを用いてモードを判定しても良い。この場合のモード判定式は以下のようになる。 As another embodiment for calculating the cost, a provisional encoding unit is prepared, and the amount of code when the prediction error signal generated in the encoding mode of the provisional encoding unit is actually encoded, The mode may be determined using a square error between the locally decoded image 114 and the input image signal 110 obtained by locally decoding the digitized data. The mode judgment formula in this case is as follows.

Ｊ＝Ｄ＋λ×Ｒ（２）
ここで、Ｄは、入力画像信号１１０と局部復号画像１１４の二乗誤差を表す符号化歪みである。一方、Ｒは仮符号化によって見積もられた符号量を表している。本コストを用いた場合は、符号化モード毎に仮符号化と局部復号（逆量子化処理や逆変換処理）が必要となるため、回路規模は増大するが、正確な符号量と符号化歪みを用いることが可能となり、符号化効率を高く維持することが可能である。本コストも、符号量のみ、符号化歪みのみを用いてコストを算出しても良いし、これらに近似した値を用いてコスト関数を作成しても良い。 J = D + λ × R (2)
Here, D is a coding distortion representing a square error between the input image signal 110 and the locally decoded image 114. On the other hand, R represents a code amount estimated by provisional encoding. When this cost is used, provisional coding and local decoding (inverse quantization processing and inverse transformation processing) are required for each coding mode, so the circuit scale increases, but accurate code amount and coding distortion are increased. Can be used, and the encoding efficiency can be kept high. As for this cost, the cost may be calculated using only the code amount or only the coding distortion, or the cost function may be created using a value approximate to these costs.

モード判定部１０３は変換量子化部１０４及び逆量子化逆変換部１０６に接続されており、モード判定部１０３で選択されたモード情報と予測誤差信号１１２は、変換量子化部１０４へと入力される。変換量子化部１０４は、入力された予測誤差信号１１２を変換係数に変換し、変換係数データを生成する。ここでは予測誤差信号１１２は例えば離散コサイン変換などを用いて直交変換される。別の実施の形態として、ウェーブレット変換や独立成分解析などの手法を用いて変換係数を作成しても良い。変換係数データは、変換量子化部１０４において量子化され、量子化変換係数１１３が生成される。量子化に必要とされる量子化パラメータは、符号化制御部１０８に設定されている。 The mode determination unit 103 is connected to the transform quantization unit 104 and the inverse quantization inverse transform unit 106, and the mode information selected by the mode determination unit 103 and the prediction error signal 112 are input to the transform quantization unit 104. The The transform quantization unit 104 transforms the input prediction error signal 112 into transform coefficients, and generates transform coefficient data. Here, the prediction error signal 112 is orthogonally transformed using, for example, discrete cosine transformation. As another embodiment, the transform coefficient may be created using a technique such as wavelet transform or independent component analysis. The transform coefficient data is quantized by the transform quantization unit 104, and a quantized transform coefficient 113 is generated. The quantization parameter required for the quantization is set in the encoding control unit 108.

量子化変換係数１１３は、モード情報、量子化パラメータなどの予測方法に関する情報とともに符号化処理部１０５へと入力される。符号化処理部１０５は、入力されたモード情報等とともに、量子化変換係数１１３をエントロピー符号化（例えばハフマン符号化や算術符号化など）する。符号化処理部１０５でエントロピー符号化された符号化データ１１５は、符号化部１００外へと出力され、多重化器等（図示せず）により多重化等が行われ、出力バッファ（図示せず）を通して送信される。この場合、符号化シーケンス毎、ピクチャ毎、又は符号化スライス毎に頻度テーブルのインデックス長を送ることができる。また、テーブル長をシーケンス単位、ピクチャ単位又はスライス単位で送り、インデックスをマクロブロック単位又はブロック単位で送ることができる。テーブル長をシーケンス単位又はスライス単位でヘッダデータに含めて送る、及び／又はインデックスをマクロブロック単位でヘッダデータに含めて送ることができる。 The quantized transform coefficient 113 is input to the encoding processing unit 105 together with information related to a prediction method such as mode information and a quantization parameter. The encoding processing unit 105 performs entropy encoding (for example, Huffman encoding or arithmetic encoding) on the quantized transform coefficient 113 together with the input mode information and the like. The encoded data 115 entropy-encoded by the encoding processing unit 105 is output to the outside of the encoding unit 100, multiplexed by a multiplexer or the like (not shown), and output buffer (not shown). ). In this case, the index length of the frequency table can be sent for each coding sequence, each picture, or each coding slice. The table length can be sent in sequence units, picture units or slice units, and the index can be sent in macroblock units or block units. The table length can be sent in the header data in sequence units or slice units, and / or the index can be sent in the header data in macro block units.

逆量子化逆変換部１０６は、変換量子化部１０４によって量子化された変換係数１１３を、符号化制御部１０８に設定されている量子化パラメータ、及び量子化マトリクスなどにしたがって逆量子化する。逆量子化された変換係数は、逆変換（例えば逆離散コサイン変換など）され、予測誤差信号（１１２）に復元される。逆変換により得られた復元予測誤差信号（１１２）は、モード判定部１０３から供給される、予測誤差信号の予測モードに対応する予測モードの予測画像信号１１１と加算される。加算結果信号は局部復号信号１１４となり、参照画像メモリ１０７へと入力される。参照画像メモリ１０７は再構成された画像を蓄積する。このように参照画像メモリ１０７に蓄積された再構成画像が、内部予測／モード判定部１０２による予測画像信号等の生成の際に参照される。 The inverse quantization inverse transform unit 106 inversely quantizes the transform coefficient 113 quantized by the transform quantization unit 104 in accordance with a quantization parameter set in the encoding control unit 108, a quantization matrix, and the like. The inversely quantized transform coefficient is inversely transformed (for example, inverse discrete cosine transformation) and restored to the prediction error signal (112). The restored prediction error signal (112) obtained by the inverse transformation is added to the prediction image signal 111 of the prediction mode corresponding to the prediction mode of the prediction error signal supplied from the mode determination unit 103. The addition result signal becomes a local decoded signal 114 and is input to the reference image memory 107. The reference image memory 107 stores the reconstructed image. The reconstructed image stored in the reference image memory 107 in this manner is referred to when a predicted image signal or the like is generated by the internal prediction / mode determination unit 102.

符号化ループ（図１における内部予測／モード判定部１０２→モード判定部１０３→変換量子化部１０４→逆量子化逆変換部１０６→参照画像メモリ１０７といった順序で流れる処理）は、符号化対象マクロブロックで選択可能な全てのモードに対して処理を行った場合に１回のループとなる。このマクロブロックに対して符号化ループが終了すると、次のマクロブロックの入力画像信号１１０が入力され、符号化が行われる。 The encoding loop (the process that flows in the order of internal prediction / mode determination unit 102 → mode determination unit 103 → transform quantization unit 104 → inverse quantization inverse transform unit 106 → reference image memory 107 in FIG. 1) is an encoding target macro. When processing is performed for all modes selectable in the block, a loop is performed once. When the encoding loop is completed for this macroblock, the input image signal 110 of the next macroblock is input and encoding is performed.

符号化制御部１０８は発生符号量のフィードバック制御及び量子化特性制御、モード判定制御などを行い、発生符号量の制御を行うレート制御や、内部予測／モード判定部１０２の制御、外部入力パラメータの制御、符号化全体の制御を行う。同時に出力バッファ（図示せず）の制御を行い、適切なタイミングで符号化データを外部に出力する機能を有する。これら各部の機能は、コンピュータに記憶されたプログラムによって実現できる。 The encoding control unit 108 performs feedback control of the generated code amount, quantization characteristic control, mode determination control, etc., and performs rate control for controlling the generated code amount, control of the internal prediction / mode determination unit 102, external input parameter Control of overall control and encoding. At the same time, it has a function of controlling an output buffer (not shown) and outputting encoded data to the outside at an appropriate timing. The functions of these units can be realized by a program stored in a computer.

以上が本実施の形態にかかる動画像符号化装置の構成である。以下、本発明にかかる動画像符号化方法について、動画像符号化装置が実施する場合を例にあげ、図２、図３、図４を参照しながら説明する。 The above is the configuration of the moving picture coding apparatus according to the present embodiment. Hereinafter, the moving picture coding method according to the present invention will be described with reference to FIGS. 2, 3, and 4 taking as an example a case where the moving picture coding apparatus implements.

図２は、図１の符号化部１００中の内部予測／モード判定部１０２の構成を示すブロック図である。図２おいては、図１と共通する構成要素には同一の符号を付けてその説明を省略する。 FIG. 2 is a block diagram illustrating a configuration of the internal prediction / mode determination unit 102 in the encoding unit 100 of FIG. In FIG. 2, the same components as those in FIG.

内部予測／モード判定部１０２は、符号化制御部１０８からインデックス長１１６を受ける予測制御部５０１と入力画像信号１１０及び参照画像メモリ１０７からの局部復号信号（参照画像）１１４を受けるモード制御部５０２を有する。予測制御部５０１とモード制御部５０２は図２に示されるように接続されている。即ち、予測制御部５０１から出力される予測画像信号１１１はモード制御部５０２へと入力されるとともに図1のモード判定部１０３を経て逆量子化逆変換部１０６へ入力される。モード制御部５０２から出力される復号信号５０４は予測制御部５０１へと入力される。更に入力画像信号１１０が減算器５０６へと入力され、予測制御部５０１から出力される予測画像信号１１１を減算されて予測誤差信号１１２が生成される。 The internal prediction / mode determination unit 102 receives a prediction control unit 501 that receives an index length 116 from the encoding control unit 108, and a mode control unit 502 that receives a local decoded signal (reference image) 114 from the input image signal 110 and the reference image memory 107. Have The prediction control unit 501 and the mode control unit 502 are connected as shown in FIG. That is, the predicted image signal 111 output from the prediction control unit 501 is input to the mode control unit 502 and also input to the inverse quantization inverse transform unit 106 via the mode determination unit 103 in FIG. The decoded signal 504 output from the mode control unit 502 is input to the prediction control unit 501. Further, the input image signal 110 is input to the subtractor 506, and the prediction image signal 111 output from the prediction control unit 501 is subtracted to generate the prediction error signal 112.

予測制御部５０１について図３を参照して詳細を説明する。予測制御部５０１は図２に示される符号化制御部１０８からインデックス長情報を受ける頻度情報テーブル１１６を受ける頻度情報テーブル抽出部２０１及び頻度情報テーブル生成部２０２を有する。頻度情報テーブル生成部２０２は、現在までに符号化された画素ブロックの予測情報２０９の頻度をテーブル化している。画素ブロックを符号化する際、制御部２１０から与えられた予測情報２０９に従って頻度情報テーブル生成部２０２の頻度情報テーブルを更新する。更新した頻度情報テーブルは頻度情報テーブル抽出部２０１へと送られる。 The prediction control unit 501 will be described in detail with reference to FIG. The prediction control unit 501 includes a frequency information table extraction unit 201 and a frequency information table generation unit 202 that receive the frequency information table 116 that receives index length information from the encoding control unit 108 shown in FIG. The frequency information table generation unit 202 tabulates the frequency of the prediction information 209 of the pixel block encoded up to now. When coding a pixel block, the frequency information table of the frequency information table generating unit 202 is updated according to the prediction information 209 given from the control unit 210. The updated frequency information table is sent to the frequency information table extraction unit 201.

具体的に頻度情報テーブルについて説明する。図７は頻度情報テーブルの更新を示している。図７に示される数字は予測モードの番号を示している。選択された予測モードの番号に従って、１つの画素ブロックのモード判定が完了するごとに、頻度情報テーブルが更新される。先ず、符号化対象画素ブロックの上と左に隣接する画素ブロックの予測モードに対して並び替え（ソーティング）が行われる。例えば図中の右端の画素ブロックについて説明する。この画素ブロックの上の予測モードは１、左の予測モードは７である。このとき、一つ前の頻度情報テーブルの中で、左隣に位置する予測モード７をテーブル中から探し、第１位（テーブルのインデックス０）へと移動する。次に上の予測モードである１をテーブル中から探し、第２位（テーブルのインデックス１）へと移動する。このように各画素ブロックに対して隣接する上左の予測モードを頻度情報テーブルの上位に並び替え（ソーティング）することで、予測モードの頻度情報を得ることが可能となる。 The frequency information table will be specifically described. FIG. 7 shows the update of the frequency information table. The numbers shown in FIG. 7 indicate the prediction mode numbers. The frequency information table is updated every time the mode determination of one pixel block is completed according to the number of the selected prediction mode. First, rearrangement (sorting) is performed on prediction modes of pixel blocks adjacent to the upper and left sides of the encoding target pixel block. For example, the pixel block at the right end in the figure will be described. The prediction mode above this pixel block is 1, and the left prediction mode is 7. At this time, the prediction mode 7 located on the left side in the previous frequency information table is searched from the table and moved to the first place (table index 0). Next, 1 that is the upper prediction mode is searched from the table and moved to the second place (table index 1). Thus, by rearranging (sorting) the upper left prediction mode adjacent to each pixel block in the higher order of the frequency information table, it is possible to obtain prediction mode frequency information.

一般的に撮影される画像はカメラ内の光学的特長や変換／量子化の過程によって似た性質を持つことが多く、符号化の際に似た予測方法がまとまった領域に選択され易い傾向がある。この頻度情報テーブルを用いると、符号化対象画素ブロック以前に選択された予測モードがテーブル中の上位に存在することになり、使われていない予測モードはテーブルの下位に存在することとなる。 Generally, captured images often have similar properties depending on the optical characteristics of the camera and the process of transformation / quantization, and tend to be selected easily in a region where a similar prediction method is used for encoding. is there. If this frequency information table is used, the prediction mode selected before the pixel block to be encoded exists in the upper part of the table, and the prediction mode not used exists in the lower part of the table.

制御部２１０に設定されているインデックス長は、図７に示されるテーブルインデックスの長さを定義するものである。たとえば、インデックス長が０の場合は、頻度情報テーブルのインデックス０に設定されている予測モードのみを予測し、符号化することを意味する。同様にインデックス長が１の場合はテーブル中の０〜３までの予測モードを予測し、符号化する。同様にインデックス長が２の場合は、テーブル中の０〜７までの予測モードを予測し、符号化する。同様にインデックス長が３の場合はテーブル中の０〜１５までの予測モードを予測し、符号化する。インデックス長がＮの場合、以下の式に従って利用可能なテーブル中の予測モード数が決定される。 The index length set in the control unit 210 defines the length of the table index shown in FIG. For example, when the index length is 0, it means that only the prediction mode set to index 0 of the frequency information table is predicted and encoded. Similarly, when the index length is 1, prediction modes 0 to 3 in the table are predicted and encoded. Similarly, when the index length is 2, prediction modes 0 to 7 in the table are predicted and encoded. Similarly, when the index length is 3, prediction modes from 0 to 15 in the table are predicted and encoded. When the index length is N, the number of prediction modes in the table that can be used is determined according to the following equation.

L＝１＜＜Ｎ（３）
頻度情報テーブルで、符号化に使用された予測モードの中で、頻度の高い予測モードが、テーブル中の上位に存在するため、より予測モードの予測が当たりやすい予測モードのみの予測画像生成を行う。本方式を用いた予測を以下、フレキシブルモード予測と呼ぶ。 L = 1 << N (3)
In the frequency information table, among the prediction modes used for encoding, a prediction mode having a high frequency exists at the top of the table, so that only a prediction mode that is more likely to be predicted in the prediction mode is generated. . Hereinafter, prediction using this method is referred to as flexible mode prediction.

頻度情報テーブル生成部２０２で生成された頻度情報テーブルが、頻度情報テーブル抽出部２０１へと出力される。頻度情報テーブル抽出部２０１は、入力された頻度情報テーブルの中から、インデックス長情報１１６に対応するＬ個の予測モードを抽出する。頻度情報テーブル抽出部２０１は、符号化対象マクロブロックの量子化スケールの値が大きいか、或いは小さいか、に応じて、予測モードを抽出できる。また、頻度情報テーブル抽出部２０１は、入力画像信号の解像度が高いか、又は低いか、に応じて予測モードを抽出できる。抽出された予測モードが予測モード設定部２０３へと出力される。予測モード設定部２０３は、入力された抽出予測モードの中から１つを選択して、選択した予測モードに設定する。この情報はテーブル情報２１１として制御部２１０に設定されるとともに、選択予測モードに応じて予測切替スイッチ２０７を切り替える。切り替えられたスイッチ２０７の終端は対応する予測器（１，２，．．．Ｎ）２０４の１つへと接続される。 The frequency information table generated by the frequency information table generation unit 202 is output to the frequency information table extraction unit 201. The frequency information table extraction unit 201 extracts L prediction modes corresponding to the index length information 116 from the input frequency information table. The frequency information table extraction unit 201 can extract the prediction mode depending on whether the quantization scale value of the encoding target macroblock is large or small. Further, the frequency information table extraction unit 201 can extract the prediction mode depending on whether the resolution of the input image signal is high or low. The extracted prediction mode is output to the prediction mode setting unit 203. The prediction mode setting unit 203 selects one of the input extracted prediction modes and sets the selected prediction mode. This information is set in the control unit 210 as table information 211, and the prediction changeover switch 207 is switched according to the selected prediction mode. The end of the switched switch 207 is connected to one of the corresponding predictors (1, 2,... N) 204.

予測器（１，２，．．．N）２０４は、複数の予測方法を表している。予測モード設定部２０３で設定された予測モードと、これに対応する予測器２０４の番号１〜Ｎが対応しており、あらかじめ規定された予測方法で予測が行われる。ここでは例としてＨ．２６４で規定されている４ｘ４画素（方向）予測が行われる。 The predictor (1, 2,... N) 204 represents a plurality of prediction methods. The prediction modes set by the prediction mode setting unit 203 correspond to the numbers 1 to N of the predictors 204 corresponding to the prediction modes, and prediction is performed by a predetermined prediction method. Here, as an example, The 4 × 4 pixel (direction) prediction defined in H.264 is performed.

Ｈ．２６４の予測モードは９通りあり、図８(ａ)に示されるように、モード２を除いて夫々２２．５度づつ異なる予測方向を持っている。モード０からモード８までが規定されており、モード２は、ＤＣ予測となっている。４ｘ４画素予測の予測ブロックと参照画素との関係が図８（ｂ）に示されている。大文字ＡからＭまでの画素が参照画素であり、小文字ａからｐまでの画素が対象予測画素である。 H. There are nine prediction modes of H.264, and as shown in FIG. 8 (a), except for mode 2, each has a different prediction direction by 22.5 degrees. Mode 0 to mode 8 are defined, and mode 2 is DC prediction. FIG. 8B shows the relationship between the prediction block of 4 × 4 pixel prediction and the reference pixel. Pixels from uppercase letters A to M are reference pixels, and pixels from lowercase letters a to p are target prediction pixels.

予測器２０４に関して、予測方法を説明する。予測器２０４では、モード２のＤＣ予測が選択された場合、次式で予測画素が計算される。 A prediction method for the predictor 204 will be described. In the predictor 204, when mode 2 DC prediction is selected, a prediction pixel is calculated by the following equation.

Ｈ＝（Ａ＋Ｂ＋Ｃ＋Ｄ）、Ｖ＝（Ｉ＋Ｊ＋Ｋ＋Ｌ）（４）
ａ〜ｐ＝（Ｈ＋Ｖ＋４）＞＞３
参照画素が利用できない時は、利用できる参照画素の平均値で予測される。利用できる参照画素が１つも存在しない場合は、符号化装置の最大輝度値の半分の値（８ビットなら１２８）で予測値が計算される。その他のモードが選択された場合、予測器２０４は、図８（a）で示される予測方向に対して、参照画素から補間された予測値をコピーする予測方法を用いる。具体的には、モード０（垂直予測）が選択された場合の予測値生成方法を次式で説明する。 H = (A + B + C + D), V = (I + J + K + L) (4)
a to p = (H + V + 4) >> 3
When the reference pixel cannot be used, the average value of the available reference pixels is predicted. When there is no reference pixel that can be used, a predicted value is calculated with a value that is half of the maximum luminance value of the encoding device (128 for 8 bits). When the other mode is selected, the predictor 204 uses a prediction method that copies the prediction value interpolated from the reference pixel with respect to the prediction direction shown in FIG. Specifically, a prediction value generation method when mode 0 (vertical prediction) is selected will be described with the following equation.

ａ，ｅ，ｉ，ｍ＝Ａ
ｂ，ｆ，ｊ，ｎ＝B a, e, i, m = A
b, f, j, n = B

ｃ，ｇ，ｋ，ｏ＝Ｃ
ｄ，ｈ，ｌ，ｐ＝Ｄ（５）
このモードは、参照画素ＡからＤまでが利用できるときだけ、選択することが可能である。予測方法の詳細を図８（Ｃ）に示す。参照画素Ａ〜Ｄの輝度値がそのまま垂直方向にコピーされ、予測値として補填される。 c, g, k, o = C
d, h, l, p = D (5)
This mode can be selected only when reference pixels A to D are available. Details of the prediction method are shown in FIG. The luminance values of the reference pixels A to D are copied as they are in the vertical direction and compensated as predicted values.

予測モード０，２以外の予測方法に関してもほぼ同様の枠組みが用いられており、予測方向に対して利用できる参照画素から補間値を生成し、その値を予測方向に応じてコピーするという予測を行う。画素ブロックと予測モードの対応は図９に示されている。図中のＮ／Ａは対応する予測方法が定義されていないことを示している。 A substantially similar framework is used for prediction methods other than prediction modes 0 and 2, and an interpolation value is generated from reference pixels that can be used in the prediction direction, and the prediction is made by copying the value according to the prediction direction. Do. The correspondence between the pixel block and the prediction mode is shown in FIG. N / A in the figure indicates that the corresponding prediction method is not defined.

予測器２０４から出力される予測画像信号１１１は内部予測／モード判定部１０２のモード制御部５０２（図２）へと出力されるとともに符号化部１００のモード判定部１０３に入力される。 The predicted image signal 111 output from the predictor 204 is output to the mode control unit 502 (FIG. 2) of the internal prediction / mode determination unit 102 and also input to the mode determination unit 103 of the encoding unit 100.

ここで予測画像信号１１１は図４に示されるモード制御部５０２において局部復号化処理によって生成された予測モードの残差信号３０５と加算され、復号信号３０６となる。復号信号３０６は内部参照画像メモリ２０５へ入力される。内部参照画像メモリ２０５は、入力されてきた復号信号３０６を保存する。ここで保存された復号画像は以降の予測画像生成時に必要に応じて読み出され、予測器２０４へと出力され、参照画像として利用される。 Here, the prediction image signal 111 is added to the prediction mode residual signal 305 generated by the local decoding processing in the mode control unit 502 shown in FIG. The decoded signal 306 is input to the internal reference image memory 205. The internal reference image memory 205 stores the input decoded signal 306. The decoded image stored here is read out as necessary at the time of subsequent prediction image generation, is output to the predictor 204, and is used as a reference image.

以上が本実施の形態にかかる予測制御部５０１の構成である。次に、図４を参照してモード制御部５０２の構成を説明する。ここでは、図１及び図２と共通する構成要素には同一の符号を付けてその説明を省略する。 The above is the configuration of the prediction control unit 501 according to the present embodiment. Next, the configuration of the mode control unit 502 will be described with reference to FIG. Here, the same reference numerals are given to the same components as those in FIGS. 1 and 2, and the description thereof is omitted.

モード制御部５０２は、マクロブロックサイズよりも小さいブロックサイズの予測も行う。このモード制御部５０２は、内部モード判定部３０１、内部変換量子化部３０２、仮符号化処理部３０３、内部逆量子化逆変換部３０４、加算器３０５により構成される。 The mode control unit 502 also predicts a block size smaller than the macroblock size. The mode control unit 502 includes an internal mode determination unit 301, an internal transform quantization unit 302, a provisional encoding processing unit 303, an internal inverse quantization inverse transform unit 304, and an adder 305.

入力画像信号１１０と局部復号信号１１４とともにモード制御部５０１から出力された予測画像信号１１１がモード制御部５０２内の内部モード判定部３０１へと入力される。この内部モード判定部３０１は、予測モードの判定を行う機能を有する。式（１）、式（２）などを用いて予測モードの符号化コストを計算し、最適な予測モードを決定する。内部モード判定部３０１を通過した予測画像信号１１１は、内部変換量子化部３０２へ入力され、直交変換される。ここでは例えば離散コサイン変換などを用いて直交変換される。別の実施の形態として、ウェーブレット変換や独立成分解析などの手法を用いて変換係数を作成しても良い。変換係数３０８は、さらに量子化される。量子化に必要とされる量子化パラメータは、符号化制御部１０８に設定されている。変換係数３０８は仮符号化処理部３０３へと出力されるとともに、内部逆量子化逆変換部３０４へと併せて出力される。仮符号化処理部３０３では、得られた変換係数３０８を元に符号量３０９を算出するための一時的な符号化を行う。ここで得られた符号量３０９を内部モード判定部３０１へとフィードバックし、符号化コストを算出してもよい。仮符号化処理部３０３で符号化された変換係数３０８は、符号化部１００の符号化データ１１５に相当する。 The predicted image signal 111 output from the mode control unit 501 together with the input image signal 110 and the local decoded signal 114 is input to the internal mode determination unit 301 in the mode control unit 502. The internal mode determination unit 301 has a function of determining the prediction mode. The encoding cost of the prediction mode is calculated using Equation (1), Equation (2), etc., and the optimal prediction mode is determined. The predicted image signal 111 that has passed through the internal mode determination unit 301 is input to the internal transform quantization unit 302 and orthogonally transformed. Here, orthogonal transform is performed using, for example, discrete cosine transform. As another embodiment, the transform coefficient may be created using a technique such as wavelet transform or independent component analysis. The transform coefficient 308 is further quantized. The quantization parameter required for the quantization is set in the encoding control unit 108. The transform coefficient 308 is output to the provisional encoding processing unit 303 and also output to the internal inverse quantization inverse transform unit 304. The temporary encoding processing unit 303 performs temporary encoding for calculating the code amount 309 based on the obtained transform coefficient 308. The code amount 309 obtained here may be fed back to the internal mode determination unit 301 to calculate the coding cost. The transform coefficient 308 encoded by the temporary encoding processing unit 303 corresponds to the encoded data 115 of the encoding unit 100.

一方、内部逆量子化逆変換部３０４では、得られた変換係数３０８を逆量子化する。ここでは変換量子化部３０２で利用された量子化に関するパラメータを用いて処理が行われる。さらに逆量子化された変換係数は逆変換（例えば逆離散コサイン変換など）を行い、量子化された予測残差信号を生成する。この予測残差信号は加算器３０５へと入力され、内部モード判定部３０１から供給される予測画像信号１１１と加算される。予測残差信号と予測画像信号１１１の加算信号は復号信号３０６となる。モード制御部５０２は、復号信号３０６を予測制御部５０１へと出力する。前述した局部復号化処理とは、モード制御部５０２内の内部モード判定部３０１⇒内部変換量子化部３０２⇒内部逆量子化逆変換部３０４⇒加算器３０５に対応する処理のことを指している。 On the other hand, the internal inverse quantization inverse transform unit 304 inversely quantizes the obtained transform coefficient 308. Here, processing is performed using the parameters relating to quantization used in the transform quantization unit 302. Further, the inversely quantized transform coefficient is subjected to inverse transform (for example, inverse discrete cosine transform) to generate a quantized prediction residual signal. This prediction residual signal is input to the adder 305 and added to the prediction image signal 111 supplied from the internal mode determination unit 301. An addition signal of the prediction residual signal and the prediction image signal 111 becomes a decoded signal 306. The mode control unit 502 outputs the decoded signal 306 to the prediction control unit 501. The above-described local decoding processing refers to processing corresponding to the internal mode determination unit 301 in the mode control unit 502 ⇒ the internal transform quantization unit 302 ⇒ the internal inverse quantization inverse transform unit 304 ⇒ the adder 305. .

内部予測ループ（図３、図４における予測モード設定部２０３⇒予測切替スイッチ２０７⇒予測器２０４⇒内部モード判定部３０１⇒内部変換量子化部３０２⇒内部逆量子化逆変換部３０４⇒加算器３０５⇒内部参照画像メモリ２０５といった順序で流れる処理）は、そのマクロブロック内の小画素ブロックで選択可能な全ての予測モードに対して処理を行った場合に１回のループとなる。 Internal prediction loop (prediction mode setting unit 203 in FIGS. 3 and 4 ⇒ prediction changeover switch 207 ⇒ predictor 204 ⇒ internal mode determination unit 301 ⇒ internal transform quantization unit 302 ⇒ internal inverse quantization inverse transform unit 304 ⇒ adder 305 ⇒ Processing that flows in the order of the internal reference image memory 205) is a loop when processing is performed for all prediction modes that can be selected in the small pixel block in the macroblock.

例えば、４ｘ４画素予測に対して、合計１６回の内部予測ループを行うことになる。この場合、制御部２１０は、頻度情報テーブル抽出部２０１で選択された予測モードに対応する予測モードを予測モード設定部２０３によって設定され、予測切替スイッチ２０７を操作し、１６回の内部予測ループを行い最適なモードの組み合わせを決定する。ここで得られた予測モードは予測画像信号１１１とともにモード制御部５０２の内部モード判定部３０１に順次入力され、符号化対象画素ブロックの最適モードが決定されることになる。 For example, a total of 16 internal prediction loops are performed for 4 × 4 pixel prediction. In this case, the control unit 210 sets a prediction mode corresponding to the prediction mode selected by the frequency information table extraction unit 201 by the prediction mode setting unit 203, operates the prediction changeover switch 207, and executes 16 internal prediction loops. To determine the optimal mode combination. The prediction mode obtained here is sequentially input to the internal mode determination unit 301 of the mode control unit 502 together with the prediction image signal 111, and the optimal mode of the encoding target pixel block is determined.

マクロブロックに対して内部予測ループが終了すると、次のマクロブロックの入力画像信号１１０が入力され、符号化が行われる。 When the inner prediction loop is completed for the macroblock, the input image signal 110 of the next macroblock is input and encoding is performed.

以上が本実施の形態における、動画像符号化装置１００の概要である。 The above is the outline of the moving picture coding apparatus 100 in the present embodiment.

本実施の形態においては、予測器２０４の予測方法として、Ｈ．２６４のフレーム内予測を利用する例を示した。しかし、予測方法に依存しないために、異なる予測方法を適用することも可能である。たとえば、フレーム間予測時に頻度情報テーブルを用いて、動き補償ブロックサイズの予測を行っても良いし、動きベクトルの予測を行っても良い。更に片方向予測や双方向予測の予測モードに対して頻度情報テーブルを作成しても良い。 In the present embodiment, as a prediction method of the predictor 204, H.264 is used. An example using H.264 intra-frame prediction is shown. However, since it does not depend on the prediction method, it is possible to apply a different prediction method. For example, a motion compensation block size may be predicted using a frequency information table during inter-frame prediction, or a motion vector may be predicted. Furthermore, a frequency information table may be created for a prediction mode of unidirectional prediction or bidirectional prediction.

また、本実施の形態においては、予測モードの頻度情報テーブルの更新時に利用画素ブロック位置として、左と上の画素ブロックの予測モードを参照しているが、符号化対象画素ブロックの隣接画素ブロックとして、さらに広い領域でテーブルを更新しても良い。具体的には、時間的に前後する同位置の画素ブロックの予測モードを用いても良いし、利用可能な右上の画素ブロック、左上の画素ブロック、上の画素ブロック更にその上の画素ブロック、左の画素ブロックの更にその左の画素ブロックなどで選択されている予測モードを用いて、頻度情報テーブルを更新しても良い。 In the present embodiment, the prediction mode of the left and upper pixel blocks is referred to as the use pixel block position when updating the frequency information table of the prediction mode, but as the adjacent pixel block of the encoding target pixel block The table may be updated in a wider area. Specifically, the prediction mode of the pixel block at the same position that moves back and forth in time may be used, or the upper right pixel block, the upper left pixel block, the upper pixel block, and the upper pixel block, left The frequency information table may be updated using the prediction mode selected in the pixel block further to the left of the pixel block.

また、本実施の形態においては、予測モードの頻度情報テーブルの更新ルールとして、左の画素ブロックの予測モードをインデックス０、上の画素ブロックの予測モードをインデックス１に挿入し、ソーティングを行っていたが、上の画素ブロックの予測モードをインデックス０、左の画素ブロックの予測モードをインデックス１に挿入し、ソーティングを行っても良いし、上述したように隣接画素ブロックを拡張して、頻度情報テーブルのソーティングを行っても良い。また、頻度情報テーブルを予測モード数に併せて複数所持しても良いし、テーブルごとに異なる更新ルールを適用しても良い。いずれにせよ、符号化器と復号化器で同じ頻度情報テーブルを持っている必要がある。 Further, in this embodiment, as the update rule of the prediction mode frequency information table, the prediction mode of the left pixel block is inserted into the index 0, and the prediction mode of the upper pixel block is inserted into the index 1, and sorting is performed. However, the prediction mode of the upper pixel block may be inserted into the index 0, the prediction mode of the left pixel block may be inserted into the index 1, and sorting may be performed, or the adjacent pixel block may be expanded as described above, and the frequency information table You may sort. Further, a plurality of frequency information tables may be provided in accordance with the number of prediction modes, and different update rules may be applied for each table. In any case, the encoder and decoder need to have the same frequency information table.

また、本実施の形態においては、処理対象フレームを１６ｘ１６画素サイズなどの短形ブロックに分割し、画面左上のブロックから右下に向かって、順に符号化する場合について説明しているが、処理順は他の順序であっても良い。例えば、右下から左上に処理を行っても良いし、画面中央から渦巻状に処理を行っても良い。右上から左下に行っても良いし、画面の周辺部から中心部に向かって処理を行っても良い。 In this embodiment, a case has been described in which a processing target frame is divided into short blocks of 16 × 16 pixel size and the like, and encoding is performed sequentially from the upper left block to the lower right block. May be in other orders. For example, the processing may be performed from the lower right to the upper left, or the processing may be performed in a spiral shape from the center of the screen. The processing may be performed from the upper right to the lower left, or the processing may be performed from the periphery of the screen toward the center.

また、実施の形態においては、変換量子化ブロックサイズを１６ｘ１６画素単位のマクロブロックとして分割し、さらにフレーム内予測の処理単位として、８ｘ８画素ブロックや４ｘ４画素ブロックの場合について説明しているが、処理対象ブロックは均一なブロック形状にする必要は無く、１６ｘ８画素、８ｘ１６画素、８ｘ４画素、４ｘ８画素、などのブロックサイズに関しても適用可能である。例えば、８ｘ４画素ブロックや２x２画素ブロックに対しても、同様の枠組みで実現が可能である。更に、１つのマクロブロック中で、均一なブロックサイズを取る必要はなく、夫々異なるブロックの大きさを選択しても良い。例えば、マクロブロック内で８ｘ８画素ブロックと４ｘ４画素ブロックを混在させても良い。この場合、分割数が増えると、分割情報を符号化するための符号量が増加するが、より精度の高い予測が可能であり、予測誤差を削減することが可能である。よって、変換係数の符号量と局所復号画像とのバランスを考慮して、ブロックサイズを選択すればよい。即ち、符号化モード毎に対応する予測画素ブロックのサイズを特定の画素ブロックサイズ内で切り替えてもよい。 In the embodiment, the transform quantization block size is divided as a macro block of 16 × 16 pixel units, and the case of an 8 × 8 pixel block or a 4 × 4 pixel block as an intra-frame prediction processing unit has been described. The target block does not need to have a uniform block shape, and can be applied to block sizes of 16 × 8 pixels, 8 × 16 pixels, 8 × 4 pixels, 4 × 8 pixels, and the like. For example, an 8 × 4 pixel block and a 2 × 2 pixel block can be realized with the same framework. Furthermore, it is not necessary to take a uniform block size in one macroblock, and different block sizes may be selected. For example, an 8 × 8 pixel block and a 4 × 4 pixel block may be mixed in a macro block. In this case, as the number of divisions increases, the amount of codes for encoding the division information increases, but more accurate prediction is possible and prediction errors can be reduced. Therefore, the block size may be selected in consideration of the balance between the code amount of the transform coefficient and the locally decoded image. That is, the size of the prediction pixel block corresponding to each coding mode may be switched within a specific pixel block size.

また、実施の形態においては、変換量子化部１０４、逆量子化逆変換部１０６及び内部変換量子化部３０２、内部逆量子化逆変換部３０４が設けられている。しかし、必ずしも全ての予測誤差信号に対して変換量子化及び逆量子化逆変換を行う必要は無く、予測誤差信号をそのまま符号化処理部１０５、仮符号化処理部３０３で符号化してもよいし、量子化及び逆量子化処理を省略しても良い。同様に、変換処理と逆変換処理を行わなくても良い。 In the embodiment, a transform quantization unit 104, an inverse quantization inverse transform unit 106, an internal transform quantization unit 302, and an internal inverse quantization inverse transform unit 304 are provided. However, it is not always necessary to perform transform quantization and inverse quantization inverse transform on all prediction error signals, and the prediction error signal may be directly encoded by the encoding processing unit 105 and the provisional encoding processing unit 303. The quantization and inverse quantization processes may be omitted. Similarly, the conversion process and the inverse conversion process may not be performed.

以上が本実施の形態にかかる内部予測及びモード判定部１０２の構成である。以下、本発明にかかる動画像符号化方法について、動画像符号化装置が実施する場合を例にあげ、図６を参照しながら説明する。 The above is the configuration of the internal prediction and mode determination unit 102 according to the present embodiment. Hereinafter, the moving picture coding method according to the present invention will be described with reference to FIG.

符号化部１００に1フレーム分の入力画像信号１１０が入力される（ステップＳ１）と画像分割部１０１は、入力画像信号１１０を複数のマクロブロックに分割し、更に複数の小画素ブロックへと分割する（ステップＳ２）。入力画像信号１１０がブロック単位で内部予測及びモード判定部１０２へと入力される。このとき、モード判定部１０３では、モードを示すインデックスやコストの初期化を行う（ステップＳ３）。 When the input image signal 110 for one frame is input to the encoding unit 100 (step S1), the image dividing unit 101 divides the input image signal 110 into a plurality of macroblocks and further divides it into a plurality of small pixel blocks. (Step S2). The input image signal 110 is input to the internal prediction and mode determination unit 102 in units of blocks. At this time, the mode determination unit 103 initializes the index indicating the mode and the cost (step S3).

入力画像信号１１０を用いて、内部予測及びモード判定部１０２にて、符号化対象ブロックで選択可能な１つの予測モードにおける予測画像信号を生成する（ステップＳ４）。このとき使用された予測モードによって頻度テーブルが仮更新される（ステップＳ５）。 Using the input image signal 110, the internal prediction and mode determination unit 102 generates a prediction image signal in one prediction mode that can be selected in the encoding target block (step S4). The frequency table is temporarily updated according to the prediction mode used at this time (step S5).

予測画像信号１１１と入力画像信号１１０の差分を取り、予測誤差信号１１２を生成する。予測モードの符号量ＯＨと予測誤差信号１１２の絶対値和ＳＡＤからコストｃｏｓｔを計算する。又は符号化歪Ｄと符号量Ｒから式（２）を用いて符号化ｃｏｓｔを計算する（ステップＳ６）。 A difference between the predicted image signal 111 and the input image signal 110 is taken to generate a prediction error signal 112. The cost cost is calculated from the code amount OH in the prediction mode and the absolute value sum SAD of the prediction error signal 112. Alternatively, the encoding cost is calculated using the equation (2) from the encoding distortion D and the code amount R (step S6).

モード判定部１０３は、計算されたコストｃｏｓｔが、最小コストｍｉｎ＿ｃｏｓｔより小さいか否かを判別し（ステップＳ７）、小さい場合（ＹＥＳ）にはそのコストで最小コストを更新するとともに、その際の符号化モードをｂｅｓｔ＿ｍｏｄｅインデックスとして頻度情報テーブルに保持する（ステップＳ８）。同時に予測画像信号１１１を内部メモリに一時保持する（ステップS９）。計算されたコストｃｏｓｔが、最小コストｍｉｎ＿ｃｏｓｔより大きい場合、モード番号を示すｉｎｄｅｘをインクリメントし、インクリメント後のｉｎｄｅｘがモードの最後かどうかを判定する（ステップＳ１０）。 The mode determination unit 103 determines whether or not the calculated cost cost is smaller than the minimum cost min_cost (step S7). If the calculated cost cost is smaller (YES), the minimum cost is updated with the cost and the code at that time is also updated. Is stored in the frequency information table as a best_mode index (step S8). At the same time, the predicted image signal 111 is temporarily stored in the internal memory (step S9). If the calculated cost cost is greater than the minimum cost min_cost, the index indicating the mode number is incremented, and it is determined whether the incremented index is the end of the mode (step S10).

ｉｎｄｅｘがモードの最後の番号であるＭＡＸよりも大きい場合（ＹＥＳ）、決定されたベストモードによって頻度情報テーブルが更新される（ステップＳ１１）。ｂｅｓｔ＿ｍｏｄｅの符号化モード情報及び予測誤差信号１１２が変換量子化部１０４へと渡され、変換及び量子化が行われる（ステップＳ１２）。量子化された変換係数１１３が符号化処理部１０５へと入力され、予測情報１０９が符号化処理部１０５でエントロピー符号化される（ステップＳ１３）。 If the index is larger than MAX, which is the last number of the mode (YES), the frequency information table is updated with the determined best mode (step S11). The coding mode information of best_mode and the prediction error signal 112 are passed to the transform quantization unit 104, and transformation and quantization are performed (step S12). The quantized transform coefficient 113 is input to the encoding processing unit 105, and the prediction information 109 is entropy encoded by the encoding processing unit 105 (step S13).

一方、ｉｎｄｅｘがモードの最後の番号であるＭＡＸよりも小さい場合（ＮＯ）、頻度情報テーブルはリセットされ（ステップＳ１４）され、処理はステップＳ４に戻り、次のｉｎｄｅｘで示される符号化モードの予測画像信号１１１が生成される。 On the other hand, if the index is smaller than MAX, which is the last number of the mode (NO), the frequency information table is reset (step S14), the process returns to step S4, and the prediction of the encoding mode indicated by the next index is performed. An image signal 111 is generated.

ｂｅｓｔ＿ｍｏｄｅでの符号化が行われると、量子化された変換係数１１３が逆量子化逆変換部１０６へと入力され、逆量子化及び逆変換が行われる（ステップＳ１５）。復号された予測誤差信号１１２とモード判定部１０３から供給されるｂｅｓｔ＿ｍｏｄｅの予測画像信号１１１が加算され、復号画像信号１１４として、参照画像メモリ１０７へと保存される（ステップＳ１６）。 When encoding with the best_mode is performed, the quantized transform coefficient 113 is input to the inverse quantization inverse transform unit 106, and inverse quantization and inverse transform are performed (step S15). The decoded prediction error signal 112 and the predicted image signal 111 of the best_mode supplied from the mode determination unit 103 are added and stored in the reference image memory 107 as a decoded image signal 114 (step S16).

ここで、１フレームの符号化が終了しているかどうかの判定が行なわれる（ステップＳ１７）。処理が完了している場合（ＹＥＳ）、処理はステップＳ１に戻り、次のフレームの入力画像信号が入力され、符号化処理が行われる。一方、１フレームの符号化処理が完了していない場合（ＮＯ）、処理はステップＳ２に戻り、次の小画素ブロックの入力信号が入力され、符号化処理が継続される。 Here, it is determined whether or not one frame has been encoded (step S17). When the process is completed (YES), the process returns to step S1, the input image signal of the next frame is input, and the encoding process is performed. On the other hand, if the encoding process for one frame is not completed (NO), the process returns to step S2, the input signal of the next small pixel block is input, and the encoding process is continued.

本実施の形態において、フレーム単位のマルチパスで符号化する場合、頻度情報テーブルのインデックス長を変えて、毎回符号化する必要はなく、符号量の増加のみを別途テーブル化して累積しておき、符号化コストを計算し、最適なインデックス長を決定することが可能である。よって再符号化を利用せずとも予測誤差が変わらないため、処理を大幅に削減することが可能である。 In this embodiment, when encoding by multi-pass in units of frames, it is not necessary to change the index length of the frequency information table every time, and only the increase in the code amount is tabulated separately and accumulated, It is possible to calculate the coding cost and determine the optimal index length. Therefore, since the prediction error does not change without using re-encoding, the processing can be greatly reduced.

以上が本実施の形態における、動画像符号化装置の概要である。次に本予測方式で用いるシンタクスの符号化方法について説明する。
図１０に本実施の形態で用いられるシンタクスの構造の概略を示す。シンタクスは主に３つのパートからなり、ハイレベルシンタクス（４０１）はスライス以上の上位レイヤのシンタクス情報が詰め込まれている。スライスレベルシンタクス（４０２）では、スライス毎に必要な情報が明記されており、マクロブロックレベルシンタクス（４０３）では、マクロブロック毎に必要とされる量子化パラメータの変更値やモード情報などが明記されている。 The above is the outline of the moving picture coding apparatus according to the present embodiment. Next, a syntax encoding method used in this prediction method will be described.
FIG. 10 shows an outline of the syntax structure used in this embodiment. The syntax mainly consists of three parts, and the high-level syntax (401) is packed with syntax information of higher layers above the slice. In the slice level syntax (402), information necessary for each slice is specified, and in the macro block level syntax (403), a change value of a quantization parameter required for each macro block, mode information, and the like are specified. ing.

夫々は、さらに詳細なシンタクスで構成されており、ハイレベルシンタクス（４０１）では、シーケンスパラメータセットシンタクス（４０４）とピクチャパラメータセットシンタクス（４０５）などのシーケンス、ピクチャレベルのシンタクスから構成されている。スライスレベルシンタクス（４０２）では、スライスヘッダーシンタクス（４０６）、スライスデータシンタクス（４０７）などから成る。さらに、マクロブロックレベルシンタクス（４０３）は、マクロブロックレイヤーシンタクス（４０８）、マクロブロックプレディクションシンタクス（４０９）などから構成されている。 Each has a more detailed syntax. The high-level syntax (401) is composed of a sequence parameter sequence syntax (404) and a picture parameter set syntax (405), and a picture level syntax. The slice level syntax (402) includes a slice header syntax (406), a slice data syntax (407), and the like. Furthermore, the macroblock level syntax (403) is composed of a macroblock layer syntax (408), a macroblock prediction syntax (409), and the like.

本実施の形態で、必要となるシンタクス情報はシーケンスパラメータセットシンタクス（４０４）、ピクチャパラメータセットシンタクス（４０５）、スライスヘッダーシンタクス（４０６）、マクロブロックレイヤーシンタクス（４０８）であり、夫々のシンタクスを以下で説明する。 In the present embodiment, the required syntax information is a sequence parameter set syntax (404), a picture parameter set syntax (405), a slice header syntax (406), and a macroblock layer syntax (408). I will explain it.

図１１のシーケンスパラメータセットシンタクス内に示されるｓｅｑ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、フレキシブルモード予測の利用可否をシーケンス毎に変更するかどうかを示すフラグであり、当該フラグがＴＲＵＥであるときは、フレキシブルモード予測を利用するかどうかを、シーケンス単位で切り替えることが可能である。一方、フラグがＦＡＬＳＥであるときは、シーケンス内ではフレキシブルモード予測を用いることが出来ない。 The seq_flexible_mode_prediction_flag shown in the sequence parameter set syntax of FIG. 11 is a flag indicating whether or not the availability of the flexible mode prediction is changed for each sequence, and when the flag is TRUE, whether to use the flexible mode prediction. It is possible to switch whether or not in sequence units. On the other hand, when the flag is FALSE, flexible mode prediction cannot be used in the sequence.

図１２のピクチャーパラメータセットシンタクス内に示されるｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、フレキシブルモード予測の利用可否をピクチャ毎に変更するかどうかを示すフラグであり、このフラグがＴＲＵＥであるときは、フレキシブルモード予測を利用するかどうかを、ピクチャ単位で切り替えることが可能である。一方、フラグがＦＡＬＳＥであるときは、ピクチャ内ではフレキシブルモード予測を用いることが出来ない。ｓｅｑ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇがＴＲＵＥの場合は必ず、ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが送信される。この時ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇがＦＡＬＳＥの場合は、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈが送信される。本シンタクスは、フレキシブルモード予測で利用可能な頻度情報テーブルのインデックス長を表している。 The pic_flexible_mode_prediction_flag shown in the picture parameter set syntax of FIG. 12 is a flag indicating whether to use the flexible mode prediction for each picture. When this flag is TRUE, whether to use the flexible mode prediction. It is possible to switch whether or not for each picture. On the other hand, when the flag is FALSE, flexible mode prediction cannot be used in the picture. When seq_flexible_mode_prediction_flag is TRUE, pic_flexble_mode_prediction_flag is always transmitted. At this time, if pic_flexible_mode_prediction_flag is FALSE, table_index_length is transmitted. This syntax represents the index length of the frequency information table that can be used in flexible mode prediction.

図１３のスライスヘッダーシンタクス内に示されるｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、フレキシブルモード予測の利用可否をスライス毎に変更するかどうかを示すフラグであり、このフラグがＴＲＵＥであるときは、フレキシブルモード予測を利用するかどうかを、スライス単位で切り替えることが可能である。一方、フラグがＦＡＬＳＥであるときは、スライス内ではフレキシブルモード予測を用いることが出来ない。ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇがＴＲＵＥであるときは、必ずｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが送信される。この時ｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇがＦＡＬＳＥの場合は、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈが送信される。本シンタクスは、フレキシブルモード予測で利用可能な頻度情報テーブルのインデックス長を表している。 The slice_flexible_mode_prediction_flag shown in the slice header syntax of FIG. 13 is a flag indicating whether or not the availability of the flexible mode prediction is changed for each slice. When this flag is TRUE, whether or not the flexible mode prediction is used. Can be switched in units of slices. On the other hand, when the flag is FALSE, the flexible mode prediction cannot be used in the slice. When pic_flexible_mode_prediction_flag is TRUE, slice_flexible_mode_prediction_flag is always transmitted. At this time, if slice_flexible_mode_prediction_flag is FALSE, table_index_length is transmitted. This syntax represents the index length of the frequency information table that can be used in flexible mode prediction.

図１４のマクロブロックレイヤーシンタクス内に示されるｍｂ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、フレキシブルモード予測の利用可否を符号化対象マクロブロックで利用するかどうかを示すフラグであり、このフラグがＴＲＵＥであるときは、フレキシブルモード予測を利用する。一方、フラグがＦＡＬＳＥであるときは、フレキシブルモード予測は利用されない。このフラグがＴＲＵＥのときには、必ず、ｍｏｄｅ＿ｉｎｄｅｘが送信される。これは符号化対象マクロブロックの予測モードインデックスを示しており、頻度情報テーブルの何番目の予測モードが選択されているかを示している。シンタクス中のＢｌｋＳｉｚｅは、符号化対象画素ブロックの数を表しており、４ｘ４画素ブロックでは１６、８ｘ８画素ブロックでは４、１６x１６画素ブロックでは１が対応する。ｍｏｄｅ＿ｉｎｄｅｘの符号量は、二値化の過程で、上位シンタクスに記述されるｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈによって変更される。たとえば、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈが４の場合、［数３］式に従って０〜１５までのテーブルインデックスが利用できることになる。ｍｏｄｅ＿ｉｎｄｅｘは、この１６個のテーブルインデックスを表しており、等長符号を与えると４ビットのシンタクスとなる。二値化では、対応するシンタクス要素の頻度情報に従って最も符号量が少なくなるように設計されることが望ましい。二値化の例として図１７を参照しながら説明する。 Mb_flexible_mode_prediction_flag shown in the macroblock layer syntax of FIG. 14 is a flag indicating whether to use the availability of flexible mode prediction in the encoding target macroblock. When this flag is TRUE, flexible mode prediction is performed. Use. On the other hand, when the flag is FALSE, flexible mode prediction is not used. When this flag is TRUE, mode_index is always transmitted. This indicates the prediction mode index of the encoding target macroblock, and indicates what number of prediction modes in the frequency information table is selected. BlkSize in the syntax represents the number of encoding target pixel blocks, and 16 corresponds to 4 × 4 pixel blocks, 4 corresponds to 8 × 8 pixel blocks, and 1 corresponds to 16 × 16 pixel blocks. The code amount of mode_index is changed by table_index_length described in the upper syntax in the binarization process. For example, when table_index_length is 4, table indexes from 0 to 15 can be used according to the formula [3]. The mode_index represents the 16 table indexes. When an equal length code is given, it becomes a 4-bit syntax. In binarization, it is desirable that the code amount is designed to be minimized according to the frequency information of the corresponding syntax element. An example of binarization will be described with reference to FIG.

図１７はインデックス長がＬ＝１６(Ｎ=４)であるときの二値化の例を示している。Ｌ＝１６の時、テーブルインデックスの選択可能な値は０〜１５の１６個である（表の一列目の番号に該当）。これらのテーブルインデックス番号の発生確率が分かっていない場合、等長符号を用いて二値化することがもっとも簡単である（表の二列目）。表中のビット列はｍｏｄｅ＿ｉｎｄｅｘを表している。一方、事前にテーブルインデックス番号の発生確率が分かっている場合、発生確率に応じてハフマン符号を用いて二値化を行うことで、テーブルインデックスを表すｍｏｄｅ＿ｉｎｄｅｘのビット数を削減することが可能である。図１７の表の二列目にハフマン符号の一例を示す。頻度情報テーブルが予め発生頻度に従って更新されているため、テーブルインデックスの上位に表に示されるような短い符号を与えることによって、テーブルインデックスの符号量をより削減することが可能である。表中の四列目がハフマン符号を生成したときの発生確率を示している。 FIG. 17 shows an example of binarization when the index length is L = 16 (N = 4). When L = 16, there are 16 selectable values of the table index (corresponding to the numbers in the first column of the table). If the occurrence probability of these table index numbers is not known, it is easiest to binarize using isometric codes (second column of the table). The bit string in the table represents mode_index. On the other hand, when the occurrence probability of the table index number is known in advance, it is possible to reduce the number of bits of the mode_index representing the table index by performing binarization using a Huffman code according to the occurrence probability. . An example of the Huffman code is shown in the second column of the table of FIG. Since the frequency information table is updated in accordance with the occurrence frequency in advance, it is possible to further reduce the code amount of the table index by giving a short code as shown in the table above the table index. The fourth column in the table indicates the occurrence probability when the Huffman code is generated.

モード判定部１０３及び予測モード設定部２０３で選択された予測モードを表すテーブルインデックスの番号は図１７で表されるような変換テーブルによって二値化され、この二値化ビット列が符号化処理部１０５及び仮符号化部３０３によってエントロピー符号化（たとえば算術符号化など）される。上記符号化以外にエントロピー符号やシャノン符号、算術符号などの方法を用いてインデックスを二値化しても良い。 The table index number indicating the prediction mode selected by the mode determination unit 103 and the prediction mode setting unit 203 is binarized by a conversion table as shown in FIG. 17, and this binarized bit string is encoded. And the temporary encoding unit 303 performs entropy encoding (for example, arithmetic encoding). In addition to the above encoding, the index may be binarized using a method such as entropy code, Shannon code, or arithmetic code.

本実施の別の形態としては、図１４で示されるｍａｃｒｏｂｌｏｃｋ＿ｄａｔａシンタクスを図１５で表されるようなシンタクスに変えても良い。図１４との違いはｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］の値が、予め定められたＥＳＣＡＰＥ＿ＣＯＤＥの場合、更にｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］が送られるという点である。 As another embodiment of the present invention, the macroblock_data syntax shown in FIG. 14 may be changed to the syntax shown in FIG. The difference from FIG. 14 is that when the value of mode_index [iBlk] is a predetermined ESCAPE_CODE, ecs_mode_index [iBlk] is further sent.

例えば、図１７において符号列１１１１（もしくはハフマン符号の１１１１１）がＥＳＣＡＰＥ＿ＣＯＤＥの場合、現在のテーブル長ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｈｔで示される頻度情報テーブルの中に、該当する予測モードが含まれていないことを示す。ｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］は、更にｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈで示されるテーブル長より後のインデックス番号を示している。例えば、頻度情報テーブルの長さ（全予測モード数を示す）Ｍ＝１５、インデックス長Ｌ＝８の場合の例を図１８に示す。選択された頻度情報テーブルのインデックスがインデックス長Ｌ＝８内からはみ出した場合、ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］にはＥＳＣＡＰＥ＿ＣＯＤＥがセットされる。更にはみ出したインデックスに対応するｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］が送られる。例えば、図中で選択された頻度情報テーブルのインデックスが１０のとき、ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］にＥＳＣＡＰＥ＿ＣＯＤＥ＝１１１がセットされ、同時にｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］に０１１がセットされる。このようにすることで、全ての予測モードをデコーダに送信することが可能となるため、予測モードの追加や削減などの拡張が容易となる。また、夫々の予測モードに対応するシンタクスの設計等が不要となる。 For example, in FIG. 17, when the code string 1111 (or Huffman code 11111) is ESCAPE_CODE, it indicates that the corresponding prediction mode is not included in the frequency information table indicated by the current table length table_index_length. ecs_mode_index [iBlk] further indicates an index number after the table length indicated by table_index_length. For example, FIG. 18 shows an example where the length of the frequency information table (indicating the total number of prediction modes) M = 15 and the index length L = 8. When the index of the selected frequency information table protrudes from within the index length L = 8, ESCAPE_CODE is set in mode_index [iBlk]. Further, ecs_mode_index [iBlk] corresponding to the protruding index is sent. For example, when the index of the frequency information table selected in the figure is 10, ESCAPE_CODE = 111 is set in mode_index [iBlk], and 011 is set in ecs_mode_index [iBlk] at the same time. In this way, since all prediction modes can be transmitted to the decoder, expansion such as addition or reduction of prediction modes is facilitated. In addition, syntax design corresponding to each prediction mode is not required.

本実施の別の形態としては、図１６で表されるようなシンタクスを用いても良い。この場合、上位シンタクスに付加されているｓｅｑ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈは必要とされない。この場合、インデックス長が送信されないため、常にテーブルの全ての値が利用できる。頻度情報テーブルの更新によって、常に当該ブロックでは最頻の情報がテーブルの上位に来ているため、前述した図１８の三列目で表現されるような二値化テーブルを用意することにより、少ない符号量でテーブルインデックス番号を送ることが可能である。 As another embodiment of the present invention, syntax as shown in FIG. 16 may be used. In this case, seq_flexible_mode_prediction_flag, pic_flexble_mode_prediction_flag, slice_flexble_mode_prediction_flag, and table_index_length added to the upper syntax are not required. In this case, since the index length is not transmitted, all values in the table can always be used. By updating the frequency information table, the most frequent information is always higher in the block, so by preparing the binarization table represented in the third column of FIG. It is possible to send the table index number by the code amount.

以上説明したように本実施の形態では、選択された予測モードの頻度情報テーブルを利用して、符号化対象ブロックに対して、テーブルから与えられる予測モードの中から、テーブルの上位に存在する出現頻度の高い予測モードのみを抽出して予測画像信号生成を行い、抽出した際のテーブルのインデックス長をシンタクスに多重化して復号化器に送信することで、従来の予測画像生成方法よりも高い符号化効率を維持しつつ、ハードウェアの演算コストを削減した予測画像が生成できる。つまり画素ブロックの内容等に応じて好適な符号化をなすことができる。 As described above, in the present embodiment, by using the frequency information table of the selected prediction mode, the appearance existing at the top of the table among the prediction modes given from the table for the encoding target block. The prediction image signal is generated by extracting only the prediction mode with high frequency, and the index length of the extracted table is multiplexed into the syntax and transmitted to the decoder. It is possible to generate a predicted image with reduced hardware calculation costs while maintaining efficiency. That is, suitable encoding can be performed according to the contents of the pixel block.

（符号化：第２の実施の形態）
図１９は、第２の実施形態に係る動画像符号化装置の構成の中で、第１の実施の形態と異なるブロックである予測制御部６００を示すブロック図である。本実施の形態では、第１の実施の形態で説明済みの予測制御部５０１と異なり、予測制御部６００には２つの異なる予測部であるＬ０予測部６０４とＬ１予測部６０８とが設けられる。一方、予測部６０４，６０８にそれぞれ対応するＬ０，Ｌ１頻度情報テーブル抽出部６０１，６０５、Ｌ０、Ｌ１頻度情報生成部６０２、６０６，Ｌ０、Ｌ１予測モード設定部６０３、６０７が設けられている。また、これら異なるＬ０，Ｌ１予測部６０４，６０８から出力された予測画像に対してフィルタ処理を行う適応フィルタ部６０９が設けられている。尚、既に説明した機能と同様の機能を持つものに対しては同じ参照符号を与えて、その説明を省略する。 (Encoding: Second Embodiment)
FIG. 19 is a block diagram showing a prediction control unit 600, which is a block different from the first embodiment, in the configuration of the video encoding apparatus according to the second embodiment. In the present embodiment, unlike the prediction control unit 501 described in the first embodiment, the prediction control unit 600 is provided with two different prediction units, an L0 prediction unit 604 and an L1 prediction unit 608. On the other hand, L0 and L1 frequency information table extraction units 601, 605, L0, L1 frequency information generation units 602, 606, L0, and L1 prediction mode setting units 603, 607 respectively corresponding to the prediction units 604, 608 are provided. Also, an adaptive filter unit 609 that performs filter processing on the prediction images output from these different L0 and L1 prediction units 604 and 608 is provided. The same reference numerals are given to those having the same functions as those already described, and the description thereof is omitted.

内部予測及びモード判定部１０２では、図２に示される予測制御部５０１に対応する予測制御部６００とモード制御部５０２が設けられる。予測制御部６００は予測制御部５０１と同様にモード制御部５０２に接続されており、予測制御部６００から出力される予測画像信号１１１がモード制御部５０２へと入力され、モード制御部５０２から出力される復号信号３０６が予測制御部６００へと入力される。入力画像信号１１０は減算器５０６へと入力され、予測制御部６００から出力される予測画像信号１１１を減算されて予測誤差信号１１２を生成する。モード制御部５０２は、マクロブロックサイズよりも小さいブロックサイズの予測も行うために、図４に示されるように内部モード判定部３０１、内部変換量子化部３０２、仮符号化処理部３０３、内部逆量子化逆変換部３０４、加算器３０５を有する。 In the internal prediction and mode determination unit 102, a prediction control unit 600 and a mode control unit 502 corresponding to the prediction control unit 501 shown in FIG. The prediction control unit 600 is connected to the mode control unit 502 similarly to the prediction control unit 501, and the prediction image signal 111 output from the prediction control unit 600 is input to the mode control unit 502 and output from the mode control unit 502. The decoded signal 306 is input to the prediction control unit 600. The input image signal 110 is input to the subtractor 506, and the prediction image signal 111 output from the prediction control unit 600 is subtracted to generate a prediction error signal 112. Since the mode control unit 502 also performs prediction of a block size smaller than the macroblock size, an internal mode determination unit 301, an internal transform quantization unit 302, a provisional encoding processing unit 303, an internal inverse unit, as shown in FIG. A quantization inverse transform unit 304 and an adder 305 are included.

Ｌ０予測部６０４は、既に符号化された時間的に過去を示す参照画像（局部復号画像）を用いて予測を行う。一方、Ｌ１予測部６０８は、既に符号化された時間的に未来を示す参照画像（局部復号画像）を用いて予測を行う。夫々の予測には対応する参照画像の番号（以下Ｌ０ＲＥＦ、Ｌ１ＲＥＦで表す）と予測するブロックの形状、動きベクトル情報などの予測情報６１２が必要である。 The L0 prediction unit 604 performs prediction using a reference image (local decoded image) that has been encoded and indicates the past in time. On the other hand, the L1 prediction unit 608 performs prediction using a reference image (local decoded image) indicating the future in time that has already been encoded. Each prediction requires a corresponding reference image number (hereinafter referred to as L0REF and L1REF), prediction block shape, prediction information 612 such as motion vector information, and the like.

Ｌ０頻度情報テーブル生成部６０２は、現在までに符号化された画素ブロックのＬ０予測に関する予測情報６１２の頻度をテーブル化している。符号化対象画素ブロックを符号化する際、制御部２１０から与えられた後述する予測情報６１２に従ってＬ０頻度情報テーブルを更新する。更新したＬ０頻度情報テーブルはＬ０頻度情報テーブル抽出部６０１へと送られる。Ｌ０頻度情報テーブル生成部６０２で生成されたＬ０頻度情報テーブルが、Ｌ０頻度情報テーブル抽出部６０１へと出力される。Ｌ０頻度情報テーブル抽出部６０１は、入力されたＬ０頻度情報テーブルの中から、制御部２１０に設定されているテーブルインデックス長の分のＬ個のＬ０予測情報を抽出する。抽出されたＬ０予測情報がＬ０予測モード設定部６０３へと出力される。Ｌ０予測モード設定部６０３は、入力された抽出Ｌ０予測情報の中から１つを選択する。この選択ＬＯ予測情報はＬ０テーブル情報６１３として制御部２１０に設定されるとともに、Ｌ０予測に必要な参照画像（Ｌ０ＲＥＦ）を内部参照画像メモリ６１０から呼び出す。呼び出された参照画像（Ｌ０ＲＥＦ）が、Ｌ０予測部６０４へと入力され、この参照画像（Ｌ０ＲＥＦ）を用いてＬ０予測が行われる。Ｌ０予測部６０４で生成されたＬ０予測画像信号６１４が適応フィルタ部６０９へと入力される。 The L0 frequency information table generation unit 602 tabulates the frequency of the prediction information 612 related to the L0 prediction of the pixel block encoded so far. When the pixel block to be encoded is encoded, the L0 frequency information table is updated according to prediction information 612 described later given from the control unit 210. The updated L0 frequency information table is sent to the L0 frequency information table extraction unit 601. The L0 frequency information table generated by the L0 frequency information table generation unit 602 is output to the L0 frequency information table extraction unit 601. The L0 frequency information table extraction unit 601 extracts L pieces of L0 prediction information corresponding to the table index length set in the control unit 210 from the input L0 frequency information table. The extracted L0 prediction information is output to the L0 prediction mode setting unit 603. The L0 prediction mode setting unit 603 selects one from the input extracted L0 prediction information. The selected LO prediction information is set in the control unit 210 as L0 table information 613, and a reference image (L0REF) necessary for L0 prediction is called from the internal reference image memory 610. The called reference image (L0REF) is input to the L0 prediction unit 604, and L0 prediction is performed using the reference image (L0REF). The L0 prediction image signal 614 generated by the L0 prediction unit 604 is input to the adaptive filter unit 609.

一方、Ｌ１頻度情報テーブル生成部６０６は、現在までに符号化された画素ブロックのＬ１予測に関する予測情報６１２の頻度をテーブル化している。符号化対象画素ブロックを符号化する際、制御部２１０から与えられた後述する予測情報６１２に従ってＬ１頻度情報テーブルを更新する。更新したＬ１頻度情報テーブルはＬ１頻度情報テーブル抽出部６０５へと送られる。Ｌ１頻度情報テーブル生成部６０６で生成されたＬ１頻度情報テーブルが、Ｌ１頻度情報テーブル抽出部６０５へと出力される。Ｌ１頻度情報テーブル抽出部６０５は、入力されたＬ１頻度情報テーブルの中から、制御部２１０に設定されているテーブルインデックス長の分のＬ個のＬ１予測情報を抽出する。抽出されたＬ１予測情報がＬ１予測モード設定部６０７へと出力される。Ｌ１予測モード設定部６０７では、入力された抽出Ｌ１予測情報の中から１つを選択する。この選択Ｌ１予測情報はＬ１テーブル情報６１７として制御部２１０に設定されるとともに、Ｌ１予測に必要な参照画像（Ｌ１ＲＥＦ）を内部参照画像メモリ６１０から呼び出す。呼び出された参照画像（Ｌ１ＲＥＦ）が、Ｌ１予測部６０８へと入力され、Ｌ１予測が行われる。Ｌ１予測部６０８で生成されたＬ１予測画像信号６１５が適応フィルタ部６０９へと入力される。 On the other hand, the L1 frequency information table generation unit 606 tabulates the frequency of the prediction information 612 related to the L1 prediction of the pixel block encoded up to now. When the pixel block to be encoded is encoded, the L1 frequency information table is updated according to prediction information 612 described later given from the control unit 210. The updated L1 frequency information table is sent to the L1 frequency information table extraction unit 605. The L1 frequency information table generated by the L1 frequency information table generation unit 606 is output to the L1 frequency information table extraction unit 605. The L1 frequency information table extraction unit 605 extracts L pieces of L1 prediction information corresponding to the table index length set in the control unit 210 from the input L1 frequency information table. The extracted L1 prediction information is output to the L1 prediction mode setting unit 607. The L1 prediction mode setting unit 607 selects one of the input extracted L1 prediction information. The selected L1 prediction information is set in the control unit 210 as L1 table information 617, and a reference image (L1REF) necessary for L1 prediction is called from the internal reference image memory 610. The called reference image (L1REF) is input to the L1 prediction unit 608, and L1 prediction is performed. The L1 prediction image signal 615 generated by the L1 prediction unit 608 is input to the adaptive filter unit 609.

適応フィルタ部６０９では、入力された２つの信号に対して以下の式を利用してフィルタリングを行う。 The adaptive filter unit 609 performs filtering on the two input signals using the following expression.

Ｐｒｅｄ＝（Ｌ０Ｐｒｅｄ＋Ｌ１Ｐｒｅｄ）＞＞１（６）
ここで、Ｐｒｅｄはフィルタリング後に得られる予測画像信号を表している。Ｌ０Ｐｒｅｄは同位置の画素に対応するＬ０予測画像信号６１４を表しており、Ｌ１Ｐｒｅｄは同位置の画素に体操するＬ１予測画像信号６１５を表している。ここで式（６）に示されるような平均値フィルタ以外のフィルタを用いても良い。具体的には次式で示されるように、Ｌ０、Ｌ１方向に重み付けするようなフィルタを用いても良い。 Pred = (L0Pred + L1Pred) >> 1 (6)
Here, Pred represents a predicted image signal obtained after filtering. L0Pred represents the L0 predicted image signal 614 corresponding to the pixel at the same position, and L1Pred represents the L1 predicted image signal 615 to be exercised to the pixel at the same position. Here, a filter other than the average value filter as shown in Expression (6) may be used. Specifically, as shown by the following equation, a filter that weights in the L0 and L1 directions may be used.

Ｐｒｅｄ＝（ＷＬ０×Ｌ０Ｐｒｅｄ＋ＷＬ１×Ｌ１Ｐｒｅｄ）
＞＞（ＢＩＴ＿ＳＨＩＦＴ）（７）
ＷＬ０、ＷＬ１は夫々Ｌ０予測画像信号６１４、Ｌ１予測画像信号６１５に対するフィルタの重み係数を表している。ＢＩＴ＿ＳＨＩＦＴは除算を避けるために導入されるシフト係数である。このとき重み係数とシフト係数には次の関係が成り立つ。
ＷＬ０＋ＷＬ１＝（１＜＜ＢＩＴ＿ＳＨＩＦＴ）（８）
また、次のようなオフセットを用いたフィルタを用いてもよい。
Ｐｒｅｄ＝（ＷＬ０×Ｌ０Ｐｒｅｄ＋ＷＬ１×Ｌ１Ｐｒｅｄ）
＞＞（ＢＩＴ＿ＳＨＩＦＴ）＋ＯＦＦＳＥＴ（９）
ＯＦＦＳＥＴ値の変更によって時間的に連続する輝度値の変化を効果的に予測することが可能となる。
適応フィルタ部６０９で生成された予測画像信号１１１がモード制御部５０２へと出力される。 Pred = (WL0 × L0Pred + WL1 × L1Pred)
>> (BIT_SHIFT) (7)
WL0 and WL1 represent filter weight coefficients for the L0 predicted image signal 614 and the L1 predicted image signal 615, respectively. BIT_SHIFT is a shift factor introduced to avoid division. At this time, the following relationship holds between the weighting coefficient and the shift coefficient.
WL0 + WL1 = (1 << BIT_SHIFT) (8)
Further, a filter using the following offset may be used.
Pred = (WL0 × L0Pred + WL1 × L1Pred)
>> (BIT_SHIFT) + OFFSET (9)
By changing the OFFSET value, it is possible to effectively predict a temporally continuous change in luminance value.
The predicted image signal 111 generated by the adaptive filter unit 609 is output to the mode control unit 502.

次に、制御部２１０の予測情報６１２について説明する。インター符号化（フレーム間符号化）を行う際、どの予測方法を用いるかを示す予測モード情報と、どの参照画像を利用するかを指す参照画像インデックスと、予測対象画素ブロックが参照画像中のどの画素ブロックを指しているかを指す動きベクトルに関する情報と、予測対象画素ブロックがどのような形状をしているかを指す画素ブロック形状情報が必要となる。本実施の形態では、この内の予測モード情報と参照画像インデックスに関して頻度情報テーブルが生成される。図２０にＬ０予測モードとＬ１予測モードの予測画素ブロックとモードの対応を示す。図２０では、Ｌ０予測モード０〜３及びＬ１予測モード０〜３に対してそれぞれ１６×１６画素予測、８×８画素予測及び４×４画素予測が示されている。具体的に頻度情報テーブルについて説明する。図７は頻度情報テーブルの更新を示している。図７に示される数字は予測モードの番号、又は対応する参照画像インデックスの番号を示している。選択された予測モード、又は参照画像インデックスの番号に従って、１つの画素ブロックのモード判定が完了するごとに、頻度情報テーブルが更新される。頻度情報テーブルの更新に関しては第１の実施の形態で既に説明済みなのでここでは説明を省略する。 Next, the prediction information 612 of the control unit 210 will be described. When performing inter coding (interframe coding), prediction mode information indicating which prediction method is used, a reference image index indicating which reference image is used, and which prediction target pixel block is in the reference image Information relating to a motion vector indicating whether the pixel block is indicated and pixel block shape information indicating what shape the prediction target pixel block has are required. In the present embodiment, a frequency information table is generated for the prediction mode information and the reference image index. FIG. 20 shows the correspondence between prediction pixel blocks and modes in the L0 prediction mode and the L1 prediction mode. In FIG. 20, 16 × 16 pixel prediction, 8 × 8 pixel prediction, and 4 × 4 pixel prediction are shown for the L0 prediction modes 0 to 3 and the L1 prediction modes 0 to 3, respectively. The frequency information table will be specifically described. FIG. 7 shows the update of the frequency information table. The numbers shown in FIG. 7 indicate the prediction mode number or the corresponding reference image index number. The frequency information table is updated every time the mode determination of one pixel block is completed according to the selected prediction mode or reference image index number. Since the update of the frequency information table has already been described in the first embodiment, the description thereof is omitted here.

次に、図２１を用いてＬ０／Ｌ１予測部６０４／６０８について説明する。Ｌ０予測部６０４は、既に符号化された時間的に過去を示す参照画像（局部復号画像）を用いて予測を行う。具体的には予測対象画素ブロックと、参照画像Ｌ０ＲＥＦに対して１／４画素精度の補間画像を作成し、ブロックマッチングを行う。図中Ｌ０予測参照画像で示される領域内に記述されている数字はＬ０ＲＥＦ番号を示している。ここでは主に、マッチングした画素ブロックと予測対象ブロックの位置ずれ量を動きベクトルとして計測する。その後、予測対象画素ブロックにマッチングした参照画像の画素ブロックで、予測対象画素ブロックを補填する。このようにして予測画像生成を行う。同様に、Ｌ１予測部６０７は、既に符号化された時間的に未来を示す参照画像（局部復号画像）を用いて予測を行う。具体的には符号化対象画素ブロックと、参照画像Ｌ１ＲＥＦに対して１／４画素精度の補間画像を作成し、ブロックマッチングを行う。図中Ｌ１予測参照画像の領域内に記述された数字はＬ１ＲＥＦ番号を示している。ここでは主に、マッチングした画素ブロックと予測対象ブロックの位置ずれ量を動きベクトルとして計測する。その後、マッチングした参照画像の画素ブロックで、予測画素ブロックを補填する。補間画像の生成は、１／２画素精度、１／８画素精度であっても良い。
以上が本実施の形態にかかる予測制御部６００の構成である。 Next, the L0 / L1 prediction unit 604/608 will be described with reference to FIG. The L0 prediction unit 604 performs prediction using a reference image (local decoded image) that has been encoded and indicates the past in time. Specifically, an interpolation image with 1/4 pixel accuracy is created for the prediction target pixel block and the reference image L0REF, and block matching is performed. The numbers described in the area indicated by the L0 prediction reference image in the figure indicate the L0REF number. Here, the amount of positional deviation between the matched pixel block and the prediction target block is mainly measured as a motion vector. After that, the pixel block of the reference image matched with the pixel block for prediction is supplemented with the pixel block for prediction. In this way, predicted image generation is performed. Similarly, the L1 prediction unit 607 performs prediction using a reference image (local decoded image) that has already been encoded and indicates the future in time. Specifically, an interpolation image with 1/4 pixel accuracy is created for the encoding target pixel block and the reference image L1REF, and block matching is performed. In the figure, the numbers described in the region of the L1 prediction reference image indicate the L1REF number. Here, the amount of positional deviation between the matched pixel block and the prediction target block is mainly measured as a motion vector. Thereafter, the predicted pixel block is supplemented with the pixel block of the matched reference image. The generation of the interpolated image may be 1/2 pixel accuracy or 1/8 pixel accuracy.
The above is the configuration of the prediction control unit 600 according to the present embodiment.

本発明の本実施の形態においては、図１９では、Ｌ０予測部６０４、Ｌ１予測部６０８はインデックス長情報６１１の入力によって必ず予測画像が生成されるブロック図となっている。しかし、実際の符号化フレーム構造では、未来の参照画像が利用できない場合が存在する。このとき、制御部２１０から与えられる予測情報６１２では、参照画像のＬ１ＲＥＦ禁止情報が付加されてＬ１頻度情報テーブル生成部６０５へと入力され、利用可能なＬ１予測モードが制限される。これによってＬ１予測モード設定部６０７では、Ｌ１ＲＥＦ禁止情報をＬ１予測部６０８へ伝える。このとき制御部２１０はＬ１ＲＥＦ禁止情報を、適応フィルタ部６０９部へ伝える。適応フィルタ部６０９はＬ１ＲＥＦ禁止情報が入力されると、次式（１０）によって予測画像信号を切り替える。 In the present embodiment of the present invention, in FIG. 19, the L0 prediction unit 604 and the L1 prediction unit 608 are block diagrams in which a prediction image is always generated by inputting the index length information 611. However, there are cases where a future reference image cannot be used in the actual encoded frame structure. At this time, in the prediction information 612 given from the control unit 210, the L1REF prohibition information of the reference image is added and input to the L1 frequency information table generation unit 605, and the available L1 prediction modes are limited. As a result, the L1 prediction mode setting unit 607 transmits the L1REF prohibition information to the L1 prediction unit 608. At this time, the control unit 210 transmits L1REF prohibition information to the adaptive filter unit 609 unit. When the L1REF prohibition information is input, the adaptive filter unit 609 switches the predicted image signal according to the following equation (10).

Ｌ１Ｐｒｅｄ＝Ｌ０Ｐｒｅｄ（１０）
又は、直接Ｌ０Ｐｒｅｄを予測画像信号として出力する。 L1Pred = L0Pred (10)
Alternatively, L0Pred is directly output as a predicted image signal.

また、Ｌ０予測部６０４で必要とされるＬ０ＲＥＦ、Ｌ１予測部６０８で必要とされるＬ１ＲＥＦが共に利用可能な場合においても、片側の予測画像信号を出力する場合、前述したＬ１ＲＥＦ禁止情報、又はＬ０ＲＥＦ禁止情報が予測情報６１２に付加されて、各頻度情報テーブル生成部６０２，６０５に入力されることによって、Ｌ０予測画像信号６１４、Ｌ１予測画像信号６１５、フィルタリングした予測画像信号の３つを別々に出力することが可能である。 In addition, when both the L0REF required by the L0 prediction unit 604 and the L1REF required by the L1 prediction unit 608 are available, when the one-side predicted image signal is output, the above-described L1REF prohibition information or L0REF Prohibition information is added to the prediction information 612 and is input to the frequency information table generation units 602 and 605, so that the L0 prediction image signal 614, the L1 prediction image signal 615, and the filtered prediction image signal are separately obtained. It is possible to output.

また、本実施の形態においては、予測モードの頻度情報テーブルの更新時に利用画素ブロック位置として、左と上の画素ブロックの予測モードを参照しているが、予測対象画素ブロックの隣接画素ブロックとして、さらに広い領域でテーブルを更新しても良い。具体的には、時間的に前後する同位置の画素ブロックの予測モードを用いても良いし、利用可能な右上の画素ブロック、左上の画素ブロック、上上の画素ブロック、左左の画素ブロックなどで選択されている予測モードを用いて、頻度情報テーブルを更新しても良い。 In the present embodiment, the prediction mode of the left and upper pixel blocks is referred to as the use pixel block position when the prediction mode frequency information table is updated, but as the adjacent pixel block of the prediction target pixel block, The table may be updated in a wider area. Specifically, the prediction mode of the pixel block at the same position that moves back and forth in time may be used, and the upper right pixel block, the upper left pixel block, the upper upper pixel block, the left left pixel block, and the like that can be used. The frequency information table may be updated using the prediction mode selected in (1).

また、本実施の形態では、インター予測（フレーム間予測）に関する実施例について詳細に説明したが、イントラ予測（フレーム内予測）に関しても、同様の符号化器構造で実施が可能である。より具体的に説明すると、図８、図９で示されるＨ．２６４で規定されている１つの方向予測モード（例えば４ｘ４画素予測内の垂直予測）をＬ０予測とし、もう１つの方向予測モード（例えば４ｘ４画素予測内の垂直左予測）をＬ１予測とする。このとき、予測制御部６００内で生成された、夫々のＬ０予測画像信号６１４とＬ１予測画像信号６１５が適応フィルタ部６０９へと入力され、新たに２つの予測画像信号をフィルタリングした予測画像信号が生成される。Ｌ０予測モードに対して、Ｌ０頻度情報テーブルが生成され、Ｌ１予測モードに対して、Ｌ１頻度情報テーブルが生成される。このようにして、予測画像信号を生成することで２つの予測画像信号から新たな予測画像信号を生成することが可能になる。 Further, in the present embodiment, an example regarding inter prediction (inter-frame prediction) has been described in detail. However, intra prediction (intra-frame prediction) can also be performed with the same encoder structure. More specifically, the H.264 shown in FIGS. One direction prediction mode defined by H.264 (for example, vertical prediction in 4 × 4 pixel prediction) is L0 prediction, and the other direction prediction mode (for example, vertical left prediction in 4 × 4 pixel prediction) is L1 prediction. At this time, the respective L0 prediction image signals 614 and L1 prediction image signals 615 generated in the prediction control unit 600 are input to the adaptive filter unit 609, and prediction image signals obtained by newly filtering two prediction image signals are obtained. Generated. An L0 frequency information table is generated for the L0 prediction mode, and an L1 frequency information table is generated for the L1 prediction mode. In this manner, by generating a predicted image signal, a new predicted image signal can be generated from the two predicted image signals.

次に本予測方式で用いるシンタクスの符号化方法について説明する。 Next, a syntax encoding method used in this prediction method will be described.

図１０に本実施の形態で用いられるシンタクスの構造の概略を示す。シンタクスは主に３つのパートからなり、ハイレベルシンタクス（４０１）はスライス以上の上位レイヤのシンタクス情報が詰め込まれている。シーケンスパラメータセットシンタックス、ピクチャーパラメータセットシンタクス及びスライスヘッダーシンタクスの詳細な説明は上記に図１１〜図１３を参照してすでに説明しているので省略する。 FIG. 10 shows an outline of the syntax structure used in this embodiment. The syntax mainly consists of three parts, and the high-level syntax (401) is packed with syntax information of higher layers above the slice. Detailed descriptions of the sequence parameter set syntax, picture parameter set syntax, and slice header syntax have already been described above with reference to FIGS.

図２２のマクロブロックレイヤーシンタクス内に示されるｍｂ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、フレキシブルモード予測の利用可否を予測対象マクロブロックで利用するかどうかを示すフラグであり、このフラグがＴＲＵＥであるときは、フレキシブルモード予測を利用する。一方、フラグがＦＡＬＳＥであるときは、フレキシブルモード予測は利用されない。このフラグがＴＲＵＥのときには、必ず、ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０が送信される。これはマクロブロックのＬ０予測モードインデックスを示しており、Ｌ０頻度情報テーブルの何番目の予測モードが選択されているかを示している。ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１はＬ１予測モードインデックスを示しており、Ｌ１頻度情報テーブルの何番目の予測モードが選択されているかを示している。 Mb_flexible_mode_prediction_flag shown in the macroblock layer syntax of FIG. 22 is a flag indicating whether to use the flexible mode prediction in the prediction target macroblock. When this flag is TRUE, the flexible mode prediction is used. To do. On the other hand, when the flag is FALSE, flexible mode prediction is not used. When this flag is TRUE, mode_index_10 is always transmitted. This indicates the L0 prediction mode index of the macroblock, and indicates what number of prediction modes in the L0 frequency information table is selected. mode_index_l1 indicates an L1 prediction mode index, which indicates which prediction mode in the L1 frequency information table is selected.

マクロブロックレイヤーシンタクス内のＢｌｋＳｉｚｅは、予測対象画素ブロックの数を表しており、４ｘ４画素ブロックでは１６、８ｘ８画素ブロックでは４、１６ｘ１６画素予測では１が対応する。また、Ｌ１ＰｒｅｄＡｖａｉｌａｂｌｅＦｌａｇは予測対象画素ブロックでＬ１予測が選択できるかどうかを示すフラグであり、このフラグがＴＲＵＥであるときは、ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１が送信される。一方、ＦＡＬＳＥであるときは、ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１は送信されない。 BlkSize in the macroblock layer syntax represents the number of prediction target pixel blocks, which corresponds to 16 for 4 × 4 pixel blocks, 4 for 8 × 8 pixel blocks, and 1 for 16 × 16 pixel prediction. L1PredAvailableFlag is a flag indicating whether or not L1 prediction can be selected in the prediction target pixel block. When this flag is TRUE, mode_index_l1 is transmitted. On the other hand, when it is FALSE, mode_index_l1 is not transmitted.

ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０及びｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１の符号量は、二値化の過程で、上位シンタクスに記述されるｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈによって変更される。たとえば、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈが４（Ｌ＝１６）の場合、式３に従って０〜１５までのテーブルインデックスが利用できることになる。ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０及びｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１は、この１６個のテーブルインデックスを表しており、等長符号を与えると４ビットのシンタクスとなる。二値化では、対応するシンタクス要素の頻度情報に従って最も符号量が少なくなるように設計されることが望ましい。二値化の例は図１６で前述しているため説明を省略する。 The code amounts of mode_index_l0 and mode_index_l1 are changed by table_index_length described in the upper syntax in the binarization process. For example, when table_index_length is 4 (L = 16), table indexes from 0 to 15 can be used according to Equation 3. mode_index_l0 and mode_index_l1 represent the 16 table indexes. When an equal length code is given, a 4-bit syntax is provided. In binarization, it is desirable that the code amount is designed to be minimized according to the frequency information of the corresponding syntax element. An example of binarization has been described above with reference to FIG.

本実施の形態の別の例としては、図２２で示されるｍａｃｒｏｂｌｏｃｋ＿ｄａｔａシンタクスを図２３で表されるようなシンタクスに変えても良い。図２２との違いはｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］の値が、予め定められたＥＳＣＡＰＥ＿ＣＯＤＥの場合、更にｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０［ｉＢｌｋ］、ｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１［ｉＢｌｋ］が送られる点である。 As another example of the present embodiment, the macroblock_data syntax shown in FIG. 22 may be changed to the syntax shown in FIG. The difference from FIG. 22 is that when the value of mode_index [iBlk] is ESCAPE_CODE determined in advance, ecs_mode_index_10 [iBlk] and ecs_mode_index_l1 [iBlk] are sent.

例えば、図１７において符号列１１１１（又はハフマン符号の１１１１１）がＥＳＣＡＰＥ＿ＣＯＤＥの場合、現在のテーブル長ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｈｔで示される頻度情報テーブルの中に、予測対象画素ブロックに該当する予測モードが含まれていないことを示す。ｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０［ｉＢｌｋ］及びｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１［ｉＢｌｋ］は、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈで示されるテーブル長より後のインデックス番号を示している。例えば、頻度情報テーブルの長さ（全予測モード数を示す）Ｍ＝１５、インデックス長Ｌ＝８の場合の例を図１８に示す。選択された頻度情報テーブルのインデックスがインデックス長Ｌ＝８内からはみ出した場合、ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０［ｉＢｌｋ］、又はｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１［ｉＢｌｋ］にはＥＳＣＡＰＥ＿ＣＯＤＥ=１１１がセットされる。更にはみ出したインデックスに対応するｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ０［ｉＢｌｋ］、又はｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１［ｉＢｌｋ］が送られる。このようにすることで、Ｌ０，Ｌ１全ての予測モードをデコーダに送信することが可能となるため、予測モードの追加や削減などの拡張が容易となる。また、夫々の予測モードに対応するシンタクスの設計等が不要となる。 For example, when the code string 1111 (or Huffman code 11111) in FIG. 17 is ESCAPE_CODE, the frequency information table indicated by the current table length table_index_length does not include a prediction mode corresponding to the prediction target pixel block. Indicates. ecs_mode_index_l0 [iBlk] and ecs_mode_index_l1 [iBlk] indicate index numbers after the table length indicated by table_index_length. For example, FIG. 18 shows an example where the length of the frequency information table (indicating the total number of prediction modes) M = 15 and the index length L = 8. When the index of the selected frequency information table protrudes from the index length L = 8, ESCAPE_CODE = 111 is set in mode_index_l0 [iBlk] or mode_index_l1 [iBlk]. Further, ecs_mode_index_l0 [iBlk] or ecs_mode_index_l1 [iBlk] corresponding to the protruding index is sent. By doing in this way, since it becomes possible to transmit all the prediction modes of L0 and L1 to a decoder, expansion, such as addition and reduction of a prediction mode, becomes easy. In addition, syntax design corresponding to each prediction mode is not required.

本実施の形態の別の例としては、図２４で表されるようなシンタクスを用いても良い。この場合、上位シンタクスに付加されているｓｅｑ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈは必要とされない。この場合、インデックス長が送信されないため、常にテーブルの全ての値が利用できる。頻度情報テーブルの更新によって、常に予測対象ブロックでは最頻の情報がテーブルの上位に来ているため、前述した図１６の三列目で表現されるような二値化テーブルを用意することにより、少ない符号量でｍｏｂｅ＿ｉｎｄｅｘ＿ｌ０、ｍｏｄｅ＿ｉｎｄｅｘ＿ｌ１に対応するテーブルインデックス番号を送ることが可能である。 As another example of the present embodiment, syntax as shown in FIG. 24 may be used. In this case, seq_flexible_mode_prediction_flag, pic_flexble_mode_prediction_flag, slice_flexble_mode_prediction_flag, and table_index_length added to the upper syntax are not required. In this case, since the index length is not transmitted, all values in the table can always be used. By updating the frequency information table, since the most frequent information is always higher in the prediction target block, by preparing a binarization table as expressed in the third column of FIG. 16 described above, It is possible to send table index numbers corresponding to move_index_l0 and mode_index_l1 with a small code amount.

このように、各モードすべてについて負担の大きい符号化処理を行う必要がなく、選択されたモードでの符号化のみ行うようにすればよいので、演算負担の増加も抑制することができる。すなわち、本実施の形態では、高速かつ好適なモード選択と、高速で圧縮効率の高い動画像符号化を実現することが可能となる。 As described above, it is not necessary to perform a heavy encoding process for all the modes, and it is only necessary to perform the encoding in the selected mode, so that an increase in calculation load can be suppressed. That is, in this embodiment, it is possible to realize high-speed and suitable mode selection and high-speed and high-efficiency video coding.

なお、上述したように選択されたモードでの符号化の際、復号画像信号の生成は、選択されたモードについてのみ行えばよく、予測モード判定のためのループ内では、必ずしも実行しなくてもよい。 Note that when encoding in the selected mode as described above, the generation of the decoded image signal need only be performed for the selected mode, and may not necessarily be executed in the loop for prediction mode determination. Good.

（２）動画像復号化装置の構成
（復号化：第１の実施の形態）
図２５に、本実施形態に係る動画像符号化装置の復号化部４００の構成を示す。符号化部１００から送出され、伝送系又は蓄積系を経て送られてきた符号化データは、入力バッファ４０１に一度蓄えられ、多重化された符号化データが逆多重化部４０２によって逆多重化される。分離された符号化データが符号列復号部４０３に入力されて、１フレーム毎にシンタクスに基づいてパース処理が行われる。即ち、符号列復号部４０３では、図１０に示されるシンタクス構造に従って、ハイレベルシンタクス、スライスレベルシンタクス、マクロブロックレベルシンタクスの夫々に対して、順次符号化データの各シンタクスの符号列が復号され、量子化された変換係数、量子化マトリクス、量子化パラメータ、予測モード情報、情報テーブルインデックス長などが復元される。ここで予測モード情報の中に、後述するインデックス番号も含まれている。 (2) Configuration of moving picture decoding apparatus (decoding: first embodiment)
FIG. 25 shows the configuration of the decoding unit 400 of the video encoding apparatus according to this embodiment. The encoded data sent from the encoding unit 100 and sent via the transmission system or storage system is once stored in the input buffer 401, and the multiplexed encoded data is demultiplexed by the demultiplexing unit 402. The The separated encoded data is input to the code string decoding unit 403, and a parsing process is performed for each frame based on the syntax. That is, the code sequence decoding unit 403 sequentially decodes the code sequence of each syntax of the encoded data for each of the high level syntax, the slice level syntax, and the macroblock level syntax according to the syntax structure shown in FIG. The quantized transform coefficient, quantization matrix, quantization parameter, prediction mode information, information table index length, and the like are restored. Here, the prediction mode information includes an index number to be described later.

符号列復号部４０３で復号されたデータのうち、復号された変換係数が逆量子化逆変換部４０４へと入力される。逆量子化逆変換部４０４では、入力された変換係数４１５が逆量子化される。ここで必要な量子化に関するパラメータは符号列復号部４０３から復号化制御部４０９へと設定され、逆量子化の際に読み込まれる。更に逆量子化された変換係数は、逆変換（例えば逆離散コサイン変換など）され、誤差信号４１３として出力される。ここでは、逆直交変換について説明したが、符号化器でウェーブレット変換や独立成分分析などが行われている場合、逆量子化逆変換部４０４は対応する逆ウェーブレット変換や逆独立成分分析などが実行されても良い。 Of the data decoded by the code string decoding unit 403, the decoded transform coefficient is input to the inverse quantization inverse transform unit 404. In the inverse quantization inverse transform unit 404, the input transform coefficient 415 is inversely quantized. Here, the necessary parameters relating to quantization are set from the code string decoding unit 403 to the decoding control unit 409, and are read at the time of inverse quantization. Further, the inversely quantized transform coefficient is inversely transformed (for example, inverse discrete cosine transformation) and output as an error signal 413. Here, the inverse orthogonal transform has been described. However, when wavelet transform or independent component analysis is performed by the encoder, the inverse quantization inverse transform unit 404 executes the corresponding inverse wavelet transform or inverse independent component analysis. May be.

誤差信号４１３は加算器４０５へと入力され、後述する予測部４０７から出力される予測画像信号４１１と加算される。誤差信号４１３と予測画像信号４１１が加算されると復号信号４１４となり、復号信号４１４が参照画像メモリ４０６へと出力される。復号信号４１４は更に参照画像メモリ４０６を介して動画像復号化部４００外へと出力され、出力バッファ４０８等へ蓄積された後、復号化制御部３０９が管理するタイミングで出力される。参照画像メモリ４０６は、復号された信号が参照画像であれば復号信号４１４を出力バッファへ送出するとともに、内部メモリへと保存する。保存された復号信号４１４は参照信号４１２として予測に利用される。一方、復号された信号が非参照画像であれば復号信号４１４を内部メモリへ保存せず、出力バッファへ送出する。信号が参照画像であるかどうかを示す信号は、符号化データに多重化されている。 The error signal 413 is input to the adder 405 and is added to the predicted image signal 411 output from the prediction unit 407 described later. When the error signal 413 and the predicted image signal 411 are added, a decoded signal 414 is obtained, and the decoded signal 414 is output to the reference image memory 406. The decoded signal 414 is further output to the outside of the moving image decoding unit 400 via the reference image memory 406, accumulated in the output buffer 408, etc., and then output at a timing managed by the decoding control unit 309. If the decoded signal is a reference image, the reference image memory 406 sends the decoded signal 414 to the output buffer and stores it in the internal memory. The stored decoded signal 414 is used for prediction as a reference signal 412. On the other hand, if the decoded signal is a non-reference image, the decoded signal 414 is not stored in the internal memory but sent to the output buffer. A signal indicating whether or not the signal is a reference image is multiplexed with the encoded data.

一方、符号列復号部４０３で復号された予測モード情報４０９、情報テーブルインデックス長４１０などが予測部４０７へと入力される。また、既に復号化されている参照信号４１２が参照画像メモリ４０６から予測部４０７へと供給される。予測部４０７は、入力されたモード情報等を基に、予測画像信号４１１を生成し、それを加算器４０５へ供給する。 On the other hand, the prediction mode information 409 and the information table index length 410 decoded by the code string decoding unit 403 are input to the prediction unit 407. The reference signal 412 that has already been decoded is supplied from the reference image memory 406 to the prediction unit 407. The prediction unit 407 generates a predicted image signal 411 based on the input mode information and the like, and supplies it to the adder 405.

復号化制御部４０９は、入力バッファ４０１、出力バッファ４０８に対する出力タイミングの制御や、復号化タイミングの制御などを行う。 The decoding control unit 409 performs control of output timing for the input buffer 401 and output buffer 408, control of decoding timing, and the like.

以上が本実施の形態における動画像復号化装置の構成である。以下、本発明にかかる動画像復号化方法について、復号化部４００が実施する例を説明する。この動画像復号化においては、予測制御部４０７は図１の符号化部１００において使用される図３の予測制御部５０１と同じ構成であるので図３を参照して説明する。 The above is the configuration of the moving picture decoding apparatus according to the present embodiment. Hereinafter, the example which the decoding part 400 implements about the moving image decoding method concerning this invention is demonstrated. In this video decoding, the prediction control unit 407 has the same configuration as that of the prediction control unit 501 in FIG. 3 used in the encoding unit 100 in FIG. 1 and will be described with reference to FIG.

頻度情報テーブル生成部２０６は、現在までに復号化された画素ブロックの予測モード情報の頻度をテーブル化している。復号化画素ブロックを復号化する際、制御部２１０から与えられた予測情報２０９に従って頻度情報テーブルを更新する。更新した頻度情報テーブルは頻度情報テーブル抽出部２０１へと送られる。 The frequency information table generation unit 206 tabulates the frequency of prediction mode information of pixel blocks decoded so far. When decoding the decoded pixel block, the frequency information table is updated according to the prediction information 209 given from the control unit 210. The updated frequency information table is sent to the frequency information table extraction unit 201.

具体的に頻度情報テーブルについて説明する。図７は頻度情報テーブルの更新を示している。図７に示される数字は予測モードの番号を示している。選択された予測モードの番号に従って、１つの画素ブロックのモード判定が完了するごとに、頻度情報テーブルが更新される。先ず、復号化対象画素ブロックの上と左に隣接する画素ブロックの予測モードに対してソーティングが行われる。例えば図中の右端の画素ブロックについて説明する。この画素ブロックの上の予測モードは１、左の予測モードは７である。このとき、一つ前の頻度情報テーブルの中で、左隣に位置する予測モード７をテーブル中から探し、第１位（テーブルのインデックス０）へと移動する。次に上の予測モードである１をテーブル中から探し、第２位（テーブルのインデックス１）へと移動する。このように各画素ブロックに対して隣接する上左の予測モードを頻度情報テーブルの上位にソーティングすることで、予測モードの頻度情報を得ることが可能となる。この頻度情報テーブルを用いると、復号化対象画素ブロック以前に選択された予測モードがテーブル中の上位に存在することになり、使われていない予測モードはテーブルの下位に存在することとなる。 The frequency information table will be specifically described. FIG. 7 shows the update of the frequency information table. The numbers shown in FIG. 7 indicate the prediction mode numbers. The frequency information table is updated every time the mode determination of one pixel block is completed according to the number of the selected prediction mode. First, sorting is performed on prediction modes of pixel blocks adjacent on the left and above the decoding target pixel block. For example, the pixel block at the right end in the figure will be described. The prediction mode above this pixel block is 1, and the left prediction mode is 7. At this time, the prediction mode 7 located on the left side in the previous frequency information table is searched from the table and moved to the first place (table index 0). Next, 1 that is the upper prediction mode is searched from the table and moved to the second place (table index 1). Thus, by sorting the upper left prediction mode adjacent to each pixel block in the upper part of the frequency information table, it is possible to obtain the frequency information of the prediction mode. When this frequency information table is used, the prediction mode selected before the pixel block to be decoded exists in the upper part of the table, and the prediction mode not used exists in the lower part of the table.

制御部２１０に設定されているインデックス長は、図７に示されるテーブルインデックスの長さを定義するものである。予測モードを復号化する際、テーブルインデックス長分のビットを復号すればよいので、冗長なビットを削減することが可能となる。 The index length set in the control unit 210 defines the length of the table index shown in FIG. When decoding the prediction mode, it is only necessary to decode bits corresponding to the table index length, so that redundant bits can be reduced.

本方式を用いた予測を以下、フレキシブルモード予測と呼ぶ。
頻度情報テーブル生成部２０２で生成された頻度情報テーブルが、頻度情報テーブル抽出部２０１へと出力される。頻度情報テーブル抽出部２０１は、入力された頻度情報テーブルの中から、符号列復号部で復号されたインデックス番号に対応する予測モード番号を抽出する。抽出された予測モードが予測モード設定部２０３へと出力される。予測モード設定部２０３では、入力された抽出予測モードを制御部２１０に設定するとともに、予測切替スイッチ２０７を対応する予測モードに切り替える。予測モード設定部２０３で設定された予測モードと、これに対応する予測部の番号が対応しており、あらかじめ規定された予測方法で予測が行われる。ここでは例としてＨ．２６４で規定されている４ｘ４画素（方向）予測が行われる。 Hereinafter, prediction using this method is referred to as flexible mode prediction.
The frequency information table generated by the frequency information table generation unit 202 is output to the frequency information table extraction unit 201. The frequency information table extraction unit 201 extracts a prediction mode number corresponding to the index number decoded by the code string decoding unit from the input frequency information table. The extracted prediction mode is output to the prediction mode setting unit 203. The prediction mode setting unit 203 sets the input extraction prediction mode in the control unit 210 and switches the prediction changeover switch 207 to the corresponding prediction mode. The prediction mode set by the prediction mode setting unit 203 corresponds to the prediction unit number corresponding to the prediction mode, and the prediction is performed by a predetermined prediction method. Here, as an example, The 4 × 4 pixel (direction) prediction defined in H.264 is performed.

Ｈ．２６４の予測モードは９通りあり、図８(ａ)に示されるように、モード２を除いて夫々２２．５度づつ異なる予測方向を持っている。モード０からモード８までが規定されており、モード２は、ＤＣ予測となっている。４ｘ４画素予測の予測ブロックと参照画素との関係が図８（ｂ）に示されている。大文字ＡからＭまでの画素が参照画素であり、小文字ａからｐまでの画素が復号対象予測画素である。 H. There are nine prediction modes of H.264, and as shown in FIG. 8 (a), except for mode 2, each has a different prediction direction by 22.5 degrees. Mode 0 to mode 8 are defined, and mode 2 is DC prediction. FIG. 8B shows the relationship between the prediction block of 4 × 4 pixel prediction and the reference pixel. Pixels from uppercase letters A to M are reference pixels, and pixels from lowercase letters a to p are decoding target prediction pixels.

予測器２０４に関して、予測方法を説明する。予測器２０４では、モード２のＤＣ予測が選択された場合、式（３）を用いて予測画素が計算される。 A prediction method for the predictor 204 will be described. In the predictor 204, when the mode 2 DC prediction is selected, a prediction pixel is calculated using Expression (3).

参照画素が利用できない時は、利用できる参照画素の平均値で予測される。もし、利用できる参照画素が１つも存在しない場合は、復号化装置の最大輝度値の半分の値（８ビットなら１２８）で予測値が計算される。 When the reference pixel cannot be used, the average value of the available reference pixels is predicted. If there is no reference pixel that can be used, the predicted value is calculated with half the maximum luminance value of the decoding device (128 for 8 bits).

その他のモードが選択された場合、予測器２０４は、図８（a）で示される予測方向に対して、参照画素から補間された予測値をコピーする予測方法を用いる。具体的には、モード０（垂直予測）が選択された場合の予測値生成方法を、式（４）を例にして説明する。 When the other mode is selected, the predictor 204 uses a prediction method that copies the prediction value interpolated from the reference pixel with respect to the prediction direction shown in FIG. Specifically, a prediction value generation method when mode 0 (vertical prediction) is selected will be described using Equation (4) as an example.

このモードは、参照画素ＡからＤまでが利用できるときだけ、選択することが可能である。予測方法の詳細を図８（ｃ）に示す。参照画素Ａ〜Ｄの輝度値がそのまま垂直方向にコピーされ、予測値として補填される。 This mode can be selected only when reference pixels A to D are available. Details of the prediction method are shown in FIG. The luminance values of the reference pixels A to D are copied as they are in the vertical direction and compensated as predicted values.

予測モード０，２以外の予測方法に関してもほぼ同様の枠組みが用いられており、予測方向に対して利用できる参照画素から補間値を生成し、その値を予測方向に応じてコピーするという予測を行う。本実施の形態にかかわる予測モードと予測画素ブロック形状の対応関係は図９に示されている。 A substantially similar framework is used for prediction methods other than prediction modes 0 and 2, and an interpolation value is generated from reference pixels that can be used in the prediction direction, and the prediction is made by copying the value according to the prediction direction. Do. FIG. 9 shows the correspondence between the prediction mode and the prediction pixel block shape according to the present embodiment.

予測器２０４にて出力される予測画像信号４１１は予測制御部４０７外へと出力され、上述した加算器４０５にて、逆量子化逆変換部４０４から出力された誤差信号と加算され、復号信号４１４が生成される。 The predicted image signal 411 output from the predictor 204 is output to the outside of the prediction control unit 407, and is added to the error signal output from the inverse quantization inverse transform unit 404 in the adder 405 described above, and the decoded signal. 414 is generated.

以上が本発明の本実施の形態における復号化部４００の構成である。以下、本発明にかかる動画像復号化方法について、動画像復号部４００が実施する例を説明する。この動画像復号化に使用するシンタクスの構造及び夫々のシンタクス並びに二値化は動画像符号化に使用した図1０のシンタックス構造及び図１１〜１４のシンタックスと同じであるので説明を省略する。また、二値化の例も図１７を参照して符号化において説明した例と同じであるので説明を省略する。 The above is the configuration of decoding section 400 in the present embodiment of the present invention. Hereinafter, the example which the moving image decoding part 400 implements about the moving image decoding method concerning this invention is demonstrated. The syntax structure used for the video decoding, the respective syntaxes, and binarization are the same as the syntax structure of FIG. 10 used for video encoding and the syntaxes of FIGS. . Also, the binarization example is the same as the example described in the encoding with reference to FIG.

モード判定部１０３及び内部モード設定部２０３で選択された予測モードを表すテーブルインデックスの番号は図１７で表されるような変換テーブルによって二値化が行われており、二値化ビット列が符号列復号部４０３によってエントロピー復号化（たとえば算術復号化など）される。上記復号化以外にエントロピー符号やシャノン符号、算術符号などの方法を用いて二値化が行われていても良い。いずれにせよ、符号化部と復号化部で同様の二値化の方式が行われる必要がある。 The number of the table index indicating the prediction mode selected by the mode determination unit 103 and the internal mode setting unit 203 is binarized by a conversion table as shown in FIG. 17, and the binarized bit string is a code string. Entropy decoding (for example, arithmetic decoding) is performed by the decoding unit 403. In addition to the above decoding, binarization may be performed using a method such as entropy code, Shannon code, or arithmetic code. In any case, the same binarization method needs to be performed in the encoding unit and the decoding unit.

本実施の形態の別の例としては、図１４で示されるｍａｃｒｏｂｌｏｃｋ＿ｄａｔａシンタクスを図１５で表されるようなシンタクスに変えても良い。図１４との違いはｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］の値が、予め定められたＥＳＣＡＰＥ＿ＣＯＤＥの場合、更にｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］が送られる点である。 As another example of the present embodiment, the macroblock_data syntax shown in FIG. 14 may be changed to the syntax shown in FIG. The difference from FIG. 14 is that when the value of mode_index [iBlk] is a predetermined ESCAPE_CODE, ecs_mode_index [iBlk] is further sent.

例えば、図１７において符号列１１１１（もしくはハフマン符号の１１１１１）がＥＳＣＡＰＥ＿ＣＯＤＥの場合、現在のテーブル長ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｈｔで示される頻度情報テーブルの中に、復号対象ブロックに該当する予測モードが含まれていないことを示す。ｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］は、更にｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈで示されるテーブル長より後のインデックス番号を示している。例えば、頻度情報テーブルの長さ（全予測モード数を示す）Ｍ＝１５、インデックス長Ｌ＝８の場合の例を図１８に示す。選択された頻度情報テーブルのインデックスがインデックス長Ｌ＝８内からはみ出した場合、ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］にはＥＳＣＡＰＥ＿ＣＯＤＥがセットされる。更にはみ出したインデックスに対応するｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］がセットされる。例えば、図中で選択された頻度情報テーブルのインデックスが１０のとき、ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］にＥＳＣＡＰＥ＿ＣＯＤＥ＝１１１がセットされ、同時にｅｃｓ＿ｍｏｄｅ＿ｉｎｄｅｘ［ｉＢｌｋ］に０１１がセットされる。このようにすることで、全ての予測モードを受信することが可能となるため、予測モードの追加や削減などの拡張が容易となる。また、夫々の予測モードに対応するシンタクスの設計等が不要となる。 For example, in FIG. 17, when the code string 1111 (or Huffman code 11111) is ESCAPE_CODE, the frequency information table indicated by the current table length table_index_length does not include a prediction mode corresponding to the decoding target block. Show. ecs_mode_index [iBlk] further indicates an index number after the table length indicated by table_index_length. For example, FIG. 18 shows an example where the length of the frequency information table (indicating the total number of prediction modes) M = 15 and the index length L = 8. When the index of the selected frequency information table protrudes from within the index length L = 8, ESCAPE_CODE is set in mode_index [iBlk]. Further, ecs_mode_index [iBlk] corresponding to the protruding index is set. For example, when the index of the frequency information table selected in the figure is 10, ESCAPE_CODE = 111 is set in mode_index [iBlk], and 011 is set in ecs_mode_index [iBlk] at the same time. By doing so, it becomes possible to receive all prediction modes, so that expansion such as addition or reduction of prediction modes is facilitated. In addition, syntax design corresponding to each prediction mode is not required.

本実施の形態の別の例としては、図１６で表されるようなシンタクスを用いても良い。この場合、上位シンタクスに付加されているｓｅｑ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｐｉｃ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｓｌｉｃｅ_ｆｌｅｘｂｌｅ＿ｍｏｄｅ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇ、ｔａｂｌｅ＿ｉｎｄｅｘ＿ｌｅｎｇｔｈは必要とされない。この場合、インデックス長が受信されないため、常にテーブルの全ての値が利用できる。頻度情報テーブルの更新によって、常に復号対象ブロックでは最頻の情報がテーブルの上位に来ているため、前述した図１６の三列目で表現されるような二値化テーブルを用意することにより、テーブルインデックス番号を受信することが可能である。 As another example of the present embodiment, syntax as shown in FIG. 16 may be used. In this case, seq_flexible_mode_prediction_flag, pic_flexble_mode_prediction_flag, slice_flexble_mode_prediction_flag, and table_index_length added to the upper syntax are not required. In this case, since the index length is not received, all values of the table can always be used. By updating the frequency information table, since the most frequent information is always higher in the decoding target block, by preparing a binarization table as expressed in the third column of FIG. 16 described above, It is possible to receive a table index number.

（復号化：第２の実施の形態）
本実施の形態では、図２５の復号化装置の復号部４００に設けられ予測制御部４０７が第２の実施の形態の符号化装置に設けられた、図１９に示す予測制御部５０１と同じように構成されている。即ち、第１の実施の形態と異なり、２つの異なる予測部であるＬ０予測器６０４とＬ１予測器６０８が設けられている。更に、Ｌ０予測器６０４とＬ１予測器６０８にそれぞれ対応する頻度情報テーブル抽出部６０１，６０５、Ｌ０頻度情報テーブル生成部６０２、Ｌ０頻度情報テーブル抽出部６０１、Ｌ０予測モード設定部６０３、Ｌ１頻度情報テーブル生成部６０２、Ｌ１頻度情報テーブル抽出部６０５、Ｌ１予測モード情報設定部６０７が設けられている。また、これら異なる予測部６０４，６０８から出力された予測画像に対してフィルタ処理を行う適応フィルタ部６０９が設けられている。この予測制御部４０７の動作は符号化装置の予測制御部５０１の動作と同じであるので詳細な説明は省略する。 (Decoding: Second Embodiment)
In the present embodiment, the prediction control unit 407 provided in the decoding unit 400 of the decoding device in FIG. 25 is the same as the prediction control unit 501 shown in FIG. 19 provided in the encoding device in the second embodiment. It is configured. That is, unlike the first embodiment, an L0 predictor 604 and an L1 predictor 608, which are two different predictors, are provided. Furthermore, frequency information table extraction units 601 and 605, L0 frequency information table generation unit 602, L0 frequency information table extraction unit 601, L0 prediction mode setting unit 603, and L1 frequency information respectively corresponding to the L0 predictor 604 and the L1 predictor 608. A table generation unit 602, an L1 frequency information table extraction unit 605, and an L1 prediction mode information setting unit 607 are provided. In addition, an adaptive filter unit 609 that performs filter processing on the prediction images output from these different prediction units 604 and 608 is provided. Since the operation of the prediction control unit 407 is the same as the operation of the prediction control unit 501 of the encoding device, detailed description thereof is omitted.

適応フィルタ部６０９で生成された予測画像信号が復号化部４００の加算器４０５へと出力される。予測情報６１２及び頻度情報テーブルも符号化装置と構成及び機能が同じであるので説明を省略する。Ｌ０／Ｌ１予測部６０４／６０８も符号化装置と構成及び機能が同じであるので説明を省略する。 The predicted image signal generated by the adaptive filter unit 609 is output to the adder 405 of the decoding unit 400. Since the prediction information 612 and the frequency information table have the same configuration and function as those of the encoding device, description thereof will be omitted. Since the L0 / L1 prediction units 604/608 also have the same configuration and function as the encoding device, description thereof will be omitted.

本実施の形態では、インター予測（フレーム間予測）に関する実施例について詳細に説明したが、イントラ予測（フレーム内予測）に関しても、同様の復号化器構造で実施が可能である。より具体的に説明すると、図８、図９で示されるＨ．２６４で規定されている１つの方向予測モード（例えば４ｘ４画素予測内の垂直予測）をＬ０予測とし、もう１つの方向予測モード（例えば４ｘ４画素予測内の垂直左予測）をＬ１予測とする。このとき、予測制御部６００内で生成された、夫々のＬ０予測画像信号６１４とＬ１予測画像信号６１５が適応フィルタ部６０９へと入力され、新たにこの２つの予測画像信号をフィルタリングした予測画像信号が生成される。Ｌ０予測モードに対して、Ｌ０頻度情報テーブルが生成され、Ｌ１予測モードに対して、Ｌ１頻度情報テーブルが生成される。このようにして、予測画像信号を生成することで２つの予測画像信号から新たな予測画像信号を生成することが可能になる。 In the present embodiment, the example related to inter prediction (interframe prediction) has been described in detail. However, intra prediction (intraframe prediction) can also be implemented with the same decoder structure. More specifically, the H.264 shown in FIGS. One direction prediction mode (for example, vertical prediction in 4 × 4 pixel prediction) defined in H.264 is L0 prediction, and the other direction prediction mode (for example, vertical left prediction in 4 × 4 pixel prediction) is L1 prediction. At this time, the L0 prediction image signal 614 and the L1 prediction image signal 615 generated in the prediction control unit 600 are input to the adaptive filter unit 609, and a prediction image signal obtained by newly filtering the two prediction image signals. Is generated. An L0 frequency information table is generated for the L0 prediction mode, and an L1 frequency information table is generated for the L1 prediction mode. In this manner, by generating a predicted image signal, a new predicted image signal can be generated from the two predicted image signals.

本予測方式で用いるシンタクスの復号化方法は復号化の第1の実施の形態と同じであるので説明は省略する。 Since the syntax decoding method used in this prediction method is the same as that in the first embodiment, the description thereof will be omitted.

このように本方式を用いると、予測を行う場合に使用する予測モードを、シーケンス、スライス毎或いはマクロブロック毎に変更できるため、ブロック毎に精度の高い予測画像生成が可能となる。また、本実施形態においては動画像符号化を例にとり説明したが、静止画像符号化にも本発明を適用することができる。 In this way, when this method is used, the prediction mode used when performing prediction can be changed for each sequence, each slice, or each macroblock, so that a predicted image can be generated with high accuracy for each block. Further, although the present embodiment has been described by taking moving image coding as an example, the present invention can also be applied to still image coding.

上述のように本発明によると、選択された予測モードの頻度情報テーブルを利用して、符号化対象ブロックに対して、テーブルから与えられる予測モードの中から、テーブルの上位に存在する出現頻度の高い予測モードのみを抽出して予測画像信号生成を行い、抽出した際のテーブルのインデックス長をシンタクスに多重化して復号化器に送信することで、従来の予測画像生成方法よりも高い符号化効率を維持しつつ、ハードウェアの演算コストを削減した予測画像が生成できる。 As described above, according to the present invention, by using the frequency information table of the selected prediction mode, the appearance frequency existing at the top of the table is selected from the prediction modes given from the table for the encoding target block. By extracting only the high prediction mode and generating the prediction image signal, the index length of the table at the time of extraction is multiplexed into the syntax and sent to the decoder, so that the encoding efficiency higher than the conventional prediction image generation method While maintaining the above, it is possible to generate a predicted image with reduced hardware calculation costs.

本発明の一実施形態に従う動画像符号化装置の構成を示すブロック図。The block diagram which shows the structure of the moving image encoder according to one Embodiment of this invention. 一実施形態に従う動画像符号化装置の構成の１部である内部予測及びモード判定部を示すブロック図。The block diagram which shows the internal prediction and mode determination part which are 1 part of the structure of the moving image encoder according to one Embodiment. 一実施形態に従う内部予測及びモード判定部の構成の１部である予測制御部を示すブロック図。The block diagram which shows the prediction control part which is 1 part of the structure of the internal prediction and mode determination part according to one Embodiment. 一実施形態に従う内部予測及びモード判定部の構成の１部であるモード制御部を示すブロック図。The block diagram which shows the mode control part which is 1 part of the structure of the internal prediction and mode determination part according to one Embodiment. 一実施形態に係わる符号化順序、ブロックサイズを示す図。The figure which shows the encoding order and block size concerning one Embodiment. 一実施形態に係わる符号化処理の流れを示すフローチャート。The flowchart which shows the flow of the encoding process concerning one Embodiment. 一実施形態に係わる頻度情報テーブルの更新方法を示す図。The figure which shows the update method of the frequency information table concerning one Embodiment. 一実施形態に係わる方向予測に利用される参照画像の予測方向を示す図。The figure which shows the prediction direction of the reference image utilized for the direction prediction concerning one Embodiment. 一実施形態に従う画面内予測方法の名称を表すテーブル。The table showing the name of the prediction method in a screen according to one Embodiment. 一実施形態に従うシンタクス構造の概略図。Schematic of the syntax structure according to one embodiment. 一実施形態に従うシーケンスパラメータセットシンタクスのデータ構造を示す図。The figure which shows the data structure of the sequence parameter set syntax according to one Embodiment. 一実施形態に従うピクチャパラメータセットシンタクスのデータ構造を示す図。The figure which shows the data structure of the picture parameter set syntax according to one Embodiment. 一実施形態に従うスライスヘッダシンタクスのデータ構造を示す図。The figure which shows the data structure of the slice header syntax according to one Embodiment. 一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment. 一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment. 一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment. 一実施形態に従う頻度情報テーブルインデックスの二値化を示すテーブル。The table which shows the binarization of the frequency information table index according to one embodiment. 一実施形態に従う頻度情報テーブルインデックスの二値化とエスケープコードの概略を示すテーブル。The table which shows the outline of the binarization of the frequency information table index according to one Embodiment, and an escape code. 本発明の一実施形態に従う内部予測及びモード判定部の構成の１部である予測制御部を示すブロック図。The block diagram which shows the prediction control part which is 1 part of the structure of the internal prediction and mode determination part according to one Embodiment of this invention. 本発明の一実施形態に従うＬ０予測モードとＬ１予測モードの予測名称を表すテーブル。The table showing the prediction name of L0 prediction mode and L1 prediction mode according to one Embodiment of this invention. 本発明の一実施形態に従うＬ０予測モードとＬ１予測モードを用いた予測方法を現す概略図。Schematic showing the prediction method using L0 prediction mode and L1 prediction mode according to one Embodiment of this invention. 本発明の一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment of this invention. 本発明の一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment of this invention. 本発明の一実施形態に従うマクロブロックレイヤーシンタクスのデータ構造を示す図。The figure which shows the data structure of the macroblock layer syntax according to one Embodiment of this invention. 本発明の一実施形態に従う動画像復号化装置の構成を示すブロック図。The block diagram which shows the structure of the moving image decoding apparatus according to one Embodiment of this invention.

Explanation of symbols

１００…符号化部、１０１…画面分割部、１０２…内部予測／モード判定部、
１０３…モード判定部、１０４…変換量子化部、１０５…符号化処理部、１０６…逆量子化逆変換部、１０７…参照画像メモリ、１０８…符号化制御部、２０１…頻度情報テーブル抽出部、２０２…頻度情報テーブル生成部、２０３…予測モード設定部、２０４…予測器、２０５…内部参照画像メモリ、２０７…予測切替スイッチ、２１０…制御部、
３０１…内部モード判定部、３０２…内部変換量子化部、３０３…仮符号化処理部、
３０４…内部逆量子化逆変換部、４０１…入力バッファ、４０２…逆多重化部、
４０３…符号列復号部、４０４…逆量子化逆変換部、４０５…加算器、４０６…参照画像メモリ、４０７…予測制御部、４０８…出力バッファ、４０９…復号化制御部、
５０１…予測制御部、５０２…モード制御部、５０６…減算器、６００…予測制御部、
６０１…Ｌ０頻度情報テーブル抽出部、６０２…Ｌ０頻度情報テーブル生成部、
６０３…Ｌ０予測モード設定部、６０４…Ｌ０予測部、６０５…Ｌ１頻度情報テーブル抽出部、６０６…Ｌ１頻度情報テーブル生成部、６０７…Ｌ１予測モード設定部、
６０８…Ｌ１予測部、６０９…適応フィルタ部、６１０…内部参照画像メモリ DESCRIPTION OF SYMBOLS 100 ... Coding part, 101 ... Screen division | segmentation part, 102 ... Internal prediction / mode determination part,
DESCRIPTION OF SYMBOLS 103 ... Mode determination part, 104 ... Transformation quantization part, 105 ... Coding process part, 106 ... Dequantization inverse transformation part, 107 ... Reference image memory, 108 ... Encoding control part, 201 ... Frequency information table extraction part, 202 ... Frequency information table generation unit, 203 ... Prediction mode setting unit, 204 ... Predictor, 205 ... Internal reference image memory, 207 ... Prediction changeover switch, 210 ... Control unit,
301 ... Internal mode determination unit, 302 ... Internal transform quantization unit, 303 ... Temporary encoding processing unit,
304: internal dequantization inverse transform unit 401: input buffer 402: demultiplexing unit
403 ... Code sequence decoding unit, 404 ... Inverse quantization inverse transformation unit, 405 ... Adder, 406 ... Reference image memory, 407 ... Prediction control unit, 408 ... Output buffer, 409 ... Decoding control unit,
501 ... Prediction control unit, 502 ... Mode control unit, 506 ... Subtractor, 600 ... Prediction control unit,
601... L0 frequency information table extraction unit, 602... L0 frequency information table generation unit,
603 ... L0 prediction mode setting unit, 604 ... L0 prediction unit, 605 ... L1 frequency information table extraction unit, 606 ... L1 frequency information table generation unit, 607 ... L1 prediction mode setting unit,
608 ... L1 prediction unit, 609 ... adaptive filter unit, 610 ... internal reference image memory

Claims

Preparing a frequency information table indicating the frequency of selection of incidental information related to the prediction mode;
Dividing the input image into a plurality of pixel blocks;
Selecting incidental information related to a prediction mode according to an encoding target pixel block of the pixel block;
Generating a prediction image for the encoding target pixel block using a reference image based on the selected supplementary information;
Determining an optimum prediction mode based on a prediction error between an input image and a prediction image and a code amount of the prediction mode, and rearranging a selection frequency order of prediction modes in the frequency information table according to the determined prediction mode;
Generating an index of the sorted frequency information table;
Extracting one or more additional information from the index for the encoding target pixel block;
Generating a prediction signal corresponding to the extracted incidental information;
Calculating the cost of the prediction mode and selecting one encoding mode from the cost;
Encoding the prediction error signal and the table length of the frequency information table according to the selected encoding mode, and encoding an index number in the frequency information table indicating the selected encoding mode;
An image encoding method comprising:

Preparing a frequency information table indicating a selection frequency of a plurality of prediction modes;
Dividing the input image into a plurality of pixel blocks;
Selecting a prediction mode according to an encoding target pixel block of the pixel block;
Generating a prediction image for the encoding target pixel block using a reference image based on the selected prediction mode;
Determining an optimum prediction mode based on a prediction error between an input image and a prediction image and a code amount of the prediction mode, and rearranging a selection frequency order of prediction modes in the frequency information table according to the determined prediction mode;
Generating an index of the sorted frequency information table;
Extracting one or more prediction modes from prediction modes corresponding to the index for the encoding target pixel block;
Generating a prediction signal and prediction mode information corresponding to the extracted prediction mode;
Calculating a coding cost of the prediction mode, and selecting one coding mode from the coding cost;
Encoding a prediction error signal in the selected encoding mode, a table length of the frequency information table, and an index number in the frequency information table indicating the selected encoding mode;
An image encoding method comprising:

Dividing the input image into a plurality of pixel blocks;
Selecting a prediction mode according to an encoding target pixel block of the pixel block;
Generating a first frequency information table by tabulating the selection frequency of the prediction mode for the selected first type prediction mode;
Generating a second frequency information table by tabulating the selection frequency of the prediction mode with respect to the selected second type prediction mode;
Generating an index of the first and second type frequency information table;
A prediction mode extraction step of extracting one or more prediction modes from the prediction modes given from the first and second frequency information tables for the encoding target pixel block of the pixel block;
Generating a first type prediction signal, a second type prediction signal, and prediction mode information corresponding to the prediction mode extracted from the first and second frequency information tables, respectively.
Performing a filtering process on the first type prediction signal and the second type prediction signal to generate one prediction signal;
Calculating a prediction error signal for the prediction mode and selecting one encoding mode;
The first type and the prediction error signal generated in the selected encoding mode, the table length of the first frequency information table, the table length of the second information table, and the selected encoding mode; An encoding step for encoding an index number corresponding to the second type;
An image encoding method comprising:

The encoding step includes a step of performing a conversion process on the prediction error signal, and a step of performing a quantization process on the converted coefficient to generate a quantized conversion coefficient, The image encoding method according to claim 1 or 2.

The image encoding method according to claim 1, further comprising a step of switching a size of a prediction pixel block corresponding to each encoding mode within a specific pixel block size.

3. The image encoding method according to claim 1, wherein the prediction mode extracting step includes a step of sending an index length of the frequency table for each encoding sequence, each picture, or each encoding slice.

The prediction mode extraction step includes a step of switching whether or not to extract the prediction mode depending on whether the quantization scale value of the encoding target macroblock is large or small. The image encoding method according to claim 1 or 2.

The prediction mode information extraction step switches whether to extract the mode information according to whether the resolution of the input image signal is high or low. The image encoding method described.

The encoding mode selection step includes a code amount calculation step of calculating a code amount when a signal generated in the selected encoding mode is encoded;
A step of locally decoding a signal generated in the selected encoding mode to generate a locally decoded image; and an encoding distortion calculating step of calculating an encoding distortion representing a difference from the input image signal. The image encoding method according to claim 1, wherein the image encoding method is performed.

3. The image encoding method according to claim 1, further comprising a step of sending the table length in sequence units, picture units or slice units, and sending the index in macroblock units or block units.

3. The method according to claim 1, further comprising a step of sending the table length including header data in sequence units or slice units and / or sending the index included in header data in macro block units. Image coding method.

Decoding the encoded signal for each pixel block according to the encoding mode of the encoded signal;
Decoding the table length of the frequency information table for tabulating the selection frequency of the additional information regarding the prediction mode of the decoded pixel block;
Generating the frequency information table based on the decrypted table length;
Decoding an index number of the frequency information table;
Extracting additional information corresponding to the decoded pixel block from the index;
Generating a prediction signal and a prediction mode corresponding to the extracted additional information;
Generating a prediction error signal based on the decoded signal;
Adding a prediction signal and a prediction error signal to generate a decoded image;
An image decoding method comprising:

Decoding the encoded signal for each pixel block according to the encoding mode of the encoded signal;
Decoding a table length of a frequency information table for tabulating a selection frequency of a prediction mode of a decoded pixel block;
Generating the frequency information table based on the decrypted table length;
Decoding an index number of the frequency information table;
Extracting one or more prediction modes from prediction modes corresponding to the index;
Generating a prediction signal corresponding to the extracted prediction mode;
Generating a prediction error signal based on the decoded signal;
Adding a prediction signal and a prediction error signal to generate a decoded image;
An image decoding method comprising:

Decoding the encoded signal for each pixel block according to the encoding mode of the encoded signal;
Decoding the table length of the frequency information table for tabulating the selection frequencies of the first type prediction mode and the second type prediction mode of the decoded pixel block;
Generating a first type frequency information table by tabulating the selection frequency of the prediction mode for the first type prediction mode;
Generating a second type frequency information table by tabulating the selection frequency of the prediction mode for the second type of prediction mode;
Respectively decoding the index numbers of the first and second type frequency information tables;
Extracting a prediction mode corresponding to each of the first type and second type prediction modes from the index;
Generating a first type prediction signal, a second type prediction signal, and prediction mode information corresponding to the extracted first type and second type prediction modes;
Performing a filtering process on the first type prediction signal and the second type prediction signal to generate one prediction signal;
Generating a prediction error signal based on the decoded signal;
Adding a prediction signal and a prediction error signal to generate a decoded image;
An image decoding method comprising:

14. The prediction error signal generation step includes a step of inversely quantizing a decoded coefficient and a step of generating a prediction error signal by inversely transforming the inverse quantization transform coefficient. Image decoding method.

The image decoding method according to claim 12 or 13, further comprising a step of switching a size of a predicted pixel block corresponding to each encoding mode within a specific pixel block size.

14. The image decoding method according to claim 12, wherein the prediction mode extraction step sends the index length of the frequency table for each sequence, each picture, or each slice when performing the extraction of the prediction mode.

The image decoding according to claim 12 or 13, wherein the prediction mode extraction step extracts the prediction mode depending on whether a quantization scale value of a decoding target macroblock is large or small. Method.

The image decoding method according to claim 10 or 11, wherein the prediction mode extracting step extracts the mode information according to whether the resolution of the decoding target input image signal is high or low. .

A memory for storing a frequency information table indicating the selection frequency of the incidental information related to the prediction mode;
A dividing unit for dividing the input image into a plurality of pixel blocks;
A selection unit that selects incidental information related to a prediction mode according to a pixel block to be encoded of the pixel block;
A prediction unit that generates a prediction image for the encoding target pixel block using a reference image based on the selected supplementary information;
A table updating unit that determines an optimal prediction mode based on a prediction error between an input image and a prediction image and a code amount of the prediction mode, and rearranges a selection frequency order of prediction modes in the frequency information table according to the determined prediction mode; ,
An index generation unit for generating an index of the sorted frequency information table;
An extraction unit that extracts one or more auxiliary information from the index for the encoding target pixel block;
A prediction signal generation unit that generates a prediction signal corresponding to the extracted auxiliary information;
A selection unit that calculates a cost of the prediction mode and selects one encoding mode from the cost;
An encoding unit that encodes the prediction error signal, the table length of the frequency information table according to the selected encoding mode, and an index number in the frequency information table indicating the selected encoding mode;
An image encoding apparatus comprising:

A decoding unit that decodes the encoded signal for each pixel block according to an encoding mode of the encoded signal;
Table decoding for decoding the table length of the frequency information table for tabulating the selection frequency of the additional information related to the prediction mode of the decoded pixel block;
A frequency information table generating unit that generates the frequency information table based on the decrypted table length;
An index decoding unit for decoding an index number of the frequency information table;
An additional information extraction unit that extracts additional information corresponding to the decoded pixel block from the index;
A prediction signal generation unit that generates a prediction signal and a prediction mode corresponding to the extracted additional information;
A prediction error signal generation unit that generates a prediction error signal based on the decoded signal;
A decoded image generating unit that generates a decoded image by adding the prediction signal and the prediction error signal;
An image decoding apparatus comprising:

A procedure for preparing a frequency information table indicating the selection frequency of the incidental information related to the prediction mode;
Dividing the input image into a plurality of pixel blocks;
A procedure for selecting incidental information related to a prediction mode according to an encoding target pixel block of the pixel block;
Generating a prediction image for the encoding target pixel block using a reference image based on the selected supplementary information;
A step of determining an optimal prediction mode based on a prediction error between an input image and a prediction image and a code amount of the prediction mode, and rearranging a selection frequency order of prediction modes in the frequency information table according to the determined prediction mode;
A procedure for generating an index of the sorted frequency information table;
A procedure for extracting one or more additional information from the index for the encoding target pixel block;
Generating a prediction signal corresponding to the extracted incidental information;
Calculating a cost of the prediction mode and selecting one encoding mode from the cost;
Encoding the prediction error signal, the table length of the frequency information table according to the selected encoding mode, and the index number in the frequency information table indicating the selected encoding mode;
An image encoding program for causing a computer to execute.

A procedure for decoding the encoded signal for each pixel block according to an encoding mode of the encoded signal;
A procedure for decoding the table length of the frequency information table for tabulating the selection frequency of the additional information related to the prediction mode of the decoded pixel block;
Generating the frequency information table based on the decrypted table length;
A procedure for decoding an index number of the frequency information table;
A procedure for extracting additional information corresponding to a decoded pixel block from the index;
Generating a prediction signal and a prediction mode corresponding to the extracted additional information;
Generating a prediction error signal based on the decoded signal;
A procedure for adding a prediction signal and a prediction error signal to generate a decoded image;
Decoding program for causing a computer to execute.

Based on a selection history of a plurality of information related to prediction modes in blocks encoded before each of the plurality of blocks into which the input image is divided, in order of the high possibility of being selected in each block according to a predetermined rule A table generating step for generating a table in which the plurality of pieces of information are arranged;
A selection step of selecting selection information to be used for prediction of each block from the plurality of pieces of information;
A prediction step of generating a prediction residual signal of each block from the image signal of each block by performing prediction according to the selection information;
An encoding step of generating encoded data by encoding the prediction residual signal of each block, information indicating the length of the table, and an index number corresponding to the selection information in the selection table;
An image encoding method comprising:

25. The image encoding method according to claim 24, wherein, in the table generation step, the table is generated by using a part extracted in order from a higher possibility of being selected from the plurality of pieces of information.

Based on a selection history of a plurality of information related to prediction modes in blocks encoded before each of the plurality of blocks into which the input image is divided, in order of the high possibility of being selected in each block according to a predetermined rule A table generator for generating a table in which the plurality of pieces of information are arranged;
A selection unit that selects selection information used for prediction of each block from the plurality of pieces of information;
A prediction unit that generates a prediction residual signal of each block from the image signal of each block by performing prediction according to the selection information;
An encoding unit that generates encoded data by encoding the prediction residual signal of each block, information indicating the length of the table, and an index number corresponding to the selection information in the selection table;
An image encoding device having:

27. The image encoding apparatus according to claim 26, wherein the table generation unit generates the table using a part extracted in order from a higher possibility of being selected from the plurality of pieces of information.