JP4227167B2

JP4227167B2 - Moving picture decoding method and apparatus

Info

Publication number: JP4227167B2
Application number: JP2006327621A
Authority: JP
Inventors: 健中條; 晋一郎古藤; 義浩菊池; 昭行谷沢
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2006-12-04
Filing date: 2006-12-04
Publication date: 2009-02-18
Anticipated expiration: 2022-11-22
Also published as: JP2007097218A

Description

この発明は、特にフェード画像やディゾルブ画像に対して効率の高い符号化／復号化を行う動画像符号化／復号化方法及び装置に関する。 The present invention relates to a moving image encoding / decoding method and apparatus for performing highly efficient encoding / decoding particularly for fade images and dissolve images.

ＩＴＵ−ＴＨ．２６１，Ｈ．２６３，ＩＳＯ／ＩＥＣＭＰＥＧ−２，ＭＰＥＧ−４といった動画像符号化標準方式では、符号化モードの一つとして動き補償予測フレーム間符号化が用いられる。動き補償予測フレーム間符号化における予測モデルとしては、時間方向には明るさが変化しない場合に最も予測効率が高くなるようなモデルが採用されている。画像の明るさが変化するフェード画像の場合、例えば黒い画像からフェードインして通常の画像になるような場合などにおいて、画像の明るさの変化に対して適切に予測を行う方法は知られていない。従って、フェード画像においても画質を維持するためには、多くの符号量を必要とするという問題がある。 ITU-TH. 261, H.M. In motion picture coding standard systems such as H.263, ISO / IEC MPEG-2, and MPEG-4, motion compensation prediction interframe coding is used as one of coding modes. As a prediction model in motion-compensated prediction interframe coding, a model that has the highest prediction efficiency when the brightness does not change in the time direction is employed. In the case of a fade image in which the brightness of an image changes, for example, when a normal image is faded in from a black image, a method for appropriately predicting a change in the brightness of the image is known. Absent. Accordingly, there is a problem that a large amount of code is required to maintain image quality even in a fade image.

この問題に対し、例えば特許第３１６６７１６号「フェード画像対応動画像符号化装置及び符号化方法」（特許文献１）では、フェード画像部分を検出して符号量の割り当てを変更することで対応している。具体的には、フェードアウト画像の場合、輝度が変化するフェードアウトの始まり部分に多くの符号量を割り当てる。フェードアウトの最後の部分は、通常、単色の画像になることから容易に符号化が可能となるため、符号量の割り当てを減らす。このようにすることで、総符号量を余り増大させることなく全体の画質を向上させている。 For example, Japanese Patent No. 3166716 “Fade Image Corresponding Video Encoding Device and Encoding Method” (Patent Document 1) addresses this problem by detecting a fade image portion and changing the code amount allocation. Yes. Specifically, in the case of a fade-out image, a large amount of code is assigned to the beginning of the fade-out where the luminance changes. Since the last part of the fade-out usually becomes a monochromatic image and can be easily encoded, the allocation of the code amount is reduced. In this way, the overall image quality is improved without increasing the total code amount too much.

一方、特許第２９３８４１２号「動画像の輝度変化補償方法、動画像符号化装置、動画像復号装置、動画像符号化もしくは復号プログラムを記録した記録媒体及び動画像の符号化データを記録した記録媒体」（特許文献２）では、輝度変化量とコントラスト変化量の２つのパラメータに従って参照画像を補償することで、フェード画像に対応する符号化方式を提案している。 On the other hand, Japanese Patent No. 2938412 “Method for compensating luminance change of moving image, moving image encoding device, moving image decoding device, recording medium recording moving image encoding or decoding program, and recording medium recording moving image encoded data (Patent Document 2) proposes an encoding method corresponding to a fade image by compensating a reference image according to two parameters of a luminance change amount and a contrast change amount.

Thomas Wiegand and Berand Girod, “Multi-frame motion-compensated prediction for video transmission”, Kluwer Academic Publishers 2001 （非特許文献１）には、複数のフレームバッファに基づく符号化方式が提案されている。この方式では、フレームバッファに保持されている複数の参照フレームから選択的に予測画像を作成することで、予測効率の向上を図っている。
特許第３１６６７１６号特許第２９３８４１２号 Thomas Wiegand and Berand Girod, “Multi-frame motion-compensated prediction for video transmission”, Kluwer Academic Publishers 2001 Thomas Wiegand and Berand Girod, “Multi-frame motion-compensated prediction for video transmission”, Kluwer Academic Publishers 2001 (Non-Patent Document 1) proposes an encoding method based on a plurality of frame buffers. In this method, the prediction efficiency is improved by selectively creating a prediction image from a plurality of reference frames held in the frame buffer.
Japanese Patent No. 3166716 Japanese Patent No. 2938412 Thomas Wiegand and Berand Girod, “Multi-frame motion-compensated prediction for video transmission”, Kluwer Academic Publishers 2001

特許文献１では、フェード画像部分を検出して符号量の割り当てを変更することでフェード画像の符号化において総符号量を増大させることなく画質を向上させるために、既存の符号化方式の枠組みの中で実現できる利点はあるが、本質的に予測効率を上げているわけではないので、大きな符号化効率の向上は期待できない。 In Patent Document 1, in order to improve the image quality without increasing the total code amount in the coding of the fade image by detecting the fade image portion and changing the code amount allocation, the framework of the existing coding scheme is used. Although there is an advantage that can be realized, the prediction efficiency is not essentially increased, so that a large improvement in coding efficiency cannot be expected.

一方、特許文献２では、フェード画像に対する予測効率が向上するというメリットがあるが、画像がある画像から別の画像に徐々に変化する、いわゆるディゾルブ画像（クロスフェード画像とも呼ばれる）に対しては、十分な予測効率が得られない。 On the other hand, in Patent Document 2, there is a merit that prediction efficiency for a fade image is improved, but for a so-called dissolve image (also called a cross-fade image) in which an image gradually changes from one image to another, Sufficient prediction efficiency cannot be obtained.

非特許文献３の方式では、フェード画像やディゾルブ画像に対しては十分な対応がなされておらず、複数の参照フレームを用意しても予測効率の改善を図ることはできない。 In the method of Non-Patent Document 3, a fade image and a dissolve image are not sufficiently dealt with, and even if a plurality of reference frames are prepared, the prediction efficiency cannot be improved.

上述したように従来の技術によると、フェード画像やディゾルブ画像を高い画質を維持しつつ符号化するには多くの符号量を必要とし、符号化効率の向上が期待できないという問題点があった。 As described above, according to the prior art, a large amount of code is required to encode a fade image or dissolve image while maintaining high image quality, and there is a problem in that improvement in encoding efficiency cannot be expected.

そこで、本発明は特にフェード画像やディゾルブ画像のような時間的に輝度が変化する動画像に対して、高効率の符号化を可能とし、かつ計算量の少ない動画像符号化及び動画像復号化の方法及び装置を提供することを目的とする。 Accordingly, the present invention enables high-efficiency encoding and a small amount of calculation of moving image encoding and decoding, especially for moving images whose luminance changes with time, such as fade images and dissolve images. It is an object of the present invention to provide a method and apparatus.

上記の課題を解決するため、本発明の第１の態様では動画像の符号化側において、入力動画像信号に対して少なくとも一つの参照画像信号、及び該入力動画像信号と該参照画像信号との間の動きベクトルを用いて動き補償予測符号化を行う際、動き補償予測符号化に用いる参照画像の数が一つの場合には、予め用意された少なくとも一つの参照画像番号と予測パラメータとの複数の組合せの中から、入力動画像信号の符号化対象領域毎に選択された一つの組み合わせの参照画像番号及び予測パラメータに従って予測画像信号を生成する、第１の予測画像信号生成手法を用いる。 In order to solve the above-described problem, in the first aspect of the present invention, at the moving image encoding side, at least one reference image signal for the input moving image signal, and the input moving image signal and the reference image signal When the number of reference images used for motion compensation prediction encoding is one when performing motion compensation prediction encoding using a motion vector between, at least one reference image number prepared in advance and a prediction parameter A first predicted image signal generation method is used that generates a predicted image signal according to a reference image number and a prediction parameter of one combination selected for each encoding target area of the input moving image signal from among a plurality of combinations.

一方、動き補償予測符号化に用いる参照画像の数が複数の場合には、符号化対象領域毎に該複数の参照画像の参照画像番号、及び該複数の参照画像の画像間距離に基づいて算出される予測パラメータに従って予測画像信号を生成する、第２の予測信号生成手法を用いる。 On the other hand, when there are a plurality of reference images used for motion compensation predictive coding, calculation is performed based on the reference image numbers of the plurality of reference images and the inter-image distances of the plurality of reference images for each encoding target region. A second prediction signal generation method for generating a prediction image signal according to a prediction parameter to be used is used.

このようにして生成された予測画像信号の入力動画像信号に対する予測画像信号の誤差を表す予測誤差信号を生成し、この予測誤差信号と動きベクトルの情報、及び選択された組み合わせと複数の参照画像の参照画像番号のいずれかを示すインデックス情報を符号化する。 A prediction error signal representing an error of the prediction image signal with respect to the input moving image signal of the prediction image signal generated in this way is generated, information on the prediction error signal and the motion vector, and the selected combination and a plurality of reference images Index information indicating any one of the reference image numbers is encoded.

本発明の他の態様では、入力動画像信号の符号化対象領域の予測タイプが動き補償予測符号化に一つの参照画像を用いる第１の予測タイプである場合に第１の予測信号生成手法を用い、符号化対象領域の予測タイプが双方向予測タイプであって、かつ動き補償予測符号化に用いる参照画像の数が複数の場合に第２の予測信号生成手法を用いる。 In another aspect of the present invention, the first prediction signal generation method is used when the prediction type of the encoding target region of the input moving image signal is the first prediction type using one reference image for motion compensation prediction encoding. The second prediction signal generation method is used when the prediction type of the encoding target region is a bidirectional prediction type and the number of reference images used for motion compensation prediction encoding is plural.

一方、動画像の復号化側では、動画像信号に対する予測画像信号の誤差を表す予測誤差信号、動きベクトル情報、及び一つの参照画像番号と予測パラメータとの組み合わせと複数の参照画像の参照画像番号のいずれかを示すインデックス情報を含む符号化データを復号化する。復号化されたインデックス情報が組み合わせを示す場合には、該組み合わせの参照画像番号と予測パラメータに従って予測画像信号を生成する。復号化されたインデックス情報が複数の参照画像の参照画像番号を示す場合には、該参照画像番号、及び該複数の参照画像の画像間距離に基づいて算出される予測パラメータに従って予測画像信号を生成する。このようにして生成された予測誤差信号及び予測画像信号を用いて、再生動画像信号を生成する。 On the other hand, on the moving image decoding side, a prediction error signal indicating an error of a predicted image signal with respect to a moving image signal, motion vector information, a combination of one reference image number and a prediction parameter, and reference image numbers of a plurality of reference images The encoded data including the index information indicating any of the above is decoded. When the decoded index information indicates a combination, a predicted image signal is generated according to the reference image number and the prediction parameter of the combination. When the decoded index information indicates reference image numbers of a plurality of reference images, a prediction image signal is generated according to a prediction parameter calculated based on the reference image numbers and the inter-image distances of the plurality of reference images. To do. A reproduction moving image signal is generated using the prediction error signal and the prediction image signal thus generated.

このように本発明によると、参照画像番号と予測パラメータの組み合わせに従って予測画像信号を生成する第１の予測画像生成手法と、選択された複数の参照画像のフレーム間距離に基づいて算出される予測パラメータを用いて予測画像信号を生成する第２の予測画像生成手法を用意しておき、動き補償予測符号化に用いる参照画像の数や予測タイプに応じていずれかの予測画像生成手法を選択して用いる。 As described above, according to the present invention, the first predicted image generation method for generating the predicted image signal according to the combination of the reference image number and the prediction parameter, and the prediction calculated based on the interframe distances of the plurality of selected reference images. A second prediction image generation method for generating a prediction image signal using parameters is prepared, and one of the prediction image generation methods is selected according to the number of reference images and the prediction type used for motion compensation prediction encoding. Use.

これによって、フェード画像やディゾルブ画像のような通常の動画像符号化の予測方式では適切な予測画像信号が作成できないような入力動画像信号に対しても、より予測効率の高い予測方式に基づいて適切な予測画像信号を作成できる。 As a result, an input video signal that cannot be generated with a normal video coding prediction method such as a fade image or a dissolve image can be generated based on a prediction method with higher prediction efficiency. An appropriate predicted image signal can be created.

また、画素当たりの乗算回数を１回にすることも可能であるため、符号化側及び復号化側のいずれにおいても、ハードウェア規模や演算コストを削減することができる。 In addition, since the number of multiplications per pixel can be set to one, both the encoding side and the decoding side can reduce the hardware scale and calculation cost.

さらに、符号化側から復号化側に対して参照画像番号や予測パラメータの情報そのものを送るのではなく、参照画像番号と予測パラメータの組み合わせを示すインデックス情報を送るか、あるいは参照画像番号を別途送る場合には予測パラメータの組み合わせを示すインデックス情報を送ることによって、符号化効率を改善することができる。 Further, instead of sending the reference picture number and prediction parameter information itself from the encoding side to the decoding side, index information indicating a combination of the reference picture number and the prediction parameter is sent, or a reference picture number is sent separately. In some cases, encoding efficiency can be improved by sending index information indicating a combination of prediction parameters.

以上説明したように、本発明によれば特にフェード画像やディゾルブ画像のような時間的に輝度が変化する動画像に対して適切な予測をうことよって効率が高く、しかも計算量の少ない動画像符号化／復号化を行うことができる。 As described above, according to the present invention, a moving image with high efficiency and a small amount of calculation can be obtained by appropriately predicting a moving image whose luminance changes with time, such as a fade image and a dissolve image. Encoding / decoding can be performed.

以下、図面を参照して本発明の実施形態について説明する。
［第１の実施形態］
（符号化側について）
図１に、本発明の第１の実施形態に係る動画像符号化装置の構成を示す。動画像符号化装置には、この例では例えばフレーム単位で動画像信号１００が入力される。この動画像信号１００は減算器１０１に入力され、ここで予測画像信号２１２との差分がとられて予測誤差信号が生成される。モード選択スイッチ１０２によって予測誤差信号と入力動画像信号１００のいずれか一方が選択され、直交変換器１０３により直交変換、例えば離散コサイン変換（ＤＣＴ）が施される。直交変換器１０３では直交変換係数情報、例えばＤＣＴ係数情報が得られる。直交変換係数情報は量子化器１０４で量子化された後、二分岐される。二分岐された量子化直交変換係数情報２１０の一方は、可変長符号化器２１５に導かれる。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]
(About encoding side)
FIG. 1 shows the configuration of a video encoding apparatus according to the first embodiment of the present invention. In this example, the moving image signal 100 is input to the moving image encoding apparatus in units of frames, for example. This moving image signal 100 is input to a subtractor 101, where a difference from the predicted image signal 212 is taken to generate a prediction error signal. Either the prediction error signal or the input moving image signal 100 is selected by the mode selection switch 102, and orthogonal transform, for example, discrete cosine transform (DCT) is performed by the orthogonal transformer 103. The orthogonal transformer 103 obtains orthogonal transform coefficient information, for example, DCT coefficient information. The orthogonal transform coefficient information is quantized by the quantizer 104 and then bifurcated. One of the bifurcated quantized orthogonal transform coefficient information 210 is guided to the variable length encoder 215.

二分岐された量子化直交変換係数情報２１０の他方は、逆量子化器１０５及び逆直交変換器１０６により量子化器１０４及び直交変換器１０３の処理と逆の処理を順次受けて予測誤差信号と同様の信号とされた後、加算器１０７でスイッチ１０９を介して入力される予測画像信号２１２と加算されることにより、局部復号画像信号２１１が生成される。局部復号画像信号２１１は、フレームメモリ／予測画像生成器１０８に入力される。 The other of the bifurcated quantized orthogonal transform coefficient information 210 is subjected to sequential processing reverse to the processing of the quantizer 104 and the orthogonal transformer 103 by the inverse quantizer 105 and the inverse orthogonal transformer 106 to obtain a prediction error signal. After the same signal is obtained, the adder 107 adds the predicted image signal 212 input via the switch 109 to generate a locally decoded image signal 211. The locally decoded image signal 211 is input to the frame memory / predicted image generator 108.

フレームメモリ／予測画像生成器１０８は、予め用意された参照フレーム番号と予測パラメータの複数の組み合わせの中から一つの組み合わせを選択する。選択された組み合わせの中の参照フレーム番号で示される参照フレームの画像信号（局部復号化画像信号２１１）について、選択された組み合わせの中の予測パラメータに従って線形和を計算し、さらに予測パラメータに従ったオフセットを加算することにより、この例ではフレーム単位の参照画像信号を生成する。この後、フレームメモリ／予測画像生成器１０８は参照画像信号に対して動きベクトルを用いて動き補償を行い、予測画像信号２１２を生成する。 The frame memory / predicted image generator 108 selects one combination from a plurality of combinations of reference frame numbers and prediction parameters prepared in advance. For the image signal (local decoded image signal 211) of the reference frame indicated by the reference frame number in the selected combination, a linear sum is calculated according to the prediction parameter in the selected combination, and further according to the prediction parameter By adding the offset, in this example, a reference image signal for each frame is generated. Thereafter, the frame memory / predicted image generator 108 performs motion compensation on the reference image signal using a motion vector, and generates a predicted image signal 212.

この過程でフレームメモリ／予測画像生成器１０８は、動きベクトル情報２１４と、参照フレーム番号と予測パラメータの選択された組み合わせを示すインデックス情報２１５を生成し、さらにモード選択器２１２に符号化モードの選択に必要な情報を送る。動きベクトル情報２１４及びインデックス情報２１５は、可変長符号化器１１１に入力される。フレームメモリ／予測画像生成器１０８については、後に詳しく説明する。 In this process, the frame memory / predicted image generator 108 generates motion vector information 214 and index information 215 indicating the selected combination of the reference frame number and the prediction parameter, and further selects a coding mode in the mode selector 212. Send necessary information to. The motion vector information 214 and the index information 215 are input to the variable length encoder 111. The frame memory / predicted image generator 108 will be described in detail later.

モード選択器１１０は、フレームメモリ／予測画像生成器１０８からの予測情報Ｐに基づいてマクロブロック単位に符号化モードの選択、すなわちフレーム内符号化（以下、イントラ符号化という）と動き補償予測フレーム間符号化（以下、インター符号化という）のいずれかの選択を行い、スイッチ制御信号Ｍ及びＳを出力する。 The mode selector 110 selects a coding mode in units of macroblocks based on the prediction information P from the frame memory / predictive image generator 108, that is, intraframe coding (hereinafter referred to as intra coding) and a motion compensated prediction frame. One of inter coding (hereinafter referred to as inter coding) is selected, and switch control signals M and S are output.

イントラ符号化モードでは、スイッチ制御信号Ｍ及びＳによってスイッチ１０２，１１２はＡ側に切り替えられ、直交変換器１０３に入力動画像信号１００が入力される。インター符号化モードでは、スイッチ制御信号Ｍ及びＳによってスイッチ１０２，１１２はＢ側に切り替えられ、直交変換器１０３には減算器１０２からの予測誤差信号、加算器１０７にはフレームメモリ／予測画像生成器１０８からの予測画像信号２１２がそれぞれ入力される。モード選択器２１２からはモード情報２１３が出力され、可変長符号化器１１１に入力される。 In the intra coding mode, the switches 102 and 112 are switched to the A side by the switch control signals M and S, and the input moving image signal 100 is input to the orthogonal transformer 103. In the inter coding mode, the switches 102 and 112 are switched to the B side by the switch control signals M and S, the orthogonal transformer 103 generates a prediction error signal from the subtractor 102, and the adder 107 generates a frame memory / predicted image. The predicted image signals 212 from the unit 108 are respectively input. Mode information 213 is output from the mode selector 212 and input to the variable length encoder 111.

可変長符号化器１１１では、直交変換係数情報２１０、モード情報２１３、動きベクトル情報２１４及びインデックス情報２１５が可変長符号化され、これによって生成された各可変長符号が多重化器１１４で多重化された後、出力バッファ１１５により平滑化される。こうして出力バッファ１１５から出力される符号化データ１１６は、図示しない伝送系または蓄積系へ送出される。 In the variable length encoder 111, the orthogonal transform coefficient information 210, the mode information 213, the motion vector information 214, and the index information 215 are variable length encoded, and each variable length code generated thereby is multiplexed by the multiplexer 114. Then, the output buffer 115 smoothes the result. Thus, the encoded data 116 output from the output buffer 115 is sent to a transmission system or storage system (not shown).

符号化制御器１１３は、減算器１０１から可変長符号化器１１１までの要素で構成される符号化部１１２の制御、具体的には例えば出力バッファ１１５のバッファ量をモニタし、バッファ量が一定となるように量子化器１０４の量子化ステップサイズなどの符号化パラメータの制御を行う。 The encoding controller 113 controls the encoding unit 112 composed of elements from the subtractor 101 to the variable length encoder 111, specifically monitors the buffer amount of the output buffer 115, for example, and the buffer amount is constant. The encoding parameters such as the quantization step size of the quantizer 104 are controlled so that

（フレームメモリ／予測画像生成器１０８について）
図２には、図１におけるフレームメモリ／予測画像作成器１０８の詳細な構成を示す。図２において、図１中の加算器１０７から入力される局部復号画像信号２１１は、メモリ制御器２０１による制御の下でフレームメモリセット２０２に格納される。フレームメモリセット２０２は、局部復号画像信号２１１を参照フレームとして一時保持するための複数（Ｎ）のフレームメモリＦＭ１〜ＦＭＮを有する。 (About frame memory / predicted image generator 108)
FIG. 2 shows a detailed configuration of the frame memory / predicted image creator 108 in FIG. In FIG. 2, the local decoded image signal 211 input from the adder 107 in FIG. 1 is stored in the frame memory set 202 under the control of the memory controller 201. The frame memory set 202 includes a plurality (N) of frame memories FM1 to FMN for temporarily storing the locally decoded image signal 211 as a reference frame.

予測パラメータ制御器２０３は、予め参照フレーム番号と予測パラメータの複数の組み合わせをテーブルとして用意しており、入力動画像信号１００に基づいて予測画像信号２１２の生成に用いる参照フレームの参照フレーム番号と予測パラメータの組み合わせを選択し、選択された組み合わせを示すインデックス情報２１５を出力する。 The prediction parameter controller 203 prepares a plurality of combinations of reference frame numbers and prediction parameters as a table in advance, and the reference frame number of the reference frame used for generating the predicted image signal 212 based on the input moving image signal 100 and the prediction A combination of parameters is selected, and index information 215 indicating the selected combination is output.

複数フレーム動き評価器２０４では、予測パラメータ制御器２０３により選択された参照フレーム番号とインデックス情報の組み合わせに従って参照画像信号を作成し、この参照画像信号と入力画像信号１００とから動き量と予測誤差の評価を行い、予測誤差を最小とする動きベクトル情報２１４を出力する。複数フレーム動き補償器２０５は、複数フレーム動き評価器２０４でブロック毎に選択された参照画像信号に対し、動きベクトルに従って動き補償を行うことによって予測画像信号２１２を生成する。 The multi-frame motion evaluator 204 creates a reference image signal according to the combination of the reference frame number selected by the prediction parameter controller 203 and the index information, and the amount of motion and the prediction error are determined from the reference image signal and the input image signal 100. Evaluation is performed, and motion vector information 214 that minimizes the prediction error is output. The multi-frame motion compensator 205 generates a predicted image signal 212 by performing motion compensation on the reference image signal selected for each block by the multi-frame motion evaluator 204 according to the motion vector.

（予測画像の生成について）
以下の数式（１）（２）（３）は、予測パラメータ制御器２０３で用意されている、参照画像番号と予測パラメータを用いた予測式の例を示している。ここで示す例は、いわゆるＰピクチャと呼ばれる符号化対象画像に対して、１枚の参照画像（参照ピクチャ）を用いて動き補償予測により予測画像信号を生成する場合と、いわゆるＢピクチャと呼ばれる符号化対象画像に対して、２枚の参照画像のどちらか１つのみを用いて動き補償予測により予測画像信号を生成する場合に適用される予測式を示している。

(Prediction image generation)
The following formulas (1), (2), and (3) show examples of prediction formulas prepared by the prediction parameter controller 203 using the reference image number and the prediction parameters. In the example shown here, a case where a prediction image signal is generated by motion compensation prediction using one reference image (reference picture) with respect to an encoding target image called a so-called P picture, and a so-called B picture code 7 shows a prediction formula applied when a predicted image signal is generated by motion compensation prediction using only one of two reference images with respect to a target image.

ここで、Ｙは輝度信号の予測画像信号、Ｃｂ，Ｃｒは二つの色差信号の予測画像信号、Ｒ_Y(i)，Ｒ_Cb(i)，Ｒ_Cr(i)はインデックスｉの参照画像信号のうちの輝度信号及び二つの色差信号の画素値をそれぞれ表している。Ｄ₁(i)，Ｄ₂(i)は、それぞれインデックスｉの輝度信号の予測係数及びオフセットである。Ｅ₁(i)，Ｅ₂(i)は、それぞれインデックスｉの色差信号Ｃｂの予測係数及びオフセットである。Ｆ₁(i)，Ｆ₂(i)はそれぞれインデックスｉの色差信号Ｃｒの予測係数及びオフセットである。インデックスｉは、０から（最大参照画像枚数−１）の値をとり、符号化対象ブロック毎（例えば、マクロブロック毎）に符号化されて動画像復号化装置に伝送される。 Here, Y is a predicted image signal of a luminance signal, Cb and Cr are predicted image signals of two color difference signals, and R _Y (i), R _Cb (i), and R _Cr (i) are reference image signals of index i. The pixel values of the luminance signal and the two color difference signals are respectively shown. D ₁ (i) and D ₂ (i) are the prediction coefficient and offset of the luminance signal of index i, respectively. E ₁ (i) and E ₂ (i) are the prediction coefficient and offset of the color difference signal Cb of index i, respectively. F ₁ (i) and F ₂ (i) are the prediction coefficient and offset of the color difference signal Cr of index i, respectively. The index i takes a value from 0 to (the maximum number of reference images−1), is encoded for each encoding target block (for example, for each macroblock), and is transmitted to the video decoding device.

予測パラメータＤ₁(i)，Ｄ₂(i)，Ｅ₁(i)，Ｅ₂(i)，Ｆ₁(i)，Ｆ₂(i)は、予め動画像符号化装置と復号化装置間で決められた値、あるいはフレーム、フィールド及びスライスといった予め決められた符号化の単位であり、符号化データと共に符号化されて動画像符号化装置から復号化装置へ伝送されることによって、両装置で共有される。 The prediction parameters D ₁ (i), D ₂ (i), E ₁ (i), E ₂ (i), F ₁ (i), and F ₂ (i) are preliminarily determined between the moving picture coding apparatus and the decoding apparatus. Or a predetermined encoding unit such as a frame, a field, and a slice, which are encoded together with encoded data and transmitted from the moving image encoding apparatus to the decoding apparatus. Shared on.

数式（１）（２）（３）は、参照画像信号に乗算する予測係数の分母を２のべき乗、すなわち２，４，８，１６，…のように選定することによって割り算を避け、算術シフトによって計算できる予測式とされている。これによって、割り算による計算コストの増大を避けることができる。 Equations (1), (2), and (3) avoid the division by selecting the denominator of the prediction coefficient to be multiplied by the reference image signal as a power of 2, that is, 2, 4, 8, 16,. It is a prediction formula that can be calculated by This avoids an increase in calculation cost due to division.

すなわち、数式（１）（２）（３）における＞＞は、ａ＞＞ｂとおいたときに、整数ａを右にｂビット算術シフトする演算子である。関数clip( )は、( )内の値を０よりも小さいときには０とし、２５５より大きいときには２５５にするクリッピング関数であり、０から２５５の整数値を返す。 That is, >> in Equations (1), (2), and (3) is an operator that arithmetically shifts the integer a to the right by b bits when a >> b is set. The function clip () is a clipping function in which the value in () is 0 when it is smaller than 0, and 255 when it is larger than 255, and returns an integer value from 0 to 255.

ここで、Ｌ_Yは輝度信号のシフト量であり、Ｌ_Cは色差信号のシフト量である。これらのシフト量Ｌ_Y，Ｌ_Cは、予め動画像符号化装置と復号化装置とで決められた値が用いられるか、動画像符号化装置においてフレーム、フィールドあるいはスライスといった予め決められた符号化単位でテーブル及び符号化データと共に符号化されて動画像復号化装置へ伝送されることにより、両装置で共有される。 Here, L _Y is the shift amount of the luminance signal, and L _C is the shift amount of the color difference signal. For these shift amounts L _Y and L _C , values determined in advance by the moving image encoding device and the decoding device are used, or in the moving image encoding device, predetermined encoding such as a frame, a field, or a slice is performed. By being encoded together with the table and encoded data in units and transmitted to the moving picture decoding apparatus, both apparatuses share the same.

本実施形態では、図２中の予測パラメータ制御器２０３において、図３で示されるような参照画像番号と予測パラメータの組み合わせテーブルが用意される。このテーブルは参照画像数が１枚の場合に用いられる。図３において、インデックスｉはブロック毎に選択され得る予測画像に対応している。この例では、インデックスｉの０〜３に対応して４種類の予測画像が存在する。参照画像番号は、言い換えれば参照画像として用いられる局部復号画像の番号である。図３に示すテーブルは、数式（１）（２）（３）に対応して輝度信号と二つの色差信号に割り当てられた予測パラメータＤ₁(i)，Ｄ₂(i)，Ｅ₁(i)，Ｅ₂(i)，Ｆ₁(i)，Ｆ₂(i)を有する。 In the present embodiment, the prediction parameter controller 203 in FIG. 2 prepares a reference image number / prediction parameter combination table as shown in FIG. This table is used when the number of reference images is one. In FIG. 3, the index i corresponds to a predicted image that can be selected for each block. In this example, there are four types of predicted images corresponding to 0 to 3 of the index i. In other words, the reference image number is a number of a locally decoded image used as a reference image. The table shown in FIG. 3 shows the prediction parameters D ₁ (i), D ₂ (i), E ₁ (i) assigned to the luminance signal and the two color difference signals corresponding to the equations (1), (2), and (3). ), E ₂ (i), F ₁ (i), and F ₂ (i).

Flagは、インデックスｉが示す参照画像番号に予測パラメータを使った予測式を適用するか否かを示すフラグである。Flagが“０”ならば、予測パラメータを用いないでインデックスｉが示す参照画像番号の局部復号画像を用いて動き補償予測を行う。Flagが“１”ならば、インデックスｉが示す参照画像番号の局部復号画像と予測パラメータを用いて、数式（１）（２）（３）に従って予測画像を作成して動き補償予測を行う。このFlagの情報についても、予め動画像符号化装置と復号化装置とで決められた値が用いられるか、動画像符号化装置においてフレーム、フィールドあるいはスライスといった予め決められた符号化単位でテーブル及び符号化データと共に符号化され、動画像復号化装置へ伝送されることにより、両装置で共有される。 Flag is a flag indicating whether or not to apply a prediction formula using a prediction parameter to the reference image number indicated by the index i. If Flag is “0”, motion compensation prediction is performed using the local decoded image of the reference image number indicated by the index i without using the prediction parameter. If Flag is “1”, using the locally decoded image of the reference image number indicated by the index i and the prediction parameter, a prediction image is created according to Equations (1), (2), and (3) to perform motion compensation prediction. Also for the information of the Flag, a value determined in advance by the video encoding device and the decoding device is used, or in the video encoding device, a table and a predetermined encoding unit such as a frame, a field, or a slice are used. The data is encoded together with the encoded data and transmitted to the moving picture decoding apparatus, so that both apparatuses share the same.

これらの例では、参照画像番号１０５はインデックスｉがｉ＝０の場合は、予測パラメータを用いて予測画像を作成し、ｉ＝１の場合は予測パラメータを用いずに動き補償予測を行っている。このように、同じ参照画像番号に対して、複数の予測方式が存在してもよい。 In these examples, for the reference image number 105, when the index i is i = 0, a prediction image is generated using the prediction parameter, and when i = 1, motion compensation prediction is performed without using the prediction parameter. . As described above, a plurality of prediction methods may exist for the same reference image number.

以下の数式（４）（５）（６）は、２枚の参照画像数を用いて予測画像信号を作成する場合の予測パラメータ制御器２０３で用意されている参照画像番号と予測パラメータの予測式の例を示している。

The following mathematical formulas (4), (5), and (6) are the prediction formulas of the reference image number and the prediction parameter prepared by the prediction parameter controller 203 when the prediction image signal is generated using the number of two reference images. An example is shown.

ここで、数式（５）の関係が成り立つことから、数式（４）は次のように変形できる。

Here, since the relationship of Formula (5) is established, Formula (4) can be modified as follows.

ここでは、いわゆるＢピクチャの場合で双方向予測を行う場合の予測式の例を示している。このとき、インデックスはｉ，ｊの２つ存在し、Ｒ（ｉ），Ｒ（ｊ）がそれぞれのインデックスｉ，ｊに対応する参照画像である。従って、インデックス情報としては、ｉとｊの２つの情報が送られるものとする。ここで、Ｗ（ｊ，ｊ）がインデックスｉとｊのときの予測係数である。予測係数Ｗの計算に用いられる関数Ｕは画像間距離を表す関数であり、Ｕ（ｉ，ｊ）でインデックスｉの示す参照画像とインデックスｊの示す参照画像間の距離を表す。ｎは、現在符号化しようとしている符号化対象画像の位置である。 Here, an example of a prediction formula when bi-directional prediction is performed in the case of a so-called B picture is shown. At this time, there are two indexes i and j, and R (i) and R (j) are reference images corresponding to the respective indexes i and j. Accordingly, two pieces of information i and j are sent as index information. Here, W (j, j) is a prediction coefficient when the indexes are i and j. A function U used for calculation of the prediction coefficient W is a function representing a distance between images, and U (i, j) represents a distance between a reference image indicated by an index i and a reference image indicated by an index j. n is the position of the current image to be encoded.

本実施形態では、時間的に過去の画像ほど小さな値の位置情報を持つとしている。従って、もしインデックスｉが示す参照画像がインデックスｊの指し示す参照画像よりも時間的に未来ならばＵ（ｉ，ｊ）＞０、インデックスｉとインデックスｊが時間的に同じ参照画像を指している場合はＵ（ｉ，ｊ）＝０、インデックスｉが示す参照画像がインデックスｊの指し示す参照画像よりも時間的に過去ならばＵ（ｉ，ｊ）＜０となる。予測係数Ｗの値は、Ｕ（ｉ，ｊ）が０のときは２^L-1とする。 In the present embodiment, it is assumed that the position information of a smaller value is stored in the past image in terms of time. Therefore, if the reference image indicated by index i is temporally later than the reference image indicated by index j, U (i, j)> 0, and index i and index j indicate the same reference image in time. U (i, j) = 0, and U (i, j) <0 if the reference image indicated by index i is temporally past the reference image indicated by index j. The value of the prediction coefficient W is 2 ^L-1 when U (i, j) is 0.

具体的には、現在符号化しようとしている符号化対象画像と２つの参照画像との時間的位置関係は、インデックスｉ，インデックスｊを用いて図４〜図７で示されるように表される。図４は、符号化対象画像ｎがインデックスｉで示される参照画像とインデックスｊで示される参照画像に挟まれる関係になっている場合の例を示している。 Specifically, the temporal positional relationship between the current image to be encoded and the two reference images is expressed as shown in FIGS. 4 to 7 using index i and index j. FIG. 4 shows an example in which the encoding target image n is in a relationship between the reference image indicated by the index i and the reference image indicated by the index j.

ここで、Ｔ_n ，Ｔ_ｉ，Ｔ_ｊはそれぞれ符号化対象画像、インデックスｉの示す参照画像、インデックスｊの示す参照画像の位置を表しており、ここでは右に行くほど大きな値をとるようになっている。従って、Ｔ_i＜Ｔ_n＜Ｔ_jの関係が成り立つ。ここで予測係数Ｗの計算に用いられる関数Ｕは、Ｕ（ｎ，ｉ）＝Ｔ_n−Ｔ_i ，Ｕ（ｊ，ｉ）＝Ｔ_j−Ｔ_iで求められ、Ｕ（ｎ，ｉ）＞０，Ｕ（ｊ，ｉ）＞０である。 Here, T _n , T _i , and T _j represent the position of the encoding target image, the reference image indicated by the index i, and the reference image indicated by the index j, respectively, and take a larger value toward the right here. It has become. Therefore, the relationship of T _i <T _n <T _j is established. Here, the function U used to calculate the prediction coefficient W is obtained by U (n, i) = T _n −T _i , U (j, i) = T _j −T _i , and U (n, i)> 0, U (j, i)> 0.

図５は、インデックスｉで示される参照画像とインデックスｊで示される参照画像がいずれも符号化対象画像ｎよりも時間的に過去の位置の関係になっている場合の例を示している。ここでは、Ｕ（ｎ，ｉ）＞０、Ｕ（ｊ，ｉ）≦０の関係になる。 FIG. 5 shows an example in which the reference image indicated by the index i and the reference image indicated by the index j are both temporally past in relation to the encoding target image n. Here, U (n, i)> 0 and U (j, i) ≦ 0.

図６は、インデックスｉで示される参照画像とインデックスｊで示される参照画像がいずれも符号化対象画像ｎよりも時間的に過去の位置の関係になっている場合の別の例を示している。ここでは、Ｕ（ｎ，ｉ）＞０、Ｕ（ｊ，ｉ）≧０の関係になる。 FIG. 6 shows another example in the case where both the reference image indicated by the index i and the reference image indicated by the index j are in a past position relationship with respect to the encoding target image n. . Here, U (n, i)> 0 and U (j, i) ≧ 0.

図７は、インデックスｉで示される参照画像とインデックスｊで示される参照画像がいずれも符号化対象画像ｎよりも時間的に未来の位置の関係になっている場合の例を示している。ここでは、Ｕ（ｎ，ｉ）＜０、Ｕ（ｊ，ｉ）≧０の関係になる。 FIG. 7 shows an example in which the reference image indicated by the index i and the reference image indicated by the index j are both in the future position relationship with respect to the encoding target image n. Here, U (n, i) <0 and U (j, i) ≧ 0.

数式（４）〜（８）においてＬはシフト量であり、予め動画像符号化装置と復号化装置間で決められた値を用いるか、あるいはフレーム、フィールド、スライスといった予め決まった符号化の単位で、テーブル及び符号化データと共に符号化されて符号化装置から復号化装置に伝送され、両装置で共有される。さらに、数式（６）（９）におけるclip2という関数は、clip2( )の( )内の値（以下、単に値という）の最大値及び最小値を制限して、整数を返す関数である。この関数clip2に関して、以下に複数の構成例を示す。 In Equations (4) to (8), L is a shift amount, and a value determined in advance between the moving image encoding device and the decoding device is used, or a predetermined encoding unit such as a frame, a field, and a slice is used. Thus, the data is encoded together with the table and the encoded data, transmitted from the encoding device to the decoding device, and shared by both devices. Furthermore, the function called clip2 in the equations (6) and (9) is a function that returns an integer by limiting the maximum value and the minimum value of the values in () of clip2 () (hereinafter simply referred to as values). Regarding this function clip2, a plurality of configuration examples are shown below.

関数clip2の第１の構成は、値が−２^Mよりも小さいときに−２^M、値が（２^M−１）より大きいときに（２^M−１）にするクリッピング関数で、−２^M以上（２^M−１）以下の整数値を返す。このような構成にすることで、画素が８ビットだとすると、（Ｒ（ｊ）−Ｒ（ｉ））の値の表現に９ビット、予測係数Ｗの表現に（Ｍ＋１）ビット必要なので，（Ｍ＋１０）ビットの演算精度で予測画像値の演算が可能となる。なお、ＭはＬ以上の非負の整数とする。 The first configuration of the function clip2 is a clipping function to -2 ^M, when the value is greater than ^{^{(2 M -1) (2 M}} -1) when the value is smaller than -2 ^M, -2 ^M Returns an integer value not less than (2 ^M −1). With this configuration, assuming that the pixel is 8 bits, 9 bits are required to express the value of (R (j) -R (i)), and (M + 1) bits are required to express the prediction coefficient W, so (M + 10) The predicted image value can be calculated with a bit calculation accuracy. Note that M is a non-negative integer greater than or equal to L.

関数clip2の第２の構成としては、値が−２^Ｍよりも小さいときに２^L-1、値が（２^M−１）より大きいときに２^L-1にする規則を持っていて、−２^M以上（２^M−１）以下の整数値を返す関数とする。このような構成にすることで、２つの参照画像の距離関係が例外的な場合は、全て平均値予測になるようにすることができる。 The second configuration of the function clip2 has a rule of 2 ^L-1 when the value is smaller than −2 ^M and 2 ^L-1 when the value is larger than (2 ^M −1), and − A function that returns an integer value of 2 ^M or more and (2 ^M −1) or less. By adopting such a configuration, when the distance relationship between two reference images is exceptional, all can be average value prediction.

関数clip2の第３の構成としては、値が１よりも小さいときに１、値が２^Mより大きいときに２^Mにするクリッピング関数で、１以上２^M以下の整数値を返す関数とする。関数clip2の第１の構成との違いは、予測係数Ｗの値が負にならないということであり、参照画像の位置関係がより制限される。従って、同一の参照画像２枚の組合せでも、図５と図６の関係のようにインデクスｉとインデクスｊの指し示し方を逆転させることによって、予測係数Ｗによる予測と平均値予測を切り替えることが可能となる。 A third configuration of the function clip2, 1 when the value is less than 1, in the clipping function value to 2 ^M when larger 2 ^M, a function that returns an integer value of 1 or more 2 ^M. The difference from the first configuration of the function clip2 is that the value of the prediction coefficient W does not become negative, and the positional relationship of the reference image is more limited. Therefore, even with a combination of two identical reference images, it is possible to switch between prediction by the prediction coefficient W and average value prediction by reversing the indication of the index i and the index j as in the relationship of FIGS. It becomes.

関数clip2の第４の構成としては、値が０よりも小さいときに０、値が２^Lより大きいときに２^Lにするクリッピング関数で、０以上２^L以下の整数値を返す関数とする。このような構成にすることで、予測係数Ｗの値が必ず２^L以下の非負の値になるので、外挿予測が禁止され、その代わり双方向予測においても２つの参照画像のいずれかが予測に使われることになる。 The fourth configuration of the function clip2, 0 when the value is less than 0, with the clipping function value to 2 ^L when larger 2 ^L, a function that returns an integer value of 0 or 2 ^L. With this configuration, since the value of the prediction coefficient W is always a non-negative value of 2 ^L or less, extrapolation prediction is prohibited, and one of the two reference images is predicted in bi-directional prediction instead. Will be used.

関数clip2の第５の構成としては、値が１よりも小さいときに２^L-1、値が２^Lより大きいときに２^L-1にするクリッピング関数で、１以上２^L−１以下の整数値を返す関数とする。このような構成にすることで、予測係数Ｗの値が必ず２^L−１以下の非負の値になるので、外挿予測が禁止され、その代わり２つ参照画像の平均値予測に使われることになる。 The fifth structure of the function clip2, 2 ^L-1 when the value is less than 1, in the clipping function to 2 ^L-1 when the value is greater than 2 ^L, 1 or 2 ^L -1 or less integer A function that returns a numeric value. With such a configuration, since the value of the prediction coefficient W is always a non-negative value of 2 ^L −1 or less, extrapolation prediction is prohibited, and instead, it is used for the average value prediction of two reference images. become.

なお、２つの参照画像間の距離が不明あるいは未定義の場合、例えば、参照画像のどちら一方、あるいは両方が背景用や長期保存用の参照画像であった場合、予測係数Ｗは、２^L-1の値をとるものとする。予測係数Ｗは、フレーム、フィールド、スライスなどの符号化の単位で予め計算しておくことが可能なため、画素当たりの計算において、２つの参照画像数で予測画像信号を作成する場合でも、1回の乗算ですむ。

When the distance between two reference images is unknown or undefined, for example, when one or both of the reference images is a reference image for background or long-term storage, the prediction coefficient W is 2 ^L− A value of ¹ is assumed. Since the prediction coefficient W can be calculated in advance in units of encoding such as a frame, a field, and a slice, even when a predicted image signal is created with two reference images in calculation per pixel, 1 It only takes 1 multiplication.

数式（９）は、数式（４）を変形した別の例である。数式（７）においては、Ｒ（ｉ）を予め左にＬビット算術シフトする操作が必要であったが、数式（１０）では括弧の外側に出すことにより算術シフトを省略している。その分、演算量を削減できる効果がある。その代わり、Ｒ（ｉ）とＲ（ｊ）の値の大小関係によって、シフトをしたときの丸めの方向が異なるため、数式（４）とは厳密には同じ結果にならない。 Expression (9) is another example obtained by modifying Expression (4). In the formula (7), an operation to shift R (i) to the left by L bits in advance is necessary, but in the formula (10), the arithmetic shift is omitted by putting it outside the parentheses. The amount of calculation can be reduced accordingly. Instead, since the rounding direction when shifting is different depending on the magnitude relationship between the values of R (i) and R (j), the result is not exactly the same as Expression (4).

また、数式（４）〜（８）に代えて以下の数式（１０）〜（２０）を用いてもよい。これは、参照画像が１つの場合の予測画像の作成方法と同様の方法で、インデックスｉの参照画像１枚の予測画像と、インデックスｊの参照画像１枚の予測画像を作成しておいてその平均を取ることによって、最終的な予測画像を作成する方法である。途中まで、参照画像数が１枚のときの処理ルーチンと同じものを使うことができるため、ハードウェア量やコード量の削減が可能であるという利点がある。

Further, the following mathematical formulas (10) to (20) may be used instead of the mathematical formulas (4) to (8). This is the same method as the method of creating a predicted image in the case of one reference image. A predicted image of one reference image with index i and a predicted image of one reference image with index j are created and This is a method of creating a final predicted image by taking an average. Since it is possible to use the same processing routine as when the number of reference images is one halfway, there is an advantage that the amount of hardware and code can be reduced.

（予測方式の選択と符号化モード判定の手順について）
次に、図８を用いて本実施形態におけるマクロブロック毎の予測方式（参照画像番号と予測パラメータの組み合わせ）の選択と符号化モード判定の具体的な手順の一例について説明する。
まず、変数ｍｉｎ_Ｄに想定可能な最大値を入れておく（ステップＳ１０１）。ＬＯＯＰ１（ステップＳ１０２）は、インター符号化における予測方式の選択のための繰り返しを示し、変数ｉは図３に示したインデックスの値を表している。ここでは、予測方式毎の最適な動きベクトルが求めることができるように、動きベクトル情報２１４に関わる符号量（動きベクトル情報２１４に対応して可変長符号化器１１１から出力される可変長符号の符号量）と予測誤差絶対値和から各インデックス（参照フレーム番号と予測パラメータの組み合わせ）の評価値Ｄを計算し、評価値Ｄを最小とする動きベクトルを選択する（ステップＳ１０３）。この評価値Ｄをmin_Ｄと比較し（ステップＳ１０４）、ｍｉｎ＿Ｄよりも評価値Ｄが小さければ評価値Ｄをｍｉｎ＿Ｄとし、インデックスｉをｍｉｎ＿ｉに代入しておく（ステップＳ１０５）。 (For prediction method selection and coding mode determination procedure)
Next, an example of a specific procedure for selecting a prediction method (combination of a reference image number and a prediction parameter) for each macroblock and determining a coding mode in this embodiment will be described with reference to FIG.
First, the maximum value that can be assumed is entered in the variable min_D (step S101). LOOP1 (step S102) indicates repetition for selection of a prediction method in inter coding, and variable i indicates the value of the index shown in FIG. Here, the code amount related to the motion vector information 214 (the variable length code output from the variable length encoder 111 corresponding to the motion vector information 214 is calculated so that the optimal motion vector for each prediction method can be obtained. The evaluation value D of each index (combination of the reference frame number and the prediction parameter) is calculated from the sum of the code amount) and the prediction error absolute value, and the motion vector that minimizes the evaluation value D is selected (step S103). The evaluation value D is compared with min_D (step S104). If the evaluation value D is smaller than min_D, the evaluation value D is set to min_D, and the index i is assigned to min_i (step S105).

次に、イントラ符号化の場合の評価値Ｄを計算し（ステップＳ１０６）、この評価値Ｄをｍｉｎ＿Ｄと比較する（ステップＳ１０７）。この比較の結果、ｍｉｎ＿Ｄの方が小さければモードＭＯＤＥはインター符号化と判定し、インデックス情報ＩＮＤＥＸにｍｉｎ＿ｉを代入する（ステップＳ１０８）。評価値Ｄの方が小さければ、モードＭＯＤＥはイントラ符号化と判定する（ステップＳ１０９）。ここで、評価値Ｄは同一量子化ステップサイズでの符号量の推定量とする。 Next, an evaluation value D in the case of intra coding is calculated (step S106), and this evaluation value D is compared with min_D (step S107). As a result of this comparison, if min_D is smaller, mode MODE is determined to be inter-coding, and min_i is substituted into index information INDEX (step S108). If the evaluation value D is smaller, the mode MODE is determined to be intra coding (step S109). Here, the evaluation value D is an estimated amount of code for the same quantization step size.

（復号化側について）
次に、図１に示した動画像符号化装置に対応する動画像復号化装置について説明する。図９に、本実施形態に係る動画像復号化装置の構成を示す。図１に示した構成の動画像符号化装置から送出され、伝送系または蓄積系を経て送られてきた符号化データ３００は、入力バッファ３０１に一度蓄えられ、多重化分離器３０２により１フレーム毎にシンタクスに基づいて分離された後、可変長復号化器３０３に入力される。可変長復号化器３０３では、符号化データ３００の各シンタクスの可変長符号の復号が行われ、量子化直交変換係数、モード情報４１３、動きベクトル情報４１４及びインデックス情報４１５が再生される。 (About decryption side)
Next, a video decoding device corresponding to the video encoding device shown in FIG. 1 will be described. FIG. 9 shows the configuration of the video decoding apparatus according to this embodiment. The encoded data 300 sent from the moving picture encoding apparatus having the configuration shown in FIG. 1 and sent via the transmission system or the storage system is once stored in the input buffer 301 and is demultiplexed by the demultiplexer 302 every frame. Are separated based on the syntax and then input to the variable length decoder 303. The variable length decoder 303 decodes the variable length code of each syntax of the encoded data 300, and reproduces the quantized orthogonal transform coefficient, mode information 413, motion vector information 414, and index information 415.

再生された各情報のうち、量子化直交変換係数は逆量子化器３０４で逆量子化され、逆直交変換器３０５で逆直交変換される。ここでモード情報４１３がイントラ符号化モードを示している場合には、逆直交変換器３０５から再生画像信号が出力され、加算器３０６を介して最終的な再生画像信号３１０として出力される。モード情報４１３がインター符号化モードを示している場合には、逆直交変換器３０５から予測誤差信号が出力され、さらにモード選択スイッチ３０８がオンとされる。予測誤差信号とフレームメモリ／予測画像生成器３０８から出力される予測画像信号４１２が加算器３０６で加算されることにより、再生画像信号３１０が出力される。再生画像信号３１０は、フレームメモリ／予測画像作成器３０８に参照画像信号として蓄積される。 Of each reproduced information, the quantized orthogonal transform coefficient is inversely quantized by the inverse quantizer 304 and inversely orthogonally transformed by the inverse orthogonal transformer 305. Here, when the mode information 413 indicates the intra coding mode, a reproduced image signal is output from the inverse orthogonal transformer 305 and is output as a final reproduced image signal 310 via the adder 306. When the mode information 413 indicates the inter coding mode, the prediction error signal is output from the inverse orthogonal transformer 305, and the mode selection switch 308 is turned on. The prediction image signal 412 output from the prediction error signal and the frame memory / prediction image generator 308 is added by the adder 306, whereby the reproduced image signal 310 is output. The reproduced image signal 310 is stored in the frame memory / predicted image generator 308 as a reference image signal.

モード情報４１３、動きベクトル情報４１４及びインデックス情報４１５は、フレームメモリ／予測画像作成器３０８に入力される。モード情報４１３はモード選択スイッチ３０９にも入力され、該スイッチ３０９をイントラ符号化モードの場合にはオフ、インター符号化モードの場合にはオンとする。 The mode information 413, motion vector information 414, and index information 415 are input to the frame memory / predicted image creator 308. The mode information 413 is also input to the mode selection switch 309. The switch 309 is turned off in the intra coding mode and turned on in the inter coding mode.

フレームメモリ／予測画像生成器３０８は、図１に示した符号化側のフレームメモリ／予測画像生成器１０８と同様に、予め用意された参照画像番号と予測パラメータの複数の組み合わせをテーブルとして用意しており、この中からインデックス情報４１５で示される一つの組み合わせを選択する。選択された組み合わせの中の参照画像番号で示される参照画像の画像信号（再生画像信号２１０）について、選択された組み合わせの中の予測パラメータに従って線形和を計算し、さらに予測パラメータに従ったオフセットを加算することにより、参照画像信号を生成する。この後、生成された参照画像信号に対して動きベクトル情報４１４で示される動きベクトルを用いて動き補償を行うことにより、予測画像信号４１２を生成する。 The frame memory / predicted image generator 308 prepares a plurality of combinations of reference image numbers and prediction parameters prepared in advance as a table, similar to the frame memory / predicted image generator 108 on the encoding side shown in FIG. From this, one combination indicated by the index information 415 is selected. For the image signal of the reference image (reproduced image signal 210) indicated by the reference image number in the selected combination, a linear sum is calculated according to the prediction parameter in the selected combination, and an offset according to the prediction parameter is further calculated. By adding, a reference image signal is generated. Thereafter, the predicted image signal 412 is generated by performing motion compensation on the generated reference image signal using the motion vector indicated by the motion vector information 414.

（フレームメモリ／予測画像生成器３０８について）
図１０に、図９におけるフレームメモリ／予測画像作成器３０８の詳細な構成を示す。図１０において、図９中の加算器３０６から出力される再生画像信号３１０は、メモリ制御器４０１による制御の下でフレームメモリセット４０２に格納される。フレームメモリセット４０２は、再生画像信号３１０を参照画像として一時保持するための複数（Ｎ）のフレームメモリＦＭ１〜ＦＭＮを有する。 (About frame memory / predicted image generator 308)
FIG. 10 shows a detailed configuration of the frame memory / predicted image creator 308 in FIG. In FIG. 10, the reproduced image signal 310 output from the adder 306 in FIG. 9 is stored in the frame memory set 402 under the control of the memory controller 401. The frame memory set 402 includes a plurality (N) of frame memories FM1 to FMN for temporarily holding the reproduced image signal 310 as a reference image.

予測パラメータ制御器４０３は、予め参照画像番号と予測パラメータの組み合わせを図３に示したと同様のテーブルとして用意しており、図９中の可変長復号化器３０３からのインデックス情報４１５に基づいて予測画像信号４１２の生成に用いる参照画像の参照画像番号と予測パラメータの組み合わせを選択する。複数フレーム動き補償器４０４は、予測パラメータ制御器４０３により選択された参照画像番号とインデックス情報の組み合わせに従って参照画像信号を作成し、この参照画像信号に対して図９中の可変長復号化器３０３からの動きベクトル情報４１４で示される動きベクトルに従ってブロック単位で動き補償を行うことによって、予測画像信号４１２を生成する。 The prediction parameter controller 403 prepares combinations of reference image numbers and prediction parameters in advance as a table similar to that shown in FIG. 3, and performs prediction based on the index information 415 from the variable length decoder 303 in FIG. A combination of a reference image number of a reference image and a prediction parameter used for generating the image signal 412 is selected. The multi-frame motion compensator 404 creates a reference image signal according to the combination of the reference image number and the index information selected by the prediction parameter controller 403, and the variable length decoder 303 in FIG. The predicted image signal 412 is generated by performing motion compensation in units of blocks in accordance with the motion vector indicated by the motion vector information 414 from.

（インデックス情報のシンタクスについて）
図１１に、各ブロックにおいてインデックス情報を符号化する場合のシンタクスの例を示す。まず、各ブロックに対してモード情報ＭＯＤＥが存在する。モード情報ＭＯＤＥに応じて、インデックスｉの値を示すインデックス情報ＩＤｉとインデックスｊの値を示すインデックス情報ＩＤｊを符号化するか否かが決定される。符号化されたインデックス情報の後に、各ブロックの動きベクトル情報として、インデックスｉの動き補償予測のための動きベクトル情報ＭＶｉと、インデックスｊの動き補償予測のための動きベクトル情報ＭＶｊが符号化される。 (About index information syntax)
FIG. 11 shows an example of syntax when index information is encoded in each block. First, mode information MODE exists for each block. Whether or not to encode the index information IDi indicating the value of the index i and the index information IDj indicating the value of the index j is determined according to the mode information MODE. After the encoded index information, motion vector information MVi for motion compensated prediction of index i and motion vector information MVj for motion compensated prediction of index j are encoded as motion vector information of each block. .

（符号化ビットストリームのデータ構造について）
図１２は、１枚の参照画像を用いて予測画像を作成する場合のブロック毎の具体的な符号化ビットストリームの例を示している。モード情報ＭＯＤＥに続いてインデックス情報ＩＤｉが配置され、その後に動きベクトル情報ＭＶｉが配置される。動きベクトル情報ＭＶｉは、通常、２次元のベクトル情報であるが、モード情報によって示された、ブロック内部の動き補償方法によっては、更に複数の２次元ベクトルが送られる場合もある。 (Data structure of encoded bit stream)
FIG. 12 illustrates an example of a specific encoded bit stream for each block when a predicted image is generated using one reference image. Following the mode information MODE, index information IDi is arranged, and then motion vector information MVi is arranged. The motion vector information MVi is usually two-dimensional vector information. However, depending on the motion compensation method inside the block indicated by the mode information, a plurality of two-dimensional vectors may be sent.

図１３には、２枚の参照画像を用いて予測画像を作成する場合のブロック毎の具体的な符号化ビットストリームの例を示す。モード情報ＭＯＤＥに続いてインデックス情報ＩＤｉ、インデックス情報ＩＤｊが配置され、その後に動きベクトル情報ＭＶｉ、動きベクトル情報ＭＶｊが配置される。動きベクトル情報ＭＶｉ及び動きベクトル情報ｊは、通常、２次元のベクトル情報であるが、モード情報によって示された、ブロック内部の動き補償方法によっては、更に複数の２次元ベクトルが送られる場合もある。
（まとめ）
以上説明したように本実施形態によると、１つの参照画像を用いて予測画像を作成する場合には、予測パラメータとして予測係数とオフセットを用いた線形予測を行うことにより、予測画像を作成する。この手法によって、単一色の画面との混合した映像であるフィード画像に対して、適切な予測画像を作成することができる。単純に参照ピクチャ番号と予測パラメータの複数の組み合わせの中から符号化対象ブロック毎に一つの組み合わせを選択する手法では、参照ピクチャ数が複数の場合、画素当たりの乗算数も複数回を必要とすることから計算量が多くなるが、本実施形態では必要な乗算回数は画素当たり１回で済む。 FIG. 13 shows an example of a specific encoded bit stream for each block when a predicted image is created using two reference images. Following the mode information MODE, index information IDi and index information IDj are arranged, followed by motion vector information MVi and motion vector information MVj. The motion vector information MVi and the motion vector information j are usually two-dimensional vector information, but a plurality of two-dimensional vectors may be sent depending on the motion compensation method inside the block indicated by the mode information. .
(Summary)
As described above, according to the present embodiment, when a predicted image is generated using one reference image, a predicted image is generated by performing linear prediction using a prediction coefficient and an offset as a prediction parameter. By this method, an appropriate predicted image can be created for a feed image that is a mixed video with a single color screen. In the method of simply selecting one combination for each coding target block from a plurality of combinations of reference picture numbers and prediction parameters, when there are a plurality of reference pictures, the number of multiplications per pixel is also required multiple times. This increases the amount of calculation, but in the present embodiment, only one multiplication is required per pixel.

一方、２つの参照画像を用いて予測画像を作成する場合には、２つの参照画像間の距離から求めることのできる重み係数を用いて、２つの参照画像の重み付き平均を行うことにより、予測画像を作成する。この手法によって、２つの映像が混合されたディゾルブ画像に対して、適切な予測画像を作成することができる。このとき、本実施形態で用いた数式を用いれば、必要な乗算回数は画素当たり１回で済む。 On the other hand, when a predicted image is created using two reference images, a prediction is performed by performing a weighted average of the two reference images using a weighting coefficient that can be obtained from the distance between the two reference images. Create an image. By this method, an appropriate predicted image can be created for a dissolve image in which two videos are mixed. At this time, if the mathematical formula used in the present embodiment is used, the necessary number of multiplications is one per pixel.

このように本実施形態によれば、フィード映像に対してもディゾルブ映像に対しても、画素当たり１回の乗算で適切な予測画像を作成できる。画素当たりの乗算が１回で済むことにより、符号化側及び復号化側のいずれにおいてもハードウェア規模や演算コストを削減することが可能となる。 As described above, according to the present embodiment, an appropriate predicted image can be created by multiplying once per pixel for both the feed video and the dissolve video. Since only one multiplication per pixel is required, it is possible to reduce the hardware scale and calculation cost on both the encoding side and the decoding side.

上述の説明では、参照画像の数によって予測画像の作成方法を切り替えたが、いわゆるピクチャタイプあるいはスライスタイプと呼ばれる予測タイプの違いによって、ピクチャ単位あるいはスライス単位で予測画像の作成方法を切り替えてもよい。例えば、Ｂピクチャの場合でどちらか１つの参照画像しか使わない場合には、予測パラメータを用いた予測画像の作成は行わず、通常の局部復号画像を用いた動き補償予測が行われる。 In the above description, the method of creating a predicted image is switched depending on the number of reference images. However, the method of creating a predicted image may be switched in units of pictures or units of slices depending on a prediction type called a so-called picture type or slice type. . For example, when only one of the reference images is used in the case of a B picture, a motion compensated prediction using a normal local decoded image is performed without generating a predicted image using a prediction parameter.

このように参照画像の数に加えて、予測タイプの違いによっても予測画像の作成方法を切り替える手法を用いた予測画像の作成手順を図１４を用いて具体的に説明する。この例では、スライス単位で予測画像の作成方法を切り替えている。 A predicted image creation procedure using a method of switching the predicted image creation method depending on the prediction type in addition to the number of reference images will be specifically described with reference to FIG. In this example, the prediction image creation method is switched in units of slices.

まず、符号化対象領域である符号化対象スライスの予測タイプ（スライスタイプという）を判定し、符号化対象スライスをフレーム内符号化を行うＩ（フレーム内予測）スライスと、１つの参照画像を用いて予測を行うＰ（片方向予測）スライスと、最大２つの参照画像を用いて予測を行うＢ（双方向予測）スライスの３つに分岐する（ステップＳ２０１）。 First, a prediction type (referred to as a slice type) of an encoding target slice that is an encoding target region is determined, and an I (intraframe prediction) slice that performs intraframe encoding on the encoding target slice and one reference image are used. The P (one-way prediction) slice that performs prediction and the B (bidirectional prediction) slice that performs prediction using a maximum of two reference images are branched (step S201).

ステップＳ２０１の判定の結果、符号化対象スライスがＩスライスの場合はフレーム内符号化（イントラ符号化）を行う（ステップＳ２０２）。符号化対象スライスがＰスライスの場合は、前述した１つの参照画像と予測パラメータの組み合わせによる予測方式を採用する（ステップＳ２０３）。 If the result of determination in step S201 is that the current slice is an I slice, intraframe coding (intra coding) is performed (step S202). When the encoding target slice is a P slice, a prediction method based on a combination of one reference image and a prediction parameter described above is employed (step S203).

符号化対象スライスがＢスライスの場合は、参照画像の数を調べ（ステップＳ２０４）、それに応じて予測方式を切り替える。すなわち、符号化対象スライスがＢスライスであって、参照画像が１つの場合は、通常の動き補償予測を採用する（ステップＳ２０５）。符号化対象スライスがＢスライスであって、参照画像が２つの場合は、前述した選択された２つの参照画像の画像間距離に応じた予測方式を採用する（ステップＳ２０６）。 When the encoding target slice is a B slice, the number of reference images is checked (step S204), and the prediction method is switched accordingly. That is, when the encoding target slice is a B slice and there is one reference image, normal motion compensation prediction is employed (step S205). When the encoding target slice is a B slice and there are two reference images, the prediction method according to the inter-image distance between the two selected reference images described above is employed (step S206).

［第２の実施形態］
次に、本発明の第２の実施形態について説明する。本実施形態における動画像符号化装置及び動画像復号化装置の全体的な構成は、第１の実施形態とほぼ同様であるため、第１の実施形態との相違点のみを説明する。本実施形態では、第１の実施形態と他の方式を組み合わせた場合の例を示す。 [Second Embodiment]
Next, a second embodiment of the present invention will be described. Since the overall configuration of the video encoding device and video decoding device in the present embodiment is substantially the same as that of the first embodiment, only differences from the first embodiment will be described. In this embodiment, an example in which the first embodiment is combined with another method is shown.

以下の数式（２１）は、参照画像を２つ使用するいわゆるＢピクチャの双方向予測の場合の予測画像の予測式であり、２つの参照画像の動き補償予測画像を単純に平均する第１の手法である。

The following formula (21) is a prediction formula of a prediction image in the case of so-called B picture bi-directional prediction using two reference images, and is a first equation that simply averages the motion compensated prediction images of two reference images. It is a technique.

この第１の手法では、数式（４）〜（６）、数式（７）〜（８）、数式（９）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２１）に示した予測式と切り替え情報をピクチャ、フレーム、フィールド及びスライスといった予め決まった符号化の単位で、符号化データと共に切り替えのフラグを符号化して動画像符号化装置から復号化装置に伝送し、両装置で共有できるようにする。すなわち、必要に応じて数式（４）〜（６）、数式（７）〜（８）、数式（１０）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２１）に示した予測式とを切り替えることにする。 In this first method, the prediction formula shown in any one of the formulas (4) to (6), the formulas (7) to (8), the formula (9), or the formulas (10) to (20), and the formula ( The prediction formula and the switching information shown in 21) are encoded in a predetermined encoding unit such as a picture, a frame, a field, and a slice, and a switching flag is encoded together with the encoded data, and transmitted from the moving picture encoding apparatus to the decoding apparatus. And be shared by both devices. That is, the prediction formula shown in any one of Formulas (4) to (6), Formulas (7) to (8), Formula (10), or Formulas (10) to (20) and Formula (21) as necessary. ) Is switched to the prediction formula shown in FIG.

この第１の手法によると、画像間距離に応じた重み付平均と２つの参照画像の単純平均を適応的に切り替えることができ、予測効率の向上が期待できる。なお、数式（２１）は乗算を含まないため、計算量は増加しない。 According to this first method, it is possible to adaptively switch between the weighted average according to the inter-image distance and the simple average of the two reference images, and an improvement in prediction efficiency can be expected. Since Equation (21) does not include multiplication, the amount of calculation does not increase.

次数式（２２）〜（２７）と数式（２８）〜（３３）は、参照画像が１つのときの予測パラメータを用いて参照画像が２つの場合の予測パラメータを作成する手法を示している。本実施形態では、第１の実施形態とこれらの手法の組み合わせた例を示す。まず、数式（２２）〜（２７）は、参照画像が１つのときの予測式の値をそれぞれ平均して予測値を求める第２の手法である。

The following mathematical formulas (22) to (27) and mathematical formulas (28) to (33) show a method of creating a prediction parameter when there are two reference images using a prediction parameter when there is one reference image. In the present embodiment, an example in which the first embodiment and these methods are combined is shown. First, Equations (22) to (27) are a second method for obtaining a prediction value by averaging the values of the prediction equations when there is one reference image.

ここで、Ｐ_Y（ｉ），Ｐ_Cb（ｉ），Ｐ_Cr（ｉ）は、それぞれ輝度信号Ｙ，色差信号Ｃｂ、色差信号Ｃｒの予測値の途中結果である。
この第２の手法では、数式（４）〜（６）、数式（７）〜（８）、数式（９）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２２）〜（２７）に示した予測式との切り替え情報をピクチャ、フレーム、フィールド及びスライスといった予め決まった符号化の単位で、符号化データと共に切り替えのフラグを符号化して動画像符号化装置から復号化装置に伝送し、両装置で共有できるようにする。このように必要に応じて数式（４）〜（６）、数式（７）〜（８）、数式（９）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２２）〜（２７）に示した予測式とを切り替えるようにする。 Here, P _Y (i), P _Cb (i), and P _Cr (i) are intermediate results of predicted values of the luminance signal Y, the color difference signal Cb, and the color difference signal Cr, respectively.
In the second method, the prediction formula shown in any one of Formulas (4) to (6), Formulas (7) to (8), Formula (9), or Formulas (10) to (20), and Formula ( The switching information with the prediction formulas shown in (22) to (27) is encoded in units of predetermined encoding such as a picture, a frame, a field, and a slice, and a switching flag is encoded together with the encoded data from the moving picture encoding apparatus. It is transmitted to the decryption device so that it can be shared by both devices. As described above, the prediction formula shown in any one of Formulas (4) to (6), Formulas (7) to (8), Formula (9), or Formulas (10) to (20), and Formula ( The prediction formulas shown in 22) to (27) are switched.

この第２の手法によって、画像間距離に応じた重み付平均と２つの参照画像を用いた線形予測による予測画像を適応的に切り替えることができ、予測効率の向上が期待できる。ただし、数式（２２）〜（２７）に示した予測式によれば、画素当たりの乗算数は２回になるが、予測係数の自由度が上がるメリットがあり、それ以上の予測効率の向上が期待できる。 By this second method, it is possible to adaptively switch the prediction image based on the linear prediction using the weighted average according to the inter-image distance and the two reference images, and an improvement in prediction efficiency can be expected. However, according to the prediction formulas shown in Equations (22) to (27), the number of multiplications per pixel is two, but there is an advantage that the degree of freedom of the prediction coefficient is increased, and the prediction efficiency is further improved. I can expect.

次数式（２８）〜（３３）には、別の予測式として、参照画像が１つの場合の予測パラメータを２つ用いて作成した、参照画像が２つの場合の線形予測式の例を示す。

The following mathematical formulas (28) to (33) show examples of linear prediction formulas for two reference images created using two prediction parameters for one reference image as other prediction formulas.

この第３の手法では、数式（４）〜（６）、数式（７）〜（８）、数式（９）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２８）〜（３３）に示した予測式との切り替え情報をピクチャ、フレーム、フィールド及びスライスといった予め決まった符号化の単位で、符号化データと共に切り替えのフラグを符号化して、動画像符号化装置から復号化装置に伝送して両装置で共有できるようにする、必要に応じて数式（４）〜（６）、数式（７）〜（８）、数式（９）または数式（１０）〜（２０）のいずれかに示した予測式と、数式（２８）〜（３３）に示した予測式とを切り替えることにする。 In the third method, the prediction formula shown in any one of Formulas (4) to (6), Formulas (7) to (8), Formula (9), or Formulas (10) to (20), and Formula ( 28) to (33) are used to encode switching information together with encoded data in a predetermined encoding unit such as a picture, a frame, a field, and a slice. Is transmitted to the decryption device and can be shared by both devices, as necessary, Formula (4) to (6), Formula (7) to (8), Formula (9), or Formula (10) to (10) 20) and the prediction formulas shown in the mathematical formulas (28) to (33) are switched.

この第３の手法によると、画像間距離に応じた重み付平均と２つの参照画像を用いた線形予測による予測画像を適応的に切り替えることができ、予測効率の向上が期待できる。ただし、数式（２８）〜（３３）の予測式によれば、画素当たりの乗算数は２回になるが、予測係数の自由度が上がるメリットがあり、それ以上の予測効率の向上が期待できる。 According to the third method, it is possible to adaptively switch the prediction image based on the linear prediction using the weighted average according to the inter-image distance and the two reference images, and an improvement in prediction efficiency can be expected. However, according to the prediction formulas of Equations (28) to (33), the number of multiplications per pixel is 2, but there is an advantage that the degree of freedom of the prediction coefficient is increased, and further improvement in prediction efficiency can be expected. .

なお、上述の実施形態においてはブロック単位の直交変換を使った動画像符号化／復号化方式の例で説明したが、例えばウェーブレット変換のような他の変換手法を用いた場合にも、上記実施形態で説明した本発明の手法を同様に適用することができる。 In the above-described embodiment, the example of the moving image encoding / decoding method using the orthogonal transform in units of blocks has been described. However, the above-described implementation is also performed when another transform method such as a wavelet transform is used. The method of the present invention described in the embodiment can be similarly applied.

本発明に係る動画像符号化及び復号化の処理は、ハードウェア（装置）として実現してもよいし、コンピュータを用いてソフトウェアにより実行してもよい。一部の処理をハードウェアで実現し、他の処理をソフトウェアにより行ってもよい。従って、本発明によると上述した動画像符号化または復号化処理をコンピュータに行わせるためのプログラムあるいは該プログラムを格納した記憶媒体を提供することが可能である。 The moving image encoding and decoding processes according to the present invention may be realized as hardware (apparatus) or may be executed by software using a computer. Some processing may be realized by hardware, and other processing may be performed by software. Therefore, according to the present invention, it is possible to provide a program for causing a computer to perform the above-described moving image encoding or decoding process or a storage medium storing the program.

本発明の第１の実施形態に係る動画像符号化装置の構成を示すブロック図The block diagram which shows the structure of the moving image encoder which concerns on the 1st Embodiment of this invention. 図２におけるフレームメモリ／予測画像作成器の詳細な構成を示すブロック図FIG. 2 is a block diagram showing a detailed configuration of the frame memory / predictive image creator in FIG. 同実施形態で用いる参照画像番号と予測パラメータの組み合わせテーブルの例を示す図The figure which shows the example of the combination table of the reference image number and prediction parameter which are used in the embodiment 同実施形態における２つの参照画像と符号化対象画像間の第１の位置関係を示す図The figure which shows the 1st positional relationship between two reference images and the encoding object image in the embodiment. 同実施形態における２つの参照画像と符号化対象画像間の第２の位置関係を示す図The figure which shows the 2nd positional relationship between two reference images and the encoding object image in the embodiment. 同実施形態における２つの参照画像と符号化対象画像間の第３の位置関係を示す図The figure which shows the 3rd positional relationship between two reference images and the encoding object image in the embodiment. 同実施形態における２つの参照画像と符号化対象画像間の第４の位置関係を示す図The figure which shows the 4th positional relationship between two reference images and the encoding object image in the embodiment. 同実施形態におけるマクロブロック毎の予測方式（参照画像番号と予測パラメータの組み合わせ）の選択及び符号化モード判定の手順の一例を示すフローチャートThe flowchart which shows an example of the procedure of selection of the prediction method (combination of a reference image number and a prediction parameter) for every macroblock and encoding mode determination in the embodiment 同実施形態に係る動画像復号化装置の構成を示すブロック図The block diagram which shows the structure of the moving image decoding apparatus which concerns on the same embodiment 図９におけるフレームメモリ／予測画像生成器の詳細な構成を示すブロック図The block diagram which shows the detailed structure of the frame memory / predictive image generator in FIG. インデックス情報を符号化する場合のブロック毎のシンタクスの例を示す図The figure which shows the example of the syntax for every block in the case of encoding index information. １枚の参照画像を用いて予測画像を作成する場合の具体的な符号化ビットストリームの例を示す図The figure which shows the example of the concrete encoding bit stream in the case of producing a predicted image using one reference image ２枚の参照画像を用いて予測画像を作成する場合の具体的な符号化ビットストリームの例を示す図The figure which shows the example of the concrete encoding bit stream in the case of producing a predicted image using two reference images 本発明の実施形態に係る符号化対象領域の種類によって予測方式を切り替える手順を示すフローチャートThe flowchart which shows the procedure which switches a prediction system by the kind of encoding object area | region which concerns on embodiment of this invention.

Explanation of symbols

１００…入力動画像信号
１０１…減算器
１０２，１０９…モード選択スイッチ
１０３…直交変換器
１０４…量子化器
１０５…逆量子化器
１０６…逆直交変換器
１０７…加算器
１０８…フレームメモリ／予測画像作成器
１１０…モード選択器
１１１…可変長符号化器
１１２…符号化部
１１３…符号化制御器
１１４…多重化器
１１５…出力バッファ
１１６…符号化データ
２０１…メモリ制御器
２０２…複数フレームメモリ
２０３…予測パラメータ制御器
２０４…複数フレーム動き評価器
２０５…複数フレーム動き補償器
２１１…局部復号画像信号
２１２…予測画像信号
２１３…モード情報
２１４…動きベクトル情報
２１５…インデックス情報
３００…符号化データ
３０１…入力バッファ
３０２…多重化分離器
３０３…可変長復号化器
３０４…逆量子化器
３０５…逆直交変換器
３０６…加算器
３０７…フレームメモリ／予測画像作成器
３０８…加算器
３０９…モード切替スイッチ
３１０…再生画像信号
４０１…メモリ制御器
４０２…複数フレームメモリ
４０３…予測パラメータ制御器
４０４…複数フレーム動き補償器
４１２…予測画像信号
４１３…モード情報
４１４…動きベクトル情報
４１５…インデックス情報 DESCRIPTION OF SYMBOLS 100 ... Input moving image signal 101 ... Subtractor 102, 109 ... Mode selection switch 103 ... Orthogonal transformer 104 ... Quantizer 105 ... Inverse quantizer 106 ... Inverse orthogonal transformer 107 ... Adder 108 ... Frame memory / predicted image Generator 110 ... Mode selector 111 ... Variable length encoder 112 ... Encoder 113 ... Encoding controller 114 ... Multiplexer 115 ... Output buffer 116 ... Encoded data 201 ... Memory controller 202 ... Multiple frame memory 203 ... prediction parameter controller 204 ... multiple frame motion estimator 205 ... multiple frame motion compensator 211 ... local decoded image signal 212 ... predicted image signal 213 ... mode information 214 ... motion vector information 215 ... index information 300 ... encoded data 301 ... Input buffer 302 ... Demultiplexer 303 ... Variable length Encoder 304 ... Inverse quantizer 305 ... Inverse orthogonal transformer 306 ... Adder 307 ... Frame memory / predicted image generator 308 ... Adder 309 ... Mode changeover switch 310 ... Reproduced image signal 401 ... Memory controller 402 ... Multiple Frame memory 403 ... Prediction parameter controller 404 ... Multiple frame motion compensator 412 ... Prediction image signal 413 ... Mode information 414 ... Motion vector information 415 ... Index information

Claims

An image decoding method for performing motion compensation predictive decoding on a decoding target block of a decoding target image,
A first step of decoding encoded data of a decoding target image to obtain a quantized orthogonal transform coefficient, motion vector information, and index information;
A first distance between the first reference image indicated by the index information and the decoding target image and a second distance between the first reference image and the second reference image indicated by the index information are obtained. A second step;
A third step of determining a first weighting factor for the first reference image and a second weighting factor for the second reference image based on a ratio of the first distance to the second distance;
An image of an area specified by the motion vector information in the first reference image and an image of an area specified by the motion vector information in the second reference image according to the first weight coefficient and the second weight coefficient A fourth step of calculating a linear sum of and generating a predicted image;
A fifth step of obtaining a prediction residual by inverse quantization and inverse orthogonal transform of the quantized orthogonal transform coefficient information;
A sixth step of adding the prediction residual and the prediction image to generate a reproduced image signal;
Have
In the third step, when the second weighting factor is larger than the upper limit, or when the second weighting factor is smaller than the lower limit, both values of the first weighting factor and the second weighting factor are set. Even when the average value is set to a value for prediction and at least one of the first image and the second image is a long-term reference image, both values of the first weighting factor and the second weighting factor are used. Is set to the value for average value prediction, The moving picture decoding method characterized by the above-mentioned.

An image decoding apparatus that performs motion compensation predictive decoding on a decoding target block of a decoding target image,
First means for decoding encoded data of an image to be decoded and obtaining quantized orthogonal transform coefficients, motion vector information, and index information;
A first distance between the first reference image indicated by the index information and the decoding target image and a second distance between the first reference image and the second reference image indicated by the index information are obtained. Two means,
Third means for determining a first weighting factor for the first reference image and a second weighting factor for the second reference image based on a ratio of the first distance to the second distance;
An image of an area specified by the motion vector information in the first reference image and an image of an area specified by the motion vector information in the second reference image according to the first weight coefficient and the second weight coefficient A fourth means for calculating a linear sum of and generating a predicted image;
A fifth means for dequantizing and inverse orthogonal transform the quantized orthogonal transform coefficient information to obtain a prediction error;
Sixth means for adding the prediction residual and the prediction image to generate a reproduced image signal;
Have
The third means averages the values of both the first weighting factor and the second weighting factor when the second weighting factor is larger than the upper limit or when the second weighting factor is smaller than the lower limit. Even when at least one of the first image and the second image is a long-term reference image, the values of both the first weight coefficient and the second weight coefficient are set to values for value prediction. A moving picture decoding apparatus characterized in that it is set to a value for average value prediction.