JP2012080151A

JP2012080151A - Method and apparatus for moving image encoding and moving image decoding using geometry-transformed/motion-compensated prediction

Info

Publication number: JP2012080151A
Application number: JP2009027747A
Authority: JP
Inventors: Akiyuki Tanizawa; 昭行谷沢; Taiichiro Shiodera; 太一郎塩寺; Takeshi Nakajo; 健中條
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2009-02-09
Filing date: 2009-02-09
Publication date: 2012-04-19
Also published as: WO2010090335A1

Abstract

PROBLEM TO BE SOLVED: To provide a moving image encoding apparatus, a moving image decoding apparatus, a moving image encoding method and a moving image decoding method that improve prediction efficiency without increasing the amount of code by reducing motion detection processing required to estimate a geometric transform parameter used for geometry-transformed/motion-compensated prediction.SOLUTION: The moving image encoding apparatus includes: a motion information acquisition section for acquiring motion information about one or more of adjacent blocks adjacent to one of divided pixel blocks of an image signal; a geometric transform information acquisition section for acquiring on the basis of the motion information a geometric transform parameter that is information on the form of mapping by a geometric transform of the pixel block in a reference image signal in the process of motion compensation of the pixel block; a geometry-transformed prediction section for performing a geometry-transformed motion prediction including a geometric transform between the reference image signal and the pixel block with the use of the reference image signal geometrically transformed with the geometric transform parameter; and an encoding section for encoding a prediction error value of the pixel block subjected to the geometry-transformed motion prediction.

Description

本発明は、隣接ブロックと予測対象ブロックの動き情報を用いて幾何変換パラメータを推定し、推定した幾何変換パラメータを基に予測対象ブロックの幾何変換予測処理を行う動画像符号化と動画像復号化の方法、プログラム及び装置に関する。 The present invention relates to moving picture coding and moving picture decoding in which geometric transformation parameters are estimated using motion information of neighboring blocks and prediction target blocks, and geometric transformation prediction processing of the prediction target blocks is performed based on the estimated geometric transformation parameters. The present invention relates to a method, a program, and an apparatus.

近年、大幅に符号化効率を向上させた動画像符号化方法がＩＴＵ−ＴとＩＳＯ／ＩＥＣとの共同で、ＩＴＵ−ＴＲＥＣ．Ｈ．２６４及びＩＳＯ／ＩＥＣ１４４９６−１０（以下、「Ｈ．２６４」という。）として勧告されている。Ｈ．２６４では、予測処理、変換処理、及び、エントロピー符号化処理が、矩形ブロック単位（例えば、１６×１６画素、８×８画素等）で行われる。このため、Ｈ．２６４では矩形ブロックで表現出来ないオブジェクトを予測する際に、より小さな予測ブロック（４×４画素等）を選択することで予測効率を高めている。このようなオブジェクトを効果的に予測するために、矩形ブロックに複数の予測パターンを用意する方法や、変形したオブジェクトに対してアフィン変換を用いた動き補償を適応する方法等がある。 In recent years, a moving picture coding method having greatly improved coding efficiency has been jointly developed by ITU-T and ISO / IEC. H. H.264 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”). H. In H.264, prediction processing, conversion processing, and entropy encoding processing are performed in units of rectangular blocks (for example, 16 × 16 pixels, 8 × 8 pixels, etc.). For this reason, H.C. In H.264, when an object that cannot be expressed by a rectangular block is predicted, the prediction efficiency is increased by selecting a smaller prediction block (4 × 4 pixels or the like). In order to predict such an object effectively, there are a method of preparing a plurality of prediction patterns in a rectangular block, a method of applying motion compensation using affine transformation to a deformed object, and the like.

例えば、特開２００７―３１２３９７号公報（特許文献１）には、オブジェクトの動きのモデルをアフィン変換モデルとし、予測対象のブロック毎に最適なアフィン変換パラメータを算出することによって、オブジェクトの拡大・縮小・回転などを考慮する予測を用いるビデオフレーム転送方法等の発明が開示されている。 For example, in Japanese Patent Application Laid-Open No. 2007-312397 (Patent Document 1), an object movement model is an affine transformation model, and an optimal affine transformation parameter is calculated for each prediction target block, thereby enlarging / reducing the object. An invention such as a video frame transfer method using a prediction that takes rotation into consideration is disclosed.

また、非特許文献１には、動きのモデルを平行移動モデルとして算出した動きベクトルの情報を基にして、ブロックを三角パッチに分割し、それぞれのパッチ毎にアフィン変換パラメータを推定することで、近似的にアフィン変換モデルの動き補償予測を行う手法が開示されている。 Further, Non-Patent Document 1 divides a block into triangular patches based on motion vector information calculated as a translation model as a motion model, and estimates an affine transformation parameter for each patch. A method of approximately performing motion compensation prediction of an affine transformation model is disclosed.

特開２００７−３１２３９７号公報JP 2007-312397 A

R.C. Kordasiewicz, M.D. Gallant, and S. Shirani, “Affine Motion Prediction Based on Translational Motion Vectors,” IEEE Trans. On Circuits and Systems for Video Technologies, Vol. 17, No. 10, October 2007.R.C.Kordasiewicz, M.D.Galant, and S. Shirani, “Affine Motion Prediction Based on Translational Motion Vectors,” IEEE Trans. On Circuits and Systems for Video Technologies, Vol. 17, No. 10, October 2007.

しかしながら、上記特許文献１に記載の方法では、6種類のアフィン変換パラメータを画素ブロック毎に送信するため、オーバーヘッドが増加する。また、これらのパラメータを算出するために、複数の参照画像と予測対象のブロックに対応する入力画像とのブロックマッチングを行う必要があり演算量が増加するという問題がある。 However, in the method described in Patent Document 1, six types of affine transformation parameters are transmitted for each pixel block, so that overhead increases. In addition, in order to calculate these parameters, it is necessary to perform block matching between a plurality of reference images and an input image corresponding to a prediction target block, which increases the amount of calculation.

また、非特許文献１に記載の方法は、上下左右など予測対象となる画素ブロックに隣接する８種類の隣接ブロックの動きベクトルと予測対象の画素ブロックで算出された動きベクトルとを用いてアフィン変換パラメータを推定するため、最適な動きベクトルを求めるためにはフレームの再符号化が必要となる。一方、動きベクトルの算出のみをフレーム単位で行った場合は、符号量と符号化歪みの観点で最適ではなく、符号化効率が低下するという問題がある。 Further, the method described in Non-Patent Document 1 uses an affine transformation using motion vectors of eight types of adjacent blocks adjacent to a pixel block to be predicted, such as up, down, left, and right, and a motion vector calculated from the pixel block to be predicted. In order to estimate the parameters, it is necessary to re-encode the frame in order to obtain an optimal motion vector. On the other hand, when only the motion vector calculation is performed in units of frames, there is a problem in that it is not optimal in terms of code amount and encoding distortion, and encoding efficiency is reduced.

本発明は、上記の点に鑑みて、これらの問題を解消するために発明されたものであり、幾何変換動き補償予測に用いる幾何変換パラメータの推定に必要な動き検出処理を低減し、符号量を増加させることなく予測効率を向上する動画像符号化装置、動画像復号化装置、動画像符号化方法、及び、動画像復号化方法を提供することを目的とする。 The present invention has been invented in order to solve these problems in view of the above points. The present invention reduces the motion detection processing necessary for estimating the geometric transformation parameters used for the geometric transformation motion compensation prediction, and reduces the code amount. An object of the present invention is to provide a video encoding device, a video decoding device, a video encoding method, and a video decoding method that improve the prediction efficiency without increasing the image quality.

上記目的を達成するために、本発明の動画像符号化装置は次の如き構成を採用した。 In order to achieve the above object, the moving picture encoding apparatus of the present invention employs the following configuration.

本発明の動画像符号化装置は、画像信号が分割された画素ブロックの一に隣接する隣接ブロックのうちの一以上の隣接ブロックの動き情報を取得する動き情報取得部と、前記画素ブロックに対する動き補償を行う際の参照画像信号における、該画素ブロックの幾何変換による写像の形状に係る情報である幾何変換パラメータを、前記動き情報に基づいて取得する幾何変換情報取得部と、前記参照画像信号と前記画素ブロックとの間の幾何変換を含む幾何変換動き予測を、前記幾何変換パラメータにより幾何変換が行われた前記参照画像信号を用いて行う、幾何変換予測部と、前記幾何変換動き予測が行われた前記画素ブロックの予測誤差値を符号化する符号化部と、を有する構成とすることができる。 A moving image encoding apparatus according to the present invention includes a motion information acquisition unit that acquires motion information of one or more adjacent blocks among adjacent blocks adjacent to one of pixel blocks into which an image signal is divided, and motion for the pixel blocks A geometric transformation information acquisition unit that acquires, based on the motion information, a geometric transformation parameter that is information related to a shape of a mapping obtained by geometric transformation of the pixel block in a reference image signal when performing compensation, and the reference image signal A geometric transformation prediction unit that performs geometric transformation motion prediction including geometric transformation with the pixel block using the reference image signal subjected to geometric transformation by the geometric transformation parameter, and the geometric transformation motion prediction is performed. And an encoding unit that encodes the prediction error value of the pixel block.

また上記目的を達成するために、本発明の動画像符号化方法は、画像信号が分割された画素ブロックの一に隣接する隣接ブロックのうちの一以上の隣接ブロックの動き情報を取得する動き情報取得ステップと、前記画素ブロックに対する動き補償を行う際の参照画像信号における、該画素ブロックの幾何変換による写像の形状に係る情報である幾何変換パラメータを、前記動き情報に基づいて取得する幾何変換情報取得ステップと、前記参照画像信号と前記画素ブロックとの間の幾何変換を含む幾何変換動き予測を、前記幾何変換パラメータにより幾何変換が行われた前記参照画像信号を用いて行う、幾何変換予測ステップと、前記幾何変換動き予測が行われた前記画素ブロックの予測誤差値を符号化する符号化ステップと、を有する構成とすることができる。 In order to achieve the above object, the moving picture coding method of the present invention provides motion information for acquiring motion information of one or more adjacent blocks among adjacent blocks adjacent to one of the pixel blocks into which the image signal is divided. Geometric transformation information for obtaining, based on the motion information, a geometric transformation parameter, which is information relating to the shape of a map obtained by geometric transformation of the pixel block, in an acquisition step and a reference image signal when performing motion compensation for the pixel block An obtaining step, and a geometric transformation prediction step including performing geometric transformation motion prediction including geometric transformation between the reference image signal and the pixel block using the reference image signal subjected to geometric transformation by the geometric transformation parameter. And a coding step for coding a prediction error value of the pixel block on which the geometric transformation motion prediction has been performed, and Rukoto can.

また上記目的を達成するために、本発明の動画像復号化装置は、画像信号が分割された画素ブロックと該画素ブロックに対する動き補償を行う際の参照画像信号との間の幾何変換を含む幾何変換動き予測により得られる予測誤差値を含む、前記画像信号が符号化された符号データを復号する復号化部と、前記画素ブロックの一に隣接する隣接ブロックのうちの一以上の隣接ブロックの動き情報を取得する動き情報取得部と、前記参照画像信号における、該画素ブロックの幾何変換による写像の形状に係る情報である幾何変換パラメータを、前記動き情報に基づいて取得する幾何変換情報取得部と、前記幾何変換動き予測を、前記幾何変換パラメータにより幾何変換が行われた前記参照画像信号を用いて行い、予測値を生成する、幾何変換予測部と、復号された前記予測誤差値と生成された前記予測値とを加算する加算部と、を有する構成とすることができる。 In order to achieve the above object, a moving picture decoding apparatus according to the present invention includes a geometric transformation including a geometric transformation between a pixel block into which an image signal is divided and a reference image signal when motion compensation is performed on the pixel block. A decoding unit that decodes code data obtained by encoding the image signal, including a prediction error value obtained by conversion motion prediction, and a motion of one or more adjacent blocks among adjacent blocks adjacent to the pixel block A motion information acquisition unit that acquires information, and a geometric transformation information acquisition unit that acquires, based on the motion information, a geometric transformation parameter that is information related to a shape of a mapping by geometric transformation of the pixel block in the reference image signal; The geometric transformation motion prediction is performed using the reference image signal subjected to the geometric transformation by the geometric transformation parameter, and a prediction value is generated. When, can be configured to have, an adder for adding the said predicted value generated and decoded the prediction error value.

また上記目的を達成するために、本発明の動画像復号化方法は、画像信号が分割された画素ブロックと該画素ブロックに対する動き補償を行う際の参照画像信号との間の幾何変換を含む幾何変換動き予測により得られる予測誤差値を含む、前記画像信号が符号化された符号データを復号する復号化ステップと、前記画素ブロックの一に隣接する隣接ブロックのうちの一以上の隣接ブロックの動き情報を取得する動き情報取得ステップと、前参照画像信号における、該画素ブロックの幾何変換による写像の形状に係る情報である幾何変換パラメータを、前記動き情報に基づいて取得する幾何変換情報取得ステップと、前記参照画像信号と前記画素ブロックとの間の幾何変換を含む幾何変換動き予測を、前記幾何変換パラメータにより幾何変換が行われた前記参照画像信号を用いて行い、予測値を生成する、幾何変換予測ステップと、復号された前記予測誤差値と生成された前記予測値とを加算する加算ステップと、を有する構成とすることができる。 In order to achieve the above object, the moving picture decoding method of the present invention includes a geometric conversion including a geometric transformation between a pixel block into which an image signal is divided and a reference image signal when motion compensation is performed on the pixel block. A decoding step for decoding code data obtained by encoding the image signal, including a prediction error value obtained by conversion motion prediction; and a motion of one or more adjacent blocks among adjacent blocks adjacent to the pixel block A motion information acquisition step for acquiring information, and a geometric conversion information acquisition step for acquiring, based on the motion information, a geometric transformation parameter that is information related to the shape of the mapping by geometric transformation of the pixel block in the previous reference image signal; , A geometric transformation motion prediction including a geometric transformation between the reference image signal and the pixel block is performed by the geometric transformation according to the geometric transformation parameter. And a geometric transformation prediction step for generating a prediction value, and an addition step for adding the decoded prediction error value and the generated prediction value. be able to.

本発明の動画像符号化装置、動画像復号化装置、動画像符号化方法、及び、動画像復号化方法によれば、幾何変換動き補償予測に用いる幾何変換パラメータの推定に必要な動き検出処理を低減し、符号量を増加させることなく予測効率を向上する動画像符号化装置、動画像復号化装置、動画像符号化方法、及び、動画像復号化方法を提供することが可能になる。 According to the moving image encoding device, the moving image decoding device, the moving image encoding method, and the moving image decoding method of the present invention, a motion detection process necessary for estimating a geometric transformation parameter used for geometric transformation motion compensation prediction It is possible to provide a moving picture encoding apparatus, a moving picture decoding apparatus, a moving picture encoding method, and a moving picture decoding method that improve the prediction efficiency without increasing the code amount.

図１は、第１の実施形態に従う動画像符号化装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a moving picture encoding apparatus according to the first embodiment. 図２は、第１の実施形態に従うインター予測部の構成示すブロック図である。FIG. 2 is a block diagram showing a configuration of the inter prediction unit according to the first embodiment. 図３は、１６×１６画素ブロックを示す図である。FIG. 3 is a diagram illustrating a 16 × 16 pixel block. 図４は、符号化の処理の流れを示す図である。FIG. 4 is a diagram showing the flow of the encoding process. 図５は、参照画像信号と予測対象画像との位置関係と動きベクトルとの関係を表す図である。FIG. 5 is a diagram illustrating the relationship between the positional relationship between the reference image signal and the prediction target image and the motion vector. 図６は、動き補償予測を行う際の分数位置の画素を内挿補間処理によって生成する例を示す図である。FIG. 6 is a diagram illustrating an example in which a pixel at a fractional position when performing motion compensation prediction is generated by interpolation processing. 図７Ａは、マクロブロック単位の動き補償ブロックのサイズを表す図である。FIG. 7A is a diagram illustrating the size of a motion compensation block in units of macroblocks. 図７Ｂは、サブブロック単位の動き補償ブロックのサイズを表す図である。FIG. 7B is a diagram illustrating the size of a motion compensation block in subblock units. 図８は、幾何変換パラメータ導出部の構成を示す図である。FIG. 8 is a diagram illustrating a configuration of the geometric transformation parameter derivation unit. 図９Ａは、符号化又は復号化の対象となる画素ブロックと隣接ブロックとの位置関係を表す図である。FIG. 9A is a diagram illustrating a positional relationship between a pixel block to be encoded or decoded and an adjacent block. 図９Ｂは、符号化又は復号化の対象となる画素ブロックがマクロブロックの左上の場合の隣接ブロックの位置関係を表す図である。FIG. 9B is a diagram illustrating the positional relationship between adjacent blocks when the pixel block to be encoded or decoded is at the upper left of the macroblock. 図９Ｃは、符号化又は復号化の対象となる画素ブロックがマクロブロックの右上の場合の隣接ブロックの位置関係を表す図である。FIG. 9C is a diagram illustrating a positional relationship between adjacent blocks when the pixel block to be encoded or decoded is at the upper right of the macro block. 図９Ｄは、符号化又は復号化の対象となる画素ブロックがマクロブロックの左下の場合の隣接ブロックの位置関係を表す図である。FIG. 9D is a diagram illustrating a positional relationship between adjacent blocks when a pixel block to be encoded or decoded is at the lower left of a macro block. 図９Ｅは、符号化又は復号化の対象となる画素ブロックがマクロブロックの右下の場合の隣接ブロックの位置関係を表す図である。FIG. 9E is a diagram illustrating a positional relationship between adjacent blocks when a pixel block to be encoded or decoded is located on the lower right side of a macroblock. 図１０Ａは、符号化又は復号化の対象となる画素ブロックに対して隣接ブロックのブロックサイズが小さい場合の位置関係を表す図である。FIG. 10A is a diagram illustrating a positional relationship when a block size of an adjacent block is small with respect to a pixel block to be encoded or decoded. 図１０Ｂは、符号化又は復号化の対象となる画素ブロックに対して隣接ブロックのブロックサイズが小さい場合の位置関係を表す図である。FIG. 10B is a diagram illustrating a positional relationship when a block size of an adjacent block is small with respect to a pixel block to be encoded or decoded. 図１０Ｃは、符号化又は復号化の対象となる画素ブロックに対して隣接ブロックのブロックサイズが小さい場合の位置関係を表す図である。FIG. 10C is a diagram illustrating a positional relationship when a block size of an adjacent block is small with respect to a pixel block to be encoded or decoded. 図１０Ｄは、符号化又は復号化の対象となる画素ブロックに対して隣接ブロックのブロックサイズが小さい場合の位置関係を表す図である。FIG. 10D is a diagram illustrating a positional relationship when a block size of an adjacent block is small with respect to a pixel block to be encoded or decoded. 図１１Ａは、符号化又は復号化の対象となる画素ブロック隣接ブロックより小さく、画素ブロックの左上の場合の隣接ブロックの位置関係を表す図である。FIG. 11A is a diagram showing the positional relationship between adjacent blocks in the case of the upper left of the pixel block, which is smaller than the adjacent block to be encoded or decoded. 図１１Ｂは、符号化又は復号化の対象となる画素ブロック隣接ブロックより小さく、画素ブロックの右上の場合の隣接ブロックの位置関係を表す図である。FIG. 11B is a diagram illustrating the positional relationship between adjacent blocks in the case of the upper right of the pixel block, which is smaller than the adjacent block to be encoded or decoded. 図１１Ｃは、符号化又は復号化の対象となる画素ブロック隣接ブロックより小さく、画素ブロックの左下の場合の隣接ブロックの位置関係を表す図である。FIG. 11C is a diagram illustrating the positional relationship between adjacent blocks in the case of the lower left of the pixel block that is smaller than the adjacent block to be encoded or decoded. 図１１Ｄは、符号化又は復号化の対象となる画素ブロック隣接ブロックより小さく、画素ブロックの右下の場合の隣接ブロックの位置関係を表す図である。FIG. 11D is a diagram illustrating the positional relationship between adjacent blocks in the case of the lower right of the pixel block that is smaller than the adjacent block to be encoded or decoded. 図１２は、符号化又は復号化の対象となる画素ブロックに対して隣接ブロックのブロックサイズが小さい場合の位置関係を表す図である。FIG. 12 is a diagram illustrating a positional relationship when a block size of an adjacent block is small with respect to a pixel block to be encoded or decoded. 図１３Ａは、予測対象画素に幾何変換を行う際の予測対象画素の分割を表す図である。FIG. 13A is a diagram illustrating division of a prediction target pixel when geometric conversion is performed on the prediction target pixel. 図１３Ｂは、予測対象画素に幾何変換を行う際の予測対象画素の分割を表す図である。FIG. 13B is a diagram illustrating division of a prediction target pixel when performing geometric transformation on the prediction target pixel. 図１４は、幾何変換予測部の構成を示す図である。FIG. 14 is a diagram illustrating a configuration of the geometric transformation prediction unit. 図１５は、動き補償予測と幾何変換予測の予測値生成の例を表す図である。FIG. 15 is a diagram illustrating an example of prediction value generation for motion compensation prediction and geometric transformation prediction. 図１６は、幾何変換を行った分数位置の画素値を内挿補間処理によって生成する例を示す図である。FIG. 16 is a diagram illustrating an example in which the pixel value at the fractional position where the geometric transformation is performed is generated by the interpolation process. 図１７は、幾何変換を行った際のオブジェクトの変形の例を示す図である。FIG. 17 is a diagram illustrating an example of deformation of an object when geometric transformation is performed. 図１８は、第１の実施形態に示される幾何変換予測の処理の流れを示すフローチャートである。FIG. 18 is a flowchart showing a flow of processing of geometric transformation prediction shown in the first embodiment. 図１９は、シンタクス構造を示す図である。FIG. 19 is a diagram illustrating a syntax structure. 図２０は、スライスヘッダーシンタクスに含まれる情報を示す図である。FIG. 20 is a diagram illustrating information included in the slice header syntax. 図２１は、第１の実施の形態におけるスライスデータシンタクスに含まれる情報を示す図である。FIG. 21 is a diagram illustrating information included in the slice data syntax according to the first embodiment. 図２２は、第１の実施の形態におけるマクロブロックレイヤーシンタクスに含まれる情報を示す図である。FIG. 22 is a diagram illustrating information included in the macroblock layer syntax according to the first embodiment. 図２３は、第１の実施の形態におけるマクロブロックプレディクションシンタクスに含まれる情報を示す図である。FIG. 23 is a diagram illustrating information included in the macroblock prediction syntax in the first embodiment. 図２４は、第１の実施の形態におけるサブマクロブロックプレディクションシンタクスに含まれる情報を示す図である。FIG. 24 is a diagram illustrating information included in the sub macroblock prediction syntax according to the first embodiment. 図２５は、第１の実施の形態の変更例におけるマクロブロックレイヤーシンタクスに含まれる情報を示す図である。FIG. 25 is a diagram illustrating information included in the macroblock layer syntax in the modification example of the first embodiment. 図２６は、第２の実施形態に従う動画像符号化装置を示すブロック図である。FIG. 26 is a block diagram showing a video encoding apparatus according to the second embodiment. 図２７Ａは、イントラ予測に用いる１６×１６画素ブロックを示す図である。FIG. 27A is a diagram illustrating a 16 × 16 pixel block used for intra prediction. 図２７Ｂは、イントラ予測に用いる４×４画素ブロックを示す図である。FIG. 27B is a diagram illustrating a 4 × 4 pixel block used for intra prediction. 図２７Ｃは、イントラ予測に用いる８×８画素ブロックを示す図である。FIG. 27C is a diagram illustrating an 8 × 8 pixel block used for intra prediction. 図２８は、第３の実施形態に従うインター予測部を示すブロック図である。FIG. 28 is a block diagram illustrating an inter prediction unit according to the third embodiment. 図２９は、第４の実施形態におけるマクロブロックレイヤーシンタクスに含まれる情報を示す図である。FIG. 29 is a diagram illustrating information included in the macroblock layer syntax in the fourth embodiment. 図３０は、第４の実施形態におけるサブマクロブロックレイヤーシンタクスに含まれる情報を示す図である。FIG. 30 is a diagram illustrating information included in the sub macroblock layer syntax in the fourth embodiment. 図３１は、第５の実施形態に従う動画像復号化装置を示すブロック図である。FIG. 31 is a block diagram showing a moving picture decoding apparatus according to the fifth embodiment. 図３２は、第５の実施形態に従う動画像復号化装置を示すブロック図である。FIG. 32 is a block diagram showing a moving picture decoding apparatus according to the fifth embodiment. 図３３は、第６の実施形態に従う動画像復号化装置を示すブロック図である。FIG. 33 is a block diagram showing a video decoding apparatus according to the sixth embodiment.

以下、第１の実施の形態ないし第７の実施の形態を図面に基づき説明する。第１の実施の形態から第４の実施の形態は、動画像符号化装置による実施の形態であり、第５の実施の形態から第７の実施の形態は、動画像復号化装置による実施の形態である。なお、以下の実施の形態における「判定パラメータ」は、「決定パラメータ」に対応する。 The first to seventh embodiments will be described below with reference to the drawings. The first to fourth embodiments are embodiments using a moving image encoding device, and the fifth to seventh embodiments are performed using a moving image decoding device. It is a form. Note that “determination parameters” in the following embodiments correspond to “determination parameters”.

＜動画像符号化装置＞
以下の実施の形態で説明する動画像符号化装置は、入力画像信号を構成する各々のフレームを複数の画素ブロックに分割し、これら分割した画素ブロックに対して符号化処理を行って圧縮符号化し、符号列を出力する装置である。 <Moving picture encoding apparatus>
A moving image encoding apparatus described in the following embodiments divides each frame constituting an input image signal into a plurality of pixel blocks, performs an encoding process on the divided pixel blocks, and performs compression encoding. , A device for outputting a code string.

［第１の実施の形態］
＜動画像符号化装置１００＞
図１は、幾何変換予測を用いる符号化方法を実現する動画像符号化装置１００の構成を示す図である。また、図２は、動画像符号化装置１００が有するインター予測部１３０のブロック図である。 [First Embodiment]
<Moving picture encoding apparatus 100>
FIG. 1 is a diagram illustrating a configuration of a moving image encoding apparatus 100 that realizes an encoding method using geometric transformation prediction. FIG. 2 is a block diagram of the inter prediction unit 130 included in the video encoding device 100.

図１の動画像符号化装置１００は、符号化制御部１２６から入力される符号化パラメータに基づいて、入力画像信号１１４に対するインター予測（フレーム間予測）符号化処理を行い、予測画像信号１２３を生成し、符号化データ１２４を出力する。 The moving image encoding apparatus 100 in FIG. 1 performs inter prediction (interframe prediction) encoding processing on the input image signal 114 based on the encoding parameter input from the encoding control unit 126, and generates the predicted image signal 123. Generate encoded data 124.

動画像符号化装置１００は、動画像または静止画像の入力画像信号１１４が、画素ブロック単位、例えばマクロブロック単位に分割されて入力される。入力画像信号は、フレーム及びフィールドの両方を含む１つの符号化の処理単位である。なお、本実施の形態では、フレームを１つの符号化の処理単位とする例について説明する。 In the moving image encoding apparatus 100, an input image signal 114 of a moving image or a still image is divided and input in units of pixel blocks, for example, macro blocks. The input image signal is one encoding processing unit including both a frame and a field. In this embodiment, an example in which a frame is used as one encoding processing unit will be described.

動画像符号化装置１００は、ブロックサイズや予測画像信号１２３の生成方法の異なる複数の予測モードによる符号化を行う。予測画像信号１２３の生成方法は、具体的には大きく分けて符号化対象のフレーム内だけで予測画像を生成するイントラ予測（フレーム内予測）と、時間的に異なる複数の参照フレームを用いて予測を行うインター予測とがあるが、本実施の形態では、インター予測を用いて予測画像信号を生成する例について説明する。 The moving image encoding apparatus 100 performs encoding in a plurality of prediction modes having different block sizes and generation methods of the predicted image signal 123. Specifically, the prediction image signal 123 is generated using a plurality of reference frames that are roughly divided into intra prediction (intraframe prediction) in which a prediction image is generated only within a frame to be encoded and a temporally different reference frame. In this embodiment, an example in which a prediction image signal is generated using inter prediction will be described.

第１ないし第７の実施の形態では、マクロブロックを符号化処理の基本的な処理ブロックサイズとする。マクロブロックは、典型的に例えば図３に示す１６×１６画素ブロックであるが、３２×３２画素ブロック単位であっても８×８画素ブロック単位であってもよい。またマクロブロックの形状は必ずしも正方格子である必要はない。以下、入力画像信号１１４の符号化対象マクロブロックを「予測対象ブロック」という。 In the first to seventh embodiments, the macro block is set to the basic processing block size of the encoding process. The macroblock is typically a 16 × 16 pixel block shown in FIG. 3, for example, but may be a 32 × 32 pixel block unit or an 8 × 8 pixel block unit. The shape of the macroblock does not necessarily need to be a square lattice. Hereinafter, the encoding target macroblock of the input image signal 114 is referred to as a “prediction target block”.

第１ないし第７の実施の形態では、説明を簡単にするために図４に示されているように左上から右下に向かって符号化処理がなされていくものとする。図４では、符号化処理をされている符号化フレームｆにおいて、符号化対象となるブロックｃよりも左及び上に位置するブロックが、符号化済みブロックｐである。 In the first to seventh embodiments, it is assumed that the encoding process is performed from the upper left to the lower right as shown in FIG. 4 in order to simplify the description. In FIG. 4, in the encoded frame f subjected to the encoding process, the blocks located on the left and above the block c to be encoded are the encoded blocks p.

動画像符号化装置１００は、減算器１０１、変換・量子化部１０２、逆量子化・逆変換部１０３、加算器１０４、参照画像メモリ１０５、動き推定部１０６、及び、インター予測部１３０を有する。動画像符号化装置１００は、符号化制御部１２６に接続される。 The moving image coding apparatus 100 includes a subtractor 101, a transform / quantization unit 102, an inverse quantization / inverse transform unit 103, an adder 104, a reference image memory 105, a motion estimation unit 106, and an inter prediction unit 130. . The moving image encoding apparatus 100 is connected to the encoding control unit 126.

次に、動画像符号化装置１００における符号化の流れを説明する。まず、入力画像信号１１４が、減算器１０１へ入力される。減算器１０１には、インター予測部１３０から出力された各々の予測モードに応じた予測画像信号１２３が更に入力される。減算器１０１は、入力画像信号１１４から予測画像信号１２３を減算した予測誤差信号１１５を算出する。予測誤差信号１１５は変換・量子化部１０２へと入力される。 Next, the flow of encoding in the moving image encoding apparatus 100 will be described. First, the input image signal 114 is input to the subtractor 101. The subtracter 101 further receives a predicted image signal 123 corresponding to each prediction mode output from the inter prediction unit 130. The subtractor 101 calculates a prediction error signal 115 obtained by subtracting the prediction image signal 123 from the input image signal 114. The prediction error signal 115 is input to the transform / quantization unit 102.

変換・量子化部１０２は、予測誤差信号１１５に対して、例えば離散コサイン変換（ＤＣＴ）のような直交変換が施され、変換係数が生成される。変換・量子化部１０２における変換は、Ｈ．２６４で用いられている離散コサイン変換の他に、離散サイン変換、ウェーブレット変換、又は、成分解析等でもよい。 The transform / quantization unit 102 performs orthogonal transform such as discrete cosine transform (DCT) on the prediction error signal 115 to generate transform coefficients. The transformation in the transformation / quantization unit 102 is H.264. In addition to the discrete cosine transform used in H.264, discrete sine transform, wavelet transform, component analysis, or the like may be used.

変換・量子化部１０２では、符号化制御部１２６によって与えられる量子化パラメータ、量子化マトリクス等に代表される量子化情報に従って変換係数を量子化する。変換・量子化部１０２は、量子化後の変換係数１１６を、エントロピー符号化部１１２に対して出力し、さらに、逆量子化・逆変換部１０３に対しても出力する。 The transform / quantization unit 102 quantizes the transform coefficient according to the quantization information represented by the quantization parameter, the quantization matrix, and the like given by the encoding control unit 126. The transform / quantization unit 102 outputs the quantized transform coefficient 116 to the entropy coding unit 112 and also outputs it to the inverse quantization / inverse transform unit 103.

エントロピー符号化部１１２は、量子化後の変換係数１１６に対してエントロピー符号化、例えばハフマン符号化や算術符号化などを行う。エントロピー符号化部１１２は、さらに、符号化制御部１２６から出力された予測情報などを含んだ、対象ブロックを符号化したときに用いた様々な符号化パラメータに対してエントロピー符号化を行う。これにより、符号化データが生成される。 The entropy encoding unit 112 performs entropy encoding, for example, Huffman encoding or arithmetic encoding, on the quantized transform coefficient 116. The entropy encoding unit 112 further performs entropy encoding on various encoding parameters used when the target block is encoded, including the prediction information output from the encoding control unit 126 and the like. Thereby, encoded data is generated.

なお、符号化パラメータとは、予測情報、変換係数に関する情報、量子化に関する情報、等の復号の際に必要となるパラメータである。なお、予測対象ブロックの符号化パラメータは、符号化制御部１２６が持つ内部メモリに保持され、予測対象ブロックが他の画素ブロックの隣接ブロックとして用いられる際に利用される。 Note that the encoding parameter is a parameter necessary for decoding prediction information, information on transform coefficients, information on quantization, and the like. Note that the encoding parameter of the prediction target block is held in an internal memory of the encoding control unit 126, and is used when the prediction target block is used as an adjacent block of another pixel block.

エントロピー符号化部１１２により生成された符号化データ１２４は、動画像符号化装置１００から出力され、多重化を経て出力バッファ１１３に一旦蓄積された後、符号化制御部１２６が管理する出力タイミングに従って符号化データ１２４として出力される。符号化データ１２４は、例えば、図示しない蓄積系（蓄積メディア）または伝送系（通信回線）へ送出される。 The encoded data 124 generated by the entropy encoding unit 112 is output from the moving image encoding apparatus 100, and after being multiplexed and temporarily stored in the output buffer 113, according to the output timing managed by the encoding control unit 126. Output as encoded data 124. The encoded data 124 is sent to, for example, a storage system (storage medium) or a transmission system (communication line) (not shown).

逆量子化・逆変換部１０３は、変換・量子化部１０２から出力された量子化後の変換係数１１６に対する逆量子化処理が行われる。ここでは、変換・量子化部１０２で使用された量子化情報に対応する量子化情報が、符号化制御部１２６の内部メモリからロードされて逆量子化処理が行われる。なお、量子化情報は、例えば、量子化パラメータ、量子化マトリクス等に代表されるパラメータである。 The inverse quantization / inverse transform unit 103 performs an inverse quantization process on the quantized transform coefficient 116 output from the transform / quantization unit 102. Here, the quantization information corresponding to the quantization information used in the transform / quantization unit 102 is loaded from the internal memory of the encoding control unit 126 and the inverse quantization process is performed. Note that the quantization information is, for example, a parameter represented by a quantization parameter, a quantization matrix, or the like.

逆量子化・逆変換部１０３では、さらに、逆量子化後の変換係数に対し、逆離散コサイン変換（ＩＤＣＴ）のような逆直交変換が施されることによって、復号予測誤差信号１１７が再生される。 The inverse quantization / inverse transform unit 103 further reproduces the decoded prediction error signal 117 by performing inverse orthogonal transform such as inverse discrete cosine transform (IDCT) on the transform coefficient after inverse quantization. The

復号予測誤差信号１１７は、加算器１０４に入力される。加算器１０４では、復号予測誤差信号１１７とインター予測部１３０から出力された予測画像信号１２３とが加算されることにより、復号画像信号１１８が生成される。復号画像信号１１８は、局所復号画像信号である。復号画像信号１１８は、参照画像メモリ１０５に参照画像信号１２０として蓄積される。参照画像メモリ１０５に蓄積された参照画像信号１２０は、動き推定部１０６、インター予測部１３０等に出力され予測の際などに参照される。 The decoded prediction error signal 117 is input to the adder 104. The adder 104 adds the decoded prediction error signal 117 and the predicted image signal 123 output from the inter prediction unit 130, thereby generating a decoded image signal 118. The decoded image signal 118 is a local decoded image signal. The decoded image signal 118 is stored as the reference image signal 120 in the reference image memory 105. The reference image signal 120 accumulated in the reference image memory 105 is output to the motion estimation unit 106, the inter prediction unit 130, etc., and is referred to when performing prediction.

動き推定部１０６は、入力画像信号１１４と参照画像信号１２０とを用いて、予測対象ブロックに適した動きベクトル１１９を算出する。動き情報は、例えば、動きベクトルで表されるとよい。動き情報は、また例えば、動きベクトルを他の動きベクトル等により予測する際の予測値でもよい。 The motion estimation unit 106 uses the input image signal 114 and the reference image signal 120 to calculate a motion vector 119 suitable for the prediction target block. The motion information may be represented by a motion vector, for example. For example, the motion information may be a predicted value when the motion vector is predicted by another motion vector or the like.

動き推定部１０６は、入力画像信号１１４の予測対象ブロックと、参照画像信号１２０の補間画像との間でブロックマッチングを行うことにより、動きベクトル１１９を算出する。マッチングの評価基準としては、例えば、入力画像信号１１４とマッチング後の補間画像との差分を画素毎に累積した値を用いる。 The motion estimation unit 106 calculates the motion vector 119 by performing block matching between the prediction target block of the input image signal 114 and the interpolated image of the reference image signal 120. As an evaluation criterion for matching, for example, a value obtained by accumulating the difference between the input image signal 114 and the interpolated image after matching for each pixel is used.

動きベクトル１１９の決定は、前述した方法の他に予測された画像と原画像との差を変換した値を用いても良いし、動きベクトルの大きさを加味したり、動きベクトルの符号量などを加味したりして、判定してもよい。また後述する式（２９）及び式（３０）などのコストを利用しても良い。また、マッチングは、動画像符号化装置１００の外部から提供される探索範囲情報に基づいてマッチングの範囲内を全探索しても良いし、画素精度毎に階層的に実施しても良い。 In addition to the method described above, the motion vector 119 may be determined by using a value obtained by converting the difference between the predicted image and the original image, taking into account the magnitude of the motion vector, the code amount of the motion vector, and the like. Or may be determined. Moreover, you may utilize costs, such as Formula (29) and Formula (30) mentioned later. Further, the matching may be performed through a search within the matching range based on search range information provided from the outside of the moving image encoding apparatus 100, or may be performed hierarchically for each pixel accuracy.

このようにして複数の参照画像信号に対して算出された動きベクトル１１９は、インター予測部１３０へと入力され、予測画像信号１２３の生成に利用される。なお、複数の参照画像信号は、それぞれの表示時刻が異なる局部復号画像である。 The motion vectors 119 calculated for the plurality of reference image signals in this way are input to the inter prediction unit 130 and used to generate the predicted image signal 123. The plurality of reference image signals are locally decoded images having different display times.

また、算出された動きベクトル１１９は、エントロピー符号化部１１２へと出力され、エントロピー符号化が行われた後に符号化データに多重化される。更に対象画素ブロックを符号化した動きベクトル１１９は、符号化制御部１２６の内部メモリに保存され、インター予測部１３０から適宜ロードされて利用される。 In addition, the calculated motion vector 119 is output to the entropy encoding unit 112, and after entropy encoding is performed, it is multiplexed into encoded data. Furthermore, the motion vector 119 obtained by encoding the target pixel block is stored in the internal memory of the encoding control unit 126 and is appropriately loaded from the inter prediction unit 130 and used.

＜インター予測部１３０＞
図２のインター予測部１３０は、動き補償予測部１０７、幾何変換パラメータ導出部１０８、幾何変換予測部１０９、判定パラメータ導出部１２７、予測分離スイッチ１１０、予測切替部１１１、及び、エントロピー符号化部１１２を有する。 <Inter prediction unit 130>
The inter prediction unit 130 in FIG. 2 includes a motion compensation prediction unit 107, a geometric transformation parameter derivation unit 108, a geometric transformation prediction unit 109, a determination parameter derivation unit 127, a prediction separation switch 110, a prediction switching unit 111, and an entropy coding unit. 112.

動き補償予測部１０７は、入力された動きベクトル１１９と参照画像信号１２０を用いて予測画像信号１２３を生成する。図５は、動き補償予測部１０７で行われるインター予測の例について説明する図である。図５では、フレーム（ｔ）が有する予測対象ブロックに対し、フレーム（ｔ−１）における予測誤差が小さい位置を取得することにより、動きベクトルが取得される。 The motion compensation prediction unit 107 generates a prediction image signal 123 using the input motion vector 119 and the reference image signal 120. FIG. 5 is a diagram illustrating an example of inter prediction performed by the motion compensation prediction unit 107. In FIG. 5, a motion vector is acquired by acquiring a position with a small prediction error in the frame (t−1) for the prediction target block included in the frame (t).

インター予測では、参照画像メモリ１０５に蓄積されている複数の参照画像信号１２０を用いて補間処理を行い、作成した補間画像と入力画像信号１１４との同位置の画素ブロックからのズレ量を元に予測画像信号１２３が生成される。補間処理は、例えば、１／２画素精度の補間処理や、１／４画素精度の補間処理などが用いられ、参照画像信号１２０に対するフィルタリング処理等の内挿補間処理を行うことにより、補間画素の値を生成する。例えば輝度信号に対して１／４画素精度までの補間処理が許容されるＨ．２６４では、ズレ量は整数画素精度の４倍で表現される。このズレ量を動きベクトルと呼ぶ。 In inter prediction, interpolation processing is performed using a plurality of reference image signals 120 stored in the reference image memory 105, and the generated interpolated image and the input image signal 114 are based on the amount of deviation from the pixel block at the same position. A predicted image signal 123 is generated. As the interpolation processing, for example, interpolation processing with 1/2 pixel accuracy, interpolation processing with 1/4 pixel accuracy, or the like is used. By performing interpolation processing such as filtering processing on the reference image signal 120, interpolation pixel processing is performed. Generate a value. For example, H.D. is allowed to perform interpolation processing up to 1/4 pixel accuracy for luminance signals. In H.264, the shift amount is expressed by four times the integer pixel accuracy. This amount of deviation is called a motion vector.

動き補償予測部１０７では、動きベクトル１１９の情報に従って、予測対象ブロックの位置から、次式（１）を用いて動きベクトル１１９により参照されている位置を割り出す。ここでは、Ｈ．２６４の１／４画素精度の内挿補間処理を例に挙げて説明する。動きベクトルの各成分が４の倍数である場合は、整数画素位置を指していることを意味する。その他の場合は、分数精度の補間位置に対応する予測位置である。 The motion compensated prediction unit 107 calculates the position referenced by the motion vector 119 using the following equation (1) from the position of the prediction target block according to the information of the motion vector 119. Here, H. A description will be given taking an example of H.264 interpolation interpolation with 1/4 pixel accuracy. When each component of the motion vector is a multiple of 4, it means that it is an integer pixel position. In other cases, it is a predicted position corresponding to an interpolation position with fractional accuracy.

ここで、（ｘ,ｙ)は予測対象ブロックの先頭位置を表す垂直、水平方向のインデックスであり、（ｘ_ｐｏｓ,ｙ_ｐｏｓ)は参照画像信号の対応する予測位置を表す。（ｍｖ_ｘ,ｍｖ_ｙ)は１／４画素精度を持つ動きベクトルを示している。次に割り出した画素位置に対して、参照画像信号１２０の対応する画素位置の補填又は内挿補間処理によって予測画素を生成する。 Here, (x, y) is an index in the vertical and horizontal directions representing the head position of the prediction target block, and (x_pos, y_pos) represents the corresponding prediction position of the reference image signal. (Mv_x, mv_y) represents a motion vector having a 1/4 pixel accuracy. Next, a predicted pixel is generated by interpolation or interpolation processing of the corresponding pixel position of the reference image signal 120 with respect to the determined pixel position.

図６は、Ｈ．２６４の予測画素生成の例を示す図である。図中大文字で示されるアルファベットは整数位置の画素を示し、ドットのハッチングの正方形は１／２画素位置の補間画素を示している。また、斜線のハッチングの正方形は１／４画素位置に対応する補間画素を示している。例えば、図中、アルファベットｂ、ｈの位置に対応する１／２画素の補間処理は次式で算出される。 FIG. It is a figure which shows the example of a H.264 prediction pixel production | generation. In the figure, alphabets indicated by capital letters indicate pixels at integer positions, and dot hatched squares indicate interpolation pixels at 1/2 pixel positions. In addition, hatched squares indicate interpolation pixels corresponding to 1/4 pixel positions. For example, in the drawing, the interpolation processing of 1/2 pixel corresponding to the positions of alphabets b and h is calculated by the following equation.

また、図中でアルファベットａ、ｄの位置に対応する１／４画素の補間処理は次式で算出される。

In the figure, the interpolation processing of 1/4 pixels corresponding to the positions of alphabets a and d is calculated by the following equation.

式（２）では、１／２画素位置の補間画素を、６タップＦＩＲフィルタ（タップ係数：（1，−５，２０，２０、−５，１）／３２）を用いて生成する。式（３）では、１／４画素位置の補間画素は、２タップの平均値フィルタ（タップ係数：（１，１）／２）を用いて算出する。４つの整数画素位置の中間に存在するアルファベットｊに対応する１／２画素の補間処理は、垂直方向６タップと水平方向６タップの両方向を行うことによって生成される。説明した以外の画素位置も同様のルールで補間値が生成できる。 In Expression (2), the interpolation pixel at the 1/2 pixel position is generated using a 6-tap FIR filter (tap coefficients: (1, -5, 20, 20, -5, 1) / 32). In Expression (3), the interpolation pixel at the 1/4 pixel position is calculated using a 2-tap average value filter (tap coefficient: (1, 1) / 2). The interpolation process of 1/2 pixel corresponding to the alphabet j existing in the middle of the four integer pixel positions is generated by performing both directions of 6 taps in the vertical direction and 6 taps in the horizontal direction. Interpolated values can be generated by the same rule for pixel positions other than those described.

インター予測では、複数の予測ブロックの中から現在の対象画素ブロックに適したブロックサイズを選択することが可能である。図７Ａにマクロブロック単位の動き補償画素ブロックのサイズを、図７Ｂにサブブロック（８×８画素ブロック以下）単位の動き補償画素ブロックのサイズを示す。 In inter prediction, a block size suitable for the current target pixel block can be selected from a plurality of prediction blocks. FIG. 7A shows the size of a motion compensation pixel block in units of macroblocks, and FIG. 7B shows the size of a motion compensation pixel block in units of sub-blocks (8 × 8 pixel blocks or less).

図７Ａでは、１６×１６画素のＭＢ１、２個の１６×８画素ブロックからなるＭＢ２、２個の８×１６画素ブロックからなるＭＢ３、及び、４個の８×８画素ブロックからなるＭＢ４が示される。また、図７Ｂでは、８×８画素のＳＢ１、２個の８×４画素ブロックからなるＳＢ２、２個の４×８画素ブロックからなるＳＢ３、及び、４個の４×４画素ブロックからなるＳＢ４が示される。 FIG. 7A shows a 16 × 16 pixel MB1, an MB2 composed of two 16 × 8 pixel blocks, an MB3 composed of two 8 × 16 pixel blocks, and an MB4 composed of four 8 × 8 pixel blocks. It is. In FIG. 7B, an SB of 8 × 8 pixels, an SB2 composed of two 8 × 4 pixel blocks, an SB3 composed of two 4 × 8 pixel blocks, and an SB4 composed of four 4 × 4 pixel blocks. Is shown.

これらの予測対象ブロックのサイズ毎に、動きベクトルを所持することが可能であるため、入力画像信号１１４の局所的な性質に従って、最適な予測対象ブロックの形状と動きベクトルを利用することができる。また、Ｈ．２６４では、どの参照画像信号に対して動きベクトルを計算したかの情報はＲｅｆ_ｉｄｘとして最小で８×８画素ブロック毎に変更することが可能である。 Since it is possible to possess a motion vector for each size of these prediction target blocks, the optimal shape and motion vector of the prediction target block can be used according to the local nature of the input image signal 114. H. In H.264, information on which reference image signal the motion vector is calculated can be changed as a minimum for each 8 × 8 pixel block as Ref_idx.

動き補償予測部１０７で生成された予測画像信号１２３は、予測分離スイッチ１１０に入力され、後述する予測切替部１１１から出力される予測切替情報１２２に従って制御されたスイッチの出力端の接続先に応じて選択される。 The prediction image signal 123 generated by the motion compensation prediction unit 107 is input to the prediction separation switch 110, and depends on the connection destination of the output end of the switch controlled according to the prediction switching information 122 output from the prediction switching unit 111 described later. Selected.

一方、動き推定部１０６から出力された動きベクトル１１９が幾何変換パラメータ導出部１０８へと入力され、幾何変換パラメータ１２１が生成される。ここで、幾何変換パラメータ１２１は、参照画像信号１２０に対して幾何変換を実施するためのパラメータセットである。幾何変換パラメータ導出部１０８から出力された幾何変換パラメータ１２１は幾何変換予測部１０９へと入力されるとともに判定パラメータ導出部１２７へと入力される。 On the other hand, the motion vector 119 output from the motion estimation unit 106 is input to the geometric transformation parameter derivation unit 108, and the geometric transformation parameter 121 is generated. Here, the geometric transformation parameter 121 is a parameter set for performing geometric transformation on the reference image signal 120. The geometric transformation parameter 121 output from the geometric transformation parameter derivation unit 108 is input to the geometric transformation prediction unit 109 and to the determination parameter derivation unit 127.

幾何変換予測部１０９は、入力された幾何変換パラメータ１２１及び参照画像信号１２０を用いて幾何変換を施した予測画像信号１２３を生成する。尚、予測対象ブロックの幾何変換パラメータ１２１は、符号化制御部１２６が持つ内部メモリに保持され、予測対象ブロックが他の画素ブロックの隣接ブロックとなる際に利用される。 The geometric transformation prediction unit 109 generates a predicted image signal 123 subjected to geometric transformation using the input geometric transformation parameter 121 and the reference image signal 120. The geometric transformation parameter 121 of the prediction target block is held in the internal memory of the encoding control unit 126, and is used when the prediction target block becomes an adjacent block of another pixel block.

判定パラメータ導出部１２７では、入力される幾何変換パラメータ１２１に基づいて、動き補償予測部１０７で生成された予測画像信号を用いるか、幾何変換予測部１０９で生成された予測画像信号を用いるかを判断するための判定パラメータを導出する。判定パラメータ導出部１２７で生成された判定パラメータ１２５は、予測切替部１１１へ入力される。 The determination parameter deriving unit 127 determines whether to use the predicted image signal generated by the motion compensation prediction unit 107 or the predicted image signal generated by the geometric conversion prediction unit 109 based on the input geometric conversion parameter 121. A determination parameter for determining is derived. The determination parameter 125 generated by the determination parameter deriving unit 127 is input to the prediction switching unit 111.

予測切替部１１１は、入力された判定パラメータ１２５に従って、予測対象ブロックの予測切替情報１２２を出力する。予測分離スイッチ１１０は、入力された予測切替情報１２２に従ってスイッチの出力端を動き補償予測部１０７に接続するか、幾何変換予測部１０９に接続するかを決定する。この決定に従って、いずれかの予測画像信号１２３が減算器１０１及び加算器１０４へと出力される。
以上が動画像符号化装置１００の符号化処理の流れである。 The prediction switching unit 111 outputs the prediction switching information 122 of the prediction target block according to the input determination parameter 125. The prediction separation switch 110 determines whether to connect the output terminal of the switch to the motion compensation prediction unit 107 or the geometric transformation prediction unit 109 according to the input prediction switching information 122. In accordance with this determination, one of the predicted image signals 123 is output to the subtracter 101 and the adder 104.
The above is the flow of the encoding process of the moving image encoding apparatus 100.

＜インター予測部１３０における幾何変換予測処理＞
以下、本実施の形態に係わる幾何変換予測処理の詳細についてより詳細に説明する。 <Geometric Transformation Prediction Process in Inter Prediction Unit 130>
Hereinafter, details of the geometric transformation prediction processing according to the present embodiment will be described in more detail.

＜幾何変換パラメータ導出部１０８＞
先ず、幾何変換パラメータ導出部１０８の処理について具体的に説明する。幾何変換パラメータ導出部１０８では、動き推定部１０６から出力された予測対象ブロックの動きベクトル１１９と符号化制御部１２６に保存されている動きベクトルを用いて、予測対象ブロックの幾何変換パラメータを導出する。符号化制御部１２６に保存されている動きベクトルは、隣接ブロックの動きベクトルであり、以下、「隣接動きベクトル」という。 <Geometric transformation parameter derivation unit 108>
First, the process of the geometric transformation parameter derivation unit 108 will be specifically described. The geometric transformation parameter derivation unit 108 derives the geometric transformation parameter of the prediction target block using the motion vector 119 of the prediction target block output from the motion estimation unit 106 and the motion vector stored in the encoding control unit 126. . The motion vector stored in the encoding control unit 126 is a motion vector of an adjacent block, and is hereinafter referred to as “adjacent motion vector”.

図８は、幾何変換パラメータ導出部１０８の構成を示すブロック図である。幾何変換パラメータ導出部１０８は、動きベクトル取得部１８１とパラメータ導出部１８２とを有する。動きベクトル取得部１８１は、複数の隣接ブロックのうち、動き情報を取得する隣接ブロックを決定し、その隣接ブロックの動き情報、例えば、動きベクトルを取得する。パラメータ導出部１８２は、隣接ブロックの動きベクトルから、幾何変換パラメータを導出する。 FIG. 8 is a block diagram showing a configuration of the geometric transformation parameter derivation unit 108. The geometric transformation parameter derivation unit 108 includes a motion vector acquisition unit 181 and a parameter derivation unit 182. The motion vector acquisition unit 181 determines an adjacent block from which motion information is acquired from among a plurality of adjacent blocks, and acquires motion information of the adjacent block, for example, a motion vector. The parameter deriving unit 182 derives a geometric transformation parameter from the motion vector of the adjacent block.

以下、図９ないし図１１を用いて、動きベクトル取得部１８１による隣接動きベクトルを導出する処理について説明する。 Hereinafter, the process of deriving the adjacent motion vector by the motion vector acquisition unit 181 will be described with reference to FIGS. 9 to 11.

≪隣接ブロックと隣接動きベクトルの導出（その１）−ブロックサイズが同じ場合≫
図９Ａないし図９Ｅは、予測対象ブロックに対する隣接ブロックの関係を示す図である。図９Ａでは、予測対象ブロックと隣接ブロックのサイズ（例えば１６×１６画素ブロック）が一致する場合の例を示す。 << Derivation of Adjacent Block and Adjacent Motion Vector (Part 1)-When Block Size is Same >>
9A to 9E are diagrams illustrating the relationship between adjacent blocks with respect to a prediction target block. FIG. 9A shows an example in which the sizes of prediction target blocks and adjacent blocks (for example, 16 × 16 pixel blocks) match.

図９Ａ中、斜線のハッチングが付された画素ブロックｐは既に符号化又は予測が完了している画素ブロック（以下、「予測済画素ブロック」という。）である。ドットのハッチングが付されたブロックｃは予測対象ブロックを示しており、白で表示されている画素ブロックｎは未符号化画素（未予測）ブロックである。図中Ｘは符号化（予測）対象画素ブロックを表している。 In FIG. 9A, a pixel block p with hatching is a pixel block that has already been encoded or predicted (hereinafter referred to as a “predicted pixel block”). A block c with dot hatching indicates a prediction target block, and a pixel block n displayed in white is an uncoded pixel (unpredicted) block. In the figure, X represents an encoding (prediction) target pixel block.

隣接ブロックＡは、予測対象ブロックＸの左の隣接ブロック、隣接ブロックＢは、予測対象ブロックＸの上の隣接ブロック、隣接ブロックＣは、予測対象ブロックＸの右上の隣接ブロック、隣接ブロックＤは、予測対象ブロックＸの左上の隣接ブロックである。 The adjacent block A is the adjacent block on the left of the prediction target block X, the adjacent block B is the adjacent block on the prediction target block X, the adjacent block C is the adjacent block on the upper right of the prediction target block X, and the adjacent block D is This is an adjacent block at the upper left of the prediction target block X.

符号化制御部１２６の内部メモリに保持されている隣接動きベクトルは、予測済画素ブロックの動きベクトルのみである。図４で示したように画素ブロックは左上から右下に向かって符号化及び予測の処理がされていくため、画素ブロックＸの予測を行う際には、右及び下方向の画素ブロックは未だ符号化が行われていない。そこで、これらの隣接ブロックから隣接動きベクトルを導出することができない。 The adjacent motion vector held in the internal memory of the encoding control unit 126 is only the motion vector of the predicted pixel block. As shown in FIG. 4, since the pixel block is encoded and predicted from the upper left to the lower right, when the pixel block X is predicted, the right and lower pixel blocks are still encoded. Has not been made. Therefore, an adjacent motion vector cannot be derived from these adjacent blocks.

図９Ｂないし図９Ｅは、予測対象ブロックが８×８画素ブロックの場合の、隣接ブロックの例を示す図である。なお、図９Ｂないし図９Ｅにおいて、太線はマクロブロックの境界を表す。図９Ｂは、マクロブロック内の左上に位置する画素ブロック、図９Ｃは、マクロブロック内の右上に位置する画素ブロック、図９Ｄは、マクロブロック内の左下に位置する画素ブロック、図９Ｅは、マクロブロック内の右下に位置する画素ブロックを、それぞれ、予測対象ブロックとする場合の例を示す。 9B to 9E are diagrams illustrating examples of adjacent blocks when the prediction target block is an 8 × 8 pixel block. In FIG. 9B to FIG. 9E, a bold line represents a macroblock boundary. 9B is a pixel block located at the upper left in the macro block, FIG. 9C is a pixel block located at the upper right in the macro block, FIG. 9D is a pixel block located at the lower left in the macro block, and FIG. 9E is a macro block. An example in which each pixel block located at the lower right in the block is a prediction target block is shown.

マクロブロックの内部も同様に左上から右下に向かって符号化処理が行われるため、８×８画素ブロックの符号化順序に応じて隣接ブロックの位置が変化する。対応する８×８画素ブロックの符号化処理又は予測画像生成処理が完了すると、その画素ブロックは符号化済み画素ブロックとなり、後に処理される画素ブロックの隣接ブロックとして利用される。図９Ｅでは、隣接ブロックＣに対応する右上の画素ブロックが未符号化画素ブロックであるため、符号化済み画素ブロックの右上に位置する画素ブロックを隣接ブロックとする。 Since the inside of the macroblock is similarly encoded from the upper left to the lower right, the position of the adjacent block changes according to the encoding order of the 8 × 8 pixel block. When the encoding process or the predicted image generation process for the corresponding 8 × 8 pixel block is completed, the pixel block becomes an encoded pixel block and is used as an adjacent block of the pixel block to be processed later. In FIG. 9E, since the upper right pixel block corresponding to the adjacent block C is an unencoded pixel block, the pixel block located at the upper right of the encoded pixel block is set as an adjacent block.

≪隣接ブロックと隣接動きベクトルの導出（その２）−ブロックサイズが異なる場合≫
次に隣接ブロックと予測対象ブロックのブロックサイズが異なる場合の隣接ブロックの関係を説明する。予測対象ブロックと隣接ブロックのブロックサイズが異なる場合、隣接画素の定義が複数存在する。 << Derivation of Adjacent Block and Adjacent Motion Vector (Part 2)-When Block Sizes are Different >>
Next, the relationship between adjacent blocks when the block sizes of the adjacent block and the prediction target block are different will be described. When the block sizes of the prediction target block and the adjacent block are different, there are a plurality of adjacent pixel definitions.

図１０Ａないし図１０Ｄは、予測対象ブロックが大きく、隣接ブロックが小さい場合の例を説明する図である。図１０Ａは、予測対象ブロックの左上の画素にもっとも近い画素が存在する画素ブロックを隣接画素とする例である。図１０Ｂは、予測対象ブロックに隣接する画素ブロックの右下に位置する画素ブロックを隣接ブロックとする例である。 FIG. 10A to FIG. 10D are diagrams illustrating an example where the prediction target block is large and the adjacent block is small. FIG. 10A is an example in which a pixel block having a pixel closest to the upper left pixel of the prediction target block is set as an adjacent pixel. FIG. 10B is an example in which the pixel block located at the lower right of the pixel block adjacent to the prediction target block is set as the adjacent block.

図１０Ｃは、予測対象ブロックに隣接する画素ブロックの中心に存在する画素ブロックを隣接ブロックとする例である。図１０Ｃでは、予測対象ブロックＸの左に隣接する画素ブロックの中心は、８×８画素ブロックの境界となる。このように中心位置がブロック境界に存在する場合は、左の画素ブロックを隣接ブロックとしている。 FIG. 10C is an example in which a pixel block existing at the center of a pixel block adjacent to the prediction target block is set as an adjacent block. In FIG. 10C, the center of the pixel block adjacent to the left of the prediction target block X is an 8 × 8 pixel block boundary. Thus, when the center position exists at the block boundary, the left pixel block is set as an adjacent block.

図１０Ｄは、予測対象ブロックに隣接する画素ブロックの中心に存在する画素ブロックを隣接ブロックとする例であるが、図１０Ｃと同様に中心点が画素ブロック境界に位置するため、中心点の右上に位置する画素ブロックを隣接ブロックとする例である。中心がブロック境界に存在する場合、隣接するどのブロックを中心のブロックと定義しても良いが、全ての隣接画素で同様の定義を適用する。これにより、それぞれの隣接ブロックの動き情報から、予測対象ブロックの動き情報を推定する際に、隣接ブロックに対する位置関係を同一の位相で表現することができる。 FIG. 10D is an example in which the pixel block existing at the center of the pixel block adjacent to the prediction target block is an adjacent block. However, since the center point is located at the pixel block boundary as in FIG. 10C, This is an example in which a positioned pixel block is an adjacent block. When the center exists at the block boundary, any adjacent block may be defined as the central block, but the same definition is applied to all adjacent pixels. Thereby, when estimating the motion information of a prediction object block from the motion information of each adjacent block, the positional relationship with respect to an adjacent block can be expressed by the same phase.

尚、本実施の形態では、簡単のために図１０Ａで示された隣接ブロックの定義を利用する。 In the present embodiment, the definition of the adjacent block shown in FIG. 10A is used for simplicity.

図１１Ａないし図１１Ｄは、予測対象ブロックが小さく、隣接ブロックのブロックサイズが大きい場合の例を説明する図である。図９Ｅと同様に、対応する画素ブロックが未符号化画素ブロックである場合は、予測対象ブロックに距離的に近い利用可能な符号化済みの画素ブロックで置き換える。 FIG. 11A to FIG. 11D are diagrams for explaining an example when the prediction target block is small and the block size of the adjacent block is large. Similarly to FIG. 9E, when the corresponding pixel block is an uncoded pixel block, it is replaced with a coded pixel block that can be used that is close in distance to the prediction target block.

以上の説明では、１６×１６画素及び８×８画素の場合を例に挙げて説明したが、同様の枠組みを用いて３２×３２画素、４×４画素などの正方画素ブロックや１６×８画素、８×１６画素などの矩形画素ブロックに対しても隣接ブロックを決定してよい。 In the above description, the case of 16 × 16 pixels and 8 × 8 pixels has been described as an example, but a square pixel block such as 32 × 32 pixels, 4 × 4 pixels, or 16 × 8 pixels using the same framework. Adjacent blocks may also be determined for rectangular pixel blocks such as 8 × 16 pixels.

また、インター予測では、マクロブロック内の符号化順序に依存せずに符号化処理、すなわち、動きベクトルの推定を行うことが可能なため、８×８画素ブロックの場合においても、図１０Ａないし図１０Ｄのいずれかを用いて隣接ブロックを決定してもよい。また、ブロックサイズの大きさが異なる画素ブロックが混在している場合にも、図１０Ａないし図１０Ｄのいずれかを用いて隣接ブロックを決定してもよい。 In inter prediction, since encoding processing, that is, motion vector estimation, can be performed without depending on the encoding order in the macroblock, even in the case of an 8 × 8 pixel block, FIG. Adjacent blocks may be determined using any of 10D. Even when pixel blocks having different block sizes are mixed, adjacent blocks may be determined using any one of FIGS. 10A to 10D.

なお、隣接ブロックとしてＡ，Ｂ，Ｃ，Ｄの４つの画素ブロックを用いる他に、隣接ブロックを更に広く定義してもかまわない。例えば、隣接ブロックＡの更に左の画素ブロックを用いてもよいし、隣接ブロックＢの更に上の画素ブロックを用いても良い。これらの隣接ブロックの定義は、既に説明した図９ないし図１１の定義と同様に定義してよい。 In addition to using four pixel blocks A, B, C, and D as adjacent blocks, the adjacent blocks may be defined more widely. For example, a pixel block on the left of the adjacent block A may be used, or a pixel block further on the adjacent block B may be used. The definition of these adjacent blocks may be defined similarly to the definitions of FIGS. 9 to 11 described above.

≪幾何変換パラメータの導出≫
次に幾何変換パラメータ導出部１０８における幾何変換パラメータ１２１の導出方法について説明する。幾何変換パラメータ１２１は、パラメータ導出部１８２により実行される。隣接ブロックが保持する隣接動きベクトルをそれぞれ式（４）ないし（７）により定義する。 ≪Derivation of geometric transformation parameters≫
Next, a method for deriving the geometric transformation parameter 121 in the geometric transformation parameter deriving unit 108 will be described. The geometric transformation parameter 121 is executed by the parameter deriving unit 182. The adjacent motion vectors held by the adjacent blocks are defined by equations (4) to (7), respectively.

また、動き推定部１０６から提供される動きベクトル１１９を式（８）により定義する。なお、動きベクトル１１９は、予測対象ブロックＸの動きベクトルである。 Also, the motion vector 119 provided from the motion estimation unit 106 is defined by equation (8). The motion vector 119 is a motion vector of the prediction target block X.

式（４）ないし（８）で表される動きベクトル及び隣接動きベクトルを用いて、幾何変換パラメータ１２１を導出する。幾何変換がアフィン変換の場合には、変換式は次式（９）で表される。 The geometric transformation parameter 121 is derived using the motion vector and the adjacent motion vector represented by the equations (4) to (8). When the geometric transformation is an affine transformation, the transformation formula is expressed by the following formula (9).

式（９）では、座標（ｘ、ｙ）がアフィン変換によって座標（ｕ，ｖ）へ変換される。式（９）に含まれるａ、ｂ、ｃ、ｄ、ｅ、ｆの６個のパラメータが幾何変換パラメータを表している。アフィン変換ではこの６種類のパラメータを推定するため、６個以上の入力値が必要となる。隣接ブロックＡ、Ｂ及び予測対象ブロックＸのそれぞれの動きベクトルを用いると、次式（１０）により幾何変換パラメータが導出される。ここでは、動きベクトルが１／４精度であることを前提としている。 In equation (9), coordinates (x, y) are converted to coordinates (u, v) by affine transformation. Six parameters a, b, c, d, e, and f included in Expression (9) represent geometric transformation parameters. In the affine transformation, since these six types of parameters are estimated, six or more input values are required. When the motion vectors of the adjacent blocks A and B and the prediction target block X are used, geometric transformation parameters are derived by the following equation (10). Here, it is assumed that the motion vector is ¼ precision.

但し、ａｘ、ａｙは予測対象ブロックのサイズに基づく変数であり、次式で算出される。

However, ax and ay are variables based on the size of the prediction target block, and are calculated by the following equations.

ここで、ｍｂ＿ｓｉｚｅ＿ｘ及びｍｂ＿ｓｉｚｅ＿ｙはマクロブロックの水平、垂直方向のサイズを示しており、１６×１６画素ブロックの場合には、ｍｂ＿ｓｉｚｅ＿ｘ＝１６、ｍｂ＿ｓｉｚｅ＿ｙ＝１６となる。また、ｂｌｋ＿ｓｉｚｅ＿ｘ及びｂｌｋ＿ｓｉｚｅ＿ｙは予測対象ブロックの水平、垂直サイズを表しており、図９Ｂの場合は、ｂｌｋ＿ｓｉｚｅ＿ｘ＝８、ｂｌｋ＿ｓｉｚｅ＿ｙ＝８となる。 Here, mb_size_x and mb_size_y indicate the horizontal and vertical sizes of the macroblock. In the case of a 16 × 16 pixel block, mb_size_x = 16 and mb_size_y = 16. Also, blk_size_x and blk_size_y represent the horizontal and vertical sizes of the prediction target block. In the case of FIG. 9B, blk_size_x = 8 and blk_size_y = 8.

ここでは、入力値として隣接ブロックＡ及びＢの動きベクトルを用いて、幾何変換パラメータを導出する例を示したが、必ずしも、隣接ブロックＡ及びＢの動きベクトルを用いる必要はなく、隣接ブロックＣ、Ｄ及びそれ以外の隣接ブロックから算出された動きベクトルを用いても良いし、これらの複数の隣接ブロックの動きベクトルからパラメータフィッティングを用いて、幾何変換パラメータを求めても良い。また、式（１０）は、それぞれａ，ｂ，ｄ，ｅが実数で得られるが、予めこれらのパラメータの演算精度を決めておくことで簡単に整数化が可能である。 Here, an example in which the geometric transformation parameters are derived using the motion vectors of adjacent blocks A and B as input values is shown, but it is not always necessary to use the motion vectors of adjacent blocks A and B. Motion vectors calculated from D and other adjacent blocks may be used, or geometric transformation parameters may be obtained from the motion vectors of these adjacent blocks using parameter fitting. Further, in equation (10), a, b, d, and e are obtained as real numbers, respectively, but can be easily converted to integers by determining the calculation accuracy of these parameters in advance.

≪隣接動きベクトルの導出−エッジ考慮≫
隣接ブロックとの境界にオブジェクトのエッジがある場合や、異なるオブジェクトが存在する場合、幾何変換パラメータが正しく導出できないことがある。そこで、利用する隣接動きベクトルと予測対象ブロックの動きベクトルの絶対差分値が大きく異なる場合は、当該隣接ブロックの動きベクトルを入力値に加えない処理を行ってもよい。例えば、｜ｍｖ_ａ−ｍｖ_ｘ｜を計算し、予め規定した閾値Ｄよりも大きくなる場合は、ｍｖ_ａを入力値として利用しないようにしてもよい。 ≪Derivation of adjacent motion vector-Edge consideration≫
When there is an edge of an object at the boundary with an adjacent block or when a different object exists, the geometric transformation parameter may not be correctly derived. Therefore, when the absolute difference value between the adjacent motion vector to be used and the motion vector of the prediction target block is greatly different, processing in which the motion vector of the adjacent block is not added to the input value may be performed. For _{example, |} mv a -mv _x _| is calculated, and if greater than the threshold value D as defined previously, may not be utilized mv _a as input.

また、予測対象ブロックの左に位置する隣接ブロックの候補が複数存在する場合、これらの候補の中で動きベクトルのメディアン値を計算し、値が大きく異なる動きベクトルを除外してもよい。図１２は、４つの８×８画素ブロック毎に、メディアン値を計算する例を示す図である。予測対象ブロックＸのブロックサイズが大きく、隣接する画素ブロックのブロックサイズが小さい場合、隣接画素の候補となる画素ブロックが複数存在する。ここで、左に位置する４個の画素ブロックの動きベクトルを、それぞれ、ｍｖ_ａ、ｍｖ_ｂ、ｍｖ_ｃ、ｍｖ_ｄとすると、次式（１２）により動きベクトルを決定する。 Further, when there are a plurality of adjacent block candidates located to the left of the prediction target block, a median value of motion vectors may be calculated from these candidates, and motion vectors having greatly different values may be excluded. FIG. 12 is a diagram illustrating an example in which a median value is calculated for each of four 8 × 8 pixel blocks. When the block size of the prediction target block X is large and the block size of adjacent pixel blocks is small, there are a plurality of pixel blocks that are candidates for adjacent pixels. Here, if the motion vectors of the four pixel blocks located on the left are mv _a , mv _b , mv _c , and mv _d , the motion vector is determined by the following equation (12).

式（１２）では、二次元ベクトルのメディアン値を用いる例を示したが、式（１３）に示すように、ベクトルの要素毎のメディアン値でもよい。

Although the example using the median value of the two-dimensional vector is shown in Expression (12), as shown in Expression (13), the median value for each vector element may be used.

この他に、要素毎の平均値、二次元ベクトルのランダム値、又は、ベクトルの要素別のランダム値等を用いてもよい。また、隣接ブロックＢ、Ｃ、Ｄにおいても同様の処理を行って隣接動きベクトルを求めてもよい。 In addition, an average value for each element, a random value of a two-dimensional vector, a random value for each element of the vector, or the like may be used. Also, the adjacent motion vectors may be obtained by performing the same processing in the adjacent blocks B, C, and D.

≪予測対象画素の分割≫
次に予測対象ブロックに対して幾何変換を実施する領域を説明する。幾何変換パラメータを導出する領域は、幾何変換を実施する領域に対応している。式（１０）では矩形画素ブロックに対して幾何変換パラメータを導出する例を示した。しかし、必ずしも矩形画素ブロックで幾何変換パラメータを導出する必要はなく、図１３で示す三角パッチで矩形ブロックを分割してもよい。図１３Ａ、及び、図１３Ｂは予測対象ブロックを対角線で分け２つの三角パッチで分割した例を示している。図１３Ａでは、それぞれの幾何変換パラメータを次式で導出する。予測対象三角パッチＸ１は、隣接ブロックＡ、Ｄ及び予測対象ブロックＸ１の動きベクトルを用いて次式（１４）で定義される。 ≪Division of prediction target pixel≫
Next, a region where geometric transformation is performed on the prediction target block will be described. The area from which the geometric transformation parameter is derived corresponds to the area where the geometric transformation is performed. Equation (10) shows an example in which a geometric transformation parameter is derived for a rectangular pixel block. However, it is not always necessary to derive the geometric transformation parameter by the rectangular pixel block, and the rectangular block may be divided by the triangular patch shown in FIG. 13A and 13B show an example in which the prediction target block is divided by a diagonal line and divided by two triangular patches. In FIG. 13A, each geometric transformation parameter is derived by the following equation. The prediction target triangular patch X1 is defined by the following equation (14) using the motion vectors of the adjacent blocks A and D and the prediction target block X1.

予測対象三角パッチＸ２は、隣接ブロックＢ、Ｄ及び予測対象ブロックＸ２の動きベクトルを用いて次式（１５）で定義される。

The prediction target triangular patch X2 is defined by the following equation (15) using the motion vectors of the adjacent blocks B and D and the prediction target block X2.

式（１４）及び式（１５）に示す例のように、予測対象三角パッチ（または予測対象ブロック）に対して空間的距離の近い隣接ブロックの動きベクトルを用いて幾何変換パラメータを導出する。 As in the examples shown in Expression (14) and Expression (15), the geometric transformation parameter is derived using the motion vector of the adjacent block having a spatial distance close to the prediction target triangular patch (or the prediction target block).

図１３Ｂの場合も、同様にして、幾何変換パラメータが導出できる。しかし、予測対象三角パッチＸ２は、未符号化画素ブロック側に空間的距離が近く、隣接ブロックとの空間的距離が遠い。一般的に、オブジェクトの動きは空間的相関が高いため、利用可能な隣接ブロックが多く取れるように分割形状を定義するとよい。 In the case of FIG. 13B as well, geometric transformation parameters can be derived in the same manner. However, the prediction target triangular patch X2 has a spatial distance close to the uncoded pixel block side and a spatial distance from an adjacent block. In general, since the motion of an object has a high spatial correlation, it is preferable to define a division shape so that a large number of adjacent blocks can be used.

なお、本実施の形態では、予測対象ブロックを２つの三角パッチで分割する例を示したが、分割形状は、更に複数の三角パッチで分割してもよく、また、矩形、曲線、台形、平行四辺形、及びこれらの組み合わせを用いて分割しても良い。 In the present embodiment, an example in which a prediction target block is divided by two triangular patches has been described. However, the division shape may be further divided by a plurality of triangular patches, and may be rectangular, curved, trapezoidal, and parallel. You may divide | segment using a quadrilateral and these combination.

本実施の形態では、幾何変換の例としてアフィン変換を用いた例を示したが、共一次変換、ヘルマート変換、二次等角変換、射影変換、3次元射影変換、などのいずれの幾何変換を用いてもよい。例えば射影変換は、次式（１６）で表される。 In this embodiment, an example using affine transformation is shown as an example of geometric transformation, but any geometric transformation such as bilinear transformation, Helmart transformation, quadratic conformal transformation, projective transformation, three-dimensional projective transformation, etc. It may be used. For example, the projective transformation is expressed by the following equation (16).

式（１６）において、分子分母をスカラーで通分すると、解くべきパラメータは８種類となる。そこで、利用可能な隣接ブロック数を多く定義することにより、アフィン変換と同様の枠組みで幾何変換パラメータを導出することが可能である。
以上が、幾何変換パラメータ導出部１０８の処理の概要である。 In equation (16), if the numerator denominator is divided into scalars, there are eight parameters to be solved. Thus, by defining a large number of adjacent blocks that can be used, it is possible to derive geometric transformation parameters in the same framework as affine transformation.
The above is the outline of the processing of the geometric transformation parameter derivation unit 108.

＜幾何変換予測部１０９＞
次に、幾何変換予測部１０９の処理について具体的に説明する。幾何変換予測部１０９は入力された幾何変換パラメータ１２１を基にして、参照画像信号１２０に対して幾何変換を実施し、幾何変換予測を行う。図１４は、幾何変換予測部１０９の構成の例を示すブロック図である。幾何変換予測部１０９は、幾何変換部１９１と内挿補間部１９２とを有する。幾何変換部１９１は、参照画像信号１２０に対する幾何変換を行い、予測画素の位置を算出する。内挿補間部１９２は、幾何変換により求められた予測画素の分数位置に対応する予測画素の値を、内挿補間等により算出する。 <Geometric transformation prediction unit 109>
Next, the process of the geometric transformation prediction unit 109 will be specifically described. The geometric transformation prediction unit 109 performs geometric transformation on the reference image signal 120 based on the inputted geometric transformation parameter 121 to perform geometric transformation prediction. FIG. 14 is a block diagram illustrating an example of the configuration of the geometric transformation prediction unit 109. The geometric transformation prediction unit 109 includes a geometric transformation unit 191 and an interpolation unit 192. The geometric conversion unit 191 performs geometric conversion on the reference image signal 120 and calculates the position of the predicted pixel. The interpolation unit 192 calculates the predicted pixel value corresponding to the fractional position of the predicted pixel obtained by the geometric transformation by interpolation or the like.

図１５は、予測対象ブロックに対する幾何変換予測と動き補償予測の例を示す図である。図１５は、１６×１６画素ブロックの例である。 FIG. 15 is a diagram illustrating an example of geometric transformation prediction and motion compensation prediction for a prediction target block. FIG. 15 is an example of a 16 × 16 pixel block.

図中、予測対象ブロックは三角で示される画素からなる正方形画素ブロックＣＲである。動き補償予測の対応する画素は黒丸で示される。黒丸で示される画素からなる画素ブロックＭＥＲは、正方形である。一方、幾何変換予測の対応する画素は×で示され、これらの画素からなる画素ブロックＧＴＲは、平行四辺形となる。 In the figure, the prediction target block is a square pixel block CR composed of pixels indicated by triangles. The corresponding pixels of motion compensated prediction are indicated by black circles. A pixel block MER composed of pixels indicated by black circles is a square. On the other hand, pixels corresponding to the geometric transformation prediction are indicated by x, and a pixel block GTR composed of these pixels is a parallelogram.

動き補償後の領域と幾何変換後の領域は、参照画像信号の対応する領域を符号化対象のフレームの座標に合わせて記述している。このように、幾何変換予測を用いることによって、矩形画素ブロックの回転、拡大・縮小、せん断、鏡面変換などの変形に合わせた予測画像信号の生成が可能となる。 The region after motion compensation and the region after geometric transformation describe the corresponding region of the reference image signal according to the coordinates of the frame to be encoded. As described above, by using the geometric transformation prediction, it is possible to generate a predicted image signal in accordance with deformation such as rotation, enlargement / reduction, shearing, and mirror transformation of the rectangular pixel block.

幾何変換部１９１は、式（１０）、式（１４）、及び、式（１５）を用いて算出された幾何変換パラメータ１２１を用い、式（９）により、幾何変換後の座標（ｕ，ｖ）を算出する。算出された幾何変換後の座標（ｕ，ｖ）は、実数値である。そこで、座標（ｕ，ｖ）に対応する輝度値を参照画像信号から内挿補間することによって予測値を生成する。 The geometric transformation unit 191 uses the geometric transformation parameter 121 calculated using the equations (10), (14), and (15), and uses the geometric transformation parameters (u, v) according to the equation (9). ) Is calculated. The calculated coordinates (u, v) after geometric transformation are real values. Therefore, the predicted value is generated by interpolating the luminance value corresponding to the coordinates (u, v) from the reference image signal.

式（１０）、式（１４）、及び、式（１５）より、６ビット演算で誤差なく分数位置を計算できるため、内挿補間の際の画素精度を６ビットとする。これにより、整数画素間は６４個の分数画素に分割される。 From Equation (10), Equation (14), and Equation (15), since the fractional position can be calculated without error by 6-bit calculation, the pixel accuracy at the time of interpolation is 6 bits. Thereby, the integer pixels are divided into 64 fractional pixels.

図１６は、共一次内挿法による輝度値補間処理の例を示す図である。白丸ｃｗ０ないしｃｗ３は整数画素位置の輝度値を示し、黒丸ｃｂが補間画素位置（ｕ，ｖ）を示している。図１６では、分数精度の位置に隣接する周囲４つの整数画素値を用いて、それぞれの距離の比から補間画素値を生成する。共一次内挿法は次式（１７）で表される。 FIG. 16 is a diagram illustrating an example of luminance value interpolation processing by bilinear interpolation. White circles cw0 to cw3 indicate luminance values at integer pixel positions, and black circles cb indicate interpolation pixel positions (u, v). In FIG. 16, the interpolated pixel value is generated from the ratio of the distances using the surrounding four integer pixel values adjacent to the fractional accuracy position. The bilinear interpolation method is expressed by the following equation (17).

ここでＰ（ｕ,ｖ）は内挿補間処理後の予測画素値を示しており、Ｒ（ｘ，ｙ）は、利用した参照画像信号の整数画素値を表している。（ｘ−ｕ）＝Ｕ／６４、（ｙ−ｖ）＝Ｖ／６４とすると、式（１７）は、式（１８）に示す整数演算に変形できる。 Here, P (u, v) represents the predicted pixel value after the interpolation process, and R (x, y) represents the integer pixel value of the used reference image signal. Assuming that (x−u) = U / 64 and (y−v) = V / 64, Equation (17) can be transformed into an integer operation shown in Equation (18).

式（１８）において、ｆは丸めのオフセット（０≦ｆ＜２１２）を表している。本実施の形態ではｆ＝０としている。

In Expression (18), f represents a rounding offset (0 ≦ f <212). In this embodiment, f = 0.

以上のように、幾何変換を行った予測対象ブロック内の座標毎に内挿補間を適用することによって、新たな予測画像信号を生成する。 As described above, a new predicted image signal is generated by applying interpolation for each coordinate in the prediction target block subjected to geometric transformation.

なお、本実施の形態では、内挿補間の方法として共一次内挿法を用いる例を示したが、最近接内挿法、３次畳み込み内挿法、線形フィルタ内挿法、ラグランジュ補間法、スプライン補間法、ランツォシュ補間法などのいかなる内挿補間法を適用しても構わない。 In the present embodiment, the bilinear interpolation method is used as the interpolation method. However, the closest interpolation method, the cubic convolution interpolation method, the linear filter interpolation method, the Lagrange interpolation method, Any interpolation method such as a spline interpolation method or a Lanczos interpolation method may be applied.

なお、本実施の形態では、参照画像信号の整数画素位置からの内挿補間についての例を説明したが、動き推定部１０６で、既に参照画像信号１２０の補間画像信号を保持している場合には、分数精度の補間画像信号を再利用しても良い。例えば、動き推定部１０６で１／４画素精度の動きベクトルを算出する際に、参照画像信号を４倍に拡大した拡大参照画像信号を生成して保持している場合には、１／４画素精度の補間画像を利用して１／６４画素精度の補間画像を生成してもよい。これにより、１／４精度の補間画像から更に１／１６精度の内挿補間処理を行って１／６４画素精度の補間画像を生成することができる。なお、内挿補間処理の画素精度は更に細かく指定することも可能である。この場合、指定した補間精度に応じて内挿補間処理を行えばよい。
以上が、幾何変換予測部１０９の処理の概要である。 In this embodiment, an example of interpolation from the integer pixel position of the reference image signal has been described. However, when the motion estimation unit 106 has already held the interpolated image signal of the reference image signal 120. May reuse the interpolated image signal of fractional precision. For example, when the motion estimation unit 106 calculates a motion vector with ¼ pixel accuracy, if an enlarged reference image signal obtained by enlarging the reference image signal by four times is generated and held, the ¼ pixel An interpolation image with 1/64 pixel accuracy may be generated using an accuracy interpolation image. Thus, an interpolation image with 1/64 pixel accuracy can be generated by further performing interpolation interpolation processing with 1/16 accuracy from the 1/4 accuracy interpolation image. Note that the pixel accuracy of the interpolation process can be specified more finely. In this case, an interpolation process may be performed according to the designated interpolation accuracy.
The above is the outline of the processing of the geometric transformation prediction unit 109.

＜判定パラメータ導出部１２７＞
次に判定パラメータ導出部１２７について具体的に説明する。判定パラメータ導出部１２７は、幾何変換パラメータ導出部１０８から出力された幾何変換パラメータ１２１を用いて、動き補償予測部１０７から出力された予測画像信号と幾何変換予測部１０９から出力された予測画像信号との何れの信号を出力するかを判定するための判定パラメータ１２５を生成し、予測切替部１１１へと出力する。幾何変換がアフィン変換の場合には、平行移動指標、回転指標、拡大・縮小指標、変形指標などを用いて、幾何変換の度合いを評価することができる。 <Determination Parameter Deriving Unit 127>
Next, the determination parameter deriving unit 127 will be specifically described. The determination parameter deriving unit 127 uses the geometric transformation parameter 121 output from the geometric transformation parameter deriving unit 108 and the predicted image signal output from the motion compensation prediction unit 107 and the predicted image signal output from the geometric transformation prediction unit 109. The determination parameter 125 for determining which signal to output is generated and output to the prediction switching unit 111. When the geometric transformation is an affine transformation, the degree of geometric transformation can be evaluated using a parallel movement index, a rotation index, an enlargement / reduction index, a deformation index, and the like.

一般的な動画像では、時間方向への相関が高い。動きベクトルは時間的に異なる画像間のオブジェクトの移動を示す値であるため、動きの空間相関も比較的高いことが予想される。そこでこれらの指標を判定パラメータとして利用し、予測方法を動的に切り替える。平行移動指標は次式（１９）で与えられる。 A general moving image has a high correlation in the time direction. Since the motion vector is a value indicating the movement of an object between temporally different images, it is expected that the spatial correlation of motion is relatively high. Therefore, the prediction method is dynamically switched using these indexes as determination parameters. The parallel movement index is given by the following equation (19).

式（１９）において、ｃnは、隣接ブロック、又は、予測対象ブロックを分割した隣接する三角パッチ、すなわち、図１３における同じ番号の隣接ブロック内の三角パッチの幾何変換パラメータのｃ成分を示しており、ｃｘは隣接ブロックの幾何変換パラメータのｃ成分を示している。三角パッチに分割した際は、分割した同じ形状の三角パッチとの隣接関係を参照している。ｆ成分についても同様である。 In Equation (19), cn represents the c component of the geometric transformation parameter of the adjacent block or the adjacent triangular patch obtained by dividing the prediction target block, that is, the triangular patch in the adjacent block of the same number in FIG. , Cx indicate the c component of the geometric transformation parameter of the adjacent block. When divided into triangular patches, the adjacent relationship with the divided triangular patches of the same shape is referred to. The same applies to the f component.

図１７は、アフィン変換による画素ブロックの変化の例を示す図である。図１７では、座標（１，０）及び（０，１）がそれぞれアフィン変換によって座標（ａ，ｄ）、（ｂ、ｃ）に変換され、変換前のベクトルの中心角度４５°からθＤ回転している。ａ，ｂ，ｄ，ｅはそれぞれ、予測対象ブロックで得られた幾何変換パラメータを示している。回転指標は次式（２０）で与えられる。 FIG. 17 is a diagram illustrating an example of a pixel block change by affine transformation. In FIG. 17, coordinates (1, 0) and (0, 1) are converted into coordinates (a, d) and (b, c) by affine transformation, respectively, and rotated by θD from the center angle 45 ° of the vector before conversion. ing. Each of a, b, d, and e indicates a geometric transformation parameter obtained in the prediction target block. The rotation index is given by the following equation (20).

式（２０）において、ｓｇｎ（Ａ）は、Ａの符号を返す関数である。矩形ブロックの中心ベクトルがどの程度回転したかを表している。

In equation (20), sgn (A) is a function that returns the sign of A. This shows how much the center vector of the rectangular block has been rotated.

拡大・縮小指標は図１７で示されるアフィン変換後の面積に相当し、値が１より大きい場合は拡大方向に、小さい場合は縮小方向に変形していることが判る。そこで、次式で拡大・縮小指標を定義する。 The enlargement / reduction index corresponds to the area after the affine transformation shown in FIG. Therefore, an enlargement / reduction index is defined by the following equation.

式（２１）でＤｅｔ≒１となる場合は、更に次式を用いて回転指標を算出してもよい。

When Det≈1 in equation (21), the rotation index may be calculated using the following equation.

図１７のθCは次式（２３）の変形指標に対応しており、アフィン変換後の図形の変形角度を定義している。

17 corresponds to the deformation index of the following equation (23), and defines the deformation angle of the figure after the affine transformation.

また、式（２４）に示す、隣接ブロックで算出された幾何変換パラメータの各成分と予測対象ブロックで算出された幾何変換パラメータの各成分の差分値もそれぞれ判定パラメータの１つのパラメータとしている。 Further, the difference value between each component of the geometric transformation parameter calculated in the adjacent block and each component of the geometric transformation parameter calculated in the prediction target block shown in Expression (24) is also used as one parameter of the determination parameter.

また、式（２０）、式（２１）、式（２２）、式（２３）を用いて算出された予測対象ブロックと隣接ブロックの指標を用いて以下のような判定パラメータを利用しても良い。 In addition, the following determination parameters may be used by using the prediction target block and the adjacent block index calculated using Expression (20), Expression (21), Expression (22), and Expression (23). .

例えばＤｅｔｎは隣接ブロック或いは隣接ブロックの三角パッチが保持する幾何変換パラメータから算出される。算出された幾何変換パラメータに対応する指標と幾何変換パラメータの差分値が判定パラメータ１２５として、予測切替部１１１へと出力される。
以上が、判定パラメータ導出部１２７の処理の概要である。 For example, Detn is calculated from the geometric transformation parameter held by the adjacent block or the triangular patch of the adjacent block. A difference value between the index corresponding to the calculated geometric transformation parameter and the geometric transformation parameter is output to the prediction switching unit 111 as the determination parameter 125.
The above is the outline of the processing of the determination parameter deriving unit 127.

＜予測切替部１１１−予測の動的切替＞
次に予測切替部１１１について具体的に説明する。予測切替部１１１は、判定パラメータ導出部１２７から出力された判定パラメータ１２５に基づいて、どちらの予測画像信号を用いかが記載された予測切替情報１２２を生成する。 <Prediction switching unit 111-Dynamic switching of prediction>
Next, the prediction switching unit 111 will be specifically described. Based on the determination parameter 125 output from the determination parameter deriving unit 127, the prediction switching unit 111 generates prediction switching information 122 that describes which prediction image signal is used.

先ず、式（２０）ないし式（２３）を用いる場合の切替方法について説明する。回転指標、拡大・縮小指標、変形指標は、動きベクトルのみの平行移動に加え、画素ブロックの回転、拡大、縮小、変形の度合いを図る指標である。算出された幾何変換パラメータの指標が、極端に大きい場合や小さい場合には、幾何変換パラメータの推定が不適切であることが予想され、このパラメータを利用して幾何変換予測で生成される予測画像信号は誤差を多く含んでいることが予想される。そこで、これらの指標が、予め定めた閾値の範囲を超えた場合には、幾何変換予測の予測画像信号を利用しないように予測切替情報１２２を生成する。例として、式（２１）の拡大・縮小指標における閾値範囲について説明する。Ｄｅｔは拡大・縮小の度合いを示す指標である。予測切替情報１２２を次式（２８）で定義する。 First, a switching method in the case of using Expression (20) to Expression (23) will be described. The rotation index, the enlargement / reduction index, and the deformation index are indices that aim at the degree of rotation, enlargement, reduction, and deformation of the pixel block in addition to the parallel movement of only the motion vector. When the calculated geometric transformation parameter index is extremely large or small, it is expected that the estimation of the geometric transformation parameter is inappropriate, and a predicted image generated by geometric transformation prediction using this parameter The signal is expected to contain many errors. Therefore, when these indices exceed a predetermined threshold range, the prediction switching information 122 is generated so as not to use the prediction image signal of the geometric transformation prediction. As an example, the threshold range in the enlargement / reduction index of Expression (21) will be described. Det is an index indicating the degree of enlargement / reduction. The prediction switching information 122 is defined by the following formula (28).

ここでｐｒｅｄ＿ｆｌａｇは予測切替情報１２２を表しており、ＴｈＤｅｔは閾値を表している。Ｄｅｔがそれぞれ閾値の範囲を超えた場合、ｐｒｅｄ＿ｆｌａｇは０となり、動き補償予測を選択する。一方、それ以外の場合は、ｐｒｅｄ＿ｆｌａｇは１となり、幾何変換予測を選択する。他の判定パラメータに関しても同様な閾値判定を行い、予測切替情報１２２を生成する。 Here, pred_flag represents the prediction switching information 122, and ThDet represents a threshold value. When Det exceeds the threshold range, pred_flag is 0, and motion compensation prediction is selected. On the other hand, in other cases, pred_flag is 1, and geometric transformation prediction is selected. Similar threshold determination is performed for other determination parameters, and prediction switching information 122 is generated.

次に、式（１９）、式（２４）ないし式（２７）を用いる際の切替方法について説明する。隣接ブロックと予測対象ブロックの幾何変換パラメータには高い空間相関があることが推定されるため、それぞれの差分値が予め定めた閾値の大きさを超える場合には、幾何変換予測の予測画像信号を利用しないように予測切替情報１２２を生成する。 Next, a switching method when using Expression (19), Expression (24) to Expression (27) will be described. Since it is estimated that the geometric transformation parameters of the adjacent block and the prediction target block have a high spatial correlation, if each difference value exceeds the predetermined threshold value, the prediction image signal of the geometric transformation prediction is The prediction switching information 122 is generated so as not to be used.

次に隣接ブロックと予測対象ブロックの符号化パラメータとを用いたときの切替方法について説明する。符号化パラメータに含まれる量子化パラメータは、量子化ステップサイズを規定するパラメータである。量子化パラメータが大きい場合、変換係数が粗く量子化される。これにより、局部復号された参照画像信号が入力画像信号に対して大きな誤差を持ち、動きベクトルの推定精度が低下する。幾何変換予測では、動きベクトルを利用して幾何変換パラメータを算出するため、動きベクトルの推定精度が低下すると幾何変換パラメータの算出精度も低下する。そこで、量子化パラメータが大きくなった場合、各種判定パラメータの閾値範囲を狭くする処理を行う。つまり、各種パラメータの閾値は量子化パラメータに対する減少関数となる。 Next, a switching method when using the adjacent block and the encoding parameter of the prediction target block will be described. The quantization parameter included in the encoding parameter is a parameter that defines the quantization step size. When the quantization parameter is large, the transform coefficient is roughly quantized. As a result, the locally decoded reference image signal has a large error with respect to the input image signal, and the motion vector estimation accuracy decreases. In geometric transformation prediction, since a geometric transformation parameter is calculated using a motion vector, if the estimation accuracy of the motion vector is lowered, the calculation accuracy of the geometric transformation parameter is also lowered. Therefore, when the quantization parameter becomes large, processing for narrowing the threshold range of various determination parameters is performed. That is, the threshold value of each parameter is a decreasing function for the quantization parameter.

なお、本実施の形態では、量子化パラメータを利用して閾値の範囲を制御する例を示したが、符号化パラメータに含まれる他の情報を用いてもよい。例えば、符号化した動画像の解像度、予測対象ブロックのタイプ、参照画像のＲｅｆ＿ｉｄｘ、動きベクトルの値、又は、変換サイズに関する情報などを利用してもよい。 In the present embodiment, an example in which the threshold range is controlled using the quantization parameter is shown, but other information included in the encoding parameter may be used. For example, information regarding the resolution of the encoded moving image, the type of the prediction target block, the Ref_idx of the reference image, the value of the motion vector, or the transform size may be used.

また、隣接ブロックの全ての隣接動きベクトルと予測対象ブロックの動きベクトルとが同一の値を持つとき、幾何変換前と後での座標が変化しない。このような時は、周辺ブロックが同一のオブジェクトの平行移動と考えられるため、幾何変換予測を行う必要がない。そこで、この条件が成り立つときは、予測切替情報１２２は、幾何変換予測を利用しない旨の値を有する。 Further, when all adjacent motion vectors of adjacent blocks and the motion vector of the prediction target block have the same value, the coordinates before and after geometric transformation do not change. In such a case, it is considered that the peripheral block is a parallel movement of the same object, so that it is not necessary to perform geometric transformation prediction. Therefore, when this condition is satisfied, the prediction switching information 122 has a value indicating that the geometric transformation prediction is not used.

また、それぞれの指標に対して用いる閾値を符号化パラメータの値に応じて変化する関数としてもよい。例えば、式（２８）における閾値ＴｈＤｅｔを量子化パラメータに依存する関数として定義し、量子化パラメータが増加するに従って、閾値が減少するような関数とすることで、推定精度の低下する低ビットレート帯での効果的な予測切替が可能となる。
以上が、予測切替部１１１及び予測分離スイッチ１１０の処理の概要である。 Further, the threshold value used for each index may be a function that changes according to the value of the encoding parameter. For example, the threshold ThDet in Equation (28) is defined as a function that depends on the quantization parameter, and the function is such that the threshold decreases as the quantization parameter increases, thereby reducing the estimation accuracy. It is possible to effectively switch predictions.
The above is the outline of the processing of the prediction switching unit 111 and the prediction separation switch 110.

＜幾何変換予測を含むインター予測の処理フロー＞
図１８は、インター予測部１３０の予測画像信号生成の処理を示すフロー図である。処理が開始される（Ｓ５０１）と、インター予測部１３０の外部から入力されてきた予測対象ブロックの動きベクトル１１９が幾何変換パラメータ導出部１０８へと入力される。これを受けて幾何変換パラメータ導出部１０８では、予測対象ブロックに対応する隣接ブロックを決定する（Ｓ５０２）。 <Processing flow of inter prediction including geometric transformation prediction>
FIG. 18 is a flowchart showing the process of generating a predicted image signal by the inter prediction unit 130. When the processing is started (S501), the motion vector 119 of the prediction target block input from the outside of the inter prediction unit 130 is input to the geometric transformation parameter deriving unit 108. In response to this, the geometric transformation parameter derivation unit 108 determines an adjacent block corresponding to the prediction target block (S502).

次に対応する隣接ブロックの隣接動きベクトルを、符号化制御部１２６が保持する内部メモリから導出する（Ｓ５０３）。幾何変換パラメータ導出部１０８が、導出された隣接動きベクトルと予測対象ブロックの動きベクトルを用いて幾何変換パラメータ１２１を導出する（Ｓ５０４）。 Next, the adjacent motion vector of the corresponding adjacent block is derived from the internal memory held by the encoding control unit 126 (S503). The geometric transformation parameter deriving unit 108 derives the geometric transformation parameter 121 using the derived adjacent motion vector and the motion vector of the prediction target block (S504).

導出された幾何変換パラメータ１２１を用いて幾何変換予測部１０９にて予測対象ブロックの幾何変換処理が行われる（Ｓ５０５）。幾何変換予測部１０９は、幾何変換によって新たに算出された分数画素位置の予測画像信号を内挿補間によって生成する（Ｓ５０６）。 Using the derived geometric transformation parameter 121, the geometric transformation processing of the prediction target block is performed in the geometric transformation prediction unit 109 (S505). The geometric transformation prediction unit 109 generates a prediction image signal at a fractional pixel position newly calculated by geometric transformation by interpolation (S506).

幾何変換パラメータ導出部１０８で算出された幾何変換パラメータ１２１は判定パラメータ導出部１２７へと入力され、判定パラメータ１２５が算出される（Ｓ５０７）。算出された判定パラメータ１２５が予測切替部１１１へと入力され、予測切替情報１２２が生成される（Ｓ５０８）。 The geometric transformation parameter 121 calculated by the geometric transformation parameter deriving unit 108 is input to the determination parameter deriving unit 127, and the determination parameter 125 is calculated (S507). The calculated determination parameter 125 is input to the prediction switching unit 111, and the prediction switching information 122 is generated (S508).

予測対象ブロックで算出された幾何変換パラメータ１２１が、符号化制御部１２６内に所持する内部メモリへと保存される（Ｓ５０９）。動きベクトル１１９が符号化制御部１２６内に所持する内部メモリへと保存される（Ｓ５１０）。 The geometric transformation parameter 121 calculated in the prediction target block is stored in an internal memory possessed in the encoding control unit 126 (S509). The motion vector 119 is stored in an internal memory possessed in the encoding control unit 126 (S510).

予測切替情報１２２に従って予測分離スイッチ１１０は、予測対象ブロックの予測方法が、幾何変換予測部１０９で生成された予測画像信号を利用するかどうかを判断する（Ｓ５１２）。かかる判断がＹＥＳの場合、予測分離スイッチ１１０は、幾何変換予測部１０９の出力端をスイッチと接続し、予測画像信号を出力する（Ｓ５１３）。 According to the prediction switching information 122, the prediction separation switch 110 determines whether or not the prediction method of the prediction target block uses the predicted image signal generated by the geometric transformation prediction unit 109 (S512). When this determination is YES, the prediction separation switch 110 connects the output end of the geometric transformation prediction unit 109 to the switch, and outputs a predicted image signal (S513).

一方、かかる判断がＮＯの場合、動き補償予測部１０７では、動きベクトル１１９に従って動き補償予測処理を行う（Ｓ５１４）。予測分離スイッチ１１０は動き補償予測部１０７の出力端へスイッチを接続し、予測画像信号を出力する（Ｓ５１５）。 On the other hand, when this determination is NO, the motion compensation prediction unit 107 performs motion compensation prediction processing according to the motion vector 119 (S514). The prediction separation switch 110 connects the switch to the output terminal of the motion compensation prediction unit 107 and outputs a prediction image signal (S515).

次に、符号化制御部１２６は、予測対象ブロックが、マクロブロックにおける最終ブロックであるかどうかを判断し（Ｓ５１６）、かかる判断がＹＥＳの場合は、当該マクロブロックの予測処理を終了する（Ｓ５１７）。かかる判断がＮＯの場合、処理は最初に戻り、マクロブロックの次の画素ブロックの予測画像生成処理を行う。以上が本発明の本実施の形態におけるインター予測の予測画像生成処理の流れである。 Next, the encoding control unit 126 determines whether or not the prediction target block is the last block in the macroblock (S516). If this determination is YES, the prediction process for the macroblock is ended (S517). ). If this determination is NO, the process returns to the beginning, and a predicted image generation process for the pixel block next to the macroblock is performed. The above is the flow of the predicted image generation processing for inter prediction in the present embodiment of the present invention.

＜シンタクス構造＞
次に、動画像符号化装置１００におけるシンタクス構造について説明する。図１９は、シンタクス１６００の構成を示す図である。図１９に示すとおり、シンタクス１６００は主に３つのパートを有する。ハイレベルシンタクス１６０１は、スライス以上の上位レイヤのシンタクス情報を有する。スライスレベルシンタクス１６０２は、スライス毎に復号に必要な情報を有し、マクロブロックレベルシンタクス１６０３は、マクロブロック毎に復号に必要とされる情報を有する。 <Syntax structure>
Next, a syntax structure in the moving picture coding apparatus 100 will be described. FIG. 19 is a diagram illustrating a configuration of the syntax 1600. As shown in FIG. 19, the syntax 1600 mainly has three parts. The high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice. The slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.

各パートは、更に詳細なシンタクスで構成されている。ハイレベルシンタクス１６０１は、シーケンスパラメータセットシンタクス１６０４とピクチャパラメータセットシンタクス１６０５などの、シーケンス及びピクチャレベルのシンタクスを含む。スライスレベルシンタクス１６０２は、スライスヘッダーシンタクス１６０６、スライスデータシンタクス１６０７等を含む。マクロブロックレベルシンタクス１６０３は、マクロブロックレイヤーシンタクス１６０８、マクロブロックプレディクションシンタクス１６０９等を含む。 Each part has a more detailed syntax. High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605. The slice level syntax 1602 includes a slice header syntax 1606, a slice data syntax 1607, and the like. The macroblock level syntax 1603 includes a macroblock layer syntax 1608, a macroblock prediction syntax 1609, and the like.

図２０は、スライスヘッダーシンタクス１６０５の例を示す図である。図中に示されるｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、当該スライスに幾何変換予測を適用するかどうかを示すシンタクス要素である。ｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが0である場合、予測切替部１１１は、当該スライスにおいて常に動き補償予測部１０７の出力端を出力するように予測切替情報１２２を設定して予測分離スイッチ１１０を切り替える。つまり、このスライスに対しては、幾何変換予測を適用しないことを意味する。一方、ｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが１である場合、当該スライスにおいて判定パラメータ１２５が指し示す情報に基づいて予測切替情報１２２が設定され、予測分離スイッチ１１０は予測画像信号を動的に切り替える。 FIG. 20 is a diagram illustrating an example of the slice header syntax 1605. The slice_affine_motion_prediction_flag shown in the figure is a syntax element indicating whether or not geometric transformation prediction is applied to the slice. When slice_affine_motion_prediction_flag is 0, the prediction switching unit 111 sets the prediction switching information 122 so as to always output the output terminal of the motion compensation prediction unit 107 in the slice, and switches the prediction separation switch 110. That is, this means that geometric transformation prediction is not applied to this slice. On the other hand, when slice_affine_motion_prediction_flag is 1, the prediction switching information 122 is set based on information indicated by the determination parameter 125 in the slice, and the prediction separation switch 110 dynamically switches the prediction image signal.

図２１は、スライスデータシンタクス１６０６の例を示す図である。図中に示されるｍｂ＿ｓｋｉｐ＿ｆｌａｇは、当該マクロブロックがスキップモードで符号化されているかどうかを示すフラグである。スキップモードである場合、変換係数や動きベクトルなどは符号化されない。 FIG. 21 is a diagram illustrating an example of the slice data syntax 1606. Mb_skip_flag shown in the drawing is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, transform coefficients, motion vectors, etc. are not encoded.

ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅは当該マクロブロックで幾何変換予測が利用できるかどうかを示す内部パラメータである。ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅが０の場合、判定パラメータ１２５で算出した各種値によって、幾何変換予測を利用しないように予測切替情報１２２が設定されていることを意味する。また、隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つときもＡｖａｉｌＡｆｆｉｎｅＭｏｄｅは０となる。 AvailAffineMode is an internal parameter indicating whether geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction switching information 122 is set not to use the geometric transformation prediction by various values calculated by the determination parameter 125. Also, AvailAffineMode is 0 when the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが符号化される。ｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが１の場合、当該スキップモードに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineMode is 1, mb_affine_motion_skip_flag indicating whether to use geometric transformation prediction or motion compensation prediction is encoded. When mb_affine_motion_skip_flag is 1, it means that geometric transformation prediction is applied to the skip mode. When mb_affine_motion_skip_flag is 0, it means that motion compensation prediction is applied.

図２２は、マクロブロックレイヤーシンタクス１６０７の例を示す図である。図中に示されるｍｂ＿ｔｙｐｅは、マクロブロックタイプ情報を示している。すなわち、現在のマクロブロックがイントラ符号化されているか、インター符号化されているか、或いはどのようなブロック形状で予測が行われているか、予測の方向が単方向予測か双方向予測か、などの情報を含んでいる。ｍｂ＿ｔｙｐｅは、マクロブロックプレディクションシンタクスと更にマクロブロック内のサブブロックのシンタクスを示すサブマクロブロックプレディクションシンタクスなどに渡される。 FIG. 22 is a diagram illustrating an example of the macroblock layer syntax 1607. Mb_type shown in the figure indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information. The mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.

図２３は、マクロブロックプレディクションシンタクスの例を示す図である。図中に示されるＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＭｂは当該画素ブロックで幾何変換予測が利用できるかどうかを示す内部パラメータである。ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＭｂが０の場合、判定パラメータ１２５で算出した各種値によって、幾何変換予測を利用しないように予測切替情報１２２が設定されていることを意味する。また、隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つときもＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＭｂは０となる。 FIG. 23 is a diagram illustrating an example of macroblock prediction syntax. AvailAffineModeMb shown in the figure is an internal parameter indicating whether or not geometric transformation prediction can be used in the pixel block. When AvailAffineModeMb is 0, it means that the prediction switching information 122 is set not to use the geometric transformation prediction by various values calculated by the determination parameter 125. Also, AvailAffineModeMb is 0 when the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＭｂが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが符号化される。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが１の場合、当該画素ブロックに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineModeMb is 1, mb_affine_pred_flag indicating whether to use geometric transformation prediction or motion compensation prediction is encoded. When mb_affine_pred_flag is 1, it means that geometric transformation prediction is applied to the pixel block. When mb_affine_pred_flag is 0, it means that motion compensation prediction is applied.

ＮｕｍＭｂＰａｒｔ()は、ｍｂ＿ｔｙｐｅに規定されたブロック分割数を返す内部関数であり、１６×１６画素ブロックの場合は１、１６×８、８×１６画素ブロックの場合は２、８×８画素ブロックの場合は４を出力する。 NumMbPart () is an internal function that returns the number of block divisions specified in mb_type. It is 1 for a 16 × 16 pixel block, 2 for an 8 × 16 pixel block, and 2 for an 8 × 8 pixel block. In the case, 4 is output.

図２４は、サブマクロブロックプレディクションシンタクスの例を示す図である。図中に示されるＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂは当該画素ブロックで幾何変換予測が利用できるかどうかを示す内部パラメータである。ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂが０の場合、判定パラメータ１２５で算出した各種値によって、幾何変換予測を利用しないように予測切替情報１２２が設定されていることを意味する。また、隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つときもＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂは０となる。 FIG. 24 is a diagram illustrating an example of sub macroblock prediction syntax. AvailAffineModeSubMb shown in the figure is an internal parameter indicating whether or not geometric transformation prediction can be used in the pixel block. When AvailAffineModeSubMb is 0, it means that the prediction switching information 122 is set not to use the geometric transformation prediction by various values calculated by the determination parameter 125. The AvailAffineModeSubMb is also 0 when the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが符号化される。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが１の場合、当該画素ブロックに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。ＮｕｍＳｕｂＭｂＰａｒｔ()は、ｍｂ＿ｔｙｐｅに規定されたブロック分割数を返す内部関数である。 On the other hand, when AvailAffineModeSubMb is 1, mb_affine_pred_flag indicating whether to use geometric transformation prediction or motion compensation prediction is encoded. When mb_affine_pred_flag is 1, it means that geometric transformation prediction is applied to the pixel block. When mb_affine_pred_flag is 0, it means that motion compensation prediction is applied. NumSubMbPart () is an internal function that returns the number of block divisions defined in mb_type.

図２５は、符号化パラメータの例としてのマクロブロックレイヤーシンタクスの例を示す図である。図中に示されるｃｏｄｅｄ＿ｂｌｏｃｋ＿ｐａｔｔｅｒｎは、８×８画素ブロック毎に、変換係数が存在するかどうかを示している。例えばこの値が０である時、対象ブロックに変換係数が存在しないことを意味している。図中のｍｂ＿ｑｐ＿ｄｅｌｔａは、量子化パラメータに関する情報を示している。対象ブロックの１つ前に符号化されたブロックの量子化パラメータからの差分値を表している。 FIG. 25 is a diagram illustrating an example of macroblock layer syntax as an example of an encoding parameter. Coded_block_pattern shown in the figure indicates whether a transform coefficient exists for each 8 × 8 pixel block. For example, when this value is 0, it means that there is no transform coefficient in the target block. Mb_qp_delta in the figure indicates information related to the quantization parameter. The difference value from the quantization parameter of the block encoded immediately before the target block is represented.

図中のｒｅｆ＿ｉｄｘ＿ｌ０及びｒｅｆ＿ｉｄｘ＿ｌ１は、インター予測が選択されているときに、対象ブロックがどの参照画像を用いて予測されたか、を表す参照画像のインデックスを示している。図中のｍｖ＿ｌ０、ｍｖ＿ｌ１は動きベクトル情報を示している。図中のｔｒａｎｓｆｏｒｍ＿８ｘ８＿ｆｌａｇは、対象ブロックが８×８変換であるかどうかを示す変換情報を表している。 Ref_idx_l0 and ref_idx_l1 in the figure indicate reference image indexes representing which reference image is used to predict the target block when inter prediction is selected. Mv_l0 and mv_l1 in the figure indicate motion vector information. In the figure, transform — 8 × 8_flag represents conversion information indicating whether or not the target block is 8 × 8 conversion.

なお、図１９ないし図２５に示すシンタクスの表中の行間には、本実施の形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、または複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。更に、当該マクロブロックレイヤーシンタクスに記述されている各々のシンタクス要素は、後述するマクロブロックデータシンタクスに明記されるように変更しても良い。
以上が、第１の実施の形態に係る動画像符号化装置１００の説明である。 It should be noted that syntax elements not defined in the present embodiment may be inserted between the rows in the syntax tables shown in FIGS. 19 to 25, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used. Further, each syntax element described in the macroblock layer syntax may be changed as specified in a macroblock data syntax described later.
The above is the description of the video encoding apparatus 100 according to the first embodiment.

（第２の実施形態：イントラ予測の追加）
第２の実施の形態で用いられる動画像符号化装置２００の構造を図２６に示す。なお、第１の実施の形態と同じ機能を持つブロックには同一の符号を付し、ここでは説明を省略する。図２６では、動画像符号化装置１００の機能に加え、新たにイントラ予測部２０１、モード判定部２０２、予測モード分離スイッチ２０３が追加されている。 (Second Embodiment: Addition of Intra Prediction)
FIG. 26 shows the structure of the moving picture coding apparatus 200 used in the second embodiment. In addition, the same code | symbol is attached | subjected to the block which has the same function as 1st Embodiment, and description is abbreviate | omitted here. In FIG. 26, in addition to the function of the moving picture coding apparatus 100, an intra prediction unit 201, a mode determination unit 202, and a prediction mode separation switch 203 are newly added.

イントラ予測部２０１は、入力された参照画像信号１２０を基にして画面内の情報のみを用いた予測画像信号１２３を生成する。イントラ予測部２０１における予測モードの例として、Ｈ．２６４のイントラ予測について説明する。Ｈ．２６４のイントラ予測に用いる画素ブロックを、図２７Ａないし図２７Ｃに示す。図２７Ａは、１６×１６画素イントラ予測、図２７Ｂは、４×４画素イントラ予測、図２７Ｃは、８×８画素イントラ予測に用いられる画素ブロックである。Ｈ．２６４は、これら３つのイントラ予測が規定されている。イントラ予測では、参照画像メモリ１０５に保存されている参照画像信号１２０から、補間画素を作成し、空間方向にコピーすることによって予測値を生成する。 The intra prediction unit 201 generates a predicted image signal 123 using only the information in the screen based on the input reference image signal 120. As an example of the prediction mode in the intra prediction unit 201, H.264 H.264 intra prediction will be described. H. Pixel blocks used for H.264 intra prediction are shown in FIGS. 27A to 27C. FIG. 27A is a 16 × 16 pixel intra prediction, FIG. 27B is a 4 × 4 pixel intra prediction, and FIG. 27C is a pixel block used for 8 × 8 pixel intra prediction. H. H.264 defines these three intra predictions. In intra prediction, an interpolated pixel is created from a reference image signal 120 stored in the reference image memory 105, and a predicted value is generated by copying in the spatial direction.

＜モード判定部２０２＞
次にモード判定部２０２について概要を説明する。モード判定部２０２は、現在符号化しているスライスの情報に応じて、予測モード切替情報２０４を予測モード分離スイッチ２０３へ出力する。予測モード切替情報２０４には、イントラ予測部２０１の出力端とインター予測部１３０の出力端のどちらと、スイッチを繋ぐかの情報が記述されている。 <Mode determination unit 202>
Next, an outline of the mode determination unit 202 will be described. The mode determination unit 202 outputs the prediction mode switching information 204 to the prediction mode separation switch 203 according to the information of the slice that is currently encoded. The prediction mode switching information 204 describes information about which of the output terminal of the intra prediction unit 201 and the output terminal of the inter prediction unit 130 is connected to the switch.

次にモード判定部２０２の機能を説明する。現在符号化しているスライスがイントラ符号化スライスである場合、モード判定部２０２は、予測モード分離スイッチ２０３の出力端をイントラ予測部２０１に接続する。一方、現在符号化しているスライスがインター符号化スライスである場合、モード判定部２０２は予測モード分離スイッチ２０３をイントラ予測部２０１の出力端に繋ぐか、インター予測部１３０の出力端へ繋ぐかを判定する。 Next, the function of the mode determination unit 202 will be described. When the currently encoded slice is an intra-coded slice, the mode determination unit 202 connects the output terminal of the prediction mode separation switch 203 to the intra prediction unit 201. On the other hand, when the currently encoded slice is an inter encoded slice, the mode determination unit 202 determines whether the prediction mode separation switch 203 is connected to the output end of the intra prediction unit 201 or the output end of the inter prediction unit 130. judge.

より詳細に説明すると、上記の場合、モード判定部２０２では、例えば、次式（２９）のコストを用いたモード判定を行う。それぞれの予測モードを選択した際に必要となる予測情報に関する符号量、例えば動きベクトル１１９の符号量やブロック形状の符号量等をＯＨ、入力画像信号１１４と予測画像信号１２３の差分絶対和、すなわち、予測誤差信号１１５の絶対累積和をＳＡＤとすると、以下のモード判定式（２９）を用いる。 More specifically, in the above case, the mode determination unit 202 performs mode determination using the cost of the following equation (29), for example. The code amount related to the prediction information required when each prediction mode is selected, for example, the code amount of the motion vector 119 or the code amount of the block shape is OH, the absolute difference sum of the input image signal 114 and the predicted image signal 123, that is, When the absolute cumulative sum of the prediction error signal 115 is SAD, the following mode determination formula (29) is used.

ここでＫはコスト、λは定数をそれぞれ表す。λは量子化スケールや量子化パラメータの値に基づいて決められるラグランジュ未定乗数である。式（２９）により得られたコストＫを基に、モード判定が行われる。すなわち、コストＫが最も小さい値を与えるモードが最適な予測モードとして選択される。 Here, K represents a cost, and λ represents a constant. λ is a Lagrangian undetermined multiplier determined based on the quantization scale and the value of the quantization parameter. Mode determination is performed based on the cost K obtained by the equation (29). That is, the mode that gives the smallest value of cost K is selected as the optimal prediction mode.

モード判定部２０２においては、式（２９）に代えて、（ａ）予測情報のみ、（ｂ）ＳＡＤのみ、を用いてモード判定を行ってもよいし、これら（ａ）、（ｂ）にアダマール変換を施した値、またはそれに近似した値を利用してもよい。さらに、モード判定部２０２において入力画像信号１１４のアクテビティ、すなわち、信号値の分散を用いてコストを作成してもよく、また、量子化スケールまたは量子化パラメータを利用してコスト関数を作成してもよい。 The mode determination unit 202 may perform mode determination using (a) only the prediction information and (b) only SAD instead of the equation (29), and the Hadamard may be included in these (a) and (b). You may use the value which performed conversion, or the value approximated to it. Further, the mode determination unit 202 may create the cost by using the activity of the input image signal 114, that is, the variance of the signal value, or by creating a cost function using the quantization scale or the quantization parameter. Also good.

さらに別の例として、仮符号化ユニットを用意し、仮符号化ユニットによりある予測モードで生成された予測誤差信号１１５を実際に符号化した場合の符号量と、入力画像信号１１４と復号画像信号１１８との間の二乗誤差を用いてモード判定を行ってもよい。この場合のモード判定式は、次式（３０）になる。 As yet another example, a provisional encoding unit is prepared, and the amount of codes when the prediction error signal 115 actually generated by the provisional encoding unit in a certain prediction mode is encoded, the input image signal 114 and the decoded image signal. The mode determination may be performed using a square error with respect to 118. The mode judgment formula in this case is the following formula (30).

式（３０）において、Ｊは符号化コスト、Ｄは入力画像信号１１４と復号画像信号１１８との間の二乗誤差を表す符号化歪みである。一方、Ｒは仮符号化によって見積もられた符号量を表している。 In Equation (30), J is the encoding cost, and D is the encoding distortion representing the square error between the input image signal 114 and the decoded image signal 118. On the other hand, R represents a code amount estimated by provisional encoding.

式（３０）の符号化コストＪを用いると、予測モード毎に仮符号化と局部復号処理が必要となるため、回路規模または演算量は増大する。反面、より正確な符号量と符号化歪みを用いるため、高い符号化効率を維持することができる。式（３０）に代えてＲのみ、またはＤのみを用いてコストを算出してもよく、また、ＲまたはＤを近似した値を用いてコスト関数を作成してもよい。 When the encoding cost J of Expression (30) is used, provisional encoding and local decoding processing are required for each prediction mode, so that the circuit scale or the amount of calculation increases. On the other hand, since a more accurate code amount and encoding distortion are used, high encoding efficiency can be maintained. The cost may be calculated using only R or only D instead of Expression (30), and the cost function may be created using a value approximating R or D.

以上のようにして、イントラ予測部２０１で生成された予測画像信号を選ぶか、インター予測部１３０で生成された予測画像信号を選ぶか、を判断し、予測モード分離スイッチ２０３の出力端を切り替える。ここで選択された予測モードの予測画像信号１２３が出力されて、減算器１０１へ入力されるとともに、加算器１０４へ出力される。
以上が本発明の本実施の形態に係る動画像符号化装置２００の処理の説明である。
（第３の実施形態：動きベクトルの再探索） As described above, it is determined whether to select the prediction image signal generated by the intra prediction unit 201 or the prediction image signal generated by the inter prediction unit 130, and the output terminal of the prediction mode separation switch 203 is switched. . The prediction image signal 123 of the prediction mode selected here is output and input to the subtracter 101 and output to the adder 104.
The above is description of the process of the moving image encoder 200 which concerns on this Embodiment of this invention.
(Third embodiment: Re-search for motion vectors)

第３の実施の形態で用いられるインター予測部３００の構造を図２８に示す。なお、第１の実施の形態と同じ機能を持つブロックには同一の符号を付し、ここでは、説明を省略する。また、インター予測部３００は、インター予測部１３０と、入出力の信号が同一であるため、動画像符号化装置１００及び動画像符号化装置２００が有するインター予測部１３０と置き換えることができる。 FIG. 28 shows the structure of the inter prediction unit 300 used in the third embodiment. In addition, the same code | symbol is attached | subjected to the block which has the same function as 1st Embodiment, and description is abbreviate | omitted here. Further, since the input / output signals of the inter prediction unit 300 are the same as those of the inter prediction unit 130, the inter prediction unit 300 can be replaced with the inter prediction unit 130 included in the video encoding device 100 and the video encoding device 200.

図２８では、インター予測部３００の機能に加え、新たに動きベクトル再探索部３０１が追加されている。動きベクトル再探索部３０１は、入力されてきた動きベクトル１１９を基準として、更に周辺部分の動きベクトルの再探索を行う。 In FIG. 28, in addition to the function of the inter prediction unit 300, a motion vector re-search unit 301 is newly added. The motion vector re-search unit 301 further re-searches the motion vectors in the peripheral part with the input motion vector 119 as a reference.

外部から入力された予測対象ブロックの動きベクトル１１９と符号化制御部１２６に保持されていた隣接画素ベクトルとが幾何変換パラメータ導出部１０８へと入力され、幾何変換パラメータ１２１が算出される。幾何変換パラメータ１２１は幾何変換予測部１０９へと入力されて幾何変換予測が行われ、予測画像信号が生成される。 The motion vector 119 of the prediction target block inputted from the outside and the adjacent pixel vector held in the encoding control unit 126 are inputted to the geometric transformation parameter deriving unit 108, and the geometric transformation parameter 121 is calculated. The geometric transformation parameter 121 is input to the geometric transformation prediction unit 109 to perform geometric transformation prediction, and a predicted image signal is generated.

動きベクトル再探索部３０１は、入力された動きベクトル１１９を基準として、符号化制御部１２６から与えられた探索範囲で予測対象ブロックの動きベクトルを変更し、幾何変換パラメータの再導出を行う。再導出された幾何変換パラメータが幾何変換予測部１０９へと入力され、再幾何変換予測が行われ、予測画像信号が生成される。このように動きベクトル再探索部３０１で生成された予測対象ブロックの動きベクトルを用いて、再探索範囲分の幾何変換予測が行われる。 The motion vector re-search unit 301 changes the motion vector of the block to be predicted within the search range given from the encoding control unit 126 with the input motion vector 119 as a reference, and re-derived the geometric transformation parameter. The re-derived geometric transformation parameters are input to the geometric transformation prediction unit 109, re-geometric transformation prediction is performed, and a predicted image signal is generated. In this way, using the motion vector of the prediction target block generated by the motion vector re-search unit 301, geometric conversion prediction for the re-search range is performed.

再探索の際に、式（２９）又は式（３０）を用いてコストが算出され、コストが小さい動きベクトルと予測画像信号のみが残される。再探索範囲内の幾何変換予測が完了すると、最小のコストを与える、動きベクトルと予測画像信号が予測分離スイッチ１１０へと出力される。予測対象ブロックの動きベクトルを予測誤差が小さくなるように変更することによって符号化効率を向上させることが可能となる。 In the re-search, the cost is calculated using the formula (29) or the formula (30), and only the motion vector and the prediction image signal with the low cost are left. When the geometric transformation prediction within the re-search range is completed, a motion vector and a predicted image signal that give the minimum cost are output to the prediction separation switch 110. Coding efficiency can be improved by changing the motion vector of the prediction target block so that the prediction error becomes small.

（第４の実施形態：動きベクトルの差分をシグナリング）
第４の実施の形態に係る動画像符号化装置の構成は、第３の実施の形態と同一である。第４の実施の形態に係る動画像符号化装置は、第３の実施の形態における動画像符号化装置の作用に加えて、予測対象ブロックの動きベクトルと再探索によって算出した動きベクトルとの差分を符号化する。動きベクトル再探索部３０１により再探索された動きベクトルを幾何変換動き補償予測に用いることにより、本来の動きベクトルが予測誤差削減のために変更される。この再探索後の動きベクトルを隣接動きベクトルとして利用すると、後段の幾何変換パラメータの導出が不適切になることがある。 (Fourth embodiment: Signaling the difference of motion vectors)
The configuration of the moving picture encoding apparatus according to the fourth embodiment is the same as that of the third embodiment. In addition to the operation of the video encoding device in the third embodiment, the video encoding device according to the fourth embodiment adds the difference between the motion vector of the prediction target block and the motion vector calculated by re-searching. Is encoded. By using the motion vector re-searched by the motion vector re-search unit 301 for the geometric transformation motion compensated prediction, the original motion vector is changed to reduce the prediction error. If the motion vector after this re-search is used as an adjacent motion vector, the subsequent geometric transformation parameter may not be derived properly.

そこで、本来の動きベクトルとそこからのズレ量を別途符号化することで、幾何変換パラメータの導出に必要な再探索後の動きベクトルと、隣接動きベクトルとして必要となる本来の動きベクトルと、が保持される。なお、この符号化処理は、エントロピー符号化部１１２が行うとよい。 Therefore, by encoding the original motion vector and the amount of deviation therefrom separately, the re-searched motion vector necessary for derivation of the geometric transformation parameter and the original motion vector required as the adjacent motion vector are obtained. Retained. This encoding process is preferably performed by the entropy encoding unit 112.

本実施の形態に係わるシンタクスの変更を、図２９、図３０に示す。シンタクス要素に含まれる文字のうち、Ｌ０及びｌ０は参照画像Ｌ０上に示される動きベクトルを示し、Ｌ１及びｌ１は参照画像Ｌ１上に示される動きベクトルを示す。再探索前の本来の動きベクトルを、ｍｖ_ｏｒｇ＝（ｍｖｘ_ｏｒｇ，ｍｖｙ_ｏｒｇ）とし、再探索後の動きベクトルをｍｖ_{ｒｅｆｉｎｅ}＝（ｍｖｘ_{ｒｅｆｉｎｅ}，ｍｖｙ_{ｒｅｆｉｎｅ}）とすると、動きベクトルの差分ｍｖｄ＿ａｆｆｉｎｅ[２]は次式で算出される。 The syntax change according to the present embodiment is shown in FIGS. Of the characters included in the syntax element, L0 and l0 indicate motion vectors indicated on the reference image L0, and L1 and l1 indicate motion vectors indicated on the reference image L1. If the original motion vector before the re-search is mv _org = (mvx _org , mvy _org ) and the motion vector after the re-search is mv _refine = (mvx _refine , mvy _refine ), the motion vector difference mvd_affine [2] Is calculated by the following equation.

なお、ｍｖｄ＿ａｆｆｉｎｅ[０]は垂直方向、ｍｖｄ＿ａｆｆｉｎｅ[１]は水平方向に対応する動きベクトルの差分値であり、シンタクス要素に対応している。なお、ここでは、図２９及び図３０に示すシンタクスに含まれている文字ｌ０及びｌ１を省略しているが、式（３１）に示す動きベクトルの差分は、Ｌ０及びＬ１のそれぞれのインデックスに対して計算する。また、ｍｖｄ＿ｌ０及びｍｖｄ＿ｌ１は、それぞれの参照画像信号に対応する動きベクトルと動きベクトルを予測した値との差分を計算することによって計算され、本実施の形態では明示しない他の動きベクトルの予測技術を用いて生成される。 Note that mvd_affine [0] is a motion vector difference value corresponding to the vertical direction and mvd_affine [1] is corresponding to a syntax element. Note that here, the characters l0 and l1 included in the syntaxes shown in FIGS. 29 and 30 are omitted, but the motion vector difference shown in the equation (31) is different from the indexes L0 and L1. To calculate. Also, mvd_l0 and mvd_l1 are calculated by calculating the difference between the motion vector corresponding to each reference image signal and the value predicted from the motion vector, and other motion vector prediction techniques not explicitly described in this embodiment are used. Generated using.

予測対象ブロックの符号化が完了すると、再探索前の動きベクトルであるｍｖ_ｏｒｇ＝（ｍｖｘ_ｏｒｇ，ｍｖｙ_ｏｒｇ）が符号化制御部１２６の内部メモリへ格納される。このように予測対象ブロックで利用する動きベクトルと、隣接ブロックとなった際に利用される動きベクトルを別々に保持することにより、隣接ブロックからの幾何変換パラメータの誤差の伝播を防ぐことが可能となる。 When the encoding of the prediction target block is completed, mv _org = (mvx _org , mvy _org ), which is a motion vector before re-searching, is stored in the internal memory of the encoding control unit 126. In this way, by separately holding the motion vector used in the prediction target block and the motion vector used when it becomes an adjacent block, it is possible to prevent propagation of errors in geometric transformation parameters from the adjacent block Become.

以上説明したように、第１ないし第４の実施の形態では、矩形ブロックに適さない、動きを有するオブジェクトを予測する際に、過度のブロック分割が施されて、ブロック分割情報が増大することを防ぐ。付加的な情報を増加させずに、ブロック内の動領域と背景領域を分離し、それぞれに最適な予測方法を適用することによって、符号化効率を向上させ、さらに、主観画質を向上するという効果を奏する。 As described above, in the first to fourth embodiments, when predicting an object having motion that is not suitable for a rectangular block, excessive block division is performed and block division information is increased. prevent. The effect of improving coding efficiency and further improving subjective image quality by separating the motion area and background area in the block without applying additional information and applying the optimal prediction method to each. Play.

＜動画像復号化装置＞
次に、動画像復号化に関する第５ないし第７の実施形態について述べる。 <Video decoding device>
Next, fifth to seventh embodiments relating to video decoding will be described.

（第５の実施形態）
図３１は、第４の実施形態に従う動画像復号化装置を示している。図３１の動画像復号化装置４００は、例えば、第１の実施形態に従う動画像符号化装置により生成される符号化データを復号する。 (Fifth embodiment)
FIG. 31 shows a video decoding device according to the fourth embodiment. For example, the moving picture decoding apparatus 400 in FIG. 31 decodes encoded data generated by the moving picture encoding apparatus according to the first embodiment.

図３１の動画像復号化装置４００は、入力バッファ４０１に蓄えられる符号化データ４１１を復号し、復号画像信号４２０を出力バッファ４１９に出力する。符号化データ４１１は、例えば、動画像符号化装置１００などから送出され、蓄積系または伝送系を経て送られ、入力バッファ４０１に一度蓄えられ、多重化された符号化データである。 31 decodes the encoded data 411 stored in the input buffer 401 and outputs a decoded image signal 420 to the output buffer 419. The encoded data 411 is encoded data that is transmitted from, for example, the moving image encoding apparatus 100, transmitted through a storage system or a transmission system, once stored in the input buffer 401, and multiplexed.

動画像復号化装置４００は、符号化データ復号部４０２、逆量子化・逆変換部４０３、加算器４０４、参照画像メモリ４０５、動き補償予測部４０６、幾何変換パラメータ導出部４０７、幾何変換予測部４０８、判定パラメータ導出部４２２、予測切替部４０９、及び、予測分離スイッチ４１０、を有する。動画像復号化装置４００は、また、入力バッファ４０１、出力バッファ４１９、及び、復号化制御部４２１と接続される。 The moving picture decoding apparatus 400 includes an encoded data decoding unit 402, an inverse quantization / inverse transform unit 403, an adder 404, a reference image memory 405, a motion compensation prediction unit 406, a geometric transformation parameter derivation unit 407, and a geometric transformation prediction unit. 408, a determination parameter deriving unit 422, a prediction switching unit 409, and a prediction separation switch 410. The video decoding device 400 is also connected to the input buffer 401, the output buffer 419, and the decoding control unit 421.

符号化データ復号部４０２は、符号化データを１フレーム又は１フィールド毎にシンタクスに基づいて構文解析による解読を行う。符号化データ復号部４０２は、順次各シンタクスの符号列をエントロピー復号化し、動きベクトル４１５、及び、対象ブロックの符号化パラメータ等を再生する。符号化パラメータとは、予測情報、変換係数に関する情報、量子化に関する情報、等の復号の際に必要になるパラメータである。 The encoded data decoding unit 402 decodes the encoded data by syntax analysis based on the syntax for each frame or field. The encoded data decoding unit 402 sequentially entropy-decodes the code string of each syntax, and reproduces the motion vector 415, the encoding parameter of the target block, and the like. The encoding parameter is a parameter required for decoding prediction information, information on transform coefficients, information on quantization, and the like.

符号化データ復号部４０２で解読が行われた変換係数は、逆量子化・逆変換部４０３へ入力される。符号化データ復号部４０２によって解読された量子化に関する様々な情報、すなわち、量子化パラメータや量子化マトリクスは、復号化制御部４２１の内部メモリに設定され、逆量子化処理として利用される際にロードされる。 The transform coefficient decoded by the encoded data decoding unit 402 is input to the inverse quantization / inverse transform unit 403. Various information relating to the quantization decoded by the encoded data decoding unit 402, that is, the quantization parameter and the quantization matrix are set in the internal memory of the decoding control unit 421 and used as an inverse quantization process. Loaded.

ロードされた量子化に関する情報を用いて、逆量子化・逆変換部４０３では、最初に逆量子化処理が行われる。逆量子化された変換係数は、続いて逆変換処理、例えば逆離散コサイン変換等が実行される。ここでは、逆直交変換について説明したが、符号化装置でウェーブレット変換などが行われている場合には、逆量子化・逆変換部４０３は、対応する逆量子化及び逆ウェーブレット変換などが実行されるとよい。 The inverse quantization / inverse transform unit 403 first performs inverse quantization processing using the loaded information regarding quantization. Subsequently, the inverse quantized transform coefficient is subjected to inverse transform processing, for example, inverse discrete cosine transform. Here, the inverse orthogonal transform has been described. However, when wavelet transform or the like is performed in the encoding device, the inverse quantization / inverse transform unit 403 executes the corresponding inverse quantization and inverse wavelet transform. Good.

逆量子化・逆変換部４０３を通って、復元された予測誤差信号４１２は加算器４０４へと入力される。加算器４０４は、予測誤差信号４１２と後述する動き補償予測部４０６又は幾何変換予測部４０８で生成された予測画像信号４１８とを加算し、復号画像信号４２０を生成する。 The restored prediction error signal 412 is input to the adder 404 through the inverse quantization / inverse transform unit 403. The adder 404 adds a prediction error signal 412 and a predicted image signal 418 generated by a motion compensation prediction unit 406 or a geometric transformation prediction unit 408 described later to generate a decoded image signal 420.

生成された復号画像信号４２０は、動画像復号化装置４００から出力されて、出力バッファ４１９に一旦蓄積された後、復号化制御部４２１が管理する出力タイミングに従って出力される。また、この復号画像信号４２０は参照画像メモリ４０５へと保存され、参照画像信号４１３となる。 The generated decoded image signal 420 is output from the moving image decoding apparatus 400, temporarily stored in the output buffer 419, and then output according to the output timing managed by the decoding control unit 421. The decoded image signal 420 is stored in the reference image memory 405 and becomes a reference image signal 413.

参照画像信号４１３は参照画像メモリ４０５から、順次フレーム毎或いはフィールド毎に読み出され、予測動き補償予測部４０６或いは幾何変換予測部４０８へと入力される。対象画素ブロックで利用された動きベクトル４１５及び幾何変換パラメータ４１４は、復号化制御部４２１に保存され、後述する幾何変換パラメータ導出部１０８、判定パラメータ導出部４２２で適宜ロードされて利用される。 The reference image signal 413 is sequentially read from the reference image memory 405 for each frame or each field, and is input to the prediction motion compensation prediction unit 406 or the geometric transformation prediction unit 408. The motion vector 415 and the geometric transformation parameter 414 used in the target pixel block are stored in the decoding control unit 421 and are appropriately loaded and used by the geometric transformation parameter deriving unit 108 and the determination parameter deriving unit 422 described later.

なお、図３１の動き補償予測部４０６、幾何変換パラメータ導出部４０７、幾何変換予測部４０８、判定パラメータ導出部４２２、予測切替部４０９、及び、予測分離スイッチ４１０は、それぞれ、図２に示す同名の各部と同一の機能及び構成を有する。より詳細には、動き補償予測部１０７に入力される動きベクトルが、動き推定部１０６によって取得されたものであるのに対し、動き補償予測部４０６に入力される動きベクトルは、符号化データ復号部４０２によって復号されたものであることが異なる他は、全て同一である。 Note that the motion compensation prediction unit 406, the geometric transformation parameter derivation unit 407, the geometric transformation prediction unit 408, the determination parameter derivation unit 422, the prediction switching unit 409, and the prediction separation switch 410 in FIG. 31 have the same names shown in FIG. It has the same function and configuration as each part. More specifically, the motion vector input to the motion compensation prediction unit 107 is acquired by the motion estimation unit 106, whereas the motion vector input to the motion compensation prediction unit 406 is encoded data decoding. All are the same except that they are decrypted by the unit 402.

図３２は、図３１における、動き補償予測部４０６、幾何変換パラメータ導出部４０７、幾何変換予測部４０８、判定パラメータ導出部４２２、予測切替部４０９、及び、予測分離スイッチ４１０を、インター予測部１３０と置き換えた例を示す図である。これらの各部は、図２に示すインター予測部１３０が有する各部と同一の機能及び構成を有する。したがって、動画像符号化装置１００が有するインター予測部１３０を、動画像復号化装置４００が保持することにより、図３１に示す構成を実現することができる。 32 includes the motion compensation prediction unit 406, the geometric transformation parameter derivation unit 407, the geometric transformation prediction unit 408, the determination parameter derivation unit 422, the prediction switching unit 409, and the prediction separation switch 410 in FIG. FIG. Each of these units has the same function and configuration as each unit of the inter prediction unit 130 shown in FIG. Therefore, when the moving picture decoding apparatus 400 holds the inter prediction unit 130 included in the moving picture encoding apparatus 100, the configuration illustrated in FIG. 31 can be realized.

＜幾何変換パラメータ導出部４０７＞
幾何変換パラメータ導出部４０７では、符号化データ復号部４０２が復号した予測対象ブロックの動きベクトル４１５と復号化制御部４２１に保存されている動きベクトル（以下、「隣接動きベクトル」という。）とを用いて、予測対象ブロックの幾何変換パラメータを導出する。 <Geometric transformation parameter derivation unit 407>
In the geometric transformation parameter deriving unit 407, the motion vector 415 of the prediction target block decoded by the encoded data decoding unit 402 and the motion vector stored in the decoding control unit 421 (hereinafter referred to as “adjacent motion vector”). The geometric transformation parameter of the prediction target block is derived by using this.

予測対象ブロックに対する隣接ブロックの関係を、図４、及び、図９ないし図１３を用いて説明する。なお、図４及び図９ないし図１３の説明において、第１ないし第４の実施の形態における「符号化又は予測対象となるブロック」、「符号化又は予測が完了した画素ブロック」及び「未符号化又は未予測画素ブロック」を、第５ないし第７の実施の形態では、それぞれ、「復号化又は予測対象となるブロック」、「復号化又は予測が完了した画素ブロック」及び「未復号化又は未予測画素ブロック」と読み替える。 The relationship between adjacent blocks with respect to the prediction target block will be described with reference to FIGS. 4 and 9 to 13. In the description of FIG. 4 and FIGS. 9 to 13, “blocks to be encoded or predicted”, “pixel blocks for which encoding or prediction has been completed” and “uncoded” in the first to fourth embodiments. In the fifth to seventh embodiments, “decoded or unpredicted pixel block”, “decoded or predicted target block”, “decoded or predicted pixel block” and “undecoded or unpredicted pixel block”, respectively. It is read as “unpredicted pixel block”.

≪隣接ブロックと隣接動きベクトルの導出（その１）−ブロックサイズが同じ場合≫
図９Ａないし図９Ｅは、予測対象ブロックに対する隣接ブロックの関係を説明する図である。図９Ａでは、予測対象ブロックと隣接ブロックのサイズ（例えば１６×１６画素ブロック）が一致する場合の例を示す。 << Derivation of Adjacent Block and Adjacent Motion Vector (Part 1)-When Block Size is Same >>
9A to 9E are diagrams for explaining the relationship between adjacent blocks with respect to a prediction target block. FIG. 9A shows an example in which the sizes of prediction target blocks and adjacent blocks (for example, 16 × 16 pixel blocks) match.

図９Ａ中、斜線のハッチングが付された画素ブロックｐは、既に復号化又は予測が完了している画素ブロック（以下、「予測済画素ブロック」という。）であり、ドットのハッチングが付されたブロックｃは予測対象ブロックであり、白で表示されている画素ブロックｎは未復号化画素（未予測）ブロックである。図中Ｘは復号化（予測）対象画素ブロックを表している。 In FIG. 9A, a pixel block p with hatched hatching is a pixel block that has already been decoded or predicted (hereinafter referred to as “predicted pixel block”), and has been hatched with dots. The block c is a prediction target block, and the pixel block n displayed in white is an undecoded pixel (unpredicted) block. In the figure, X represents a decoding (prediction) target pixel block.

復号化制御部４２１の内部メモリに保持されている隣接動きベクトルは、予測済画素ブロックの動きベクトルのみである。図４では、復号化処理をされている復号化フレームｆにおいて、復号化対象となるブロックｃよりも左及び上に位置するブロックが、復号済みブロックｐである。図４で示したように画素ブロックは左上から右下に向かって復号化及び予測の処理がされていくため、画素ブロックＸの予測を行う際には、右及び下方向の画素ブロックは未だ復号化が行われていない。そこで、これらの隣接ブロックから隣接動きベクトルを導出することができない。 The adjacent motion vector held in the internal memory of the decoding control unit 421 is only the motion vector of the predicted pixel block. In FIG. 4, in the decoded frame f subjected to the decoding process, a block located on the left and above the block c to be decoded is a decoded block p. As shown in FIG. 4, since the pixel block is decoded and predicted from the upper left to the lower right, when the pixel block X is predicted, the right and lower pixel blocks are still decoded. Has not been made. Therefore, an adjacent motion vector cannot be derived from these adjacent blocks.

マクロブロックの内部も同様に左上から右下に向かって復号化処理が行われるため、８×８画素ブロックの復号化順序に応じて隣接ブロックの位置が変化する。対応する８×８画素ブロックの復号化処理又は予測画像生成処理が完了すると、その画素ブロックは復号化済み画素ブロックとなり、後に処理される画素ブロックの隣接ブロックとして利用される。図９Ｅでは、隣接ブロックＣに対応する右上の画素ブロックが未復号化画素ブロックであるため、復号化済み画素ブロックの右上に位置する画素ブロックを隣接ブロックとする。 Similarly, the decoding process is performed from the upper left to the lower right inside the macro block, so that the position of the adjacent block changes according to the decoding order of the 8 × 8 pixel block. When the decoding process or predicted image generation process for the corresponding 8 × 8 pixel block is completed, the pixel block becomes a decoded pixel block and is used as an adjacent block of the pixel block to be processed later. In FIG. 9E, since the upper right pixel block corresponding to the adjacent block C is an undecoded pixel block, the pixel block located at the upper right of the decoded pixel block is set as the adjacent block.

図１１Ａないし図１１Ｄは、予測対象ブロックが小さく、隣接ブロックのブロックサイズが大きい場合の例を説明する図である。図９Ｅと同様に、対応する画素ブロックが未復号化画素ブロックである場合は、予測対象ブロックに距離的に近い利用可能な復号化済みの画素ブロックで置き換える。 FIG. 11A to FIG. 11D are diagrams for explaining an example when the prediction target block is small and the block size of the adjacent block is large. Similarly to FIG. 9E, when the corresponding pixel block is an undecoded pixel block, it is replaced with an available decoded pixel block that is close in distance to the prediction target block.

また、インター予測では、マクロブロック内の復号化順序に依存せずに復号化処理、すなわち、動きベクトルの推定を行うことが可能なため、８×８画素ブロックの場合においても、図１０Ａないし図１０Ｄのいずれかを用いて隣接ブロックを決定してもよい。また、ブロックサイズの大きさが異なる画素ブロックが混在している場合にも、図１０Ａないし図１０Ｄのいずれかを用いて隣接ブロックを決定してもよい。 In inter prediction, since it is possible to perform decoding processing, that is, motion vector estimation, without depending on the decoding order in the macroblock, even in the case of an 8 × 8 pixel block, FIG. Adjacent blocks may be determined using any of 10D. Even when pixel blocks having different block sizes are mixed, adjacent blocks may be determined using any one of FIGS. 10A to 10D.

≪幾何変換パラメータの導出≫
次に幾何変換パラメータ導出部４０７における幾何変換パラメータ４１４の導出方法について説明する。隣接ブロックが保持する隣接動きベクトルをそれぞれ式（４）ないし（７）により定義する。 ≪Derivation of geometric transformation parameters≫
Next, a method for deriving the geometric transformation parameter 414 in the geometric transformation parameter deriving unit 407 will be described. The adjacent motion vectors held by the adjacent blocks are defined by equations (4) to (7), respectively.

式（４）ないし（８）で表される動きベクトル及び隣接動きベクトルを用いて、幾何変換パラメータ１２１を導出する。幾何変換がアフィン変換の場合には、変換式は式（９）で表される。式（９）では、座標（ｘ、ｙ）がアフィン変換によって座標（ｕ，ｖ）へ変換される。式（９）に含まれるａ、ｂ、ｃ、ｄ、ｅ、ｆの６個のパラメータが幾何変換パラメータを表している。アフィン変換ではこの６種類のパラメータを推定するため、６個以上の入力値が必要となる。 The geometric transformation parameter 121 is derived using the motion vector and the adjacent motion vector represented by the equations (4) to (8). When the geometric transformation is an affine transformation, the transformation formula is expressed by Equation (9). In equation (9), coordinates (x, y) are converted to coordinates (u, v) by affine transformation. Six parameters a, b, c, d, e, and f included in Expression (9) represent geometric transformation parameters. In the affine transformation, since these six types of parameters are estimated, six or more input values are required.

隣接ブロックＡ、Ｂ及び予測対象ブロックＸのそれぞれの動きベクトルを用いると、式（１０）により幾何変換パラメータが導出される。ここでは、動きベクトルが１／４精度であることを前提としている。但し、ａｘ、ａｙは予測対象ブロックのサイズに基づく変数であり、式（１１）により算出される。 When the motion vectors of the adjacent blocks A and B and the prediction target block X are used, a geometric transformation parameter is derived by Expression (10). Here, it is assumed that the motion vector is ¼ precision. However, ax and ay are variables based on the size of the prediction target block, and are calculated by Expression (11).

式（１１）において、ｍｂ＿ｓｉｚｅ＿ｘ及びｍｂ＿ｓｉｚｅ＿ｙはマクロブロックの水平、垂直方向のサイズを示しており、１６×１６画素ブロックの場合には、ｍｂ＿ｓｉｚｅ＿ｘ＝１６、ｍｂ＿ｓｉｚｅ＿ｙ＝１６となる。また、ｂｌｋ＿ｓｉｚｅ＿ｘ及びｂｌｋ＿ｓｉｚｅ＿ｙは予測対象ブロックの水平、垂直サイズを表しており、図９Ｂの場合は、ｂｌｋ＿ｓｉｚｅ＿ｘ＝８、ｂｌｋ＿ｓｉｚｅ＿ｙ＝８となる。 In Expression (11), mb_size_x and mb_size_y indicate the size of the macroblock in the horizontal and vertical directions. In the case of a 16 × 16 pixel block, mb_size_x = 16 and mb_size_y = 16. Also, blk_size_x and blk_size_y represent the horizontal and vertical sizes of the prediction target block. In the case of FIG. 9B, blk_size_x = 8 and blk_size_y = 8.

また、予測対象ブロックの左に位置する隣接ブロックの候補が複数存在する場合、これらの候補の中で動きベクトルのメディアン値を計算し、値が大きく異なる動きベクトルを除外してもよい。図１２は、４つの８×８画素ブロック毎に、メディアン値を計算する例を示す図である。予測対象ブロックＸのブロックサイズが大きく、隣接する画素ブロックのブロックサイズが小さい場合、隣接画素の候補となる画素ブロックが複数存在する。ここで、左に位置する４個の画素ブロックの動きベクトルを、それぞれ、ｍｖ_ａ、ｍｖ_ｂ、ｍｖ_ｃ、ｍｖ_ｄとすると、式（１２）により動きベクトルを決定する。 Further, when there are a plurality of adjacent block candidates located to the left of the prediction target block, a median value of motion vectors may be calculated from these candidates, and motion vectors having greatly different values may be excluded. FIG. 12 is a diagram illustrating an example in which a median value is calculated for each of four 8 × 8 pixel blocks. When the block size of the prediction target block X is large and the block size of adjacent pixel blocks is small, there are a plurality of pixel blocks that are candidates for adjacent pixels. Here, assuming that the motion vectors of the four pixel blocks located on the left are mv _a , mv _b , mv _c , and mv _d , the motion vector is determined by Expression (12).

式（１２）では、二次元ベクトルのメディアン値を用いる例を示したが、式（１３）に示すように、ベクトルの要素毎のメディアン値でもよい。この他に、要素毎の平均値、二次元ベクトルのランダム値、又は、ベクトルの要素別のランダム値等を用いてもよい。また、隣接ブロックＢ、Ｃ、Ｄにおいても同様の処理を行って隣接動きベクトルを求めてもよい。 Although the example using the median value of the two-dimensional vector is shown in Expression (12), as shown in Expression (13), the median value for each vector element may be used. In addition, an average value for each element, a random value of a two-dimensional vector, a random value for each element of the vector, or the like may be used. Also, the adjacent motion vectors may be obtained by performing the same processing in the adjacent blocks B, C, and D.

≪予測対象画素の分割≫
次に予測対象ブロックに対して幾何変換を実施する領域を説明する。幾何変換パラメータを導出する領域は、幾何変換を実施する領域に対応している。式（１０）では矩形画素ブロックに対して幾何変換パラメータを導出する例を示した。しかし、必ずしも矩形画素ブロックで幾何変換パラメータを導出する必要はなく、図１３で示す三角パッチで矩形ブロックを分割してもよい。図１３Ａ、及び、図１３Ｂは予測対象ブロックを対角線で分け２つの三角パッチで分割した例を示している。図１３Ａでは、それぞれの幾何変換パラメータを次式で導出する。予測対象三角パッチＸ１は、隣接ブロックＡ、Ｄ及び予測対象ブロックＸ１の動きベクトルを用いて式（１４）で定義される。
予測対象三角パッチＸ２は、隣接ブロックＢ、Ｄ及び予測対象ブロックＸ２の動きベクトルを用いて式（１５）で定義される。 ≪Division of prediction target pixel≫
Next, a region where geometric transformation is performed on the prediction target block will be described. The area from which the geometric transformation parameter is derived corresponds to the area where the geometric transformation is performed. Equation (10) shows an example in which a geometric transformation parameter is derived for a rectangular pixel block. However, it is not always necessary to derive the geometric transformation parameter by the rectangular pixel block, and the rectangular block may be divided by the triangular patch shown in FIG. 13A and 13B show an example in which the prediction target block is divided by a diagonal line and divided by two triangular patches. In FIG. 13A, each geometric transformation parameter is derived by the following equation. The prediction target triangular patch X1 is defined by Expression (14) using the motion vectors of the adjacent blocks A and D and the prediction target block X1.
The prediction target triangular patch X2 is defined by Expression (15) using the motion vectors of the adjacent blocks B and D and the prediction target block X2.

図１３Ｂの場合も、同様にして、幾何変換パラメータが導出できる。しかし、予測対象三角パッチＸ２は、未復号化画素ブロック側に空間的距離が近く、隣接ブロックとの空間的距離が遠い。一般的に、オブジェクトの動きは空間的相関が高いため、利用可能な隣接ブロックが多く取れるように分割形状を定義するとよい。 In the case of FIG. 13B as well, geometric transformation parameters can be derived in the same manner. However, the prediction target triangular patch X2 has a spatial distance close to the undecoded pixel block side and a spatial distance from an adjacent block. In general, since the motion of an object has a high spatial correlation, it is preferable to define a division shape so that a large number of adjacent blocks can be used.

本実施の形態では、幾何変換の例としてアフィン変換を用いた例を示したが、共一次変換、ヘルマート変換、二次等角変換、射影変換、3次元射影変換、などのいずれの幾何変換を用いてもよい。例えば射影変換は、式（１６）で表される。 In this embodiment, an example using affine transformation is shown as an example of geometric transformation, but any geometric transformation such as bilinear transformation, Helmart transformation, quadratic conformal transformation, projective transformation, three-dimensional projective transformation, etc. It may be used. For example, the projective transformation is expressed by Expression (16).

式（１６）において、分子分母をスカラーで通分すると、解くべきパラメータは８種類となる。そこで、利用可能な隣接ブロック数を多く定義することにより、アフィン変換と同様の枠組みで幾何変換パラメータを導出することが可能である。 In equation (16), if the numerator denominator is divided into scalars, there are eight parameters to be solved. Thus, by defining a large number of adjacent blocks that can be used, it is possible to derive geometric transformation parameters in the same framework as affine transformation.

以上が、幾何変換パラメータ導出部４０７の処理の概要である。 The above is the outline of the processing of the geometric transformation parameter deriving unit 407.

＜幾何変換予測部４０８＞
次に、幾何変換予測部４０８の処理について、図１５ないし図１７を用いて説明する。なお、図１５ないし図１７の説明において、第１ないし第４の実施の形態における「参照画像信号１２０」は、局所復号画像であったのに対し、第５ないし第７の実施の形態では、「参照画像信号４１３」は、復号画像である。 <Geometric transformation prediction unit 408>
Next, processing of the geometric transformation prediction unit 408 will be described with reference to FIGS. 15 to 17. In the description of FIGS. 15 to 17, the “reference image signal 120” in the first to fourth embodiments is a locally decoded image, whereas in the fifth to seventh embodiments, The “reference image signal 413” is a decoded image.

幾何変換予測部４０８は入力された幾何変換パラメータ４１４を基にして、参照画像信号４１３に対して幾何変換を実施する。図１５は、予測対象ブロックに対する幾何変換予測と動き補償予測の例を示す図である。図１５は、１６×１６画素ブロックの例である。 The geometric transformation prediction unit 408 performs geometric transformation on the reference image signal 413 based on the input geometric transformation parameter 414. FIG. 15 is a diagram illustrating an example of geometric transformation prediction and motion compensation prediction for a prediction target block. FIG. 15 is an example of a 16 × 16 pixel block.

動き補償後の領域と幾何変換後の領域は、参照画像信号の対応する領域を復号化対象のフレームの座標に合わせて記述している。このように、幾何変換予測を用いることによって、矩形画素ブロックの回転、拡大・縮小、せん断、鏡面変換などの変形に合わせた予測画像信号の生成が可能となる。 The region after motion compensation and the region after geometric transformation describe the corresponding region of the reference image signal according to the coordinates of the frame to be decoded. As described above, by using the geometric transformation prediction, it is possible to generate a predicted image signal in accordance with deformation such as rotation, enlargement / reduction, shearing, and mirror transformation of the rectangular pixel block.

幾何変換予測部１０９では、式（１０）、式（１４）、及び、式（１５）を用いて算出された幾何変換パラメータ４１４を用い、式（９）により、幾何変換後の座標（ｕ，ｖ）を算出する。算出された幾何変換後の座標（ｕ，ｖ）は、実数値である。そこで、座標（ｕ，ｖ）に対応する輝度値を参照画像信号から内挿補間することによって予測値を生成する。 The geometric transformation prediction unit 109 uses the geometric transformation parameters 414 calculated using the equations (10), (14), and (15), and uses the geometric transformation parameters (u, v) is calculated. The calculated coordinates (u, v) after geometric transformation are real values. Therefore, the predicted value is generated by interpolating the luminance value corresponding to the coordinates (u, v) from the reference image signal.

図１６は、共一次内挿法による輝度値補間処理の例を示す図である。白丸ｃｗ０ないしｃｗ３は整数画素位置の輝度値を示し、黒丸ｃｂが補間画素位置（ｕ，ｖ）を示している。図１６では、分数精度の位置に隣接する周囲４つの整数画素値を用いて、それぞれの距離の比から補間画素値を生成する。共一次内挿法は式（１７）で表される。 FIG. 16 is a diagram illustrating an example of luminance value interpolation processing by bilinear interpolation. White circles cw0 to cw3 indicate luminance values at integer pixel positions, and black circles cb indicate interpolation pixel positions (u, v). In FIG. 16, the interpolated pixel value is generated from the ratio of the distances using the surrounding four integer pixel values adjacent to the fractional accuracy position. The bilinear interpolation method is expressed by Expression (17).

式（１７）において、Ｐ（ｕ,ｖ）は内挿補間処理後の予測画素値を示しており、Ｒ（ｘ，ｙ）は、利用した参照画像信号の整数画素値を表している。（ｘ−ｕ）＝Ｕ／６４、（ｙ−ｖ）＝Ｖ／６４とすると、式（１７）は、式（１８）に示す整数演算に変形できる。式（１８）において、ｆは丸めのオフセット（０≦ｆ＜２１２）を表している。本実施の形態ではｆ＝０としている。
以上のように、幾何変換を行った予測対象ブロック内の座標毎に内挿補間を適用することによって、新たな予測画像信号を生成する。 In Expression (17), P (u, v) represents the predicted pixel value after the interpolation process, and R (x, y) represents the integer pixel value of the used reference image signal. Assuming that (x−u) = U / 64 and (y−v) = V / 64, Equation (17) can be transformed into an integer operation shown in Equation (18). In Expression (18), f represents a rounding offset (0 ≦ f <212). In this embodiment, f = 0.
As described above, a new predicted image signal is generated by applying interpolation for each coordinate in the prediction target block subjected to geometric transformation.

なお、本実施の形態では、参照画像信号４１３の整数画素位置からの内挿補間についての例を説明したが、動き補償予測部４０６で、既に参照画像信号４１３の補間画像信号を生成している場合には、分数精度の補間画像信号を再利用しても良い。例えば、符号化データ復号部４０２が、１／４画素精度の動きベクトルを復号し、動き補償予測部４０６が、その動きベクトルに対応する参照画像信号４１３を４倍に拡大した拡大参照画像信号を生成して保持している場合には、１／４画素精度の補間画像を利用して１／６４画素精度の補間画像を生成してもよい。これにより、１／４精度の補間画像から更に１／１６精度の内挿補間処理を行って１／６４画素精度の補間画像を生成することができる。 In this embodiment, an example of interpolation from the integer pixel position of the reference image signal 413 has been described. However, the motion compensation prediction unit 406 has already generated an interpolated image signal of the reference image signal 413. In this case, the interpolated image signal with fractional accuracy may be reused. For example, the encoded data decoding unit 402 decodes a motion vector with ¼ pixel accuracy, and the motion compensation prediction unit 406 generates an enlarged reference image signal obtained by enlarging the reference image signal 413 corresponding to the motion vector by four times. When generated and held, an interpolation image with 1/64 pixel accuracy may be generated using an interpolation image with 1/4 pixel accuracy. Thus, an interpolation image with 1/64 pixel accuracy can be generated by further performing interpolation interpolation processing with 1/16 accuracy from the 1/4 accuracy interpolation image.

なお、内挿補間処理の画素精度は更に細かく指定することも可能である。この場合、指定した補間精度に応じて内挿補間処理を行えばよい。
以上が、幾何変換予測部４０８の処理の概要である。 Note that the pixel accuracy of the interpolation process can be specified more finely. In this case, an interpolation process may be performed according to the designated interpolation accuracy.
The above is the outline of the process of the geometric transformation prediction unit 408.

＜判定パラメータ導出部４２２＞
次に判定パラメータ導出部４２２について具体的に説明する。判定パラメータ導出部４２２は、幾何変換パラメータ導出部４０７から出力された幾何変換パラメータ４１４を用いて、動き補償予測部４０６から出力された予測画像信号と幾何変換予測部４０８から出力された予測画像信号との何れの信号を出力するかを判定するための判定パラメータ４１６を生成し、予測切替部４０９へと出力する。幾何変換がアフィン変換の場合には、平行移動指標、回転指標、拡大・縮小指標、変形指標などを用いて、幾何変換の度合いを評価することができる。 <Determination Parameter Deriving Unit 422>
Next, the determination parameter derivation unit 422 will be specifically described. The determination parameter deriving unit 422 uses the geometric transformation parameter 414 output from the geometric transformation parameter deriving unit 407 and the predicted image signal output from the motion compensation prediction unit 406 and the predicted image signal output from the geometric transformation prediction unit 408. The determination parameter 416 for determining which signal to output is generated and output to the prediction switching unit 409. When the geometric transformation is an affine transformation, the degree of geometric transformation can be evaluated using a parallel movement index, a rotation index, an enlargement / reduction index, a deformation index, and the like.

一般的な動画像では、時間方向への相関が高い。動きベクトルは時間的に異なる画像間のオブジェクトの移動を示す値であるため、動きの空間相関も比較的高いことが予想される。そこでこれらの指標を判定パラメータとして利用し、予測方法を動的に切り替える。平行移動指標は式（１９）で与えられる。 A general moving image has a high correlation in the time direction. Since the motion vector is a value indicating the movement of an object between temporally different images, it is expected that the spatial correlation of motion is relatively high. Therefore, the prediction method is dynamically switched using these indexes as determination parameters. The translation index is given by equation (19).

図１７は、アフィン変換による画素ブロックの変化の例を示す図である。図１７では、座標（１，０）及び（０，１）がそれぞれアフィン変換によって座標（ａ，ｄ）、（ｂ、ｃ）に変換され、変換前のベクトルの中心角度４５°からθＤ回転している。ａ，ｂ，ｄ，ｅはそれぞれ、予測対象ブロックで得られた幾何変換パラメータを示している。回転指標は式（２０）で与えられる。式（２０）において、ｓｇｎ（Ａ）は、Ａの符号を返す関数である。矩形ブロックの中心ベクトルがどの程度回転したかを表している。 FIG. 17 is a diagram illustrating an example of a pixel block change by affine transformation. In FIG. 17, coordinates (1, 0) and (0, 1) are converted into coordinates (a, d) and (b, c) by affine transformation, respectively, and rotated by θD from the center angle 45 ° of the vector before conversion. ing. Each of a, b, d, and e indicates a geometric transformation parameter obtained in the prediction target block. The rotation index is given by equation (20). In equation (20), sgn (A) is a function that returns the sign of A. This shows how much the center vector of the rectangular block has been rotated.

拡大・縮小指標は図１７で示されるアフィン変換後の面積に相当し、値が１より大きい場合は拡大方向に、小さい場合は縮小方向に変形していることが判る。そこで、式（２１）で拡大・縮小指標を定義する。式（２１）でＤｅｔ≒１となる場合は、更に式（２２）を用いて回転指標を算出してもよい。図１７のθCは式（２３）の変形指標に対応しており、アフィン変換後の図形の変形角度を定義している。 The enlargement / reduction index corresponds to the area after the affine transformation shown in FIG. Therefore, an enlargement / reduction index is defined by equation (21). When Det≈1 in equation (21), the rotation index may be calculated using equation (22). ΘC in FIG. 17 corresponds to the deformation index of equation (23), and defines the deformation angle of the figure after the affine transformation.

また、式（２０）、式（２１）、式（２２）、式（２３）を用いて算出された予測対象ブロックと隣接ブロックの指標を用いて式（２５）ないし式（２７）に示す判定パラメータを利用しても良い。 Also, the determinations shown in Expression (25) to Expression (27) using the prediction target block and the adjacent block index calculated using Expression (20), Expression (21), Expression (22), and Expression (23). Parameters may be used.

例えばＤｅｔｎは隣接ブロック或いは隣接ブロックの三角パッチが保持する幾何変換パラメータ４１４から算出される。算出された幾何変換パラメータに対応する指標と幾何変換パラメータの差分値が判定パラメータ４１６として、予測切替部４０９へと出力される。
以上が、判定パラメータ導出部４２２の処理の概要である。 For example, Detn is calculated from the geometric transformation parameter 414 held by the adjacent block or the triangular patch of the adjacent block. A difference value between the index corresponding to the calculated geometric transformation parameter and the geometric transformation parameter is output to the prediction switching unit 409 as the determination parameter 416.
The above is the outline of the processing of the determination parameter deriving unit 422.

＜予測切替部４０９−予測の動的切替＞
次に予測切替部４０９について具体的に説明する。予測切替部４０９は、判定パラメータ導出部４２２から出力された判定パラメータ４１６に基づいて、どちらの予測画像信号を用いたかが記載された予測切替情報４１７を生成する。 <Prediction switching unit 409-Dynamic switching of prediction>
Next, the prediction switching unit 409 will be specifically described. The prediction switching unit 409 generates prediction switching information 417 that describes which prediction image signal is used based on the determination parameter 416 output from the determination parameter deriving unit 422.

先ず、式（２０）ないし式（２３）を用いる場合の切替方法について説明する。回転指標、拡大・縮小指標、変形指標は、動きベクトルのみの平行移動に加え、画素ブロックの回転、拡大、縮小、変形の度合いを図る指標である。算出された幾何変換パラメータの指標が、極端に大きい場合や小さい場合には、幾何変換パラメータの推定が不適切であることが予想され、このパラメータを利用して幾何変換予測で生成される予測画像信号は誤差を多く含んでいることが予想される。そこで、これらの指標が、予め定めた閾値の範囲を超えた場合には、幾何変換予測の予測画像信号を利用しないように予測切替情報４１７を生成する。 First, a switching method in the case of using Expression (20) to Expression (23) will be described. The rotation index, the enlargement / reduction index, and the deformation index are indices that aim at the degree of rotation, enlargement, reduction, and deformation of the pixel block in addition to the parallel movement of only the motion vector. When the calculated geometric transformation parameter index is extremely large or small, it is expected that the estimation of the geometric transformation parameter is inappropriate, and a predicted image generated by geometric transformation prediction using this parameter The signal is expected to contain many errors. Therefore, when these indices exceed a predetermined threshold range, the prediction switching information 417 is generated so as not to use the predicted image signal of the geometric transformation prediction.

例として、式（２１）の拡大・縮小指標における閾値範囲について説明する。Ｄｅｔは拡大・縮小の度合いを示す指標である。予測切替情報４１７を式（２８）で定義する。 As an example, the threshold range in the enlargement / reduction index of Expression (21) will be described. Det is an index indicating the degree of enlargement / reduction. Prediction switching information 417 is defined by equation (28).

式（２８）におけるｐｒｅｄ＿ｆｌａｇは予測切替情報４１７を表しており、ＴｈＤｅｔは閾値を表している。Ｄｅｔがそれぞれ閾値の範囲を超えた場合、ｐｒｅｄ＿ｆｌａｇは０となり、動き補償予測を選択する。一方、それ以外の場合は、ｐｒｅｄ＿ｆｌａｇは１となり、幾何変換予測を選択する。他の判定パラメータに関しても同様な閾値判定を行い、予測切替情報４１７を生成する。 In formula (28), pred_flag represents the prediction switching information 417, and ThDet represents a threshold value. When Det exceeds the threshold range, pred_flag is 0, and motion compensation prediction is selected. On the other hand, in other cases, pred_flag is 1, and geometric transformation prediction is selected. Similar threshold determination is performed for other determination parameters, and prediction switching information 417 is generated.

次に、式（１９）、式（２４）ないし式（２７）を用いる際の切替方法について説明する。隣接ブロックと予測対象ブロックの幾何変換パラメータには高い空間相関があることが推定されるため、それぞれの差分値が予め定めた閾値の大きさを超える場合には、幾何変換予測の予測画像信号を利用しないように予測切替情報４１７を生成する。 Next, a switching method when using Expression (19), Expression (24) to Expression (27) will be described. Since it is estimated that the geometric transformation parameters of the adjacent block and the prediction target block have a high spatial correlation, if each difference value exceeds the predetermined threshold value, the prediction image signal of the geometric transformation prediction is Prediction switching information 417 is generated so as not to be used.

なお、本実施の形態では、量子化パラメータを利用して閾値の範囲を制御する例を示したが、符号化パラメータに含まれる他の情報を用いてもよい。例えば、復号化した動画像の解像度、予測対象ブロックのタイプ、参照画像のＲｅｆ＿ｉｄｘ、動きベクトルの値、又は、変換サイズに関する情報などを利用してもよい。 In the present embodiment, an example in which the threshold range is controlled using the quantization parameter is shown, but other information included in the encoding parameter may be used. For example, the resolution of the decoded moving image, the type of the prediction target block, the Ref_idx of the reference image, the value of the motion vector, or information on the transform size may be used.

また、隣接ブロックの全ての隣接動きベクトルと予測対象ブロックの動きベクトルとが同一の値を持つとき、幾何変換前と後での座標が変化しない。このような時は、周辺ブロックが同一のオブジェクトの平行移動と考えられるため、幾何変換予測を行う必要がない。そこで、この条件が成り立つときは、予測切替情報４１７は、幾何変換予測を利用しない旨の値を有する。 Further, when all adjacent motion vectors of adjacent blocks and the motion vector of the prediction target block have the same value, the coordinates before and after geometric transformation do not change. In such a case, it is considered that the peripheral block is a parallel movement of the same object, so that it is not necessary to perform geometric transformation prediction. Therefore, when this condition is satisfied, the prediction switching information 417 has a value indicating that the geometric transformation prediction is not used.

また、それぞれの指標に対して用いる閾値を符号化パラメータの値に応じて変化する関数としてもよい。例えば、式（２８）における閾値ＴｈＤｅｔを量子化パラメータに依存する関数として定義し、量子化パラメータが増加するに従って、閾値が減少するような関数とすることで、推定精度の低下する低ビットレート帯での効果的な予測切替が可能となる。 Further, the threshold value used for each index may be a function that changes according to the value of the encoding parameter. For example, the threshold ThDet in Equation (28) is defined as a function that depends on the quantization parameter, and the function is such that the threshold decreases as the quantization parameter increases, thereby reducing the estimation accuracy. It is possible to effectively switch predictions.

以上が、予測切替部４０９及び予測分離スイッチ４１０の処理の概要である。 The above is the outline of the processing of the prediction switching unit 409 and the prediction separation switch 410.

＜シンタクス構造＞
次に、動画像復号化装置４００が復号する符号化データのシンタクス構造について説明する。動画像復号化装置４００が復号する符号化データ４１１は、動画像符号化装置１００と同一のシンタクス構造を有するとよい。そこで、ここでは、図１９ないし図２５を用いて説明する。 <Syntax structure>
Next, the syntax structure of the encoded data decoded by the video decoding device 400 will be described. The encoded data 411 decoded by the video decoding device 400 may have the same syntax structure as that of the video encoding device 100. Therefore, here, description will be made with reference to FIGS.

図１９は、シンタクス１６００の構成を示す図である。図１９に示すとおり、シンタクス１６００は主に３つのパートを有する。ハイレベルシンタクス１６０１は、スライス以上の上位レイヤのシンタクス情報を有する。スライスレベルシンタクス１６０２は、スライス毎に復号に必要な情報を有し、マクロブロックレベルシンタクス１６０３は、マクロブロック毎に復号に必要とされる情報を有する。 FIG. 19 is a diagram illustrating a configuration of the syntax 1600. As shown in FIG. 19, the syntax 1600 mainly has three parts. The high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice. The slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.

各パートは、更に詳細なシンタクスで構成されている。ハイレベルシンタクス１６０１は、シーケンスパラメータセットシンタクス１６０４とピクチャパラメータセットシンタクス１６０５などの、シーケンス及びピクチャレベルのシンタクスを含む。スライスレベルシンタクス１６０２は、スライスヘッダーシンタクス１６０５、スライスデータシンタクス１６０６等を含む。マクロブロックレベルシンタクス１６０３は、マクロブロックレイヤーシンタクス１６０７、マクロブロックプレディクションシンタクス１６０８等を含む。 Each part has a more detailed syntax. High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605. The slice level syntax 1602 includes a slice header syntax 1605, a slice data syntax 1606, and the like. The macroblock level syntax 1603 includes a macroblock layer syntax 1607, a macroblock prediction syntax 1608, and the like.

図２０は、スライスヘッダーシンタクス１６０５の例を示す図である。図中に示されるｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇは、当該スライスに幾何変換予測を適用するかどうかを示すシンタクス要素である。ｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが0である場合、予測切替部４０９は、当該スライスにおいて常に動き補償予測部４０６の出力端を出力するように予測切替情報４１７を設定して予測分離スイッチ４１０を切り替える。つまり、このスライスに対しては、幾何変換予測を適用しないことを意味する。一方、ｓｌｉｃｅ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｐｒｅｄｉｃｔｉｏｎ＿ｆｌａｇが１である場合、当該スライスにおいて判定パラメータ４１６が指し示す情報に基づいて予測切替情報４１７が設定され、予測分離スイッチ４１０は予測画像信号を動的に切り替える。 FIG. 20 is a diagram illustrating an example of the slice header syntax 1605. The slice_affine_motion_prediction_flag shown in the figure is a syntax element indicating whether or not geometric transformation prediction is applied to the slice. When slice_affine_motion_prediction_flag is 0, the prediction switching unit 409 sets the prediction switching information 417 so as to always output the output terminal of the motion compensation prediction unit 406 in the slice, and switches the prediction separation switch 410. That is, this means that geometric transformation prediction is not applied to this slice. On the other hand, when slice_affine_motion_prediction_flag is 1, the prediction switching information 417 is set based on the information indicated by the determination parameter 416 in the slice, and the prediction separation switch 410 dynamically switches the prediction image signal.

図２１は、スライスデータシンタクス１６０６の例を示す図である。図中に示されるｍｂ＿ｓｋｉｐ＿ｆｌａｇは、当該マクロブロックがスキップモードで符号化されているかどうかを示すフラグである。スキップモードである場合、変換係数や動きベクトルなどは符号化されていない。 FIG. 21 is a diagram illustrating an example of the slice data syntax 1606. Mb_skip_flag shown in the drawing is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, conversion coefficients, motion vectors, and the like are not encoded.

ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅは当該マクロブロックで幾何変換予測が利用できるかどうかを示す内部パラメータである。ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅが０の場合、判定パラメータ４１６で算出した各種値によって、幾何変換予測を利用しないように予測切替情報４１７が設定されていることを意味する。また、隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つときもＡｖａｉｌＡｆｆｉｎｅＭｏｄｅは０となる。 AvailAffineMode is an internal parameter indicating whether geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction switching information 417 is set not to use the geometric transformation prediction by various values calculated by the determination parameter 416. Also, AvailAffineMode is 0 when the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが符号化されている。ｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが１の場合、当該スキップモードに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｍｏｔｉｏｎ＿ｓｋｉｐ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineMode is 1, mb_affine_motion_skip_flag indicating whether to use geometric transformation prediction or motion compensation prediction is encoded. When mb_affine_motion_skip_flag is 1, it means that geometric transformation prediction is applied to the skip mode. When mb_affine_motion_skip_flag is 0, it means that motion compensation prediction is applied.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＭｂが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが符号化されている。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが１の場合、当該画素ブロックに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineModeMb is 1, mb_affine_pred_flag indicating which of geometric transformation prediction and motion compensation prediction is used is encoded. When mb_affine_pred_flag is 1, it means that geometric transformation prediction is applied to the pixel block. When mb_affine_pred_flag is 0, it means that motion compensation prediction is applied.

図２４は、サブマクロブロックプレディクションシンタクスの例を示す図である。図中に示されるＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂは当該画素ブロックで幾何変換予測が利用できるかどうかを示す内部パラメータである。ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂが０の場合、判定パラメータ１２５で算出した各種値によって、幾何変換予測を利用しないように予測切替情報４１７が設定されていることを意味する。また、隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つときもＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂは０となる。 FIG. 24 is a diagram illustrating an example of sub macroblock prediction syntax. AvailAffineModeSubMb shown in the figure is an internal parameter indicating whether or not geometric transformation prediction can be used in the pixel block. When AvailAffineModeSubMb is 0, it means that the prediction switching information 417 is set not to use the geometric transformation prediction by various values calculated by the determination parameter 125. The AvailAffineModeSubMb is also 0 when the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value.

一方、ＡｖａｉｌＡｆｆｉｎｅＭｏｄｅＳｕｂＭｂが１の場合は、幾何変換予測と動き補償予測のどちらを利用するかを示すｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが符号化されている。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが１の場合、当該画素ブロックに対して幾何変換予測が適用されることを意味する。ｍｂ＿ａｆｆｉｎｅ＿ｐｒｅｄ＿ｆｌａｇが０の場合、動き補償予測が適用されることを意味する。ＮｕｍＳｕｂＭｂＰａｒｔ()は、ｍｂ＿ｔｙｐｅに規定されたブロック分割数を返す内部関数である。 On the other hand, when AvailAffineModeSubMb is 1, mb_affine_pred_flag indicating whether to use geometric transformation prediction or motion compensation prediction is encoded. When mb_affine_pred_flag is 1, it means that geometric transformation prediction is applied to the pixel block. When mb_affine_pred_flag is 0, it means that motion compensation prediction is applied. NumSubMbPart () is an internal function that returns the number of block divisions defined in mb_type.

なお、図１９ないし図２５に示すシンタクスの表中の行間には、本実施の形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、または複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。更に、当該マクロブロックレイヤーシンタクスに記述されている各々のシンタクス要素は、後述するマクロブロックデータシンタクスに明記されるように変更しても良い。 It should be noted that syntax elements not defined in the present embodiment may be inserted between the rows in the syntax tables shown in FIGS. 19 to 25, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used. Further, each syntax element described in the macroblock layer syntax may be changed as specified in a macroblock data syntax described later.

以上が、動画像復号化装置４００の説明である。 The above is the description of the moving picture decoding apparatus 400.

（第６の実施形態：イントラ予測の追加）
図３３は、第６の実施の形態で用いられる動画像復号化装置５００の構造を示す図である。図３３において、図３２の動画像復号化装置４００と同位置の機能を持つ各部には同一の符号を付し、ここでは説明を省略する。 (Sixth embodiment: addition of intra prediction)
FIG. 33 is a diagram illustrating a structure of a moving picture decoding apparatus 500 used in the sixth embodiment. In FIG. 33, each unit having the same function as that of the moving picture decoding apparatus 400 in FIG. 32 is denoted by the same reference numeral, and description thereof is omitted here.

動画像復号化装置５００は、動画像復号化装置４００が有する各部に加え、イントラ予測部５０１、及び、予測切替スイッチ５０２が追加されている。イントラ予測部５０１は、入力された参照画像信号４１３を基にして画面内の情報のみを用いた予測画像信号４１８を生成する。 In addition to the units included in the video decoding device 400, the video decoding device 500 includes an intra prediction unit 501 and a prediction changeover switch 502. The intra prediction unit 501 generates a predicted image signal 418 using only information in the screen based on the input reference image signal 413.

符号化データ復号部４０２で復号された予測情報から予測切替情報５０３が予測切替スイッチ５０２へと入力される。予測切替情報５０３には、イントラ予測部５０１の出力端とインター予測部１３０の出力端との、どちらとスイッチを繋ぐかの情報が記述されている。 Prediction switching information 503 is input to the prediction switching switch 502 from the prediction information decoded by the encoded data decoding unit 402. The prediction switching information 503 describes information about which of the output terminal of the intra prediction unit 501 and the output terminal of the inter prediction unit 130 is connected to the switch.

イントラ予測が選択された場合、予測切替スイッチ５０２は、イントラ予測部５０１の出力端をスイッチへと接続し、イントラ予測部５０１で得られた予測画像信号４１８を加算器４０４へと出力する。一方、インター予測が選択された場合、インター予測部１３０の出力端をスイッチへと接続し、インター予測部１３０で得られた予測画像信号４１８を加算器４０４へと出力する。 When intra prediction is selected, the prediction changeover switch 502 connects the output terminal of the intra prediction unit 501 to the switch, and outputs the predicted image signal 418 obtained by the intra prediction unit 501 to the adder 404. On the other hand, when the inter prediction is selected, the output terminal of the inter prediction unit 130 is connected to the switch, and the prediction image signal 418 obtained by the inter prediction unit 130 is output to the adder 404.

これにより、イントラ予測部５０１で生成された予測画像信号を選ぶか、インター予測部１３０で生成された予測画像信号を選ぶか、が判断され、予測モード分離スイッチ５０２の出力端を切り替える。 Thereby, it is determined whether to select the predicted image signal generated by the intra prediction unit 501 or the predicted image signal generated by the inter prediction unit 130, and the output terminal of the prediction mode separation switch 502 is switched.

以上が第６の実施の形態に係る動画像復号化装置５００の処理の説明である。 The above is the description of the processing of the video decoding device 500 according to the sixth embodiment.

（第７の実施形態：動きベクトルの差分をシグナリング）
本発明の第７の実施の形態に係わる動画像復号化装置は、例えば、第４の実施の動画像復号化装置が生成する符号化データを復号して復号画像信号を生成する。第７の実施の形態に係る動画像符号化装置は、第５の実施の形態に係わる動画像復号化装置４００の機能に加えて、予測対象ブロックの動きベクトルと幾何変換パラメータを補正する動きベクトルの差分を復号化する。 (Seventh embodiment: signaling a difference of motion vectors)
The moving picture decoding apparatus according to the seventh embodiment of the present invention generates, for example, a decoded picture signal by decoding encoded data generated by the moving picture decoding apparatus according to the fourth embodiment. In addition to the function of the video decoding device 400 according to the fifth embodiment, the video coding device according to the seventh embodiment corrects the motion vector and geometric transformation parameter of the prediction target block. The difference between is decoded.

第５の実施の形態に係る動画像復号化装置４００が復号化する符号化データのシンタクスに対する、本実施の形態に係わる動画像復号化装置が復号する符号化データのシンタクスの変更は、図２９及び図３０と同一である。 The change of the syntax of the encoded data decoded by the video decoding device according to the present embodiment with respect to the syntax of the encoded data decoded by the video decoding device 400 according to the fifth embodiment is shown in FIG. And the same as FIG.

図２９及び図３０において、シンタクス要素に含まれる文字のうち、Ｌ０及びｌ０は参照画像Ｌ０上に示される動きベクトルを示し、Ｌ１及びｌ１は参照画像Ｌ１上に示される動きベクトルを示す。再探索前の本来の動きベクトルを、ｍｖ_ｏｒｇ＝（ｍｖｘ_ｏｒｇ，ｍｖｙ_ｏｒｇ）とし、再探索後の動きベクトルをｍｖ_{ｒｅｆｉｎｅ}＝（ｍｖｘ_{ｒｅｆｉｎｅ}，ｍｖｙ_{ｒｅｆｉｎｅ}）とすると、動きベクトルの差分ｍｖｄ＿ａｆｆｉｎｅ[２]は式（３１）により算出される。 29 and 30, among the characters included in the syntax elements, L0 and l0 indicate motion vectors indicated on the reference image L0, and L1 and l1 indicate motion vectors indicated on the reference image L1. If the original motion vector before the re-search is mv _org = (mvx _org , mvy _org ) and the motion vector after the re-search is mv _refine = (mvx _refine , mvy _refine ), the motion vector difference mvd_affine [2] Is calculated by equation (31).

なお、式（３１）においてｍｖｄ＿ａｆｆｉｎｅ[０]は垂直方向、ｍｖｄ＿ａｆｆｉｎｅ[１]は水平方向に対応する動きベクトルの差分値であり、シンタクス要素に対応している。なお、ここでは、図２９及び図３０に示すシンタクスに含まれている文字ｌ０及びｌ１を省略しているが、式（３１）に示す動きベクトルの差分は、Ｌ０及びＬ１のそれぞれのインデックスに対して計算する。また、ｍｖｄ＿ｌ０及びｍｖｄ＿ｌ１は、それぞれの参照画像信号に対応する動きベクトルと動きベクトルを予測した値との差分を計算することによって計算され、本実施の形態では明示しない他の動きベクトルの予測技術を用いて生成される。 In Equation (31), mvd_affine [0] is the difference value of the motion vector corresponding to the vertical direction, and mvd_affine [1] corresponds to the syntax element. Note that here, the characters l0 and l1 included in the syntaxes shown in FIGS. 29 and 30 are omitted, but the motion vector difference shown in the equation (31) is different from the indexes L0 and L1. To calculate. Also, mvd_l0 and mvd_l1 are calculated by calculating the difference between the motion vector corresponding to each reference image signal and the value predicted from the motion vector, and other motion vector prediction techniques not explicitly described in this embodiment are used. Generated using.

（第１ないし第７の実施形態の変形例）
（１）第１ないし第７の実施形態においては、処理対象フレームを１６×１６画素サイズなどの短形ブロックに分割し、図４に示したように画面左上のブロックから右下に向かって順に符号化／復号化する場合について説明しているが、符号化順序及び復号化順序はこれに限られない。例えば、右下から左上に向かって順に符号化及び復号化を行ってもよいし、画面中央から渦巻状に向かって順に符号化及び復号化を行ってもよい。さらに、右上から左下に向かって順に符号化及び復号化を行ってもよいし、画面の周辺部から中心部に向かって順に符号化及び復号化を行ってもよい。 (Modifications of the first to seventh embodiments)
(1) In the first to seventh embodiments, the processing target frame is divided into short blocks of 16 × 16 pixel size and the like, and as shown in FIG. The case of encoding / decoding has been described, but the encoding order and decoding order are not limited to this. For example, encoding and decoding may be performed sequentially from the lower right to the upper left, or encoding and decoding may be performed sequentially from the center of the screen toward the spiral. Furthermore, encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed in order from the peripheral part to the center part of the screen.

（２）第１ないし第７の実施形態においては、ブロックサイズを４×４画素ブロック、８×８画素ブロックとして説明を行ったが、予測対象ブロックは均一なブロック形状にする必要なく、１６×８画素ブロック、８×１６画素ブロック、８×４画素ブロック、４×８画素ブロックなどの何れのブロックサイズであってもよい。また、１つのマクロブロック内でも全てのブロックを同一にする必要はなく、異なるサイズのブロックを混在させてもよい。この場合、分割数が増えると分割情報を符号化又は復号化するための符号量が増加する。そこで、変換係数の符号量と局部復号画像又は復号画像とのバランスを考慮して、ブロックサイズを選択すればよい。 (2) In the first to seventh embodiments, the block size has been described as a 4 × 4 pixel block and an 8 × 8 pixel block. However, the prediction target block does not need to have a uniform block shape, and 16 × Any block size such as an 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, or a 4 × 8 pixel block may be used. Also, it is not necessary to make all blocks the same within one macroblock, and blocks of different sizes may be mixed. In this case, as the number of divisions increases, the amount of codes for encoding or decoding the division information increases. Therefore, the block size may be selected in consideration of the balance between the code amount of the transform coefficient and the locally decoded image or the decoded image.

（３）第１ないし第７の実施形態においては、輝度信号と色差信号を分割せず、一方の色信号成分に限定した例として記述した。しかし、予測処理が輝度信号と色差信号で異なる場合、それぞれ異なる予測方法を用いてもよいし、同一の予測方法を用いても良い。異なる予測方法を用いる場合は、色差信号に対して選択した予測方法を輝度信号と同様の方法で符号化又は復号化する。 (3) In the first to seventh embodiments, the luminance signal and the color difference signal are not divided and described as an example limited to one color signal component. However, when the prediction processing is different between the luminance signal and the color difference signal, different prediction methods may be used, or the same prediction method may be used. When a different prediction method is used, the prediction method selected for the color difference signal is encoded or decoded by the same method as the luminance signal.

（４）第１ないし第４の実施形態においては、判定パラメータを符号化データに含ませない例を記述した。しかし、画素ブロック毎の判定パラメータを、符号化データに含ませてもよい。また、第５ないし第７の実施形態においては、判定パラメータを、幾何変換パラメータに基づいて、画素ブロック毎に算出する例を説明した。しかし、符号化データに判定パラメータが含まれている場合には、動画像復号化装置が判定パラメータを算出することなく、復号した判定パラメータにより、幾何変換による動き補償予測を行うか否かを判定する構成にするとよい。 (4) In the first to fourth embodiments, the example in which the determination parameter is not included in the encoded data has been described. However, the determination parameter for each pixel block may be included in the encoded data. In the fifth to seventh embodiments, the example in which the determination parameter is calculated for each pixel block based on the geometric transformation parameter has been described. However, when the determination parameter is included in the encoded data, the moving image decoding apparatus determines whether to perform motion compensation prediction by geometric transformation based on the decoded determination parameter without calculating the determination parameter. It is good to have a configuration to do.

本提案手法を用いることで、平行移動モデルに適さない動オブジェクトを予測するために、過度のブロック分割が施されて、ブロック分割情報が増大することを防ぐ。つまり、付加的な情報を増加させずに、ブロック内のオブジェクトの幾何変形を予測し、それぞれに好適な幾何変換パラメータを適用することによって、符号化効率を向上させると共に主観画質も向上するという効果を奏する。 By using the proposed method, in order to predict a moving object that is not suitable for the parallel movement model, excessive block division is prevented and block division information is prevented from increasing. In other words, the effect of improving the coding efficiency and the subjective image quality by predicting the geometric deformation of the object in the block without applying additional information and applying a suitable geometric transformation parameter to each. Play.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

以上のように、本発明にかかる動画像符号化装置、動画像復号化装置、動画像符号化方法、及び、動画像復号化方法は、高効率な動画像の符号化に有用であり、特に、幾何変換動き補償予測に用いる幾何変換パラメータの推定に必要な動き検出処理を低減する動画像符号化に適している。 As described above, the moving image encoding device, the moving image decoding device, the moving image encoding method, and the moving image decoding method according to the present invention are useful for highly efficient moving image encoding. It is suitable for moving picture coding that reduces the motion detection process necessary for estimating the geometric transformation parameter used for the geometric transformation motion compensation prediction.

１００動画像符号化装置
１０１減算器
１０２変換・量子化部
１０３逆量子化・逆変換部
１０４加算器
１０５参照画像メモリ
１０６動き推定部
１０７動き補償予測部
１０８幾何変換パラメータ導出部
１０９幾何変換予測部
１１０予測分離スイッチ
１１１予測切替部
１１２エントロピー符号化部
１１３出力バッファ
１１４入力画像信号
１１５予測誤差信号
１１６変換係数
１１７復号予測誤差信号
１１８復号画像信号
１１９動きベクトル
１２０参照画像信号
１２１幾何変換パラメータ
１２２予測切替情報
１２３予測画像信号
１２４符号化データ
１２５判定パラメータ
１２６符号化制御部
１２７判定パラメータ導出部
１３０インター予測部
１８１動きベクトル取得部
１８２パラメータ導出部
１９１幾何変換部
１９２内挿補間部
２００動画像符号化装置
２０１イントラ予測部
２０２モード判定部
２０３予測モード分離スイッチ
２０４予測モード切替情報
３００インター予測部
３０１ベクトル再探索部
４００動画像復号化装置
４０１入力バッファ
４０２符号化データ復号部
４０３逆量子化・逆変換部
４０４加算器
４０５参照画像メモリ
４０６動き補償予測部
４０７幾何変換パラメータ導出部
４０８幾何変換予測部
４０９予測切替部
４１０予測分離スイッチ
４１１符号化データ
４１２予測誤差信号
４１３参照画像信号
４１４幾何変換パラメータ
４１５動きベクトル
４１６判定パラメータ
４１７予測切替情報
４１８予測画像信号
４１９出力バッファ
４２０復号画像信号
４２１復号化制御部
４２２判定パラメータ導出部
５００動画像復号化装置
５０１イントラ予測部
５０２予測モード分離スイッチ
５０２予測切替スイッチ
５０３予測切替情報 DESCRIPTION OF SYMBOLS 100 Moving image encoder 101 Subtractor 102 Transformer / quantizer 103 Dequantizer / Inverse transformer 104 Adder 105 Reference image memory 106 Motion estimator 107 Motion compensation predictor 108 Geometric transformation parameter derivation unit 109 Geometric transformation predictor 110 prediction separation switch 111 prediction switching unit 112 entropy encoding unit 113 output buffer 114 input image signal 115 prediction error signal 116 transform coefficient 117 decoded prediction error signal 118 decoded image signal 119 motion vector 120 reference image signal 121 geometric transformation parameter 122 prediction switching Information 123 Predictive image signal 124 Encoded data 125 Determination parameter 126 Encoding control unit 127 Determination parameter deriving unit 130 Inter prediction unit 181 Motion vector acquisition unit 182 Parameter deriving unit 191 Geometric conversion unit 192 Interpolation unit 00 video encoding device 201 intra prediction unit 202 mode determination unit 203 prediction mode separation switch 204 prediction mode switching information 300 inter prediction unit 301 vector re-search unit 400 video decoding device 401 input buffer 402 encoded data decoding unit 403 inverse Quantization / inverse transform unit 404 Adder 405 Reference image memory 406 Motion compensation prediction unit 407 Geometric transformation parameter derivation unit 408 Geometric transformation prediction unit 409 Prediction switching unit 410 Prediction separation switch 411 Encoded data 412 Prediction error signal 413 Reference image signal 414 Geometric transformation parameter 415 Motion vector 416 Determination parameter 417 Prediction switching information 418 Prediction image signal 419 Output buffer 420 Decoded image signal 421 Decoding control unit 422 Determination parameter derivation unit 500 Video decoding device 501 IN Tiger prediction unit 502 prediction mode separation switch 502 prediction changeover switch 503 prediction changeover information

Claims

A motion information acquisition unit that acquires motion information of one or more adjacent blocks among adjacent blocks adjacent to one of the pixel blocks into which the image signal is divided;
A geometric transformation information acquisition unit that acquires, based on the motion information, a geometric transformation parameter that is information related to a shape of a mapping by geometric transformation of the pixel block in a reference image signal when performing motion compensation on the pixel block;
A geometric transformation prediction unit that performs geometric transformation motion prediction including geometric transformation between the reference image signal and the pixel block, using the reference image signal subjected to geometric transformation by the geometric transformation parameter;
An encoding unit that encodes a prediction error value of the pixel block on which the geometric transformation motion prediction has been performed;
A moving picture encoding apparatus having:

The moving image coding apparatus according to claim 1, further comprising a geometric transformation unit that performs geometric transformation on the reference image signal according to the geometric transformation parameter.

Information relating to the geometric transformation parameter, information relating to the rotation angle obtained from the geometric transformation parameter, information relating to the amount of enlargement or reduction obtained from the geometric transformation parameter, information relating to the amount of deformation obtained from the geometric transformation parameter And a determination unit that determines whether or not to perform the geometric transformation motion prediction based on a decision parameter that is one or more pieces of information regarding the amount of movement obtained from the geometric transformation parameter. The moving image encoding device according to 1 or 2.

The moving image encoding apparatus according to claim 3, wherein the encoding unit further encodes the determination parameter.

The geometric transformation prediction unit further generates motion information of the pixel block,
The moving image encoding apparatus according to claim 1, wherein the encoding unit further encodes the motion information.

The moving image encoding apparatus according to claim 1, wherein the motion information acquisition unit acquires motion information of an adjacent block for which prediction has already been completed among the adjacent blocks.

The moving image encoding apparatus according to claim 1, wherein the geometric transformation information acquisition unit acquires the geometric transformation parameters by converting the motion information based on a relative position between the adjacent block and the pixel block.

The motion information acquisition unit includes motion information of pixels adjacent to the first pixel in the raster order in the pixel block, and pixels adjacent to other pixels excluding the first pixel in the raster order among the pixels at the vertex of the pixel block. 2. The moving picture encoding apparatus according to claim 1, wherein one or more pieces of movement information are acquired from the movement information of the first pixel and the movement information of a pixel located at a center of the pixel block.

The moving image according to claim 1, wherein the motion information acquisition unit acquires one piece of motion information based on a motion vector obtained by performing motion prediction on the reference image signal for each of the adjacent blocks of the plurality of adjacent blocks. Image encoding device.

The motion information acquisition unit acquires a motion vector of the pixel block from the motion information,
The encoding unit includes the geometric transformation motion prediction based on the motion vector and the geometric transformation parameter, and the geometric transformation motion prediction based on a motion vector obtained by a motion search based on the motion vector and the geometric transformation parameter. The moving picture coding apparatus according to claim 1, wherein information relating to a motion vector of geometric transformation motion prediction having a small prediction error value is coded.

The moving image encoding apparatus according to claim 1, wherein the encoding unit further encodes the geometric transformation parameter.

The moving image encoding apparatus according to claim 1, wherein the encoding unit further encodes correction information used for correction for reducing the prediction error value with respect to the geometric transformation parameter.

The moving image encoding apparatus according to claim 3, wherein the determination unit determines to perform the geometric transformation motion prediction when a value of the determination parameter is within a predetermined range.

In addition to the determination parameter, the determination unit includes a predetermined parameter when the pixel block is encoded, or a predetermined parameter when an already encoded pixel block included in the image signal is encoded 4. The moving picture encoding apparatus according to claim 3, wherein whether to perform the geometric transformation motion prediction is determined based on.

The predetermined parameters include prediction type information, prediction mode information, quantization parameter information, transform coefficient information, coefficient existence information, motion of the pixel block or the already encoded pixel block of the image signal. The moving picture coding apparatus according to claim 14, wherein the moving picture coding apparatus is at least one of information and block division information.

The moving image code according to claim 1, wherein the geometric transformation is one or more of affine transformation, bilinear transformation, Helmat transformation, quadratic conformal transformation, projective transformation, and three-dimensional projective transformation. Device.

The geometric transformation unit is one of a bilinear interpolation method, a nearest interpolation method, and a cubic convolution interpolation method when pixels of the mapping obtained by the geometric transformation are located between integer pixels. The moving image encoding apparatus according to claim 2, wherein pixel values of the pixels of the mapping are acquired by interpolation.

A motion information acquisition step of acquiring motion information of one or more adjacent blocks among adjacent blocks adjacent to one of the pixel blocks into which the image signal is divided;
A geometric transformation information acquisition step for acquiring a geometric transformation parameter, which is information related to a shape of a map obtained by geometric transformation of the pixel block, in a reference image signal when performing motion compensation for the pixel block, and based on the motion information;
A geometric transformation prediction step, wherein geometric transformation motion prediction including geometric transformation between the reference image signal and the pixel block is performed using the reference image signal subjected to geometric transformation by the geometric transformation parameter;
An encoding step for encoding a prediction error value of the pixel block on which the geometric transformation motion prediction has been performed;
A video encoding method comprising:

The image signal including a prediction error value obtained by geometric transformation motion prediction including geometric transformation between a pixel block into which the image signal is divided and a reference image signal when performing motion compensation on the pixel block is encoded. A decoding unit for decoding the encoded data,
A motion information acquisition unit that acquires motion information of one or more adjacent blocks of adjacent blocks adjacent to one of the pixel blocks;
A geometric transformation information acquisition unit that acquires, based on the motion information, a geometric transformation parameter that is information related to a shape of a map obtained by geometric transformation of the pixel block in the reference image signal;
Performing the geometric transformation motion prediction using the reference image signal subjected to the geometric transformation by the geometric transformation parameter, and generating a prediction value; a geometric transformation prediction unit;
An adder that adds the decoded prediction error value and the generated prediction value;
A moving picture decoding apparatus comprising:

The moving picture decoding apparatus according to claim 19, further comprising a geometric transformation unit that performs geometric transformation on the reference image signal according to the geometric transformation parameter.

Information relating to the geometric transformation parameter, information relating to the rotation angle obtained from the geometric transformation parameter, information relating to the amount of enlargement or reduction obtained from the geometric transformation parameter, information relating to the amount of deformation obtained from the geometric transformation parameter And a determination unit that determines whether or not to perform the geometric transformation motion prediction based on a decision parameter that is one or more pieces of information related to the amount of movement obtained from the geometric transformation parameter. The moving picture decoding apparatus according to 19 or 20.

The moving picture decoding apparatus according to claim 21, wherein the encoding unit further encodes the determination parameter.

The video decoding device according to claim 19, wherein the decoding unit further decodes motion information of the pixel block.

The moving picture decoding apparatus according to claim 19, wherein the motion information acquisition unit acquires motion information of an adjacent block for which prediction has already been completed among the adjacent blocks.

The moving picture decoding apparatus according to claim 19, wherein the geometric transformation information acquisition unit acquires the geometric transformation parameters by converting the motion information based on a relative position between the adjacent block and the pixel block.

The motion information acquisition unit includes motion information of pixels adjacent to the first pixel in the raster order in the pixel block, and pixels adjacent to other pixels excluding the first pixel in the raster order among the pixels at the vertex of the pixel block. The moving picture decoding apparatus according to claim 19, wherein one or more pieces of movement information are acquired from the movement information of the first pixel and the movement information of a pixel located at a center of the pixel block.

The moving image according to claim 19, wherein the motion information acquisition unit acquires one piece of motion information based on a motion vector obtained by performing motion prediction on the reference image signal for each of the adjacent blocks of the plurality of adjacent blocks. Image decoding device.

The code data includes information on the geometric transformation parameter,
The moving picture decoding apparatus according to claim 19, wherein the decoding unit further decodes the geometric transformation parameter.

The code data further includes correction information used for correction to reduce the prediction error value for the geometric transformation parameter,
The video decoding device according to claim 19, wherein the decoding unit further decodes the correction information.

The moving picture decoding apparatus according to claim 21, wherein the determination unit determines to perform the geometric transformation motion prediction when the value of the determination parameter is within a predetermined range.

In addition to the determination parameter, the determination unit includes a predetermined parameter when the pixel block is encoded, or a predetermined parameter when an already encoded pixel block included in the image signal is encoded The moving picture decoding apparatus according to claim 21, wherein whether to perform the geometric transformation motion prediction is determined based on.

The predetermined parameters include prediction type information, prediction mode information, quantization parameter information, transform coefficient information, coefficient existence information, motion of the pixel block or the already encoded pixel block of the image signal. 32. The moving picture decoding apparatus according to claim 31, wherein the moving picture decoding apparatus is at least one of information and block division information.

The moving picture decoding according to claim 19, wherein the geometric transformation is one or more of an affine transformation, a bilinear transformation, a Helmat transformation, a quadratic conformal transformation, a projective transformation, and a three-dimensional projective transformation. Device.

The geometric transformation unit is one of a bilinear interpolation method, a nearest interpolation method, and a cubic convolution interpolation method when pixels of the mapping obtained by the geometric transformation are located between integer pixels. 21. The moving picture decoding apparatus according to claim 20, wherein pixel values of pixels of the mapping are acquired by interpolation.

The image signal including a prediction error value obtained by a geometric transformation motion prediction including a geometric transformation between a pixel block into which the image signal is divided and a reference image signal when performing motion compensation on the pixel block is encoded. A decoding step of decoding the encoded data;
A motion information acquisition step of acquiring motion information of one or more adjacent blocks of adjacent blocks adjacent to one of the pixel blocks;
A geometric transformation information obtaining step for obtaining a geometric transformation parameter, which is information relating to a shape of a map obtained by geometric transformation of the pixel block in the reference image signal, based on the motion information;
Performing the geometric transformation motion prediction using the reference image signal subjected to the geometric transformation by the geometric transformation parameter, and generating a prediction value; a geometric transformation prediction step;
An adding step of adding the decoded prediction error value and the generated prediction value;
A video decoding method comprising: