JP6871442B2

JP6871442B2 - Moving image coding method and moving image decoding method

Info

Publication number: JP6871442B2
Application number: JP2020007855A
Authority: JP
Inventors: 太一郎塩寺; 昭行谷沢; 山影　朋夫; 朋夫山影; 中條　健; 健中條
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2020-01-21
Filing date: 2020-01-21
Publication date: 2021-05-12
Anticipated expiration: 2031-03-09
Also published as: JP2020074582A

Description

本発明の実施形態は、動画像の符号化及び復号化における動き情報圧縮方法、動画像符号化方法及び動画像復号化方法に関する。 An embodiment of the present invention relates to a motion information compression method, a moving image coding method, and a moving image decoding method in coding and decoding a moving image.

近年、大幅に符号化効率を向上させた画像符号化方法が、ITU-TとISO/IECとの共同で、ITU-T Rec. H.264及びISO/IEC 14496-10（以下、H.264という）として勧告されている。H.264では、予測処理、変換処理及びエントロピー符号化処理は、矩形ブロック単位（例えば、１６×１６画素ブロック単位、８×８画素ブロック単位等）で行われる。予測処理においては、符号化対象の矩形ブロック（符号化対象ブロック）に対して、既に符号化済みのフレーム（参照フレーム）を参照して、時間方向の予測を行う動き補償が行われる。このような動き補償では、符号化対象ブロックと参照フレーム内において参照されるブロックとの空間的シフト情報としての動きベクトルを含む動き情報を符号化して復号化側に送る必要がある。さらに、複数の参照フレームを用いて動き補償を行う場合、動き情報とともに参照フレーム番号も符号化する必要がある。このため、動き情報及び参照フレーム番号に関する符号量が増大する場合がある。また、参照フレームの動き情報メモリに格納されている動き情報を参照して、符号化対象ブロックの予測動き情報を導出する動き情報予測方法があり（特許文献１及び非特許文献２）、動き情報を格納する動き情報メモリの容量が増加する場合がある。 In recent years, image coding methods with significantly improved coding efficiency have been developed in collaboration with ITU-T and ISO / IEC, such as ITU-T Rec. H.264 and ISO / IEC 14496-10 (hereinafter, H.264). It is recommended as). In H.264, prediction processing, conversion processing, and entropy coding processing are performed in rectangular block units (for example, 16 × 16 pixel block units, 8 × 8 pixel block units, etc.). In the prediction process, motion compensation is performed on the rectangular block to be encoded (block to be encoded) by referring to the already encoded frame (reference frame) to make a prediction in the time direction. In such motion compensation, it is necessary to encode the motion information including the motion vector as the spatial shift information between the coded block and the block referenced in the reference frame and send it to the decoding side. Further, when motion compensation is performed using a plurality of reference frames, it is necessary to encode the reference frame number together with the motion information. Therefore, the amount of code related to the motion information and the reference frame number may increase. Further, there is a motion information prediction method for deriving the predicted motion information of the coded block by referring to the motion information stored in the motion information memory of the reference frame (Patent Document 1 and Non-Patent Document 2). The capacity of the motion information memory that stores the data may increase.

動き情報メモリの容量を削減する方法の一例として、（非特許文献２）では、予め定められたブロック内で代表する動き情報を導出し、代表する動き情報のみを動き情報メモリに格納する。 As an example of a method of reducing the capacity of the motion information memory, (Non-Patent Document 2) derives representative motion information in a predetermined block and stores only the representative motion information in the motion information memory.

特許第４０２０７８９号Patent No. 4020789 J. Jung et al, “Temporal MV predictor modification for MV-Comp, Skip, Direct and Merge schemes”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 Document, JCTVC-D164, January 20110.J. Jung et al, “Temporal MV predictor modification for MV-Comp, Skip, Direct and Merge schemes”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11 Document, JCTVC-D164, January 20110. Yeping Su et al, “CE9: Reduced resolution storage of motion vector data”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 Document, JCTVC-D072, January 20110.Yeping Su et al, “CE9: Reduced resolution storage of motion vector data”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11 Document, JCTVC-D072, January 20110.

しかしながら、非特許文献１で示される予測動き情報の導出方法と非特許文献２で示される代表動き情報の導出方法が異なる場合に、予測動き情報の時間相関が低減するために、動き情報に関する符号量が増加される問題がある。 However, when the method for deriving the predicted motion information shown in Non-Patent Document 1 and the method for deriving the representative motion information shown in Non-Patent Document 2 are different, the time correlation of the predicted motion information is reduced. There is a problem that the amount is increased.

本発明が解決しようとする課題は、上記問題点を解決するためになされたものであり、符号化効率を向上可能な動き情報圧縮装置を含んだ動画像符号化装置及び動画像復号化装置を提供することである。 The problem to be solved by the present invention is to solve the above-mentioned problems, and to provide a moving image coding device and a moving image decoding device including a motion information compression device capable of improving coding efficiency. To provide.

実施形態によれば、動画像符号化方法は、入力画像信号を画素ブロックに分割し、これら分割した画素ブロックに対してインター予測を行う方法である。この方法は、符号化済み領域における動き情報を保持する動き情報バッファの中から、予測動き情報を選択し、前記予測動き情報を用いて、符号化対象ブロックの動き情報を予測することを含み。さらに、この方法は符号化が終了した領域内の複数の動き情報の中から、前記予測動き情報の選択方法を示す第１情報に従って代表動き情報を取得し、前記代表動き情報のみを得ることを含む。 According to the embodiment, the moving image coding method is a method of dividing an input image signal into pixel blocks and performing inter-prediction for these divided pixel blocks. This method includes selecting predicted motion information from a motion information buffer that holds motion information in a coded region, and using the predicted motion information to predict motion information of a block to be encoded. Further, in this method, representative motion information is acquired from a plurality of motion information in the encoded region according to the first information indicating the method of selecting the predicted motion information, and only the representative motion information is obtained. Including.

第１の実施形態に係る画像符号化装置の構成を概略的に示すブロック図。The block diagram which shows schematic the structure of the image coding apparatus which concerns on 1st Embodiment. 画素ブロックの予測符号化順の説明図。Explanatory drawing of the predicted coding order of a pixel block. 画素ブロックサイズの一例の説明図。Explanatory drawing of an example of a pixel block size. 画素ブロックサイズの別の例の説明図。Explanatory drawing of another example of pixel block size. 画素ブロックサイズの別の例の説明図。Explanatory drawing of another example of pixel block size. コーディングツリーユニットにおける画素ブロックの一例の説明図。Explanatory drawing of an example of a pixel block in a coding tree unit. コーディングツリーユニットにおける画素ブロックの別の例の説明図。Explanatory drawing of another example of a pixel block in a coding tree unit. コーディングツリーユニットにおける画素ブロックの別の例の説明図。Explanatory drawing of another example of a pixel block in a coding tree unit. コーディングツリーユニットにおける画素ブロックの別の例の説明図。Explanatory drawing of another example of a pixel block in a coding tree unit. 図１のエントロピー符号化部の構成を概略的に示すブロック図。The block diagram which shows the structure of the entropy coding part of FIG. 1 schematicly. 図１の動き情報メモリの構成を概略的に示す説明図。The explanatory view which shows the structure of the motion information memory of FIG. 1 schematicly. 図１のインター予測部が実行するインター予測処理の一例の説明図。It is explanatory drawing of an example of the inter-prediction processing executed by the inter-prediction part of FIG. 図１のインター予測部が実行するインター予測処理の別の例の説明図。Explanatory drawing of another example of the inter-prediction processing executed by the inter-prediction part of FIG. プレディクションユニットの一例の説明図。Explanatory drawing of an example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. プレディクションユニットの別の例の説明図。Explanatory drawing of another example of a prediction unit. スキップモード、マージモード、インターモードを示す説明図。Explanatory drawing which shows skip mode, merge mode, inter mode. 図４の動き情報符号化部の構成を概略的に示すブロック図。The block diagram which shows the structure of the motion information coding part of FIG. 4 schematicly. 符号化対象プレディクションユニットに対する、予測動き情報候補の位置の例を示す説明図。The explanatory view which shows the example of the position of the predicted motion information candidate with respect to the prediction unit to be encoded. 符号化対象プレディクションユニットに対する、予測動き情報候補の位置の更に別の例を示す説明図。Explanatory drawing which shows still another example of the position of the predicted motion information candidate with respect to the prediction unit to be encoded. 複数の予測動き情報候補のブロック位置とインデクスＭｖｐiｄｘの関係を示すリストの例を示す説明図。The explanatory view which shows the example of the list which shows the relationship between the block position of a plurality of predicted motion information candidates, and index Mvpidx. 符号化対象プレディクションユニットのサイズが３２ｘ３２の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be encoded is 32x32. 符号化対象プレディクションユニットのサイズが３２ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 32x16. 符号化対象プレディクションユニットのサイズが１６ｘ３２の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x32. 符号化対象プレディクションユニットのサイズが１６ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x16. 符号化対象プレディクションユニットのサイズが１６ｘ８の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x8. 符号化対象プレディクションユニットのサイズが８ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 8x16. 符号化対象プレディクションユニットのサイズが３２ｘ３２の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。The explanatory view which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 32x32. 符号化対象プレディクションユニットのサイズが３２ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。The explanatory view which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 32x16. 符号化対象プレディクションユニットのサイズが１６ｘ３２の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。The explanatory view which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x32. 符号化対象プレディクションユニットのサイズが１６ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。The explanatory view which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x16. 符号化対象プレディクションユニットのサイズが１６ｘ８の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。The explanatory view which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 16x8. 符号化対象プレディクションユニットのサイズが８ｘ１６の場合の、プレディクションユニットの中心を示す参照動き情報取得位置の更に別の例を示す説明図。Explanatory drawing which shows still another example of the reference motion information acquisition position which shows the center of a prediction unit when the size of the prediction unit to be coded is 8x16. 空間方向参照動き情報メモリ５０１及び時間方向参照動き情報メモリ５０２に関する説明図。Explanatory drawing about spatial direction reference motion information memory 501 and temporal direction reference motion information memory 502. 図１の動き情報圧縮部の動作の一例を示すフローチャート。The flowchart which shows an example of the operation of the motion information compression part of FIG. 符号化対象プレディクションユニットのサイズが３２ｘ３２の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be encoded is 32x32. 符号化対象プレディクションユニットのサイズが３２ｘ１６の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be coded is 32x16. 符号化対象プレディクションユニットのサイズが１６ｘ３２の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be encoded is 16x32. 符号化対象プレディクションユニットのサイズが１６ｘ１６の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be coded is 16x16. 符号化対象プレディクションユニットのサイズが１６ｘ８の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be coded is 16x8. 符号化対象プレディクションユニットのサイズが８ｘ１６の場合の、プレディクションユニットの左上端を示す参照動き情報取得位置の例を示す説明図。The explanatory view which shows the example of the reference motion information acquisition position which shows the upper left end of the prediction unit when the size of the prediction unit to be coded is 8x16. 代表動き情報位置の例を示す説明図。Explanatory drawing which shows an example of a representative motion information position. 代表動き情報位置の別の例を示す説明図。Explanatory drawing which shows another example of representative motion information position. 各プレディクションサイズにおけるプレディクションユニットの中心の例を示す説明図。Explanatory drawing which shows an example of the center of the prediction unit in each prediction size. 動き情報圧縮ブロック毎の複数の参照動き情報取得位置の重心を代表動き情報位置と設定した場合の代表動き情報位置の例を示す説明図。An explanatory diagram showing an example of a representative motion information position when the center of gravity of a plurality of reference motion information acquisition positions for each motion information compression block is set as a representative motion information position. 動き情報圧縮ブロック毎の複数の参照動き情報取得位置の重心を代表動き情報位置と設定した場合の代表動き情報位置の別例を示す説明図。Explanatory drawing which shows another example of the representative motion information position when the center of gravity of a plurality of reference motion information acquisition positions for each motion information compression block is set as a representative motion information position. 代表動き情報位置の例を示す説明図。Explanatory drawing which shows an example of a representative motion information position. 代表動き情報位置の別の例を示す説明図。Explanatory drawing which shows another example of representative motion information position. 一実施形態に従うシンタクス構造を示す図である。It is a figure which shows the syntax structure according to one Embodiment. 一実施形態に従うシーケンスパラメータセットシンタクスの一例を示す図である。It is a figure which shows an example of the sequence parameter set syntax according to one Embodiment. 一実施形態に従うシーケンスパラメータセットシンタクスの別例を示す図である。It is a figure which shows another example of the sequence parameter set syntax according to one Embodiment. 一実施形態に従うプレディクションユニットシンタクスの一例を示す図である。It is a figure which shows an example of the prediction unit syntax according to one Embodiment. 第２の実施形態に係る画像復号化装置を概略的に示すブロック図。The block diagram which shows schematic the image decoding apparatus which concerns on 2nd Embodiment. 図２５のエントロピー復号化部を概略的に示すブロック図。The block diagram which shows schematic the entropy decoding part of FIG. 図２６の動き情報復号化部を概略的に示すブロック図。FIG. 26 is a block diagram schematically showing a motion information decoding unit of FIG. 26.

以下、図面を参照して、各実施形態に係る動画像符号化装置及び動画像復号化装置について詳細に説明する。なお、以降の説明において、「画像」という用語は、「映像」「画素」「画像信号」、「画像データ」などの用語として適宜読み替えることができる。また、以下の実施形態では、同一の番号を付した部分については同様の動作を行うものとして、重ねての説明を省略する。
（第１の実施形態）
第１の実施形態は画像符号化装置に関する。本実施形態に係る画像符号化装置に対応する動画像復号化装置は、第２の実施形態において説明する。この画像符号化装置は、ＬＳＩ（Large-Scale Integration）チップやＤＳＰ（Digital Signal Processor）、ＦＰＧＡ（Field Programmable Gate Array）などのハードウェアにより実現可能である。また、この画像符号化装置は、コンピュータに画像符号化プログラムを実行させることによっても実現可能である。 Hereinafter, the moving image coding device and the moving image decoding device according to each embodiment will be described in detail with reference to the drawings. In the following description, the term "image" can be appropriately read as a term such as "video", "pixel", "image signal", and "image data". Further, in the following embodiments, the same operation is performed for the portions with the same number, and the description thereof will be omitted.
(First Embodiment)
The first embodiment relates to an image coding device. The moving image decoding device corresponding to the image coding device according to the present embodiment will be described in the second embodiment. This image coding device can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), and an FPGA (Field Programmable Gate Array). This image coding device can also be realized by causing a computer to execute an image coding program.

図１に示すように、本実施形態に係る画像符号化装置１００は、減算部１０１、直交変換部１０２、量子化部１０３、逆量子化部１０４、逆直交変換部１０５、加算部１０６、参照画像メモリ１０７、インター予測部１０８、動き情報圧縮部１０９、動き情報メモリ１１０、及びエントロピー符号化部１１２を含む。符号化制御部１１４、及び出力バッファ１１３は通常、画像符号化装置１００の外部に設置される。 As shown in FIG. 1, the image coding apparatus 100 according to the present embodiment has a subtraction unit 101, an orthogonal transform unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transform unit 105, and an addition unit 106. The image memory 107, the inter-prediction unit 108, the motion information compression unit 109, the motion information memory 110, and the entropy coding unit 112 are included. The coding control unit 114 and the output buffer 113 are usually installed outside the image coding device 100.

図１の画像符号化装置１００は、入力画像信号１５１を構成する各フレームまたは各フィールドまたは各スライスを複数の画素ブロックに分割し、これら分割した画素ブロックに対して予測符号化を行って、符号化データ１６３を出力する。以降の説明では、簡単化のために、図２Ａに示されるように左上から右下に向かって画素ブロックの予測符号化が行われることを仮定する。図２Ａでは、符号化処理対象のフレームｆにおいて、符号化対象画素ブロックｃよりも左側及び上側に符号化済み画素ブロックｐが位置している。 The image coding device 100 of FIG. 1 divides each frame or each field or each slice constituting the input image signal 151 into a plurality of pixel blocks, performs predictive coding on the divided pixel blocks, and encodes the divided pixel blocks. The conversion data 163 is output. In the following description, for simplification, it is assumed that the pixel block is predictively coded from the upper left to the lower right as shown in FIG. 2A. In FIG. 2A, in the frame f to be coded, the coded pixel block p is located on the left side and the upper side of the pixel block c to be coded.

ここで、画素ブロックは、例えば、Ｍ×Ｎサイズのブロック（Ｎ及びＭは自然数）、コーディングユニット、マクロブロック、サブブロック、１画素などの画像を処理する単位を指す。なお、以降の説明では、画素ブロックをコーディングユニットの意味で基本的に使用するが、説明を適宜読み替えることにより画素ブロックを上述した意味で解釈することも可能である。コーディングユニットは、典型的には、例えば図２Ｂに示す１６×１６画素ブロックであるが、図２Ｃに示す３２×３２画素ブロック、図２Ｄに示す６４×６４画素ブロックであってもよいし、図示しない８×８画素ブロック、４×４画素ブロックであってもよい。また、コーディングユニットは必ずしも正方形である必要はない。以下、入力画像信号１５１の符号化対象ブロックもしくはコーディングニットを「予測対象ブロック」と称することもある。また、符号化単位には、コーディングユニットのような画素ブロックに限らず、フレームまたはフィールド、スライス、或いはこれらの組み合わせを用いることができる。 Here, the pixel block refers to a unit for processing an image such as an M × N size block (N and M are natural numbers), a coding unit, a macro block, a sub block, and one pixel. In the following description, the pixel block is basically used in the meaning of the coding unit, but the pixel block can be interpreted in the above-mentioned meaning by appropriately reading the description. The coding unit is typically, for example, a 16 × 16 pixel block shown in FIG. 2B, but may be a 32 × 32 pixel block shown in FIG. 2C, a 64 × 64 pixel block shown in FIG. 2D, or is shown. It may be an 8 × 8 pixel block or a 4 × 4 pixel block. Also, the coding unit does not necessarily have to be square. Hereinafter, the coding target block or coding knit of the input image signal 151 may be referred to as a “prediction target block”. Further, the coding unit is not limited to a pixel block such as a coding unit, and a frame or field, a slice, or a combination thereof can be used.

図３Ａから図３Ｄまでは、コーディングユニットの具体例を示した図である。図３Ａは、コーディングユニットのサイズが６４×６４（Ｎ＝３２）の場合の例を示している。ここでＮは、基準となるコーディングユニットのサイズを表しており、分割された場合のサイズをＮと定義し、分割されない場合を２Ｎと定義する。コーディングツリーユニットは四分木構造を持ち、分割された場合は、４つの画素ブロックに対してＺスキャン順でインデックスが付される。図３Ｂに、図３Ａの６４ｘ６４画素ブロックを四分木分割した例を示す。図中に示される番号がＺスキャンの順番を表している。また、コーディングユニットの１つの四分木のインデックス内でさらに四分木分割することが可能である。分割の深さをＤｅｐｔｈで定義する。つまり、図３ＡはＤｅｐｔｈ＝０の例を示している。図３ＣにＤｅｐｔｈ＝１の場合の３２×３２（Ｎ＝１６）サイズのコーディングツリーユニットの例を示す。このようなコーディングツリーユニットの最も大きいユニットをラージコーディングツリーユニット若しくはツリーブロックと呼び、図２Ａに示すように、この単位で入力画像信号がラスタースキャン順に符号化される。 3A to 3D are diagrams showing specific examples of coding units. FIG. 3A shows an example when the size of the coding unit is 64 × 64 (N = 32). Here, N represents the size of the reference coding unit, and the size when divided is defined as N, and the case where it is not divided is defined as 2N. The coding tree unit has a quadtree structure, and when divided, the four pixel blocks are indexed in Z-scan order. FIG. 3B shows an example in which the 64x64 pixel block of FIG. 3A is divided into quadtrees. The numbers shown in the figure indicate the order of Z scans. It is also possible to further divide the quadtree within the index of one quadtree of the coding unit. The depth of division is defined by Depth. That is, FIG. 3A shows an example of Dept = 0. FIG. 3C shows an example of a coding tree unit having a size of 32 × 32 (N = 16) when Dept = 1. The largest unit of such a coding tree unit is called a large coding tree unit or a tree block, and as shown in FIG. 2A, the input image signal is encoded in the raster scan order in this unit.

図１の画像符号化装置１００は、符号化制御部１１４から入力される符号化パラメータに基づいて、画素ブロックに対するインター予測（画面間予測、フレーム間予測、動き補償予測などとも称される）または図示されないイントラ予測（画面内予測、フレーム内予測などとも称される）を行って、予測画像信号１５９を生成する。この画像符号化装置１００は、画素ブロック（入力画像信号１５１）と予測画像信号１５９との間の予測誤差信号１５２を直交変換及び量子化し、エントロピー符号化を行って符号化データ１６３を生成して出力する。 The image coding device 100 of FIG. 1 has inter-prediction (also referred to as inter-screen prediction, inter-frame prediction, motion compensation prediction, etc.) or inter-prediction for pixel blocks based on coding parameters input from the coding control unit 114. Intra-prediction (also referred to as in-screen prediction, in-frame prediction, etc.) (not shown) is performed to generate a prediction image signal 159. The image coding device 100 orthogonally transforms and quantizes the prediction error signal 152 between the pixel block (input image signal 151) and the predicted image signal 159, performs entropy coding, and generates coded data 163. Output.

図１の画像符号化装置１００は、ブロックサイズ及び予測画像信号１５９の生成方法の異なる複数の予測モードを選択的に適用して符号化を行う。予測画像信号１５９の生成方法は、大別すると、符号化対象フレーム内で予測を行うイントラ予測と、時間的に異なる１つまたは複数の参照フレームを用いて予測を行うインター予測との２種類である。 The image coding device 100 of FIG. 1 selectively applies a plurality of prediction modes having different block sizes and methods for generating the predicted image signal 159 to perform coding. The prediction image signal 159 is roughly classified into two types: intra-prediction, which makes predictions within a frame to be encoded, and inter-prediction, which makes predictions using one or more reference frames that differ in time. is there.

以下、図１の画像符号化装置１００に含まれる各要素を説明する。
減算部１０１は、入力画像信号１５１の符号化対象ブロックから、対応する予測画像信号１５９を減算して予測誤差信号１５２を得る。減算部１０１は、予測誤差信号１５２を直交変換部１０２に入力する。 Hereinafter, each element included in the image coding apparatus 100 of FIG. 1 will be described.
The subtraction unit 101 subtracts the corresponding prediction image signal 159 from the coded target block of the input image signal 151 to obtain the prediction error signal 152. The subtraction unit 101 inputs the prediction error signal 152 to the orthogonal transform unit 102.

直交変換部１０２は、減算部１０１からの予測誤差信号１５２に対して、例えば離散コサイン変換（ＤＣＴ）のような直交変換を行い、変換係数１５３を得る。直交変換部１０２は、変換係数１５３を量子化部１０３に出力する。 The orthogonal transform unit 102 performs orthogonal transform such as discrete cosine transform (DCT) on the prediction error signal 152 from the subtraction unit 101 to obtain a conversion coefficient 153. The orthogonal transform unit 102 outputs the conversion coefficient 153 to the quantization unit 103.

量子化部１０３は、直交変換部１０２からの変換係数１５３に対して量子化を行い、量子化変換係数１５４を得る。具体的には、量子化部１０３は、符号化制御部１１４によって指定される量子化パラメータ、量子化マトリクスなどの量子化情報に従って量子化を行う。量子化パラメータは、量子化の細かさを示す。量子化マトリクスは、量子化の細かさを変換係数の成分毎に重み付けするために使用されるが、量子化マトリクスの使用・不使用は本発明の実施形態の本質部分ではない。量子化部１０３は、量子化変換係数１５４をエントロピー符号化部１１２及び逆量子化部１０４に出力する。 The quantization unit 103 performs quantization on the conversion coefficient 153 from the orthogonal transformation unit 102 to obtain a quantization conversion coefficient 154. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter and a quantization matrix specified by the coding control unit 114. The quantization parameter indicates the fineness of quantization. The quantization matrix is used to weight the fineness of quantization for each component of the conversion coefficient, but the use / non-use of the quantization matrix is not an essential part of the embodiment of the present invention. The quantization unit 103 outputs the quantization conversion coefficient 154 to the entropy coding unit 112 and the inverse quantization unit 104.

エントロピー符号化部１１２は、量子化部１０３からの量子化変換係数１５４、インター予測部１０８からの動き情報１６０、符号化制御部１１４によって指定される予測情報１６５、符号化制御部１１４からの参照位置情報１６４、量子化情報などの様々な符号化パラメータに対してエントロピー符号化（例えば、ハフマン符号化、算術符号化など）を行い、符号化データ１６３を生成する。なお、符号化パラメータとは、予測情報１６５、変換係数に関する情報、量子化に関する情報、などの復号に必要となるパラメータである。例えば、符号化制御部１１４が内部メモリ（図示しない）を持ち、このメモリに符号化パラメータが保持され、予測対象ブロックを符号化する際に隣接する既に符号化済みの画素ブロックの符号化パラメータを用いる。 The entropy coding unit 112 has a quantization conversion coefficient 154 from the quantization unit 103, motion information 160 from the inter-prediction unit 108, prediction information 165 designated by the coding control unit 114, and a reference from the coding control unit 114. Entropy coding (for example, Huffman coding, arithmetic coding, etc.) is performed on various coding parameters such as position information 164 and quantization information, and coded data 163 is generated. The coding parameter is a parameter required for decoding such as prediction information 165, information on conversion coefficient, and information on quantization. For example, the coding control unit 114 has an internal memory (not shown), and the coding parameters are held in this memory, and when the prediction target block is encoded, the coding parameters of the adjacent already encoded pixel blocks are set. Use.

具体的には、エントロピー符号化部１１２は、図４に示すように、パラメータ符号化部４０１、変換係数符号化部４０２、動き情報符号化部４０３、並びに多重化部４０４を備える。パラメータ符号化部４０１は、符号化制御部１１４から受け取った予測情報１６５などの符号化パラメータを符号化して符号化データ４５１Ａを生成する。変換係数符号化部４０２は、量子化部１０３から受け取った量子化後の変換係数１５４を符号化して符号化データ４５１Ｂを生成する。 Specifically, as shown in FIG. 4, the entropy coding unit 112 includes a parameter coding unit 401, a conversion coefficient coding unit 402, a motion information coding unit 403, and a multiplexing unit 404. The parameter coding unit 401 encodes the coding parameters such as the prediction information 165 received from the coding control unit 114 to generate the coded data 451A. The conversion coefficient coding unit 402 encodes the quantized conversion coefficient 154 received from the quantization unit 103 to generate the coded data 451B.

動き情報符号化部４０３は、動き情報メモリ１１０から受け取った参照動き情報１６６、符号化制御部１１４から受け取った参照位置情報１６４を参照して、インター予測部１０８から受け取った動き情報１６０を符号化して符号化データ４５１Ｃを生成する。動き情報符号化部４０３に関してはその詳細を後述する。 The motion information coding unit 403 encodes the motion information 160 received from the inter-prediction unit 108 with reference to the reference motion information 166 received from the motion information memory 110 and the reference position information 164 received from the coding control unit 114. To generate coded data 451C. The details of the motion information coding unit 403 will be described later.

多重化部４０４は、符号化データ４５１Ａ、４５１Ｂ、４５１Ｃを多重化して符号化データ１６３を生成する。生成された符号化データ１６３は、動き情報１６０、予測情報１６５とともに、変換係数に関する情報、量子化に関する情報などの復号の際に必要になるあらゆるパラメータを含む。 The multiplexing unit 404 multiplexes the coded data 451A, 451B, and 451C to generate the coded data 163. The generated coded data 163 includes motion information 160, prediction information 165, and all parameters required for decoding such as information on conversion coefficients and information on quantization.

エントロピー符号化部１１２によって生成された符号化データ１６３は、例えば多重化を経て出力バッファ１１３に一時的に蓄積され、符号化制御部１１４が管理する適切な出力タイミングに従って符号化データ１６３として出力される。符号化データ１６３は、例えば、図示しない蓄積系（蓄積メディア）または伝送系（通信回線）へ出力される。 The coded data 163 generated by the entropy coding unit 112 is temporarily stored in the output buffer 113 through, for example, multiplexing, and is output as the coded data 163 according to an appropriate output timing managed by the coding control unit 114. To. The coded data 163 is output to, for example, a storage system (storage medium) or a transmission system (communication line) (not shown).

逆量子化部１０４は、量子化部１０３からの量子化変換係数１５４に対して逆量子化を行い、復元変換係数１５５を得る。具体的には、逆量子化部１０４は、量子化部１０３において使用された量子化情報に従って逆量子化を行う。量子化部１０３において使用された量子化情報は、符号化制御部１１４の内部メモリからロードされる。逆量子化部１０４は、復元変換係数１５５を逆直交変換部１０５に出力する。 The inverse quantization unit 104 performs inverse quantization on the quantization conversion coefficient 154 from the quantization unit 103 to obtain a restoration conversion coefficient 155. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103. The quantization information used in the quantization unit 103 is loaded from the internal memory of the coding control unit 114. The inverse quantization unit 104 outputs the restoration conversion coefficient 155 to the inverse orthogonal transformation unit 105.

逆直交変換部１０５は、逆量子化部１０４からの復元変換係数１５５に対して、例えば逆離散コサイン変換などのような直交変換部１０２において行われた直交変換に対応する逆直交変換を行い、復元予測誤差信号１５６を得る。逆直交変換部１０５は、復元予測誤差信号１５６を加算部１０６に出力する。 The inverse orthogonal transform unit 105 performs an inverse orthogonal transform corresponding to the orthogonal transform performed by the orthogonal transform unit 102, such as an inverse discrete cosine transform, on the restoration conversion coefficient 155 from the inverse quantization unit 104. The restoration prediction error signal 156 is obtained. The inverse orthogonal transform unit 105 outputs the restoration prediction error signal 156 to the addition unit 106.

加算部１０６は、復元予測誤差信号１５６と、対応する予測画像信号１５９とを加算し、局所的な復号画像信号１５７を生成する。復号画像信号１５７は図示しないデブロッキングフィルタやウィナーフィルタなどを施し、参照画像メモリ１０７へと入力される。 The addition unit 106 adds the restoration prediction error signal 156 and the corresponding prediction image signal 159 to generate a local decoded image signal 157. The decoded image signal 157 is subjected to a deblocking filter, a Wiener filter, or the like (not shown), and is input to the reference image memory 107.

参照画像メモリ１０７は、メモリに局部復号後の被フィルタ画像信号１５８を蓄積しておりインター予測部１０８によって必要に応じて予測画像を生成する際に、参照画像信号１５８として参照される。 The reference image memory 107 stores the filtered image signal 158 after local decoding in the memory, and is referred to as the reference image signal 158 when the inter-prediction unit 108 generates a predicted image as needed.

インター予測部１０８は、参照画像メモリ１０７に保存されている参照画像信号１５８を利用してインター予測を行う。具体的には、インター予測部１０８は、予測対象ブロックと参照画像信号１５８との間でブロックマッチング処理を行って動きのズレ量（動きベクトル）を導出する。インター予測部１０８は、この動きベクトルに基づいて動き補償（小数精度の動きの場合は補間処理）を行ってインター予測画像を生成する。Ｈ．２６４では、１／４画素精度までの補間処理が可能である。導出された動きベクトルは動き情報１６０の一部としてエントロピー符号化される。 The inter-prediction unit 108 performs inter-prediction using the reference image signal 158 stored in the reference image memory 107. Specifically, the inter-prediction unit 108 performs block matching processing between the prediction target block and the reference image signal 158 to derive the amount of motion deviation (motion vector). The inter-prediction unit 108 generates an inter-prediction image by performing motion compensation (interpolation processing in the case of decimal precision motion) based on this motion vector. H. In 264, interpolation processing up to 1/4 pixel accuracy is possible. The derived motion vector is entropy-encoded as part of the motion information 160.

動き情報メモリ１１０は、動き情報圧縮部１０９を有し、動き情報１６０に対して適宜圧縮処理を行い情報量を削減し、参照動き情報１６６として一時的に格納する。図５に示されるように、動き情報メモリ１１０がフレーム（またはスライス）単位で保持されており、同一フレーム上の動き情報１６０を参照動き情報１６６として格納する空間方向参照動き情報メモリ５０１及び、既に符号化が終了したフレームの動き情報１６０を参照動き情報１６６として格納する時間方向参照動き情報メモリ５０２を更に有する。時間方向参照動き情報メモリ５０２は符号化対象フレームが予測に用いる参照フレームの数に応じて、複数有しても構わない。 The motion information memory 110 has a motion information compression unit 109, appropriately compresses the motion information 160 to reduce the amount of information, and temporarily stores the motion information 160 as reference motion information 166. As shown in FIG. 5, the motion information memory 110 is held in frame (or slice) units, and the spatial direction reference motion information memory 501 that stores the motion information 160 on the same frame as the reference motion information 166 and the already Further, it has a time direction reference motion information memory 502 that stores the motion information 160 of the frame whose coding has been completed as the reference motion information 166. The time direction reference motion information memory 502 may have a plurality of reference frames depending on the number of reference frames used for prediction by the coded target frame.

また、空間方向参照動き情報メモリ５０１及び時間方向参照動き情報メモリ５０２は、物理的に同一のメモリを論理的に区切っても構わない。更に、空間方向参照動き情報メモリ５０１は、現在符号化を行っているフレームで必要な空間方向動き情報のみを保持し、参照が不要となった空間方向動き情報を順次圧縮して時間方向参照動き情報メモリ５０２に格納しても構わない。 Further, the spatial direction reference motion information memory 501 and the temporal direction reference motion information memory 502 may logically divide physically the same memory. Further, the spatial reference motion information memory 501 holds only the spatial motion information required by the frame currently being encoded, and sequentially compresses the spatial motion information that is no longer needed for reference, and the temporal reference motion. It may be stored in the information memory 502.

参照動き情報１６６は、所定の領域単位（例えば、４×４画素ブロック単位）で空間方向参照動き情報メモリ５０１及び時間方向参照動き情報メモリ５０２内に保持される。参照動き情報１６６は、その領域が後述するインター予測で符号化されたのか或いは後述するイントラ予測で符号化されたのかを示す情報をさらに有する。また、コーディングユニット（又はプレディクションユニット）がＨ．２６４で規定されるスキップモード、ダイレクトモード若しくは後述するマージモードのように、動き情報１６０内の動きベクトルの値が符号化されず、符号化済みの領域から予測された動き情報１６０を用いてインター予測される場合においても、当該コーディングユニット（又はプレディクションユニット）の動き情報が参照動き情報１６６として保持される。 The reference motion information 166 is held in the spatial direction reference motion information memory 501 and the time direction reference motion information memory 502 in predetermined area units (for example, 4 × 4 pixel block units). The reference motion information 166 further has information indicating whether the region is encoded by the inter-prediction described later or the intra-prediction described later. In addition, the coding unit (or prediction unit) is H.I. As in the skip mode defined by 264, the direct mode, or the merge mode described later, the value of the motion vector in the motion information 160 is not encoded, and the motion information 160 predicted from the encoded region is used for interim. Even in the predicted case, the motion information of the coding unit (or prediction unit) is retained as the reference motion information 166.

符号化対象のフレーム又はスライスの符号化処理が終了したら、当該フレームの空間方向参照動き情報メモリ５０１は、次に符号化処理を行うフレームに用いる時間方向参照動き情報メモリ５０２としてその扱いが変更される。この際、時間方向参照動き情報メモリ５０２のメモリ容量を削減するために、後述する動き情報圧縮部１０９によって圧縮された動き情報１６０を時間方向参照動き情報メモリ５０２に格納する。 When the coding process of the frame or slice to be encoded is completed, the spatial direction reference motion information memory 501 of the frame is changed to the time direction reference motion information memory 502 used for the frame to be encoded next. To. At this time, in order to reduce the memory capacity of the time direction reference motion information memory 502, the motion information 160 compressed by the motion information compression unit 109 described later is stored in the time direction reference motion information memory 502.

予測情報１６５は符号化制御部１１４が制御する予測モードに従っており、前述のように、予測画像信号１５９の生成のためにインター予測または図示されないイントラ予測またはインター予測が選択可能であるが、イントラ予測及びインター予測の夫々に複数のモードがさらに選択可能である。符号化制御部１１４はイントラ予測及びインター予測の複数の予測モードのうちの１つを最適な予測モードとして判定し、予測情報１６５を設定する。 The prediction information 165 follows a prediction mode controlled by the coding control unit 114, and as described above, inter-prediction or intra-prediction or inter-prediction (not shown) can be selected for generating the predicted image signal 159, but intra-prediction. A plurality of modes can be further selected for each of the inter-prediction and the inter-prediction. The coding control unit 114 determines one of the plurality of prediction modes of the intra prediction and the inter prediction as the optimum prediction mode, and sets the prediction information 165.

例えば、符号化制御部１１４は、次の数式（１）に示すコスト関数を用いて最適な予測モードを判定する。 For example, the coding control unit 114 determines the optimum prediction mode using the cost function shown in the following mathematical expression (1).

数式（１）（以下、簡易符号化コストと呼ぶ）において、ＯＨは予測情報１６０（例えば、動きベクトル情報、予測ブロックサイズ情報）に関する符号量を示し、ＳＡＤは予測対象ブロックと予測画像信号１５９との間の差分絶対値和（即ち、予測誤差信号１５２の絶対値の累積和）を示す。また、λは量子化情報（量子化パラメータ）の値に基づいて決定されるラグランジュ未定乗数を示し、Ｋは符号化コストを示す。数式（１）を用いる場合には、符号化コストＫを最小化する予測モードが発生符号量及び予測誤差の観点から最適な予測モードとして判定される。数式（１）の変形として、ＯＨのみまたはＳＡＤのみから符号化コストを見積もってもよいし、ＳＡＤにアダマール変換を施した値またはその近似値を利用して符号化コストを見積もってもよい。 In formula (1) (hereinafter referred to as simple coding cost), OH indicates the code amount related to the prediction information 160 (for example, motion vector information, prediction block size information), and SAD indicates the prediction target block and the prediction image signal 159. The sum of the absolute values of the differences between the two (that is, the cumulative sum of the absolute values of the prediction error signal 152) is shown. Further, λ indicates a Lagrange undetermined multiplier determined based on the value of the quantization information (quantization parameter), and K indicates the coding cost. When the mathematical formula (1) is used, the prediction mode that minimizes the coding cost K is determined as the optimum prediction mode from the viewpoint of the generated code amount and the prediction error. As a modification of the equation (1), the coding cost may be estimated from only OH or only SAD, or the coding cost may be estimated using the value obtained by Hadamard transforming SAD or an approximate value thereof.

また、図示しない仮符号化ユニットを用いることにより最適な予測モードを判定することも可能である。例えば、符号化制御部１１４は、次の数式（２）に示すコスト関数を用いて最適な予測モードを判定する。 It is also possible to determine the optimum prediction mode by using a provisional coding unit (not shown). For example, the coding control unit 114 determines the optimum prediction mode using the cost function shown in the following mathematical expression (2).

数式（２）において、Ｄは予測対象ブロックと局所復号画像との間の二乗誤差和（即ち、符号化歪）を示し、Ｒは予測対象ブロックと予測モードの予測画像信号１５９との間の予測誤差について仮符号化によって見積もられた符号量を示し、Ｊは符号化コストを示す。数式（２）の符号化コストＪ（以後、詳細符号化コストと呼ぶ）を導出する場合には予測モード毎に仮符号化処理及び局部復号化処理が必要なので、回路規模または演算量が増大する。反面、より正確な符号化歪と符号量とに基づいて符号化コストＪが導出されるので、最適な予測モードを高精度に判定して高い符号化効率を維持しやすい。なお、数式（２）の変形として、ＲのみまたはＤのみから符号化コストを見積もってもよいし、ＲまたはＤの近似値を利用して符号化コストを見積もってもよい。また、これらのコストを階層的に用いてもよい。符号化制御部１１４は、予測対象ブロックに関して事前に得られる情報（周囲の画素ブロックの予測モード、画像解析の結果など）に基づいて、数式（１）または数式（２）を用いた判定を行う予測モードの候補の数を、予め絞り込んでおいてもよい。 In formula (2), D indicates the sum of squared errors (ie, coding distortion) between the predicted block and the locally decoded image, and R is the prediction between the predicted block and the predicted image signal 159 in the predicted mode. The amount of code estimated by tentative coding for the error is shown, and J indicates the coding cost. When deriving the coding cost J (hereinafter referred to as the detailed coding cost) of the equation (2), provisional coding processing and local decoding processing are required for each prediction mode, so that the circuit scale or the amount of calculation increases. .. On the other hand, since the coding cost J is derived based on the more accurate coding distortion and the coding amount, it is easy to determine the optimum prediction mode with high accuracy and maintain high coding efficiency. As a modification of the mathematical formula (2), the coding cost may be estimated only from R or only D, or the coding cost may be estimated using the approximate value of R or D. Moreover, these costs may be used hierarchically. The coding control unit 114 makes a determination using the mathematical formula (1) or the mathematical formula (2) based on the information obtained in advance regarding the prediction target block (prediction mode of surrounding pixel blocks, result of image analysis, etc.). The number of candidates for the prediction mode may be narrowed down in advance.

本実施形態の変形例として、数式（１）と数式（２）を組み合わせた二段階のモード判定を行うことで、符号化性能を維持しつつ、予測モードの候補数をさらに削減することが可能となる。ここで、数式（１）で示される簡易符号化コストは、数式（２）と異なり局部復号化処理が必要ないため、高速に演算が可能である。本実施形態の動画像符号化装置では、Ｈ．２６４と比較しても予測モード数が多いため、詳細符号化コストを用いたモード判定は現実的ではない。そこで、第一ステップとして、簡易符号化コストを用いたモード判定を、当該画素ブロックで利用可能な予測モードに対して行い、予測モード候補を導出する。 As a modification of this embodiment, it is possible to further reduce the number of candidates for the prediction mode while maintaining the coding performance by performing the two-step mode determination by combining the formula (1) and the formula (2). It becomes. Here, unlike the mathematical formula (2), the simple coding cost represented by the mathematical formula (1) does not require a local decoding process, and thus can be calculated at high speed. In the moving image coding apparatus of this embodiment, H. Since the number of prediction modes is larger than that of 264, mode determination using the detailed coding cost is not realistic. Therefore, as a first step, a mode determination using the simple coding cost is performed for the prediction mode available in the pixel block, and a prediction mode candidate is derived.

ここで、量子化の粗さを定めた量子化パラメータの値が大きくなるほど、簡易符号化コストと詳細符号化コストの相関が高くなる性質を利用して、予測モード候補数を変更する。 Here, the number of prediction mode candidates is changed by utilizing the property that the correlation between the simple coding cost and the detailed coding cost increases as the value of the quantization parameter that defines the roughness of the quantization increases.

次に、画像符号化装置１００の予測処理について説明する。
図１の画像符号化装置１００には、図示していないが、複数の予測モードが用意されており、各予測モードでは、予測画像信号１５９の生成方法及び動き補償ブロックサイズが互いに異なる。予測部１０８が予測画像信号１５９を生成する方法としては、具体的には大きく分けて、符号化対象フレーム（又は、フィールド）の参照画像信号１５８を用いて予測画像を生成するイントラ予測（フレーム内予測）と、１以上の符号化済みの参照フレーム（又は、参照フィールド）の参照画像信号１５８を用いて予測画像を生成するインター予測（フレーム間予測）とがある。予測部１０８は、イントラ予測及びインター予測を選択的に切り替えて、符号化対象ブロックの予測画像信号１５９を生成する。 Next, the prediction process of the image coding apparatus 100 will be described.
Although not shown, the image coding apparatus 100 of FIG. 1 is provided with a plurality of prediction modes, and in each prediction mode, the method of generating the prediction image signal 159 and the motion compensation block size are different from each other. The method by which the prediction unit 108 generates the prediction image signal 159 is roughly divided into an intra prediction (inside the frame) in which the prediction image is generated by using the reference image signal 158 of the coded frame (or field). Prediction) and inter-prediction (inter-frame prediction) that generates a prediction image using the reference image signal 158 of one or more encoded reference frames (or reference fields). The prediction unit 108 selectively switches between the intra prediction and the inter prediction to generate the prediction image signal 159 of the coded block.

図６Ａは、インター予測の一例を示している。インター予測は、典型的にはプレディクションユニットの単位で実行され、プレディクションユニット単位で異なる動き情報１６０を有することが可能となる。インター予測では、図６Ａに示されるように、既に符号化が完了している参照フレーム（例えば、１フレーム前の符号化済みフレーム）内の画素ブロックであって、符号化対象のプレディクションユニットと同じ位置のブロック６０１から、動き情報１６０に含まれる動きベクトルに応じて空間的にシフトした位置のブロック６０２の参照画像信号１５８を使用して、予測画像信号１５９が生成される。即ち、予測画像信号１５９の生成では、符号化対象ブロックの位置（座標）及び動き情報１６０に含まれる動きベクトルで特定される、参照フレーム内のブロック６０２の参照画像信号１５８が使用される。 FIG. 6A shows an example of inter-prediction. Inter-prediction is typically performed in units of prediction units, and it is possible for each prediction unit to have different motion information 160. In the inter-prediction, as shown in FIG. 6A, it is a pixel block in a reference frame (for example, a coded frame one frame before) that has already been coded, and is a prediction unit to be coded. A predicted image signal 159 is generated from the block 601 at the same position by using the reference image signal 158 of the block 602 at the position spatially shifted according to the motion vector included in the motion information 160. That is, in the generation of the predicted image signal 159, the reference image signal 158 of the block 602 in the reference frame specified by the position (coordinates) of the coded block and the motion vector included in the motion information 160 is used.

インター予測では、少数画素精度（例えば、１／２画素精度又は１／４画素精度）の動き補償が可能であり、参照画像信号１５８に対してフィルタリング処理を行うことによって、補間画素の値が生成される。例えば、Ｈ．２６４では、輝度信号に対して１／４画素精度までの補間処理が可能である。当該補間処理は、Ｈ．２６４で規定されるフィルタリングの他に、任意のフィルタリングを用いることにより実行可能である。 In the inter-prediction, motion compensation with a small number of pixel accuracy (for example, 1/2 pixel accuracy or 1/4 pixel accuracy) is possible, and the value of the interpolated pixel is generated by performing filtering processing on the reference image signal 158. Will be done. For example, H. In 264, interpolation processing up to 1/4 pixel accuracy is possible for the luminance signal. The interpolation process is performed by H. It can be performed by using arbitrary filtering in addition to the filtering specified in 264.

なお、インター予測では、図６Ａに示されるような１フレーム前の参照フレームを使用する例に限らず、図６Ｂに示されるように、いずれの符号化済みの参照フレームが使用されてもよい。時間位置が異なる複数の参照フレームの参照画像信号１５８が保持されている場合、どの時間位置の参照画像信号１５８から予測画像信号１５９を生成したかを示す情報は、参照フレーム番号で表わされる。参照フレーム番号は、動き情報１６０に含まれる。参照フレーム番号は、領域単位（ピクチャ、スライス、ブロック単位など）で変更することができる。即ち、プレディクションユニット毎に異なる参照フレームが使用されることができる。一例として、符号化済みの１フレーム前の参照フレームを予測に使用した場合、この領域の参照フレーム番号は、０に設定され、符号化済みの２フレーム前の参照フレームを予測に使用した場合、この領域の参照フレーム番号は、１に設定される。他の例として、１フレーム分だけの参照画像信号１５８が参照画像メモリ１０７に保持されている（保持されている参照フレームの数が１つのみである）場合、参照フレーム番号は、常に０に設定される。 Note that the inter-prediction is not limited to the example of using the reference frame one frame before as shown in FIG. 6A, and any encoded reference frame may be used as shown in FIG. 6B. When the reference image signals 158 of a plurality of reference frames having different time positions are held, the information indicating from which time position the reference image signal 158 the predicted image signal 159 is generated is represented by the reference frame number. The reference frame number is included in the motion information 160. The reference frame number can be changed in units of areas (pictures, slices, blocks, etc.). That is, different reference frames can be used for each prediction unit. As an example, if the encoded reference frame one frame before is used for prediction, the reference frame number in this area is set to 0, and if the encoded reference frame two frames before is used for prediction, The reference frame number in this area is set to 1. As another example, when the reference image signal 158 for only one frame is held in the reference image memory 107 (the number of reference frames held is only one), the reference frame number is always set to 0. Set.

さらに、インター予測では、予め用意される複数のプレディクションユニットのサイズの中から符号化対象ブロックに適したサイズを選択して用いることができる。例えば、図７Ａから図７Ｇに示されるようなコーディングツリーユニットを分割して得られるプレディクションユニット毎に動き補償を行うことが可能である。また、図７Ｆ、図７Ｇに示されるような矩形以外に分割して得られるプレディクションユニット毎に動き補償を行うことが可能である。 Further, in the inter-prediction, a size suitable for the coded block can be selected and used from the sizes of a plurality of prediction units prepared in advance. For example, it is possible to perform motion compensation for each prediction unit obtained by dividing the coding tree unit as shown in FIGS. 7A to 7G. Further, it is possible to perform motion compensation for each prediction unit obtained by dividing into a rectangle other than the rectangle as shown in FIGS. 7F and 7G.

前述したように、インター予測に使用する符号化対象フレーム内の符号化済みの画素ブロック（例えば、４×４画素ブロック）の動き情報１６０は参照動き情報１６６として保持されているので、入力画像信号１５１の局所的な性質に従って、最適な動き補償ブロックの形状及び動きベクトル、参照フレーム番号を利用することができる。また、コーディングユニット及びプレディクションユニットは任意に組み合わせることができる。コーディングツリーユニットが６４×６４画素ブロックである場合、６４×６４画素ブロックを分割した４つのコーディングツリーユニット（３２×３２画素ブロック）の各々に対して、さらにコーディングツリーユニットを４つに分割することで階層的に６４×６４画素ブロックから１６×１６画素ブロックを利用することができる。同様にして、階層的に６４×６４画素ブロックから８×８画素ブロックを利用することができる。ここで、プレディクションユニットがコーディングツリーユニットを４つに分割したものであるとすれば、６４×６４画素ブロックから４×４画素ブロックまでの階層的な動き補償処理を実行することが可能となる。 As described above, since the motion information 160 of the coded pixel block (for example, 4 × 4 pixel block) in the coded target frame used for the inter-prediction is held as the reference motion information 166, the input image signal. According to the local properties of 151, the optimum motion compensation block shape, motion vector, and reference frame number can be utilized. Further, the coding unit and the prediction unit can be arbitrarily combined. When the coding tree unit is a 64 × 64 pixel block, the coding tree unit is further divided into four for each of the four coding tree units (32 × 32 pixel blocks) in which the 64 × 64 pixel block is divided. It is possible to use a block of 64 × 64 pixels to a block of 16 × 16 pixels hierarchically. Similarly, 64 × 64 pixel blocks to 8 × 8 pixel blocks can be used hierarchically. Here, assuming that the prediction unit divides the coding tree unit into four, it is possible to execute hierarchical motion compensation processing from a 64 × 64 pixel block to a 4 × 4 pixel block. ..

また、インター予測では、符号化対象画素ブロックに対して２種類の動き補償を用いた双方向予測を実行することができる。Ｈ．２６４では、符号化対象画素ブロックに対し２種類の動き補償を行い、２種類の予測画像信号を加重平均することで、新しい予測画像信号を得る（図示せず）。双方向予測において２種類の動き補償をそれぞれリスト０予測、リスト１予測と称する。 Further, in the inter-prediction, bidirectional prediction using two types of motion compensation can be executed for the pixel block to be coded. H. In 264, two types of motion compensation are performed on the pixel block to be coded, and a new predicted image signal is obtained by weighted averaging the two types of predicted image signals (not shown). In bidirectional prediction, the two types of motion compensation are referred to as list 0 prediction and list 1 prediction, respectively.

＜スキップモード、マージモード、インターモードの説明＞
本実施形態に係る画像符号化装置１００は、図８に示す符号化処理の異なる複数の予測モードを使用する。図中のスキップモードは後述する予測動き情報位置９５４に関するシンタクスのみを符号化し、その他のシンタクスは符号化しないモードである。マージモードは予測動き情報位置９５４に関するシンタクス、変換係数情報１５３のみを符号化し、その他のシンタクスは符号化しないモードである。インターモードは、予測動き情報位置９５４に関するシンタクス、後述する差分動き情報９５３、変換係数情報１５３を符号化するモードである。これらのモードは符号化制御部１１４が制御する予測情報１６５によって切り替えられる。 <Explanation of skip mode, merge mode, and intermode>
The image coding apparatus 100 according to the present embodiment uses a plurality of prediction modes having different coding processes shown in FIG. The skip mode in the figure is a mode in which only the syntax related to the predicted motion information position 954 described later is encoded, and the other syntax is not encoded. The merge mode is a mode in which only the syntax regarding the predicted motion information position 954 and the conversion coefficient information 153 are encoded, and the other syntaxes are not encoded. The intermode is a mode for encoding the syntax regarding the predicted motion information position 954, the difference motion information 953 described later, and the conversion coefficient information 153. These modes are switched by the prediction information 165 controlled by the coding control unit 114.

＜動き情報符号化部４０３＞
以下、動き情報符号化部４０３について図９を用いて説明する。 <Motion information coding unit 403>
Hereinafter, the motion information coding unit 403 will be described with reference to FIG.

動き情報符号化部４０３は、参照動きベクトル取得部９０１、予測動きベクトル選択スイッチ（予測動き情報選択スイッチ、とも称す）９０２、減算部９０３、差分動き情報符号化部９０４、予測動き情報位置符号化部９０５及び多重化部９０６を有する。 The motion information coding unit 403 includes a reference motion vector acquisition unit 901, a predicted motion vector selection switch (also referred to as a predicted motion information selection switch) 902, a subtraction unit 903, a differential motion information coding unit 904, and a predicted motion information position coding. It has a unit 905 and a multiplexing unit 906.

参照動きベクトル取得部９０１は、参照動き情報１６６及び参照位置情報１６４を入力として、少なくとも一つ以上の予測動き情報候補（予測動きベクトル候補、とも称す）９５１（９５１Ａ、９５１Ｂ、…）を生成する。図１０、図１１は、対象プレディクションユニットに対する、予測動き情報候補９５１の位置の一例を示している。図１０は対象プレディクションユニットに空間的に隣接するプレディクションユニットの位置を示している。ＡＸ（Ｘ＝０〜ｎＡ−１）は、対象プレディクションユニットに対して左に隣接するプレディクションユニット、ＢＹ（Ｙ＝０〜ｎＢ−１）は対象プレディクションユニットに対して上に隣接するプレディクションユニット、Ｃ、Ｄ、Ｅは対象プレディクションユニットに対してそれぞれ右上、左上、左下に隣接するプレディクションユニットを示している。また、図１１は符号化対象プレディクションユニットに対して、既に符号化済みの参照フレームにおけるプレディクションユニットの位置を示している。図１１中のＣｏｌは、参照フレーム内であって符号化対象プレディクションユニットと同一位置にあるプレディクションユニットを示している。図１２は、複数の予測動き情報候補９５１のブロック位置とインデクスＭｖｐiｄｘの関係を示すリストの一例を示す。Ｍｖｐiｄｘが０〜２は空間方向に位置する予測動きベクトル候補９５１、Ｍｖｐiｄｘが３は時間方向に位置する予測動きベクトル候補９５１をそれぞれ示している。プレディクションユニット位置Ａは図１０に示されるＡＸの内、インター予測である、つまり参照動き情報１６６を有するプレディクションユニットであって、Ｘの値が最も小さい位置をプレディクションユニット位置Ａとする。また、プレディクションユニット位置Ｂは図１０に示されるＢＹの内、インター予測である、つまり参照動き情報１６６を有するプレディクションユニットであって、Ｙの値が最も小さい位置をプレディクションユニット位置Ａとする。プレディクションユニット位置Ｃがインター予測ではない場合、プレディクションユニット位置Ｄの参照動き情報１６６をプレディクションユニット位置Ｃの参照動き情報１６６として置き換える。プレディクションユニット位置Ｃ及びＤがインター予測ではない場合、プレディクションユニット位置Ｅの参照動き情報１６６をプレディクションユニット位置Ｃの参照動き情報１６６として置き換える。 The reference motion vector acquisition unit 901 receives at least one predicted motion information candidate (also referred to as a predicted motion vector candidate) 951 (951A, 951B, ...) By inputting the reference motion information 166 and the reference position information 164. .. 10 and 11 show an example of the position of the predicted motion information candidate 951 with respect to the target prediction unit. FIG. 10 shows the position of the prediction unit spatially adjacent to the target prediction unit. AX (X = 0 to nA-1) is a prediction unit adjacent to the left side of the target prediction unit, and BY (Y = 0 to nB-1) is a prediction unit adjacent to the target prediction unit. The action units, C, D, and E indicate the prediction units adjacent to the upper right, upper left, and lower left with respect to the target prediction unit, respectively. Further, FIG. 11 shows the position of the prediction unit in the already encoded reference frame with respect to the prediction unit to be encoded. Col in FIG. 11 indicates a prediction unit in the reference frame and at the same position as the prediction unit to be encoded. FIG. 12 shows an example of a list showing the relationship between the block position of the plurality of predicted motion information candidates 951 and the index Mvpidx. Mvpidx 0 to 2 indicates a predicted motion vector candidate 951 located in the spatial direction, and Mvpidx 3 indicates a predicted motion vector candidate 951 located in the time direction. The prediction unit position A is an inter-prediction of the AX shown in FIG. 10, that is, the prediction unit having the reference motion information 166, and the position where the value of X is the smallest is the prediction unit position A. Further, the prediction unit position B is an inter-prediction among the BYs shown in FIG. 10, that is, the prediction unit having the reference motion information 166, and the position where the value of Y is the smallest is referred to as the prediction unit position A. To do. When the prediction unit position C is not an inter-prediction, the reference motion information 166 of the prediction unit position D is replaced with the reference motion information 166 of the prediction unit position C. When the prediction unit positions C and D are not inter-prediction, the reference motion information 166 of the prediction unit position E is replaced with the reference motion information 166 of the prediction unit position C.

符号化対象プレディクションユニットのサイズが最小プレディクションユニットより大きい場合には、プレディクションユニット位置Ｃｏｌは、複数の参照動き情報１６６を時間方向参照動き情報メモリ５０２に保持している可能性がある。この場合、参照位置情報１６４に従って位置Ｃｏｌのプレディクションユニット中の参照動き情報１６６を取得する。以降、位置Ｃｏｌのプレディクションユニット中の参照動き情報１６６の取得位置を参照動き情報取得位置と称する。図１３Ａ〜Ｆは、参照位置情報１６４が位置Ｃｏｌのプレディクションユニットの中心を示す場合の参照動き情報取得位置の一例を符号化対象プレディクションユニットのサイズ（３２ｘ３２〜１６ｘ１６）毎に示す。図中のブロックはそれぞれ４ｘ４プレディクションユニットを示し、丸印は予測動き情報候補９５１として取得する４ｘ４プレディクションユニットの位置を示している。参照動き情報取得位置の別の一例を図１４Ａ〜Ｆに示す。図１４Ａ〜Ｆにおいて、丸印の位置は４ｘ４プレディクションユニットが存在しないため、丸印に隣接する４つの４ｘ４プレディクションユニットにおける参照動き情報１６６の平均値やメディアン値といった予め定められた方式で、予測動き情報候補９５１を生成する。参照動き情報取得位置の更に別の一例として、位置Ｃｏｌのプレディクションユニットの左上端に位置する４ｘ４プレディクションユニットの参照動き情報１６６を予測動き情報候補９５１としても構わない。上記の例以外に置いても、予め定められた方式であれば、いずれの位置及び方式を用いて予測動き情報候補９５１を生成しても構わない。 When the size of the prediction unit to be encoded is larger than the minimum prediction unit, the prediction unit position Col may hold a plurality of reference motion information 166s in the time direction reference motion information memory 502. In this case, the reference motion information 166 in the prediction unit of the position Col is acquired according to the reference position information 164. Hereinafter, the acquisition position of the reference motion information 166 in the prediction unit of the position Col is referred to as a reference motion information acquisition position. 13A to 13F show an example of the reference motion information acquisition position when the reference position information 164 indicates the center of the prediction unit at the position Col for each size (32x32 to 16x16) of the prediction unit to be encoded. The blocks in the figure each indicate a 4x4 prediction unit, and the circles indicate the positions of the 4x4 prediction units to be acquired as the predicted motion information candidate 951. Another example of the reference motion information acquisition position is shown in FIGS. 14A to 14F. In FIGS. 14A to 14F, since the 4x4 prediction unit does not exist at the position of the circle, a predetermined method such as the average value or the median value of the reference motion information 166 in the four 4x4 prediction units adjacent to the circle is used. Predictive motion information candidate 951 is generated. As yet another example of the reference motion information acquisition position, the reference motion information 166 of the 4x4 prediction unit located at the upper left end of the prediction unit at the position Col may be used as the predicted motion information candidate 951. In addition to the above examples, any position and method may be used to generate the predicted motion information candidate 951 as long as it is a predetermined method.

なお、参照動き情報１６６が存在しない場合、ゼロベクトルを有する動き情報１６０を、予測動き情報候補９５１として出力する。 When the reference motion information 166 does not exist, the motion information 160 having a zero vector is output as the predicted motion information candidate 951.

以上により、少なくとも一つ以上の予測動き情報候補９５１が参照動きブロックから出力される。上記の予測動き情報候補９５１が有する参照フレーム番号と符号化対象プレディクションユニットの参照フレーム番号が異なる場合は、予測動き情報候補９５１を予測動き情報候補９５１が有する参照フレーム番号と符号化対象プレディクションユニットの参照フレーム番号に従ってスケーリングしても構わない。 As described above, at least one or more predicted motion information candidates 951 are output from the reference motion block. When the reference frame number of the predicted motion information candidate 951 and the reference frame number of the coding target prediction unit are different, the predicted motion information candidate 951 is divided into the reference frame number of the predicted motion information candidate 951 and the coding target prediction. It may be scaled according to the reference frame number of the unit.

予測動き情報選択スイッチ９０２は、符号化制御部１１４からの指令に応じて複数の予測動き情報候補９５１から一つを選択し、予測動き情報９５２を出力する。また予測動き情報選択スイッチ９０２が、後述する予測動き情報位置情報９５４を出力してもよい。上記、選択には数式（１）や（２）といった評価関数を用いて選択しても構わない。減算部９０３は、動き情報１６０から予測動きベクトル情報９５２を減算し、差分動き情報９５３を差分動き情報符号化部９０４に出力する。差分動き情報符号化部９０４は、差分動き情報９５３を符号化処理し符号化データ９６０Ａを出力する。なお、スキップモード及びマージモードでは差分動き情報符号化部９０４において、差分動き情報９５３の符号化は不要となる。 The predicted motion information selection switch 902 selects one from a plurality of predicted motion information candidates 951 in response to a command from the coding control unit 114, and outputs the predicted motion information 952. Further, the predicted motion information selection switch 902 may output the predicted motion information position information 954 described later. The above selection may be made using an evaluation function such as mathematical formulas (1) and (2). The subtracting unit 903 subtracts the predicted motion vector information 952 from the motion information 160, and outputs the differential motion information 953 to the differential motion information coding unit 904. The difference motion information coding unit 904 encodes the difference motion information 953 and outputs the coded data 960A. In the skip mode and the merge mode, the differential motion information coding unit 904 does not need to encode the differential motion information 953.

予測動き情報位置符号化部９０５は、図１２で示されるリストのうち、どの予測動き情報候補９５１を選択したかを示す予測動き情報位置情報９５４（Ｍｖｐiｄｘ）を符号化し、符号化データ９６０Ｂを出力する。予測動き情報位置情報９５４は予測動き情報候補９５１の総数から生成される等長符号化や可変長符号化を用いて符号化される。隣接ブロックとの相関を利用して可変長符号化しても構わない。更に、複数の予測動き情報候補９５１で重複する情報を有する場合、重複する予測動き情報候補９５１を削除した予測動き情報候補９５１の総数から符号表を作成し、予測動き情報位置情報９５４を符号化しても構わない。また、予測動き情報候補９５１の総数が１種類である場合、当該予測動き情報候補９５１が予測動き情報９５２と決定されるため、予測動き情報位置情報９５４を符号化する必要はない。 The predicted motion information position coding unit 905 encodes the predicted motion information position information 954 (Mvpidx) indicating which predicted motion information candidate 951 is selected from the list shown in FIG. 12, and outputs the encoded data 960B. To do. The predicted motion information position information 954 is encoded using equal-length coding or variable-length coding generated from the total number of predicted motion information candidates 951. Variable length coding may be performed using the correlation with the adjacent block. Further, when a plurality of predicted motion information candidates 951 have duplicate information, a code table is created from the total number of predicted motion information candidates 951 in which the duplicated predicted motion information candidates 951 are deleted, and the predicted motion information position information 954 is encoded. It doesn't matter. Further, when the total number of the predicted motion information candidates 951 is one type, the predicted motion information candidate 951 is determined to be the predicted motion information 952, so that it is not necessary to encode the predicted motion information position information 954.

また、スキップモード、マージモード、インターモードそれぞれにおいて、予測動き情報候補９５１の導出方法は同一である必要はなく、それぞれ独立に予測動き情報候補９５１の導出方法を設定しても構わない。本実施形態では、スキップモードとインターモードの予測動き情報候補９５１の導出方法は同一で、マージモードの予測動き情報候補９５１の導出方法は異なるものとして説明する。 Further, the derivation method of the predicted motion information candidate 951 does not have to be the same in each of the skip mode, the merge mode, and the inter mode, and the derivation method of the predicted motion information candidate 951 may be set independently for each. In the present embodiment, the method for deriving the predicted motion information candidate 951 in the skip mode and the intermode is the same, and the method for deriving the predicted motion information candidate 951 in the merge mode is different.

＜動き情報圧縮部１０９の詳細＞
まず、動き情報圧縮処理について図１５を用いて説明する。図１５は、空間方向参照動き情報メモリ５０１の参照動き情報１６６を圧縮し、時間方向参照動き情報メモリ５０２へ格納する。空間方向参照動き情報メモリ５０１では動き情報圧縮ブロック（同図では１６ｘ１６画素ブロック）毎に代表動き情報位置に保持される参照動き情報１６６を時間方向参照動き情報メモリ５０２に格納する。上述の動き情報符号化処理を行う場合には、前述の参照動き情報取得位置に保持される参照動き情報１６６を予測動き情報候補９５１として設定する。このとき、仮想的に動き情報圧縮ブロック内は同一の参照動き情報１６６を持つこととして、前述の参照動き情報取得位置に保持される参照動き情報１６６を予測動き情報候補９５１として設定しても構わない（同一の予測動き情報候補９５１が導出される。）
次に、動き情報圧縮部１０９について図１６に示すフローチャートを用いて説明する。動き情報圧縮部１０９は、フレーム（もしくはスライス、コーディングユニットなど任意の単位）の符号化処理が終了した際に、動き情報１６０を圧縮して時間方向参照動き情報メモリ５０２に動き情報１６０を格納する。 <Details of motion information compression unit 109>
First, the motion information compression process will be described with reference to FIG. In FIG. 15, the reference motion information 166 of the spatial reference motion information memory 501 is compressed and stored in the temporal reference motion information memory 502. In the spatial direction reference motion information memory 501, the reference motion information 166 held at the representative motion information position is stored in the time direction reference motion information memory 502 for each motion information compression block (16x16 pixel block in the figure). When the above-mentioned motion information coding process is performed, the reference motion information 166 held at the above-mentioned reference motion information acquisition position is set as the predicted motion information candidate 951. At this time, assuming that the motion information compression block virtually has the same reference motion information 166, the reference motion information 166 held at the above-mentioned reference motion information acquisition position may be set as the predicted motion information candidate 951. None (The same predicted motion information candidate 951 is derived.)
Next, the motion information compression unit 109 will be described with reference to the flowchart shown in FIG. The motion information compression unit 109 compresses the motion information 160 and stores the motion information 160 in the time direction reference motion information memory 502 when the coding process of the frame (or an arbitrary unit such as a slice or a coding unit) is completed. ..

まず、符号化制御部１１４から参照位置情報１６４を取得し（ステップＳ１６０１）、フレームを動き情報１６０の圧縮単位である動き情報圧縮ブロックに分割する（ステップＳ１６０２）。動き情報圧縮ブロックは、動き補償処理により動き情報１６０が保持される単位（典型的には４ｘ４画素ブロック）より大きい画素ブロックであり、典型的には１６ｘ１６画素ブロックである。動き情報圧縮ブロックは６４ｘ６４画素ブロックや３２ｘ３２画素ブロック、８ｘ８画素ブロック、長方形画素ブロック、任意の形状の画素領域であっても構わない。 First, the reference position information 164 is acquired from the coding control unit 114 (step S1601), and the frame is divided into motion information compression blocks, which are compression units of the motion information 160 (step S1602). The motion information compression block is a pixel block larger than a unit (typically a 4x4 pixel block) in which motion information 160 is held by motion compensation processing, and is typically a 16x16 pixel block. The motion information compression block may be a 64x64 pixel block, a 32x32 pixel block, an 8x8 pixel block, a rectangular pixel block, or a pixel area having an arbitrary shape.

次に、参照位置情報１６４に従って代表動き情報位置を生成する（ステップＳ１６０３）。代表動き情報位置を生成する一例として、動き情報圧縮ブロックが１６ｘ１６画素ブロックの場合、図１３Ｄ、図１４Ｄ、図１７Ｄにそれぞれ示されるプレディクションユニットのサイズが１６ｘ１６の場合の参照動き情報取得位置を代表動き情報位置とする。次に、生成した代表動き情報位置の参照動き情報１６６を代表動き情報に設定し（ステップＳ１６０４）、当該代表動き情報を時間方向参照動き情報メモリに格納する（ステップＳ１６０５）。上記のステップＳ１６０４〜Ｓ１６０５をすべての動き情報圧縮ブロックに対して実行する。 Next, the representative motion information position is generated according to the reference position information 164 (step S1603). As an example of generating the representative motion information position, when the motion information compression block is a 16x16 pixel block, the reference motion information acquisition position when the size of the prediction unit shown in FIGS. 13D, 14D, and 17D is 16x16 is representative. It is the motion information position. Next, the reference motion information 166 of the generated representative motion information position is set as the representative motion information (step S1604), and the representative motion information is stored in the time direction reference motion information memory (step S1605). The above steps S1604 to S1605 are executed for all motion information compression blocks.

動き情報１６０が保持される単位をＭｘＭブロック、動き情報圧縮ブロックのサイズをＮｘＮ（ＮはＭの倍数）とすると、上記動き情報圧縮処理を実行することにより、参照動き情報メモリの容量を（ＭｘＭ）／（ＮｘＮ）に削減することが可能となる。 Assuming that the unit in which the motion information 160 is held is the MxM block and the size of the motion information compression block is NxN (N is a multiple of M), the capacity of the reference motion information memory is increased (MxM) by executing the motion information compression process. ) / (NxN).

＜代表動き情報位置の別の実施形態＞
代表動き情報位置を生成する別の例として、複数の参照動き情報取得位置の中心位置を代表動き情報位置としても構わない。図１８Ａ及び図１８Ｂはサイズが１６ｘ１６である動き圧縮ブロック毎の代表動き情報位置を示している。図１８Ａは、参照動き情報取得位置が図１３Ｄに示される位置である場合の代表動き情報位置、同様に図１８Ｂは、参照動き情報取得位置が図１７Ｄに示される位置である場合の代表動き情報位置をそれぞれ示している。図１８Ａ及び図１８Ｂ中の丸印は、プレディクションユニットが１６ｘ１６ブロックである際の、参照動き情報取得位置を示しており、４点の参照動き情報取得位置の中心位置（重心位置とも称す）にバツ印で示される代表動き情報位置を配置している。 <Another Embodiment of Representative Movement Information Position>
As another example of generating the representative motion information position, the center position of the plurality of reference motion information acquisition positions may be set as the representative motion information position. 18A and 18B show representative motion information positions for each motion compression block of size 16x16. FIG. 18A is a representative motion information position when the reference motion information acquisition position is the position shown in FIG. 13D, and FIG. 18B is similarly representative motion information when the reference motion information acquisition position is the position shown in FIG. 17D. Each position is shown. The circles in FIGS. 18A and 18B indicate the reference motion information acquisition positions when the prediction unit is a 16x16 block, and are located at the center positions (also referred to as the center of gravity positions) of the four reference motion information acquisition positions. The representative movement information position indicated by the cross mark is arranged.

代表動き情報位置を生成する更に別の例として、複数のプレディクションユニットのサイズ毎の参照動き情報取得位置を参照位置情報１６４として有し、複数の参照動き情報取得位置から代表動き情報位置を生成しても構わない。 As yet another example of generating the representative motion information position, the reference motion information acquisition position for each size of a plurality of prediction units is provided as the reference position information 164, and the representative motion information position is generated from the plurality of reference motion information acquisition positions. It doesn't matter.

代表動き情報位置を生成する一例として、複数のプレディクションユニットのサイズ毎の参照動き情報取得位置を参照位置情報１６４として有し、複数の参照動き情報取得位置から代表動き情報位置を生成しても構わない。図１９は、ツリーブロックが６４ｘ６４画素ブロックである場合の、プレディクションユニットのサイズが１６ｘ１６以上の各サイズにおけるプレディクションユニットの中心（参照動き情報取得位置）をそれぞれ示している。 As an example of generating the representative motion information position, even if the reference motion information acquisition position for each size of a plurality of prediction units is provided as the reference position information 164 and the representative motion information position is generated from the plurality of reference motion information acquisition positions. I do not care. FIG. 19 shows the center (reference motion information acquisition position) of each size of the prediction unit having a size of 16x16 or more when the tree block is a 64x64 pixel block.

代表動き情報位置を生成する別の一例として、代表動き情報位置は動き情報圧縮ブロック毎に配置される参照動き情報取得位置を用いて設定されても構わない。図２０Ａは、動き情報圧縮ブロック毎の複数の参照動き情報取得位置の重心を代表動き情報位置と設定した場合の例を示す。重心位置が４ｘ４ブロックの位置と一致しない場合には、最近傍の４ｘ４ブロックを代表動き情報位置としてもよいし、共一次内挿法などの内挿法を用いて重心位置の参照動きベクトル１６６を生成しても構わない。 As another example of generating the representative motion information position, the representative motion information position may be set by using the reference motion information acquisition position arranged for each motion information compression block. FIG. 20A shows an example in which the center of gravity of a plurality of reference motion information acquisition positions for each motion information compression block is set as the representative motion information position. If the position of the center of gravity does not match the position of the 4x4 block, the nearest 4x4 block may be used as the representative motion information position, or the reference motion vector 166 of the center of gravity position may be set by using an interpolation method such as the co-primary interpolation method. You may generate it.

また、図２０Ｂは動き情報圧縮ブロック毎に複数の参照動き情報取得位置のいずれかを選択し、代表動き情報位置と設定した場合の例を示す。 Further, FIG. 20B shows an example in which one of a plurality of reference motion information acquisition positions is selected for each motion information compression block and set as a representative motion information position.

更に、図２１Ａ、図２１Ｂにツリーブロック内で各動き情報圧縮ブロックで参照動き情報取得位置を同一にした場合の例を更に示す。全ての動き情報圧縮ブロック内で同一の代表動き情報位置であるため、ツリーブロック内の位置に応じて、代表動き情報位置を切り替える必要はない。また、代表動き情報位置は図２１Ａ、図２１Ｂ以外にも、動き情報圧縮ブロック内の左上端や右上端など、いずれの位置にあっても構わない。 Further, FIGS. 21A and 21B further show an example in which the reference motion information acquisition position is the same in each motion information compression block in the tree block. Since the representative motion information position is the same in all the motion information compression blocks, it is not necessary to switch the representative motion information position according to the position in the tree block. In addition to FIGS. 21A and 21B, the representative motion information position may be at any position such as the upper left end or the upper right end in the motion information compression block.

代表動き情報位置を生成する一例、動き情報圧縮ブロック内の４ｘ４ブロック位置をＺスキャン順で示すＢｌｋＩｄｘを用いて代表動き情報位置を示しても構わない。動き情報圧縮ブロックのサイズが１６ｘ１６である場合に、図２１Ａに示される代表動き情報位置はＢｌｋＩｄｘ＝１２の位置に相当する。また、図２１Ｂに示される代表動き情報位置はＢｌｋＩｄｘ＝１５の位置に相当する。 As an example of generating the representative motion information position, the representative motion information position may be indicated by using BlkIdx which indicates the 4x4 block position in the motion information compression block in the Z scan order. When the size of the motion information compression block is 16x16, the representative motion information position shown in FIG. 21A corresponds to the position of BlkIdx = 12. The representative motion information position shown in FIG. 21B corresponds to the position of BlkIdx = 15.

動き情報圧縮処理における別の一例として、参照フレーム番号に関するメモリ容量を削減するために、動き情報圧縮処理に参照フレーム番号を含めても構わない。この場合、代表動き情報位置に保持される参照フレーム番号を参照フレーム番号に関するメモリ容量に格納する。従って、図５に示される空間方向参照動き情報メモリ５０１及び空間方向参照動き情報メモリ５０２は動きベクトル情報に追加して参照フレーム番号を格納する。 As another example in the motion information compression process, the reference frame number may be included in the motion information compression process in order to reduce the memory capacity related to the reference frame number. In this case, the reference frame number held at the representative motion information position is stored in the memory capacity related to the reference frame number. Therefore, the spatial reference motion information memory 501 and the spatial reference motion information memory 502 shown in FIG. 5 store the reference frame number in addition to the motion vector information.

動き情報圧縮処理における更に別の一例として、動き情報圧縮処理に参照フレーム番号を含めない場合に、代表動き情報位置にある動き情報内の動きベクトル情報を、参照フレーム番号を用いてスケーリング処理を施して、動き情報メモリ１１０に格納しても構わない。スケーリング処理の典型例として、参照フレーム番号ゼロを基準とした線形スケーリング処理がある。これは、参照フレーム番号がゼロ以外の値である場合に、動きベクトル情報が参照フレーム番号ゼロに対応する参照フレームを参照するように線形スケーリング処理するものである。上述のスケーリング処理の基準は参照フレーム番号がゼロ以外の値であっても構わない。上述の線形スケーリング処理を行う場合に除算が発生する場合には、予め除算処理をテーブル化しておき、都度テーブルを引くことで上記除算を実現しても構わない。 As yet another example of the motion information compression process, when the reference frame number is not included in the motion information compression process, the motion vector information in the motion information at the representative motion information position is scaled using the reference frame number. It may be stored in the motion information memory 110. A typical example of the scaling process is a linear scaling process based on the reference frame number zero. In this method, when the reference frame number is a non-zero value, the motion vector information is linearly scaled so as to refer to the reference frame corresponding to the reference frame number zero. The reference frame number may be a non-zero value as the standard for the scaling process described above. If division occurs when the above-mentioned linear scaling processing is performed, the above-mentioned division may be realized by creating a table of the division processing in advance and pulling the table each time.

動き情報圧縮ブロックのサイズが１６ｘ１６ブロック以外の場合、上述と同様の処理を用いて代表動き情報位置を生成する。一例では、動き情報圧縮ブロックのサイズが６４ｘ６４の場合、プレディクションユニットのサイズが６４ｘ６４における参照動き情報取得位置を代表動き情報位置とする。更に別の一例では、図２１Ａ、図２１Ｂ等で示される動き情報圧縮ブロックのサイズが１６ｘ１６ブロックにおける代表動き情報位置を、動き情報圧縮ブロックのサイズに従って水平方向及び垂直方向でスケーリングした位置を代表動き情報位置としても構わない。 When the size of the motion information compression block is other than 16x16 blocks, the representative motion information position is generated by using the same processing as described above. In one example, when the size of the motion information compression block is 64x64, the reference motion information acquisition position when the size of the prediction unit is 64x64 is set as the representative motion information position. In yet another example, the representative motion information position when the size of the motion information compression block shown in FIGS. 21A and 21B is 16x16 blocks is scaled in the horizontal and vertical directions according to the size of the motion information compression block. It may be the information position.

代表動き情報位置が、ピクチャやスライスの外であるとして参照動き情報が存在しない場合には、動き情報圧縮ブロックの左上端といった動き情報圧縮ブロック内で参照動き情報が取得可能な位置を新しい代表動き情報位置として置き換えても構わない。また、代表動き情報位置が、イントラ予測が適用された領域であって、参照動き情報が存在しない場合にも同様の処理を実行して、新しい代表動き情報位置として置き換えても構わない。 If the reference motion information does not exist because the representative motion information position is outside the picture or slice, the new representative motion is the position where the reference motion information can be obtained in the motion information compression block such as the upper left corner of the motion information compression block. It may be replaced as an information position. Further, even when the representative motion information position is the area to which the intra prediction is applied and the reference motion information does not exist, the same process may be executed to replace it as a new representative motion information position.

＜シンタクス構成＞
以下、図１の画像符号化装置１００が利用するシンタクスについて説明する。
シンタクスは、画像符号化装置が動画像データを符号化する際の符号化データ（例えば、図１の符号化データ１６３）の構造を示している。この符号化データを復号化する際に、同じシンタクス構造を参照して動画像復号化装置がシンタクス解釈を行う。図１の動画像符号化装置が利用するシンタクス２２００を図２２に例示する。 <Syntax configuration>
Hereinafter, the syntax used by the image coding apparatus 100 of FIG. 1 will be described.
The syntax shows the structure of the coded data (for example, the coded data 163 of FIG. 1) when the image coding apparatus encodes the moving image data. When decoding this coded data, the moving image decoding apparatus refers to the same syntax structure and performs syntax interpretation. The syntax 2200 used by the moving image encoding device of FIG. 1 is illustrated in FIG.

シンタクス２２００は、ハイレベルシンタクス２２０１、スライスレベルシンタクス２２０２及びコーディングツリーレベルシンタクス２２０３の３つのパートを含む。ハイレベルシンタクス２２０１は、スライスよりも上位のレイヤのシンタクス情報を含む。スライスとは、フレームまたはフィールドに含まれる矩形領域もしくは連続領域を指す。スライスレベルシンタクス２２０２は、各スライスを復号化するために必要な情報を含む。コーディングツリーレベルシンタクス２２０３は、各コーディングツリー（即ち、各コーディングツリーユニット）を復号化するために必要な情報を含む。これら各パートは、さらに詳細なシンタクスを含む。 The syntax 2200 includes three parts: high-level syntax 2201, slice-level syntax 2202, and coding tree-level syntax 2203. The high-level syntax 2201 contains the syntax information of the layer above the slice. A slice is a rectangular or continuous area contained in a frame or field. The slice level syntax 2202 contains the information required to decode each slice. The coding tree level syntax 2203 contains information necessary for decoding each coding tree (that is, each coding tree unit). Each of these parts contains more detailed syntax.

ハイレベルシンタクス２２０１は、シーケンスパラメータセットシンタクス２２０４及びピクチャパラメータセットシンタクス２２０５などの、シーケンス及びピクチャレベルのシンタクスを含む。スライスレベルシンタクス２２０２は、スライスヘッダーシンタクス２２０６及びスライスデータシンタクス２２０７などを含む。コーディングツリーレベルシンタクス２２０３は、コーディングツリーユニットシンタクス２２０８、トランスフォームユニットシンタクス２２０９及びプレディクションユニットシンタクス２２１０などを含む。 High-level syntax 2201 includes sequence and picture-level syntax such as sequence parameter set syntax 2204 and picture parameter set syntax 2205. Slice level syntax 2202 includes slice header syntax 2206, slice data syntax 2207, and the like. The coding tree level syntax 2203 includes a coding tree unit syntax 2208, a transform unit syntax 2209, a prediction unit syntax 2210, and the like.

コーディングツリーユニットシンタクス２２０８は、四分木構造を持つことができる。具体的には、コーディングツリーユニットシンタクス２２０８のシンタクス要素として、さらにコーディングツリーユニットシンタクス２２０８を再帰呼び出しすることができる。即ち、１つのコーディングツリーユニットを四分木で細分化することができる。また、コーディングツリーユニットシンタクス２２０８内にはトランスフォームユニットシンタクス２２０９及びプレディクッションユニットシンタクス２２１０が含まれている。トランスフォームユニットシンタクス２２０９及びプレディクッションユニットシンタクス２２１０は、四分木の最末端の各コーディングツリーユニットシンタクス２２０８において呼び出される。プレディクッションユニットシンタクス２２１０は予測に関わる情報、トランスフォームユニットシンタクス２２０９は、逆直交変換及び量子化などに関わる情報がそれぞれ記述されている。 The coding tree unit Syntax 2208 can have a quadtree structure. Specifically, the coding tree unit syntax 2208 can be recursively called as a syntax element of the coding tree unit syntax 2208. That is, one coding tree unit can be subdivided into quadtrees. Further, the coding tree unit syntax 2208 includes a transform unit syntax 2209 and a predi cushion unit syntax 2210. The transform unit syntax 2209 and the predi cushion unit syntax 2210 are called in each coding tree unit syntax 2208 at the end of the quadtree. The pre-cushion unit syntax 2210 describes information related to prediction, and the transform unit syntax 2209 describes information related to inverse orthogonal transformation and quantization.

図２３は、本実施形態に係るシーケンスパラメータセットシンタクス２２０４を例示する。図２３Ａ及び図２３Ｂに示されるｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇは、当該シーケンスに関して本実施形態に係る動き情報圧縮の有効／無効を示すシンタクスである。ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇが０である場合、当該シーケンスに関して本実施形態に係る動き情報圧縮は無効である。従って、図１に示される動き情報圧縮部の処理はスキップされる。一例として、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇが１である場合、当該シーケンスに関して本実施携帯に係る動き情報圧縮は有効である。図２３及び図２３Ｂに示されるｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２は、動き情報圧縮処理の単位を示す情報であり、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇが１である場合に示される。ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２は、例えば本実施形態に係る動き情報圧縮ブロックのサイズの情報を示し、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２は、動き補償の最小単位に２^{(ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２)}を乗じた値が動き情報圧縮ブロックのサイズとなる。動き補償の最小単位が４ｘ４画素ブロックである、つまり参照動き情報メモリが４ｘ４画素ブロック単位に保持される場合の例を以下に示す。ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２が１の場合、本実施形態に係る動き情報圧縮ブロックのサイズは８ｘ８画素ブロックとなる。同様に、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２が２の場合、本実施形態に係る動き情報圧縮ブロックのサイズは１６ｘ１６画素ブロックとなる。図２３Ｂに示されるｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｐｏｓｉｔｉｏｎは、動き情報圧縮ブロック内の代表動き情報位置を示す情報であり、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇが１である場合に示される。ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｐｏｓｉｔｉｏｎは、例えば図２１Ａ、図２１Ｂに示されるような動き情報圧縮ブロック内の参照動き情報位置を示したり、図２０Ａ、図２０Ｂに示されるように動き情報圧縮ブロック毎の参照動き情報位置を示しても構わない。また、複数のブロックの中心にあっても構わない。 FIG. 23 illustrates the sequence parameter set syntax 2204 according to this embodiment. The motion_vector_buffer_comp_flag shown in FIGS. 23A and 23B is a syntax indicating the validity / invalidity of the motion information compression according to the present embodiment with respect to the sequence. When motion_vector_buffer_comp_flag is 0, the motion information compression according to the present embodiment is invalid for the sequence. Therefore, the processing of the motion information compression unit shown in FIG. 1 is skipped. As an example, when motion_vector_buffer_comp_flag is 1, the motion information compression related to the present mobile phone is effective for the sequence. The motion_vector_buffer_comp_ratio_log2 shown in FIGS. 23 and 23B is information indicating a unit of motion information compression processing, and is shown when the motion_vector_buffer_comp_flag is 1. The motion_vector_buffer_comp_ratio_log2 indicates, for example, the size information of the motion information compression block according to the present embodiment, and the motion_vector_buffer_comp_ratio_log2 is the minimum unit of motion compensation, and the motion compensation is 2 ^{(motion_vector_buffer_buffer)} . An example in which the minimum unit of motion compensation is a 4x4 pixel block, that is, the reference motion information memory is held in the 4x4 pixel block unit is shown below. When motion_vector_buffer_comp_ratio_log2 is 1, the size of the motion information compression block according to the present embodiment is an 8x8 pixel block. Similarly, when motion_vector_buffer_comp_ratio_log2 is 2, the size of the motion information compression block according to the present embodiment is a 16x16 pixel block. The motion_vector_buffer_comp_position shown in FIG. 23B is information indicating a representative motion information position in the motion information compression block, and is shown when the motion_vector_buffer_comp_flag is 1. The motion_vector_buffer_comp_position indicates, for example, the reference motion information position in the motion information compression block as shown in FIGS. 21A and 21B, and indicates the reference motion information position for each motion information compression block as shown in FIGS. 20A and 20B. It doesn't matter. It may also be in the center of a plurality of blocks.

また、別の例として、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｆｌａｇ、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｒａｔｉｏ＿ｌｏｇ２、ｍｏｔｉｏｎ＿ｖｅｃｔｏｒ＿ｂｕｆｆｅｒ＿ｃｏｍｐ＿ｐｏｓｉｔｉｏｎより下位のレイヤ（ピクチャパラメータセットシンタクス、スライスレベルシンタクス、コーディングツリーユニット、トランスフォームユニットなど）のシンタクスにおいて当該スライス内部の局所領域毎に本実施形態に係る予測の有効／無効が規定されてもよい。 As another example, the layers below the prediction_vector_buffer_comp_flag, the motion_vector_buffer_comp_ratio_log2, the motion_vector_buffer_comp_position, etc. The validity / invalidity of such prediction may be specified.

図２４に、プレディクションユニットシンタクスの一例を示す。図中のｓｋｉｐ＿ｆｌａｇは、プレディクションユニットシンタクスが属するコーディングユニットの予測モードがスキップモードであるか否かを示すフラグである。ｓｋｉｐ＿ｆｌａｇが１である場合、予測動き情報位置情報９５４以外のシンタクス（コーディングユニットシンタクス、プレディクションユニットシンタクス、トランスフォームユニットシンタクス）を符号化しないことを示す。ＮｕｍＭＶＰＣａｎｄ（Ｌ０）、ＮｕｍＭＶＰＣａｎｄ（Ｌ１）は、それぞれリスト０予測、リスト１予測における予測動き情報候補９５１の数を示す。予測動き情報候補９５１が存在する（ＮｕｍＭＶＰＣａｎｄ（ＬＸ）＞０、Ｘ＝０若しくは１）場合、予測動き情報位置情報９５４を示すｍｖｐ＿ｉｄｘ＿ｌＸが符号化される。 FIG. 24 shows an example of the prediction unit syntax. Skip_flag in the figure is a flag indicating whether or not the prediction mode of the coding unit to which the prediction unit syntax belongs is the skip mode. When skip_flag is 1, it means that the syntaxes other than the predicted motion information position information 954 (coding unit syntax, prediction unit syntax, transform unit syntax) are not encoded. NumMVPCand (L0) and NumMVPCand (L1) indicate the number of predicted movement information candidates 951 in the list 0 prediction and the list 1 prediction, respectively. When the predicted motion information candidate 951 exists (NumMVPCand (LX)> 0, X = 0 or 1), mvp_idx_lX indicating the predicted motion information position information 954 is encoded.

ｓｋｉｐ＿ｆｌａｇが０である場合、プレディクションユニットシンタクスが属するコーディングユニットの予測モードがスキップモードではないことを示す。ＮｕｍＭｅｒｇｅＣａｎｄｉｄａｔｅｓは、図１２などで導出される予測動き情報候補９５１の数を示す。予測動き情報候補９５１が存在する（ＮｕｍＭｅｒｇｅＣａｎｄｉｄａｔｅｓ＞０）場合、プレディクションユニットがマージモードであるか否かを示すフラグであるｍｅｒｇｅ＿ｆｌａｇが符号化される。ｍｅｒｇｅ＿ｆｌａｇは、その値が１である場合、プレディクションユニットがマージモードであることを示し、その値が０である場合、プレディクションユニットがインターモードを用いることを示す。ｍｅｒｇｅ＿ｆｌａｇが１且つ予測動き情報候補９５１が２つ以上存在する（ＮｕｍＭｅｒｇｅＣａｎｄｉｄａｔｅｓ＞1）場合、予測動き情報候補９５１の内、どのブロックからマージするかを示す予測動き情報９５２であるｍｅｒｇｅ＿ｉｄｘが符号化される。 When skip_flag is 0, it indicates that the prediction mode of the coding unit to which the prediction unit syntax belongs is not the skip mode. NumberCandidates shows the number of predicted motion information candidates 951 derived in FIG. 12 and the like. When the predicted motion information candidate 951 exists (NumMergeCandedates> 0), the flagage_flag, which is a flag indicating whether or not the prediction unit is in the merge mode, is encoded. When its value is 1, it indicates that the prediction unit is in merge mode, and when its value is 0, it indicates that the prediction unit uses intermode. When the message_flag is 1 and there are two or more predicted motion information candidates 951 (NumMergeCandedates> 1), the predicted motion information 952, which is the predicted motion information 952, is encoded from the predicted motion information candidates 951. ..

ｍｅｒｇｅ＿ｆｌａｇが１である場合、ｍｅｒｇｅ＿ｆｌａｇ、ｍｅｒｇｅ＿ｉｄｘ以外のプレディクションユニットシンタクスは符号化する必要はない。 When the message_flag is 1, the prediction unit syntax other than the message_flag and the message_idx does not need to be encoded.

ｍｅｒｇｅ＿ｆｌａｇが０である場合、プレディクションユニットがインターモードであることを示す。インターモードでは差分動き情報９５３が含む差分動きベクトル情報を示すｍｖｄ＿ｌＸ（Ｘ＝０若しくは１）や参照フレーム番号ｒｅｆ＿ｉｄｘ＿ｌＸ、Ｂスライスの場合、プレディクションユニットが単方向予測（リスト０若しくはリスト１）であるか双方向予測であるかを示すｉｎｔｅｒ＿ｐｒｅｄ＿ｉｄｃが符号化される。また、スキップモードと同様にＮｕｍＭＶＰＣａｎｄ（Ｌ０）、ＮｕｍＭＶＰＣａｎｄ（Ｌ１）を取得し、予測動き情報候補９５１が存在する（ＮｕｍＭＶＰＣａｎｄ（ＬＸ）＞０、Ｘ＝０若しくは１）場合、予測動き情報位置情報９５４を示すｍｖｐ＿ｉｄｘ＿ｌＸが符号化される。 When merge_flag is 0, it indicates that the prediction unit is in intermode. In the intermode, in the case of mvd_lX (X = 0 or 1) indicating the differential motion vector information included in the differential motion information 953, the reference frame number ref_idx_lX, and the B slice, the prediction unit is unidirectional prediction (list 0 or list 1). An inter_pred_idc indicating whether it is a bidirectional prediction is encoded. Further, when NumMVPCand (L0) and NumMVPCand (L1) are acquired in the same manner as in the skip mode and the predicted motion information candidate 951 exists (NumMVPCand (LX)> 0, X = 0 or 1), the predicted motion information position information 954 Mvp_idx_lX indicating is encoded.

以上が、本実施形態に係るシンタクス構成である。 The above is the syntax configuration according to this embodiment.

（第２の実施形態）
第２の実施形態は動画像復号化装置に関する。本実施形態に係る動画像復号化装置に対応する動画像符号化装置は、第１の実施形態において説明した通りである。即ち、本実施形態に係る動画像復号化装置は、例えば第１の実施形態に係る動画像符号化装置によって生成された符号化データを復号化する。 (Second Embodiment)
The second embodiment relates to a moving image decoding device. The moving image coding device corresponding to the moving image decoding device according to the present embodiment is as described in the first embodiment. That is, the moving image decoding device according to the present embodiment decodes the coded data generated by the moving image coding device according to the first embodiment, for example.

図２５に示すように、本実施形態に係る動画像復号化装置は、エントロピー復号化部２５０１、逆量子化部２５０２、逆直交変換部２５０３、加算部２５０４、参照画像メモリ２５０５、インター予測部２５０６、参照動き情報メモリ２５０７、参照動き情報圧縮部２５０８及び復号化制御部２５１０を含む。 As shown in FIG. 25, the moving image decoding apparatus according to the present embodiment includes an entropy decoding unit 2501, an inverse quantization unit 2502, an inverse orthogonal conversion unit 2503, an addition unit 2504, a reference image memory 2505, and an inter-prediction unit 2506. , The reference motion information memory 2507, the reference motion information compression unit 2508, and the decoding control unit 2510.

図２５の動画像復号化装置は、符号化データ２５５０を復号し、復号画像信号２５５４を出力バッファ２５１１に蓄積して出力画像として出力する。符号化データ２５５０は、例えば図１の動画像符号化装置などから出力され、図示しない蓄積系または伝送系を経て、動画像復号化装置２５００に入力される。 The moving image decoding device of FIG. 25 decodes the coded data 2550, stores the decoded image signal 2554 in the output buffer 2511, and outputs it as an output image. The coded data 2550 is output from, for example, the moving image coding device of FIG. 1, and is input to the moving image decoding device 2500 via a storage system or a transmission system (not shown).

エントロピー復号化部２５０１は、符号化データ２５５０の復号化のために、シンタクスに基づいて解読を行う。エントロピー復号化部２５０１は、各シンタクスの符号列を順次エントロピー復号化し、動き情報２５５９、量子化変換係数２５５１などの符号化対象ブロックの符号化パラメータを再生する。符号化パラメータとは、予測情報、変換係数に関する情報、量子化に関する情報、などの復号に必要となるパラメータである。 The entropy decoding unit 2501 decodes the coded data 2550 based on the syntax for decoding. The entropy decoding unit 2501 sequentially entropy-decodes the code string of each syntax, and reproduces the coding parameters of the coded target block such as the motion information 2559 and the quantization conversion coefficient 2551. The coding parameter is a parameter required for decoding such as prediction information, information on conversion coefficient, and information on quantization.

具体的には、エントロピー復号化部２５０１は、図２６に示すように、分離部２６０１、パラメータ復号化部２６０２、変換係数復号化部２６０３、並びに動き情報復号化部２６０４を備える。分離部２６０１は符号化データ２５５０を分離し、パラメータに関する符号化データ２６５１Ａをパラメータ復号化部２６０２、変換係数に関する符号化データ２６５１Ｂを変換係数復号化部２６０３、動き情報に関する符号化データ２６５１Ｃを動き情報復号化部２６０４にそれぞれ出力する。パラメータ復号化部２６０２は、予測情報などの符号化パラメータ２５７０を復号化し符号化パラメータ２５７０を出力し復号化制御部２５１０に出力する。変換係数復号化部２６０３は、符号化データ２６５１Ｂを入力し、変換係数情報２５５１を復号化して逆量子化部２５０２に出力する。 Specifically, as shown in FIG. 26, the entropy decoding unit 2501 includes a separation unit 2601, a parameter decoding unit 2602, a conversion coefficient decoding unit 2603, and a motion information decoding unit 2604. Separation unit 2601 separates the coded data 2550, the parameter coding data 2651A is the parameter decoding unit 2602, the conversion coefficient coding data 2651B is the conversion coefficient decoding unit 2603, and the motion information coding data 2651C is the motion information. It is output to the decoding unit 2604 respectively. The parameter decoding unit 2602 decodes the coding parameter 2570 such as prediction information, outputs the coding parameter 2570, and outputs the coding parameter 2570 to the decoding control unit 2510. The conversion coefficient decoding unit 2603 inputs the coded data 2651B, decodes the conversion coefficient information 2551 and outputs it to the inverse quantization unit 2502.

動き情報復号化部２６０４は、分離部２６０１から符号化データ２６５１Ｃ、復号化制御部２５１０から参照位置情報２５６０、参照動き情報メモリ２５０７から参照動き情報２５５８をそれぞれ受け取り、動き情報２５５９を出力する。出力された動き情報２５５９はインター予測部２５０６に入力される。 The motion information decoding unit 2604 receives the coded data 2651C from the separation unit 2601, the reference position information 2560 from the decoding control unit 2510, and the reference motion information 2558 from the reference motion information memory 2507, and outputs the motion information 2559. The output motion information 2559 is input to the inter-prediction unit 2506.

動き情報復号化部２６０４は、図２７に示すように、分離部２７０１、差分動き情報復号化部２７０２、予測動き情報位置復号化部２５０３、参照動き情報取得部２７０４、予測動き情報選択スイッチ２７０５及び加算部２７０６を含む。 As shown in FIG. 27, the motion information decoding unit 2604 includes a separation unit 2701, a difference motion information decoding unit 2702, a predicted motion information position decoding unit 2503, a reference motion information acquisition unit 2704, a predicted motion information selection switch 2705, and the motion information decoding unit 2604. The addition unit 2706 is included.

動き情報に関する符号化データ２６５１Ｃを分離部２７０１に入力し、差分動き情報に関する符号化データ２７５１と予測動き情報位置に関する符号化データ２７５２に分離する。差分動き情報符号化部２７０２は、差分動き情報に関する符号化データ２７５１を入力し、差分動き情報２７５３を復号化する。差分動き情報２７５３は加算部２７０６にて後述する予測動き情報２７５６と加算され、動き情報２７５９が出力される。予測動き情報位置復号化部２７０３は予測動き情報位置に関する符号化データ２７５２を入力し、予測動き情報位置２７５４を復号化する。 The coded data 2651C related to the motion information is input to the separation unit 2701 and separated into the coded data 2751 related to the differential motion information and the coded data 2752 related to the predicted motion information position. The difference motion information coding unit 2702 inputs the coded data 2751 related to the difference motion information and decodes the difference motion information 2753. The differential motion information 2753 is added to the predicted motion information 2756 described later by the addition unit 2706, and the motion information 2759 is output. The predicted motion information position decoding unit 2703 inputs the coded data 2752 regarding the predicted motion information position, and decodes the predicted motion information position 2754.

予測動き情報位置２７５４は予測動き情報選択スイッチ２７０５に入力され、予測動き情報候補２７５５の中から予測動き情報２７５６を選択する。予測動き情報位置情報２５６０は予測動き情報候補２７５５の数から生成される等長復号化や可変長復号化を用いて復号化される。隣接ブロックとの相関を利用して可変長復号化しても構わない。更に、複数の予測動き情報候補２７５５で重複する場合、重複を削除した予測動き情報候補２７５５の総数から生成される符号表から、予測動き情報位置情報２５６０を復号化しても構わない。また、予測動き情報候補２７５５の総数が１種類である場合、当該予測動き情報候補２７５５が予測動き情報２５５６と決定されるため、予測動き情報位置情報２７５４を復号化する必要はない。 The predicted motion information position 2754 is input to the predicted motion information selection switch 2705, and the predicted motion information 2756 is selected from the predicted motion information candidates 2755. The predicted motion information position information 2560 is decoded by using isometric decoding or variable length decoding generated from the number of predicted motion information candidates 2755. Variable length decoding may be performed using the correlation with the adjacent block. Further, when a plurality of predicted motion information candidates 2755 overlap, the predicted motion information position information 2560 may be decoded from the code table generated from the total number of the predicted motion information candidates 2755 with the duplication removed. Further, when the total number of the predicted motion information candidates 2755 is one type, the predicted motion information candidate 2755 is determined to be the predicted motion information 2556, so that it is not necessary to decode the predicted motion information position information 2754.

参照動き情報取得部２７０４は第１の実施形態で説明した参照動き情報取得部９０１とその構成、処理内容は同一である。 The reference motion information acquisition unit 2704 has the same configuration and processing contents as the reference motion information acquisition unit 901 described in the first embodiment.

参照動き情報取得部２７０４は、参照動き情報２５５８及び参照位置情報２５６０を入力として、少なくとも一つ以上の予測動き情報候補２７５５（２７５５Ａ、２７５５Ｂ、…）を生成する。図１０、図１１は、復号化対象プレディクションユニットに対する、予測動き情報候補２７５５の位置の一例を示している。図１０は復号化対象プレディクションユニットに空間的に隣接するプレディクションユニットの位置を示している。ＡＸ（Ｘ＝０〜ｎＡ−１）は、対象プレディクションユニットに対して左に隣接するプレディクションユニット、ＢＹ（Ｙ＝０〜ｎＢ−１）は対象プレディクションユニットに対して上に隣接するプレディクションユニット、Ｃ、Ｄ、Ｅは復号化対象プレディクションユニットに対してそれぞれ右上、左上、左下に隣接するプレディクションユニットを示している。また、図１１は復号化対象プレディクションユニットに対して、既に復号化済みの参照フレームにおけるプレディクションユニットの位置を示している。図中のＣｏｌは、参照フレーム内であって復号化対象プレディクションユニットと同一位置にあるプレディクションユニットを示している。図１２は、複数の予測動き情報候補２７５５のブロック位置とインデクスＭｖｐiｄｘの関係を示すリストの一例を示す。Ｍｖｐiｄｘが０〜２は空間方向に位置する予測動き情報候補２７５５、Ｍｖｐiｄｘが３は時間方向に位置する測動きベクトル候補２７５５をそれぞれ示している。プレディクションユニット位置Ａは図１０に示されるＡＸの内、インター予測である、つまり参照動き情報２５５８を有するプレディクションユニットであって、Ｘの値が最も小さい位置をプレディクションユニット位置Ａとする。また、プレディクションユニット位置Ｂは図１０に示されるＢＹの内、インター予測である、つまり参照動き情報２５５８を有するプレディクションユニットであって、Ｙの値が最も小さい位置をプレディクションユニット位置Ａとする。プレディクションユニット位置Ｃがインター予測ではない場合、プレディクションユニット位置Ｄの参照動き情報２５５８をプレディクションユニット位置Ｃの参照動き情報２５５８として置き換える。プレディクションユニット位置Ｃ及びＤがインター予測ではない場合、プレディクションユニット位置Ｅの参照動き情報２５５８をプレディクションユニット位置Ｃの参照動き情報２５５８として置き換える。 The reference motion information acquisition unit 2704 generates at least one or more predicted motion information candidates 2755 (2755A, 2755B, ...) By inputting the reference motion information 2558 and the reference position information 2560. 10 and 11 show an example of the position of the predicted motion information candidate 2755 with respect to the prediction unit to be decoded. FIG. 10 shows the position of the prediction unit spatially adjacent to the prediction unit to be decoded. AX (X = 0 to nA-1) is a prediction unit adjacent to the left side of the target prediction unit, and BY (Y = 0 to nB-1) is a prediction unit adjacent to the target prediction unit. The action units, C, D, and E indicate the prediction units adjacent to the upper right, upper left, and lower left with respect to the prediction unit to be decoded, respectively. Further, FIG. 11 shows the position of the prediction unit in the reference frame that has already been decoded with respect to the prediction unit to be decoded. Col in the figure indicates a prediction unit in the reference frame and at the same position as the prediction unit to be decoded. FIG. 12 shows an example of a list showing the relationship between the block position of the plurality of predicted motion information candidates 2755 and the index Mvpidx. Mvpidx 0 to 2 indicates a predicted motion information candidate 2755 located in the spatial direction, and Mvpidx 3 indicates a motion vector candidate 2755 located in the time direction. The prediction unit position A is an inter-prediction of the AX shown in FIG. 10, that is, the prediction unit having the reference motion information 2558, and the position where the value of X is the smallest is the prediction unit position A. Further, the prediction unit position B is an inter-prediction among the BYs shown in FIG. 10, that is, the prediction unit having the reference motion information 2558, and the position where the value of Y is the smallest is referred to as the prediction unit position A. To do. When the prediction unit position C is not an inter-prediction, the reference motion information 2558 of the prediction unit position D is replaced with the reference motion information 2558 of the prediction unit position C. When the prediction unit positions C and D are not inter-prediction, the reference motion information 2558 of the prediction unit position E is replaced with the reference motion information 2558 of the prediction unit position C.

復号化対象プレディクションユニットのサイズが最小プレディクションユニットより大きい場合には、プレディクションユニット位置Ｃｏｌは、複数の参照動き情報２５５８を時間方向参照動き情報メモリ２５０７に保持している可能性がある。この場合、参照位置情報２５６０に従って位置Ｃｏｌのプレディクションユニット中の参照動き情報２５５８を取得する。以降、位置Ｃｏｌのプレディクションユニット中の参照動き情報２５５８の取得位置を参照動き情報取得位置と称する。図１３Ａ〜Ｆは、参照位置情報２５６０が位置Ｃｏｌのプレディクションユニットの中心を示す場合の参照動き情報取得位置の一例を復号化対象プレディクションユニットのサイズ（３２ｘ３２〜１６ｘ１６）毎に示す。図中のブロックはそれぞれ４ｘ４プレディクションユニットを示し、丸印は予測動き情報候補２７５５として取得する４ｘ４プレディクションユニットの位置を示している。参照動き情報取得位置の別の一例を図１４Ａ〜Ｆに示す。図１４Ａ〜Ｆにおいて、丸印の位置は４ｘ４プレディクションユニットが存在しないため、丸印に隣接する４つのｘ４プレディクションユニットにおける参照動き情報２５５８の平均値やメディアン値といった予め定められた方式で、予測動き情報候補２７５５を生成する。参照動き情報取得位置の更に別の一例として、位置Ｃｏｌのプレディクションユニットの左上端に位置する４ｘ４プレディクションユニットの参照動き情報２５５８を予測動き情報候補２７５５としても構わない。上記の例以外に置いても、予め定められた方式であれば、いずれの位置及び方式を用いて予測動き情報候補２７５５を生成しても構わない。 When the size of the prediction unit to be decoded is larger than the minimum prediction unit, the prediction unit position Col may hold a plurality of reference motion information 2558 in the time direction reference motion information memory 2507. In this case, the reference motion information 2558 in the prediction unit of the position Col is acquired according to the reference position information 2560. Hereinafter, the acquisition position of the reference motion information 2558 in the prediction unit of the position Col is referred to as a reference motion information acquisition position. 13A to 13F show an example of the reference motion information acquisition position when the reference position information 2560 indicates the center of the prediction unit at the position Col for each size (32x32 to 16x16) of the prediction unit to be decoded. The blocks in the figure each indicate a 4x4 prediction unit, and the circles indicate the positions of the 4x4 prediction units to be acquired as the predicted motion information candidate 2755. Another example of the reference motion information acquisition position is shown in FIGS. 14A to 14F. In FIGS. 14A to 14F, since the 4x4 prediction unit does not exist at the position of the circle, a predetermined method such as the average value or the median value of the reference motion information 2558 in the four x4 prediction units adjacent to the circle is used. Predictive motion information candidate 2755 is generated. As yet another example of the reference motion information acquisition position, the reference motion information 2558 of the 4x4 prediction unit located at the upper left end of the prediction unit at the position Col may be used as the predicted motion information candidate 2755. In addition to the above examples, any position and method may be used to generate the predicted motion information candidate 2755 as long as it is a predetermined method.

なお、参照動き情報２５５８が存在しない場合、ゼロベクトルを有する動き情報２５５９を、予測動き情報候補２７５５として出力する。 If the reference motion information 2558 does not exist, the motion information 2559 having a zero vector is output as the predicted motion information candidate 2755.

以上により、少なくとも一つ以上の予測動き情報候補２７５５が参照動きブロックから出力される。上記の予測動き情報候補２７５５が有する参照フレーム番号と復号化対象プレディクションユニットの参照フレーム番号が異なる場合、予測動き情報候補２７５５を予測動き情報候補２７５５が有する参照フレーム番号と復号化対象プレディクションユニットの参照フレーム番号に従ってスケーリングしても構わない。予測動き情報選択スイッチ２７０５は、複数の予測動き情報候補２７５５から予測動き情報位置２７５４に従って一つを選択し、予測動き情報９５２を出力する。 As described above, at least one or more predicted motion information candidates 2755 are output from the reference motion block. When the reference frame number of the predicted motion information candidate 2755 and the reference frame number of the decoding target prediction unit are different, the predicted motion information candidate 2755 is the reference frame number of the predicted motion information candidate 2755 and the decoding target prediction unit. It may be scaled according to the reference frame number of. The predicted motion information selection switch 2705 selects one from a plurality of predicted motion information candidates 2755 according to the predicted motion information position 2754, and outputs the predicted motion information 952.

逆量子化部２５０２は、エントロピー復号化部２５０１からの量子化変換係数２５５１に逆量子化を行って、復元変換係数２５５２を得る。具体的には、逆量子化部２５０２は、エントロピー復号化部２５０１によって復号化された量子化に関する情報に従って逆量子化を行う。逆量子化部２５０２は、復元変換係数２５５２を逆直交変換部２５０３に出力する。 The inverse quantization unit 2502 performs inverse quantization on the quantization conversion coefficient 2551 from the entropy decoding unit 2501 to obtain a restoration conversion coefficient 2552. Specifically, the inverse quantization unit 2502 performs the inverse quantization according to the information regarding the quantization decoded by the entropy decoding unit 2501. The inverse quantization unit 2502 outputs the restoration conversion coefficient 2552 to the inverse orthogonal transformation unit 2503.

逆直交変換部２５０３は、逆量子化部２５０２からの復元変換係数２５５２に対して、符号化側において行われた直交変換に対応する逆直交変換を行い、復元予測誤差信号２５５３を得る。逆直交変換部２５０３は、復元予測誤差信号２５５３を加算部２５０４に入力する。 The inverse orthogonal transform unit 2503 performs an inverse orthogonal transform corresponding to the orthogonal transform performed on the coding side with respect to the restore conversion coefficient 2552 from the inverse quantization unit 2502, and obtains a restore prediction error signal 2553. The inverse orthogonal transform unit 2503 inputs the restoration prediction error signal 2553 to the addition unit 2504.

加算部２５０４は、復元予測誤差信号２５５３と、対応する予測画像信号２５５６とを加算し、復号画像信号２５５４を生成する。復号画像信号２５５４は、図示されないデブロッキングフィルタやウィナーフィルタなどを施し、出力画像のために出力バッファ２５１１に一時的に蓄積されると共に、参照画像信号２５５５のために参照画像メモリ２５０５にも保存される。参照画像メモリ２５０５に保存された復号画像信号２５５４は、参照画像信号２５５５としインター予測部２５０６によって必要に応じてフレーム単位またはフィールド単位で参照される。出力バッファ２５１１に一時的に蓄積された復号画像信号２５５４は、復号化制御部２５１０によって管理される出力タイミングに従って出力される。 The addition unit 2504 adds the restoration prediction error signal 2553 and the corresponding predicted image signal 2556 to generate the decoded image signal 2554. The decoded image signal 2554 is subjected to a deblocking filter, a Wiener filter, etc. (not shown), temporarily stored in the output buffer 2511 for the output image, and also stored in the reference image memory 2505 for the reference image signal 2555. To. The decoded image signal 2554 stored in the reference image memory 2505 is referred to as a reference image signal 2554 by the inter-prediction unit 2506 in frame units or field units as necessary. The decoded image signal 2554 temporarily stored in the output buffer 2511 is output according to the output timing managed by the decoding control unit 2510.

インター予測部２５０６は、参照画像メモリ２５０５に保存されている参照画像信号２５５５を利用してインター予測を行う。具体的には、インター予測部２５０６は、予測対象ブロックと参照画像信号２５５５との間の動きのズレ量（動きベクトル）を含む動き情報２５５９をエントロピー復号化部２５０１から取得し、この動きベクトルに基づいて補間処理（動き補償）を行ってインター予測画像を生成する。インター予測画像の生成に関しては、第一の実施形態と同一であるので、説明を省略する。 The inter-prediction unit 2506 uses the reference image signal 2555 stored in the reference image memory 2505 to perform inter-prediction. Specifically, the inter-prediction unit 2506 acquires motion information 2559 including the amount of motion deviation (motion vector) between the prediction target block and the reference image signal 2555 from the entropy decoding unit 2501 and uses this motion vector as the motion information. Based on this, interpolation processing (motion compensation) is performed to generate an inter-prediction image. Since the generation of the inter-predicted image is the same as that of the first embodiment, the description thereof will be omitted.

復号化制御部２５１０は、図２５の動画像復号化装置の各要素を制御する。具体的には、復号化制御部２５１０は、後述する参照位置情報２５６０をエントロピー復号化部２５０１に出力したり、上述の動作を含む復号化処理のための種々の制御を行う。 The decoding control unit 2510 controls each element of the moving image decoding device shown in FIG. 25. Specifically, the decoding control unit 2510 outputs the reference position information 2560, which will be described later, to the entropy decoding unit 2501, and performs various controls for the decoding process including the above-mentioned operation.

＜スキップモード、マージモード、インターモードの説明＞
本実施形態に係る画像復号化装置２５００は、図８に示す復号化処理の異なる複数の予測モードを使用する。図中のスキップモードは後述する予測動き情報位置２７５４に関するシンタクスのみを復号化し、その他のシンタクスは復号化しないモードである。マージモードは予測動き情報位置２７５４に関するシンタクス、変換係数情報２５５１のみを復号化し、その他のシンタクスは復号化しないモードである。インターモードは、予測動き情報位置２７５４に関するシンタクス、後述する差分動き情報２７５３、変換係数情報２５５１を復号化するモードである。これらのモードは復号化制御部２５１０が制御する予測情報２５７１によって切り替えられる。 <Explanation of skip mode, merge mode, and intermode>
The image decoding apparatus 2500 according to the present embodiment uses a plurality of prediction modes having different decoding processes shown in FIG. The skip mode in the figure is a mode in which only the syntax related to the predicted motion information position 2754 described later is decoded, and the other syntax is not decoded. The merge mode is a mode in which only the syntax regarding the predicted motion information position 2754 and the conversion coefficient information 2551 are decoded, and the other syntaxes are not decoded. The intermode is a mode for decoding the syntax regarding the predicted motion information position 2754, the difference motion information 2753 described later, and the conversion coefficient information 2551. These modes are switched by the prediction information 2571 controlled by the decoding control unit 2510.

また、図２５の動画像復号化装置は、図２８説明したシンタクスと同一または類似のシンタクスを利用するのでその詳細な説明を省略する。 Further, since the moving image decoding apparatus of FIG. 25 uses the same or similar syntax as the syntax described in FIG. 28, detailed description thereof will be omitted.

＜動き情報圧縮部２５０８の詳細＞
次に、動き情報圧縮部２５０８について図１６に示すフローチャートを用いて説明する。動き情報圧縮部２５０８は、フレーム（もしくはスライス、コーディングユニットなど任意の単位）の復号化処理が終了した際に、動き情報２５５９を圧縮して時間方向参照動き情報メモリ５０２に動き情報２５５９を格納する。 <Details of motion information compression unit 2508>
Next, the motion information compression unit 2508 will be described with reference to the flowchart shown in FIG. The motion information compression unit 2508 compresses the motion information 2559 and stores the motion information 2559 in the time direction reference motion information memory 502 when the decoding process of the frame (or an arbitrary unit such as a slice or a coding unit) is completed. ..

まず、復号化制御部２５１０から参照位置情報２５６０を取得し（ステップＳ１６０１）、フレームを動き情報２５５９の圧縮単位である動き情報圧縮ブロックに分割する（ステップＳ１６０２）。動き情報圧縮ブロックは、動き補償処理により動き情報２５５９が保持される単位（典型的には４ｘ４画素ブロック）より大きい画素ブロックであり、典型的には１６ｘ１６画素ブロックである。動き情報圧縮ブロックは３２ｘ３２画素ブロックや８ｘ８画素ブロック、長方形画素ブロック、任意の形状の画素領域であっても構わない。 First, the reference position information 2560 is acquired from the decoding control unit 2510 (step S1601), and the frame is divided into motion information compression blocks which are compression units of the motion information 2559 (step S1602). The motion information compression block is a pixel block larger than a unit (typically a 4x4 pixel block) in which motion information 2559 is held by motion compensation processing, and is typically a 16x16 pixel block. The motion information compression block may be a 32x32 pixel block, an 8x8 pixel block, a rectangular pixel block, or a pixel area having an arbitrary shape.

次に、参照位置情報２５６０に従って代表動き情報位置を生成する（ステップＳ１６０３）。代表動き情報位置を生成する一例として、動き情報圧縮ブロックが１６ｘ１６画素ブロックの場合、図１３Ｄ、図１４Ｄ、図１７Ｄにそれぞれ示されるプレディクションユニットのサイズが１６ｘ１６の場合の参照動き情報取得位置を代表動き情報位置とする。次に、生成した代表動き情報位置の参照動き情報２５５８を代表動き情報に設定し（ステップＳ１６０５）、当該代表動き情報を時間方向参照動き情報メモリに格納する（ステップＳ１６０６）。上記のステップＳ１６０４〜Ｓ１６０５をすべての動き情報圧縮ブロックに対して実行する。 Next, the representative motion information position is generated according to the reference position information 2560 (step S1603). As an example of generating the representative motion information position, when the motion information compression block is a 16x16 pixel block, the reference motion information acquisition position when the size of the prediction unit shown in FIGS. 13D, 14D, and 17D is 16x16 is representative. It is the motion information position. Next, the generated reference motion information 2558 of the representative motion information position is set as the representative motion information (step S1605), and the representative motion information is stored in the time direction reference motion information memory (step S1606). The above steps S1604 to S1605 are executed for all motion information compression blocks.

動き情報２５５９が保持される単位をＭｘＭブロック、動き情報圧縮ブロックのサイズをＮｘＮ（ＮはＭの倍数）とすると、上記動き情報圧縮処理を実行することにより、参照動き情報メモリの容量を（ＭｘＭ）／（ＮｘＮ）に削減することが可能となる。 Assuming that the unit in which the motion information 2559 is held is the MxM block and the size of the motion information compression block is NxN (N is a multiple of M), the capacity of the reference motion information memory is increased (MxM) by executing the motion information compression process. ) / (NxN).

＜代表動き情報位置の別の実施形態＞
代表動き情報位置を生成する別の例として、複数の参照動き情報取得位置の中心位置を代表動き情報位置としても構わない。図１８Ａ及び図１８Ｂはサイズが１６ｘ１６である動き圧縮ブロック毎の代表動き情報位置を示している。図１８Ａは、参照動き情報取得位置が図１３Ｄに示される位置である場合の代表動き情報位置、同様に図１８Ｂは、参照動き情報取得位置が図１７Ｄに示される位置である場合の代表動き情報位置をそれぞれ示している。図１８Ａ及び図１８Ｂ中の丸印は、プレディクションユニットが１６ｘ１６である際の、参照動き情報取得位置を示しており、４点の参照動き情報取得位置の中心位置にバツ印で示される代表動き情報位置を配置している。 <Another Embodiment of Representative Movement Information Position>
As another example of generating the representative motion information position, the center position of the plurality of reference motion information acquisition positions may be set as the representative motion information position. 18A and 18B show representative motion information positions for each motion compression block of size 16x16. FIG. 18A is a representative motion information position when the reference motion information acquisition position is the position shown in FIG. 13D, and FIG. 18B is similarly representative motion information when the reference motion information acquisition position is the position shown in FIG. 17D. Each position is shown. The circles in FIGS. 18A and 18B indicate the reference motion information acquisition positions when the prediction unit is 16x16, and the representative motions indicated by cross marks at the center positions of the four reference motion information acquisition positions. The information position is arranged.

代表動き情報位置を生成する更に別の例として、複数のプレディクションユニットのサイズ毎の参照動き情報取得位置を参照位置情報２５６０として有し、複数の参照動き情報取得位置から代表動き情報位置を生成しても構わない。図１９は、ツリーブロックが６４ｘ６４画素ブロックである場合の、プレディクションユニットのサイズが１６ｘ１６以上の各サイズにおけるプレディクションユニットの中心（参照動き情報取得位置）をそれぞれ示している。 As yet another example of generating the representative motion information position, the reference motion information acquisition position for each size of a plurality of prediction units is provided as the reference position information 2560, and the representative motion information position is generated from the plurality of reference motion information acquisition positions. It doesn't matter. FIG. 19 shows the center (reference motion information acquisition position) of each size of the prediction unit having a size of 16x16 or more when the tree block is a 64x64 pixel block.

更に、図２１Ａ、Ｂにツリーブロック内で各動き情報圧縮ブロックで参照動き情報取得位置を同一にした場合の例を更に示す。全ての動き情報圧縮ブロック内で同一の代表動き情報位置であるため、ツリーブロック内の位置に応じて、代表動き情報位置を切り替える必要はない。また、代表動き情報位置は図２１Ａ、Ｂ以外にも、動き情報圧縮ブロック内の左上端や右上端等いずれの位置にあっても構わない。 Further, FIGS. 21A and 21B further show an example in which the reference motion information acquisition position is the same in each motion information compression block in the tree block. Since the representative motion information position is the same in all the motion information compression blocks, it is not necessary to switch the representative motion information position according to the position in the tree block. Further, the representative motion information position may be at any position such as the upper left end or the upper right end in the motion information compression block other than FIGS. 21A and 21B.

代表動き情報位置が、ピクチャやスライスの外であるとして参照動き情報が存在しない場合には、動き情報圧縮ブロックの左上端といった動き情報圧縮ブロック内で参照動き情報が取得可能な位置を新しい代表動き情報位置として置き換えても構わない。また、代表動き情報位置がイントラ予測が適用された領域であって、参照動き情報が存在しない場合にも同様の処理を実行して、新しい代表動き情報位置として置き換えても構わない。 If the reference motion information does not exist because the representative motion information position is outside the picture or slice, the new representative motion is the position where the reference motion information can be obtained in the motion information compression block such as the upper left corner of the motion information compression block. It may be replaced as an information position. Further, even when the representative motion information position is the area to which the intra prediction is applied and the reference motion information does not exist, the same process may be executed to replace it as a new representative motion information position.

以下、各実施形態の変形例を列挙して紹介する。
第１及び第２の実施形態において、フレームを１６×１６画素サイズなどの矩形ブロックに分割し、画面左上のブロックから右下に向かって順に符号化／復号化を行う例について説明している（図２Ａを参照）。しかしながら、符号化順序及び復号化順序はこの例に限定されない。例えば、右下から左上に向かって順に符号化及び復号化が行われてもよいし、画面中央から画面端に向かって渦巻を描くように符号化及び復号化が行われてもよい。さらに、右上から左下に向かって順に符号化及び復号化が行われてもよいし、画面端から画面中央に向かって渦巻きを描くように符号化及び復号化が行われてもよい。 Hereinafter, modification examples of each embodiment will be listed and introduced.
In the first and second embodiments, an example is described in which a frame is divided into rectangular blocks having a size of 16 × 16 pixels, and encoding / decoding is performed in order from the block on the upper left of the screen to the lower right (). See FIG. 2A). However, the coding order and decoding order are not limited to this example. For example, coding and decoding may be performed in order from the lower right to the upper left, or coding and decoding may be performed so as to draw a spiral from the center of the screen toward the edge of the screen. Further, coding and decoding may be performed in order from the upper right to the lower left, or coding and decoding may be performed so as to draw a spiral from the edge of the screen toward the center of the screen.

第１及び第２の実施形態において、４×４画素ブロック、８×８画素ブロック、１６×１６画素ブロックなどの予測対象ブロックサイズを例示して説明を行ったが、予測対象ブロックは均一なブロック形状でなくてもよい。例えば、予測対象ブロック（プレディクションユニット）サイズは、１６×８画素ブロック、８×１６画素ブロック、８×４画素ブロック、４×８画素ブロックなどであってもよい。また、１つのコーディングツリーユニット内で全てのブロックサイズを統一させる必要はなく、複数の異なるブロックサイズを混在させてもよい。１つのコーディングツリーユニット内で複数の異なるブロックサイズを混在させる場合、分割数の増加に伴って分割情報を符号化または復号化するための符号量も増加する。そこで、分割情報の符号量と局部復号画像または復号画像の品質との間のバランスを考慮して、ブロックサイズを選択することが望ましい。 In the first and second embodiments, the prediction target block sizes such as 4 × 4 pixel block, 8 × 8 pixel block, and 16 × 16 pixel block have been illustrated and described, but the prediction target block is a uniform block. It does not have to be a shape. For example, the size of the prediction target block (prediction unit) may be a 16 × 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, a 4 × 8 pixel block, or the like. Further, it is not necessary to unify all block sizes in one coding tree unit, and a plurality of different block sizes may be mixed. When a plurality of different block sizes are mixed in one coding tree unit, the amount of code for encoding or decoding the division information increases as the number of divisions increases. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the divided information and the quality of the locally decoded image or the decoded image.

第１及び第２の実施形態において、簡単化のために、輝度信号と色差信号とを区別せず、色信号成分に関して包括的な説明を記述した。しかしながら、予測処理が輝度信号と色差信号との間で異なる場合には、同一または異なる予測方法が用いられてよい。輝度信号と色差信号との間で異なる予測方法が用いられるならば、色差信号に対して選択した予測方法を輝度信号と同様の方法で符号化または復号化できる。 In the first and second embodiments, for the sake of simplicity, the luminance signal and the color difference signal are not distinguished, and a comprehensive description of the color signal component is described. However, when the prediction process differs between the luminance signal and the color difference signal, the same or different prediction methods may be used. If different prediction methods are used for the luminance signal and the color difference signal, the prediction method selected for the luminance signal can be encoded or decoded in the same manner as the luminance signal.

第１及び第２の実施形態において、簡単化のために、輝度信号と色差信号とを区別せず、色信号成分に関して包括的な説明を記述した。しかしながら、直交変換処理が輝度信号と色差信号との間で異なる場合には、同一または異なる直交変換方法が用いられてよい。輝度信号と色差信号との間で異なる直交変換方法が用いられるならば、色差信号に対して選択した直交変換方法を輝度信号と同様の方法で符号化または復号化できる。 In the first and second embodiments, for the sake of simplicity, the luminance signal and the color difference signal are not distinguished, and a comprehensive description of the color signal component is described. However, when the orthogonal transform processing differs between the luminance signal and the color difference signal, the same or different orthogonal transform methods may be used. If different orthogonal conversion methods are used between the luminance signal and the color difference signal, the orthogonal transform method selected for the luminance signal can be encoded or decoded in the same manner as the luminance signal.

第１及び第２までの実施形態において、シンタクス構成に示す表の行間には、実施形態で規定していないシンタクス要素が挿入されることも可能であるし、それ以外の条件分岐に関する記述が含まれていても構わない。或いは、シンタクステーブルを複数のテーブルに分割、統合することも可能である。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更しても構わない。 In the first and second embodiments, it is possible to insert a syntax element not specified in the embodiment between the rows of the table shown in the syntax configuration, and other descriptions regarding conditional branching are included. It doesn't matter if it is. Alternatively, the syntax table can be divided and integrated into a plurality of tables. Further, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.

以上説明したように、各実施形態は、ハードウェア実装及びソフトウェア実装における困難性を緩和しつつ、高効率な直交変換及び逆直交変換を実現することができる。故に、各実施形態によれば、符号化効率が向上し、ひいては主観画質も向上する。 As described above, each embodiment can realize highly efficient orthogonal transformation and inverse orthogonal transformation while alleviating difficulties in hardware implementation and software implementation. Therefore, according to each embodiment, the coding efficiency is improved, and the subjective image quality is also improved.

また、上述の実施形態の中で示した処理手順に示された指示は、ソフトウェアであるプログラムに基づいて実行されることが可能である。汎用の計算機システムが、このプログラムを予め記憶しておき、このプログラムを読み込むことにより、上述した実施形態の動画像符号化装置及び動画像復号化装置による効果と同様な効果を得ることも可能である。上述の実施形態で記述された指示は、コンピュータに実行させることのできるプログラムとして、磁気ディスク（フレキシブルディスク、ハードディスクなど）、光ディスク（ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、ＤＶＤ−ＲＯＭ、ＤＶＤ±Ｒ、ＤＶＤ±ＲＷなど）、半導体メモリ、またはこれに類する記録媒体に記録される。コンピュータまたは組み込みシステムが読み取り可能な記録媒体であれば、その記憶形式は何れの形態であってもよい。コンピュータは、この記録媒体からプログラムを読み込み、このプログラムに基づいてプログラムに記述されている指示をＣＰＵで実行させれば、上述した実施形態の動画像符号化装置及び動画像復号化装置と同様な動作を実現することができる。もちろん、コンピュータがプログラムを取得する場合または読み込む場合はネットワークを通じて取得または読み込んでもよい。
また、記録媒体からコンピュータや組み込みシステムにインストールされたプログラムの指示に基づきコンピュータ上で稼働しているＯＳ（オペレーティングシステム）や、データベース管理ソフト、ネットワーク等のＭＷ（ミドルウェア）等が本実施形態を実現するための各処理の一部を実行してもよい。
さらに、本願発明の実施形態における記録媒体は、コンピュータあるいは組み込みシステムと独立した媒体に限らず、ＬＡＮやインターネット等により伝達されたプログラムをダウンロードして記憶または一時記憶した記録媒体も含まれる。また、上記各実施形態の処理を実現するプログラムを、インターネットなどのネットワークに接続されたコンピュータ（サーバ）上に格納し、ネットワーク経由でコンピュータ（クライアント）にダウンロードさせてもよい。
また、記録媒体は１つに限られず、複数の媒体から本実施形態における処理が実行される場合も、本発明の実施形態における記録媒体に含まれ、媒体の構成は何れの構成であってもよい。 In addition, the instructions given in the processing procedure shown in the above-described embodiment can be executed based on a program that is software. By storing this program in advance and reading this program, a general-purpose computer system can obtain the same effect as the effect of the moving image coding device and the moving image decoding device of the above-described embodiment. is there. The instructions described in the above-described embodiments can be executed by a computer as a program such as a magnetic disk (flexible disk, hard disk, etc.) or an optical disk (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD). It is recorded on a recording medium (± R, DVD ± RW, etc.), a semiconductor memory, or a similar recording medium. The storage format may be any form as long as it is a recording medium that can be read by a computer or an embedded system. If the computer reads the program from this recording medium and causes the CPU to execute the instructions described in the program based on this program, it is similar to the moving image coding device and the moving image decoding device of the above-described embodiment. The operation can be realized. Of course, if the computer acquires or loads the program, it may acquire or load it through the network.
In addition, the OS (operating system) running on the computer based on the instructions of the program installed on the computer or embedded system from the recording medium, database management software, MW (middleware) such as the network, etc. realize this embodiment. You may perform a part of each process for doing so.
Further, the recording medium according to the embodiment of the present invention is not limited to a medium independent of a computer or an embedded system, but also includes a recording medium in which a program transmitted via a LAN, the Internet, or the like is downloaded and stored or temporarily stored. Further, the program that realizes the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
Further, the recording medium is not limited to one, and even when the processing in the present embodiment is executed from a plurality of media, it is included in the recording medium in the embodiment of the present invention, and the structure of the medium may be any configuration. Good.

なお、本願発明の実施形態におけるコンピュータまたは組み込みシステムは、記録媒体に記憶されたプログラムに基づき、本実施形態における各処理を実行するためのものであって、パソコン、マイコン等の１つからなる装置、複数の装置がネットワーク接続されたシステム等の何れの構成であってもよい。
また、本願発明の実施形態におけるコンピュータとは、パソコンに限らず、情報処理機器に含まれる演算処理装置、マイコン等も含み、プログラムによって本発明の実施形態における機能を実現することが可能な機器、装置を総称している。 The computer or embedded system according to the embodiment of the present invention is for executing each process according to the present embodiment based on the program stored in the recording medium, and is a device including one such as a personal computer and a microcomputer. , A system in which a plurality of devices are connected to a network, or the like may be used.
Further, the computer in the embodiment of the present invention includes not only a personal computer but also an arithmetic processing unit, a microcomputer, etc. included in an information processing device, and a device capable of realizing the function in the embodiment of the present invention by a program. It is a general term for devices.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are also included in the scope of the invention described in the claims and the equivalent scope thereof.

１００…画像符号化装置、１０１…減算部、１０２…直交変換部、１０３…量子化部、１０４、２５０２…逆量子化部、１０５、２５０３…逆直交変換部、１０６、２５０４、２７０６…加算部、１０７、２５０５…参照画像メモリ、１０８、２５０６…インター予測部、１０９…動き情報圧縮部、１１０…動き情報メモリ、１１２…エントロピー符号化部、１１３…出力バッファ、１１４…符号化制御部、４０１…パラメータ符号化部、４０２…変換係数符号化部、４０３…動き情報符号化部、４０４…多重化部、９０１…参照動きベクトル取得部、９０２…予測動きベクトル選択スイッチ、９０３…減算部、９０４…差分動き情報符号化部、９０５…予測動き情報位置符号化部、９０６…多重化部、２５００…動画像復号化装置、２５０１…エントロピー復号化部、２５０７…参照動き情報メモリ、２５０８…参照動き情報圧縮部、２５１０…復号化制御部、２６０１、２７０１…分離部、２６０２…パラメータ復号化部、２６０３…変換係数復号化部、２６０４…動き情報復号化部、２７０２…差分動き情報復号化部、２５０３…予測動き情報位置復号化部、２７０４…参照動き情報取得部、２７０５…予測動き情報選択スイッチ。 100 ... image encoding device, 101 ... subtraction unit, 102 ... orthogonal conversion unit, 103 ... quantization unit, 104, 2502 ... inverse quantization unit, 105, 2503 ... inverse orthogonal conversion unit, 106, 2504, 2706 ... addition unit , 107, 2505 ... Reference image memory, 108, 2506 ... Inter prediction unit, 109 ... Motion information compression unit, 110 ... Motion information memory, 112 ... Entropy coding unit, 113 ... Output buffer, 114 ... Coding control unit, 401 ... Parameter coding unit, 402 ... Conversion coefficient coding unit, 403 ... Motion information coding unit, 404 ... Multiplexing unit, 901 ... Reference motion vector acquisition unit, 902 ... Predicted motion vector selection switch, 903 ... Subtraction unit, 904 ... differential motion information coding unit, 905 ... predicted motion information position coding unit, 906 ... multiplexing unit, 2500 ... moving image decoding device, 2501 ... entropy decoding unit, 2507 ... reference motion information memory, 2508 ... reference motion Information compression unit, 2510 ... Decoding control unit, 2601, 2701 ... Separation unit, 2602 ... Parameter decoding unit, 2603 ... Conversion coefficient decoding unit, 2604 ... Motion information decoding unit, 2702 ... Differential motion information decoding unit, 2503 ... Predicted motion information position decoding unit, 2704 ... Reference motion information acquisition unit, 2705 ... Predicted motion information selection switch.

Claims

Acquire the coded data including at least the first frame including the target block, and
Decrypt the merge flag to determine if at least the motion vector is predicted from the merge block in inter-prediction mode.
When it is specified by the merge flag that at least the motion vector is predicted from the merge block in the inter-prediction mode, a candidate for the first motion vector is derived from at least one adjacent block of the target block.
When it is specified by the merge flag that at least the motion vector is predicted from the merge block in the inter-prediction mode, the motion vector from the reference block to the second motion vector included in the second frame different from the first frame. Guide candidates
Decrypt the merge index that specifies the merge block from the at least one adjacent block and the reference block.
It is provided that the motion vector of the target block is derived from any one of the candidate of the first motion vector and the candidate of the second motion vector according to the merge index.
The at least one adjacent block is (1) a block on the lower left side of the target block, (2) a block on the left side of the target block, (3) a block on the upper right side of the target block, and (4) said. It comprises at least one of a block above the target block and (5) a block on the upper left side of the target block.
The candidate for the second motion vector of the reference block is derived according to the representative motion information position .
When the representative motion information position is outside the frame, a new representative motion information position is set in order to derive the candidate for the second motion vector of the reference block .
Video decoding method.

Acquire the data including at least the first frame including the target block,
Set the merge flag to determine at least whether the motion vector is predicted from the merge block in inter-prediction mode,
When it is specified by the merge flag that at least the motion vector is predicted from the merge block in the inter-prediction mode, a candidate for the first motion vector is set from at least one adjacent block of the target block.
When it is specified by the merge flag that at least the motion vector is predicted from the merge block in the inter-prediction mode, the motion vector from the reference block to the second motion vector included in the second frame different from the first frame. Set candidates and
A merge index that specifies the merge block is set from the at least one adjacent block and the reference block.
The motion vector of the target block is set from any one of the candidate of the first motion vector and the candidate of the second motion vector according to the merge index.
The at least one adjacent block is (1) a block on the lower left side of the target block, (2) a block on the left side of the target block, (3) a block on the upper right side of the target block, and (4) said. It comprises at least one of a block above the target block and (5) a block on the upper left side of the target block.
The candidate for the second motion vector of the reference block is derived according to the representative motion information position .
When the representative motion information position is outside the frame, a new representative motion information position is set to derive the candidate for the second motion vector of the reference block.
The merge flag and the merge index are encoded.
Video coding method.