JP5304709B2

JP5304709B2 - Moving picture decoding apparatus, moving picture decoding method, and moving picture decoding program

Info

Publication number: JP5304709B2
Application number: JP2010083143A
Authority: JP
Inventors: 基晴上田; 智坂爪; 茂福島; 徹熊倉
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2010-03-31
Filing date: 2010-03-31
Publication date: 2013-10-02
Anticipated expiration: 2030-03-31
Also published as: JP2011217105A

Description

本発明は、動画像信号の復号技術に関する。 The present invention relates to a moving image signal decoding technique.

近年、デジタル化された画像及び音のコンテンツを、衛星や地上波等の放送波やネットワークを介して配信するサービスが実用化されており、膨大な情報量を持つコンテンツを効率的に記録及び伝送するための高能率符号化技術が必要となっている。動画像の高能率符号化としては、ＭＰＥＧ４−ＡＶＣに代表される、動画像信号の同一フレーム内で空間的に隣接する画素間の相関、及び時間的に隣接するフレーム間やフィールド間の相関を利用して情報を圧縮する方法が用いられる。 In recent years, services that deliver digital image and sound content via broadcast waves such as satellite and terrestrial waves and networks have been put into practical use, and content with a huge amount of information can be efficiently recorded and transmitted. In order to do so, a high-efficiency encoding technique is required. As high-efficiency encoding of moving images, the correlation between pixels that are spatially adjacent in the same frame of a moving image signal, and the correlation between temporally adjacent frames and fields, represented by MPEG4-AVC, are used. A method of compressing information using it is used.

ＭＰＥＧ４−ＡＶＣでは、時間的相関を利用した圧縮として、符号化対象フレームである対象画像に対して、既に符号化済みのフレームの局部復号画像を参照画像として用い、所定の大きさの２次元ブロック（以降、「対象ブロック」と記す）単位で、対象画像と参照画像との間での動き量（以降、「動きベクトル」と記す）を検出し、対象ブロックと動きベクトルに基づいた予測画像を生成する動き補償予測が用いられる。 In MPEG4-AVC, as a compression using temporal correlation, a local decoded image of a frame that has already been encoded is used as a reference image for a target image that is an encoding target frame, and a two-dimensional block of a predetermined size is used. A motion amount (hereinafter referred to as “motion vector”) between the target image and the reference image is detected in units (hereinafter referred to as “target block”), and a predicted image based on the target block and the motion vector is detected. The generated motion compensated prediction is used.

ＭＰＥＧ４−ＡＶＣでは、符号化処理の単位となる１６×１６画素の２次元ブロック（以降、「マクロブロック」と記す）内での、対象ブロックの大きさを可変にして対象ブロック毎の動きベクトルを用いて予測する手法、複数の参照画像を格納し予測に用いる参照画像を選択する手法、２つの参照画像と対象ブロックの間の動きベクトルを求めて動き予測画像を生成する手法を用いる事で、動き補償予測の予測精度を向上させる事が可能であり、それにより情報量の削減を実現している。 In MPEG4-AVC, the size of a target block in a 16 × 16 pixel two-dimensional block (hereinafter referred to as “macroblock”), which is a unit of encoding processing, is changed, and a motion vector for each target block is obtained. By using a method of predicting using, a method of storing a plurality of reference images and selecting a reference image used for prediction, and a method of generating a motion predicted image by obtaining a motion vector between two reference images and a target block, It is possible to improve the prediction accuracy of motion compensated prediction, thereby reducing the amount of information.

また、動き補償予測においては生成した動きベクトルを符号化伝送する必要があり、動きベクトルによる情報量の増加を防ぐために、対象ブロック周辺の復号済みのブロックに対する動きベクトルから予測される、予測動きベクトル値を用いて符号化する事で、動きベクトルを伝送しないダイレクトモードと呼ばれる動き補償予測を用いる事が可能となっている。 Also, in motion compensated prediction, it is necessary to encode and transmit the generated motion vector, and in order to prevent an increase in the amount of information due to the motion vector, a predicted motion vector predicted from the motion vector for a decoded block around the target block By encoding using values, it is possible to use motion compensated prediction called direct mode in which no motion vector is transmitted.

しかしながら、上記動きベクトルの予測は必ずしも精度良く求める事が出来ないため、特許文献１に示されるように、符号化側と復号側が共に、参照画像間で動きベクトルの検出を行い、その動きベクトルが時間的に連続している事を想定して、対象ブロックの予測動きベクトルを生成し、ダイレクトモードを構成する手法も提示されている。 However, since the prediction of the motion vector cannot always be obtained with high accuracy, as shown in Patent Document 1, both the encoding side and the decoding side detect a motion vector between reference images, and the motion vector is A method of generating a predicted motion vector of a target block and assuming a direct mode on the assumption that it is continuous in time is also presented.

特開２００８−１５４０１５号公報JP 2008-154015 A

ＭＰＥＧ４−ＡＶＣに代表される従来の動画像符号化における動き補償予測は、以下のような課題が解決できていないために、符号化効率の向上が妨げられている。 In motion compensated prediction in conventional moving picture coding represented by MPEG4-AVC, the following problems cannot be solved, and hence improvement in coding efficiency is hindered.

一つ目の課題は、参照画像として用いる復号画像の品質が劣化することによる、動き補償予測画像の品質低下であり、特に高圧縮な符号化を施した場合に動き補償予測画像に混入した劣化成分が予測精度を悪化させると共に、劣化成分を復元するための情報を予測差分として符号化する必要があり、情報量が増加している。 The first problem is the degradation of the quality of the motion compensated predicted image due to the degradation of the quality of the decoded image used as the reference image, especially the degradation mixed in the motion compensated predicted image when high-compression encoding is performed. While the component deteriorates the prediction accuracy, it is necessary to encode information for restoring the deteriorated component as a prediction difference, and the amount of information is increasing.

二つ目の課題は、時間的・空間的に動きの連続性が少ない画像信号において、動きベクトルの予測が十分な精度でなく、ダイレクトモードを用いた際の予測画像の品質が悪く有効に機能しない点である。対象となる物体を跨いで隣接するブロックでは異なる動きを有する際にこの劣化が生じ、時間的には動きが大きい場合に予測に用いる動きベクトルが本来の対象ブロックとは動きに相当して移動した位置のブロックを想定しているためにこの劣化が生じる。また時間的に動きが変化する場合にも、同様に予測が当らず劣化が生じる。 The second problem is that the motion vector prediction is not accurate enough for image signals with little temporal and spatial motion continuity, and the predicted image quality when using the direct mode is effective. It is a point not to do. This degradation occurs when adjacent blocks have different motions across the target object, and the motion vector used for prediction when the motion is large in time has moved corresponding to the motion of the original target block This degradation occurs because a block of positions is assumed. Similarly, when the motion changes with time, the prediction is not successful and deterioration occurs.

三つ目の課題は、２つの参照画像を用いた予測や細かいブロック単位での動き補償予測を用いた際の、動きベクトル伝送に要する符号量の増加である。２つの参照画像を用いる場合に、参照画像を加算する事による予測劣化の平滑化が行われ、劣化成分の影響を少なくする事が出来るが、それに応じた動きベクトルの伝送が必要となり符号量が増加する。また、細かいブロック単位での動き補償においても、物体の境界に応じて適切な動きを求める事が可能となり、予測画像の精度は向上するが、細かい単位での動きベクトルの伝送が必要となって符号量が増大する。 The third problem is an increase in the amount of code required for motion vector transmission when using prediction using two reference images or motion compensated prediction in units of fine blocks. When two reference images are used, prediction deterioration is smoothed by adding the reference images, and the influence of the deterioration component can be reduced. To increase. Also, in motion compensation in fine block units, it is possible to obtain appropriate motion according to the boundary of the object, and the accuracy of the predicted image is improved, but it is necessary to transmit motion vectors in fine units. The amount of code increases.

特許文献１は、上記二つ目の課題を解決するために提示された手法であるが、空間的に一様な動きをしている場合には、参照画像間で求めた動きベクトルが対象ブロックの位置を通過する動きとなる為、動きベクトルの予測精度は向上するが、空間的に一様な動きをしていない場合には、対象ブロックの情報を用いずに求めた予測動きベクトルであるために、対象ブロックと異なる動きとなり予測が十分に当らない。また、大きな動きを捉えるためには、参照画像間で広い範囲に渡る動きベクトルの検出処理が符号化装置、復号装置の両方に必要となり、演算量が大きくなる課題が生じる。 Patent Document 1 is a technique presented to solve the second problem described above. When a spatially uniform motion is present, a motion vector obtained between reference images is a target block. The motion vector prediction accuracy is improved because the motion passes through the position, but if the motion is not spatially uniform, it is the predicted motion vector obtained without using the target block information Therefore, the motion is different from that of the target block, and the prediction is not sufficient. In addition, in order to capture a large motion, a motion vector detection process over a wide range between reference images is required for both the encoding device and the decoding device, which causes a problem that the amount of calculation increases.

本発明はこうした状況に鑑みてなされたものであり、その目的は、符号化装置および復号装置における演算量の増加を抑えつつ、予測画像の品質を高めて動き補償予測の効率を向上させる技術を提供することにある。 The present invention has been made in view of such a situation, and an object of the present invention is to provide a technique for improving the efficiency of motion compensated prediction by improving the quality of a predicted image while suppressing an increase in the amount of calculation in an encoding device and a decoding device. It is to provide.

上記課題を解決するために、本発明のある態様の動画像復号装置は、符号化ストリームより、復号対象ブロックに対する動きベクトルを復号する動きベクトル復号部と、動きベクトルを用いて第１の参照画像より抽出した第１の参照ブロックと、他の少なくとも１つの参照画像の所定領域とを合成した合成参照ブロックを生成する参照画像合成部と、合成参照ブロックを予測ブロックとして、予測ブロックと、復号対象ブロックから復号した予測差分ブロックとを加算することにより、復号画像を生成する復号部とを備える。 In order to solve the above problems, a video decoding device according to an aspect of the present invention includes a motion vector decoding unit that decodes a motion vector for a decoding target block from an encoded stream, and a first reference image using the motion vector. A reference image synthesizing unit that generates a synthesized reference block obtained by synthesizing the extracted first reference block and a predetermined region of at least one other reference image, a synthesized reference block as a predicted block, a prediction block, and a decoding target A decoding unit that generates a decoded image by adding the prediction difference block decoded from the block;

上記課題を解決するために、本発明のある態様の動画像復号装置は、符号化ストリームより、復号対象ブロックに対する第１の動きベクトルを復号する動きベクトル復号部と、第１の動きベクトルより第２の動きベクトルを生成する動きベクトル分離部と、第２の動きベクトルを用いて第１の参照画像より抽出した、復号対象ブロック以上の大きさを持つ特定領域の第１の参照ブロックと、他の少なくとも１つの参照画像の所定領域とを合成した合成参照ブロックを生成する参照画像合成部と、第１の動きベクトルを用いて、復号対象ブロックと同じ大きさのブロックを合成参照ブロックより抽出し、その抽出したブロックを予測ブロックとする動き補償予測部と、予測ブロックと、復号対象ブロックから復号した予測差分ブロックとを加算することにより、復号画像を生成する復号部とを備える。 In order to solve the above-described problem, a video decoding device according to an aspect of the present invention includes a motion vector decoding unit that decodes a first motion vector for a decoding target block from an encoded stream, and a first motion vector. A motion vector separation unit that generates two motion vectors, a first reference block in a specific area having a size equal to or larger than the decoding target block, extracted from the first reference image using the second motion vector, and the like A block having the same size as the decoding target block is extracted from the combined reference block using a reference image combining unit that generates a combined reference block that combines a predetermined region of at least one reference image and a first motion vector. , Adding the motion compensated prediction unit having the extracted block as a prediction block, the prediction block, and the prediction difference block decoded from the decoding target block The Rukoto, and a decoder for generating a decoded image.

動きベクトル分離部において、入力された第１の動きベクトルの精度がＭ画素精度（Ｍは実数）であり、生成する第２の動きベクトルの精度がＮ画素精度（Ｎは実数：Ｎ＞Ｍ）であり、第２の動きベクトルが第１の動きベクトルをＮ画素精度に変換した値であり、特定領域は、第２の動きベクトルで示される第１の参照画像の位置を基準に、対象ブロック±Ｎ／２画素以上の領域を有してもよい。 In the motion vector separation unit, the accuracy of the input first motion vector is M pixel accuracy (M is a real number), and the accuracy of the second motion vector to be generated is N pixel accuracy (N is a real number: N> M). And the second motion vector is a value obtained by converting the first motion vector to N pixel accuracy, and the specific area is based on the position of the first reference image indicated by the second motion vector. It may have an area of ± N / 2 pixels or more.

この構成によると、復号した動きベクトルの精度をＭ画素精度とすると、Ｍ画素よりも荒いＮ画素精度に動きベクトルを変換し、変換した動きベクトル値を基準に、動き補償予測した参照画像に対して他の参照画像の合成処理を行うことで、符号化装置と同じ合成処理を復号側で施すことを可能とし、変換した動きベクトル値と受信した動きベクトル値の差分値を、合成した動き補償予測画像の位相補正値として用いることで、１つの動きベクトル値で、符号化装置側生成した予測残差の少ない動き補償予測画像を復号装置で取得することが可能である。 According to this configuration, if the accuracy of the decoded motion vector is M pixel accuracy, the motion vector is converted to N pixel accuracy rougher than the M pixel, and the reference image subjected to motion compensation prediction is converted based on the converted motion vector value. By synthesizing other reference images, it is possible to perform the same synthesizing process on the decoding side as the encoding device, and by combining the difference value between the converted motion vector value and the received motion vector value, motion compensation is performed. By using it as a phase correction value of a predicted image, a motion compensated predicted image with a small prediction residual generated on the encoding device side can be obtained by a decoding device with one motion vector value.

参照画像合成部は、第１の参照ブロックと、他の参照画像である第２の参照画像との間の第３の動きベクトルを検出する参照画像間動きベクトル検出部を有してもよい。参照画像合成部は、第２の参照画像から第３の動きベクトルを用いて抽出した第２の参照ブロックと、第１の参照ブロックとの、画素毎の平均値もしくは重み付け平均値を算出することで、合成参照ブロックを生成してもよい。 The reference image synthesis unit may include an inter-reference image motion vector detection unit that detects a third motion vector between the first reference block and a second reference image that is another reference image. The reference image synthesis unit calculates an average value or a weighted average value for each pixel of the second reference block extracted from the second reference image using the third motion vector and the first reference block. Thus, a synthesized reference block may be generated.

この構成によると、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像との間の動きベクトル値を求め、他の参照画像から取得した動き補償予測画像と加算平均をとることで、符号化劣化成分の除去と復号対象物の微少な輝度変化に対応した予測画像を生成することができ、符号化効率を向上することができる。 According to this configuration, a motion vector value between another reference image is obtained with respect to a motion compensated predicted image predicted using the first reference image, and the motion compensated predicted image obtained from the other reference image is added. By taking the average, it is possible to generate a prediction image corresponding to the removal of the coding degradation component and the slight luminance change of the decoding target, and the coding efficiency can be improved.

参照画像間動きベクトル検出部が、第１の参照ブロックよりも小さなブロック単位で、第１の参照ブロックと第２の参照画像との間の、複数の第３の動きベクトルを検出してもよい。参照画像合成部は、第２の参照画像から複数の第３の動きベクトルを用いて抽出した小さなブロック単位の複数の第２の参照ブロックを合わせて、第１の参照ブロックとの画素毎の平均値もしくは重み付け平均値を算出することにより、合成参照ブロックを生成してもよい。 The inter-reference image motion vector detection unit may detect a plurality of third motion vectors between the first reference block and the second reference image in units of blocks smaller than the first reference block. . The reference image synthesis unit combines a plurality of second reference blocks in units of small blocks extracted from the second reference image using a plurality of third motion vectors, and calculates an average for each pixel with the first reference block. A composite reference block may be generated by calculating a value or a weighted average value.

この構成によると、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像との間で対象としている動き補償予測画像よりも細かい単位の動きベクトル値を求め、それぞれの動きベクトルに応じて細かい単位で取得した動き補償予測画像と合成処理を行うことで、復号対象物の物体の時間的な微少な変形に対応した予測画像を生成することができ、符号化効率を向上することができる。 According to this configuration, with respect to the motion compensated predicted image predicted using the first reference image, a motion vector value in a smaller unit than the target motion compensated predicted image between other reference images is obtained, By performing the synthesis process with the motion compensated prediction image acquired in fine units according to the motion vector, it is possible to generate a prediction image corresponding to a minute temporal deformation of the object of the decoding target, and encoding efficiency Can be improved.

参照画像間動きベクトル検出部が、第１の参照画像と復号対象ブロックとの第１の時間差と、第２の参照画像と復号対象ブロックとの第２の時間差との２つの時間差に応じて第２の動きベクトルを変換した動きベクトル値を中心として、所定範囲内の動きを探索することで、第３の動きベクトルを検出してもよい。 The inter-reference-picture motion vector detection unit performs the first time difference according to the two time differences between the first time difference between the first reference picture and the decoding target block and the second time difference between the second reference picture and the decoding target block. The third motion vector may be detected by searching for a motion within a predetermined range around the motion vector value obtained by converting the motion vector 2.

なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

本発明によれば、符号化装置および復号装置における演算量の増加を抑えつつ、予測画像の品質を高めて動き補償予測の効率を向上させることができる。 ADVANTAGE OF THE INVENTION According to this invention, the quality of a prediction image can be improved and the efficiency of motion compensation prediction can be improved, suppressing the increase in the computational complexity in an encoding apparatus and a decoding apparatus.

本発明の実施の形態１の動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder of Embodiment 1 of this invention. 本発明の実施の形態１の動画像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image decoding apparatus of Embodiment 1 of this invention. 本発明における、合成画像動き補償予測手法を示す概念図である。It is a conceptual diagram which shows the synthetic | combination image motion compensation prediction method in this invention. 本発明の実施の形態１の動画像符号化装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image encoder of Embodiment 1 of this invention. 本発明の実施の形態１の動画像復号装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image decoding apparatus of Embodiment 1 of this invention. 本発明の実施の形態２の動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder of Embodiment 2 of this invention. 本発明の実施の形態２の動画像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image decoding apparatus of Embodiment 2 of this invention. 本発明の実施の形態２における、合成画像動き補償予測処理の動作を示す概念図である。It is a conceptual diagram which shows the operation | movement of the synthetic | combination image motion compensation prediction process in Embodiment 2 of this invention. 本発明の実施の形態２の動画像符号化装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image encoder of Embodiment 2 of this invention. 本発明の実施の形態２の動画像符号化装置における、複数参照画像合成部及び合成画像動き補償予測部の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the multiple reference image synthetic | combination part and a synthetic | combination image motion compensation prediction part in the moving image encoder of Embodiment 2 of this invention. 本発明の実施の形態２における符号化処理の処理順と参照画像管理の一例を示す図である。It is a figure which shows an example of the process order of the encoding process in Embodiment 2 of this invention, and reference image management. 本発明の実施の形態２における参照画像間の動きベクトル検出範囲の一例を示す図である。It is a figure which shows an example of the motion vector detection range between the reference images in Embodiment 2 of this invention. 本発明の実施の形態２におけるスライスヘッダへの追加情報の一例を示す図である。It is a figure which shows an example of the additional information to the slice header in Embodiment 2 of this invention. 本発明の実施の形態２における動き補償予測モードへの追加情報の一例を示す図である。It is a figure which shows an example of the additional information to the motion compensation prediction mode in Embodiment 2 of this invention. 本発明の実施の形態２の動画像復号装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image decoding apparatus of Embodiment 2 of this invention. 本発明の実施の形態２の動画像復号装置における、動きベクトル分離部、複数参照画像合成部及び合成画像動き補償予測部の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the motion vector separation part, multiple reference image synthetic | combination part, and a synthetic | combination image motion compensation prediction part in the moving image decoding apparatus of Embodiment 2 of this invention. 本発明の実施の形態３における参照画像の合成処理の動作を示す概念図である。It is a conceptual diagram which shows the operation | movement of the synthetic | combination process of the reference image in Embodiment 3 of this invention. 本発明の実施の形態４における合成画像動き補償予測処理の動作を示す概念図である。It is a conceptual diagram which shows the operation | movement of the composite image motion compensation prediction process in Embodiment 4 of this invention. 本発明の実施の形態４の動画像符号化装置及び動画像復号装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image encoder and moving image decoder of Embodiment 4 of this invention. 本発明の実施の形態４の動画像符号化装置及び動画像復号装置における、合成判定部の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the synthetic | combination determination part in the moving image encoder and moving image decoder of Embodiment 4 of this invention. 本発明の実施の形態５における合成画像動き補償予測処理の動作を示す概念図である。It is a conceptual diagram which shows the operation | movement of the synthetic | combination image motion compensation prediction process in Embodiment 5 of this invention. 本発明の実施の形態５の動画像符号化装置及び動画像復号装置における、複数参照画像合成部の構成を示すブロック図である。It is a block diagram which shows the structure of the multiple reference image synthetic | combination part in the moving image encoder and moving image decoder of Embodiment 5 of this invention.

以下、本発明の実施の形態について図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施の形態１）
まず、実施の形態１の動画像符号化装置を説明する。図１は、実施の形態１の動画像符号化装置の構成を示すブロック図である。 (Embodiment 1)
First, the moving picture coding apparatus according to the first embodiment will be described. FIG. 1 is a block diagram showing a configuration of the moving picture encoding apparatus according to the first embodiment.

図１に示すように、実施の形態１の動画像符号化装置は、入力端子１００、入力画像バッファ１０１、ブロック分割部１０２、フレーム内予測部１０３、動きベクトル検出部１０４、動き補償予測部１０５、動きベクトル予測部１０６、複数参照画像合成部１０７、予測モード判定部１０９、減算器１１０、直交変換部１１１、量子化部１１２、逆量子化部１１３、逆直交変換部１１４、加算器１１５、フレーム内復号画像メモリ１１６、復号参照画像メモリ１１７、エントロピー符号化部１１８、ストリームバッファ１１９、出力端子１２０、及び符号量制御部１２１を備える。 As shown in FIG. 1, the moving picture coding apparatus according to Embodiment 1 includes an input terminal 100, an input picture buffer 101, a block division unit 102, an intra-frame prediction unit 103, a motion vector detection unit 104, and a motion compensation prediction unit 105. , Motion vector prediction unit 106, multiple reference image synthesis unit 107, prediction mode determination unit 109, subtractor 110, orthogonal transform unit 111, quantization unit 112, inverse quantization unit 113, inverse orthogonal transform unit 114, adder 115, An intra-frame decoded image memory 116, a decoded reference image memory 117, an entropy encoding unit 118, a stream buffer 119, an output terminal 120, and a code amount control unit 121 are provided.

複数参照画像合成部１０７を設けた点と、この処理ブロック及び動き補償予測部１０５における動作が、本発明の実施の形態１における特徴であり、他の処理ブロックに関してはＭＰＥＧ４−ＡＶＣ等の動画像符号化装置における符号化処理を構成する処理ブロックと同一処理が適用できる。 The point in which the multiple reference image synthesis unit 107 is provided and the operation in the processing block and the motion compensation prediction unit 105 are the features in the first embodiment of the present invention. With regard to the other processing blocks, a moving image such as MPEG4-AVC is used. The same processing as the processing blocks constituting the encoding processing in the encoding device can be applied.

入力端子１００より入力されたデジタル画像信号は、入力画像バッファ１０１に格納される。入力画像バッファ１０１に格納されたデジタル画像信号は、ブロック分割部１０２に供給され、１６×１６画素で構成されるマクロブロック単位で符号化対象ブロックとして切り出される。ブロック分割部１０２は、切り出した符号化対象ブロックを、フレーム内予測部１０３、動きベクトル検出部１０４、動き補償予測部１０５、及び減算器１１０に供給する。 A digital image signal input from the input terminal 100 is stored in the input image buffer 101. The digital image signal stored in the input image buffer 101 is supplied to the block dividing unit 102, and is cut out as an encoding target block in units of macroblocks composed of 16 × 16 pixels. The block division unit 102 supplies the extracted encoding target block to the intra-frame prediction unit 103, the motion vector detection unit 104, the motion compensation prediction unit 105, and the subtractor 110.

フレーム内予測部１０３では、ブロック分割部１０２より入力された符号化対象ブロックと、フレーム内復号画像メモリ１１６に格納された、符号化対象ブロックの周辺に対して符号化が完了した領域の復号画像が入力され、フレーム内の相関性を用いた予測が行われる。例えば、符号化対象ブロックに対して、４×４画素単位、８×８画素単位、１６×１６画素単位で、複数の所定の方向に対して画素値を予測し、予測処理の単位と選択した方向を示す情報（イントラ予測モード）と共に予測画像を生成するイントラ予測という手法を用いて、画面内の隣接画素の相関を用いた予測を行う。予測画像及び選択したイントラ予測モードは、フレーム内予測部１０３より予測モード判定部１０９に出力される。 In the intra-frame prediction unit 103, the decoding target image input from the block division unit 102 and the decoded image of the area that has been encoded with respect to the periphery of the encoding target block stored in the intra-frame decoding image memory 116. Is input, and prediction using correlation within the frame is performed. For example, the pixel value is predicted in a plurality of predetermined directions in units of 4 × 4 pixels, 8 × 8 pixels, and 16 × 16 pixels for the encoding target block, and is selected as a unit of prediction processing Prediction using the correlation of adjacent pixels in the screen is performed using a method called intra prediction that generates a predicted image together with information indicating the direction (intra prediction mode). The predicted image and the selected intra prediction mode are output from the intra-frame prediction unit 103 to the prediction mode determination unit 109.

動きベクトル検出部１０４では、ブロック分割部１０２より入力された符号化対象ブロックと、復号参照画像メモリ１１７に格納された、画面全体の符号化が完了したフレームの復号画像が参照画像として入力されて、符号化対象ブロックと参照画像との間での動き推定を行う。一般的な動き推定処理としては、画面内の同一位置より所定の移動量だけ移動させた位置の参照画像を切り出し、その画像を予測ブロックとした際の予測誤差が最も少なくなる移動量を動きベクトル値として、移動量を変化させながら求めるブロックマッチング処理が用いられる。検出された動きベクトル値は、動き補償予測部１０５及び複数参照画像合成部１０７に出力される。 In the motion vector detection unit 104, the encoding target block input from the block division unit 102 and the decoded image of the frame that has been encoded on the entire screen and stored in the decoded reference image memory 117 are input as the reference image. Then, motion estimation is performed between the encoding target block and the reference image. As a general motion estimation process, a reference image at a position moved by a predetermined movement amount from the same position on the screen is cut out, and the movement amount that minimizes the prediction error when the image is used as a prediction block is determined as a motion vector. As a value, a block matching process that is obtained while changing the movement amount is used. The detected motion vector value is output to the motion compensation prediction unit 105 and the multiple reference image synthesis unit 107.

動き補償予測部１０５は、動きベクトル検出部１０４によって求められた動きベクトル値を入力し、１６×１６以下の複数のブロックサイズ及び複数の参照画像に対する、動き補償予測画像を生成し、ブロック分割部１０２より入力された符号化対象ブロックに対して、最も符号化する差分情報が少ない予測信号を選択すると共に、複数参照画像合成部１０７より入力された合成した参照画像信号も同様に予測信号の候補として、最も符号化する差分情報が少ない予測信号を選択する。動き補償予測部１０５は、選択された動き補償予測モードと予測信号を予測モード判定部１０９に出力する。動き補償予測モードには、合成した参照画像を用いた動き補償であるか否かを示すモード情報が含まれる。 The motion compensation prediction unit 105 receives the motion vector value obtained by the motion vector detection unit 104, generates motion compensation prediction images for a plurality of block sizes of 16 × 16 or less and a plurality of reference images, and a block division unit The prediction signal with the least difference information to be encoded is selected for the encoding target block input from 102, and the combined reference image signal input from the multiple reference image combining unit 107 is also a prediction signal candidate. As described above, the prediction signal with the least difference information to be encoded is selected. The motion compensation prediction unit 105 outputs the selected motion compensation prediction mode and the prediction signal to the prediction mode determination unit 109. The motion compensation prediction mode includes mode information indicating whether or not motion compensation is performed using the synthesized reference image.

動きベクトル予測部１０６は、周辺の符号化済みブロックの動きベクトルを用いて、予測動きベクトル値を算出し、動きベクトル検出部１０４、動き補償予測部１０５に供給する。 The motion vector prediction unit 106 calculates a predicted motion vector value using the motion vectors of the surrounding encoded blocks, and supplies the motion vector detection unit 104 and the motion compensated prediction unit 105.

予測動きベクトル値を用いる事で、動きベクトル検出部１０４は、動きベクトル予測値と動きベクトル値との差分を符号化する際に必要となる符号量を加味して、最適な動きベクトル値を検出する。同様に、動き補償予測部１０５は、動きベクトル予測値と動きベクトル値との差分を符号化する際に必要となる符号量を加味して、最適な動き補償予測のブロック単位と用いる参照画像及び動きベクトル値を選択する。 By using the predicted motion vector value, the motion vector detection unit 104 detects the optimal motion vector value by taking into account the amount of code necessary for encoding the difference between the motion vector predicted value and the motion vector value. To do. Similarly, the motion compensation prediction unit 105 takes into account the amount of code required when encoding the difference between the motion vector prediction value and the motion vector value, and the reference image used as the block unit for the optimal motion compensation prediction and Select a motion vector value.

複数参照画像合成部１０７は、動きベクトル検出部１０４より出力された１つの参照画像に対する動きベクトル値と、復号参照画像メモリ１１７に格納された複数の参照画像を入力し、複数参照画像を用いた参照画像の合成処理が行なわれる。合成した参照画像信号は、動き補償予測部１０５に出力される。複数参照画像合成部１０７の詳細動作に関しては後述する。 The multi-reference image synthesis unit 107 inputs a motion vector value for one reference image output from the motion vector detection unit 104 and a plurality of reference images stored in the decoded reference image memory 117, and uses the multi-reference image. A reference image composition process is performed. The synthesized reference image signal is output to the motion compensation prediction unit 105. The detailed operation of the multiple reference image composition unit 107 will be described later.

予測モード判定部１０９は、フレーム内予測部１０３、動き補償予測部１０５より入力された各予測手法に対する予測モードと予測画像より、ブロック分割部１０２より入力された符号化対象ブロックに対して、最も符号化する差分情報が少ない予測信号を選択し、選択された予測手法に対する予測画像ブロックを減算器１１０及び加算器１１５に出力すると共に、エントロピー符号化部に１１８に対して、付加情報としての予測モード情報と、予測モードに応じた符号化を要する情報を出力する。 The prediction mode determination unit 109 applies the most prediction to the coding target block input from the block division unit 102 based on the prediction mode and the prediction image for each prediction method input from the intra-frame prediction unit 103 and the motion compensation prediction unit 105. A prediction signal with little difference information to be encoded is selected, and a prediction image block for the selected prediction method is output to the subtractor 110 and the adder 115, and the entropy encoding unit 118 performs prediction as additional information. The mode information and information that requires encoding according to the prediction mode are output.

減算器１１０は、ブロック分割部１０２から供給された符号化対象ブロックと、予測モード判定部１０９より供給された予測画像ブロックとの差分を演算し、結果を差分ブロックとして直交変換部１１１に供給する。 The subtractor 110 calculates the difference between the encoding target block supplied from the block dividing unit 102 and the prediction image block supplied from the prediction mode determination unit 109, and supplies the result to the orthogonal transform unit 111 as a difference block. .

直交変換部１１１では、差分ブロックに対して４×４画素もしくは８×８画素単位にＤＣＴ変換を行うことで、直交変換された周波数成分信号に相当するＤＣＴ係数を生成する。また、直交変換部１１１では、生成したＤＣＴ係数をマクロブロック単位に纏めて、量子化部１１２に出力する。 The orthogonal transform unit 111 generates DCT coefficients corresponding to the orthogonally transformed frequency component signal by performing DCT transform on the difference block in units of 4 × 4 pixels or 8 × 8 pixels. Further, the orthogonal transform unit 111 collects the generated DCT coefficients in units of macroblocks and outputs them to the quantization unit 112.

量子化部１１２においては、ＤＣＴ係数を周波数成分毎に異なった値で除算することにより量子化処理を施す。量子化部１１２は、量子化処理されたＤＣＴ係数を、逆量子化部１１３及びエントロピー符号化部１１８に供給する。 The quantization unit 112 performs quantization processing by dividing the DCT coefficient by a different value for each frequency component. The quantization unit 112 supplies the quantized DCT coefficient to the inverse quantization unit 113 and the entropy coding unit 118.

逆量子化部１１３は、量子化部１１２より入力した量子化処理されたＤＣＴ係数に対して、量子化時に除算された値を乗算することで逆量子化を行い、逆量子化された結果を復号されたＤＣＴ係数として、逆直交変換部１１４に出力する。 The inverse quantization unit 113 performs inverse quantization by multiplying the quantized DCT coefficient input from the quantization unit 112 by a value divided at the time of quantization, and the result of the inverse quantization is obtained. The decoded DCT coefficient is output to the inverse orthogonal transform unit 114.

逆直交変換部１１４においては逆ＤＣＴ処理が行われ、復号された差分ブロックを生成する。逆直交変換部１１４は、復号された差分ブロックを加算器１１５に供給する。 The inverse orthogonal transform unit 114 performs inverse DCT processing to generate a decoded difference block. The inverse orthogonal transform unit 114 supplies the decoded difference block to the adder 115.

加算器１１５は、予測モード判定部１０９より供給された予測画像ブロックと、逆直交変換部１１４より供給される復号された差分ブロックを加算し、局部復号ブロックを生成する。加算器１１５で生成された局部復号ブロックは、フレーム内復号画像メモリ１１６及び復号参照画像メモリ１１７に逆ブロック変換された形で格納される。ＭＰＥＧ−４ＡＶＣの場合には、局部復号ブロックが復号参照画像メモリ１１７に入力される手前で、ブロック毎での符号化歪が境界となって現れやすいブロック境界に対して、適応的にフィルタリングを行う処理が施される場合もある。 The adder 115 adds the prediction image block supplied from the prediction mode determination unit 109 and the decoded difference block supplied from the inverse orthogonal transform unit 114 to generate a local decoding block. The local decoded block generated by the adder 115 is stored in the intra-frame decoded image memory 116 and the decoded reference image memory 117 in a form subjected to inverse block conversion. In the case of MPEG-4 AVC, adaptive filtering is applied to the block boundary where the coding distortion of each block is likely to appear as a boundary before the local decoding block is input to the decoded reference image memory 117. In some cases, processing to be performed is performed.

エントロピー符号化部１１８は、量子化部１１２より供給された量子化処理されたＤＣＴ係数と、予測モード判定部１０９より供給された、予測モード情報と、予測モードに応じた符号化を要する情報に対して、それぞれの情報の可変長符号化を行う。具体的には、フレーム内予測の場合にはイントラ予測モードと予測ブロックサイズ情報が、動き補償予測及び合成画像動き補償予測の場合には、予測ブロックサイズ、参照画像の指定情報、及び動きベクトルと予測動きベクトル値との差分値が、符号化を要する情報となる。可変長符号化を施した情報は符号化ビットストリームとして、エントロピー符号化部１１８よりストリームバッファ１１９に出力される。 The entropy encoding unit 118 converts the quantized DCT coefficient supplied from the quantization unit 112, the prediction mode information supplied from the prediction mode determination unit 109, and information that requires encoding according to the prediction mode. On the other hand, variable length encoding of each information is performed. Specifically, intra prediction mode and prediction block size information in the case of intra-frame prediction, and prediction block size, reference image designation information, and motion vector in the case of motion compensation prediction and synthesized image motion compensation prediction, The difference value from the predicted motion vector value is information that requires encoding. The information subjected to the variable length coding is output from the entropy coding unit 118 to the stream buffer 119 as a coded bit stream.

ストリームバッファ１１９に蓄えられた符号化ビットストリームは、出力端子１２０を介して、記録媒体もしくは伝送路に出力される。符号化ビットストリームの符号量制御に関しては、符号量制御部１２１に、ストリームバッファ１１９に蓄えられている符号化ビットストリームの符号量が供給され、目標とする符号量との間で比較がとられ、目標符号量に近づけるために量子化部１１２の量子化の細かさ（量子化スケール）が制御される。 The encoded bit stream stored in the stream buffer 119 is output to a recording medium or a transmission path via the output terminal 120. Regarding the code amount control of the encoded bit stream, the code amount control unit 121 is supplied with the code amount of the encoded bit stream stored in the stream buffer 119 and compared with the target code amount. In order to approach the target code amount, the fineness (quantization scale) of the quantization unit 112 is controlled.

続いて、実施の形態１の動画像符号化装置により生成された符号化ビットストリームを復号する、動画像復号装置を説明する。図２は、実施の形態１の動画像復号装置の構成図である。 Subsequently, a moving picture decoding apparatus that decodes an encoded bitstream generated by the moving picture encoding apparatus according to Embodiment 1 will be described. FIG. 2 is a configuration diagram of the moving picture decoding apparatus according to the first embodiment.

図２に示すように、実施の形態１の動画像復号装置は、入力端子２００、ストリームバッファ２０１、エントロピー復号部２０２、予測モード復号部２０３、予測画像選択部２０４、逆量子化部２０５、逆直交変換部２０６、加算器２０７、フレーム内復号画像メモリ２０８、復号参照画像メモリ２０９、出力端子２１０、フレーム内予測部２１１、動きベクトル予測復号部２１２、動き補償予測部２１３、及び複数参照画像合成部２１５を備える。 As shown in FIG. 2, the moving picture decoding apparatus according to Embodiment 1 includes an input terminal 200, a stream buffer 201, an entropy decoding unit 202, a prediction mode decoding unit 203, a prediction image selection unit 204, an inverse quantization unit 205, and an inverse. Orthogonal transformation unit 206, adder 207, intra-frame decoded image memory 208, decoded reference image memory 209, output terminal 210, intra-frame prediction unit 211, motion vector prediction decoding unit 212, motion compensation prediction unit 213, and multiple reference image synthesis Part 215.

複数参照画像合成部２１５を設けた点と、この処理ブロック及び動き補償予測部２１３における動作が、本発明の実施の形態１における特徴であり、他の処理ブロックに関してはＭＰＥＧ４−ＡＶＣ等の動画像復号装置における復号処理を構成する処理ブロックと同一処理が適用できる。 The point in which the multiple reference image synthesis unit 215 is provided and the operation in the processing block and the motion compensation prediction unit 213 are the characteristics in the first embodiment of the present invention. For other processing blocks, a moving image such as MPEG4-AVC is used. The same processing as the processing blocks constituting the decoding processing in the decoding device can be applied.

入力端子２００より入力された符号化ビットストリームは、ストリームバッファ２０１に供給され、ストリームバッファ２０１で符号化ビットストリームの符号量変動を吸収して、フレーム等の所定単位でエントロピー復号部２０２に供給される。エントロピー復号部２０２は、ストリームバッファ２０１を介して入力された符号化ビットストリームより、符号化された予測モード情報と予測モードに応じた付加情報、及び量子化されたＤＣＴ係数に関して可変長復号を行い、逆量子化部２０５に量子化されたＤＣＴ係数を、予測モード復号部２０３に予測モード情報と予測モードに応じた付加情報を出力する。 The encoded bit stream input from the input terminal 200 is supplied to the stream buffer 201, and the stream buffer 201 absorbs the code amount variation of the encoded bit stream and is supplied to the entropy decoding unit 202 in a predetermined unit such as a frame. The The entropy decoding unit 202 performs variable-length decoding on the encoded prediction mode information, the additional information corresponding to the prediction mode, and the quantized DCT coefficient from the encoded bitstream input via the stream buffer 201. Then, the quantized DCT coefficient is output to the inverse quantization unit 205, and the prediction mode information and additional information corresponding to the prediction mode are output to the prediction mode decoding unit 203.

逆量子化部２０５、逆直交変換部２０６、加算器２０７、フレーム内復号画像メモリ２０８、及び復号参照画像メモリ２０９に関しては、本発明の実施の形態１の動画像符号化装置の局部復号処理である逆量子化部１１３、逆直交変換部１１４、加算器１１５、フレーム内復号画像メモリ１１６、復号参照画像メモリ１１７と同様の処理が行われる。フレーム内復号画像メモリ２０８に蓄えられた復号画像は、出力端子２１０を介して、表示装置に復号画像信号として表示される。 Regarding the inverse quantization unit 205, the inverse orthogonal transform unit 206, the adder 207, the intra-frame decoded image memory 208, and the decoded reference image memory 209, the local decoding process of the moving image coding apparatus according to the first embodiment of the present invention. Processing similar to that of a certain inverse quantization unit 113, inverse orthogonal transform unit 114, adder 115, intra-frame decoded image memory 116, and decoded reference image memory 117 is performed. The decoded image stored in the intra-frame decoded image memory 208 is displayed as a decoded image signal on the display device via the output terminal 210.

予測モード復号部２０３では、エントロピー復号部２０２より入力された予測モード情報と予測モードに応じた付加情報より、予測モードとして動き補償予測もしくは合成動き補償予測が選択された場合に、動きベクトル予測復号部２１２に対して、予測したブロック単位を示す情報である動き補償予測モードもしくは合成画像動き補償予測モードと、復号した差分ベクトル値を出力すると共に、予測画像選択部２０４に対して予測モード情報を出力する。また、予測モード復号部２０３は、復号した予測モード情報に応じて、フレーム内予測部２１１、動き補償予測部２１３に対して、選択されたことを示す情報及び予測モードに応じた付加情報を出力する。 The prediction mode decoding unit 203 performs motion vector prediction decoding when motion compensation prediction or synthesized motion compensation prediction is selected as the prediction mode from the prediction mode information input from the entropy decoding unit 202 and additional information corresponding to the prediction mode. The motion compensation prediction mode or the composite image motion compensation prediction mode, which is information indicating the predicted block unit, and the decoded difference vector value are output to the unit 212, and the prediction mode information is output to the prediction image selection unit 204. Output. Also, the prediction mode decoding unit 203 outputs information indicating selection and additional information according to the prediction mode to the intra-frame prediction unit 211 and the motion compensation prediction unit 213 according to the decoded prediction mode information. To do.

予測画像選択部２０４は、予測モード復号部２０３より入力された予測モード情報に応じて、フレーム内予測部２１１、動き補償予測部２１３の何れかより出力された、復号対象ブロックに対する予測画像を選択し、加算器２０７に出力する。 The predicted image selection unit 204 selects a predicted image for the decoding target block output from either the intra-frame prediction unit 211 or the motion compensated prediction unit 213 according to the prediction mode information input from the prediction mode decoding unit 203. And output to the adder 207.

フレーム内予測部２１１は、復号した予測モードがフレーム内予測を示している場合に、予測モード復号部２０３より予測モードに応じた付加情報として、イントラ予測モードが入力され、イントラ予測モードに応じてフレーム内復号画像メモリ２０８に格納された、復号対象ブロックの周辺に対して復号が完了した領域の復号画像が入力され、符号化装置と同じイントラ予測モードでフレーム内の相関性を用いた予測が行なわれる。フレーム内予測部２１１は、予測により生成したフレーム内予測画像を、予測画像選択部２０４に出力する。 When the decoded prediction mode indicates intra-frame prediction, the intra-frame prediction unit 211 receives the intra prediction mode as additional information according to the prediction mode from the prediction mode decoding unit 203, and according to the intra prediction mode. The decoded image of the region where decoding is completed is input to the periphery of the decoding target block stored in the intra-frame decoded image memory 208, and prediction using the intra-frame correlation is performed in the same intra prediction mode as the encoding device. Done. The intra-frame prediction unit 211 outputs the intra-frame prediction image generated by the prediction to the prediction image selection unit 204.

動きベクトル予測復号部２１２は、予測モード復号部２０３より入力された復号した差分ベクトル値に対して、周辺の復号済みブロックの動きベクトルを用いて、符号化装置で行う方式と同じ方法で予測動きベクトル値を算出し、差分ベクトル値と予測動きベクトル値を加算した値を、復号対象ブロックの動きベクトル値として動き補償予測部２１３及び複数参照画像合成部２１５に出力する。動きベクトルは、動き補償予測モードもしくは合成画像動き補償予測モードで示される、予測処理のブロック単位に従って符号化された数だけ復号される。 The motion vector predictive decoding unit 212 uses the motion vector of the neighboring decoded block for the decoded difference vector value input from the prediction mode decoding unit 203, and performs the motion prediction using the same method as that performed by the encoding device. A vector value is calculated, and a value obtained by adding the difference vector value and the predicted motion vector value is output to the motion compensated prediction unit 213 and the multiple reference image synthesis unit 215 as the motion vector value of the decoding target block. The motion vector is decoded by the number encoded according to the block unit of the prediction process indicated in the motion compensation prediction mode or the composite image motion compensation prediction mode.

動き補償予測部２１３は、動きベクトル予測復号部２１２より入力された動きベクトル値と、複数参照画像合成部２１５より入力された合成した参照画像信号と、予測モード復号部２０３より入力された予測モードに応じた付加情報としての動き補償予測モード及び合成画像動き補償予測であるか否かを示す情報により動き補償予測画像を生成し、生成した動き補償予測画像を、予測画像選択部２０４に出力する。 The motion compensation prediction unit 213 receives the motion vector value input from the motion vector prediction decoding unit 212, the combined reference image signal input from the multiple reference image combining unit 215, and the prediction mode input from the prediction mode decoding unit 203. A motion compensation prediction image is generated based on information indicating whether the motion compensation prediction mode and the composite image motion compensation prediction are added information as additional information according to the information, and the generated motion compensation prediction image is output to the prediction image selection unit 204. .

複数参照画像合成部２１５では、動きベクトル予測復号部２１２より出力された合成画像動き補償予測モードで示される１つの参照画像に対する動きベクトル値と、復号参照画像メモリ２０９に格納された複数の参照画像を入力し、複数参照画像を用いた参照画像の合成処理が行なわれる。合成した参照画像信号は、動き補償予測部２１３に出力される。 In the multiple reference image synthesis unit 215, the motion vector value for one reference image indicated in the synthesized image motion compensation prediction mode output from the motion vector prediction decoding unit 212 and the multiple reference images stored in the decoded reference image memory 209. And a reference image synthesis process using a plurality of reference images is performed. The synthesized reference image signal is output to the motion compensation prediction unit 213.

複数参照画像合成部２１５は、本発明の実施の形態１の動画像符号化装置における、複数参照画像合成部１０７と対をなすものであり、このブロックの詳細動作に関しては後述する。 The multi-reference image synthesis unit 215 is paired with the multi-reference image synthesis unit 107 in the moving picture coding apparatus according to Embodiment 1 of the present invention. The detailed operation of this block will be described later.

以下、実施の形態１の動画像符号化装置と動画像復号装置において動作する、合成した参照画像による動き補償予測の予測画像生成方法を、図３を用いて説明を行う。 Hereinafter, a motion compensation prediction prediction image generation method using a synthesized reference image, which operates in the video encoding device and the video decoding device according to Embodiment 1, will be described with reference to FIG.

図３（ｃ）が、本発明における合成画像動き補償予測手法を示す概念図である。図３（ａ）と図３（ｂ）は、ＭＰＥＧ４−ＡＶＣにおいて用いられている複数参照画像を用いた動き補償予測の概念図となる。 FIG.3 (c) is a conceptual diagram which shows the synthetic | combination image motion compensation prediction method in this invention. FIGS. 3A and 3B are conceptual diagrams of motion compensation prediction using a plurality of reference images used in MPEG4-AVC.

図３（ａ）は、双方向予測と呼ばれている２つの参照画像に対して符号化対象ブロックとの間で動きベクトルを検出し、それぞれの参照画像に対する動きベクトルを伝送すると共に、２つの動きベクトルで示される参照ブロックの平均値を予測画像とする手法である。２つの参照画像を合成することで、時間方向の動き適応フィルタとしての符号化劣化成分の除去機能と、符号化対象物の微少な輝度変化を平均化により参照画像で構成する予測画像を生成できる。 FIG. 3 (a) detects a motion vector between two reference pictures called bi-directional prediction and a coding target block, transmits a motion vector for each reference picture, In this method, the average value of reference blocks indicated by motion vectors is used as a predicted image. By synthesizing two reference images, it is possible to generate a prediction image composed of a reference image by averaging a small luminance change of an encoding target and a function of removing an encoding degradation component as a motion adaptive filter in the time direction. .

図３（ｂ）は、時間ダイレクトモードと呼ばれている動きベクトルの伝送なしに、２つの参照画像を用いた予測を行う手法である。参照画像２の符号化対象ブロックと同位置にあるブロックが参照画像１からの動き補償予測で生成されている場合に、その動きが時間的に連続していることを想定して、符号化対象ブロックと参照画像１と参照画像２の間の動きベクトル値を生成し、その動きベクトルにより双方向予測を行なう構成となる。動きベクトルを伝送することなく、２つの参照画像を合成した予測画像を生成できるが、図３（ｂ）で示されるように、参照画像1と参照画像２の間の動きベクトル値が大きい場合に、その動きベクトル値で表される動きが符号化対象ブロックから空間的に離れた位置の動きを示し、且つ時間的かつ連続している場合の動きに限定して暗黙に生成されており、空間的に時間的に動きベクトル値の連続性が少ない場合には、時間ダイレクトモードは有効に機能しない。 FIG. 3B is a technique for performing prediction using two reference images without transmitting a motion vector called a temporal direct mode. When a block at the same position as the encoding target block of the reference image 2 is generated by motion compensated prediction from the reference image 1, assuming that the motion is temporally continuous, the encoding target A motion vector value between the block, reference image 1 and reference image 2 is generated, and bidirectional prediction is performed using the motion vector. A prediction image obtained by synthesizing two reference images can be generated without transmitting a motion vector. However, when the motion vector value between the reference image 1 and the reference image 2 is large as shown in FIG. The motion represented by the motion vector value indicates the motion at a position spatially separated from the encoding target block, and is generated implicitly limited to the motion when temporally continuous. Therefore, the temporal direct mode does not function effectively when the continuity of motion vector values is temporally small.

特許文献１に示される方法は、上記時間ダイレクトモードの品質を向上させる目的で、符号化側と復号側が共に参照画像間で符号化対象ブロックを中心に対称位置に存在するブロックでの動きを検出することで、符号化対象ブロックを跨ぐ時間連続性のある動きベクトルを生成する手法であるが、空間的に連続性のない条件に対しては有効に機能できるが、時間的に連続性の少ない場合には、時間ダイレクトモードと同様に有効に機能しない。 In the method disclosed in Patent Document 1, for the purpose of improving the quality of the above-described temporal direct mode, both the encoding side and the decoding side detect motion in a block existing at a symmetrical position around the encoding target block between reference images. By doing this, it is a technique to generate motion vectors with temporal continuity across the encoding target block, but it can function effectively for conditions that are not spatially continuous, but there is little temporal continuity In the case, it does not function as well as the time direct mode.

本発明の合成動き補償予測の予測構成は、図３（ｃ）で示されるように、参照画像１における動きベクトルは検出し符号化するとともに、参照画像１の動きベクトルで示された参照ブロックに対して、参照画像２に対する参照画像間での動きベクトル検出を符号化側と復号側で共に行い、参照画像１に対する動きベクトル値のみを伝送し、２つの参照画像を用いた合成参照画像を生成する事で、空間的・時間的に連続性が保たれていない画像に対して良好な双方向予測を可能とし、従来の双方向予測よりも少ない動きベクトル情報で動き補償予測処理が実現出来る。 As shown in FIG. 3C, the prediction configuration of the composite motion compensated prediction of the present invention detects and encodes the motion vector in the reference image 1 and applies it to the reference block indicated by the motion vector in the reference image 1. On the other hand, the motion vector detection between the reference images for the reference image 2 is performed on both the encoding side and the decoding side, and only the motion vector value for the reference image 1 is transmitted to generate a synthesized reference image using two reference images. By doing so, it is possible to perform good bidirectional prediction for an image whose spatial and temporal continuity is not maintained, and it is possible to realize motion compensation prediction processing with less motion vector information than conventional bidirectional prediction.

続いて、図４に実施の形態１の動画像符号化装置における複数参照画像合成部の構成図を示し、参照画像の合成処理の動作を説明する。図４に示すように、複数参照画像合成部１０７は、基準参照画像取得部４００、動きベクトル検出範囲設定部４０１、参照画像間動きベクトル検出部４０２、合成参照画像取得部４０３、参照画像合成部４０４、及び合成画像メモリ４０５を備える。 Next, FIG. 4 shows a configuration diagram of a plurality of reference image synthesis units in the moving picture coding apparatus according to Embodiment 1, and the operation of reference image synthesis processing will be described. As shown in FIG. 4, the multiple reference image synthesis unit 107 includes a standard reference image acquisition unit 400, a motion vector detection range setting unit 401, an inter-reference image motion vector detection unit 402, a synthesized reference image acquisition unit 403, and a reference image synthesis unit. 404 and a composite image memory 405.

最初に動きベクトル検出部１０４より、基準参照画像取得部４００及び動きベクトル検出範囲設定部４０１に、第１参照画像と符号化対象ブロックとの間の動きベクトル値ＭＶ１が入力される。基準参照画像取得部４００では、入力されたＭＶ１を用いて復号参照画像メモリ１１７から、第１参照画像の参照ブロックを取得する。基準参照画像取得部４００は、取得した第１の参照ブロックを参照画像間動きベクトル検出部４０２及び参照画像合成部４０４に出力する。 First, the motion vector detection unit 104 inputs the motion vector value MV1 between the first reference image and the encoding target block to the standard reference image acquisition unit 400 and the motion vector detection range setting unit 401. The reference reference image acquisition unit 400 acquires the reference block of the first reference image from the decoded reference image memory 117 using the input MV1. The reference reference image acquisition unit 400 outputs the acquired first reference block to the inter-reference image motion vector detection unit 402 and the reference image synthesis unit 404.

続いて動きベクトル検出範囲設定部４０１では、第１の参照ブロックに対して第２参照画像の間での動きベクトルを検出する範囲を設定する。参照画像間の動きベクトルの検出範囲に関しては、符号化装置と復号装置において同じ検出範囲設定を暗黙に行う手法を適用可能であるが、フレーム単位や用いられる参照画像毎の検出範囲設定を符号化情報として伝送する手法を用いることも可能である。実施の形態１においては、検出範囲の設定を暗黙に設定し（例：±３２画素）、動きベクトルの検出範囲の中心を、参照画像における符号化対象ブロックの位置と同じ位置とするものとする。 Subsequently, the motion vector detection range setting unit 401 sets a range for detecting a motion vector between the second reference images for the first reference block. For the detection range of motion vectors between reference images, a method of implicitly setting the same detection range in the encoding device and the decoding device can be applied, but the detection range setting for each frame or reference image used is encoded. It is also possible to use a method of transmitting as information. In the first embodiment, the detection range is set implicitly (eg, ± 32 pixels), and the center of the motion vector detection range is set to the same position as the position of the encoding target block in the reference image. .

参照画像間動きベクトル検出部４０２は、基準参照画像取得部４００より入力された第１の参照ブロックに対して、動きベクトル検出範囲設定部４０１より指定された動きベクトルの検出範囲における第２参照画像の参照ブロックを、合成参照画像取得部４０３を介して復号参照画像メモリ１１７より取得し、ブロックマッチング等の誤差値を算出し、その値が小さな動きベクトルを参照画像間動きベクトルとして算出する。参照画像間動きベクトルの検出精度に関しても同様に、符号化装置と復号装置において同じ検出精度の動きベクトル検出を暗黙に行う手法を適用可能であるが、フレーム単位や用いられる参照画像毎に動きベクトルの検出精度を符号化情報として伝送する手法を用いることも可能である。ここでは、暗黙の設定として１／４画素精度の検出精度とする。参照画像間動きベクトル検出部４０２は、算出した参照画像間動きベクトルを参照画像合成部４０４に出力する。 The inter-reference image motion vector detection unit 402 performs the second reference image in the motion vector detection range specified by the motion vector detection range setting unit 401 with respect to the first reference block input from the standard reference image acquisition unit 400. Are obtained from the decoded reference image memory 117 via the synthesized reference image acquisition unit 403, an error value such as block matching is calculated, and a motion vector having a small value is calculated as a motion vector between reference images. Similarly, with respect to the detection accuracy of inter-reference image motion vectors, it is possible to apply a technique for implicitly detecting motion vectors with the same detection accuracy in the encoding device and the decoding device. However, the motion vector is used for each frame or each reference image used. It is also possible to use a method of transmitting the detection accuracy of the data as encoded information. Here, as an implicit setting, the detection accuracy is 1/4 pixel accuracy. The inter-reference image motion vector detection unit 402 outputs the calculated inter-reference image motion vector to the reference image synthesis unit 404.

参照画像合成部４０４では、基準参照画像取得部４００より第１の参照ブロックを入力すると共に、参照画像間動きベクトル検出部４０２より参照画像間動きベクトルを入力し、参照画像間動きベクトルによって示される第２参照画像の参照ブロックを、合成参照画像取得部４０３を介して復号参照画像メモリ１１７より取得することで、第２の参照ブロックを得る。参照画像合成部４０４では、第１の参照ブロックと第２の参照ブロックとの間での合成処理を行う。実施の形態１における合成処理は、例えば第１の参照ブロックと第２の参照ブロックの画素毎の平均を取ることで合成した参照ブロックとして生成する手法をとる。参照画像合成部４０４は、合成した参照ブロックを、合成画像メモリ４０５を介して動き補償予測部１０５に出力する。 In the reference image synthesis unit 404, the first reference block is input from the standard reference image acquisition unit 400, and the inter-reference image motion vector is input from the inter-reference image motion vector detection unit 402, which is indicated by the inter-reference image motion vector. The second reference block is obtained by acquiring the reference block of the second reference image from the decoded reference image memory 117 via the synthesized reference image acquisition unit 403. The reference image composition unit 404 performs composition processing between the first reference block and the second reference block. The synthesis process in the first embodiment employs a method of generating a reference block that is synthesized by taking an average for each pixel of the first reference block and the second reference block, for example. The reference image synthesis unit 404 outputs the synthesized reference block to the motion compensation prediction unit 105 via the synthesized image memory 405.

次に、実施の形態１の動画像復号装置における複数参照画像合成部２１５の構成を図５に示し動作を説明する。図５に示すように、複数参照画像合成部２１５は、基準参照画像取得部１０００、動きベクトル検出範囲設定部１００１、参照画像間動きベクトル検出部１００２、合成参照画像取得部１００３、参照画像合成部１００４、及び合成画像メモリ１００５から構成され、それぞれの動作は、図４に示した、基準参照画像取得部４００、動きベクトル検出範囲設定部４０１、参照画像間動きベクトル検出部４０２、合成参照画像取得部４０３、参照画像合成部４０４、及び合成画像メモリ４０５と同様の動作を行う。 Next, the configuration of the multiple reference image synthesis unit 215 in the moving picture decoding apparatus according to Embodiment 1 is shown in FIG. As shown in FIG. 5, the multiple reference image synthesis unit 215 includes a standard reference image acquisition unit 1000, a motion vector detection range setting unit 1001, an inter-reference image motion vector detection unit 1002, a synthesized reference image acquisition unit 1003, and a reference image synthesis unit. 1004 and a composite image memory 1005, and the operations of the reference reference image acquisition unit 400, motion vector detection range setting unit 401, inter-reference image motion vector detection unit 402, and composite reference image acquisition shown in FIG. Operations similar to those of the unit 403, the reference image synthesis unit 404, and the synthesized image memory 405 are performed.

最初に動きベクトル予測復号部２１２より、基準参照画像取得部１０００及び動きベクトル検出範囲設定部１００１に、復号した動きベクトル値ＭＶ１が入力される。基準参照画像取得部１０００では、入力されたＭＶ１を用いて復号参照画像メモリ２０９から、第１参照画像の参照ブロックを取得する。基準参照画像取得部１０００は、取得した第１の参照ブロックを参照画像間動きベクトル検出部１００２及び参照画像合成部１００４に出力する。 First, the decoded motion vector value MV1 is input from the motion vector predictive decoding unit 212 to the standard reference image acquisition unit 1000 and the motion vector detection range setting unit 1001. The reference reference image acquisition unit 1000 acquires the reference block of the first reference image from the decoded reference image memory 209 using the input MV1. The reference reference image acquisition unit 1000 outputs the acquired first reference block to the inter-reference image motion vector detection unit 1002 and the reference image synthesis unit 1004.

続いて動きベクトル検出範囲設定部１００１では、第１の参照ブロックに対して第２参照画像の間での動きベクトルを検出する範囲を設定する。検出する範囲に関しては、実施の形態１においては暗黙の設定で１／４画素精度の検出精度、動きベクトルの検出範囲の中心が参照画像における符号化対象ブロックの位置と同じ位置で±３２画素の検出範囲としている。動きベクトル検出範囲設定部１００１は、設定した動きベクトル検出範囲の情報を参照画像間動きベクトル検出部１００２に出力する。 Subsequently, the motion vector detection range setting unit 1001 sets a range for detecting a motion vector between the second reference images for the first reference block. Regarding the detection range, in Embodiment 1, the detection accuracy of ¼ pixel accuracy is set by an implicit setting, and the center of the motion vector detection range is ± 32 pixels at the same position as the position of the block to be encoded in the reference image. The detection range. The motion vector detection range setting unit 1001 outputs information on the set motion vector detection range to the inter-reference image motion vector detection unit 1002.

参照画像間動きベクトル検出部１００２は、基準参照画像取得部１０００より入力された第１の参照ブロックに対して、動きベクトル検出範囲設定部１００１より指定された動きベクトルの検出範囲における第２参照画像の参照ブロックを、合成参照画像取得部１００３を介して復号参照画像メモリ２０９より取得し、ブロックマッチング等の誤差値を算出し、その値が小さな動きベクトルを参照画像間動きベクトルとして算出する。参照画像間動きベクトル検出部１００２は、算出した参照画像間動きベクトルを参照画像合成部１００４に出力する。 The inter-reference image motion vector detection unit 1002 performs the second reference image in the motion vector detection range specified by the motion vector detection range setting unit 1001 with respect to the first reference block input from the standard reference image acquisition unit 1000. Are obtained from the decoded reference image memory 209 via the synthesized reference image acquisition unit 1003, an error value such as block matching is calculated, and a motion vector having a small value is calculated as a motion vector between reference images. The inter-reference image motion vector detection unit 1002 outputs the calculated inter-reference image motion vector to the reference image synthesis unit 1004.

参照画像合成部１００４では、基準参照画像取得部１０００より第１の参照ブロックを入力すると共に、参照画像間動きベクトル検出部１００２より参照画像間動きベクトルを入力し、参照画像間動きベクトルによって示される第２参照画像の参照ブロックを、合成参照画像取得部１００３を介して復号参照画像メモリ２０９より取得することで、第２の参照ブロックを得る。参照画像合成部１００４では、第１の参照ブロックと第２の参照ブロックとの間での合成処理を行う。参照画像合成部１００４は、合成した参照ブロックを、合成画像メモリ１００５を介して動き補償予測部２１３に出力する。 In the reference image synthesis unit 1004, the first reference block is input from the standard reference image acquisition unit 1000, and the inter-reference image motion vector detection unit 1002 is input, which is indicated by the inter-reference image motion vector. By acquiring the reference block of the second reference image from the decoded reference image memory 209 via the synthesized reference image acquisition unit 1003, the second reference block is obtained. The reference image synthesis unit 1004 performs a synthesis process between the first reference block and the second reference block. The reference image synthesis unit 1004 outputs the synthesized reference block to the motion compensation prediction unit 213 via the synthesized image memory 1005.

本発明の実施の形態１の動画像符号化装置及び動画像復号装置により、符号化対象ブロックと第１の参照画像の間で動きベクトルを求めて動き補償予測した参照画像に対して、他の参照画像を合成するとことで、予測残差の少ない動き補償予測画像を１つの動きベクトルを伝送するだけの少ない付加情報で実現できる。 With respect to the reference image obtained by calculating the motion vector between the encoding target block and the first reference image by the moving image encoding device and the moving image decoding device according to Embodiment 1 of the present invention, By synthesizing the reference image, it is possible to realize a motion compensated predicted image with a small prediction residual with additional information that only transmits one motion vector.

尚、参照画像間動きベクトルの値と動きベクトル値ＭＶ１を用いることで、符号化対象ブロックと第２参照画像の間の動きベクトル値を生成することができ、その値を動きベクトル予測部１０６、動きベクトル予測復号部２１２に格納し、後続する符号化対象ブロックに対する予測動きベクトル値に用いることが可能である。これにより、復号装置が認識できる動きベクトル値が増えて動きベクトルの予測精度が向上することで、より少ない情報で動きベクトルが伝送できるという新たな効果を有する。 In addition, by using the value of the motion vector between the reference images and the motion vector value MV1, it is possible to generate a motion vector value between the block to be encoded and the second reference image. It can be stored in the motion vector predictive decoding unit 212 and used as a predicted motion vector value for the subsequent encoding target block. Thus, the motion vector value that can be recognized by the decoding apparatus is increased and the prediction accuracy of the motion vector is improved, so that the motion vector can be transmitted with less information.

（実施の形態２）
続いて、実施の形態２にについて説明を行う。実施の形態２においては、参照画像の合成処理に用いる動きベクトルの精度を荒くして、合成後の参照画像に対して細かい精度の位相合わせを行う構成を取る。図６は、実施の形態２の動画像符号化装置の構成を示すブロック図である。 (Embodiment 2)
Next, the second embodiment will be described. In the second embodiment, the accuracy of motion vectors used for reference image synthesis processing is roughened, and fine phase alignment is performed on the synthesized reference image. FIG. 6 is a block diagram showing a configuration of the moving picture encoding apparatus according to the second embodiment.

図６に示すように、実施の形態２の動画像符号化装置は、入力端子１００、入力画像バッファ１０１、ブロック分割部１０２、フレーム内予測部１０３、動きベクトル検出部１０４、動き補償予測部１０５、動きベクトル予測部１０６、複数参照画像合成部１０７、合成画像動き補償予測部１０８、予測モード判定部１０９、減算器１１０、直交変換部１１１、量子化部１１２、逆量子化部１１３、逆直交変換部１１４、加算器１１５、フレーム内復号画像メモリ１１６、復号参照画像メモリ１１７、エントロピー符号化部１１８、ストリームバッファ１１９、出力端子１２０、及び符号量制御部１２１を備える。実施の形態１に対して、合成画像動き補償予測部１０８の機能が加わると共に、複数参照画像合成部１０７の動作が異なる構成となる。追加された合成画像動き補償予測部１０８に関連する機能ブロックの動作のみ説明を加える。 As shown in FIG. 6, the moving picture coding apparatus according to Embodiment 2 includes an input terminal 100, an input picture buffer 101, a block division unit 102, an intra-frame prediction unit 103, a motion vector detection unit 104, and a motion compensation prediction unit 105. , Motion vector prediction unit 106, multiple reference image synthesis unit 107, synthesized image motion compensation prediction unit 108, prediction mode determination unit 109, subtractor 110, orthogonal transform unit 111, quantization unit 112, inverse quantization unit 113, inverse orthogonal A conversion unit 114, an adder 115, an intra-frame decoded image memory 116, a decoded reference image memory 117, an entropy encoding unit 118, a stream buffer 119, an output terminal 120, and a code amount control unit 121 are provided. Compared to the first embodiment, the function of the composite image motion compensation prediction unit 108 is added, and the operation of the multiple reference image composition unit 107 is different. Only the operation of the functional blocks related to the added composite image motion compensation prediction unit 108 will be described.

動き補償予測部１０５は、動きベクトル検出部１０４によって求められた動きベクトル値を入力し、実施の形態１と同様に１６×１６以下の複数のブロックサイズ及び複数の参照画像に対する、動き補償予測画像を生成し、ブロック分割部１０２より入力された符号化対象ブロックに対して、最も符号化する差分情報が少ない予測信号を選択し、選択された動き補償予測モードと予測信号を予測モード判定部１０９に出力する。 The motion compensation prediction unit 105 receives the motion vector value obtained by the motion vector detection unit 104, and similarly to the first embodiment, motion compensation prediction images for a plurality of block sizes of 16 × 16 or less and a plurality of reference images. For the encoding target block input from the block dividing unit 102, the prediction signal with the least difference information to be encoded is selected, and the selected motion compensation prediction mode and the prediction signal are selected as the prediction mode determination unit 109. Output to.

動きベクトル予測部１０６は、周辺の符号化済みブロックの動きベクトルを用いて、実施の形態１と同様に手法で予測動きベクトル値を算出し、動きベクトル検出部１０４、動き補償予測部１０５及び合成画像動き補償予測部１０８に供給する。 The motion vector prediction unit 106 calculates a predicted motion vector value using the motion vectors of the surrounding encoded blocks in the same manner as in the first embodiment, and the motion vector detection unit 104, the motion compensation prediction unit 105, and the synthesis This is supplied to the image motion compensation prediction unit 108.

複数参照画像合成部１０７は、動きベクトル検出部１０４より出力された１つの参照画像に対する動きベクトル値と、復号参照画像メモリ１１７に格納された複数の参照画像を入力し、複数参照画像を用いた参照画像の合成処理が行なわれる。合成した参照画像信号は、合成画像動き補償予測部１０８に出力される。 The multi-reference image synthesis unit 107 inputs a motion vector value for one reference image output from the motion vector detection unit 104 and a plurality of reference images stored in the decoded reference image memory 117, and uses the multi-reference image. A reference image composition process is performed. The combined reference image signal is output to the combined image motion compensation prediction unit 108.

合成画像動き補償予測部１０８は、複数参照画像合成部１０７より入力された合成した参照画像信号と、動きベクトル予測部１０６より入力された予測動きベクトル値を用いて、ブロック分割部１０２より入力された符号化対象ブロックに対して、最も符号化する差分情報が少ない予測信号を選択し、選択された合成画像動き補償予測モードと予測信号を予測モード判定部１０９に出力する。複数参照画像合成部１０７及び合成画像動き補償予測部１０８の詳細動作に関しては後述する。 The combined image motion compensation prediction unit 108 is input from the block dividing unit 102 using the combined reference image signal input from the multiple reference image combining unit 107 and the predicted motion vector value input from the motion vector prediction unit 106. The prediction signal with the least difference information to be encoded is selected for the encoding target block, and the selected combined image motion compensation prediction mode and prediction signal are output to the prediction mode determination unit 109. Detailed operations of the multiple reference image synthesis unit 107 and the synthesized image motion compensation prediction unit 108 will be described later.

続いて、実施の形態２の動画像符号化装置により生成された符号化ビットストリームを復号する、動画像復号装置を説明する。図７は、実施の形態２の動画像復号装置の構成図である。 Next, a moving picture decoding apparatus that decodes an encoded bitstream generated by the moving picture encoding apparatus according to Embodiment 2 will be described. FIG. 7 is a configuration diagram of the moving picture decoding apparatus according to the second embodiment.

図７に示すように、実施の形態２の動画像復号装置は、入力端子２００、ストリームバッファ２０１、エントロピー復号部２０２、予測モード復号部２０３、予測画像選択部２０４、逆量子化部２０５、逆直交変換部２０６、加算器２０７、フレーム内復号画像メモリ２０８、復号参照画像メモリ２０９、出力端子２１０、フレーム内予測部２１１、動きベクトル予測復号部２１２、動き補償予測部２１３、動きベクトル分離部２１４、複数参照画像合成部２１５、及び合成画像動き補償予測部２１６を備える。実施の形態１に対して、動きベクトル分離部２１４、合成画像動き補償予測部２１６の機能が加わると共に、複数参照画像合成部２１５の動作が異なる構成となる。追加された動きベクトル分離部２１４及び合成画像動き補償予測部２１６に関連する機能ブロックの動作のみ説明を加える。 As shown in FIG. 7, the moving picture decoding apparatus according to Embodiment 2 includes an input terminal 200, a stream buffer 201, an entropy decoding unit 202, a prediction mode decoding unit 203, a prediction image selection unit 204, an inverse quantization unit 205, and an inverse. Orthogonal transformation unit 206, adder 207, intra-frame decoded image memory 208, decoded reference image memory 209, output terminal 210, intra-frame prediction unit 211, motion vector prediction decoding unit 212, motion compensation prediction unit 213, motion vector separation unit 214 A multiple reference image synthesis unit 215 and a synthesized image motion compensation prediction unit 216. The functions of the motion vector separation unit 214 and the synthesized image motion compensation prediction unit 216 are added to the first embodiment, and the operation of the multiple reference image synthesis unit 215 is different. Only the operation of functional blocks related to the added motion vector separation unit 214 and synthesized image motion compensation prediction unit 216 will be described.

予測モード復号部２０３では、実施の形態１と同様の処理を行うが、復号した予測モード情報に応じて、フレーム内予測部２１１、動き補償予測部２１３、及び合成動き補償予測部２１６に対して、選択されたことを示す情報及び予測モードに応じた付加情報を出力する部分が動作として異なる。 The prediction mode decoding unit 203 performs the same processing as in Embodiment 1, but according to the decoded prediction mode information, the intra-frame prediction unit 211, the motion compensation prediction unit 213, and the combined motion compensation prediction unit 216 The portion for outputting the information indicating the selection and the additional information corresponding to the prediction mode is different as the operation.

予測画像選択部２０４は、予測モード復号部２０３より入力された予測モード情報に応じて、選択する予測画像の選択が、フレーム内予測部２１１、動き補償予測部２１３に加えて、合成動き補償予測部２１６を含めた何れかより出力された、復号対象ブロックを入力し選択処理が行われ、加算器２０７に出力する。 The prediction image selection unit 204 selects the prediction image to be selected in accordance with the prediction mode information input from the prediction mode decoding unit 203, in addition to the intra-frame prediction unit 211 and the motion compensation prediction unit 213, and combines motion compensation prediction. The decoding target block output from any one including the unit 216 is input, selection processing is performed, and the result is output to the adder 207.

動きベクトル予測復号部２１２は、実施の形態１と同様の方法で復号対象ブロックの動きベクトル値を算出し、動きベクトル値を動き補償予測部２１３及び合成画像動き補償予測部２１６に出力する。動きベクトルは、動き補償予測モードもしくは合成画像動き補償予測モードで示される、予測処理のブロック単位に従って符号化された数だけ復号される。 The motion vector prediction decoding unit 212 calculates the motion vector value of the decoding target block by the same method as in Embodiment 1, and outputs the motion vector value to the motion compensation prediction unit 213 and the synthesized image motion compensation prediction unit 216. The motion vector is decoded by the number encoded according to the block unit of the prediction process indicated in the motion compensation prediction mode or the composite image motion compensation prediction mode.

動き補償予測部２１３は、動きベクトル予測復号部２１２より入力された動きベクトル値と、予測モード復号部２０３より入力された予測モードに応じた付加情報としての動き補償予測モードより動き補償予測画像を生成し、生成した動き補償予測画像を、予測画像選択部２０４に出力する。 The motion compensated prediction unit 213 receives a motion compensated prediction image from the motion compensated prediction mode as additional information corresponding to the motion vector value input from the motion vector prediction decoding unit 212 and the prediction mode input from the prediction mode decoding unit 203. The generated motion compensated prediction image is output to the prediction image selection unit 204.

動きベクトル分離部２１４は、動きベクトル予測復号部２１２より入力された動きベクトル値に対して、所定の画素精度に変換された動きベクトル値（以降、基準動きベクトル値と記す）と、動きベクトル値と基準動きベクトル値との差分ベクトル値（以降、補正ベクトル値と記す）に分離し、基準動きベクトル値を複数参照画像合成部２１５に出力し、補正ベクトル値を合成画像動き補償予測部２１６に出力する。基準動きベクトル値及び補正ベクトル値は、合成画像動き補償予測モードで示される、予測処理のブロック単位に従って符号化された数だけ復号される。 The motion vector separation unit 214 converts a motion vector value input from the motion vector predictive decoding unit 212 into a predetermined pixel accuracy (hereinafter referred to as a reference motion vector value), and a motion vector value. Are divided into difference vector values (hereinafter referred to as correction vector values), and the reference motion vector values are output to the plurality of reference image synthesis units 215, and the correction vector values are output to the synthesized image motion compensation prediction unit 216. Output. The reference motion vector value and the correction vector value are decoded by the number encoded according to the block unit of the prediction process indicated in the composite image motion compensation prediction mode.

複数参照画像合成部２１５では、動きベクトル分離部２１４より出力された合成画像動き補償予測モードで示される１つの参照画像に対する基準動きベクトル値と、復号参照画像メモリ２０９に格納された複数の参照画像を入力し、複数参照画像を用いた参照画像の合成処理が行なわれる。合成した参照画像信号は、合成画像動き補償予測部２１６に出力される。 In the multiple reference image synthesis unit 215, the reference motion vector value for one reference image indicated in the synthesized image motion compensated prediction mode output from the motion vector separation unit 214 and the multiple reference images stored in the decoded reference image memory 209. And a reference image synthesis process using a plurality of reference images is performed. The combined reference image signal is output to the combined image motion compensation prediction unit 216.

合成画像動き補償予測部２１６は、複数参照画像合成部２１５より入力された合成した参照画像信号と、動きベクトル分離部２１４より出力された合成画像動き補償予測モードで示される１つの参照画像に対する補正ベクトル値を用いて、合成した参照画像信号より復号対象ブロックに対する予測ブロックを切り出す。合成画像動き補償予測部２１６は、合成画像動き補償予測モードで示された全てのブロックに対して、切り出された予測ブロックを合わることで生成した合成動き補償予測画像を、予測画像選択部２０４に出力する。 The synthesized image motion compensation prediction unit 216 corrects one reference image indicated by the synthesized reference image signal input from the multiple reference image synthesis unit 215 and the synthesized image motion compensation prediction mode output from the motion vector separation unit 214. Using the vector value, a prediction block for the decoding target block is cut out from the synthesized reference image signal. The composite image motion compensation prediction unit 216 generates a composite motion compensation prediction image generated by combining the extracted prediction blocks with respect to all the blocks indicated in the composite image motion compensation prediction mode. Output to.

複数参照画像合成部２１５及び合成画像動き補償予測部２１６は、本発明の実施の形態２の動画像符号化装置における、複数参照画像合成部１０７及び合成画像動き補償予測部１０８と対をなすものであり、これらのブロック及び動きベクトル分離部２１４の詳細動作に関しては後述する。 The multiple reference image synthesis unit 215 and the synthesized image motion compensation prediction unit 216 are paired with the multiple reference image synthesis unit 107 and the synthesized image motion compensation prediction unit 108 in the moving image coding apparatus according to Embodiment 2 of the present invention. The detailed operation of these blocks and the motion vector separation unit 214 will be described later.

以下、実施の形態２の動画像符号化装置と動画像復号装置において動作する、合成画像動き補償予測の予測画像生成方法の全体としての仕組みを、図８を用いて説明を行い、続いて詳細動作を説明する。 Hereinafter, the overall mechanism of the prediction image generation method of the composite image motion compensation prediction that operates in the video encoding device and the video decoding device of Embodiment 2 will be described with reference to FIG. The operation will be described.

図８は、本発明の実施の形態２における、合成画像動き補償予測処理の動作を示す概念図である。符号化装置側では、最初に符号化対象フレームと、基準にする参照画像を第１参照画像として、符号化対象フレームと第１参照画像間の動きベクトルの検出を行い、第１の動きベクトル値ＭＶ１を生成する。図６の構成においては、ＭＶ１は動きベクトル検出部１０４により求める。ここでは、ＭＶ１の精度をＮ画素とする（例：１画素）。動きベクトル検出部１０４により検出された動きベクトル値が、Ｎ画素精度よりも細かい精度である場合には、検出された動きベクトル値をＮ画素精度に丸めることで、ＭＶ１を生成する。 FIG. 8 is a conceptual diagram showing the operation of the composite image motion compensation prediction process in Embodiment 2 of the present invention. The encoding apparatus first detects a motion vector between the encoding target frame and the first reference image using the encoding target frame and the reference image as a reference as a first reference image, and the first motion vector value MV1 is generated. In the configuration of FIG. 6, MV1 is obtained by the motion vector detection unit 104. Here, the accuracy of MV1 is assumed to be N pixels (example: one pixel). When the motion vector value detected by the motion vector detection unit 104 is finer than N pixel accuracy, MV1 is generated by rounding the detected motion vector value to N pixel accuracy.

次にＭＶ１により第１参照画像から切り出された参照ブロックを第１の参照ブロックとして、第１の参照ブロックと第２参照画像との間の参照画像間動きベクトルを検出する。検出した動きを元に、第１の参照ブロックとその周囲に対して、Ｍ＜ＮであるＭ画素（例：１／４画素）精度の予測画像をフィルタリング等の手段で生成し、第２参照画像より参照画像間動きベクトルを用いて切り出した第２の参照ブロックとその周囲に対して、同様の精度の予測画像を生成し、これらを用いて周囲を含む合成した予測画像を生成する。 Next, the inter-reference image motion vector between the first reference block and the second reference image is detected with the reference block cut out from the first reference image by MV1 as the first reference block. Based on the detected motion, for the first reference block and its surroundings, a predicted image with M pixel (eg, 1/4 pixel) accuracy with M <N is generated by means such as filtering, and the second reference A predicted image with the same accuracy is generated for the second reference block cut out from the image using the inter-reference image motion vector and its surroundings, and a combined predicted image including the surroundings is generated using these.

最後に生成した周囲を含む予測画像に対して、符号化対象ブロックとの間でブロックマッチング等の手段でＭ画素精度の動きベクトル検出を行い、その結果検出された第２の動きベクトル値ＭＶ２を、符号化対象ブロックと第１参照画像との間の動きベクトルとして符号化・伝送すると共に、ＭＶ２で指定された合成した予測画像を、合成動き補償予測ブロックとして、符号化対象ブロックより減算し、差分ブロックを符号化・伝送する。 The motion vector detection with M pixel accuracy is performed on the prediction image including the surroundings generated last by means such as block matching with the encoding target block, and the second motion vector value MV2 detected as a result is detected. , Encoding and transmitting as a motion vector between the encoding target block and the first reference image, and subtracting the combined predicted image specified by MV2 from the encoding target block as a combined motion compensated prediction block, Encode and transmit the difference block.

一方、復号装置側では、最初に受信した第２の動きベクトル値ＭＶ２をＮ画素精度に丸めることで、第１の動きベクトル値ＭＶ１を復元する。次にＭＶ１により第１参照画像から切り出された参照ブロックを第１の参照ブロックとして、第１の参照ブロックと第２参照画像との間の参照画像間動きベクトルを検出する。検出した動きを元に、第１の参照ブロックとその周囲に対して、Ｍ画素精度の予測画像を符号化側で規定したフィルタリング等の手段で生成し、第２参照画像より参照画像間動きベクトルを用いて切り出した第２の参照ブロックとその周囲に対して、同様の精度の予測画像を生成し、これらを用いて周囲を含む合成した予測画像を生成する。 On the other hand, on the decoding device side, the first motion vector value MV1 is restored by rounding the first received motion vector value MV2 to N pixel accuracy. Next, the inter-reference image motion vector between the first reference block and the second reference image is detected with the reference block cut out from the first reference image by MV1 as the first reference block. Based on the detected motion, a predicted image with M pixel accuracy is generated for the first reference block and its surroundings by means such as filtering defined on the encoding side, and an inter-reference image motion vector is generated from the second reference image. A predicted image with the same accuracy is generated for the second reference block cut out using and the surrounding area, and a combined predicted image including the surrounding area is generated using these.

最後に生成した周囲を含む予測画像に対して、第２の動きベクトル値ＭＶ２で指定され位置の合成した予測画像を切り出すことにより、符号化装置側で生成したものと同じ合成動き補償予測ブロックが生成できる。 The synthesized motion compensated prediction block that is the same as that generated on the encoding device side is obtained by cutting out the predicted image synthesized at the position specified by the second motion vector value MV2 from the predicted image including the periphery generated last. Can be generated.

この仕組みにおいては、第１の参照ブロックを符号化対象ブロックに近い情報を持つブロックとしてテンプレートとして、他の参照画像との間での動き補償を伴う画像合成を行う事で、ＭＰＥＧ−４ＡＶＣ等の動き補償予測における、２つの参照画像を用いた予測（双方向予測）に近い特性を持つ予測信号を生成できるという実施の形態１と同様の効果が得られる。そしてそれと共に、符号化装置側で１／４精度レベルの細かい精度毎の合成処理による動きベクトル値確定を行う必要がなく、１画素精度レベルの荒い精度の動きベクトル値を基準にして動きベクトル値の細かい精度（Ｍ画素精度）での補正を、合成した参照画像に対して行うことが出来るため、少ない処理で合成処理において微少に移動した位相の補正や、合成処理において参照画像のノイズ成分が除去された結果を加味した、動きベクトル値が得られ、精度の高い予測画像ブロックが生成できる。 In this mechanism, the first reference block is used as a template having information close to the encoding target block, and image synthesis with motion compensation is performed with other reference images, so that MPEG-4 AVC or the like is performed. In the motion compensated prediction, the same effect as in the first embodiment can be obtained in which a prediction signal having characteristics close to the prediction (bidirectional prediction) using two reference images can be generated. At the same time, there is no need to determine a motion vector value by a synthesizing process with fine accuracy at a quarter accuracy level on the encoding device side, and a motion vector value based on a rough accuracy motion vector value at one pixel accuracy level. Correction with fine accuracy (M pixel accuracy) can be performed on the synthesized reference image, so that the phase that has moved slightly in the synthesis process with a small amount of processing or the noise component of the reference image in the synthesis process A motion vector value that takes into account the removed result is obtained, and a highly accurate predicted image block can be generated.

また、復号装置において、１画素精度レベルの参照画像の画素を直接取得することで生成できる画像を合成処理における参照画像間の動きベクトル検出に用いることが可能となり、小数画素精度のフィルタリング処理と、動きベクトル検出処理を並列動作することが可能になるという効果もある。 In addition, in the decoding device, an image that can be generated by directly acquiring the pixel of the reference image of the one-pixel accuracy level can be used for motion vector detection between the reference images in the synthesis process. There is also an effect that the motion vector detection processing can be performed in parallel.

次に、図８で示した仕組みを実現する、符号化装置における複数参照画像合成部１０７の構成を図９に、複数参照画像合成部１０７及び合成画像動き補償予測部１０８において動作する、合成画像動き補償予測処理のフローチャートを図１０に示し、その詳細動作を説明する。 Next, FIG. 9 shows the configuration of the multiple reference image synthesis unit 107 in the encoding apparatus that realizes the mechanism shown in FIG. 8, and the composite image that operates in the multiple reference image synthesis unit 107 and the synthesized image motion compensation prediction unit 108. A flowchart of the motion compensation prediction process is shown in FIG.

図９に示すように、複数参照画像合成部１０７は、基準参照画像取得部４００、動きベクトル検出範囲設定部４０１、参照画像間動きベクトル検出部４０２、合成参照画像取得部４０３、参照画像合成部４０４、及び合成画像メモリ４０５を備える。 As shown in FIG. 9, the multiple reference image synthesis unit 107 includes a standard reference image acquisition unit 400, a motion vector detection range setting unit 401, an inter-reference image motion vector detection unit 402, a synthesized reference image acquisition unit 403, and a reference image synthesis unit. 404 and a composite image memory 405.

最初に動きベクトル検出部１０４より、基準参照画像取得部４００及び動きベクトル検出範囲設定部４０１に、第１参照画像と符号化対象ブロックとの間の動きベクトル値ＭＶ１が入力される。基準参照画像取得部４００では、入力されたＭＶ１を用いて復号参照画像メモリ１１７から、第１参照画像の参照ブロックを取得する。参照ブロックの取得領域は、符号化対象ブロックに対して、ＭＶ１の値だけ移動した第１参照画像の位置を基準に、対象ブロック±Ｎ／２画素以上のＭ画素精度（Ｍ＜Ｎ）の参照画像を作成する為に必要な領域をとる。例えば、Ｎが１画素、Ｍが１／４画素で、１／４画素精度画像を生成するために必要な拡大フィルタをＭＰＥＧ−４ＡＶＣで用いている６タップのフィルタを用いる場合では、符号化対象ブロックサイズに加えて±３画素の領域の参照画像を、第１の参照ブロックとして取得する。基準参照画像取得部４００は、取得した第１の参照ブロックを参照画像間動きベクトル検出部４０２及び参照画像合成部４０４に出力する。 First, the motion vector detection unit 104 inputs the motion vector value MV1 between the first reference image and the encoding target block to the standard reference image acquisition unit 400 and the motion vector detection range setting unit 401. The reference reference image acquisition unit 400 acquires the reference block of the first reference image from the decoded reference image memory 117 using the input MV1. The reference block acquisition area is a reference with M pixel accuracy (M <N) of the target block ± N / 2 pixels or more based on the position of the first reference image moved by the value of MV1 with respect to the encoding target block. The area necessary for creating the image is taken. For example, when N is 1 pixel, M is 1/4 pixel, and a 6-tap filter used in MPEG-4 AVC is used as an expansion filter necessary for generating a 1/4 pixel precision image, encoding is performed. In addition to the target block size, a reference image of an area of ± 3 pixels is acquired as a first reference block. The reference reference image acquisition unit 400 outputs the acquired first reference block to the inter-reference image motion vector detection unit 402 and the reference image synthesis unit 404.

続いて動きベクトル検出範囲設定部４０１では、第１の参照ブロックに対して第２参照画像の間での動きベクトルを検出する範囲を設定する。参照画像間の動きベクトルの検出範囲に関しては、符号化装置と復号装置において同じ検出範囲設定を暗黙に行う手法を適用可能であるが、フレーム単位や用いられる参照画像毎の検出範囲設定を符号化情報として伝送する手法を用いることも可能である。実施の形態２における、検出範囲の設定アルゴリズムに関しては、図１０を用いて後で説明する。動きベクトル検出範囲設定部４０１は、設定した動きベクトル検出範囲の情報を参照画像間動きベクトル検出部４０２に出力する。 Subsequently, the motion vector detection range setting unit 401 sets a range for detecting a motion vector between the second reference images for the first reference block. For the detection range of motion vectors between reference images, a method of implicitly setting the same detection range in the encoding device and the decoding device can be applied, but the detection range setting for each frame or reference image used is encoded. It is also possible to use a method of transmitting as information. The detection range setting algorithm in the second embodiment will be described later with reference to FIG. The motion vector detection range setting unit 401 outputs information on the set motion vector detection range to the inter-reference image motion vector detection unit 402.

参照画像間動きベクトル検出部４０２は、基準参照画像取得部４００より入力された第１の参照ブロックに対して、動きベクトル検出範囲設定部４０１より指定された動きベクトルの検出範囲における第２参照画像の参照ブロックを、合成参照画像取得部４０３を介して復号参照画像メモリ１１７より取得し、ブロックマッチング等の誤差値を算出し、その値が小さな動きベクトルを参照画像間動きベクトルとして算出する。参照画像間動きベクトルの検出精度に関しても同様に、符号化装置と復号装置において同じ検出精度の動きベクトル検出を暗黙に行う手法を適用可能であるが、フレーム単位や用いられる参照画像毎に動きベクトルの検出精度を符号化情報として伝送する手法を用いることも可能である。参照画像間動きベクトル検出部４０２は、算出した参照画像間動きベクトルを参照画像合成部４０４に出力する。 The inter-reference image motion vector detection unit 402 performs the second reference image in the motion vector detection range specified by the motion vector detection range setting unit 401 with respect to the first reference block input from the standard reference image acquisition unit 400. Are obtained from the decoded reference image memory 117 via the synthesized reference image acquisition unit 403, an error value such as block matching is calculated, and a motion vector having a small value is calculated as a motion vector between reference images. Similarly, with respect to the detection accuracy of inter-reference image motion vectors, it is possible to apply a technique for implicitly detecting motion vectors with the same detection accuracy in the encoding device and the decoding device. However, the motion vector is used for each frame or each reference image used. It is also possible to use a method of transmitting the detection accuracy of the data as encoded information. The inter-reference image motion vector detection unit 402 outputs the calculated inter-reference image motion vector to the reference image synthesis unit 404.

参照画像合成部４０４では、基準参照画像取得部４００より第１の参照ブロックを入力すると共に、参照画像間動きベクトル検出部４０２より参照画像間動きベクトルを入力し、参照画像間動きベクトルによって示される第２参照画像の参照ブロックを、合成参照画像取得部４０３を介して復号参照画像メモリ１１７より取得することで、第２の参照ブロックを得る。参照画像合成部４０４では、第１の参照ブロックと第２の参照ブロックとの間での合成処理を行う。実施の形態２における合成処理は、例えば第１の参照ブロックと第２の参照ブロックの画素毎の平均を取ることで合成した参照ブロックとして生成する手法をとる。参照画像合成部４０４は、合成した参照ブロックを、合成画像メモリ４０５を介して合成画像動き補償予測部１０８に出力する。 In the reference image synthesis unit 404, the first reference block is input from the standard reference image acquisition unit 400, and the inter-reference image motion vector is input from the inter-reference image motion vector detection unit 402, which is indicated by the inter-reference image motion vector. The second reference block is obtained by acquiring the reference block of the second reference image from the decoded reference image memory 117 via the synthesized reference image acquisition unit 403. The reference image composition unit 404 performs composition processing between the first reference block and the second reference block. The synthesis process in Embodiment 2 employs a method of generating a reference block that is synthesized by taking the average of each pixel of the first reference block and the second reference block, for example. The reference image synthesis unit 404 outputs the synthesized reference block to the synthesized image motion compensation prediction unit 108 via the synthesized image memory 405.

続いて、これらの構成を用いた合成画像動き補償予測処理の動作を、図１０のフローチャートを用いて説明する。図１０においては、１フレームの符号化処理における合成画像動き補償予測の動作の流れを示している。他の処理部の動作に関しては、ＭＰＥＧ−４ＡＶＣ等の従来の動画像符号化処理を用いることが可能である。 Next, the operation of the composite image motion compensation prediction process using these configurations will be described using the flowchart of FIG. FIG. 10 shows a flow of the operation of the composite image motion compensation prediction in the encoding process of one frame. As for the operation of other processing units, it is possible to use a conventional moving image encoding process such as MPEG-4 AVC.

１フレームの処理開始時に、最初に参照画像毎の合成対象参照画像を確定する（Ｓ５００）。実施の形態２における動き補償予測の参照画像は複数毎から選択して用いることが可能である。図１１に実施の形態２における符号化処理の処理順と参照画像管理の一例を示し、説明する。 At the start of processing for one frame, first, a compositing target reference image for each reference image is determined (S500). It is possible to select and use a plurality of reference images for motion compensation prediction in the second embodiment. FIG. 11 shows an example of the processing order of encoding processing and reference image management in Embodiment 2, and will be described.

Ｉスライスと呼ばれる動き補償予測を用いない符号化を行う処理が、最初のフレームや完結的に行われる。Ｉスライスで符号化された復号画像は復号参照画像メモリ１１７に蓄えられ、続いて符号化するフレームの参照画像となる。 A process called I-slice that performs coding without using motion compensation prediction is performed for the first frame or completely. The decoded image encoded by the I slice is stored in the decoded reference image memory 117, and becomes a reference image of a frame to be subsequently encoded.

Ｐスライスは、時間的に前のフレームの復号画像を参照画像とし、動き補償予測を用いた時間相関による圧縮を可能とするフレームである。図１１における、実施の形態２の符号化処理順の一例においては、Ｐスライスの復号画像はすべて参照画像として用いられる。追加された参照画像は、復号参照画像メモリ１１７に蓄えられ、予め定義された参照画像の枚数まで格納する。 The P slice is a frame that enables compression by temporal correlation using motion compensated prediction using a decoded image of a previous frame in time as a reference image. In the example of the encoding processing order of the second embodiment in FIG. 11, all the decoded images of P slices are used as reference images. The added reference images are stored in the decoded reference image memory 117 and stored up to a predetermined number of reference images.

Ｂスライスは、２つの参照画像を加算して動き補償予測を行うことが可能なフレームで、時間的に前後の参照画像を用いて予測精度の高い動き補償予測が可能となるが、２つの参照画像を用いる場合には、２つの動きベクトルを符号化する必要がある。図１１における、実施の形態２の符号化処理順の一例においては、Ｂスライスの復号画像は参照画像として用いない。 A B slice is a frame in which motion compensation prediction can be performed by adding two reference images, and motion compensation prediction with high prediction accuracy is possible using temporally preceding and following reference images. When using an image, it is necessary to encode two motion vectors. In the example of the encoding processing order of the second embodiment in FIG. 11, the decoded image of the B slice is not used as a reference image.

図１１に示している例のように、Ｂスライスが１フレーム毎に設定された符号化処理で、参照画像を４枚格納できる場合には、Ｉフレーム、Ｐフレームの符号化後に新たな参照画像を格納し、４枚以上になる場合には１枚の参照画像を廃棄することで新しい復号画像を参照画像として用いる。図１１における、実施の形態２の参照画像管理の一例においては、廃棄する参照画像は時間的に最も古いフレームを選択している。 As in the example shown in FIG. 11, when four reference images can be stored in the encoding process in which the B slice is set for each frame, a new reference image is encoded after the I frame and P frame are encoded. Is stored, and when four or more images are stored, one reference image is discarded and a new decoded image is used as a reference image. In the example of reference image management of the second embodiment in FIG. 11, the oldest frame in time is selected as the reference image to be discarded.

このように、符号化対象フレームに対して複数毎の参照画像を選択的に用いることが出来るため、最初に参照画像毎に合成の対象とする参照画像を確定させる処理を行う。暗黙の規定を定めて、符号化装置・復号装置で同様の判断を行うことで、正しい合成処理が可能となる。 Thus, since a plurality of reference images can be selectively used for the encoding target frame, first, a process of determining a reference image to be synthesized for each reference image is performed. By defining an implicit rule and making the same determination in the encoding device / decoding device, correct composition processing can be performed.

例えば、符号化対象フレームがＢスライスの場合には、基本の参照画像となる第１参照画像に対して、符号化対象フレームを跨いだ時間関係にある、符号化対象フレームに最も近い参照画像を、合成するために用いる参照画像である第２参照画像とする。符号化対象がＰスライスの場合には、第１参照画像が符号化対象フレームに最も近い参照画像である場合には、２番目に近い参照画像を第２参照画像とし、それ以外の場合には第2参照画像を符号化対象フレームに最も近い参照画像とする。 For example, when the encoding target frame is a B slice, a reference image closest to the encoding target frame that has a temporal relationship across the encoding target frame with respect to the first reference image that is a basic reference image is selected. The second reference image is a reference image used for composition. When the encoding target is a P slice, when the first reference image is the reference image closest to the encoding target frame, the second closest reference image is set as the second reference image, and otherwise. The second reference image is the reference image closest to the encoding target frame.

全ての参照画像に対して合成対象となる参照画像を確定されたら、続いて参照画像間の動きベクトル検出精度を確定する（Ｓ５０１）。ここでは、動きベクトル検出精度は、最終的な合成動き補償予測で伝送する検出精度である１／４画素を設定するが、たとえば１／８画素精度など、より細かい精度での動きを求めることで、伝送する動きベクトルの精度を上げることなく、細かい精度での合成処理を行うことも可能である。 When the reference images to be synthesized with respect to all the reference images are determined, the motion vector detection accuracy between the reference images is subsequently determined (S501). Here, the motion vector detection accuracy is set to ¼ pixel, which is the detection accuracy transmitted in the final combined motion compensation prediction. For example, the motion vector detection accuracy can be obtained by obtaining a motion with finer accuracy such as １／ pixel accuracy. It is also possible to perform the synthesis process with fine accuracy without increasing the accuracy of the motion vector to be transmitted.

続いて、参照画像間の動きベクトル検出範囲を確定する（Ｓ５０２）。検出範囲に関しては、全ての第１の参照ブロックに対して、第２参照画像の全領域を動きベクトル検出範囲にとることも可能であり、復号装置と同じ定義で検出処理を行うことで、実施の形態２における符号化装置は機能するが、参照画像間の動きベクトル検出における演算量を減らすために、図１２で示すような検出範囲の設定を行う。 Subsequently, a motion vector detection range between reference images is determined (S502). Regarding the detection range, it is also possible to take the entire area of the second reference image as the motion vector detection range for all the first reference blocks, and by performing detection processing with the same definition as the decoding device, Although the encoding apparatus in the second embodiment functions, in order to reduce the amount of calculation in detecting a motion vector between reference images, a detection range as shown in FIG. 12 is set.

図１２は、実施の形態２における参照画像間の動きベクトル検出範囲の一例である。符号化対象画像の入力時刻をPoc_Cur、第１参照画像の入力時刻をPoc_Ref1、第２参照画像の入力時刻をPoc_Ref2とすると、符号化対象ブロックに対する第１参照画像からの動きベクトルＭＶ１に対して、第２参照画像の探索範囲を符号化対象ブロックの位置を基準にすると、探索中心位置を
α＝ＭＶ１×（Poc_Cur -Poc_Ref2）／（Poc_Cur -Poc_Ref1）
で表されるように、時間的に動きが連続していることを想定した場合の符号化対象ブロックと第２参照画像の間の動きベクトル予測値に設定する。 FIG. 12 is an example of a motion vector detection range between reference images in the second embodiment. If the input time of the encoding target image is Poc_Cur, the input time of the first reference image is Poc_Ref1, and the input time of the second reference image is Poc_Ref2, the motion vector MV1 from the first reference image for the encoding target block is When the search range of the second reference image is based on the position of the encoding target block, the search center position is α = MV1 × (Poc_Cur-Poc_Ref2) / (Poc_Cur-Poc_Ref1)
As shown in the above, the motion vector prediction value between the encoding target block and the second reference image when it is assumed that the motion is temporally continuous is set.

しかしながら、カメラの動きや物体の動き等、時間的に連続な変化ではない状況も多いため、探索位置を中心として特定の領域に関して動きベクトルを探索することで、合成処理に適切な第２参照画像の参照ブロックを取得できるようにする。図１２に示した一例においては、特定の領域として、±４画素の領域を指定している。 However, since there are many situations that are not temporally continuous changes, such as camera movements and object movements, a second reference image suitable for synthesis processing can be obtained by searching for a motion vector for a specific region with the search position as the center. To get the reference block. In the example shown in FIG. 12, an area of ± 4 pixels is designated as the specific area.

具体的には、Ｓ５０２においては、上記±４画素の定義を確定させる処理を行うのみで、各符号化対象ブロックに対しての合成参照画像を取得するための探索中心位置の算出は、対象ブロック毎に計算される。 Specifically, in S502, the calculation of the search center position for obtaining the synthesized reference image for each encoding target block is performed only by performing the process of determining the definition of ± 4 pixels. Calculated every time.

続いて、フレーム単位での処理定義の内で符号化ビットストリームとして伝送することで、復号装置に同様の処理を機能させる情報を、フレーム単位の情報を伝送するスライスヘッダにおいて伝送する。図１３に実施の形態２におけるスライスヘッダへの追加情報の一例を示す。 Subsequently, information that causes the decoding device to perform the same processing by transmitting the encoded bit stream within the processing definition in units of frames is transmitted in a slice header that transmits information in units of frames. FIG. 13 shows an example of additional information to the slice header in the second embodiment.

図１３のスライスヘッダは、ＭＰＥＧ−４ＡＶＣにおけるスライスヘッダを基本としているため、記述している部分は追加した情報に関するものだけである。合成動き補償予測はフレーム間の予測方式であるＩスライスでは使用しないため、Ｉスライス以外の場合に追加した情報を伝送する。 Since the slice header of FIG. 13 is based on the slice header in MPEG-4 AVC, the described portion is only related to the added information. Since the combined motion compensation prediction is not used in the I slice which is a prediction method between frames, information added in the case other than the I slice is transmitted.

最初に合成動き補償予測を行うか否かをスライス単位で制御する情報であるrefinement_mc_enableを１ビット伝送する。更に、refinement_mc_enableが１である（合成動き補償予測を行う）場合に、以下の３つの情報を伝送する。 First, 1 bit of refinement_mc_enable, which is information for controlling whether to perform combined motion compensation prediction in units of slices, is transmitted. Furthermore, when refinement_mc_enable is 1 (combined motion compensation prediction is performed), the following three pieces of information are transmitted.

１つは、従来の動き補償予測と適応的に切り替えるか、従来の動き補償予測と合成動き補償予測を置き換えるかを示す情報であり、refinement_mc_adaptiveとして１ビット伝送する。 One is information indicating whether to switch adaptively with the conventional motion compensation prediction, or to replace the conventional motion compensation prediction with the combined motion compensation prediction, and transmits 1 bit as refinement_mc_adaptive.

２つ目は、参照画像間の動きベクトル検出範囲を示す情報として２ビットのデータをrefinement_mc_matching_range_fullとして伝送する。一例としては、２ビットのデータが、以下の検出範囲が定義されたことを示すようにする。
００ ±１画素
０１ ±２画素
１０ ±４画素
１１ ±８画素 Second, 2-bit data is transmitted as refinement_mc_matching_range_full as information indicating a motion vector detection range between reference images. As an example, 2-bit data indicates that the following detection range is defined.
00 ± 1 pixel
01 ± 2 pixels
10 ± 4 pixels
11 ± 8 pixels

３つ目は、参照画像間の動きベクトル検出精度を示す情報として２ビットのデータをrefinement_mc_matching_subpelとして伝送する。一例としては、２ビットのデータが、以下の検出精度が定義されたことを示すようにする。
００１画素精度（小数精度の検出を行わない）
０１１／２画素精度
１０１／４画素精度
１１１／８画素精度 Third, 2-bit data is transmitted as refinement_mc_matching_subpel as information indicating the accuracy of motion vector detection between reference images. As an example, 2-bit data indicates that the following detection accuracy is defined.
00 1 pixel accuracy (no decimal precision detection is performed)
01 1/2 pixel accuracy
10 1/4 pixel accuracy
11 1/8 pixel accuracy

このようにして、フレーム単位での設定確定後に、符号化対象フレーム内の符号化対象ブロックであるマクロブロックに対する、合成動き補償予測処理が施される。各マクロブロックに対して（Ｓ５０４）、全ての参照画像に対して（Ｓ５０５）、最初に選択した参照画像を第１参照画像とした第１の動きベクトル検出を行う（Ｓ５０６）。 In this way, after the setting is determined in units of frames, the combined motion compensation prediction process is performed on the macroblock that is the encoding target block in the encoding target frame. For each macroblock (S504), for all reference images (S505), first motion vector detection is performed using the first selected reference image as the first reference image (S506).

上記検出処理は、従来の動き補償予測において用いられている動きベクトル検出部１０４により動作できるが、従来の動き補償予測を用いない場合には同様の動きベクトル検出処理を合成動き補償予測に加えるか、もしくは常に第１の動きベクトル値として、動きベクトル予測部１０６から出力する予測動きベクトル値を第１の動きベクトル値とすることも可能である。 The above detection processing can be performed by the motion vector detection unit 104 used in the conventional motion compensation prediction. However, if the conventional motion compensation prediction is not used, whether the same motion vector detection processing is added to the combined motion compensation prediction. Alternatively, the predicted motion vector value output from the motion vector predicting unit 106 can always be used as the first motion vector value as the first motion vector value.

第１の動きベクトル値として、予測動きベクトル値を用いた場合には、差分動きベクトル値が参照画像を合成した後の、微少範囲に対する検索結果としての中心位置からのずれ量となり、伝送する差分動きベクトル情報が少なくなるメリットがある。 When a predicted motion vector value is used as the first motion vector value, the difference motion vector value becomes a shift amount from the center position as a search result for a minute range after the reference image is synthesized, and the difference to be transmitted There is an advantage that motion vector information is reduced.

第１の動きベクトルは１画素精度とし、動きベクトル検出部１０４又は動きベクトル予測部１０６から入力される動きベクトル値が１画素未満の精度の動きベクトルであった場合には、１画素精度に丸める動作を施す。例えば、入力された動きベクトル値ＭＶ１orgが１／４画素精度であった場合には、第１の動きベクトル値ＭＶ１を以下のように計算で求める。
ＭＶ１＝（ＭＶ１org ＋２）＞＞２
続いて、第１の動きベクトルより第１の参照ブロックを取得する（Ｓ５０７）。図９の説明において記述したように、第１の参照ブロックは、符号化対象ブロックに対して、ＭＶ１の値だけ移動した第１参照画像の位置を基準に、対象ブロック±１／２画素以上の１／４画素精度の参照画像を作成する為に必要な領域として、符号化対象ブロックサイズに加えて±３画素の領域の参照画像を取得する。 The first motion vector is assumed to have one pixel accuracy, and if the motion vector value input from the motion vector detection unit 104 or the motion vector prediction unit 106 is a motion vector with accuracy less than one pixel, it is rounded to one pixel accuracy. Apply actions. For example, when the input motion vector value MV1org has a 1/4 pixel accuracy, the first motion vector value MV1 is obtained by calculation as follows.
MV1 = (MV1org + 2) >> 2
Subsequently, a first reference block is acquired from the first motion vector (S507). As described in the description of FIG. 9, the first reference block has a target block ± 1/2 pixel or more based on the position of the first reference image moved by the value of MV1 with respect to the encoding target block. In addition to the encoding target block size, a reference image of an area of ± 3 pixels is acquired as an area necessary for creating a reference image with 1/4 pixel accuracy.

次に、第１の動きベクトルより第２参照画像の検出範囲を設定する（Ｓ５０８）。第２参照画像にどの参照画像を用いるかを、Ｓ５００によって決めた定義により確定する。検出範囲に関しては、Ｓ５０２において説明した図１２に示される検出範囲が設定される。設定された検出範囲に対して、第１の参照ブロックと第２参照画像との間で参照画像間の動きベクトル検出が施される（Ｓ５０９）。 Next, the detection range of the second reference image is set from the first motion vector (S508). Which reference image is used for the second reference image is determined by the definition determined in S500. Regarding the detection range, the detection range shown in FIG. 12 described in S502 is set. For the set detection range, motion vector detection between reference images is performed between the first reference block and the second reference image (S509).

続いて、検出された参照画像間動きベクトルを用いて、第２の参照ブロックを取得する（Ｓ５１０）。第２の参照ブロックは、第１の参照ブロックに対して参照画像間の動きベクトル値だけ移動した第２参照画像の位置を基準に、第１の参照ブロックと同様の符号化ブロックサイズ±３画素の領域の参照画像を取得する。 Subsequently, a second reference block is acquired using the detected inter-reference image motion vector (S510). The second reference block has the same coding block size ± 3 pixels as the first reference block on the basis of the position of the second reference image moved by the motion vector value between the reference images with respect to the first reference block. A reference image of the area is acquired.

次に、第１の参照ブロックと第２の参照ブロックを合成し、合成参照画像ブロックを生成する（Ｓ５１１）。合成するアルゴリズムとしては、第１の参照ブロックと第２の参照ブロックの画素毎の平均値を算出することで合成参照画像ブロックを生成する。尚、合成参照画像ブロックにおいて、ＭＰＥＧ−４ＡＶＣにおいて用いられる重み付け予測（Weighted Prediction）に対応することも可能であり、合成参照画像ブロックに対して重みを付けることや、第１の参照ブロックと第２の参照ブロックの加算比率を、符号化対象画像からの距離に反比例させて重み付け加算平均を取ることも可能であり、これらを切り替える場合には加算方法を指定するための情報をフレーム単位やマクロブロック単位で伝送する。 Next, the first reference block and the second reference block are synthesized to generate a synthesized reference image block (S511). As an algorithm to be synthesized, a synthesized reference image block is generated by calculating an average value for each pixel of the first reference block and the second reference block. Note that the synthesized reference image block can also support weighted prediction used in MPEG-4 AVC. The synthesized reference image block can be weighted, and the first reference block and the first reference block It is also possible to take the weighted addition average by making the addition ratio of the two reference blocks inversely proportional to the distance from the image to be encoded. When switching between these, information for specifying the addition method is displayed in units of frames or macros. Transmit in blocks.

続いて、合成参照画像ブロック内と符号化対象ブロックとの間で、微少範囲の動きベクトル検出を行い、第２の動きベクトル値を生成する（Ｓ５１２）。具体的には、第１の動きベクトル値が１画素精度で、１／４画素精度の動きベクトルを検出する場合には、第１の参照ブロックに対して、第１の動きベクトルＭＶ１で示される位置を基準に±１／２画素の範囲内を１／４画素単位で水平・垂直に移動させながら、合成参照画像ブロックの同位置より符号化対象ブロックと同じサイズのブロックを切り出し、符号化対象ブロックとのブロックマッチングを行う。 Subsequently, motion vector detection in a very small range is performed between the synthesized reference image block and the encoding target block to generate a second motion vector value (S512). Specifically, when a first motion vector value is detected with a 1-pixel accuracy and a 1 / 4-pixel accuracy motion vector, the first motion vector is indicated by the first motion vector MV1 with respect to the first reference block. A block of the same size as the encoding target block is cut out from the same position of the synthesized reference image block while moving horizontally and vertically in a unit of 1/4 pixel within the range of ± 1/2 pixel based on the position. Perform block matching with the block.

ブロックマッチングの結果、符号化対象ブロックとの誤差評価値が最も小さいものを第２動きベクトル値ＭＶ２として算出する。上記移動させた範囲を示す移動量をＭＶdeltaとすると、
ＭＶ２＝（ＭＶ１＜＜２）＋ＭＶdelta
として出力される。 As a result of block matching, the one with the smallest error evaluation value with respect to the encoding target block is calculated as the second motion vector value MV2. If the movement amount indicating the moved range is MVdelta,
MV2 = (MV1 << 2) + MVdelta
Is output as

ここで、ＭＶdeltaは、１／４画素精度で水平・垂直共に−２≦ＭＶdelta＜２で算出されるため、ＭＶ２に対して、復号側でＭＶ１＝（ＭＶ２＋２）＞＞２の処理を行う事で、第１の動きベクトルが復元できる。 Here, since MVdelta is calculated with ¼ pixel accuracy in both horizontal and vertical directions with −2 ≦ MVdelta <2, by performing processing of MV1 = (MV2 + 2) >> 2 on the decoding side for MV2. The first motion vector can be restored.

続いて、求められた第２の動きベクトル値ＭＶ２で示される位置を基準に、合成参照画像ブロックより合成動き補償予測ブロックを切り出し、誤差評価値を算出する。誤差評価値は、ブロックマッチング等による誤差の総和だけでなく、動きベクトル等の伝送に必要な符号量や、求められた合成動き補償予測ブロックを用いて、符号化対象ブロックより減算した予測差分ブロックを符号化した際に要する符号量を加味して、符号量と復号後の入力画像との歪量を計算した値として算出することも可能である。 Subsequently, based on the position indicated by the obtained second motion vector value MV2, a synthesized motion compensated prediction block is cut out from the synthesized reference image block, and an error evaluation value is calculated. The error evaluation value is not only the sum of errors due to block matching etc., but also the prediction difference block that is subtracted from the encoding target block using the amount of code required for transmission of motion vectors etc. and the obtained synthesized motion compensated prediction block It is also possible to calculate the distortion amount between the code amount and the decoded input image as a calculated value, taking into account the code amount required when encoding.

Ｓ５０６からＳ５１３までの処理は、すべての参照画像に対して施され、参照画像が最後の参照画像でない場合（Ｓ５１４：ＮＯ）には、次の参照画像を第１の参照画像として選択し（Ｓ５１５）、Ｓ５０６に戻る。参照画像が最後の参照画像である場合（Ｓ５１４：ＹＥＳ）には、すべての参照画像に対して求められた第２の動きベクトル値より、誤差評価値の最も小さなものを選択し、選択した第２の動きベクトル値と第２の動きベクトル値を算出する際に用いた第１参照画像を示す情報を、誤差評価値と共に予測モード判定部１０９に出力する（Ｓ５１６）。 The processing from S506 to S513 is performed on all reference images, and when the reference image is not the last reference image (S514: NO), the next reference image is selected as the first reference image (S515). ), The process returns to S506. If the reference image is the last reference image (S514: YES), the one with the smallest error evaluation value is selected from the second motion vector values obtained for all the reference images, and the selected second image is selected. Information indicating the first reference image used when calculating the second motion vector value and the second motion vector value is output to the prediction mode determination unit 109 together with the error evaluation value (S516).

予測モード判定部１０９では、他の予測モードとの間で誤差評価値が比較され、最適な予測モードを決定する（Ｓ５１７）。 The prediction mode determination unit 109 compares the error evaluation value with other prediction modes and determines an optimal prediction mode (S517).

決定した予測モードの予測画像と符号化対象ブロックの差分である、予測差分ブロックと、予測モードに関連する付加情報が符号化される（Ｓ５１８）ことで、１つのマクロブロックに対する符号化処理が終了する。 The prediction difference block, which is the difference between the predicted image in the determined prediction mode and the encoding target block, and the additional information related to the prediction mode are encoded (S518), thereby completing the encoding process for one macroblock. To do.

動きベクトル値に関しては、従来の動き補償予測が選択された場合と、合成動き補償予測が選択された場合で同様に、続くマクロブロックの動きベクトル予測に用いるために、動きベクトル予測部１０６に格納される。合成動き補償予測において伝送する第２の動きベクトル値は、合成処理を施さない場合の第１参照画像の動きベクトル値としては、従来の動き補償予測と同様の相関性があるため、別々に管理せずに同一に取り扱うことで、周辺ブロックの参照可能な動きベクトル値を増やし、従来と同等の動きベクトルの予測精度を保つことが可能となる。 The motion vector value is stored in the motion vector prediction unit 106 for use in the motion vector prediction of the subsequent macroblock similarly when the conventional motion compensated prediction is selected and when the combined motion compensated prediction is selected. Is done. The second motion vector value transmitted in the synthesized motion compensated prediction is managed separately because the motion vector value of the first reference image without the synthesis process has the same correlation as in the conventional motion compensated prediction. By handling the same in the same manner, it is possible to increase the motion vector values that can be referred to in the peripheral blocks, and to maintain the same motion vector prediction accuracy as that in the prior art.

また、図１０のフローチャートにおいては、マクロブロック内の符号化対象ブロックサイズを１つとして説明しているが、ＭＰＥＧ−４ＡＶＣと同様に１６×１６、１６×８、８×１６、８×８、８×４、４×８、４×４等のブロックサイズ単位で合成動き補償予測を施すことが可能であり、その場合にはブロックサイズ別の合成動き補償予測の誤差評価値を算出し、最も誤差評価値の小さなブロックサイズが選択され、予測モードを伝送することで選択結果を復号装置が認識できる。 In the flowchart of FIG. 10, the description is made assuming that the size of the encoding target block in the macroblock is one, but as in MPEG-4 AVC, 16 × 16, 16 × 8, 8 × 16, 8 × 8 , 8 × 4, 4 × 8, 4 × 4, etc., can be combined motion compensated prediction, in which case, an error evaluation value of the combined motion compensated prediction for each block size is calculated, The block size with the smallest error evaluation value is selected, and the decoding device can recognize the selection result by transmitting the prediction mode.

図１３に示したスライスヘッダ情報として、refinement_mc_adaptive＝１を伝送した場合には、通常の動き補償予測と合成動き補償予測とを適応的に切り替える処理が行われる。 When refinement_mc_adaptive = 1 is transmitted as the slice header information shown in FIG. 13, a process of adaptively switching between normal motion compensation prediction and synthesized motion compensation prediction is performed.

図１４に実施の形態２における動き補償予測モードへの追加情報の一例を示す。切り替え情報は、動きベクトルを使用しないフレーム内予測（Intra）と予測した動きベクトル値だけを用いるダイレクトモード（Direct）以外の場合に、適用されたモードにしたがって動きベクトルを伝送する参照画像単位で、１ビットのON/OFF情報として伝送する。図１４における、refmc_on_l0[ mbPartIdx ]及びrefmc_on_l1[ mbPartIdx ]が、該当する情報となる。 FIG. 14 shows an example of additional information to the motion compensation prediction mode in the second embodiment. The switching information is a reference image unit that transmits a motion vector according to an applied mode in cases other than the direct mode (Direct) using only the predicted motion vector value (Intra) and the predicted motion vector value without using a motion vector, It is transmitted as 1-bit ON / OFF information. In FIG. 14, refmc_on_l0 [mbPartIdx] and refmc_on_l1 [mbPartIdx] are the corresponding information.

Ｂスライスにおける双方向予測（２つの参照画像を用いて加算予測を行う）において、１つずつの参照画像に対して合成参照画像を予測画像として用いるか否かを選択することも可能であり、参照画像の選択と併せると、２つの動きベクトルで最大４つの参照画像から合成動き補償予測を行うことができ、予測画像の品質を更に向上することが可能である。 In bi-directional prediction in B slice (additional prediction is performed using two reference images), it is also possible to select whether to use a synthesized reference image as a predicted image for each reference image, In combination with the selection of the reference image, the combined motion compensated prediction can be performed from a maximum of four reference images with two motion vectors, and the quality of the predicted image can be further improved.

マクロブロックの符号化が終了したら、最後のマクロブロックでない場合（Ｓ５１９：ＮＯ）には、次のマクロブロックを指定し（Ｓ５２０）、Ｓ５０４に移動する。最後のマクロブロックである場合には（Ｓ５１９：ＹＥＳ）、１フレームの符号化処理を終了する。 When the encoding of the macroblock is completed, if it is not the last macroblock (S519: NO), the next macroblock is designated (S520), and the process proceeds to S504. If it is the last macroblock (S519: YES), the encoding process for one frame is terminated.

次に、図８の仕組みを実現する復号装置における複数参照画像合成部２１５の構成を図１５に、動きベクトル分離部２１４、複数参照画像合成部２１５及び合成画像動き補償予測部２１６において動作する、合成画像動き補償予測処理のフローチャートを図１６に示し、その詳細動作を説明する。 Next, FIG. 15 shows the configuration of the multiple reference image synthesis unit 215 in the decoding apparatus that implements the mechanism of FIG. 8, and the motion vector separation unit 214, the multiple reference image synthesis unit 215, and the synthesized image motion compensation prediction unit 216 operate. The flowchart of the composite image motion compensation prediction process is shown in FIG. 16, and the detailed operation will be described.

図１５に示すように、複数参照画像合成部２１５は、基準参照画像取得部１０００、動きベクトル検出範囲設定部１００１、参照画像間動きベクトル検出部１００２、合成参照画像取得部１００３、参照画像合成部１００４、及び合成画像メモリ１００５から構成され、それぞれの動作は、図９に示した、基準参照画像取得部４００、動きベクトル検出範囲設定部４０１、参照画像間動きベクトル検出部４０２、合成参照画像取得部４０３、参照画像合成部４０４、及び合成画像メモリ４０５と同様の動作を行う。 As shown in FIG. 15, the multiple reference image synthesis unit 215 includes a standard reference image acquisition unit 1000, a motion vector detection range setting unit 1001, an inter-reference image motion vector detection unit 1002, a synthesized reference image acquisition unit 1003, and a reference image synthesis unit. 1004, and a synthesized image memory 1005, and the operations are the reference reference image acquisition unit 400, motion vector detection range setting unit 401, inter-reference image motion vector detection unit 402, and synthesized reference image acquisition shown in FIG. Operations similar to those of the unit 403, the reference image synthesis unit 404, and the synthesized image memory 405 are performed.

最初に動きベクトル分離部２１４より、基準参照画像取得部１０００及び動きベクトル検出範囲設定部１００１に、復号した動きベクトル値ＭＶ２より動きベクトル分離部２１４で生成された、動きベクトル値ＭＶ１が入力される。 First, the motion vector separation unit 214 inputs the motion vector value MV1 generated by the motion vector separation unit 214 from the decoded motion vector value MV2 to the standard reference image acquisition unit 1000 and the motion vector detection range setting unit 1001. .

具体的には、ＭＶ１は、ＭＶ１＝（ＭＶ２＋２）＞＞２の演算で生成され、符号化装置における第１参照画像と符号化対象ブロックとの間の動きベクトル値が取得できる。基準参照画像取得部１０００では、入力されたＭＶ１を用いて復号参照画像メモリ２０９から、第１参照画像の参照ブロックを取得する。基準参照画像取得部１０００は、取得した第１の参照ブロックを参照画像間動きベクトル検出部１００２及び参照画像合成部１００４に出力する。 Specifically, MV1 is generated by an operation of MV1 = (MV2 + 2) >> 2, and a motion vector value between the first reference image and the encoding target block in the encoding device can be acquired. The reference reference image acquisition unit 1000 acquires the reference block of the first reference image from the decoded reference image memory 209 using the input MV1. The reference reference image acquisition unit 1000 outputs the acquired first reference block to the inter-reference image motion vector detection unit 1002 and the reference image synthesis unit 1004.

続いて動きベクトル検出範囲設定部１００１では、第１の参照ブロックに対して第２参照画像の間での動きベクトルを検出する範囲を設定する。動きベクトル検出範囲設定部１００１は、設定した動きベクトル検出範囲の情報を参照画像間動きベクトル検出部１００２に出力する。 Subsequently, the motion vector detection range setting unit 1001 sets a range for detecting a motion vector between the second reference images for the first reference block. The motion vector detection range setting unit 1001 outputs information on the set motion vector detection range to the inter-reference image motion vector detection unit 1002.

参照画像合成部１００４では、基準参照画像取得部１０００より第１の参照ブロックを入力すると共に、参照画像間動きベクトル検出部１００２より参照画像間動きベクトルを入力し、参照画像間動きベクトルによって示される第２参照画像の参照ブロックを、合成参照画像取得部１００３を介して復号参照画像メモリ２０９より取得することで、第２の参照ブロックを得る。参照画像合成部１００４では、第１の参照ブロックと第２の参照ブロックとの間での合成処理を行う。参照画像合成部１００４は、合成した参照ブロックを、合成画像メモリ１００５を介して合成画像動き補償予測部２１６に出力する。 In the reference image synthesis unit 1004, the first reference block is input from the standard reference image acquisition unit 1000, and the inter-reference image motion vector detection unit 1002 is input, which is indicated by the inter-reference image motion vector. By acquiring the reference block of the second reference image from the decoded reference image memory 209 via the synthesized reference image acquisition unit 1003, the second reference block is obtained. The reference image synthesis unit 1004 performs a synthesis process between the first reference block and the second reference block. The reference image synthesis unit 1004 outputs the synthesized reference block to the synthesized image motion compensation prediction unit 216 via the synthesized image memory 1005.

続いて、これらの構成を用いた復号装置側での合成画像動き補償予測処理の動作を、図１６のフローチャートを用いて説明する。図１６においても、図１０と同様に１フレームの復号処理における合成画像動き補償予測の動作の流れを示している。他の処理部の動作に関しては、ＭＰＥＧ−４ＡＶＣ等の従来の動画像復号処理を用いることが可能である。 Next, the operation of the composite image motion compensation prediction process on the decoding device side using these configurations will be described with reference to the flowchart of FIG. Also in FIG. 16, the flow of the operation | movement of the synthetic | combination image motion compensation prediction in the decoding process of 1 frame similarly to FIG. 10 is shown. As for the operation of other processing units, it is possible to use a conventional video decoding process such as MPEG-4 AVC.

１フレームの復号処理開始時に、最初にスライスヘッダを復号して参照画像に関連する情報を取得する（Ｓ１１００）。図１１において示したような、符号化順を示す情報や、参照画像を特定する情報がスライスヘッダにおいて伝送されていると共に、図１３に示したような合成動き補償予測に関する情報も復号される。 At the start of the decoding process for one frame, the slice header is first decoded to obtain information related to the reference image (S1100). Information indicating the coding order as shown in FIG. 11 and information specifying the reference image are transmitted in the slice header, and information related to the combined motion compensated prediction as shown in FIG. 13 is also decoded.

次に、参照画像毎に合成の対象とする参照画像を確定させる処理を行う（Ｓ１１０１）。実施の形態２においては、符号化装置の動作で示した処理と同様の判断を復号装置において行う。 Next, a process of determining a reference image to be synthesized for each reference image is performed (S1101). In the second embodiment, the decoding device makes a determination similar to the processing shown in the operation of the encoding device.

全ての参照画像に対して合成対象となる参照画像を確定されたら、続いて参照画像間の動きベクトル検出精度を復号したrefinement_mc_matching_subpelを用いて設定する（Ｓ１１０２）。 When the reference images to be synthesized are determined for all the reference images, the motion vector detection accuracy between the reference images is set using the decoded refinement_mc_matching_subpel (S1102).

同様に、参照画像間の動きベクトル検出範囲に対して、スライスヘッダより復号したrefinement_mc_matching_range_fullを用いて設定する（Ｓ１１０３）。 Similarly, the motion vector detection range between reference images is set using refinement_mc_matching_range_full decoded from the slice header (S1103).

フレーム単位での設定を確定した後に、復号対象フレーム内の復号対象ブロックであるマクロブロックに対して、合成動き補償予測が用いられている場合に、合成動き補償予測ブロックを生成する処理が施される。 After the setting in units of frames is confirmed, when synthesized motion compensated prediction is used for a macroblock that is a decoding target block in the decoding target frame, a process for generating a synthesized motion compensated prediction block is performed. The

各マクロブロックに対して（Ｓ１１０４）、最初に予測モードが合成動き補償予測モードでない場合（Ｓ１１０５：ＮＯ）には、他の予測モードで予測処理を行い、生成された予測画像を用いて復号処理を施す（Ｓ１１０６）。 For each macroblock (S1104), when the prediction mode is not the combined motion compensated prediction mode for the first time (S1105: NO), prediction processing is performed in another prediction mode, and decoding processing is performed using the generated prediction image. (S1106).

予測モードが合成動き補償予測モードである場合には（Ｓ１１０５：ＹＥＳ）、第１参照画像を示す情報を取得する（Ｓ１１０７）。参照画像を示す情報はＭＰＥＧ−４ＡＶＣと同様に予測モードと一緒に符号化されており、復号したマクロブロックの予測モード情報と共に取得することが出来る。 When the prediction mode is the combined motion compensation prediction mode (S1105: YES), information indicating the first reference image is acquired (S1107). The information indicating the reference image is encoded together with the prediction mode similarly to MPEG-4 AVC, and can be acquired together with the prediction mode information of the decoded macroblock.

続いて、動きベクトル予測復号部２１２において復号された動きベクトル値ＭＶ２を取得する（Ｓ１１０８）。ＭＶ２は動きベクトル分離部２１４において、分離処理が施されＭＶ１を生成する（Ｓ１１０９）。具体的には、上述したようにＭＶ１＝（ＭＶ２＋２）＞＞２の演算がなされる。 Subsequently, the motion vector value MV2 decoded by the motion vector predictive decoding unit 212 is acquired (S1108). The MV2 is subjected to separation processing in the motion vector separation unit 214 to generate MV1 (S1109). Specifically, the calculation of MV1 = (MV2 + 2) >> 2 is performed as described above.

続いて、ＭＶ１を用いて第１の参照ブロックを取得する（Ｓ１１１０）。図９の説明において記述したように、第１の参照ブロックは、ＭＶ１の値だけ移動した第１参照画像の位置を基準に、対象ブロック±１／２画素以上の１／４画素精度の参照画像を作成する為に必要な領域として、符号化対象ブロックサイズに加えて±３画素の領域の参照画像を取得する。 Subsequently, the first reference block is acquired using MV1 (S1110). As described in the description of FIG. 9, the first reference block is a reference image with a 1/4 pixel accuracy of ± 1/2 pixel or more of the target block on the basis of the position of the first reference image moved by the value of MV1. In addition to the encoding target block size, a reference image of an area of ± 3 pixels is acquired as an area necessary for creating the image.

次に、第１の動きベクトルより第２参照画像の検出範囲を設定する（Ｓ１１１１）。第２参照画像にどの参照画像を用いるかは、Ｓ１１０１によって決めた定義により符号化装置と同じ選択がなされる。検出範囲に関しては、Ｓ１１０３において設定された検出範囲が用いられる。設定された検出範囲に対して、第１の参照ブロックと第２参照画像との間で参照画像間の動きベクトル検出が施される（Ｓ１１１２）。 Next, the detection range of the second reference image is set from the first motion vector (S1111). Which reference image is used for the second reference image is selected in the same way as the encoding device according to the definition determined in S1101. As for the detection range, the detection range set in S1103 is used. For the set detection range, motion vector detection between reference images is performed between the first reference block and the second reference image (S1112).

続いて、検出された参照画像間動きベクトルを用いて、第２の参照ブロックを取得する（Ｓ１１１３）。第２の参照ブロックは、第１の参照ブロックに対して参照画像間の動きベクトル値だけ移動した第２参照画像の位置を基準に、第１の参照ブロックと同様の符号化ブロックサイズ±３画素の領域の参照画像を取得する。 Subsequently, a second reference block is acquired using the detected inter-reference image motion vector (S1113). The second reference block has the same coding block size ± 3 pixels as the first reference block on the basis of the position of the second reference image moved by the motion vector value between the reference images with respect to the first reference block. A reference image of the area is acquired.

次に、第１の参照ブロックと第２の参照ブロックを合成し、合成参照画像ブロックを生成する（Ｓ１１１４）。 Next, the first reference block and the second reference block are synthesized to generate a synthesized reference image block (S1114).

続いて、合成した参照画像ブロックに対して、ＭＶ１で指定された位置に対してＭＶ２−ＭＶ１移動させた領域、即ちＭＶ２で指定される位置に相当する領域の画像ブロックを抽出する（Ｓ１１１５）。移動する１／４画素成分は、ＭＶ１は１画素精度、ＭＶ２は１／４画素精度となるため、ＭＶ２−（ＭＶ１＜＜２）の演算で生成される。抽出した画像ブロックは、合成動き補償予測ブロックとして、予測画像選択部２０４に出力される（Ｓ１１１６）。 Subsequently, an image block of an area corresponding to the position designated by MV2 is extracted from the synthesized reference image block by moving MV2-MV1 with respect to the position designated by MV1 (S1115). The moving 1/4 pixel component is generated by the calculation of MV2− (MV1 << 2) because MV1 has 1 pixel accuracy and MV2 has 1/4 pixel accuracy. The extracted image block is output to the predicted image selection unit 204 as a synthesized motion compensated prediction block (S1116).

続いて、上記合成動き補償予測ブロックを用いて、差分情報の復号処理が施される（Ｓ１１１７）ことで、１つのマクロブロックの復号処理が終了する。復号処理を行ったマクロブロックが１フレームの最後のマクロブロックでない場合（Ｓ１１１８：ＮＯ）には、次に復号するマクロブロックを指定して（Ｓ１１１９）、Ｓ１１０５に戻る。 Subsequently, the decoding process of difference information is performed using the synthesized motion compensated prediction block (S1117), whereby the decoding process of one macroblock is completed. If the decoded macroblock is not the last macroblock of one frame (S1118: NO), the macroblock to be decoded next is designated (S1119), and the process returns to S1105.

１フレームの最後のマクロブロックを復号した場合には（Ｓ１１１８：ＹＥＳ）、１フレームの処理を完了する。 When the last macroblock of one frame is decoded (S1118: YES), the processing of one frame is completed.

尚、実施の形態２における復号装置における合成画像動き補償予測処理では、最終的なＭＶ２の値が予めわかっているため、第１の参照ブロックに関しては、符号化対象ブロックサイズの１画素単位の参照画像ブロックのみを取得し、参照画像間の動きベクトル検出を行い、合成参照画像生成時に必要になる１／４画素単位の参照画像を生成して、合成処理を行うことも可能であり、復号時のフィルタ処理による演算量増加を減らしつつ、符号化装置と同じ合成参照画像を生成することができる。 In the composite image motion compensation prediction process in the decoding apparatus according to the second embodiment, since the final MV2 value is known in advance, for the first reference block, reference is made in units of one pixel of the encoding target block size. It is also possible to acquire only image blocks, perform motion vector detection between reference images, generate a reference image in 1/4 pixel units that is required when generating a combined reference image, and perform combining processing. The same synthesized reference image as that of the encoding device can be generated while reducing the increase in the amount of calculation due to the filtering process.

本発明の実施の形態２の動画像符号化装置及び動画像復号装置により、符号化対象ブロックと第１の参照画像の間で動きベクトルを求めて動き補償予測した参照画像に対して、他の参照画像を合成するとともに、合成した予測画像に対して微少範囲の動きベクトル検出（補正）を行うことで、予測画像の品質を向上した上で向上後のエッジ部分等の位相変化を加味して動きを補正した、予測残差の少ない動き補償予測画像を生成できた。 With respect to the reference image obtained by performing motion compensation prediction by obtaining a motion vector between the encoding target block and the first reference image by the moving image encoding device and the moving image decoding device according to the second embodiment of the present invention, By synthesizing the reference image and performing motion vector detection (correction) in a very small range on the synthesized predicted image, the quality of the predicted image is improved, and phase changes such as the improved edge portion are added. A motion-compensated prediction image with little prediction residual with corrected motion could be generated.

更に、第１の参照画像に対して求めた動きベクトルの精度をＮ画素精度とすると、合成した予測画像に対して施す動きベクトル検出（補正）の範囲を±Ｎ／２画素とし、Ｎ画素よりも細かい精度で補正を行い、補正結果の動きベクトル値を伝送することで、１つの動きベクトル値で、復号装置側での第１の参照画像からの動き補償予測画像取得と、合成した予測画像の位相変化を補正した動き補償予測画像の取得を可能とし、付加情報を増やすことなく、予測残差の少ない動き補償予測画像を符号化・復号可能とした。 Furthermore, assuming that the accuracy of the motion vector obtained for the first reference image is N pixel accuracy, the range of motion vector detection (correction) performed on the synthesized predicted image is ± N / 2 pixels. The motion compensation predicted image is obtained from the first reference image on the decoding device side by one motion vector value, and the synthesized predicted image is transmitted by performing the correction with fine accuracy and transmitting the motion vector value of the correction result. It is possible to obtain a motion compensated predicted image in which the phase change is corrected, and to encode / decode a motion compensated predicted image with a small prediction residual without increasing additional information.

また、第１の予測画像に対する動きベクトル値を、復号済みの周辺ブロックの動きベクトル値より予測した場合に、復号する動きベクトル値として合成した予測画像に対する補正値のみを受信すればよく、更に動きベクトルの情報量を削減できた。 Further, when the motion vector value for the first predicted image is predicted from the motion vector values of the decoded peripheral blocks, it is only necessary to receive only the correction value for the predicted image synthesized as the motion vector value to be decoded. The amount of vector information was reduced.

本発明の実施の形態２においては、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像との間の動きベクトル値を求め、他の参照画像から取得した動き補償予測画像と加算平均をとることで、符号化劣化成分の除去と復号対象物の微少な輝度変化に対応した予測画像を生成することが出来、符号化効率を向上できた。 In Embodiment 2 of the present invention, a motion vector value between a motion compensated predicted image predicted using the first reference image and another reference image is obtained, and the motion obtained from the other reference image is obtained. By taking the addition average with the compensated prediction image, it is possible to generate a prediction image corresponding to the removal of the encoding degradation component and the slight luminance change of the decoding target, and the encoding efficiency can be improved.

尚、第２の動きベクトルに関しては、符号化装置において１つの確定した結果を用いて合成参照画像を生成し、合成動き補償予測をおこなっているが、複数の第１の動きベクトルを用意して、それぞれの動きベクトルを基準に合成動き補償予測を同じ手法で行い、最適な第２の動きベクトルを符号化した場合においても、復号装置は実施の形態２において説明した処理で、演算量の増加なく復号が可能であり、符号化装置においてもＮ画素単位での判断でＭ画素精度の最適な合成動き補償予測が可能となり、実施の形態１に対して符号化処理の増加を抑えて、適した合成動き補償予測が可能となる。 As for the second motion vector, a synthesized reference image is generated using one determined result in the encoding device and synthesized motion compensation prediction is performed. However, a plurality of first motion vectors are prepared. Even when the combined motion compensated prediction is performed using the same method based on the respective motion vectors and the optimal second motion vector is encoded, the decoding apparatus increases the amount of computation by the process described in the second embodiment. Decoding is possible, and the encoding apparatus can perform optimum combined motion compensation prediction with M pixel accuracy by determination in units of N pixels, and is suitable for the first embodiment while suppressing an increase in encoding processing. Combined motion compensation prediction is possible.

（実施の形態３）
次に、実施の形態３の動画像符号化装置及び動画像復号装置を説明する。実施の形態３においては、動画像符号化装置及び動画像復号装置の構成は実施の形態２と同様の構成を取り、複数参照画像合成部における参照画像の合成処理のみが異なる動作を行う。具体的には、実施の形態２の説明における、参照画像合成部４０４、１００４及びフローチャートのＳ５１１、Ｓ１１１４で行う演算処理のみが異なる。 (Embodiment 3)
Next, a video encoding device and a video decoding device according to Embodiment 3 will be described. In the third embodiment, the configuration of the moving image encoding device and the moving image decoding device is the same as that of the second embodiment, and only the reference image combining process in the multiple reference image combining unit is different. Specifically, only the arithmetic processing performed in the reference image synthesis units 404 and 1004 and the flowcharts S511 and S1114 in the description of the second embodiment is different.

実施の形態３における、参照画像の合成処理の動作を示す概念図を図１７に示し、演算処理の説明を行う。実施の形態２においては、合成処理においてブロック内の全ての画素値において、一様の加算平均処理を施す平均化が行われていたが、実施の形態３においては、合成処理において、第１の参照ブロックと第２の参照ブロックとの間で、画素毎に誤差値を算出し、誤差の絶対値に応じて画素毎での第２の参照ブロックと前記第１の参照ブロックとの重み付けを変化させて重み付け平均値を算出する。 FIG. 17 is a conceptual diagram showing the operation of reference image synthesis processing in Embodiment 3, and the calculation processing will be described. In the second embodiment, averaging is performed by performing a uniform addition averaging process on all pixel values in the block in the synthesis process. However, in the third embodiment, the first process is performed in the synthesis process. An error value is calculated for each pixel between the reference block and the second reference block, and the weighting of the second reference block and the first reference block for each pixel is changed according to the absolute value of the error. To calculate a weighted average value.

具体的には、誤差が少ない場合には均等の重み付けを行い、閾値以上は第２の参照ブロックの画素を加算しない、動き適応フィルタの構成を取る。第１の参照ブロックの画素値をＰ１、第２の参照ブロックの画素値をＰ２とすると、図１７に示したような関数にて、画素誤差絶対値｜Ｐ１−Ｐ２｜より、加算比率値であるαを算出する。算出したαを用いた、画素毎の合成画素値ＰＭは、
ＰＭ＝Ｐ１×（１−α）＋Ｐ２×α
で計算される。 Specifically, when the error is small, the weighting is performed equally, and the pixel of the second reference block is not added above the threshold value. Assuming that the pixel value of the first reference block is P1, and the pixel value of the second reference block is P2, the addition ratio value is obtained from the pixel error absolute value | P1-P2 | A certain α is calculated. Using the calculated α, the combined pixel value PM for each pixel is
PM = P1 × (1−α) + P2 × α
Calculated by

これらの合成処理は、符号化装置・復号装置で暗黙に同じ動作をさせることも可能であるが、スライスヘッダ等に単純平均化を行うか、画素毎での適応加算を行うかを示す情報を送って選択することも可能である。 These synthesizing processes can be performed implicitly by the encoding device / decoding device, but information indicating whether simple averaging is performed on the slice header or the like, or adaptive addition for each pixel is performed. It is also possible to send and select.

実施の形態２における単純平均化を行った場合は２つの参照画像からの平均値予測に相当する特性の予測画像を１つの動きベクトルで生成することが可能であるのに対して、実施の形態３における適応加算を行った場合には、第１の参照ブロックを基準として、第１の参照ブロックが符号化劣化等で生じた歪を除去しつつ、エッジ成分などの変化が大きい部分に関しては信号特性を保存することで、第１の参照画像ブロックの品質を向上した予測画像を符号化・復号共通の処理において生成する。これにより、実施の形態３における動画像符号化装置・動画像復号装置においては、実施の形態１における効果に加えて、エッジ成分などの特徴部分の信号特性を保ちつつ、符号化劣化成分の除去を行った高品質な予測画像を生成することが出来、符号化効率を向上した。 When simple averaging is performed in the second embodiment, it is possible to generate a predicted image having characteristics corresponding to the average value prediction from two reference images with one motion vector. 3, the first reference block is used as a reference to remove the distortion caused by the encoding deterioration or the like while the first reference block is used as a reference, and the signal with respect to the portion where the change in the edge component or the like is large By storing the characteristics, a predicted image in which the quality of the first reference image block is improved is generated in a process common to encoding and decoding. As a result, in the video encoding device / video decoding device according to the third embodiment, in addition to the effects of the first embodiment, it is possible to remove the coding degradation component while maintaining the signal characteristics of the characteristic part such as the edge component. It was possible to generate a high-quality prediction image that was performed, and improved the encoding efficiency.

尚、参照画像が図１１で示すような符号化構造において、１枚のみの状態である場合においても、第２の参照画像として第１の参照画像を指定し、第１の参照画像内でのテクスチャ成分をブロックマッチングさせて適応的に加算することにより、劣化成分の除去は可能となり、本発明の構成を持って、良好な効果を発揮させることができる。 Even when the reference image has only one image in the coding structure as shown in FIG. 11, the first reference image is designated as the second reference image, and the reference image within the first reference image is designated. By performing block matching and adaptively adding the texture components, it is possible to remove the deteriorated components, and with the configuration of the present invention, a good effect can be exhibited.

（実施の形態４）
次に、実施の形態４の動画像符号化装置及び動画像復号装置を説明する。実施の形態４においては、複数参照画像合成部における第２の参照画像を複数用いる構成をとると共に、第１の参照ブロックに対する第２の参照画像からの動きベクトル検出のブロック単位が符号化対象ブロックよりも小さなブロック単位で構成されることを特徴とする。 (Embodiment 4)
Next, a video encoding device and a video decoding device according to Embodiment 4 will be described. In the fourth embodiment, the multiple reference image synthesis unit uses a plurality of second reference images, and the block unit of motion vector detection from the second reference image for the first reference block is the encoding target block. It is characterized by being composed of smaller block units.

図１８は、本発明の実施の形態４における、合成画像動き補償予測処理の動作を示す概念図である。符号化装置側と復号装置側の関係は、図８に示した実施の形態２と同様の関係であるため、符号化装置側の動作を示す概念図のみ記している。 FIG. 18 is a conceptual diagram showing the operation of the composite image motion compensation prediction process in Embodiment 4 of the present invention. Since the relationship between the encoding device side and the decoding device side is the same as that in the second embodiment shown in FIG. 8, only the conceptual diagram showing the operation on the encoding device side is shown.

符号化装置側では、符号化対象フレームと、基準にする参照画像を第１参照画像として動きベクトルの検出を行い、ＭＶ１を生成する処理を、実施の形態１と同様に行う。 On the encoding device side, a motion vector is detected using the encoding target frame and a reference image as a reference as a first reference image, and processing for generating MV1 is performed in the same manner as in the first embodiment.

次にＭＶ１により第１参照画像から切り出された参照ブロックを、符号化対象ブロックよりも小さなブロック単位に分割し、小さなブロック単位で第２参照画像との間の参照画像間動きベクトルを検出する。一例としては、符号化対象ブロックがマクロブロックと同じサイズである１６×１６画素の場合に、参照画像間動きベクトルを求める単位を８×８画素にする。そして、検出された複数（一例の場合は４つ）の第２参照画像との間の参照画像間動きベクトルを用いて、対象となる第１の参照ブロックの領域に対して合成対象となる小さなブロック単位の複数の参照ブロックを生成する。 Next, the reference block cut out from the first reference image by MV1 is divided into smaller block units than the encoding target block, and the inter-reference image motion vector between the second reference image is detected in smaller block units. As an example, when the encoding target block is 16 × 16 pixels having the same size as the macroblock, the unit for obtaining the inter-reference image motion vector is 8 × 8 pixels. Then, using the inter-reference image motion vectors between a plurality of (four in the example, two) detected second reference images, a small target to be synthesized with respect to the target first reference block region. Generate multiple reference blocks in block units.

第２の参照ブロックを生成する際の対象ブロック周辺の画像に対しては、図１８に示しているように、分割した他のブロックに含まれない領域のみを、各小ブロック単位で取得することも可能であるが、全ての小ブロックにおいて実施の形態１と同様に±３画素の周辺領域を取得し、合成処理に用いることも可能であり、小ブロックの境界を滑らかに接続する合成画像を生成することが可能である。上記小ブロックの複数の参照ブロックを用いて、第１の参照ブロックに対応する第２の参照ブロックが生成される。 For the image around the target block when generating the second reference block, as shown in FIG. 18, only the area not included in the other divided blocks is acquired in units of small blocks. However, it is also possible to acquire a peripheral area of ± 3 pixels in all the small blocks and use it for the synthesis processing in the same manner as in the first embodiment, so that a synthesized image that smoothly connects the boundaries of the small blocks can be obtained. It is possible to generate. A second reference block corresponding to the first reference block is generated using the plurality of reference blocks of the small block.

続いて、第２の参照画像とは異なる第３の参照画像に対して、同様に参照画像間の動きベクトルを検出する。第３の参照画像に対しても同様に小さなブロック単位で第１の参照ブロックとの間の参照画像間動きベクトルを検出する。検出した参照画像間動きベクトルにより、複数の参照ブロックを生成し、複数の参照ブロックを用いて、第１の参照ブロックに対応する第３の参照ブロックを生成する。 Subsequently, a motion vector between reference images is similarly detected for a third reference image different from the second reference image. Similarly, for the third reference image, the inter-reference image motion vector between the first reference block is detected in units of small blocks. A plurality of reference blocks are generated based on the detected inter-reference image motion vector, and a third reference block corresponding to the first reference block is generated using the plurality of reference blocks.

合成処理においては、第１の参照ブロックに対して画素毎もしくは小さいブロック毎に第２の参照ブロックを合成させるか、第３の参照ブロックを合成させるか、第２、第３の参照ブロックの両方を用いて合成させるかを、第１の参照ブロックの画素値と第２及び第３の参照ブロックの画素値を用いて判断する。 In the combining process, the second reference block is combined for each pixel or each small block with respect to the first reference block, or the third reference block is combined, or both the second and third reference blocks are combined. Is determined using the pixel values of the first reference block and the pixel values of the second and third reference blocks.

このようにして合成された周囲を含む予測画像に対しては、実施の形態１と同様に符号化対象ブロックとの間でブロックマッチング等の手段でＭ画素精度の動きベクトル検出を行い、その結果検出された第２の動きベクトル値ＭＶ２を、符号化対象ブロックと第１参照画像との間の動きベクトルとして符号化・伝送すると共に、ＭＶ２で指定された合成した予測画像を、合成動き補償予測ブロックとして、符号化対象ブロックより減算し、差分ブロックを符号化・伝送する。 For the predicted image including the surroundings synthesized in this way, motion vector detection with M pixel accuracy is performed between the encoding target block and means such as block matching in the same manner as in the first embodiment. The detected second motion vector value MV2 is encoded and transmitted as a motion vector between the current block to be encoded and the first reference image, and a synthesized predicted image designated by MV2 is synthesized and motion compensated prediction. As a block, the difference block is encoded and transmitted by subtracting from the encoding target block.

実施の形態４における動画像符号化装置及び動画像復号装置の構成も、実施の形態２と同様の構成を取るが、複数参照画像合成部における構成及び処理が異なる。実施の形態４における複数参照画像合成部の構成を示すブロック図を図１９に、動作を説明するフローチャートを図２０に示し、実施の形態４の説明を行う。符号化装置と復号装置における複数参照画像合成部の構成及び動作は、複数参照画像合成部に繋がる処理ブロックの動作が異なるのみで、同様の動作を行うため、符号化装置における振る舞いを示し説明する。 The configuration of the moving image encoding device and the moving image decoding device in the fourth embodiment is the same as that in the second embodiment, but the configuration and processing in the multiple reference image synthesis unit are different. FIG. 19 is a block diagram showing the configuration of the multi-reference image synthesis unit in the fourth embodiment, and FIG. 20 is a flowchart for explaining the operation. The fourth embodiment will be described. The configuration and operation of the multi-reference image synthesizer in the encoding device and the decoding device are the same except that the operation of the processing block connected to the multi-reference image synthesizer is different. .

図１９に示すように、実施の形態４における複数参照画像合成部は、基準参照画像取得部１４００、動きベクトル検出範囲設定部１４０１、第２参照画像間動きベクトル検出部１４０２、第２参照画像取得部１４０３、参照画像合成部１４０４、及び合成画像メモリ１４０５、第３参照画像取得部１４０６、第３参照画像間動きベクトル検出部１４０７、及び合成判定部１４０８を備える。実施の形態１の複数参照画像合成部に対して、第３参照画像取得部１４０６、第３参照画像間動きベクトル検出部１４０７、及び合成判定部１４０８における動作が、実施の形態４における新たな効果をもたらす構成であるが、第３参照画像取得部１４０６、第３参照画像間動きベクトル検出部１４０７に関しては、第２参照画像間動きベクトル検出部１４０２及び第２参照画像取得部１４０３と統合して、実施の形態２における構成のように、参照画像間動きベクトル検出部４０２、合成参照画像取得部４０３の構成で動作させることも可能である。図１９においては、動作説明を行うために別々のブロックとして記述している。 As illustrated in FIG. 19, the multiple reference image synthesis unit according to Embodiment 4 includes a standard reference image acquisition unit 1400, a motion vector detection range setting unit 1401, a second inter-reference image motion vector detection unit 1402, and a second reference image acquisition. Unit 1403, reference image synthesis unit 1404, synthesized image memory 1405, third reference image acquisition unit 1406, third inter-reference image motion vector detection unit 1407, and synthesis determination unit 1408. The operations of the third reference image acquisition unit 1406, the third inter-reference image motion vector detection unit 1407, and the synthesis determination unit 1408 are new effects in the fourth embodiment compared to the multiple reference image synthesis unit of the first embodiment. However, the third reference image acquisition unit 1406 and the third inter-reference image motion vector detection unit 1407 are integrated with the second inter-reference image motion vector detection unit 1402 and the second reference image acquisition unit 1403. As in the configuration in the second embodiment, it is also possible to operate with the configuration of the inter-reference image motion vector detection unit 402 and the synthesized reference image acquisition unit 403. In FIG. 19, in order to explain the operation, they are described as separate blocks.

最初に動きベクトル検出部１０４より、基準参照画像取得部１４００及び動きベクトル検出範囲設定部１４０１に、第１参照画像と符号化対象ブロックとの間の動きベクトル値ＭＶ１が入力される。基準参照画像取得部１４００では、入力されたＭＶ１を用いて復号参照画像メモリ１１７から、第１参照画像の参照ブロックを取得する。参照ブロックの取得領域は、符号化対象ブロックに対して、ＭＶ１の値だけ移動した第１参照画像の位置を基準に、対象ブロック±Ｎ／２画素以上のＭ画素精度（Ｍ＜Ｎ）の参照画像を作成する為に必要な領域をとる。基準参照画像取得部１４００は、取得した第１の参照ブロックを第２参照画像間動きベクトル検出部１４０２、合成判定部１４０８、及び第３参照画像間動きベクトル検出部１４０７に出力する。 First, the motion vector detection unit 104 inputs the motion vector value MV1 between the first reference image and the encoding target block to the standard reference image acquisition unit 1400 and the motion vector detection range setting unit 1401. The reference reference image acquisition unit 1400 acquires the reference block of the first reference image from the decoded reference image memory 117 using the input MV1. The reference block acquisition area is a reference with M pixel accuracy (M <N) of the target block ± N / 2 pixels or more based on the position of the first reference image moved by the value of MV1 with respect to the encoding target block. The area necessary for creating the image is taken. The reference reference image acquisition unit 1400 outputs the acquired first reference block to the second inter-reference image motion vector detection unit 1402, the synthesis determination unit 1408, and the third inter-reference image motion vector detection unit 1407.

続いて動きベクトル検出範囲設定部１４０１では、第１の参照ブロックに対して第２参照画像の間での動きベクトルを検出する範囲、及び第３参照画像の間での動きベクトルを検出する範囲を設定する。検出範囲の設定アルゴリズムに関しては、実施の形態１における検出範囲設定と同様の処理が、第２参照画像との参照画像間動きベクトル検出に対してと、第２参照画像との参照画像間動きベクトル検出に対して、個別に行われ範囲を確定させる。動きベクトル検出範囲設定部１４０１は、設定した動きベクトル検出範囲の情報を第２参照画像間動きベクトル検出部１４０２及び第３参照画像間動きベクトル検出部１４０７に出力する。 Subsequently, in the motion vector detection range setting unit 1401, a range for detecting a motion vector between the second reference images for the first reference block and a range for detecting a motion vector between the third reference images are set. Set. As for the detection range setting algorithm, the same processing as the detection range setting in the first embodiment is performed for the motion vector detection between reference images with the second reference image and the motion vector between reference images with the second reference image. The detection is performed individually and the range is determined. The motion vector detection range setting unit 1401 outputs information on the set motion vector detection range to the second reference inter-image motion vector detection unit 1402 and the third reference inter-image motion vector detection unit 1407.

第２参照画像間動きベクトル検出部１４０２は、基準参照画像取得部１４００より入力された第１の参照ブロックに対して、動きベクトル検出範囲設定部１４０１より指定された動きベクトルの検出範囲における第２参照画像の参照ブロックを、第２参照画像取得部１４０３を介して復号参照画像メモリ１１７より取得し、符号化対象ブロックに対して１／４の大きさである８×８ブロックサイズ毎に、ブロックマッチング等の誤差値を算出し、その値が小さな動きベクトルを第２参照画像間動きベクトルとして算出する。第２参照画像間動きベクトル検出部１４０２は、算出した４つの第２参照画像間動きベクトルを合成判定部１４０８に出力する。 The second inter-reference image motion vector detection unit 1402 performs the second reference in the motion vector detection range specified by the motion vector detection range setting unit 1401 with respect to the first reference block input from the standard reference image acquisition unit 1400. The reference block of the reference image is acquired from the decoded reference image memory 117 via the second reference image acquisition unit 1403, and the block is obtained for each 8 × 8 block size that is ¼ the size of the encoding target block. An error value such as matching is calculated, and a motion vector having a small value is calculated as a second reference inter-image motion vector. The second inter-reference image motion vector detection unit 1402 outputs the four calculated inter-second reference image motion vectors to the synthesis determination unit 1408.

同様に第３参照画像間動きベクトル検出部１４０７は、基準参照画像取得部１４００より入力された第１の参照ブロックに対して、動きベクトル検出範囲設定部１４０１より指定された動きベクトルの検出範囲における第３参照画像の参照ブロックを、第３参照画像取得部１４０６を介して復号参照画像メモリ１１７より取得し、符号化対象ブロックに対して１／４の大きさである８×８ブロックサイズ毎に、ブロックマッチング等の誤差値を算出し、その値が小さな動きベクトルを第３参照画像間動きベクトルとして算出する。第３参照画像間動きベクトル検出部１４０７は、算出した４つの第３参照画像間動きベクトルを合成判定部１４０８に出力する。 Similarly, the third inter-reference image motion vector detection unit 1407 performs the motion vector detection range specified by the motion vector detection range setting unit 1401 on the first reference block input from the standard reference image acquisition unit 1400. The reference block of the third reference image is acquired from the decoded reference image memory 117 via the third reference image acquisition unit 1406, and for each 8 × 8 block size that is ¼ the size of the encoding target block. Then, an error value such as block matching is calculated, and a motion vector having a small value is calculated as a motion vector between the third reference images. The third inter-reference image motion vector detection unit 1407 outputs the four calculated third inter-reference image motion vectors to the synthesis determination unit 1408.

合成判定部１４０８では、基準参照画像取得部１４００より入力された第１の参照ブロックに対して、第２参照画像間動きベクトル検出部１４０２より入力された４つの第２参照画像間動きベクトルによって示される、複数の第２参照画像の参照ブロックを用いて、第２参照画像取得部１４０３を介して復号参照画像メモリ１１７より、第１の参照ブロックに対応する第２の参照ブロックを生成する。 In the synthesis determination unit 1408, the first reference block input from the standard reference image acquisition unit 1400 is indicated by the four second reference image motion vectors input from the second reference image motion vector detection unit 1402. A second reference block corresponding to the first reference block is generated from the decoded reference image memory 117 via the second reference image acquisition unit 1403 using the reference blocks of the plurality of second reference images.

合成判定部１４０８は、同様に第３参照画像間動きベクトル検出部１４０７より入力された４つの第３参照画像間動きベクトルによって示される、複数の第３参照画像の参照ブロックを用いて、第３参照画像取得部１４０６を介して復号参照画像メモリ１１７より、第１の参照ブロックに対応する第３の参照ブロックを生成する。 Similarly, the combination determination unit 1408 uses the reference blocks of the plurality of third reference images indicated by the four third reference image motion vectors input from the third reference image motion vector detection unit 1407 to A third reference block corresponding to the first reference block is generated from the decoded reference image memory 117 via the reference image acquisition unit 1406.

合成判定部１４０８は、続いて生成した第２の参照ブロック、第３の参照ブロックと、第１の参照ブロックの間で誤差値を算出し、誤差値の関係を用いて第１の参照ブロックに対して、合成する第２の参照ブロック及び第３の参照ブロックの選択及び、加算比率を確定する。確定アルゴリズムに関しては、後述する。 The synthesis determining unit 1408 calculates an error value between the second reference block, the third reference block, and the first reference block that are generated subsequently, and uses the error value relationship to determine the first reference block. On the other hand, the selection of the second reference block and the third reference block to be combined and the addition ratio are determined. The confirmation algorithm will be described later.

合成判定部１４０８は、確定した加算比率を用いて第１の参照ブロックに対して、第２の参照ブロック及び第３の参照ブロックを合成することで、合成した参照ブロックを生成し、合成した参照ブロックを、合成画像メモリ１４０５を介して合成画像動き補償予測部１０８に出力する。 The combination determination unit 1408 generates a combined reference block by combining the second reference block and the third reference block with the first reference block using the determined addition ratio, and combines the reference The block is output to the composite image motion compensation prediction unit 108 via the composite image memory 1405.

続いて、図２０に合成判定部１４０８における判定処理の動作を説明するためのフローチャートを示し、詳細動作を説明する。符号化対象ブロック単位で、最初に第１の参照ブロックを入力する（Ｓ１５００）。参照画像ブロックは、符号化対象ブロックの大きさに対して±１／２画素の範囲内での1／４画素単位での動き量移動を、合成動き補償予測部で行うために、フィルタ係数を加味して±３画素の領域を取得する（符号化対象ブロックが１６×１６画素の場合、２２×２２画素の領域を取得）。 Next, FIG. 20 shows a flowchart for explaining the operation of the determination process in the combination determination unit 1408, and the detailed operation will be described. First, the first reference block is input in units of encoding target blocks (S1500). The reference image block has a filter coefficient in order to perform a movement amount movement in a unit of 1/4 pixel within a range of ± 1/2 pixel with respect to the size of the encoding target block in the combined motion compensation prediction unit. In addition, an area of ± 3 pixels is acquired (when the encoding target block is 16 × 16 pixels, an area of 22 × 22 pixels is acquired).

続いて、第２参照画像間動きベクトル検出部１４０２により算出された４つの第２参照画像間動きベクトルを入力する（Ｓ１５０１）。入力されたそれぞれの動きベクトルを用いて、８×８画素単位での小参照画像ブロックを第２参照画像より取得する（Ｓ１５０２）。小参照画像ブロックの取得領域は１４×１４画素となる。 Subsequently, the four second reference image motion vectors calculated by the second reference image motion vector detection unit 1402 are input (S1501). Using each input motion vector, a small reference image block in units of 8 × 8 pixels is acquired from the second reference image (S1502). The acquisition area of the small reference image block is 14 × 14 pixels.

小参照画像ブロックの隣接部分に関しては、隣接する参照画像ブロックのオーバーラップ部分の画素を反映させるか否かの判断を行う（Ｓ１５０３）。具体的には、隣接する取得した動きベクトルの差分値が±１画素以内の場合には、オーバーラップさせてブロック隣接部分を滑らかに接続する。±１画素より大きい場合には、異なる物体に対して求められた参照ブロックであると判断し、オーバーラップ部分は反映させず、該当する小参照ブロックの画素をそのまま設定する。上記判断に従って、小参照画像ブロックを重ね合わせ、２２×２２画素で構成される第２の参照ブロックを生成する（Ｓ１５０４）。 For the adjacent portion of the small reference image block, it is determined whether or not the pixels of the overlapping portion of the adjacent reference image block are reflected (S1503). Specifically, when the difference value between adjacent acquired motion vectors is within ± 1 pixel, the adjacent blocks are overlapped and connected smoothly. If it is larger than ± 1 pixel, it is determined that it is a reference block obtained for a different object, the overlapping portion is not reflected, and the pixel of the corresponding small reference block is set as it is. According to the above determination, the small reference image blocks are superimposed to generate a second reference block composed of 22 × 22 pixels (S1504).

続いて、第３参照画像間動きベクトル検出部１４０７により算出された４つの第３参照画像間動きベクトルを入力する（Ｓ１５０５）。第２参照画像の場合と同様に、それぞれの動きベクトルを用いて小参照画像ブロックを第３参照画像より取得し（Ｓ１５０６）、小参照画像ブロックの隣接境界部分の処理を確定させ（Ｓ１５０７）、小参照画像ブロックを重ね合わせて第３の参照ブロックを生成する（Ｓ１５０８）。 Subsequently, four third reference image motion vectors calculated by the third reference image motion vector detection unit 1407 are input (S1505). As in the case of the second reference image, a small reference image block is acquired from the third reference image using each motion vector (S1506), and the processing of the adjacent boundary portion of the small reference image block is confirmed (S1507). The third reference block is generated by superimposing the small reference image blocks (S1508).

次に、第１の参照ブロック、第２の参照ブロック、第３の参照ブロックを用いて合成処理を画素単位に行う。参照ブロック内の画素単位に（Ｓ１５０９）、第１の参照ブロックの画素値Ｐ１と第２の参照ブロックの画素値Ｐ２との誤差絶対値|Ｐ１−Ｐ２|を算出する（Ｓ１５１０）。同様にＰ１と第３の参照ブロックの画素値Ｐ３との絶対誤差値｜Ｐ１−Ｐ３｜を算出し（Ｓ１５１１）、Ｐ２とＰ３の絶対誤差値｜Ｐ２−Ｐ３｜を算出する（Ｓ１５１２）。 Next, combining processing is performed on a pixel basis using the first reference block, the second reference block, and the third reference block. For each pixel in the reference block (S1509), an error absolute value | P1-P2 | between the pixel value P1 of the first reference block and the pixel value P2 of the second reference block is calculated (S1510). Similarly, an absolute error value | P1-P3 | between P1 and the pixel value P3 of the third reference block is calculated (S1511), and an absolute error value | P2-P3 | of P2 and P3 is calculated (S1512).

実施の形態３における合成処理の判断は、|Ｐ１−Ｐ２|、|Ｐ１−Ｐ３|、|Ｐ３−Ｐ２|の３つの値を用いて、Ｐ１、Ｐ２、Ｐ３の加算比率を確定させる（Ｓ１５１３）ことで行われる。 In the determination of the synthesis process in the third embodiment, the addition ratio of P1, P2, and P3 is determined using the three values | P1-P2 |, | P1-P3 |, and | P3-P2 | (S1513). Is done.

先ず、｜Ｐ１−Ｐ２｜と｜Ｐ１−Ｐ３｜が共に閾値β（例：８）よりも小さい場合には、Ｐ１、Ｐ２、Ｐ３を同じ重み付けで加算平均する。即ち、Ｐ１、Ｐ２、Ｐ３の比率が１：１：１となる。 First, when | P1-P2 | and | P1-P3 | are both smaller than a threshold value β (example: 8), P1, P2, and P3 are added and averaged with the same weight. That is, the ratio of P1, P2, and P3 is 1: 1: 1.

次に、｜Ｐ１―Ｐ２｜が閾値βより小さく、｜Ｐ１−Ｐ３｜が閾値γ（例：１６）よりも大きい場合には、Ｐ２のみをＰ１に加算する。即ち、Ｐ１、Ｐ２、Ｐ３の比率を１：１：０とする。Ｐ２とＰ３の関係が逆の場合、即ち｜Ｐ１―Ｐ３｜が閾値βより小さく、｜Ｐ１−Ｐ２｜が閾値γ（例：１６）よりも大きい場合には、Ｐ１、Ｐ２、Ｐ３の比率が１：０：１になる。 Next, when | P1-P2 | is smaller than the threshold β and | P1-P3 | is larger than the threshold γ (for example, 16), only P2 is added to P1. That is, the ratio of P1, P2, and P3 is 1: 1: 0. When the relationship between P2 and P3 is reversed, that is, when | P1-P3 | is smaller than the threshold β and | P1-P2 | is larger than the threshold γ (for example, 16), the ratio of P1, P2, and P3 is 1: 0: 1.

｜Ｐ１―Ｐ２｜と｜Ｐ１―Ｐ３｜が共に閾値γよりも大きい場合には、｜Ｐ２−Ｐ３｜の値を調査する。｜Ｐ２−Ｐ３｜が閾値δ（例：４）よりも小さい場合には、Ｐ１の画素値が劣化等の要因で誤差を生じていると判断し、Ｐ２、Ｐ３を用いて画素値を更新する方向に加算処理を行う。具体的には、Ｐ１、Ｐ２、Ｐ３の比率を１：２：２にする。
また、｜Ｐ２−Ｐ３｜が閾値γよりも大きい場合には、Ｐ２及びＰ３は合成対象から外し、加算比率がＰ１、Ｐ２、Ｐ３の比率が１：０：０になる。 When both | P1-P2 | and | P1-P3 | are larger than the threshold value γ, the value of | P2-P3 | is investigated. If | P2-P3 | is smaller than the threshold value δ (example: 4), it is determined that an error has occurred in the pixel value of P1, and the pixel value is updated using P2 and P3. Addition process in the direction. Specifically, the ratio of P1, P2, and P3 is 1: 2: 2.
When | P2-P3 | is larger than the threshold γ, P2 and P3 are excluded from the synthesis targets, and the addition ratios of P1, P2, and P3 are 1: 0: 0.

上記の条件以外の場合には、Ｐ１に対して、Ｐ２とＰ３の平均値を加算平均する。即ち、Ｐ１、Ｐ２、Ｐ３の比率が２：１：１となる。このようにして確定した比率に応じて、Ｐ１、Ｐ２、Ｐ３の重み付け加算平均を取り、合成した参照ブロックの画素値ＰＭを生成する（Ｓ１５１４）。 In cases other than the above conditions, the average value of P2 and P3 is added to P1 and averaged. That is, the ratio of P1, P2, and P3 is 2: 1: 1. The weighted average of P1, P2, and P3 is calculated according to the ratio determined in this way, and the combined reference block pixel value PM is generated (S1514).

参照ブロック内の全ての画素に対して同様の処理を施す。参照ブロック内の最後の画素でない（Ｓ１５１５：ＮＯ）場合には、次の画素を設定し（Ｓ１５１６）、Ｓ１５１０に戻る。参照ブロック内の最後の画素である（Ｓ１５１５：ＹＥＳ）場合には、符号化対象ブロックに対する参照画像の合成処理を終了する。 Similar processing is performed on all the pixels in the reference block. If it is not the last pixel in the reference block (S1515: NO), the next pixel is set (S1516), and the process returns to S1510. If it is the last pixel in the reference block (S1515: YES), the reference image synthesis process for the encoding target block is terminated.

実施の形態４の動画像符号化装置及び動画像復号装置においては、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像との間で対象としている動き補償予測画像よりも細かい単位の動きベクトル値を求め、それぞれの動きベクトルに応じて細かい単位で取得した動き補償予測画像と合成処理を行うことで、伝送する動きベクトルを増加せずに、符号化対象物の物体の時間的な微少な変形に対応した予測画像を生成し、符号化効率を向上させた。また、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像との間の相関性を求め合成に適した参照画像を複数選択し合成処理を行うことで、付加情報を送らずに、複数の参照画像からの適切な合成画像の生成を行うことが出来、更に符号化効率を向上させた。 In the moving image encoding device and the moving image decoding device according to the fourth embodiment, the motion compensated prediction which is the target between the other reference images with respect to the motion compensated predicted image predicted using the first reference image By obtaining a motion vector value in a smaller unit than the image and performing a synthesis process with the motion compensated prediction image obtained in a fine unit according to each motion vector, the encoding target object is not increased. The prediction image corresponding to the minute temporal deformation of the object is generated to improve the encoding efficiency. In addition, the motion compensated prediction image predicted using the first reference image is added by obtaining a correlation with another reference image, selecting a plurality of reference images suitable for combining, and performing a combining process. An appropriate composite image can be generated from a plurality of reference images without sending information, and the encoding efficiency is further improved.

（実施の形態５）
次に、実施の形態５の動画像符号化装置及び動画像復号装置を説明する。実施の形態５においては、複数参照画像合成部において、第１の参照画像に対して複数の参照画像を用いて超解像拡大処理を行い、超解像拡大を施した結果の拡大画像を合成参照画像として動き補償予測に用いる構成をとることが特徴である。 (Embodiment 5)
Next, a video encoding device and a video decoding device according to Embodiment 5 will be described. In the fifth embodiment, the multi-reference image synthesis unit performs super-resolution enlargement processing on the first reference image using a plurality of reference images, and synthesizes an enlarged image as a result of super-resolution enlargement. A feature is that the reference image is configured to be used for motion compensation prediction.

最初に、実施の形態５における合成画像動き補償予測処理の動作を示す概念図を図２１に示し説明する。実施の形態１から４までの構成においては、１画素未満の小数画素精度の画素値を、それぞれの参照画像に対するフィルタリングにより生成し、生成後の参照画像を用いて合成参照画像を生成していたが、実施の形態４においては、第１の参照画像を拡大した１画素未満の小数画素精度の画素値を、他の参照画像からの貼り付けにより生成し、貼り付け後の周波数帯域を整えることで、高精細かつ符号化劣化の影響を低減した参照画像を生成する。 First, a conceptual diagram showing the operation of the composite image motion compensation prediction process in Embodiment 5 is shown in FIG. 21 and described. In the configurations of the first to fourth embodiments, a pixel value with decimal pixel accuracy of less than one pixel is generated by filtering each reference image, and a synthesized reference image is generated using the generated reference image. However, in the fourth embodiment, a pixel value with decimal pixel accuracy of less than one pixel obtained by enlarging the first reference image is generated by pasting from another reference image, and the frequency band after pasting is adjusted. Thus, a reference image with high definition and reduced influence of coding deterioration is generated.

図２１に示すように、符号化対象ブロックに対して、第１の参照画像との間で検出した第１の動きベクトルを元に取得した第１の参照ブロックに対して、特定範囲の第２の参照画像及び第３の参照画像を設定し、特定範囲内の画素に対して第１の参照ブロックに対する動き検出を行い、画素貼り付けを行う（レジストレーション）事で小数画素精度の拡大参照ブロックを生成し、所定の帯域に整えるためのフィルタを施し、その成分を反映させる。 As shown in FIG. 21, with respect to the first reference block acquired based on the first motion vector detected between the encoding target block and the first reference image, the second block in the specific range. The reference image and the third reference image are set, the motion detection for the first reference block is performed on the pixels within the specific range, and the pixel is pasted (registration), thereby enlarging the reference block with decimal pixel accuracy. Is generated, and a filter for adjusting to a predetermined band is applied to reflect the component.

上記処理を複数回繰り返すことで、第１の参照ブロックにおける符号化劣化を除去し、小数画素精度の高精細参照画像を生成する。このようにして生成された高精細参照信号を合成画像動き補償予測に用いることで、予測残差の高周波成分が少ない予測画像を生成する。 By repeating the above process a plurality of times, the encoding deterioration in the first reference block is removed, and a high-definition reference image with decimal pixel accuracy is generated. By using the high-definition reference signal generated in this way for the composite image motion compensation prediction, a predicted image with less high frequency components of the prediction residual is generated.

図２２に実施の形態５の動画像符号化装置及び動画像復号装置における、複数参照画像合成部の構成を示すブロック図を示し、その動作を説明する。実施の形態４と同様に、実施の形態５においても符号化装置と復号装置における複数参照画像合成部の構成及び動作は、複数参照画像合成部に繋がる処理ブロックの動作が異なるのみで、同様の動作を行うため、符号化装置における振る舞いを示し説明する。 FIG. 22 is a block diagram showing the configuration of the multiple reference image synthesizer in the video encoding device and video decoding device of Embodiment 5, and the operation will be described. Similar to the fourth embodiment, in the fifth embodiment, the configuration and operation of the multiple reference image synthesis unit in the encoding device and the decoding device are the same except that the operation of the processing block connected to the multiple reference image synthesis unit is different. In order to perform the operation, the behavior in the encoding device will be shown and described.

図２２に示すように、実施の形態５における複数参照画像合成部は、基準参照画像取得部１７００、レジストレーション対象範囲設定部１７０１、レジストレーション部１７０２、合成参照画像取得部１７０３、帯域制限フィルタ部１７０４、合成画像メモリ１７０５、及び再構成終了判定部１７０６を備える。 As shown in FIG. 22, the multiple reference image synthesis unit according to the fifth embodiment includes a standard reference image acquisition unit 1700, a registration target range setting unit 1701, a registration unit 1702, a synthesized reference image acquisition unit 1703, and a band limiting filter unit. 1704, a composite image memory 1705, and a reconstruction end determination unit 1706.

最初に動きベクトル検出部１０４より、基準参照画像取得部１７００、レジストレーション対象範囲設定部１７０１に、第１参照画像と符号化対象ブロックとの間の動きベクトル値ＭＶ１が入力される。基準参照画像取得部１７００では、入力されたＭＶ１を用いて復号参照画像メモリ１１７から、第１参照画像の参照ブロックを取得する。基準参照画像取得部１７００は、取得した第１の参照ブロックをレジストレーション部１７０２に出力する。 First, the motion vector detection unit 104 inputs the motion vector value MV1 between the first reference image and the encoding target block to the standard reference image acquisition unit 1700 and the registration target range setting unit 1701. The reference reference image acquisition unit 1700 acquires the reference block of the first reference image from the decoded reference image memory 117 using the input MV1. The reference reference image acquisition unit 1700 outputs the acquired first reference block to the registration unit 1702.

続いてレジストレーション対象範囲設定部１７０１では、第１の参照ブロックに対して他の参照画像のからレジストレーションを行う対象となるエリアを設定する。具体的には、
図１２で示した実施の形態１における参照画像間の動きベクトル検出範囲と同様に、符号化対象画像との距離に応じて、ＭＶ１を延長もしくは縮小した動き量で示される位置を中心に、±Ｌ画素の範囲をレジストレーションを行う対象エリアとして設定する。Ｌの値は、実施の形態１における参照画像間の動きベクトル検出範囲よりも広い範囲を必要とし、例えばＬ＝３２と設定される。 Subsequently, the registration target range setting unit 1701 sets an area to be registered from other reference images for the first reference block. In particular,
Similar to the motion vector detection range between the reference images in the first embodiment shown in FIG. 12, the position indicated by the motion amount obtained by extending or reducing MV1 in accordance with the distance from the encoding target image is ± The range of L pixels is set as a target area for registration. The value of L requires a wider range than the motion vector detection range between reference images in the first embodiment, and is set to L = 32, for example.

レジストレーション対象範囲設定部１７０１において設定された対象エリアは、レジストレーション部１７０２に送られ、レジストレーション処理が施される。レジストレーション部１７０２では、最初に第１の参照ブロックに対して水平・垂直Ｘ倍の拡大処理を施す。本実施の形態においては、Ｘ＝４とすることで１／４画素単位の動き補償に用いることを可能とする拡大画像を生成する。 The target area set in the registration target range setting unit 1701 is sent to the registration unit 1702 and subjected to registration processing. The registration unit 1702 first performs horizontal and vertical X-fold enlargement processing on the first reference block. In the present embodiment, by setting X = 4, an enlarged image that can be used for motion compensation in 1/4 pixel units is generated.

拡大画像において、画素値が存在している１画素単位の画素と、拡大処理により生成された画素は、レジストレーション処理において異なる処理が施される。レジストレーションは図２１に示したように、所定画素単位（例：４×４画素）で第１の拡大された参照ブロックと、合成参照画像取得部１７０３を介して復号参照画像メモリ１１７より取得した他の参照画像の１画素単位で構成された参照ブロックとの間で、１画素間隔でのブロックマッチングを行うことで動きベクトルを算出する。算出された動きベクトルが１画素精度の位置を示していない場合には、他の参照画像からの画素貼り付けが従来存在していなかった画素に対して行われるため、貼り付けた画素値がフィルタにより生成された画素値と置き換わる。１画素精度の位置を示していた場合、及び複数の参照画像より同じ場所に画素が張り付く場合には、レジストレーション終了後にそれぞれの画素位置に張り付いた画素値の分布や頻度を元に、最も該当位置にあるべき値を算出しその値で画素値を置き換える。 In the enlarged image, a pixel in which a pixel value exists and a pixel generated by the enlargement process are subjected to different processes in the registration process. As shown in FIG. 21, the registration is acquired from the decoded reference image memory 117 via the first enlarged reference block in a predetermined pixel unit (eg, 4 × 4 pixels) and the synthesized reference image acquisition unit 1703. A motion vector is calculated by performing block matching at intervals of one pixel with a reference block configured in units of one pixel of another reference image. When the calculated motion vector does not indicate a position with 1 pixel accuracy, pixel pasting from another reference image is performed on a pixel that has not existed conventionally, and thus the pasted pixel value is filtered. Replaces the pixel value generated by. When the position of one-pixel accuracy is indicated, and when a pixel sticks to the same place from a plurality of reference images, the highest is based on the distribution and frequency of pixel values attached to each pixel position after the registration is completed. The value that should be at the corresponding position is calculated, and the pixel value is replaced with the value.

レジストレーションを施された参照ブロックは、レジストレーション部１７０２より帯域制限フィルタ部１７０４に出力される。帯域制限フィルタ部１７０４においては、本来拡大した参照画像が有する周波数特性を想定した帯域制限フィルタを、入力したレジストレーション後の参照ブロックに対して施す。 The registered reference block is output from the registration unit 1702 to the band limiting filter unit 1704. The band limiting filter unit 1704 applies a band limiting filter that assumes frequency characteristics of the originally enlarged reference image to the input reference block after registration.

レジストレーション時に張り付くことがなかった１画素精度以外の画素位置の値に関しては、最初に拡大した際に生成した値を用いずに、周囲のレジストレーションされた画素値を用いて、帯域制限フィルタによりフィルタリングされる。これにより、張り付かなかった位置の画素値にもレジストレーションの影響が反映される。 For pixel position values other than one-pixel accuracy that did not stick at the time of registration, using the surrounding registered pixel values without using the values generated when the image was first enlarged, Filtered. As a result, the influence of the registration is also reflected on the pixel value at the position where it is not attached.

帯域制限フィルタが掛けられた結果の参照ブロックは、帯域制限フィルタ部１７０４より合成画像メモリ１７０５に格納される。合成画像メモリ１７０５は、格納した参照ブロックを再構成終了判定部１７０６に送る。 The reference block as a result of applying the band limiting filter is stored in the synthesized image memory 1705 by the band limiting filter unit 1704. The composite image memory 1705 sends the stored reference block to the reconstruction end determination unit 1706.

再構成終了判定部１７０６においては、帯域制限フィルタ部１７０４より送られた１回前の帯域制限フィルタが掛けられた参照ブロックを内部に確保しておき、入力された参照ブロックとの比較を行う。比較結果として変化が少なくなった場合（１回前の変化よりも変化量が少ない）で且つ、今回の変化が少ない場合に、超解像化を行うための再構成処理が完了したと判断し、再構成終了判定部１７０６は現在の符号化対象ブロックに対する、複数参照画像からの合成処理を終了する。 In the reconfiguration end determination unit 1706, the reference block sent from the band limiting filter unit 1704 and subjected to the previous band limiting filter is secured inside, and compared with the input reference block. When the change is small as a comparison result (the amount of change is smaller than the previous change) and the change this time is small, it is determined that the reconstruction processing for super-resolution is completed. The reconstruction end determination unit 1706 ends the synthesis process from the plurality of reference images for the current encoding target block.

終了時には、合成画像メモリ１７０５より合成画像動き補償予測部１０８に対して、格納された参照ブロックが出力される。終了しない場合には、１回前の帯域制限フィルタが掛けられた参照ブロックに対して、格納された参照ブロックと１回前の帯域制限フィルタが掛けられた参照ブロックの差分に対して、帯域制限フィルタの逆特性を持つフィルタを施し高域成分を抽出し、生成された高域成分情報を反映させた参照ブロックの更新画像が再度レジストレーション部１７０２に入力され、再度他の参照画像からのレジストレーション処理が施される。複数回レジストレーション処理が繰り返されることで、段階的に高精細成分が参照ブロック上で再構成され、高品質な参照ブロックが生成される。 At the end, the stored reference block is output from the composite image memory 1705 to the composite image motion compensation prediction unit 108. If not completed, the bandwidth limit is applied to the difference between the stored reference block and the reference block to which the previous bandwidth limit filter is applied with respect to the reference block to which the previous bandwidth limit filter is applied. A high-frequency component is extracted by applying a filter having an inverse characteristic of the filter, and an updated image of the reference block reflecting the generated high-frequency component information is input again to the registration unit 1702, and the registration image from another reference image is again input. Is applied. By repeating the registration process a plurality of times, high-definition components are reconstructed in stages on the reference block, and a high-quality reference block is generated.

具体的なレジストレーション及びその反映を含む超解像処理に関しては、実施の形態５の構成以外にも方法が存在し、その手法を適用する場合においても超解像処理を施した合成参照画像による動き補償予測を、追加の動きベクトルを伝送せずに実現できる効果がある。 Regarding super-resolution processing including specific registration and its reflection, there is a method other than the configuration of the fifth embodiment, and even when the method is applied, it is based on a synthesized reference image that has been subjected to super-resolution processing. There is an effect that the motion compensation prediction can be realized without transmitting an additional motion vector.

実施の形態５における動画像符号化装置及び動画像復号装置によれば、第１の参照画像を用いて予測した動き補償予測画像に対して、他の参照画像を用いて超解像化処理を施した画像を予測画像とすることで、参照画像が消失した高周波成分を復元した予測画像を生成すると共に、超解像化した参照画像に対して細かい位相調整を行う動きベクトル検出を施して高周波成分の位相を加味した動きベクトルを伝送できることで、付加情報の増加無しに、高周波成分の予測残差を大幅に削減できる、新たな効果が加わる。 According to the moving image encoding device and the moving image decoding device in the fifth embodiment, the super-resolution processing is performed on the motion compensated prediction image predicted using the first reference image using another reference image. By creating the predicted image as a predicted image, a predicted image in which the high-frequency component from which the reference image has disappeared is restored is generated, and motion vector detection is performed to finely adjust the super-resolved reference image for high frequency. Since a motion vector that takes into account the phase of the component can be transmitted, there is a new effect that the prediction residual of the high-frequency component can be significantly reduced without an increase in additional information.

尚、第１、第２、第３、第４、第５の実施の形態として提示した、動画像符号化装置、及び動画像復号装置は、物理的にはＣＰＵ（中央処理装置）、メモリなどの記録装置、ディスプレイ等の表示装置、及び伝送路への通信手段を具備したコンピュータで実現することが可能であり、提示した各々の機能を具備する手段を、コンピュータ上のプログラムとして実現し、実行することが可能である。また、プログラムをコンピュータ等で読み取り可能な記録媒体に記録して提供することも、有線あるいは無線のネットワークを通してサーバから提供することも、地上波あるいは衛星デジタル放送のデータ放送として提供することも可能である。 The moving picture encoding apparatus and moving picture decoding apparatus presented as the first, second, third, fourth, and fifth embodiments are physically a CPU (central processing unit), a memory, and the like. Can be realized by a computer equipped with a recording device, a display device such as a display, and a communication means to the transmission path, and the means having each presented function is realized as a program on the computer and executed. Is possible. In addition, the program can be provided by being recorded on a computer-readable recording medium, provided from a server through a wired or wireless network, or provided as data broadcasting of terrestrial or satellite digital broadcasting. is there.

以上、本発明を実施の形態をもとに説明した。実施の形態は例示であり、それらの各構成要素や各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. The embodiments are exemplifications, and it will be understood by those skilled in the art that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are within the scope of the present invention. .

１０１入力画像バッファ、１０２ブロック分割部、１０３フレーム内予測部、１０４動きベクトル検出部、１０５補償予測部、１０６動きベクトル予測部、１０７複数参照画像合成部、１０８補償予測部、１０９予測モード判定部、１１０減算器、１１１直交変換部、１１２量子化部、１１３逆量子化部、１１４逆直交変換部、１１５加算器、１１６フレーム内復号画像メモリ、１１７復号参照画像メモリ、１１８エントロピー符号化部、１１９ストリームバッファ、１２１符号量制御部、２０１ストリームバッファ、２０２エントロピー復号部、２０３予測モード復号部、２０４予測画像選択部、２０５逆量子化部、２０６逆直交変換部、２０７加算器、２０８フレーム内復号画像メモリ、２０９復号参照画像メモリ、２１１フレーム内予測部、２１２動きベクトル予測復号部、２１３補償予測部、２１４動きベクトル分離部、２１５複数参照画像合成部、２１６補償予測部、４００基準参照画像取得部、４０１動きベクトル検出範囲設定部、４０２参照画像間動きベクトル検出部、４０３合成参照画像取得部、４０４参照画像合成部、４０５合成画像メモリ、１０００基準参照画像取得部、１００１動きベクトル検出範囲設定部、１００２動きベクトル検出部、１００３合成参照画像取得部、１００４参照画像合成部、１００５合成画像メモリ。 DESCRIPTION OF SYMBOLS 101 Input image buffer, 102 Block division part, 103 Intra-frame prediction part, 104 Motion vector detection part, 105 Compensation prediction part, 106 Motion vector prediction part, 107 Multiple reference image synthetic | combination part, 108 Compensation prediction part, 109 Prediction mode determination part 110 subtractor, 111 orthogonal transform unit, 112 quantization unit, 113 inverse quantization unit, 114 inverse orthogonal transform unit, 115 adder, 116 intra-frame decoded image memory, 117 decoded reference image memory, 118 entropy encoding unit, 119 Stream buffer, 121 Code amount control unit, 201 Stream buffer, 202 Entropy decoding unit, 203 Prediction mode decoding unit, 204 Predicted image selection unit, 205 Inverse quantization unit, 206 Inverse orthogonal transform unit, 207 Adder, 20 Intra-frame decoded image memory, 209 decoded reference image memory, 211 intra-frame prediction unit, 212 motion vector prediction decoding unit, 213 compensation prediction unit, 214 motion vector separation unit, 215 multiple reference image synthesis unit, 216 compensation prediction unit, 400 standard Reference image acquisition unit, 401 motion vector detection range setting unit, 402 inter-reference image motion vector detection unit, 403 synthesized reference image acquisition unit, 404 reference image synthesis unit, 405 synthesized image memory, 1000 standard reference image acquisition unit, 1001 motion vector A detection range setting unit, 1002 a motion vector detection unit, 1003 a synthesized reference image acquisition unit, 1004 a reference image synthesis unit, and 1005 a synthesized image memory.

Claims

A motion vector decoding unit for decoding the first motion vector for the decoding target block from the encoded stream;
A motion vector separation unit that generates a second motion vector from the first motion vector;
A first reference block of a specific area having a size equal to or larger than the decoding target block, extracted from the first reference image using the second motion vector, and a predetermined area of at least one other reference image; A reference image synthesis unit that generates a synthesized reference block;
Using the first motion vector, a block having the same size as the decoding target block is extracted from the synthesis reference block, and the extracted block is a prediction block;
A moving picture decoding apparatus comprising: a decoding unit that generates a decoded image by adding the prediction block and a prediction difference block decoded from the decoding target block.

In the motion vector separation unit, the accuracy of the input first motion vector is M pixel accuracy (M is a real number), and the accuracy of the second motion vector to be generated is N pixel accuracy (N is a real number: N > M), and the second motion vector is a value obtained by converting the first motion vector to N pixel accuracy,
The specific area, the location relative to the first reference image indicated by the second motion vector, motion picture according to claim 1, characterized in that it comprises a target block ± N / 2 pixels or more regions Image decoding device.

The reference image synthesis unit includes an inter-reference image motion vector detection unit that detects a third motion vector between the first reference block and a second reference image that is another reference image.
The reference image synthesis unit includes an average value or a weighted average value for each pixel of the second reference block extracted from the second reference image using the third motion vector and the first reference block. by calculating the moving picture decoding apparatus according to claim 1 or 2, characterized in that to generate the synthetic reference block.

The inter-reference image motion vector detection unit detects a plurality of third motion vectors between the first reference block and the second reference image in units of blocks smaller than the first reference block. And
The reference image synthesis unit combines a plurality of second reference blocks in small blocks extracted from the second reference image using the plurality of third motion vectors, and 4. The moving picture decoding apparatus according to claim 3 , wherein the synthesized reference block is generated by calculating an average value or a weighted average value for each pixel.

The inter-reference-picture motion vector detection unit detects two time differences between a first time difference between the first reference picture and the decoding target block and a second time difference between the second reference picture and the decoding target block. around the motion vector value obtained by converting the second motion vector according, by searching a motion within a predetermined range, according to claim 3 or 4, characterized in that detecting the third motion vector Video decoding device.

Decoding a first motion vector for a decoding target block from an encoded stream;
Generating a second motion vector from the first motion vector;
A first reference block of a specific area having a size equal to or larger than the decoding target block, extracted from the first reference image using the second motion vector, and a predetermined area of at least one other reference image; Generating a synthesized composite reference block;
Extracting a block having the same size as the decoding target block from the synthesis reference block using the first motion vector, and setting the extracted block as a prediction block;
A moving picture decoding method comprising: a step of adding a prediction block and a prediction difference block decoded from the decoding target block to generate a decoded image.

A function of decoding a first motion vector for a decoding target block from an encoded stream;
A function of generating a second motion vector from the first motion vector;
A first reference block of a specific area having a size equal to or larger than the decoding target block, extracted from the first reference image using the second motion vector, and a predetermined area of at least one other reference image; A function to generate a synthesized reference block;
A function of extracting a block having the same size as the decoding target block from the synthesis reference block using the first motion vector, and setting the extracted block as a prediction block;
A moving picture decoding program that causes a computer to realize a function of generating a decoded image by adding the prediction block and a prediction difference block decoded from the decoding target block.