JP2004120626A

JP2004120626A - Moving image compositing method and apparatus, and program

Info

Publication number: JP2004120626A
Application number: JP2002284126A
Authority: JP
Inventors: Wataru Ito; 伊藤　渡; Sukekazu Kameyama; 亀山　祐和
Original assignee: Fuji Photo Film Co Ltd
Current assignee: Fujifilm Holdings Corp
Priority date: 2002-09-27
Filing date: 2002-09-27
Publication date: 2004-04-15
Anticipated expiration: 2022-09-27
Also published as: JP4104947B2

Abstract

<P>PROBLEM TO BE SOLVED: To form a highly precise composite frame by sampling a plurality of consecutive frames in a moving image. <P>SOLUTION: A sampling means 1 decides the number S of frames to be sampled on the basis of moving image data M0 and image characteristics of the composite frame, and samples S pieces of frames including a reference frame FrN. A correlation finding means 2 finds correlations between other frames and the reference frame successively from the other frames near the reference frame out of the S pieces of frames. For the other frames for which the correlations with the reference frame are found by the correlation means 2, a stopping means 10 finds the correlation between each of these frames and the reference frame, and when the correlation is lower than a predetermined threshold, the stopping means 10 stops its finding operation of the correlations with this frame and subsequent frames. The composite frame is generated on the basis of the correlation found by the correlation means 2. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像の連続する複数のフレームを統合して、これら複数のフレームよりも高解像度の合成フレームを作成することができる動画像合成方法および装置並びに動画像合成方法をコンピュータに実行させるためのプログラムに関するものである。
【０００２】
【従来の技術】
近年、デジタルビデオカメラの普及により、動画像を１フレーム単位で扱うことが可能となっている。このような動画像のフレームをプリント出力する際には、画質を向上させるためにフレームを高解像度にする必要がある。このため、動画像から複数のフレームをサンプリングし、サンプリングした複数のフレームを統合することにより、これらのフレームよりも高解像度の１の合成フレームを作成する方法が提案されている。
【０００３】
動画像の複数のフレームを統合する際に必要とされるのは、動領域における各フレーム間の画素の対応関係を求めることである。これには通常、ブロックマッチング法や勾配法が用いられるが、従来のブロックマッチング法は、ブロック内の動き量が同一方向であることを仮定したものであるため、回転、拡大、縮小、変形といった様々な動きに対応する柔軟性に欠けている上に、処理時間がかかり、実用的ではないという問題がある。一方、勾配法は、従来のブロックマッチング法と比較して安定に解を求めることができないという問題がある。これらの問題を克服した方法としては、統合される複数のフレームのうちの１つのフレームを基準フレームとし、基準フレームに１または複数の矩形領域からなる基準パッチを、基準フレーム以外の他のフレームに基準パッチと同様のパッチを配置し、パッチ内の画像が基準パッチ内の画像と一致するようにパッチを他のフレーム上において移動および／または変形し、移動および／または変形後のパッチおよび基準パッチに基づいて、他のフレーム上のパッチ内の画素と基準フレーム上の基準パッチ内の画素との対応関係を求めて複数フレームを精度よく合成する方法が提案されている（非特許文献１参照）。
【０００４】
非特許文献１の方法においては、基準フレームと他のフレームとの対応関係を求め、求めた後、他のフレームと基準フレームとを、最終的に必要な解像度を有する統合画像上に割り当てることにより、高精細な合成フレームを得ることができる。
【０００５】
【非特許文献１】
「フレーム間統合による高精細ディジタル画像の獲得」，中沢祐二、小松隆、斉藤隆弘，テレビジョン学会誌，１９９５年，Ｖｏｌ．４９，Ｎｏ．３，ｐ２９９−３０８
【０００６】
【発明が解決しようとする課題】
しかし、非特許文献１に記載された方法においては、動画像から複数のフレームをサンプリングする際に、基準フレームを含むどの範囲のフレーム、すなわち、基準フレームを含む何枚までのフレームを統合に使用するフレームとすることについては、操作者の手動により設定されることになっている。操作者に画像処理の知識を要求すると共に、手間がかかるという問題がある。また、操作者の手動により設定されるので、操作者の主観が入り、必ずしも客観的に適切な範囲を得ることができず、合成フレームの品質に悪い影響を与えてしまうという問題がある。
【０００７】
本発明は、上記事情を鑑みなされたものであり、動画像の複数のフレームを統合して合成フレームを作成する際に、簡単かつ客観的に適切なフレーム範囲を決定し、品質の良い合成フレームを作成することが可能な動画像合成方法および装置並びにプログラムを提供することを目的とするものである。
【０００８】
【課題を解決するための手段】
本発明の動画像合成方法は、動画像の連続する、基準フレームを含む２以上の所定の枚数のフレームをサンプリングし、
前記基準フレーム上に１つまたは複数の矩形領域からなる基準パッチを配置し、
該基準パッチと同様のパッチを前記所定の枚数のフレームのうちの他のフレーム上に配置し、
該パッチ内の画像が前記基準パッチ内の画像と略一致するように、該パッチを前記他のフレーム上において移動および／または変形し、
該移動および／または変形後のパッチおよび前記基準パッチに基づいて、前記他のフレームの夫々のフレーム上の前記パッチ内の画素と前記基準フレーム上の前記基準パッチ内の画素との対応関係を夫々求め、
求められた前記対応関係に基づいて前記所定の枚数のフレームから合成フレームを作成する動画像合成方法において、
前記動画像および／または作成しようとする前記合成フレームの画像特性に基づいて前記所定の枚数を決定して該所定の枚数のフレームをサンプリングすることを特徴とするものである。
【０００９】
ここで、動画像の画像特性とは、動画像から合成フレームを作成する際に合成フレームの品質に影響を与える可能性のある特性を意味し、例えば、動画像の各フレームの画素サイズや解像度、動画像のフレームレート、動画像の圧縮率などを例として挙げることができる。同様に、作成しようとする合成フレームの画像特性とは、該合成フレームを作成するのにサンプリングするフレームの数、若しくは必要とするフレームの数を決定する上で影響を与える可能性のある特性を意味し、例えば、作成しようとする合成フレームの画素サイズや解像度などを例として挙げることができる。また、直接的ではないが、例えば作成しようとする合成フレームの画素サイズが、前記動画像のフレームの画素サイズに対する倍率なども、間接的に前記動画像および作成しようとする前記合成フレームの画像特性である。
【００１０】
本発明の動画像合成方法において、前記画像特性を取得する方式としては、必要な画像特性を取得することができればいかなる方式であってもよく、例えば、動画像の画像特性としては、動画像のタグなどの付属情報を読み取って取得するようにしてもよいし、操作者により入力された値を用いるようにしてもよく、作成しようとする合成フレームの画像特性としては、操作者により入力された値を用いてよいし、固定した目標値を用いるようにしてもよい。
【００１１】
本発明の画像処理方法は、前記基準フレームに近い他のフレームから順に、前記対応関係を求めると共に、前記対応関係が求められる前記他のフレームと前記基準フレームとの相関を求めていき、
前記相関が所定の閾値より低くなったフレームにおいて、前記対応関係を求める処理を中止し、
求められた前記対応関係に基づいて、前記基準フレームおよび前記対応関係が求められた前記他のフレームを用いて前記合成フレームを作成することが好ましい。
【００１２】
ここで、「基準フレームに近い他のフレームから順に」とは、例えば、サンプリングされた複数のフレームにおける基準フレームの時系列的な位置が先頭または末端であれば、「基準フレームより時系列的に早い他のフレームから順に」または「基準フレームより時系列的に遅い他のフレームから順に」とのことを意味するが、サンプリングされた複数のフレームにおける基準フレームの時系列的な位置が先頭および末端ではなければ、「基準フレームより時系列的に早い他のフレームから順に」と「基準フレームより時系列的に遅い他のフレームから順に」との夫々両方のことを意味する。
【００１３】
本発明の動画像合成装置は、動画像の連続する、基準フレームを含む２以上の所定の枚数のフレームをサンプリングするサンプリング手段と、
前記基準フレーム上に１つまたは複数の矩形領域からなる基準パッチを配置し、該基準パッチと同様のパッチを前記所定の枚数のフレームのうちの他のフレーム上に配置し、該パッチ内の画像が前記基準パッチ内の画像と略一致するように、該パッチを前記他のフレーム上において移動および／または変形し、該移動および／または変形後のパッチおよび前記基準パッチに基づいて、前記他のフレームの夫々のフレーム上の前記パッチ内の画素と前記基準フレーム上の前記基準パッチ内の画素との対応関係を夫々求める対応関係求出手段と、
該対応関係求出手段により求められた前記対応関係に基づいて前記所定の枚数のフレームから合成フレームを作成するフレーム統合手段とを備えてなる動画像合成装置であって、
前記サンプリング手段が、前記動画像および／または作成しようとする前記合成フレームの画像特性に基づいて前記所定の枚数を決定するフレーム枚数決定手段を備え、該フレーム枚数決定手段に決定された前記所定の枚数のフレームをサンプリングするものであることを特徴とするものである。
【００１４】
本発明の動画像合成装置は、前記対応関係求出手段の処理を中止する中止手段をさらに備え、
前記対応関係求出手段が、前記基準フレームに近い他のフレームから順に、前記対応関係を求めるものであり、
前記中止手段が、前記対応関係求出手段により前記対応関係が求められる前記他のフレームと前記基準フレームとの相関を求めると共に、前記相関が所定の閾値より低くなったフレームから、前記対応関係求出手段の処理を中止するものであり、
前記フレーム統合手段が、求められた前記対応関係に基づいて、前記基準フレームおよび前記対応関係が求められた前記他のフレームを用いて前記合成フレームを作成するものであることが好ましい。
【００１５】
本発明のプログラムは、動画像および／または該動画像の複数のフレームから作成しようとする合成フレームの画像特性に基づいて、合成に使用する前記フレームの枚数を決定する枚数決定処理と、
前記動画像の連続する、基準フレームを含む前記枚数のフレームをサンプリングするサンプリング処理と、
前記基準フレーム上に１つまたは複数の矩形領域からなる基準パッチを配置し、該基準パッチと同様のパッチを前記枚数のフレームのうちの他のフレーム上に配置し、該パッチ内の画像が前記基準パッチ内の画像と略一致するように、該パッチを前記他のフレーム上において移動および／または変形し、該移動および／または変形後のパッチおよび前記基準パッチに基づいて、前記他のフレームの夫々のフレーム上の前記パッチ内の画素と前記基準フレーム上の前記基準パッチ内の画素との対応関係を夫々求める対応関係求出処理と、
求められた前記対応関係に基づいて前記枚数のフレームから合成フレームを作成するフレーム統合処理とをコンピュータに実行させることを特徴とするものである。
【００１６】
前記対応関係求出処理が、前記基準フレームに近い他のフレームから順に、前記対応関係を求めるものであり、本発明のプログラムは、前記対応関係が求められる前記他のフレームと前記基準フレームとの相関を求めると共に、前記相関が所定の閾値より低くなったフレームから、前記対応関係を求める処理を中止する中止処理をさらにコンピュータに実行させるものであることが好ましい。
【００１７】
【発明の効果】
本発明の動画像合成方法および装置によれば、動画像の複数の連続するフレームをサンプリングして合成フレームを作成する際に、動画像および／または合成しようとする合成フレームの画像特性に基づいてサンプリングするフレームの数を決定するようにしているので、操作者が手動でフレームの数を決定する必要がなく、便利である。また、動画像および／または作成しようとする合成フレームの画像特性に基づいてフレームの数を決定することによって、客観的に適切なフレームの数を決定することができので、高品質の合成フレームを作成することができる。
【００１８】
本発明の動画像合成方法および装置において、決定されたフレームの数のフレームをサンプリングし、これらのフレームに対して、基準フレームに近い基準フレーム以外の他のフレームから順に、他のフレーム上のパッチ内の画素と基準フレーム上の基準パッチ上内の画素との対応関係を求めると共に、他のフレームと基準フレームとの相関を求めていき、相関が所定の閾値より大きければ、次の他のフレームに対して対応関係を求める処理を続行するが、相関が所定の閾値より低くなったフレームを検出すると、決定されたフレームの数に到達していなくても、このフレーム以降の他のフレームに対する対応関係求出処理を中止するようにすることによって、基準フレームと相関が低いフレーム（例えば基準フレームのシーンと切り替わったシーンにフレーム）を用いて合成フレームを作成することを避けることができ、より高品質の合成フレームを作成することが可能となる。
【００１９】
【発明の実施の形態】
以下、図面を参照して本発明の実施形態について説明する。
【００２０】
図１は、本発明の実施形態となる動画像合成装置の構成を示すブロック図である。図１に示すように、本実施形態による動画像合成装置は、入力された動画像データＭ０から複数のフレームをサンプリングするサンプリング手段１と、後述する対応関係求出手段２の処理を中止させる中止手段１０と、サンプリング手段１によりサンプリングした複数のフレームのうち、中止手段１０により中止されるフレームまでのフレーム（中止されなければ複数のフレーム全部）に対して、基準となる１つの基準フレームの画素および基準フレーム以外の他のフレームの画素の対応関係を、基準フレームに近い他のフレームから順に求める対応関係求出手段２と、対応関係求出手段２により求められた対応関係に基づいて、対応関係が求められた他のフレームを夫々基準フレームの座標空間上に座標変換して座標変換済みフレームを取得する座標変換手段３と、対応関係求出手段２において求められた対応関係に基づいて、対応関係が求められた他のフレームに対して補間演算を施して各フレームよりも解像度が高い第１の補間フレームを取得する時空間補間手段４と、基準フレームに対して補間演算を施して各フレームよりも解像度が高い第２の補間フレームを取得する空間補間手段５と、夫々の座標変換済みフレームと基準フレームとの相関を表す相関値を算出する相関値算出手段６と、第１の補間フレームと第２の補間フレームとを重み付け加算するための重み係数を相関値算出手段６において算出された相関値に基づいて算出する重み算出手段７と、重み算出手段７において算出された重み係数に基づいて第１および第２の補間フレームを重み付け加算して合成フレームＦｒＧを取得する合成手段８とを備える。なお、座標変換手段３と、時空間補間手段４と、空間補間手段５と、相関値算出手段６と、重み算出手段７と、合成手段８とは、請求項記載のフレーム統合手段に当たるものである。
【００２１】
図２は、図１に示す動画像合成装置におけるサンプリング手段１の構成を示すブロック図である。図２に示すように、サンプリング手段１は、合成フレームの画素サイズが動画像の１フレームの画素サイズに対する倍率、動画像のフレームレート、および動画像の圧縮クオリティと、サンプリングすべきフレームの数Ｓとを対応付けて作成されたフレーム数決定テーブルを記憶した記憶手段１２と、実際に作成しようとする合成フレームＦｒＧの画素サイズが動画像の１フレームの画素サイズに対する倍率、動画像データＭ０のフレームレート、および動画像データＭ０の圧縮クオリティを入力させるための条件設定手段１４と、記憶手段１２に記憶されたフレーム数決定テーブルを参照し、条件設定手段１４を介して入力された倍率、フレームレート、圧縮クオリティに対応した、サンプリングすべきフレームの数Ｓを検出して、Ｓ枚の連続したフレームを動画像データＭ０からサンプリングするサンプリング実行手段１６とを備えてなるものである。
【００２２】
図３は、図２に示すサンプリング手段１の記憶手段１２に記憶されたフレーム数決定テーブルの１例を示している。図示の例は、下記の式（１）に従って、様々な倍率、フレームレート、圧縮クオリティの組み合わせから、この組み合わせに対応してサンプリングすべきフレームの数Ｓを求めたものである。
【００２３】
Ｓ＝ｍｉｎ（Ｓ１，Ｓ２×Ｓ３）
Ｓ１＝フレームレート×３　　　　　　　　　　　　　　　（１）
Ｓ２＝倍率×１．５
Ｓ３＝１．０（高圧縮クオリティ）
Ｓ３＝１．２（中圧縮クオリティ）
Ｓ３＝１．５（低圧縮クオリティ）
即ち、フレームレートが大きければフレームの数Ｓが多く、倍率が大きければフレームの数Ｓが多く、圧縮クオリティが低ければフレームの数Ｓが多くなる傾向でフレームの数が求められている。
【００２４】
サンプリング手段１は、サンプリングしたＳ枚のフレームを対応関係求出手段２に出力し、対応関係求出手段２は、このＳ枚のフレーム（中止手段１０により中止されれば、このＳ枚のフレームのうちの中止されたフレームまでのフレーム）のうちの基準フレームの画素および他のフレームの画素の対応関係を、基準フレームに近い他のフレームから順に求める。ここで対応関係求出手段２の動作を説明する。なお、動画像データＭ０はカラーの動画像を表すものであり、各フレームはＹ，Ｃｂ，Ｃｒの輝度色差成分からなるものとする。また、以降の説明において、Ｙ，Ｃｂ，Ｃｒの各成分に対して処理が行われるが、行われる処理は全ての成分について同様であるため、本実施形態においては輝度成分Ｙの処理について詳細に説明し、色差成分Ｃｂ，Ｃｒに対する処理については説明を省略する。
【００２５】
サンプリング手段１から出力されてきたＳ枚のフレームは、例として１つの基準フレームＦｒＮを先頭にして、基準フレームＦｒＮに近い順からＦｒＮ＋１，ＦｒＮ＋２．．．ＦｒＮ＋（Ｓ−１）のように連続して並んだものである。ここで、フレームＦｒＮ＋１と基準フレームＦｒＮとを例にして対応関係求出手段２の動作を説明する。なお、以降では、作成しようとする合成フレームＦｒＧはサンプリングしたフレームの縦横それぞれ２倍（倍率が４倍となる）の画素数を有する場合について説明するが、ｎ倍（ｎ：正数）の画素数を有するものであってもよい。
【００２６】
対応関係求出手段２は、以下のようにしてフレームＦｒＮ＋１と基準フレームＦｒＮとの対応関係を求める。図４はフレームＦｒＮ＋１と基準フレームＦｒＮとの対応関係の求出を説明するための図である。なお、図４において、基準フレームＦｒＮに含まれる円形の被写体が、フレームＦｒＮ＋１においては図面上右側に若干移動しているものとする。
【００２７】
まず、対応関係求出手段２は、基準フレームＦｒＮ上に１または複数の矩形領域からなる基準パッチＰ０を配置する。図４（ａ）は、基準フレームＦｒＮ上に基準パッチＰ０が配置された状態を示す図である。図４（ａ）に示すように、本実施形態においては、基準パッチＰ０は４×４の矩形領域からなるものとする。
次いで、図４（ｂ）に示すように、フレームＦｒＮ＋１の適当な位置に基準パッチＰ０と同様のパッチＰ１を配置し、基準パッチＰ０内の画像とパッチＰ１内の画像との相関を表す相関値を算出する。なお、相関値は下記の式（２）により平均二乗誤差として算出することができる。また、座標軸は紙面左右方向にｘ軸、紙面上下方向にｙ軸をとるものとする。
【００２８】
【数１】

但し、Ｅ：相関値
ｐｉ，ｑｉ：基準パッチＰ０，Ｐ１内にそれぞれ対応する画素の画素値
Ｎ：基準パッチＰ０およびパッチＰ１内の画素数
次いで、フレームＦｒＮ＋１上のパッチＰ１を上下左右の４方向に一定画素±Δｘ，±Δｙ移動し、このときのパッチＰ１内の画像と基準フレームＦｒＮ上の基準パッチＰ０内の画像との相関値を算出する。ここで、相関値は上下左右方向のそれぞれについて算出され、各相関値をそれぞれＥ（Δｘ，０），Ｅ（−Δｘ，０），Ｅ（０，Δｙ），Ｅ（０，−Δｙ）とする。
【００２９】
そして、移動後の４つの相関値Ｅ（Δｘ，０），Ｅ（−Δｘ，０），Ｅ（０，Δｙ），Ｅ（０，−Δｙ）から相関値が小さく（すなわち相関が大きく）なる勾配方向を相関勾配として求め、この方向に予め設定した実数値倍だけ図４（ｃ）に示すようにパッチＰ１を移動する。具体的には、下記の式（３）により係数Ｃ（Δｘ，０），Ｃ（−Δｘ，０），Ｃ（０，Δｙ），Ｃ（０，−Δｙ）を算出し、これらの係数Ｃ（Δｘ，０），Ｃ（−Δｘ，０），Ｃ（０，Δｙ），Ｃ（０，−Δｙ）から下記の式（４），式（５）により相関勾配ｇｘ，ｇｙを算出する。
【００３０】
【数２】

そして、算出された相関勾配ｇｘ，ｇｙに基づいてパッチＰ１の全体を（−λ１ｇｘ，−λ１ｇｙ）移動し、さらに上記と同様の処理を繰り返すことにより、図４（ｄ）に示すようにパッチＰ１がある位置に収束するまで反復的にパッチＰ１を移動する。ここで、λ１は収束の速さを決定するパラメータであり、実数値をとるものとする。なお、λ１をあまり大きな値とすると反復処理により解が発散してしまうため、適当な値（例えば１０）を選ぶ必要がある。
【００３１】
さらに、パッチＰ１の格子点を座標軸に沿った４方向に一定画素移動させる。
このとき、移動した格子点を含む矩形領域は例えば図５に示すように変形する。
そして、変形した矩形領域について基準パッチＰ０の対応する矩形領域との相関値を算出する。この相関値をそれぞれＥ１（Δｘ，０），Ｅ１（−Δｘ，０），Ｅ１（０，Δｙ），Ｅ１（０，−Δｙ）とする。
【００３２】
そして、上記と同様に、変形後の４つの相関値Ｅ１（Δｘ，０），Ｅ１（−Δｘ，０），Ｅ１（０，Δｙ），Ｅ１（０，−Δｙ）から相関値が小さく（すなわち相関が大きく）なる勾配方向を求め、この方向に予め設定した実数値倍だけパッチＰ１の格子点を移動する。これをパッチＰ１の全ての格子点について行い、これを１回の処理とする。そして格子点の座標が収束するまでこの処理を繰り返す。
【００３３】
これにより、パッチＰ１の基準パッチＰ０に対する移動量および変形量が求まり、これに基づいて基準パッチＰ０内の画素とパッチＰ１内の画素との対応関係を求めることができる。
【００３４】
対応関係求出手段２は、このようにしてサンプリング手段１から出力されてきたＳ枚のフレームに対して、基準フレームＦｒＮに近いフレームからの順、即ちＦｒＮ＋１，ＦｒＮ＋２，．．．の順に対応関係を求めるが、中止手段１０により中止されたときは、中止されたフレームから以降のフレームに対する対応関係の求出を中止する。
【００３５】
図６は、中止手段１０の構成を示すブロック図である。図示のように、中止手段１０は、対応関係求出手段２により処理中のフレームと基準フレームとの相関を求める相関取得手段２２と、相関取得手段２２により求められた相関が所定の閾値以上であれば、対応関係求出手段２の処理を中止しないが、相関が所定の閾値より低ければ、対応関係求出手段２による処理中のフレーム以降のフレームに対する対応関係の求出を中止する。
【００３６】
本実施形態において、相関取得手段２２は、対応関係求出手段２において１つのフレームに対して算出された、収束時の相関値Ｅの和を、このフレームと基準フレームとの相関値として用い、この相関値が所定の閾値より高ければ（すなわち、相関が所定の閾値より低くければ）、対応関係求出手段２の処理を中止させ、すなわち、処理中のフレーム以降のフレームに対する対応関係の求出処理を中止させる。
【００３７】
座標変換手段３などからなるフレーム統合手段は、対応関係求出手段２により求められた対応関係に基づいて、基準フレームおよび、基準フレームとの対応関係が求められた他のフレームを用いて合成フレームを作成するものである。説明上の便宜のため、まず、対応関係求出手段２により対応関係が求められたフレームはＦｒＮ＋１のみであると仮定して、フレーム統合手段の動作を説明する。
【００３８】
座標変換手段３は以下のようにしてフレームＦｒＮ＋１を基準フレームＦｒＮの座標空間に座標変換して座標変換済みフレームＦｒＴ０を取得する。なお、以降の説明においては、基準フレームＦｒＮの基準パッチＰ０内の領域およびフレームＦｒＮ＋１のパッチＰ１内の領域についてのみ変換、補間演算および合成が行われる。
【００３９】
本実施形態においては、座標変換は双１次変換を用いて行うものとする。双１次変換による座標変換は、下記の式（６），（７）により定義される。
【００４０】
【数３】

式（６），（７）は、２次元座標上の４点（ｘｎ，ｙｎ）（１≦ｎ≦４）で与えられたパッチＰ１内の座標を、正規化座標系（ｕ，ｖ）（０≦ｕ，ｖ≦１）によって補間するものであり、任意の２つの矩形内の座標変換は、式（６），（７）および式（６），（７）の逆変換を組み合わせることにより行うことができる。
【００４１】
ここで、図７に示すように、パッチＰ１（ｘｎ，ｙｎ）内の点（ｘ，ｙ）が対応する基準パッチＰ０（ｘ′ｎ，ｙ′ｎ）内のどの位置に対応するかを考える。
まずパッチＰ１（ｘｎ，ｙｎ）内の点（ｘ，ｙ）について、正規化座標（ｕ，ｖ）を求める。これは式（６），（７）の逆変換により求める。そしてこのときの（ｕ，ｖ）と対応する基準パッチＰ０（ｘ′ｎ，ｙ′ｎ）を元に、式（６），（７）から点（ｘ，ｙ）に対応する座標（ｘ′，ｙ′）を求める。ここで、点（ｘ，ｙ）が本来画素値が存在する整数座標であるのに対し、点（ｘ′，ｙ′）は本来画素値が存在しない実数座標となる場合があるため、変換後の整数座標における画素値は、基準パッチＰ０の整数座標に隣接する８近傍の整数座標に囲まれた領域を設定し、この領域内に変換された座標（ｘ′，ｙ′）の画素値の荷重和として求めるものとする。
【００４２】
具体的には、図８に示すように基準パッチＰ０上における整数座標ｂ（ｘ，ｙ）について、その８近傍の整数座標ｂ（ｘ−１，ｙ−１），ｂ（ｘ，ｙ−１），ｂ（ｘ＋１，ｙ−１），ｂ（ｘ−１，ｙ），ｂ（ｘ＋１，ｙ），ｂ（ｘ−１，ｙ＋１），ｂ（ｘ，ｙ＋１），ｂ（ｘ＋１，ｙ＋１）に囲まれる領域内に変換されたフレームＦｒＮ＋１の画素値に基づいて算出する。ここで、フレームＦｒＮ＋１のｍ個の画素値が８近傍の画素に囲まれる領域内に変換され、変換された各画素の画素値をＩｔｊ（ｘ°，ｙ°）（１≦ｊ≦ｍ）とすると、整数座標ｂ（ｘ，ｙ）における画素値Ｉｔ（ｘ＾，ｙ＾）は、下記の式（８）により算出することができる。なお、式（８）においてφは荷重和演算を表す関数である。
【００４３】
【数４】

但し、Ｗｉ（１≦ｊ≦ｍ）：画素値Ｉｔｊ（ｘ°，ｙ°）が割り当てられた位置における近傍の整数画素から見た座標内分比の積
ここで、簡単のため、図８を用いて８近傍の画素に囲まれる領域内にフレームＦｒＮ＋１の２つの画素値Ｉｔ１，Ｉｔ２が変換された場合について考えると、整数座標ｂ（ｘ，ｙ）における画素値Ｉｔ（ｘ＾，ｙ＾）は下記の式（９）により算出することができる。
【００４４】
【数５】

但し、Ｗ１＝ｕ×ｖ、Ｗ２＝（１−ｓ）×（１−ｔ）
以上の処理をパッチＰ１内の全ての画素について行うことにより、パッチＰ１内の画像が基準フレームＦｒＮの座標空間に変換されて、座標変換済みフレームＦｒＴ０が得られる。
【００４５】
時空間補間手段４は、フレームＦｒＮ＋１に対して補間演算を施して第１の補間フレームＦｒＨ１を取得する。具体的には、まず図９に示すように、最終的に必要な画素数を有する統合画像（本実施形態においては、フレームＦｒＮ，ＦｒＮ＋１の縦横それぞれ２倍の画素数を有する場合について説明するが、ｎ倍（ｎ：正数）の画素数を有するものであってもよい）を用意し、対応関係求出手段２において求められた対応関係に基づいて、フレームＦｒＮ＋１（パッチＰ１内の領域）の画素の画素値を統合画像上に割り当てる。この割り当てを行う関数をΠとすると、下記の式（１０）によりフレームＦｒＮ＋１の各画素の画素値が統合画像上に割り当てられる。
【００４６】
【数６】

但し、Ｉ１Ｎ＋１（ｘ°，ｙ°）：統合画像上に割り当てられたフレームＦｒＮ＋１の画素値
ＦｒＮ＋１（ｘ，ｙ）：フレームＦｒＮ＋１の画素値
このように統合画像上にフレームＦｒＮ＋１の画素値を割り当てることにより画素値Ｉ１Ｎ＋１（ｘ°，ｙ°）を得、各画素についてＩ１（ｘ°，ｙ°）（＝Ｉ１Ｎ＋１（ｘ°，ｙ°））の画素値を有する第１の補間フレームＦｒＨ１を取得する。
【００４７】
ここで、画素値を統合画像上に割り当てる際に、統合画像の画素数とフレームＦｒＮ＋１の画素数との関係によっては、フレームＦｒＮ＋１上の各画素が統合画像の整数座標（すなわち画素値が存在すべき座標）に対応しない場合がある。
本実施形態においては、後述するように合成時において統合画像の整数座標における画素値を求めるものであるが、以下、合成時の説明を容易にするために統合画像の整数座標における画素値の算出について説明する。
【００４８】
統合画像の整数座標における画素値は、統合画像の整数座標に隣接する８近傍の整数座標に囲まれた領域を設定し、この領域内に割り当てられたフレームＦｒＮ＋１上の各画素の画素値の荷重和として求める。
【００４９】
すなわち、図１０に示すように統合画像における整数座標ｐ（ｘ，ｙ）については、その８近傍の整数座標ｐ（ｘ−１，ｙ−１），ｐ（ｘ，ｙ−１），ｐ（ｘ＋１，ｙ−１），ｐ（ｘ−１，ｙ），ｐ（ｘ＋１，ｙ），ｐ（ｘ−１，ｙ＋１），ｐ（ｘ，ｙ＋１），ｐ（ｘ＋１，ｙ＋１）に囲まれる領域内に割り当てられたフレームＦｒＮ＋１の画素値に基づいて算出する。ここで、フレームＦｒＮ＋１のｋ個の画素値が８近傍の画素に囲まれる領域内に割り当てられ、割り当てられた各画素の画素値をＩ１Ｎ＋１ｉ（ｘ°，ｙ°）（１≦ｉ≦ｋ）とすると、整数座標ｐ（ｘ，ｙ）における画素値Ｉ１Ｎ＋１（ｘ＾，ｙ＾）は、下記の式（１１）により算出することができる。なお、式（１１）においてΦは荷重和演算を表す関数である。
【００５０】
【数７】

但し、Ｍｉ（１≦ｉ≦ｋ）：画素値Ｉ１Ｎ＋１ｉ（ｘ°，ｙ°）が割り当てられた位置における近傍の整数画素から見た座標内分比の積
ここで、簡単のため、図１０を用いて８近傍の画素に囲まれる領域内にフレームＦｒＮ＋１の２つの画素値Ｉ１Ｎ＋１１，Ｉ１Ｎ＋１２が割り当てられた場合について考えると、整数座標ｐ（ｘ，ｙ）における画素値Ｉ１Ｎ＋１（ｘ＾，ｙ＾）は下記の式（１２）により算出することができる。
【００５１】
【数８】

但し、Ｍ１＝ｕ×ｖ、Ｍ２＝（１−ｓ）×（１−ｔ）
そして、統合画像の全ての整数座標について、フレームＦｒＮ＋１の画素値を割り当てることにより画素値Ｉ１Ｎ＋１（ｘ＾，ｙ＾）を得ることができる。この場合、第１の補間フレームＦｒＨ１の各画素値Ｉ１（ｘ＾，ｙ＾）はＩ１Ｎ＋１（ｘ＾，ｙ＾）となる。
【００５２】
なお、上記ではフレームＦｒＮ＋１に対して補間演算を施して第１の補間フレームＦｒＨ１を取得しているが、フレームＦｒＮ＋１とともに基準フレームＦｒＮをも用いて第１の補間フレームＦｒＨ１を取得してもよい。この場合、基準フレームＦｒＮの画素は、統合画像の整数座標に補間されて直接割り当てられることとなる。
【００５３】
空間補間手段５は、基準フレームＦｒＮに対して、統合画像上のフレームＦｒＮ＋１の画素が割り当てられた座標（実数座標（ｘ°，ｙ°））に画素値を割り当てる補間演算を施すことにより、第２の補間フレームＦｒＨ２を取得する。ここで、第２の補間フレームＦｒＨ２の実数座標の画素値をＩ２（ｘ°，ｙ°）とすると、画素値Ｉ２（ｘ°，ｙ°）は下記の式（１３）により算出される。
【００５４】
【数９】

但し、ｆ：補間演算の関数
なお、補間演算としては、線形補間演算、スプライン補間演算等の種々の補間演算を用いることができる。
【００５５】
また、本実施形態においては、合成フレームＦｒＧは基準フレームＦｒＮの縦横それぞれ２倍の画素数であるため、基準フレームＦｒＮに対して縦横方向に画素数を２倍とする補間演算を施すことにより、統合画像の画素数と同一の画素数を有する第２の補間フレームＦｒＨ２を取得してもよい。この場合、補間演算により得られる画素値は統合画像における整数座標の画素値であり、この画素値をＩ２（ｘ＾，ｙ＾）とすると、画素値Ｉ２（ｘ＾，ｙ＾）は下記の式（１４）により算出される。
【００５６】
【数１０】

相関値算出手段６は、座標変換済みフレームＦｒＴ０と基準フレームＦｒＮとの相対応する画素同士の相関値ｄ０（ｘ，ｙ）を算出する。具体的には下記の式（１５）に示すように、座標変換済みフレームＦｒＴ０と基準フレームＦｒＮとの対応する画素における画素値ＦｒＴ０（ｘ，ｙ），ＦｒＮ（ｘ，ｙ）との差の絶対値を相関値ｄ０（ｘ，ｙ）として算出する。なお、相関値ｄ０（ｘ，ｙ）は座標変換済みフレームＦｒＴ０と基準フレームＦｒＮとの相関が大きいほど小さい値となる。
【００５７】
【数１１】

なお、本実施形態では座標変換済みフレームＦｒＴ０と基準フレームＦｒＮとの対応する画素における画素値ＦｒＴ０（ｘ，ｙ），ＦｒＮ（ｘ，ｙ）との差の絶対値を相関値ｄ０（ｘ，ｙ）として算出しているが、差の二乗を相関値として算出してもよい。また、相関値を画素毎に算出しているが、座標変換済みフレームＦｒＴ０および基準フレームＦｒＮを複数の領域に分割し、領域内の全画素値の平均値または加算値を算出して、領域単位で相関値を得てもよい。また、画素毎に算出された相関値ｄ０（ｘ，ｙ）のフレーム全体についての平均値または加算値を算出して、フレーム単位で相関値を得てもよい。また、座標変換済みフレームＦｒＴ０および基準フレームＦｒＮのヒストグラムをそれぞれ算出し、座標変換済みフレームＦｒＴ０および基準フレームＦｒＮのヒストグラムの平均値、メディアン値または標準偏差の差分値、もしくはヒストグラムの差分値の累積和を相関値として用いてもよい。また、基準フレームＦｒＮに対する座標変換済みフレームＦｒＴ０の動きを表す動きベクトルを基準フレームＦｒＮの各画素または小領域毎に算出し、算出された動ベクトルの平均値、メディアン値または標準偏差を相関値として用いてもよく、動ベクトルのヒストグラムの累積和を相関値として用いてもよい。
【００５８】
重み算出手段７は、相関値算出手段６により算出された相関値ｄ０（ｘ，ｙ）から第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＨ２を重み付け加算する際の重み係数α（ｘ，ｙ）を取得する。具体的には、図１１に示すテーブルを参照して重み係数α（ｘ，ｙ）を取得する。なお、図１１に示すテーブルは、相関値ｄ０（ｘ，ｙ）が小さい、すなわち座標変換済みフレームＦｒＴ０および基準フレームＦｒＮの相関が大きいほど、重み係数α（ｘ，ｙ）の値が１に近いものとなる。なお、ここでは相関値ｄ０（ｘ，ｙ）は８ビットの値をとるものとする。
【００５９】
さらに、重み算出手段７は、フレームＦｒＮ＋１を統合画像上に割り当てた場合と同様に重み係数α（ｘ，ｙ）を統合画像上に割り当てることにより、フレームＦｒＮ＋１の画素が割り当てられた座標（実数座標）における重み係数α（ｘ°，ｙ°）を算出する。具体的には、空間補間手段５における補間演算と同様に、重み係数α（ｘ，ｙ）に対して、統合画像上のフレームＦｒＮ＋１の画素が割り当てられた座標（実数座標（ｘ°，ｙ°））に画素値を割り当てる補間演算を施すことにより、重み係数α（ｘ°，ｙ°）を取得する。
【００６０】
なお、統合画像の上記実数座標における重み係数α（ｘ°，ｙ°）を補間演算により算出することなく、基準フレームＦｒＮを統合画像のサイズとなるように拡大または等倍して拡大または等倍基準フレームを取得し、統合画像におけるフレームＦｒＮ＋１の画素が割り当てられた実数座標の最近傍に対応する拡大または等倍基準フレームの画素について取得された重み係数α（ｘ，ｙ）の値をその実数座標の重み係数α（ｘ°，ｙ°）として用いてもよい。
【００６１】
さらに、統合画像の整数座標における画素値Ｉ１（ｘ＾，ｙ＾），Ｉ２（ｘ＾，ｙ＾）が取得されている場合には、統合画像上に割り当てた重み係数α（ｘ°，ｙ°）について上記と同様に荷重和を求めることにより、統合画像の整数座標における重み係数α（ｘ＾，ｙ＾）を算出すればよい。
【００６２】
合成手段８は、第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＨ２を重み算出手段７により算出された重み係数α（ｘ°，ｙ°）に基づいて重み付け加算するとともに荷重和演算を行うことにより、統合画像の整数座標において画素値ＦｒＧ（ｘ＾，ｙ＾）を有する合成フレームＦｒＧを取得する。具体的には、下記の式（１６）により第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＨ２の対応する画素の画素値Ｉ１（ｘ°，ｙ°），Ｉ２（ｘ°，ｙ°）を重み係数α（ｘ°，ｙ°）により重み付け加算するとともに荷重和演算を行い合成フレームＦｒＧの画素値ＦｒＧ（ｘ＾，ｙ＾）を取得する。
【００６３】
【数１２】

なお、式（１６）において、ｋは合成フレームＦｒＧすなわち統合画像の整数座標（ｘ＾，ｙ＾）の８近傍の整数座標に囲まれる領域に割り当てられたフレームＦｒＮ＋１の画素の数であり、この割り当てられた画素がそれぞれ画素値Ｉ１（ｘ°，ｙ°），Ｉ２（ｘ°，ｙ°）および重み係数α（ｘ°，ｙ°）を有するものである。
【００６４】
本実施形態においては、基準フレームＦｒＮと座標変換済みフレームＦｒＴ０との相関が大きいほど、第１の補間フレームＦｒＨ１の重み付けが大きくされて、第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＨ２の重み付け加算が行われる。
【００６５】
なお、統合画像の全ての整数座標に画素値を割り当てることができない場合がある。このような場合は、割り当てられた画素値に対して前述した空間補間手段５と同様の補間演算を施して、割り当てられなかった整数座標の画素値を算出すればよい。
【００６６】
また、上記では輝度成分Ｙについての合成フレームＦｒＧを求める処理について説明したが、色差成分Ｃｂ，Ｃｒについても同様に合成フレームＦｒＧが取得される。そして、輝度成分Ｙから求められた合成フレームＦｒＧ（Ｙ）および色差成分Ｃｂ，Ｃｒから求められた合成フレームＦｒＧ（Ｃｂ），ＦｒＧ（Ｃｒ）を合成することにより、最終的な合成フレームが得られることとなる。なお、処理の高速化のためには、輝度成分Ｙについてのみ基準フレームＦｒＮとフレームＦｒＮ＋１との対応関係を求め、色差成分Ｃｂ，Ｃｒについては輝度成分Ｙについて求められた対応関係に基づいて処理を行うことが好ましい。
【００６７】
また、統合画像の整数座標について画素値を有する第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＨ２並びに整数座標の重み係数α（ｘ＾，ｙ＾）を取得した場合には、下記の式（１７）により第１の補間フレームＦｒＨ１および第２の補間フレームＦｒＧ２の対応する画素の画素値Ｉ１（ｘ＾，ｙ＾），Ｉ２（ｘ＾，ｙ＾）を重み係数α（ｘ＾，ｙ＾）により重み付け加算して合成フレームＦｒＧの画素値ＦｒＧ（ｘ，ｙ）を取得すればよい。
【００６８】
【数１３】

図１２は、本実施形態において行われる処理を示すフローチャートである。なお、ここでは統合画像のフレームＦｒＮ＋１の画素が割り当てられた実数座標について第１の補間フレームＦｒＨ１、第２の補間フレームＦｒＨ２および重み係数α（ｘ°，ｙ°）を取得するものとして説明する。図１２に示すように、本実施形態の動画像合成装置の動作は、動画像データＭ０が入力される（Ｓ２）ことから始まる。動画像データＭ０から合成フレームを作成するため、サンプリング手段１の条件設定手段１４を介して倍率、フレームレート、圧縮クオリティが入力される（Ｓ４）と、サンプリング手段１のサンプリング実行手段１６は、記憶手段１２に記憶されたフレーム数決定テーブルを参照し、条件設定手段１４を介して入力された倍率、フレームレート、圧縮クオリティに対応した、サンプリングすべきフレームの数Ｓを検出して、Ｓ枚の連続したフレームを動画像データＭ０からサンプリングして対応関係求出手段２に出力する（Ｓ６）。対応関係求出手段２は、Ｓ枚のフレームのうちの基準フレームＦｒＮ上に基準パッチを配置する（Ｓ８）と共に、フレームＦｒＮ＋１上に基準パッチと同様のパッチを配置して、パッチ内の画像と、基準パッチ内の画像との相関値Ｅが収束するまで、パッチを移動および変形する（Ｓ１２、Ｓ１４）。中止手段１０は、収束時の相関値Ｅの和を求め、Ｅの和が所定の閾値より高ければ（すなわち、このフレームと基準フレームとの相関が所定の閾値より低くければ）、対応関係求出手段２の処理を中止させ、すなわち、処理中のフレーム以降のフレームに対する対応関係の求出処理を中止させることによって、動画像合成装置の処理を、座標変換手段３等からなるフレーム統合手段の処理に移行させる（Ｓ１６：Ｎｏ、Ｓ３０〜Ｓ４０）。
【００６９】
一方、中止手段１０により中止されなければ、対応関係求出手段２は、サンプリング手段１によりサンプリングしたＳ枚のフレームのうち、基準フレームを除く全てのフレームと基準フレームとの対応関係を求めて、フレーム統合手段に供する（Ｓ１６：Ｎｏ、Ｓ１８、Ｓ２０：Ｙｅｓ、Ｓ２５）。
【００７０】
ステップＳ３０からステップＳ４０までは、座標変換手段などからなるフレーム統合手段の動作を示している。ここでも、説明上の便宜のため、例として、対応関係求出手段２からフレームＦｒＮ＋１のみについて基準フレームＦｒＮとの対応関係が求められたとして説明をする。
【００７１】
対応関係求出手段２により求められた対応関係に基づいて、座標変換手段３によりフレームＦｒＮ＋１が基準フレームＦｒＮの座標空間に変換されて座標変換済みフレームＦｒＴ０が取得される（Ｓ３０）。そして、相関値算出手段６により座標変換済みフレームＦｒＴ０と基準フレームＦｒＮとの対応する画素の相関値ｄ０（ｘ，ｙ）が算出される（Ｓ３２）。さらに、相関値ｄ０に基づいて重み算出手段７により重み係数α（ｘ°，ｙ°）が算出される（Ｓ３４）。
【００７２】
一方、求められた対応関係に基づいて、時空間補間手段４により第１の補間フレームＦｒＨ１が取得され（Ｓ３６）、空間補間手段５により第２の補間フレームＦｒＨ２が取得される（Ｓ３８）。
【００７３】
なお、Ｓ３６〜Ｓ３８の処理を先に行ってもよく、ステップＳ３０〜Ｓ３４の処理およびステップＳ３６〜Ｓ３８の処理を並列に行ってもよい。
【００７４】
そして、合成手段８において上記式（１６）により第１の補間フレームＦｒＨ１の画素Ｉ１（ｘ°，ｙ°）と第２の補間フレームＦｒＨ２の画素Ｉ２（ｘ°，ｙ°）とが合成されて、画素ＦｒＧ（ｘ＾，ｙ＾）からなる合成フレームＦｒＧが取得され（Ｓ４０）、処理を終了する。
【００７５】
上述において、説明上の便宜のため、対応関係求出手段２によりＦｒＮ＋１のみについて基準フレームＦｒＮとの対応関係が求められ、フレーム統合手段は、基準フレームＦｒＮとＦｒＮ＋１との２つのフレームを用いて合成フレームを作成することについて説明したが、例えばＴ個（Ｔ≧３）のフレームＦｒＮ＋ｔ′（０≦ｔ′≦Ｔ−１）から合成フレームＦｒＧを取得する場合（すなわち、対応関係求出手段２により２つ以上のフレームと基準フレームとの対応関係が求められた場合）、基準フレームＦｒＮ（＝ＦｒＮ＋０）以外の他のフレームＦｒＮ＋ｔ（１≦ｔ≦Ｔ−１）について、統合画像上に画素値を割り当てて複数の第１の補間フレームＦｒＨ１ｔを得る。なお、第１の補間フレームＦｒＨ１ｔの画素値をＩ１ｔ（ｘ°，ｙ°）とする。
【００７６】
また、基準フレームＦｒＮに対して、統合画像上のフレームＦｒＮ＋ｔの画素が割り当てられた座標（実数座標（ｘ°，ｙ°））に画素値を割り当てる補間演算を施すことにより、フレームＦｒＮ＋ｔに対応した第２の補間フレームＦｒＨ２ｔを取得する。なお、第２の補間フレームＦｒＨ２ｔの画素値をＩ２ｔ（ｘ°，ｙ°）とする。
【００７７】
さらに、求められた対応関係に基づいて、対応する第１および第２の補間フレームＦｒＨ１ｔ，ＦｒＨ２ｔを重み付け加算する重み係数αｔ（ｘ°，ｙ°）を取得する。
【００７８】
そして、互いに対応する第１および第２の補間フレームＦｒＨ１ｔ，ＦｒＨ２ｔを重み係数αｔ（ｘ°，ｙ°）により重み付け加算するとともに荷重和演算を行うことにより、統合画像の整数座標において画素値ＦｒＧｔ（ｘ＾，ｙ＾）を有する中間合成フレームＦｒＧｔを取得する。具体的には、下記の式（１８）により第１の補間フレームＦｒＨ１ｔおよび第２の補間フレームＦｒＧ２ｔの対応する画素の画素値Ｉ１ｔ（ｘ°，ｙ°），Ｉ２ｔ（ｘ°，ｙ°）を対応する重み係数αｔ（ｘ°，ｙ°）により重み付け加算するとともに荷重和演算を行い、中間合成フレームＦｒＧｔの画素値ＦｒＧｔ（ｘ＾，ｙ＾）を取得する。
【００７９】
【数１４】

なお、式（１８）において、ｋは中間合成フレームＦｒＧｔすなわち統合画像の整数座標（ｘ＾，ｙ＾）の８近傍の整数座標に囲まれる領域に割り当てられたフレームＦｒＮ＋ｔの画素の数であり、この割り当てられた画素がそれぞれ画素値Ｉ１ｔ（ｘ°，ｙ°），Ｉ２ｔ（ｘ°，ｙ°）および重み係数αｔ（ｘ°，ｙ°）を有するものである。
【００８０】
そして、中間合成フレームＦｒＧｔを加算することにより合成フレームＦｒＧを取得する。具体的には、下記の式（１９）により中間合成フレームＦｒＧｔを対応する画素同士で加算することにより、合成フレームＦｒＧの画素値ＦｒＧ（ｘ＾，ｙ＾）を取得する。
【００８１】
【数１５】

なお、統合画像の全ての整数座標に画素値を割り当てることができない場合がある。このような場合は、割り当てられた画素値に対して前述した空間補間手段５と同様の補間演算を施して、割り当てられなかった整数座標の画素値を算出すればよい。
【００８２】
また、３以上の複数のフレームから合成フレームＦｒＧを取得する場合、統合画像の整数座標について画素値を有する第１の補間フレームＦｒＨ１ｔおよび第２の補間フレームＦｒＨ２ｔ並びに整数座標の重み係数αｔ（ｘ＾，ｙ＾）を取得してもよい。この場合、各フレームＦｒＮ＋ｔ（１≦ｔ≦Ｔ−１）について、各フレームＦｒＮ＋ｔの画素値ＦｒＮ＋ｔ（ｘ，ｙ）を統合座標の全ての整数座標に割り当てて画素値Ｉ１Ｎ＋ｔ（ｘ＾，ｙ＾）すなわち画素値Ｉ１ｔ（ｘ＾，ｙ＾）を有する第１の補間フレームＦｒＨ１ｔを取得する。そして、全てのフレームＦｒＮ＋ｔについて割り当てられた画素値Ｉ１ｔ（ｘ＾，ｙ＾）と第２の補間フレームＦｒＨ２ｔの画素値Ｉ２ｔ（ｘ＾，ｙ＾）とを加算することにより複数の中間合成フレームＦｒＧｔを取得し、これらをさらに加算して合成フレームＦｒＧを取得すればよい。
【００８３】
具体的には、まず、下記の式（２０）に示すように、全てのフレームＦｒＮ＋ｔについて、統合画像の整数座標における画素値Ｉ１Ｎ＋ｔ（ｘ＾，ｙ＾）を算出する。そして、式（２１）に示すように、画素値Ｉ１ｔ（ｘ＾，ｙ＾）と画素値Ｉ２ｔ（ｘ＾，ｙ＾）とを重み係数α（ｘ＾，ｙ＾）により重み付け加算することにより中間合成フレームＦｒＧｔを得る。そして、上記式（２０）に示すように、中間合成フレームＦｒＧｔを加算することにより合成フレームＦｒＧを取得する。
【００８４】
【数１６】

なお、３以上の複数のフレームから合成フレームＦｒＧを取得する場合、座標変換済みフレームＦｒＴ０は複数取得されるため、相関値および重み係数もフレーム数に対応して複数取得される。この場合、複数取得された重み係数の平均値や中間値を対応する第１および第２の補間フレームＦｒＨ１，ＦｒＨ２を重み付け加算する際の重み係数としてもよい。
【００８５】
このように、本実施形態の動画像合成装置において、サンプリング手段１は、動画像データＭ０の圧縮クオリティおよびフレームレートと、合成しようとする合成フレームの画素サイズが動画像のフレームの画素サイズに対する倍率とに基づいてサンプリングするフレームの数を決定するようにしているので、操作者が手動でフレームの数を決定する必要がなく、便利である。また、動画像と作成しようとする合成フレームとの画像特性に基づいてフレームの数を決定することによって、客観的に適切なフレームの数を決定することができので、高品質の合成フレームを作成することができる。
【００８６】
また、本実施形態の動画像合成装置において、サンプリングしたＳ枚のフレームに対して、基準フレームＦｒＮに近い基準フレーム以外の他のフレームから順に、他のフレーム上のパッチ内の画素と基準フレーム上の基準パッチ上内の画素との対応関係を求めると共に、他のフレームと基準フレームとの相関を求め、相関が所定の閾値より大きければ、次の他のフレームに対して対応関係を求める処理を続行するが、相関が所定の閾値より低くなったフレームを検出すると、決定されたフレームの数に到達していなくても、このフレーム以降の他のフレームに対する対応関係求出処理を中止するようにすることによって、基準フレームと相関が低いフレーム（例えば基準フレームのシーンと切り替わったシーンにフレーム）を用いて合成フレームを作成することを避けることができ、より高品質の合成フレームを作成することを可能とする。
【図面の簡単な説明】
【図１】本発明の実施形態による動画像合成装置の構成を示すブロック図
【図２】図１に示す動画像合成装置のサンプリング手段１の構成を示すブロック図
【図３】フレーム数決定テーブルの一例を示す図
【図４】フレームＦｒＮ＋１と基準フレームＦｒＮとの対応関係の求出を説明するための図
【図５】パッチの変形を説明するための図
【図６】図１に示す動画像合成装置の中止手段１０の構成を示すブロック図
【図７】パッチＰ１と基準パッチＰ０との対応関係を説明するための図
【図８】双１次内挿を説明するための図
【図９】フレームＦｒＮ＋１の統合画像への割り当てを説明するための図
【図１０】統合画像における整数座標の画素値の算出を説明するための図
【図１１】重み係数を求めるテーブルを示す図
【図１２】図１に示す動画像合成装置において行われる処理を示すフローチャート
【符号の説明】
１　　サンプリング手段
２　　対応関係求出手段
３　　座標変換手段
４　　時空間補間手段
５　　空間補間手段
６　　相関値算出手段
７　　重み算出手段
８　　合成手段
１０　　中止手段
１２　　記憶手段
１４　　条件設定手段
１６　　サンプリング実行手段
２２　　相関取得手段
２４　　中止実行手段[0001]
BACKGROUND OF THE INVENTION
The present invention causes a computer to execute a moving image combining method and apparatus, and a moving image combining method, which can integrate a plurality of consecutive frames of moving images and create a combined frame having a higher resolution than the plurality of frames. Is related to the program.
[0002]
[Prior art]
In recent years, with the spread of digital video cameras, it has become possible to handle moving images in units of frames. When printing out a frame of such a moving image, it is necessary to make the frame high resolution in order to improve the image quality. For this reason, a method has been proposed in which a plurality of frames are sampled from a moving image and a plurality of the sampled frames are integrated to create one composite frame having a higher resolution than those frames.
[0003]
What is required when integrating a plurality of frames of moving images is to obtain a correspondence relationship of pixels between frames in the moving region. For this, the block matching method and the gradient method are usually used, but the conventional block matching method assumes that the amount of motion in the block is in the same direction, so rotation, enlargement, reduction, deformation, etc. In addition to lack of flexibility to cope with various movements, there is a problem that processing time is required and it is not practical. On the other hand, the gradient method has a problem that a solution cannot be obtained stably as compared with the conventional block matching method. As a method for overcoming these problems, one of a plurality of frames to be integrated is set as a reference frame, and a reference patch including one or more rectangular areas is set as a reference frame to other frames other than the reference frame. A patch similar to the reference patch is placed, and the patch is moved and / or deformed on another frame so that the image in the patch matches the image in the reference patch. Based on the above, a method has been proposed in which the correspondence between the pixels in the patch on the other frame and the pixels in the reference patch on the reference frame is obtained to synthesize a plurality of frames with high accuracy (see Non-Patent Document 1). .
[0004]
In the method of Non-Patent Document 1, the correspondence between the reference frame and the other frame is obtained, and after the other frame and the reference frame are finally assigned to the integrated image having the necessary resolution, A high-definition composite frame can be obtained.
[0005]
[Non-Patent Document 1]
“Acquisition of high-definition digital image by interframe integration”, Yuji Nakazawa, Takashi Komatsu, Takahiro Saito, Journal of Television Society, 1995, Vol. 49, no. 3, p299-308
[0006]
[Problems to be solved by the invention]
However, in the method described in Non-Patent Document 1, when sampling a plurality of frames from a moving image, any range of frames including the reference frame, that is, up to how many frames including the reference frame are used for integration. The frame to be set is set manually by the operator. There is a problem that the operator is required to have knowledge of image processing and takes time and effort. In addition, since it is set manually by the operator, there is a problem that the subjectivity of the operator enters and an appropriate range cannot be objectively obtained, and the quality of the composite frame is adversely affected.
[0007]
The present invention has been made in view of the above circumstances. When a composite frame is created by integrating a plurality of frames of moving images, an appropriate frame range is determined simply and objectively, and a high-quality composite frame is obtained. It is an object of the present invention to provide a moving image synthesizing method, apparatus, and program capable of creating a video.
[0008]
[Means for Solving the Problems]
The moving image composition method of the present invention samples two or more predetermined number of frames including a reference frame, which are continuous moving images,
Placing a reference patch consisting of one or more rectangular areas on the reference frame;
A patch similar to the reference patch is arranged on another frame of the predetermined number of frames,
Moving and / or deforming the patch on the other frame so that the image in the patch substantially matches the image in the reference patch;
Based on the patch after the movement and / or deformation and the reference patch, the correspondence relationship between the pixels in the patch on each frame of the other frame and the pixels in the reference patch on the reference frame is respectively determined. Seeking
In the moving image composition method for creating a composite frame from the predetermined number of frames based on the obtained correspondence relationship,
The predetermined number of frames is determined based on the moving image and / or the image characteristics of the composite frame to be created, and the predetermined number of frames are sampled.
[0009]
Here, the image characteristic of a moving image means a characteristic that may affect the quality of the combined frame when creating a combined frame from the moving image. For example, the pixel size or resolution of each frame of the moving image As an example, the frame rate of a moving image, the compression rate of a moving image, and the like can be given. Similarly, the image characteristics of the composite frame to be created are the characteristics that may influence the number of frames to be sampled or the number of frames required to create the composite frame. This means, for example, the pixel size and resolution of the composite frame to be created. In addition, although not directly, for example, the pixel size of the composite frame to be created is a magnification with respect to the pixel size of the frame of the moving image, and the image characteristics of the moving image and the composite frame to be created indirectly. It is.
[0010]
In the moving image composition method of the present invention, the method for acquiring the image characteristics may be any method as long as necessary image characteristics can be acquired. Attached information such as a tag may be read and acquired, or a value input by the operator may be used. The image characteristics of the composite frame to be created are input by the operator. A value may be used, or a fixed target value may be used.
[0011]
The image processing method of the present invention obtains the correspondence relationship in order from other frames close to the reference frame, and obtains the correlation between the other frame for which the correspondence relationship is obtained and the reference frame,
In the frame in which the correlation is lower than a predetermined threshold, the processing for obtaining the correspondence relationship is stopped,
It is preferable that the composite frame is created using the reference frame and the other frame for which the correspondence relationship is obtained based on the obtained correspondence relationship.
[0012]
Here, “in order from other frames close to the reference frame” means that, for example, if the time-series position of the reference frame in a plurality of sampled frames is the head or the end, It means “in order from other early frames” or “in order from other frames that are later in time series than the reference frame”, but the time series position of the reference frame in the sampled frames is the beginning and end. Otherwise, it means both “in order from other frames earlier in time series than the reference frame” and “in order from other frames later in time series than the reference frame”.
[0013]
The moving image synthesizing apparatus of the present invention comprises sampling means for sampling two or more predetermined number of frames including a reference frame, which are continuous moving images,
A reference patch composed of one or a plurality of rectangular areas is arranged on the reference frame, a patch similar to the reference patch is arranged on another frame of the predetermined number of frames, and an image in the patch Is moved and / or deformed on the other frame so that the image substantially matches the image in the reference patch, and based on the moved and / or deformed patch and the reference patch, the other Correspondence finding means for obtaining the correspondence between the pixels in the patch on each frame of the frame and the pixels in the reference patch on the reference frame;
A moving image synthesizing apparatus comprising frame integration means for creating a synthesized frame from the predetermined number of frames based on the correspondence obtained by the correspondence finding means;
The sampling means comprises frame number determining means for determining the predetermined number based on the moving image and / or image characteristics of the composite frame to be created, and the predetermined number of frames determined by the frame number determining means It is characterized by sampling a number of frames.
[0014]
The moving image synthesizing apparatus of the present invention further includes a canceling unit for canceling the processing of the correspondence relationship calculating unit,
The correspondence relationship obtaining means obtains the correspondence relationship in order from other frames close to the reference frame,
The cancellation means obtains a correlation between the other frame for which the correspondence relation is obtained by the correspondence relation obtaining means and the reference frame, and obtains the correspondence relation from a frame whose correlation is lower than a predetermined threshold. To stop the processing of the exit means,
It is preferable that the frame integration unit creates the composite frame using the reference frame and the other frame for which the correspondence relationship is obtained based on the obtained correspondence relationship.
[0015]
The program of the present invention includes a number determination process for determining the number of frames used for composition based on image characteristics of a moving image and / or a combined frame to be created from a plurality of frames of the moving image;
Sampling processing for sampling the number of frames including the reference frame that is continuous in the moving image;
A reference patch composed of one or more rectangular areas is arranged on the reference frame, a patch similar to the reference patch is arranged on another frame of the number of frames, and an image in the patch is The patch is moved and / or deformed on the other frame so as to substantially match the image in the reference patch, and the other frame is changed based on the moved and / or deformed patch and the reference patch. A correspondence finding process for obtaining a correspondence between each pixel in the patch on each frame and a pixel in the reference patch on the reference frame;
The computer is caused to execute a frame integration process for creating a composite frame from the number of frames based on the obtained correspondence.
[0016]
The correspondence finding process is for obtaining the correspondence in order from other frames close to the reference frame, and the program of the present invention is configured to determine whether the correspondence between the other frame and the reference frame is obtained. It is preferable to obtain a correlation and to cause the computer to further execute a cancellation process for canceling the process for obtaining the correspondence from a frame in which the correlation is lower than a predetermined threshold.
[0017]
【The invention's effect】
According to the moving image synthesizing method and apparatus of the present invention, when generating a synthesized frame by sampling a plurality of consecutive frames of a moving image, based on the image characteristics of the moving image and / or the synthesized frame to be synthesized. Since the number of frames to be sampled is determined, there is no need for the operator to manually determine the number of frames, which is convenient. In addition, since the number of frames can be objectively determined by determining the number of frames based on the moving image and / or the image characteristics of the composite frame to be created, a high-quality composite frame can be obtained. Can be created.
[0018]
In the moving image composition method and apparatus of the present invention, the determined number of frames are sampled, and patches on other frames are sequentially sampled from these frames other than the reference frame close to the reference frame. The correspondence between the pixels in the reference frame and the pixels in the reference patch on the reference frame is obtained, and the correlation between the other frame and the reference frame is obtained. If the correlation is larger than a predetermined threshold, the next other frame is obtained. The process for obtaining the correspondence relationship is continued, but if a frame whose correlation is lower than the predetermined threshold is detected, the correspondence to other frames after this frame is met even if the determined number of frames has not been reached. By canceling the relationship seeking process, a frame having a low correlation with the reference frame (for example, switching to the scene of the reference frame). Can be avoided to create a composite frame using a frame) to the scene, it is possible to create a higher quality synthesis frame.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0020]
FIG. 1 is a block diagram showing a configuration of a moving image synthesis apparatus according to an embodiment of the present invention. As shown in FIG. 1, the moving image synthesizing apparatus according to the present embodiment stops the processing of sampling means 1 for sampling a plurality of frames from input moving image data M0 and the processing of correspondence relationship finding means 2 described later. Among the plurality of frames sampled by the means 10 and the sampling means 1, the pixels of one reference frame that serves as a reference for the frames up to the frame to be canceled by the stopping means 10 (all the plurality of frames if not canceled) Based on the correspondence relationship obtaining means 2 for obtaining the correspondence relationship of pixels of other frames other than the reference frame in order from other frames close to the reference frame, and the correspondence relationship obtained by the correspondence relationship obtaining means 2 The other frames for which the relationship has been obtained are each transformed into the coordinate space of the reference frame to obtain a coordinate transformed frame. A coordinate conversion unit 3 that performs the interpolation operation on the other frames for which the correspondence relationship is obtained based on the correspondence relationship that is obtained by the correspondence relationship obtaining unit 2 and a resolution that is higher than each frame. A space-time interpolation means 4 for obtaining an interpolation frame; a space interpolation means 5 for performing interpolation on a reference frame to obtain a second interpolation frame having a higher resolution than each frame; Correlation value calculation means 6 for calculating a correlation value representing a correlation with the reference frame, and a correlation coefficient calculated by the correlation value calculation means 6 for weighting and adding the first interpolation frame and the second interpolation frame. Weight calculation means 7 calculated based on the value, and weighted addition of the first and second interpolation frames based on the weight coefficient calculated by the weight calculation means 7, And a synthesizing unit 8 for acquiring beam FrG. The coordinate conversion means 3, the space-time interpolation means 4, the space interpolation means 5, the correlation value calculation means 6, the weight calculation means 7, and the synthesis means 8 correspond to the frame integration means described in the claims. is there.
[0021]
FIG. 2 is a block diagram showing the configuration of the sampling means 1 in the moving image synthesizing apparatus shown in FIG. As shown in FIG. 2, the sampling means 1 is configured such that the composite frame has a pixel size with respect to the pixel size of one frame of the moving image, the frame rate of the moving image, the compression quality of the moving image, and the number S of frames to be sampled. Storage means 12 that stores the frame number determination table created in association with each other, the magnification of the synthesized frame FrG to be actually created with respect to the pixel size of one frame of the moving image, and the frame of the moving image data M0 Referring to the condition setting means 14 for inputting the rate and the compression quality of the moving image data M0 and the frame number determination table stored in the storage means 12, the magnification and the frame rate input via the condition setting means 14 Detecting the number S of frames to be sampled, corresponding to the compression quality, The connection frame from the moving image data M0 is made and a sampling execution unit 16 for sampling.
[0022]
FIG. 3 shows an example of the frame number determination table stored in the storage unit 12 of the sampling unit 1 shown in FIG. In the example shown in the figure, the number S of frames to be sampled is obtained from a combination of various magnifications, frame rates, and compression qualities according to the following equation (1).
[0023]
S = min (S1, S2 × S3)
S1 = frame rate × 3 (1)
S2 = magnification × 1.5
S3 = 1.0 (High compression quality)
S3 = 1.2 (medium compression quality)
S3 = 1.5 (low compression quality)
That is, the number of frames is required such that the number of frames S is large when the frame rate is large, the number S of frames is large when the magnification is large, and the number S of frames is large when the compression quality is low.
[0024]
The sampling means 1 outputs the sampled S frames to the correspondence relationship finding means 2, and the correspondence relationship finding means 2 outputs the S frames (if the suspension means 10 stops, this S frames) Among the frames up to the canceled frame), the correspondence relationship between the pixels of the reference frame and the pixels of the other frames is obtained in order from the other frames close to the reference frame. Here, the operation of the correspondence relationship finding means 2 will be described. The moving image data M0 represents a color moving image, and each frame is composed of luminance, color difference components of Y, Cb, and Cr. In the following description, processing is performed for each component of Y, Cb, and Cr. Since the processing to be performed is the same for all components, the processing of the luminance component Y is described in detail in the present embodiment. A description of the processing for the color difference components Cb and Cr will be omitted.
[0025]
The S frames output from the sampling unit 1 are, for example, FrN + 1, FrN + 2,... Starting from one reference frame FrN in order from the closest to the reference frame FrN. . . They are arranged continuously like FrN + (S-1). Here, the operation of the correspondence relationship finding unit 2 will be described using the frame FrN + 1 and the reference frame FrN as examples. In the following, a description will be given of a case where the composite frame FrG to be created has twice as many pixels as the sampled frame in the vertical and horizontal directions (the magnification is 4 times), but n times (n is a positive number) pixels. It may have a number.
[0026]
The correspondence relationship obtaining means 2 obtains the correspondence relationship between the frame FrN + 1 and the reference frame FrN as follows. FIG. 4 is a diagram for explaining the calculation of the correspondence between the frame FrN + 1 and the reference frame FrN. In FIG. 4, it is assumed that the circular subject included in the reference frame FrN has moved slightly to the right in the drawing in the frame FrN + 1.
[0027]
First, the correspondence relationship obtaining unit 2 arranges the reference patch P0 including one or a plurality of rectangular areas on the reference frame FrN. FIG. 4A is a diagram illustrating a state in which the reference patch P0 is arranged on the reference frame FrN. As shown in FIG. 4A, in the present embodiment, the reference patch P0 is assumed to be a 4 × 4 rectangular area.
Next, as shown in FIG. 4B, a patch P1 similar to the reference patch P0 is arranged at an appropriate position in the frame FrN + 1, and a correlation value representing the correlation between the image in the reference patch P0 and the image in the patch P1. Is calculated. The correlation value can be calculated as a mean square error by the following equation (2). The coordinate axes are assumed to be the x axis in the left and right direction on the paper and the y axis in the vertical direction on the paper.
[0028]
[Expression 1]

Where E: correlation value
pi, qi: pixel values of the corresponding pixels in the reference patches P0, P1
N: Number of pixels in the reference patch P0 and the patch P1
Next, the patch P1 on the frame FrN + 1 is moved by fixed pixels ± Δx, ± Δy in four directions, up, down, left, and right. The correlation value between the image in the patch P1 and the image in the reference patch P0 on the reference frame FrN is calculate. Here, the correlation value is calculated for each of the up, down, left, and right directions, and each correlation value is expressed as E (Δx, 0), E (−Δx, 0), E (0, Δy), E (0, −Δy), respectively. To do.
[0029]
Then, the correlation value becomes smaller (that is, the correlation becomes larger) from the four correlation values E (Δx, 0), E (−Δx, 0), E (0, Δy), and E (0, −Δy) after the movement. The gradient direction is obtained as a correlation gradient, and the patch P1 is moved in this direction by a preset real value multiple as shown in FIG. Specifically, coefficients C (Δx, 0), C (−Δx, 0), C (0, Δy), C (0, −Δy) are calculated by the following equation (3), and these coefficients C Correlation gradients gx and gy are calculated from the following equations (4) and (5) from (Δx, 0), C (−Δx, 0), C (0, Δy), and C (0, −Δy).
[0030]
[Expression 2]

Then, the entire patch P1 is moved (−λ1gx, −λ1gy) based on the calculated correlation gradients gx, gy, and the same processing as described above is repeated, so that the patch P1 is shown in FIG. The patch P1 is repeatedly moved until it converges to a certain position. Here, λ1 is a parameter that determines the speed of convergence, and takes a real value. If λ1 is too large, the solution diverges due to iterative processing, so an appropriate value (for example, 10) must be selected.
[0031]
Further, the lattice point of the patch P1 is moved by a fixed pixel in four directions along the coordinate axis.
At this time, the rectangular area including the moved grid point is deformed as shown in FIG. 5, for example.
Then, a correlation value between the deformed rectangular area and the corresponding rectangular area of the reference patch P0 is calculated. The correlation values are defined as E1 (Δx, 0), E1 (−Δx, 0), E1 (0, Δy), and E1 (0, −Δy), respectively.
[0032]
Similarly to the above, the correlation value is small from the four correlation values E1 (Δx, 0), E1 (−Δx, 0), E1 (0, Δy), and E1 (0, −Δy) after deformation (ie, The gradient direction in which the correlation is large) is obtained, and the lattice point of the patch P1 is moved by a preset real value multiple in this direction. This is performed for all the grid points of the patch P1, and this is regarded as one process. This process is repeated until the coordinates of the grid points converge.
[0033]
Thereby, the movement amount and the deformation amount of the patch P1 with respect to the reference patch P0 are obtained, and based on this, the correspondence relationship between the pixels in the reference patch P0 and the pixels in the patch P1 can be obtained.
[0034]
Correspondence relationship obtaining means 2 applies the S frames output from sampling means 1 in this way in the order from the frame closest to reference frame FrN, that is, FrN + 1, FrN + 2,. . . In this order, the correspondence relationship is obtained. However, when the canceling unit 10 cancels the correspondence, the acquisition of the correspondence relationship from the canceled frame to the subsequent frames is stopped.
[0035]
FIG. 6 is a block diagram illustrating a configuration of the canceling unit 10. As shown in the figure, the canceling means 10 includes a correlation obtaining means 22 for obtaining a correlation between the frame being processed by the correspondence obtaining means 2 and a reference frame, and the correlation obtained by the correlation obtaining means 22 is equal to or greater than a predetermined threshold value. If there is, the processing of the correspondence relationship finding means 2 is not stopped, but if the correlation is lower than a predetermined threshold, the correspondence relationship seeking means 2 stops the finding of the correspondence relationship for the frames after the frame being processed.
[0036]
In the present embodiment, the correlation acquisition unit 22 uses the sum of the correlation values E at the time of convergence calculated for one frame in the correspondence determination unit 2 as the correlation value between this frame and the reference frame, If this correlation value is higher than a predetermined threshold value (that is, if the correlation is lower than the predetermined threshold value), the processing of the correspondence relationship obtaining means 2 is stopped, that is, the correspondence relationship for the frames after the frame being processed is determined. Stop out processing.
[0037]
The frame integration means including the coordinate conversion means 3 and the like is based on the correspondence obtained by the correspondence obtaining means 2 and uses the reference frame and other frames for which the correspondence with the reference frame is obtained. Is to create. For convenience of explanation, first, the operation of the frame integration unit will be described on the assumption that the frame for which the correspondence relationship is obtained by the correspondence relationship obtaining unit 2 is only FrN + 1.
[0038]
The coordinate conversion unit 3 converts the frame FrN + 1 into the coordinate space of the reference frame FrN as follows to obtain the coordinate-converted frame FrT0. In the following description, conversion, interpolation calculation, and synthesis are performed only for the region in the reference patch P0 of the reference frame FrN and the region in the patch P1 of the frame FrN + 1.
[0039]
In the present embodiment, coordinate transformation is performed using bilinear transformation. Coordinate transformation by bilinear transformation is defined by the following equations (6) and (7).
[0040]
[Equation 3]

Expressions (6) and (7) express coordinates in the patch P1 given by four points (xn, yn) (1 ≦ n ≦ 4) on a two-dimensional coordinate in a normalized coordinate system (u, v) ( 0 ≦ u, v ≦ 1), and coordinate transformation in any two rectangles is performed by combining the inverse transformations of equations (6) and (7) and equations (6) and (7). It can be carried out.
[0041]
Here, as shown in FIG. 7, a position in the reference patch P0 (x′n, y′n) corresponding to the point (x, y) in the patch P1 (xn, yn) is considered. .
First, normalized coordinates (u, v) are obtained for the point (x, y) in the patch P1 (xn, yn). This is obtained by inverse transformation of equations (6) and (7). Based on the reference patch P0 (x′n, y′n) corresponding to (u, v) at this time, the coordinates (x ′) corresponding to the point (x, y) from the equations (6) and (7) , Y ′). Here, since the point (x, y) is an integer coordinate where the pixel value originally exists, the point (x ′, y ′) may be a real number coordinate where the pixel value originally does not exist. As for the pixel value in the integer coordinates, an area surrounded by integer coordinates in the vicinity of 8 adjacent to the integer coordinates of the reference patch P0 is set, and the pixel value of the converted coordinates (x ′, y ′) is set in this area. It shall be obtained as the load sum.
[0042]
Specifically, as shown in FIG. 8, for integer coordinates b (x, y) on the reference patch P0, integer coordinates b (x-1, y-1) and b (x, y-1) in the vicinity of the eight coordinates. ), B (x + 1, y-1), b (x-1, y), b (x + 1, y), b (x-1, y + 1), b (x, y + 1), b (x + 1, y + 1) Calculation is performed based on the pixel value of the frame FrN + 1 converted in the enclosed area. Here, m pixel values of the frame FrN + 1 are converted into an area surrounded by eight neighboring pixels, and the converted pixel value of each pixel is Itj (x °, y °) (1 ≦ j ≦ m). Then, the pixel value It (x ^, y ^) at the integer coordinates b (x, y) can be calculated by the following equation (8). In Expression (8), φ is a function representing the load sum calculation.
[0043]
[Expression 4]

However, Wi (1 ≦ j ≦ m): product of the internal ratio of coordinates as viewed from neighboring integer pixels at the position to which the pixel value Itj (x °, y °) is assigned.
Here, for the sake of simplicity, consider the case where the two pixel values It1 and It2 of the frame FrN + 1 are transformed into an area surrounded by pixels in the vicinity of 8 using FIG. 8, at an integer coordinate b (x, y). The pixel value It (x ^, y ^) can be calculated by the following equation (9).
[0044]
[Equation 5]

However, W1 = u × v, W2 = (1-s) × (1-t)
By performing the above processing for all the pixels in the patch P1, the image in the patch P1 is converted into the coordinate space of the reference frame FrN, and a coordinate-converted frame FrT0 is obtained.
[0045]
The spatiotemporal interpolation unit 4 performs an interpolation operation on the frame FrN + 1 to obtain a first interpolation frame FrH1. Specifically, first, as shown in FIG. 9, an integrated image having a finally required number of pixels (in this embodiment, a case in which the number of pixels is twice each in the vertical and horizontal directions of the frames FrN and FrN + 1 will be described. , Which may have a pixel number n times (n: positive number)), and based on the correspondence obtained by the correspondence obtaining means 2, the frame FrN + 1 (area in the patch P1) Are assigned on the integrated image. Assuming that the function for performing this assignment is Π, the pixel value of each pixel of the frame FrN + 1 is assigned on the integrated image by the following equation (10).
[0046]
[Formula 6]

However, I1N + 1 (x °, y °): pixel value of the frame FrN + 1 allocated on the integrated image
FrN + 1 (x, y): pixel value of frame FrN + 1
Thus, by assigning the pixel value of the frame FrN + 1 on the integrated image, the pixel value I1N + 1 (x °, y °) is obtained, and I1 (x °, y °) (= I1N + 1 (x °, y °)) for each pixel. The first interpolation frame FrH1 having the pixel value of) is acquired.
[0047]
Here, when assigning pixel values to the integrated image, depending on the relationship between the number of pixels in the integrated image and the number of pixels in the frame FrN + 1, each pixel on the frame FrN + 1 has an integer coordinate (that is, a pixel value exists). (Coordinates) may not correspond.
In this embodiment, as will be described later, the pixel value at the integer coordinates of the integrated image is obtained at the time of synthesis. Hereinafter, in order to facilitate the explanation at the time of synthesis, the calculation of the pixel value at the integer coordinates of the integrated image is performed. Will be described.
[0048]
As the pixel value in the integer coordinates of the integrated image, a region surrounded by eight integer coordinates adjacent to the integer coordinate of the integrated image is set, and the pixel value load of each pixel on the frame FrN + 1 allocated in this region is set. Find as sum.
[0049]
That is, as shown in FIG. 10, for integer coordinates p (x, y) in the integrated image, integer coordinates p (x-1, y-1), p (x, y-1), p ( x + 1, y-1), p (x-1, y), p (x + 1, y), p (x-1, y + 1), p (x, y + 1), p (x + 1, y + 1) Is calculated based on the pixel value of the frame FrN + 1 assigned to. Here, k pixel values of the frame FrN + 1 are allocated in a region surrounded by eight neighboring pixels, and the pixel values of the allocated pixels are I1N + 1i (x °, y °) (1 ≦ i ≦ k). Then, the pixel value I1N + 1 (x ^, y ^) at the integer coordinates p (x, y) can be calculated by the following equation (11). In Equation (11), Φ is a function representing the load sum calculation.
[0050]
[Expression 7]

However, Mi (1 ≦ i ≦ k): product of the internal ratio of coordinates as viewed from neighboring integer pixels at the position where the pixel value I1N + 1i (x °, y °) is assigned.
Here, for the sake of simplicity, consider the case where the two pixel values I1N + 11 and I1N + 12 of the frame FrN + 1 are assigned to the area surrounded by the eight neighboring pixels with reference to FIG. The pixel value I1N + 1 (x ^, y ^) can be calculated by the following equation (12).
[0051]
[Equation 8]

However, M1 = u × v, M2 = (1-s) × (1-t)
The pixel value I1N + 1 (x ^, y ^) can be obtained by assigning the pixel value of the frame FrN + 1 for all integer coordinates of the integrated image. In this case, each pixel value I1 (x ^, y ^) of the first interpolation frame FrH1 is I1N + 1 (x ^, y ^).
[0052]
In the above description, the first interpolation frame FrH1 is obtained by performing the interpolation operation on the frame FrN + 1. However, the first interpolation frame FrH1 may be obtained by using the reference frame FrN together with the frame FrN + 1. In this case, the pixels of the reference frame FrN are directly assigned after being interpolated into integer coordinates of the integrated image.
[0053]
The spatial interpolation unit 5 performs an interpolation operation for assigning pixel values to coordinates (real number coordinates (x °, y °)) to which the pixels of the frame FrN + 1 on the integrated image are assigned to the reference frame FrN. The second interpolation frame FrH2 is acquired. Here, if the pixel value of the real number coordinate of the second interpolation frame FrH2 is I2 (x °, y °), the pixel value I2 (x °, y °) is calculated by the following equation (13).
[0054]
[Equation 9]

Where f: interpolation calculation function
As the interpolation calculation, various interpolation calculations such as a linear interpolation calculation and a spline interpolation calculation can be used.
[0055]
Further, in the present embodiment, since the composite frame FrG has twice as many pixels as the reference frame FrN in both the vertical and horizontal directions, by performing an interpolation operation to double the number of pixels in the vertical and horizontal directions with respect to the reference frame FrN, A second interpolation frame FrH2 having the same number of pixels as that of the integrated image may be acquired. In this case, the pixel value obtained by the interpolation calculation is a pixel value of integer coordinates in the integrated image. If this pixel value is I2 (x ^, y ^), the pixel value I2 (x ^, y ^) Calculated by equation (14).
[0056]
[Expression 10]

The correlation value calculation means 6 calculates a correlation value d0 (x, y) between corresponding pixels of the coordinate-transformed frame FrT0 and the reference frame FrN. Specifically, as shown in the following equation (15), the absolute difference between the pixel values FrT0 (x, y) and FrN (x, y) in the corresponding pixels of the coordinate-converted frame FrT0 and the reference frame FrN The value is calculated as a correlation value d0 (x, y). The correlation value d0 (x, y) becomes smaller as the correlation between the coordinate-transformed frame FrT0 and the reference frame FrN increases.
[0057]
## EQU11 ##

In the present embodiment, the absolute value of the difference between the pixel values FrT0 (x, y) and FrN (x, y) in the corresponding pixels of the coordinate-converted frame FrT0 and the reference frame FrN is used as the correlation value d0 (x, y). ), But the square of the difference may be calculated as the correlation value. Further, although the correlation value is calculated for each pixel, the coordinate-converted frame FrT0 and the reference frame FrN are divided into a plurality of regions, and an average value or an addition value of all the pixel values in the region is calculated, The correlation value may be obtained by Further, an average value or an addition value of the correlation value d0 (x, y) calculated for each pixel for the entire frame may be calculated to obtain a correlation value for each frame. Also, the histograms of the coordinate-converted frame FrT0 and the reference frame FrN are calculated, respectively, and the average value of the histograms of the coordinate-converted frame FrT0 and the reference frame FrN, the difference value of the median value or the standard deviation, or the cumulative sum of the histogram difference values May be used as the correlation value. Also, a motion vector representing the motion of the coordinate-converted frame FrT0 with respect to the reference frame FrN is calculated for each pixel or small region of the reference frame FrN, and the average value, median value, or standard deviation of the calculated motion vectors is used as a correlation value. Alternatively, a cumulative sum of motion vector histograms may be used as a correlation value.
[0058]
The weight calculation unit 7 weights and adds the first interpolation frame FrH1 and the second interpolation frame FrH2 from the correlation value d0 (x, y) calculated by the correlation value calculation unit 6 by weighting. ) To get. Specifically, the weighting factor α (x, y) is acquired with reference to the table shown in FIG. In the table shown in FIG. 11, the value of the weighting coefficient α (x, y) is closer to 1 as the correlation value d0 (x, y) is smaller, that is, the correlation between the coordinate-transformed frame FrT0 and the reference frame FrN is larger. It will be a thing. Here, the correlation value d0 (x, y) is assumed to be an 8-bit value.
[0059]
Further, the weight calculation means 7 assigns the weight coefficient α (x, y) to the integrated image in the same manner as when the frame FrN + 1 is assigned to the integrated image, thereby the coordinates (real number coordinates) to which the pixels of the frame FrN + 1 are assigned. The weighting coefficient α (x °, y °) is calculated. Specifically, similarly to the interpolation calculation in the spatial interpolation means 5, the coordinates (real number coordinates (x °, y °) where the pixels of the frame FrN + 1 on the integrated image are assigned to the weighting coefficient α (x, y). The weighting coefficient α (x °, y °) is obtained by performing an interpolation operation for assigning pixel values to)).
[0060]
It should be noted that the reference frame FrN is enlarged or enlarged so as to be the size of the integrated image without calculating the weighting coefficient α (x °, y °) in the real coordinates of the integrated image by interpolation calculation. The reference frame is acquired, and the value of the weighting coefficient α (x, y) acquired for the pixel of the enlarged or equal reference frame corresponding to the nearest neighbor of the real number coordinate to which the pixel of the frame FrN + 1 in the integrated image is assigned is the real number. The coordinate weighting coefficient α (x °, y °) may be used.
[0061]
Furthermore, when the pixel values I1 (x ^, y ^) and I2 (x ^, y ^) in the integer coordinates of the integrated image are acquired, the weighting coefficient α (x °, y assigned on the integrated image). The weight coefficient α (x ^, y ^) in the integer coordinates of the integrated image may be calculated by calculating the load sum in the same manner as described above.
[0062]
The synthesizer 8 weights and adds the first interpolation frame FrH1 and the second interpolation frame FrH2 based on the weight coefficient α (x °, y °) calculated by the weight calculator 7, and performs a load sum operation. Thus, a composite frame FrG having the pixel value FrG (x ^, y ^) at the integer coordinates of the integrated image is acquired. Specifically, the pixel values I1 (x °, y °) and I2 (x °, y °) of the corresponding pixels of the first interpolation frame FrH1 and the second interpolation frame FrH2 are expressed by the following equation (16). Weighted addition is performed with a weighting coefficient α (x °, y °) and a load sum operation is performed to obtain a pixel value FrG (x ^, y ^) of the combined frame FrG.
[0063]
[Expression 12]

In Expression (16), k is the number of pixels of the frame FrN + 1 assigned to the region surrounded by the integer frame in the vicinity of the combined frame FrG, that is, the integer coordinates (x ^, y ^) of the integrated image. The assigned pixels have pixel values I1 (x °, y °), I2 (x °, y °) and a weight coefficient α (x °, y °), respectively.
[0064]
In the present embodiment, the greater the correlation between the reference frame FrN and the coordinate-transformed frame FrT0, the greater the weighting of the first interpolation frame FrH1, and the weighting of the first interpolation frame FrH1 and the second interpolation frame FrH2. Addition is performed.
[0065]
Note that pixel values may not be assigned to all integer coordinates of the integrated image. In such a case, an interpolation calculation similar to that of the spatial interpolation unit 5 described above may be performed on the assigned pixel value to calculate a pixel value of an integer coordinate that has not been assigned.
[0066]
In the above description, the process for obtaining the composite frame FrG for the luminance component Y has been described. However, the composite frame FrG is also obtained for the color difference components Cb and Cr. Then, a synthesized frame FrG (Y) obtained from the luminance component Y and synthesized frames FrG (Cb) and FrG (Cr) obtained from the color difference components Cb and Cr are synthesized to obtain a final synthesized frame. It will be. In order to increase the processing speed, the correspondence between the reference frame FrN and the frame FrN + 1 is obtained only for the luminance component Y, and the processing is performed for the color difference components Cb and Cr based on the correspondence obtained for the luminance component Y. Preferably it is done.
[0067]
When the first interpolation frame FrH1 and the second interpolation frame FrH2 having pixel values for the integer coordinates of the integrated image and the weighting coefficient α (x ^, y ^) of the integer coordinates are acquired, the following formula ( 17), the pixel values I1 (x ^, y ^) and I2 (x ^, y ^) of the corresponding pixels of the first interpolation frame FrH1 and the second interpolation frame FrG2 are converted into weighting factors α (x ^, y ^). ) To obtain the pixel value FrG (x, y) of the combined frame FrG.
[0068]
[Formula 13]

FIG. 12 is a flowchart showing processing performed in the present embodiment. Here, the description will be made assuming that the first interpolation frame FrH1, the second interpolation frame FrH2, and the weighting coefficient α (x °, y °) are acquired for the real coordinates to which the pixels of the frame FrN + 1 of the integrated image are assigned. As shown in FIG. 12, the operation of the moving image synthesizing apparatus of the present embodiment starts from the input of moving image data M0 (S2). In order to create a composite frame from the moving image data M0, the magnification, the frame rate, and the compression quality are input via the condition setting unit 14 of the sampling unit 1 (S4), the sampling execution unit 16 of the sampling unit 1 stores the data. Referring to the frame number determination table stored in the means 12, the number S of frames to be sampled corresponding to the magnification, frame rate, and compression quality input via the condition setting means 14 is detected, and S frames The consecutive frames are sampled from the moving image data M0 and output to the correspondence relationship obtaining means 2 (S6). Correspondence relationship obtaining means 2 arranges a reference patch on reference frame FrN among the S frames (S8), and arranges a patch similar to the reference patch on frame FrN + 1 so that the image in the patch The patch is moved and deformed until the correlation value E with the image in the reference patch converges (S12, S14). The canceling means 10 obtains the sum of the correlation values E at the time of convergence, and if the sum of E is higher than a predetermined threshold (that is, if the correlation between this frame and the reference frame is lower than the predetermined threshold), the correspondence relationship is obtained. The processing of the moving image synthesizing apparatus is performed by the frame integration means including the coordinate conversion means 3 by stopping the processing of the output means 2, that is, by stopping the processing for obtaining the correspondence relationship for the frames after the frame being processed. The process is shifted (S16: No, S30 to S40).
[0069]
On the other hand, if not canceled by the canceling means 10, the correspondence determining means 2 obtains the correspondence between all the frames except the reference frame among the S frames sampled by the sampling means 1 and the reference frame, It uses for a frame integration means (S16: No, S18, S20: Yes, S25).
[0070]
Steps S30 to S40 show the operation of the frame integration means including coordinate conversion means. Here, for convenience of explanation, the description will be made on the assumption that the correspondence relationship obtaining means 2 has obtained the correspondence relationship with the reference frame FrN only for the frame FrN + 1.
[0071]
Based on the correspondence obtained by the correspondence obtaining means 2, the frame FrN + 1 is transformed into the coordinate space of the reference frame FrN by the coordinate transformation means 3 to obtain a coordinate transformed frame FrT0 (S30). Then, the correlation value calculation means 6 calculates the correlation value d0 (x, y) of the corresponding pixel between the coordinate-converted frame FrT0 and the reference frame FrN (S32). Further, a weight coefficient α (x °, y °) is calculated by the weight calculation means 7 based on the correlation value d0 (S34).
[0072]
On the other hand, based on the obtained correspondence relationship, the first interpolation frame FrH1 is acquired by the spatiotemporal interpolation unit 4 (S36), and the second interpolation frame FrH2 is acquired by the spatial interpolation unit 5 (S38).
[0073]
In addition, the process of S36-S38 may be performed first and the process of step S30-S34 and the process of step S36-S38 may be performed in parallel.
[0074]
Then, the synthesis means 8 synthesizes the pixel I1 (x °, y °) of the first interpolation frame FrH1 and the pixel I2 (x °, y °) of the second interpolation frame FrH2 by the above equation (16). , A synthesized frame FrG composed of the pixels FrG (x ^, y ^) is acquired (S40), and the process is terminated.
[0075]
In the above description, for convenience of explanation, the correspondence relationship obtaining means 2 obtains the correspondence relationship with the reference frame FrN for only FrN + 1, and the frame integration means combines the two frames of the reference frames FrN and FrN + 1. The generation of the frame has been described. For example, when the composite frame FrG is acquired from T (T ≧ 3) frames FrN + t ′ (0 ≦ t ′ ≦ T−1) (that is, by the correspondence determining unit 2). When a correspondence relationship between two or more frames and a reference frame is obtained), pixel values are displayed on the integrated image for frames FrN + t (1 ≦ t ≦ T−1) other than the reference frame FrN (= FrN + 0). A plurality of first interpolation frames FrH1t are obtained by allocation. It is assumed that the pixel value of the first interpolation frame FrH1t is I1t (x °, y °).
[0076]
In addition, an interpolation operation for assigning pixel values to coordinates (real number coordinates (x °, y °)) to which the pixels of the frame FrN + t on the integrated image are assigned to the reference frame FrN corresponds to the frame FrN + t. A second interpolation frame FrH2t is acquired. Note that the pixel value of the second interpolation frame FrH2t is I2t (x °, y °).
[0077]
Further, based on the obtained correspondence relationship, a weighting coefficient αt (x °, y °) for weighting and adding the corresponding first and second interpolation frames FrH1t and FrH2t is acquired.
[0078]
Then, the first and second interpolated frames FrH1t and FrH2t corresponding to each other are weighted and added by the weighting coefficient αt (x °, y °) and the weighted sum operation is performed, whereby the pixel value FrGt ( An intermediate composite frame FrGt having x ^, y ^) is acquired. Specifically, the pixel values I1t (x °, y °) and I2t (x °, y °) of the corresponding pixels of the first interpolation frame FrH1t and the second interpolation frame FrG2t are expressed by the following equation (18). The weighted addition is performed with the corresponding weighting coefficient αt (x °, y °) and the load sum operation is performed to obtain the pixel value FrGt (x ^, y ^) of the intermediate composite frame FrGt.
[0079]
[Expression 14]

In Expression (18), k is the number of pixels of the frame FrN + t assigned to the intermediate synthesis frame FrGt, that is, the area surrounded by the integer coordinates in the vicinity of the integer coordinates (x ^, y ^) of the integrated image. The assigned pixels have pixel values I1t (x °, y °), I2t (x °, y °) and a weighting coefficient αt (x °, y °), respectively.
[0080]
Then, the synthesized frame FrG is obtained by adding the intermediate synthesized frame FrGt. Specifically, the pixel value FrG (x ^, y ^) of the synthesized frame FrG is acquired by adding the intermediate synthesized frame FrGt between corresponding pixels by the following equation (19).
[0081]
[Expression 15]

Note that pixel values may not be assigned to all integer coordinates of the integrated image. In such a case, an interpolation calculation similar to that of the spatial interpolation unit 5 described above may be performed on the assigned pixel value to calculate a pixel value of an integer coordinate that has not been assigned.
[0082]
Further, when the composite frame FrG is acquired from a plurality of three or more frames, the first interpolation frame FrH1t and the second interpolation frame FrH2t having pixel values with respect to the integer coordinates of the integrated image, and the weight coefficient αt (x ^) of the integer coordinates , Y ^) may be acquired. In this case, for each frame FrN + t (1 ≦ t ≦ T−1), the pixel value FrN + t (x, y) of each frame FrN + t is assigned to all integer coordinates of the integrated coordinates, and the pixel value I1N + t (x ^, y ^) That is, the first interpolation frame FrH1t having the pixel value I1t (x ^, y ^) is acquired. Then, by adding the pixel values I1t (x ^, y ^) assigned to all the frames FrN + t and the pixel values I2t (x ^, y ^) of the second interpolation frame FrH2t, a plurality of intermediate composite frames FrGt And adding them together to obtain a composite frame FrG.
[0083]
Specifically, first, as shown in the following formula (20), pixel values I1N + t (x ^, y ^) in integer coordinates of the integrated image are calculated for all frames FrN + t. Then, as shown in Expression (21), the pixel value I1t (x ^, y ^) and the pixel value I2t (x ^, y ^) are weighted and added by the weighting coefficient α (x ^, y ^). An intermediate composite frame FrGt is obtained. Then, as shown in the above equation (20), the synthesized frame FrG is acquired by adding the intermediate synthesized frame FrGt.
[0084]
[Expression 16]

Note that when the composite frame FrG is acquired from a plurality of three or more frames, a plurality of coordinate-transformed frames FrT0 are acquired, and thus a plurality of correlation values and weighting coefficients are also acquired corresponding to the number of frames. In this case, an average value or an intermediate value of a plurality of obtained weighting factors may be used as the weighting factors for weighted addition of the corresponding first and second interpolation frames FrH1 and FrH2.
[0085]
As described above, in the moving image synthesizing apparatus according to the present embodiment, the sampling unit 1 uses the compression quality and frame rate of the moving image data M0, and the magnification of the pixel size of the synthesized frame to be synthesized with respect to the pixel size of the moving image frame. Since the number of frames to be sampled is determined based on the above, it is not necessary for the operator to manually determine the number of frames, which is convenient. Also, by determining the number of frames based on the image characteristics of the moving image and the composite frame to be created, it is possible to objectively determine the appropriate number of frames, thus creating a high-quality composite frame can do.
[0086]
Further, in the moving image synthesizing apparatus according to the present embodiment, for the sampled S frames, the pixels in the patch on the other frame and the reference frame are sequentially selected from other frames other than the reference frame close to the reference frame FrN. A process of obtaining a correspondence relationship between pixels in the reference patch and obtaining a correlation between another frame and the reference frame, and obtaining a correspondence relationship with respect to the next other frame if the correlation is greater than a predetermined threshold. Continue, but if a frame whose correlation is lower than the predetermined threshold is detected, the correspondence calculation process for other frames after this frame is stopped even if the determined number of frames has not been reached. By using a frame having a low correlation with the reference frame (for example, a frame in a scene switched from the reference frame scene) It can be avoided to create a beam, making it possible to create more high-quality synthetic frame of.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a moving image synthesis apparatus according to an embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of sampling means 1 of the moving image synthesizing apparatus shown in FIG.
FIG. 3 is a diagram illustrating an example of a frame number determination table.
FIG. 4 is a diagram for explaining the calculation of the correspondence between the frame FrN + 1 and the reference frame FrN.
FIG. 5 is a diagram for explaining a deformation of a patch.
6 is a block diagram showing a configuration of a canceling unit 10 of the moving image synthesizing apparatus shown in FIG.
FIG. 7 is a diagram for explaining a correspondence relationship between a patch P1 and a reference patch P0;
FIG. 8 is a diagram for explaining bilinear interpolation;
FIG. 9 is a diagram for explaining assignment of a frame FrN + 1 to an integrated image;
FIG. 10 is a diagram for explaining calculation of pixel values of integer coordinates in an integrated image.
FIG. 11 is a diagram showing a table for obtaining weighting factors
12 is a flowchart showing processing performed in the moving image synthesizing apparatus shown in FIG.
[Explanation of symbols]
1 Sampling means
2 Correspondence relationship seeking means
3 Coordinate conversion means
4 Spatio-temporal interpolation means
5 Spatial interpolation means
6 Correlation value calculation means
7 Weight calculation means
8 Synthesis means
10 Means to cancel
12 Storage means
14 Condition setting means
16 Sampling execution means
22 Correlation acquisition means
24 Canceling execution means

Claims

Sampling a predetermined number of frames of 2 or more including a reference frame that is a sequence of moving images,
Placing a reference patch consisting of one or more rectangular areas on the reference frame;
A patch similar to the reference patch is arranged on another frame of the predetermined number of frames,
Moving and / or deforming the patch on the other frame so that the image in the patch substantially matches the image in the reference patch;
Based on the patch after the movement and / or deformation and the reference patch, the correspondence relationship between the pixels in the patch on each frame of the other frame and the pixels in the reference patch on the reference frame is respectively determined. Seeking
In the moving image composition method for creating a composite frame from the predetermined number of frames based on the obtained correspondence relationship,
A method for synthesizing a moving image, wherein the predetermined number of frames is determined based on the moving image and / or image characteristics of the synthetic frame to be created, and the predetermined number of frames are sampled.

In order from the other frames close to the reference frame, the correspondence is obtained, and the correlation between the other frame and the reference frame for which the correspondence is obtained is obtained.
In the frame in which the correlation is lower than a predetermined threshold, the processing for obtaining the correspondence relationship is stopped,
2. The moving image synthesizing method according to claim 1, wherein the synthesized frame is created using the reference frame and the other frame for which the correspondence is obtained based on the obtained correspondence.

Sampling means for sampling a predetermined number of frames of two or more including a reference frame in a moving image;
A reference patch composed of one or a plurality of rectangular areas is arranged on the reference frame, a patch similar to the reference patch is arranged on another frame of the predetermined number of frames, and an image in the patch Is moved and / or deformed on the other frame so that the image substantially matches the image in the reference patch, and based on the moved and / or deformed patch and the reference patch, the other Correspondence finding means for obtaining the correspondence between the pixels in the patch on each frame of the frame and the pixels in the reference patch on the reference frame;
A moving image synthesizing apparatus comprising frame integration means for creating a synthesized frame from the predetermined number of frames based on the correspondence obtained by the correspondence finding means;
The sampling means comprises frame number determining means for determining the predetermined number based on the moving image and / or image characteristics of the composite frame to be created, and the predetermined number of frames determined by the frame number determining means A moving image synthesizing apparatus for sampling a number of frames.

The correspondence relationship obtaining means obtains the correspondence relationship in order from other frames close to the reference frame,
A correlation between the other frame for which the correspondence relationship is obtained by the correspondence relationship obtaining unit and the reference frame is obtained, and the processing of the correspondence relationship obtaining unit is performed from a frame whose correlation is lower than a predetermined threshold. A canceling means for canceling,
The frame integration means creates the composite frame using the reference frame and the other frame for which the correspondence relationship is obtained based on the obtained correspondence relationship. 3. The moving image synthesizing apparatus according to 3.

A number determination process for determining the number of frames to be used for synthesis based on the image characteristics of a moving image and / or a combined frame to be created from a plurality of frames of the moving image;
Sampling processing for sampling the number of frames including the reference frame that is continuous in the moving image;
A reference patch composed of one or more rectangular areas is arranged on the reference frame, a patch similar to the reference patch is arranged on another frame of the number of frames, and an image in the patch is The patch is moved and / or deformed on the other frame so as to substantially match the image in the reference patch, and the other frame is changed based on the moved and / or deformed patch and the reference patch. A correspondence finding process for obtaining a correspondence between each pixel in the patch on each frame and a pixel in the reference patch on the reference frame;
A program causing a computer to execute frame integration processing for creating a composite frame from the number of frames based on the obtained correspondence.

The correspondence finding process is for obtaining the correspondence in order from other frames close to the reference frame.
The computer further calculates a correlation between the other frame for which the correspondence relationship is required and the reference frame, and further cancels the processing for determining the correspondence relationship from a frame whose correlation is lower than a predetermined threshold. 6. The program according to claim 5, wherein: