JP3832894B2

JP3832894B2 - Image synthesizer

Info

Publication number: JP3832894B2
Application number: JP13363996A
Authority: JP
Inventors: 英夫滝口
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1996-05-28
Filing date: 1996-05-28
Publication date: 2006-10-11
Anticipated expiration: 2016-05-28
Also published as: JPH09322059A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像の一部がオーバーラップしている複数の画像をコンピュータ上で合成を行うパノラマ合成システム等に用いて好適な画像合成装置に関する。
【０００２】
【従来の技術】
画像の一部の辺がオーバーラップしている複数の画像をコンピュータ上で合成する処理を一般的にパノラマ合成と呼ぶ。これはワイドな被写体を撮影して一枚の画像にしたい、という要求からの処理といえる。また、電子カメラにおいては、銀塩カメラやスキャナと比較した短所として、解像度の低さ（画素数の少なさ）が指摘されている。この電子カメラで撮影された画像にとってのパノラマ画像合成は、ワイドな画像を撮るということだけでなく、高解像度な画像を撮る手段としても重要である。具体的には、一枚の紙の原稿や雑誌等を複数に分けて撮影し、スキャナ並みの高解像度データを取得したり、また風景を複数に分割してワイドで高解像度に撮影したりする場合に威力を発揮する。
【０００３】
パノラマ合成において、最も重要でかつ難しいのは、複数の画像のオーバーラップ位置を見つける処理である。これは言い換えると、２つの画像の中から同じ点（以降、対応点と呼ぶ）を探しだす処理である（以降、対応点抽出処理と呼ぶ）。対応点抽出処理の難しさ（エラーレート）は各々の画像によって異なる。オーバーラップしている領域内に、他の箇所にはないユニークな特徴形状が多数あるときは、対応点を間違えないで探すことができる。しかしオーバーラップ内の他の箇所にも、似たようなパターンが存在する場合は（例えば原稿中の文字等）、対応点を間違える場合が発生する。
【０００４】
そこで従来は、ユーザに対応する点を明示的に指定してもらい、その位置をもとに微調整をして合成を行うというのが一般的である。図２１に従来例を示す。まずユーザが合成したい複数の画像を指定すると、図２１のようなウインドウが開く。２枚の画像での対応するポイントをユーザが指定し、マーク２１ａ、２１ｂ、２２ａ、２２ｂをつける。処理としては、各々のマークの中心からごく近傍のパターンを調べて、両者が最も合致する位置関係を求め対応点とする。そしてこの対応点に基づいて合成のためのパラメータを求め，合成処理を行う。
【０００５】
【発明が解決しようとする課題】
しかしながら、上記従来例においては、以下のような問題点がある。まず、ユーザが合成したい画像のセットを指示しなければならない。コンピュータ上で管理している画像の全枚数がそれほど多くないときは問題にはならないが、数千枚以上もの多くの画像を管理している場合は、見落とし等が発生する。また、ユーザが対応点のポイントをほぼ正確に指定しなければならず、これも作業の回数が増えるにつれてユーザの負担となる。
【０００６】
本発明は上記のような実情に鑑みて成されたもので、もっと簡単にパノラマ画像合成を行うことのできる画像合成装置を得ることを目的としている。
【０００７】
【課題を解決するための手段】
請求項１の発明による画像合成装置においては、複数の画像を表示可能な第１の表示領域と、複数の画像を表示可能で前記第１の表示領域とは異なる第２の表示領域とを同一画面上に表示する表示手段と、ユーザの指示によって前記第１の表示領域に表示されている画像から選択した複数枚の画像を前記第２の表示領域へと移動させる移動手段と、前記第１の表示領域から前記第２の表示領域へと移動された複数の画像のオーバーラップ部分を検出する検出手段と、前記検出手段によって検出されたオーバーラップ部分における対応点を抽出する対応点抽出手段と、前記対応点抽出手段によって抽出された対応点に基づいて前記複数枚の画像をつなぎ合わせて１枚の画像を作成する合成処理手段と、を備え、前記検出手段は、前記移動手段によって移動させた複数枚の画像の枚数が所定の枚数未満である場合には、自動で画像のオーバーラップ部分を検出することを特徴とする。
【０００８】
また、請求項２の発明による画像合成装置においては、複数の画像を表示可能な第１の表示領域と、複数の画像を表示可能で前記第１の表示領域とは異なる第２の表示領域とを同一画面上に表示する表示手段と、ユーザの指示によって前記第１の表示領域に表示されている画像から選択した複数枚の画像を前記第２の表示領域へと移動させる移動手段と、前記第１の表示領域から前記第２の表示領域へと移動された複数の画像のオーバーラップ部分を検出する検出手段と、前記検出手段によって検出されたオーバーラップ部分における対応点を抽出する対応点抽出手段と、前記対応点抽出手段によって抽出された対応点に基づいて前記複数枚の画像をつなぎ合わせて１枚の画像を作成する合成処理手段と、を備え、前記移動手段は、所定の枚数以上の画像が入力されたとき、これらの画像を前記第２の表示領域上で移動させて各画像の相対位置を指定し、前記検出手段は、前記指定された相対位置に基づいて前記オーバーラップ部分を検出することを特徴とする。
【０００９】
【作用】
請求項１の発明による画像合成装置によれば、複数の画像を表示可能な第１の表示領域と、複数の画像を表示可能で前記第１の表示領域とは異なる第２の表示領域とを同一画面上に表示し、ユーザの指示によって前記第１の表示領域に表示されている画像から選択した複数枚の画像を前記第２の表示領域へと移動させ、前記第１の表示領域から前記第２の表示領域へと移動された複数枚の画像が所定の枚数未満である場合には、自動でオーバーラップ部分を検出し、検出されたオーバーラップ部分における対応点を抽出し、抽出された対応点に基づいて前記複数枚の画像をつなぎ合わせて１枚の画像を作成する。
【００１０】
請求項２の発明による画像合成装置によれば、複数の画像を表示可能な第１の表示領域と、複数の画像を表示可能で前記第１の表示領域とは異なる第２の表示領域とを同一画面上に表示し、ユーザの指示によって前記第１の表示領域に表示されている画像から選択した複数枚の画像を前記第２の表示領域へと移動させ、前記第１の表示領域から前記第２の表示領域へと移動された複数枚の画像が所定の枚数以上である場合には、これら複数枚の画像を第２の表示領域上で移動させて各画像の相対位置を指定し、この相対位置に基づいて、オーバーラップ部分を検出し、検出されたオーバーラップ部分における対応点を抽出し、抽出された対応点に基づいて前記複数枚の画像をつなぎ合わせて１枚の画像を作成する。
【００１１】
【発明の実施の形態】
本実施の形態は、以下の手順により簡単で便利なパノラマ画像合成を行うものである。
（１）まず、電子カメラで画像を撮影する際に、ユーザは“パノラマ画像撮影モード”にカメラをセットしてから撮影を行う。このモードにより、撮影された画像の属性情報中に、１セットのパノラマ画像を示す識別子が自動的に記録される。
（２）そして、このカメラをコンピュータに接続して、カメラ内のメモリ中にある画像＋属性情報をコンピュータのＨＤにコピーする際に、アプリケーションソフトウェアを介してこの属性情報をチェックする。属性情報中にパノラマ画像撮影モードでの識別子が存在するものから自動的に１セットの画像を抽出する。
【００１２】
次にパノラマ画像合成処理に入る。画像合成においては、画像が２枚のときのみ完全自動で行う後述するフルオート合成を行う。３枚以上のときは、画像の上下左右の相対位置だけをユーザに指定してもらう後述するオート合成を行う。またこれらのフルオート合成、オート合成のチェック段階で対応点が十分求められなかったとき、またはユーザが対応点の検出に要する時間を省いてより短い時間で合成処理を行いたいときは、ユーザがだいたいのオーバーラップ位置を指定する後述するセミオート合成を行う。
【００１３】
（３）「フルオート合成」：上記（２）でパノラマ画像の１セットを抽出して、それが２枚だったときはフルオート合成処理に入る。２枚の合成位置としては、図３（ａ）〜（ｄ）に示すように上下左右の４通りが考えられる。そこでこの４つの場合の対応点を求める処理を行い、対応点が上下左右のうち最も多く求まったところを、正しい合成位置として合成する。このとき、４つのいずれの場合も、対応点の数が所定の量以下のときは、確実性が低いといえる。このときはフルオート合成処理を打ち切り、セミオート合成処理に移行する。このモードでのユーザの操作としては、画像をカメラからコンピュータ中へコピーする操作だけで、あとは自動でやってくれることになる。特殊な用途以外は通常は２枚合成がほとんどと考えられるので、この合成処理が行われる場合が最も多い。
【００１４】
（４）「オート合成」：上記（２）でパノラマ画像の１セットを抽出して、それが３枚以上だったときこのオート合成処理に入る。このオート合成では１セットの画像をウインドウに表示する。図４は４枚の画像を表示する場合を示す。ユーザはこれらの画像をｄｒａｇして配置を色々考え、上下、左右の位置関係のみを指示する。これに基づいて対応点抽出処理を行う。この結果、所定のレベル以上一致している対応点の数が所定の量以上のときは、正しい合成位置として合成する。そうでなければ確実性が低いので、オート合成処理を打ち切り、セミオート合成処理に移行する。
【００１５】
（５）「セミオート合成」：上記（３）（４）で対応点抽出の確実性が低かったとき、または対応点抽出に要する時間を節約してより早い合成結果を得たい場合は、このセミオート処理を行う。ユーザはこれらの画像をｄｒａｇして、図５に示すようにオーバーラップのだいたいの位置を指定する。この位置情報に基づいてオート処理でのときよりずっと狭い範囲で対応点抽出処理を行う。この結果最も一致する位置を求め、合成処理を行う。
【００１６】
上記（４）（５）の場合、ユーザは対応点抽出のための操作を行うが、どちらも画像をｄｒａｇするだけの操作であり、これは最も単純でかつ共通の操作であるので、ユーザの負担は小さい。また、（５）のセミオート合成のとき、画像をｄｒａｇしてだいたいの位置に合わせるだけなので、従来例のポイントを明示的に指定する操作よりもずっと簡単で楽である。
【００１７】
図２は本発明が実施されうるプラットフォームであるパーソナルコンピュータシステムの構成例を示している。図２において、３０１はコンピュータシステム本体、３０２はデータを表示するディスプレイ、３０３は代表的なポインティングデバイスであるマウス、３０４はマウスボタン、３０５はキーボードである。３０７はコンピュータに接続可能な電子カメラであり、これは３０６で示す双方向パラレルインターフェースやＳＣＳＩインターフェース等の、高速で画像転送可能な汎用インターフェースによって接続されている。
【００１８】
図１はソフトウェアとハードウェアとを含むパノラマ画像合成システムの構成を示す図である。図１において、５０９はハードウェアであり、ＣＰＵ５１８を有している。５０５はハードウェア５０９の上で動作するオペレーディングシステム（ＯＳ）であり、５０４はＯＳ５０５の上で動作するアプリケーションソフトウェアである。なお、ハードウェア５０９とＯＳ５０５を構成するブロックのうち、構成要素として当然含まれるが本発明の実施の形態を説明する上で直接必要としないブロックに関しては図示していない。そのような図示していないブロックの例としてハードウェアとしてはメモリ、ＯＳとしてはメモリ管理システム等がある。
【００１９】
図１において５１５はファイルやデータを物理的に格納するハードディスク、５０８はＯＳ５０５を構成するファイルシステムであり、アプリケーションソフトウェア５０４がハードウェア５０９を意識せずにファイルの入出力が行えるようにする機能がある。５１４はファイルシステム５０８がハードディスク５１５の読み書きを行うためのディスクＩＯインターフェースである。５０７はＯＳ５０５を構成する描画管理システムであり、アプリケーションソフトウェア５０４がハードウェア５０９を意識せずに描画が行えるようにする機能がある。
【００２０】
５１３は描画管理システム５０７がディスプレー３０２に描画を行うためのビデオインターフェースである。５０６はＯＳ５０５を構成する入力デバイス管理システムであり、アプリケーションソフトウェア５０４がハードウェア５０９を意識せずユーザの入力を受け取ることができるようにする機能がある。５１０は入力デバイス管理システム５０６がキーボード３０５の入力を受け取るためのキーボードインターフェース、５１２は入力デバイス管理システム５０６がマウス３０３からの入力を受け取ることができるようにするためのマウスインターフェースである。
【００２１】
さらに、電子カメラ３０７は、双方向インターフェースもしくはＳＣＳＩインターフェース等５１６に接続され、入力デバイス管理システム５０６を通して、画像データ等のやりとりを行うことができる。５０１は画像データ管理システムであり、５０２は画像データを属性情報もしくはユーザの入力によるキーワード等で管理するためのデータ管理部である。５０３は管理されている画像データを、その属性情報もしくはユーザの入力によるキーワード等で検索し表示するデータ表示部である。５１７はパノラマ画像合成システムであり、画像データ管理システム５０１からパノラマ撮影モードで撮影した画像を受け取り、パノラマ画像合成処理を行う。そして合成した結果の画像を画像データ管理システム５０１へ登録する。
【００２２】
図６はカメラ内の内蔵メモリに格納される画像データおよび属性情報のデータ構造を示す。まずメモリ内には画像管理テーブル８１が置かれ、対応する画像データ８２と属性情報８３とが参照される。画像データ８２は、カメラ独自のフォーマットデータ（ネィティブデータ）か、ＪＰＥＧ等の汎用フォーマットデータのいずれかで格納されている。ネィティブデータは、例えば、ＣＣＤから出力を単にＡ／Ｄして得られたデータ等である。一般的にネィティブデータは記録に要する時間が短いが、データサイズが大きく、ＪＰＥＧデータは、記録に要する時間はかかるが、データサイズを小さくできるという点が異なる。ユーザは撮影状況に応じてこれらのどちらかを選択して格納することになる。
【００２３】
属性情報８３の中には、ファイル名８４、フィルタタイプ８５、撮影日時８６、撮影モード８７が記録されている。ファイル名８４は、カメラが自動的に付けるユニークなファイル名である。フィルタタイプ８５は、ネィティブデータフォーマットなのかＪＰＥＧフォーマットなのか、あるいはカメラがサポートする他の汎用フォーマットなのかを示す。撮影日時８６は、カメラは内部にカレンダーとタイマとを持っており、カメラのシャッターボタンを押された時点の日時と時間が記録される。撮影モード８７は、カメラの有する数種類の撮影モードのうち撮影時に選択されているモードを示す。これが“パノラマ撮影モード”の場合は、さらに識別子８８が付加される。この識別子はパノラマ撮影モードにセットしたときにセットされるユニークな番号であるモードＩＤ８９とそのモードでの何枚目かを示すデータ９０とが格納される。従ってパノラマ撮影モードにおいて同じモードＩＤ８９を持つ複数の画像が１セットであるということになる。この図６の例では、風景を左右２枚の画像として撮影しているので、８９ａと８９ｂとは同じモードＩＤである。
以上のようにしてカメラ内に画像データ８２および属性情報８３が格納される。
【００２４】
図７はカメラ内のデータをパーソナルコンピュータへコピーするときの画面を示す。図１のカメラ３０７を汎用インターフェース３０６を介してコンピュータに接続し、画像データ管理システム５０２を起動する。画像データ管理システム５０２は、カメラ内のデータを図７のカメラカタログ９１と名前の付けられたウインドウに表示する。ここで９４は画像データの縮小画像（サムネール画像）、９５は属性情報中のファイル名、フィルタタイプ等である。属性情報のうちどこまで表示するかはユーザ指定で変更できる。９２はパーソナルコンピュータのハードディスク中に存在するユーザの画像データベースの一部を表示しているユーザカタログである。
【００２５】
ユーザはカメラカタログ９１中から画像を選択して（９３は選択されたことを表示する枠）ユーザカタログ９２にＤｒａｇ＆Ｄｒｏｐの操作を行うとコピーが行われる。このとき、コピーなのか（カメラ内にデータは残る）、移動なのか（カメラ内のデータは消去される）はユーザの指定でどちらにでも切り替えられる。このコピー操作の最中に、
・ネィティブデータを所定の汎用フォーマットに変換する。
・パノラマ撮影モードで撮影された画像があれば、それらの合成を行う。
【００２６】
以上の操作を自動的にその必要性を検知して実行する。図８にその際のユーザカタログ９１上でのデータ構造を示し、図９にフローチャートを示す。
まず図８の説明をする。まず画像データ管理システム５０８では、内部に格納している画像データについて固有のＩＤ番号を付けて管理している。それがデータ管理テーブル１１０８である。これは、データＩＤ１１０９と、それにリンクされている画像データ、属性情報との対応がとられる。このデータＩＤが管理の基本となる。画像データ管理システム５０２では、ユーザが任意の個数だけユーザカタログ９２を持つことができる。この一個のカタログごとに、カタログテーブル１１００を持つ。またカタログ内の画像データをユーザが複数の画像をひとつのグループとしてカテゴリー分けをする機能を有する。これによりひとつのカタログ内を階層化してデータを管理できる。
【００２７】
カタログテーブル１１００内には、このカタログに属する画像のデータＩＤ１１０１と、属するグループのグループＩＤ１１０２が保持される。そしてグループＩＤ１１０２はグループテーブル１１０３とリンクする。グループ属性テーブル１１０４は基本的にカタログテーブルと同じであり、このグループに属する画像のデータＩＤ、またはこのグループに属するグループＩＤを持つ。違いは頭にグループ属性データ１１０４を持つところが異なる。これにはグループ名１１０６、作成日時１１０７、グループタイプ１１１０が格納される。グループ名１１０６はユーザが付けた任意の名前がつく。パノラマ画像のセットとしてグループが作られたときは、このグループ名はデフォルトで“パノラマ画像”と付けられる。
【００２８】
作成日時は、このグループが作成されたときの日時が格納されている。グループタイプ１１１０は、ユーザが作成した場合は“ユーザ作成”、パノラマ画像のセットとしてグループが作られたときは、“パノラマ画像”と入っている。パノラマ画像のときは、さらに識別子とリンクしモードＩＤ８９ａが納められる。そして実際の画像データ、属性情報は図６の場合と同じ構造で格納され、これらはデータ管理テーブル１１０８から参照される。
【００２９】
次に図９の説明を行う。ステップ１０００で全てのコピーする画像データの処理が終了しない間において、ステップ１００１で、コピー操作の中でまず一個の画像データとそれに付随した属性情報とを取得する。ステップ１００２で、属性情報内のフィルタタイプ８５から、この画像データがネィティブデータかどうかを判断する。そうであればステップ１００３でデフォルトとして決まっている汎用フォーマット（ＪＰＥＧやＴＩＦＦ等）にネィティブデータを変換する。変換が終わったらフィルタタイプ８５も更新する。次にステップ１００４で、撮影モード８７を調べてパノラマ撮影モードで撮影された画であるかをチェックする。パノラマ画像でない場合は、ステップ１００８で通常の画像データとして登録する。具体的には図８でのデータ管理テーブル１１０８に固有のデータＩＤを付けて登録し、そのデータＩＤをカタログテーブル１１００に登録する。
【００３０】
パノラマ画像であるときは、このパノラマ画像用のグループがすでに作成済みかどうかをチェックする。これは図８のカタログテーブル１１００をたどっていって、グループＩＤのモードＩＤ８９ａが、画像のモードＩＤ８９ａと同じかどうかを見ることによって行われる。対応するグループがないときはグループを作成する。これはカタログテーブル１１００に新たにグループＩＤ１１０２を登録し、グループ名１１０６、作成日時１１０７、グループタイプ１１１０を作成する。グループタイプ１１１０には“パノラマ撮影”と記録され、画像の属性情報中のモードＩＤ８９ａが納められる。そしてステップ１００７で、このパノラマ画像データに固有のデータＩＤを付けて管理テーブル１１０８に登録し、そのデータＩＤ１１０５を登録する。
【００３１】
以上の一連の処理をコピーする画像全てに対して行う。全ての画像に対して処理が終わったならば、ステップ１００９に移る。ステップ１００９では、今までコピーしたものの中でパノラマ画像のグループが作られたかをチェックし、そうであればグループ内の画像を用いてステップ１０１０で後述するパノラマ画像合成処理を行う。そうでなければ処理を終了する。
【００３２】
図１０に上記ステップ１０１０のパノラマ画像合成処理のフローチャートを示す。まずステップ１２００でグループ内の画像が２枚か２枚より多いかをチェックする。２枚のときはステップ１２０２で後述するフルオート処理に入る。２枚より多いときはステップ１２０１で後述するオート合成処理に入る。このステップ１２０１、１２０２の処理は、ステップ１２０３、１２０４でその結果が成功か失敗かをチェックする。この成功か失敗かの判断は、画像同士の対応点が所定数以上見つけられたかどうかで判断する。従って、合成処理全体の中では対応点抽出処理の早い段階で判断を下すので、結果が失敗でもユーザがその結果を得るまで待つ時間は短くて済む。そして成功であれば処理は終了で、失敗であれば、ステップ１２０５で後述するセミオート処理を行う。
【００３３】
図１１にオート合成処理のユーザインターフェースを示す。まず画面上に上記ステップ１００９におけるパノラマ画像のグループに属する画像全てがウインドウに入る大きさにリサイズされて表示される。これをユーザが見て正しい順番にＤｒａｇして並べ替える。この図１１の例では、３枚の画像が表示され、同図（ａ）の下に位置している画像が本当は一番右になるので、右側にＤｒａｇすると、その位置から横に３つ並ぶ同図（ｂ）のようなパノラマであることを検知する。そしてウインドウに入りきるように再度リサイズされて表示される。
【００３４】
図１２にオート合成処理のフローチャートを示す。まずステップ１３０１で、ユーザが並べ替えた位置関係の情報を取得する。ステップ１３０２で一致する対応点を見つけるのにサーチする範囲を設定する。これを図１３で説明する。パノラマ画像として撮影するときのルールとして、最小１０％、最大５０％オーバーラップさせることと、それに直交する上下方向のずれをそれぞれ５％以下と決めると、必ずオーバーラップする範囲は図１３（ａ）の１５０４で示す範囲となる。またオーバーラップしている可能性のある範囲は図５（ｂ）の１５０５で示す範囲となる。上記範囲１５０４の中にあるポイント１５０３は上記のルールに従うと、サーチ範囲中の１５０５の中に対応する点があることになる。後述する対応点抽出処理では、このサーチ範囲１５０５に対してマッチングするかどうかを見ていくことになる。
【００３５】
再び図１２のフローチャートに戻ると、ステップ１３０２では、以上のサーチ範囲１５０５設定に用いるパラメータをセットする。ステップ１３０３では、対応点を抽出する処理を行う。この詳細については後述する。ステップ１３０４では、求まった対応点の数が所定値以上かどうかを判断し、所定値以下のときは十分対応点を自動で見つけることができなかったので、セミオート処理へ進む。所定値より多かったときは、ステップ１３０５の合成パラメータ処理へ進む。ここでは合成の際に用いる移動、拡大（縮小）、回転のパラメータを先の対応点の座標から求める。その詳細は後述する。ステップ１３０６ではこれらパラメータをもとに画像を合成する。その詳細も後述する。
【００３６】
図１４にフルオート処理のフローチャートを示す。まず、ステップ１６０１で対応点抽出のためのサーチ範囲であるマッチング範囲設定を行うが、これは上記ステップ１３０２と処理は同一である。次に４回対応点抽出処理を行う。フルオートの場合、枚数は２枚に限定しているので、考えられる位置関係は画像１と画像２が上下、下上、左右、右左の４通りである。そこでこの４つの場合について対応点抽出処理をして、それぞれ対応点として抽出できた数、対応点としての一致レベルの平均を保持する。これらの処理が、ステップ１６０２から１６０９までの処理である。そしてステップ１６１０で上記４つの場合について対応点が所定数以上の対応点を抽出できたものがあるかをチェックする。もし一つもなければセミオート処理へ進む。
【００３７】
抽出された対応点の数が所定数以上であれば、ステップ１６１１でその中で対応点としての一致レベルの平均が最も高いものを真の位置関係であるとする。通常の画像では、対応点が所定値以上の場合は４つのうちの一つだけになるはずだが、例えば原稿等を分割して撮った場合似たような字が並んでいて、正しくない位置関係のときでも所定値以上を対応点として抽出してしまう場合があり得る。そこでこのステップ１６１１では最もフィットしているもの（平均一致レベルが最も高いもの）を選択するようにする。ステップ１６１２の合成パラメータ処理、ステップ１６１３の画像合成処理は上記ステップ１３０５、１３０６と同一処理である。
【００３８】
図１５にセミーオート処理でのユーザインターフェースを示す。まず画面上に前記ステップ１００９でのパノラマ画像のグループに属する画像全てがウインドウに入る大きさにリサイズされて表示される。これをユーザが見てだいたいのオーバーラップ位置を合わせて重ね合わせる。重ね合うところは、画素単位でビットごとにＡＮＤ演算をして表示する。従って重なった部分は両方の画像が透けて見える。そしてウィンドウに入りきるように再度リサイズされて表示される。
【００３９】
以上の操作は、基本的にオート処理で操作と同一であり、ユーザの負担は少ない。違いはマウス等のポインティングデバイスで画像をＤｒａｇして離した位置を、位置関係の情報だけを使って並べ替えて表示する（オート処理）ことと位置の情報をそのまま使ってオーパーラップして表示する（セミオート処理）ことだけである。またＤｒａｇ中にも先のＡＮＤ演算で透けて見えるため、だいたいの位置を容易に合わすことができる。
【００４０】
図１６にセミオート処理でのフローチャートを示す。ここでの処理はオート処理のときとほとんど同一である。ステップ１７０１で、画像のユーザが合わせたオーバーラップ位置情報を取得する。ステップ１７０２でマッチング範囲を設定するが、ここでの範囲は、所定の範囲（想定されるユーザの合わせた位置の誤差範囲＋マージン）となる。従って、オート処理の時の範囲よりはずっと狭い範囲となり、計算時間の短縮と精度の向上が図られる。ステップ１７０３、１７０４、１７０５の対応点抽出処理、合成パラメータ設定処理、画像合成処理は、いずれもオート処理のときと同一である。
【００４１】
図１７に対応点抽出処理フローチャート、図１８にそれを図示したものを示す。図１８は、左と右の画像２枚の例を示す。画像の枚数が２枚より大きいときは、２枚の合成を何回か繰り返せばよいので処理としては基本的に同じである。まず、撮影時のルールにのっとり、テンプレートを設定する範囲２００５は、縦９０％横１０％の範囲に設定する。また、サーチする範囲は、対応する点が存在する可能性の範囲ということで、縦１００％、横５０％の範囲２００６に設定される。画像中の範囲２００５からエッジが所定値以上強い点を探し、そこを中心として縦、横ｎ画素の矩形をテンプレート画像として切り出す。このテンプレート画像２００３をサーチ範囲２００６範囲上に置いて画素単位でその差分をとる。この合計が最小となるところをサーチ範囲２００６上を１画素ずらして求める。サーチ範囲２００６上を全てサーチした結果の最小値が、所定値以下であれば、そのそのポイント同士（ｘ，ｙ）と（ｘ′，ｙ′）を対応点のペアとして保持する。
【００４２】
以上が処理の概要となるが、これを図１７のフローチャートに沿ってもう一度説明する。まずステップ１９０１でエッジ抽出画像を作成する。そしてステップ１９０２で、このエッジ抽出画像の中のテンプレートを設定する範囲２００５からエッジが所定値以上強いポイントを探す。そしてそのポイントがあれば、ステップ１９０３でそのポイントから縦横±ｎ画素ずつの矩形で画像を切り出しテンプレート画像とする。そのポイントの位置からステップ１９０４で右画像中のサーチ範囲を設定する。そしてステップ１９０５でサーチ範囲中の画像とテンプレート画像とを重ね合わせ、画素単位で画素値の差の絶対値をとりその合計を求める。その差分の合計値が、それまでの最小値かどうかをステップ１９０６でチェックし、そうであれば、ステップ１９０７で、そのサーチ範囲中のポイントの座標とその最小値とを保持する。
【００４３】
以上をサーチ範囲全てにおいて繰り返し、最も一致する（最小の差分を持つ）点を見けだす。ステップ１９０８でサーチ範囲全てにおいて行ったかチェックし、ステップ１９０９で、その結果求められた最小値が十分小さな値であるか（確かな対応点か）を、所定値Ｌと比較して判断する。所定値Ｌより小さかった場合は、ステップ１９１０で、対応点のリストにテンプレート画像を切り出したポイントの座標（ｘ，ｙ）と、最小値が求められたポイントの座標（ｘ′，ｙ′）と、その最小値の値とを登録する。以上をステップ１９１１でテンプレート設定範囲のエッジが所定値以上強いポイント全部に対して行い、終了したら対応点のリスト中の全ての最小値からその平均値を求め、これをステップ１９１２で一致レベル値として保持する。以上で対応点抽出処理を終了する。
【００４４】
次に、合成パラメータ設定処理について説明する。画像を２枚としたときに（２枚以上の合成の場合も２枚の合成の繰り返しなので、まずは２枚で考えてよい）、そのずれは、ｘ、ｙ方向の並進、回転、および拡大率の差で表すことができる。よって対応する点（ｘ，ｙ）、（ｘ′，ｙ′）は以下のように表せる。
【００４５】
【数１】

【００４６】
ここで、θは回転角、Δｘおよびｙは並進、ｍは倍率を示す。従ってパラメータＡ、Ｂ、Ｃ、Ｄを求めることによりこの座標変換を表すことができる。先の対応点抽出処理では、対応点（ｘ，ｙ）、（ｘ′，ｙ′）の複数の組を取得した。これを最小自乗法を用いてパラメータＡ、Ｂ、Ｃ、Ｄを求める。
【００４７】
【数２】

【００４８】
の条件で、
【００４９】
【数３】

【００５０】
を満たすパラメータＡ、Ｂ、Ｃ、Ｄを求める。
ここで、
【００５１】
【数４】

【００５２】
とすると、パラメータＡ、Ｂ、Ｃ、Ｄは次のように表すことができる。
【００５３】
【数５】

【００５４】
このパラメータｐ₁からｐ₈を求め、上式に代入することにより、パラメータＡ、Ｂ、Ｃ、Ｄを算出する。
【００５５】
最後に、画像合成処理について説明する。すでに、パラメータＡ、Ｂ、Ｃ、Ｄは求められているので、次の式
ｘ′＝Ａｘ＋Ｂｙ＋Ｃ
ｙ′＝−Ｂｘ＋Ａｙ＋Ｄ
に代入すればよい。
【００５６】
図１９にこれを図示したものを示す。画像が左、右画像の場合、左画像の２倍の大きさを合成画像２１０３として確保する。ここに、まず左画像をそのままコピーしてくる。次に合成画像の残りの領域（ｘ，ｙ）について、上式から対応する（ｘ′，ｙ′）を求める。そして右画像の（ｘ′，ｙ′）の画素を（ｘ，ｙ）にコピーする。これを合成画像の残りの領域全てに対して行う。
【００５７】
図２０に同様の内容をフローチャートに示す。ステップＳ２２０１で、第１の画像（２０での左画像）の２倍の領域を合成画像領域として確保する。ステップＳ２２０２で、第１の画像をこの合成画像領域に単純にコピーする。ステップＳ２２０３で、合成画像の残りの領域（ｘ，ｙ）について、上式から対応する（ｘ′，ｙ′）を求める。ステップ２２０４で、（ｘ′，ｙ′）は第２の画像（図１９での右画像）内にあるかどうかをチェックし、あればステップＳ２２０５で（ｘ′，ｙ′）の画素を（ｘ，ｙ）にコピーする。以上を合成画像の残りの領域全てに対して繰り返し、処理は終了する。
以上によって、最終的なパノラマ合成画像を作成することができる。
【００５８】
本実施の形態によれば、まず、自動的にパノラマ画像として合成する画像のセットを抽出してくれる。これによりユーザが合成したい画像のセットを指示する必要がない。また、合成する画像のセットが２枚のときはフルオートで、２枚より多いときもユーザは位置関係を指示するだけ（オート）で合成処理を行うことができ、ユーザの負担は非常に軽いものとなっている。さらにフルオート、オートのときに十分な対応点を検出できなかったときには、セミオート処理に切り替え、この場合もユーザは画像をドラッグして重ね合わせた簡単な位置を指定するだけである。このセミオート処理に切り替わる場合は少ないし、またそうなっても非常に簡単な操作で行えるので、これもユーザの負担は軽いと言える。
【００５９】
【発明の効果】
以上のように本発明による画像形成装置によれば、ユーザは、第１の表示領域に表示されている画像をＤｒａｇ＆Ｄｒｏｐ等により第２の表示領域に移動させて合成したい複数の画像を表示することによって、合成する画像を確認しながら容易な操作で合成画像を得ることができる。
【図面の簡単な説明】
【図１】本発明の実施の形態を示すブロック図である。
【図２】機器の構成例を示す構成図である。
【図３】フルオート処理のときに想定される２枚の画像の組み合わせを示す構成図である。
【図４】オート処理における相対的位置を指示するユーザインターフェースを示す構成図である。
【図５】セミオート処理におけるだいだいのオーバーラップ位置を指示するユーザインターフェースを示す構成図である。
【図６】カラメ内に記録された画像データの構成図である。
【図７】カメラ内の画像データをコンピュータ内へコピーする操作を示す構成図である。
【図８】コンピュータ内で管理するデータの構成図である。
【図９】カメラ内の画像データをコンピュータ内へコピーする際に行う処理のフローチャートである。
【図１０】パノラマ合成のフルオート、オート、セミオートの流れを示すフローチャートである。
【図１１】オート処理のユーザインターフェースを示す構成図である。
【図１２】オート処理のフローチャートである。
【図１３】合成時のマッチングが範囲を示す構成図である。
【図１４】フルオート処理のフローチャートである。
【図１５】セミオート処理のユーザインターフェースを示す構成図である。
【図１６】セミオート処理のフローチャートである。
【図１７】対応点抽出処理のフローチャートである。
【図１８】対応点抽出装置でのテンプレート画像とマッチング範囲を示す構成図である。
【図１９】合成処理を示す構成図である。
【図２０】合成処理のフローチャートである。
【図２１】従来の対応点指示を示す構成図である。
【符号の説明】
３０１コンピュータシステム本体
３０２ディスプレイ
３０３マウス
３０５キーボード
３０７電子カメラ
５０４アプリケーションソフトウェア
５０５オペレーティングシステム
５１８ＣＵＰ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image synthesizing apparatus suitable for use in a panorama synthesizing system or the like that synthesizes a plurality of images that partially overlap images on a computer.
[0002]
[Prior art]
The process of composing a plurality of images on which part of the image overlaps on a computer is generally called panoramic composition. This can be said to be processing from a request to capture a wide subject to make a single image. In addition, in electronic cameras, low resolution (small number of pixels) has been pointed out as a disadvantage compared to silver halide cameras and scanners. The panoramic image synthesis for an image taken with this electronic camera is important not only for taking a wide image but also as a means for taking a high-resolution image. Specifically, a single paper manuscript, magazine, etc. is shot in multiple parts to acquire high resolution data similar to that of a scanner, or a landscape is divided into multiple parts and shot in wide, high resolution. Demonstrates power in cases.
[0003]
In the panorama synthesis, the most important and difficult process is a process of finding an overlap position of a plurality of images. In other words, this is a process of finding the same point (hereinafter referred to as a corresponding point) from the two images (hereinafter referred to as a corresponding point extraction process). The difficulty (error rate) of the corresponding point extraction process varies depending on each image. If there are many unique feature shapes not found elsewhere in the overlapping area, it is possible to search for the corresponding points without making a mistake. However, if a similar pattern is present in other places in the overlap (for example, characters in the document), the corresponding points may be mistaken.
[0004]
Therefore, conventionally, it is common to have a user explicitly specify a corresponding point and perform fine adjustment based on the position to perform synthesis. FIG. 21 shows a conventional example. First, when the user designates a plurality of images to be synthesized, a window as shown in FIG. 21 is opened. The user designates corresponding points in the two images, and puts

marks

21a, 21b, 22a, 22b. As processing, a pattern very close from the center of each mark is examined, and a positional relationship in which the two match best is obtained and used as a corresponding point. Based on the corresponding points, a parameter for synthesis is obtained and a synthesis process is performed.
[0005]
[Problems to be solved by the invention]
However, the conventional example has the following problems. First, the user must indicate the set of images that he wants to compose. This is not a problem when the total number of images managed on the computer is not so large, but an oversight or the like occurs when many images of thousands or more are managed. In addition, the user must designate the corresponding points almost accurately, and this also becomes a burden on the user as the number of operations increases.
[0006]
The present invention has been made in view of the above-described circumstances, and an object thereof is to obtain an image composition apparatus capable of performing panoramic image composition more easily.
[0007]
[Means for Solving the Problems]
In the image composition device according to the first aspect of the present invention, the first display area capable of displaying a plurality of images is the same as the second display area capable of displaying a plurality of images and different from the first display area. Display means for displaying on a screen; moving means for moving a plurality of images selected from images displayed in the first display area to a second display area according to a user's instruction; and the first Detecting means for detecting an overlap portion of a plurality of images moved from the display area to the second display area, and corresponding point extracting means for extracting corresponding points in the overlap portion detected by the detecting means; And combining processing means for creating a single image by joining the plurality of images based on the corresponding points extracted by the corresponding point extracting means, and the detecting means includes the moving means. The number of the plurality of images are moved within the case is less than the predetermined number of sheets, and detects the overlapping portion of the image automatically I.
[0008]
In the image composition device according to the second aspect of the present invention, a first display area capable of displaying a plurality of images, and a second display area capable of displaying a plurality of images and different from the first display area, Displaying means on the same screen, moving means for moving a plurality of images selected from images displayed in the first display area to the second display area according to a user instruction, Detection means for detecting an overlap portion of a plurality of images moved from the first display area to the second display area, and corresponding point extraction for extracting a corresponding point in the overlap portion detected by the detection means Means for combining the plurality of images based on the corresponding points extracted by the corresponding point extraction means to create one image, and the moving means comprises a predetermined number of sheets. When the above images are input, the images are moved on the second display area to designate the relative positions of the images, and the detecting means is configured to specify the overlap based on the designated relative positions. A part is detected.
[0009]
[Action]
According to the image composition device of the first aspect of the present invention, the first display area capable of displaying a plurality of images and the second display area capable of displaying a plurality of images and different from the first display area are provided. A plurality of images displayed on the same screen and selected from the images displayed in the first display area according to a user's instruction are moved to the second display area, and the first display area When the plurality of images moved to the second display area are less than the predetermined number, the overlap portion is automatically detected, and the corresponding points in the detected overlap portion are extracted and extracted. Based on the corresponding points, the plurality of images are connected to create one image.
[0010]
According to the image composition device of the second aspect of the present invention, the first display area capable of displaying a plurality of images and the second display area capable of displaying a plurality of images and different from the first display area are provided. A plurality of images displayed on the same screen and selected from the images displayed in the first display area according to a user's instruction are moved to the second display area, and the first display area When a plurality of images moved to the second display area are a predetermined number or more, the plurality of images are moved on the second display area to designate the relative positions of the images, Based on this relative position, an overlap portion is detected, corresponding points in the detected overlap portion are extracted, and a plurality of images are connected based on the extracted corresponding points to create one image. To do.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
The present embodiment performs simple and convenient panoramic image synthesis by the following procedure.
(1) First, when shooting an image with an electronic camera, the user sets the camera in the “panoramic image shooting mode” before shooting. In this mode, an identifier indicating a set of panoramic images is automatically recorded in the attribute information of the captured image.
(2) When this camera is connected to a computer and image + attribute information in the memory in the camera is copied to the HD of the computer, this attribute information is checked via application software. One set of images is automatically extracted from the attribute information having an identifier in the panoramic image shooting mode.
[0012]
Next, panorama image composition processing is started. In the image composition, full auto composition, which will be described later, is performed fully automatically only when there are two images. When there are three or more images, auto-composition described later is performed, in which the user designates only the relative positions of the top, bottom, left and right of the image. In addition, when the corresponding points are not sufficiently obtained in the full auto synthesis and auto synthesis check stages, or when the user wants to perform the synthesis processing in a shorter time by omitting the time required for detecting the corresponding points, the user Perform semi-auto synthesis, which will be described later, to specify the approximate overlap position.
[0013]
(3) “Full auto composition”: When one set of panoramic images is extracted in the above (2) and there are two images, full auto composition processing is started. There are four possible positions for combining the two sheets, as shown in FIGS. 3A to 3D. Accordingly, processing for obtaining corresponding points in these four cases is performed, and the most frequently obtained corresponding points among the upper, lower, left, and right are synthesized as correct synthesis positions. At this time, in any of the four cases, the certainty is low when the number of corresponding points is equal to or less than a predetermined amount. At this time, the full-auto synthesis process is discontinued and the process proceeds to a semi-automatic synthesis process. The only user operation in this mode is to copy the image from the camera to the computer, and the rest will be done automatically. Since it is generally considered that two sheets are combined except for special purposes, this combining process is most often performed.
[0014]
(4) “Automatic composition”: When one set of panoramic images is extracted in the above (2) and there are three or more images, this automatic composition processing is started. In this automatic composition, a set of images is displayed in a window. FIG. 4 shows a case where four images are displayed. The user drags these images and considers various arrangements, and instructs only the positional relationship between the top, bottom, left and right. Based on this, corresponding point extraction processing is performed. As a result, when the number of corresponding points that match at a predetermined level or more is greater than or equal to a predetermined amount, they are combined as a correct combining position. Otherwise, the certainty is low, so the auto synthesis process is terminated and the process proceeds to semi-auto synthesis process.
[0015]
(5) “Semi-automatic synthesis”: When the certainty of corresponding point extraction is low in the above (3) and (4), or when it is desired to obtain a faster synthesis result by saving the time required for extracting corresponding points, this semiautomatic Process. The user drags these images and designates the approximate position of overlap as shown in FIG. Based on this position information, corresponding point extraction processing is performed in a much narrower range than in the automatic processing. As a result, the best matching position is obtained and the synthesis process is performed.
[0016]
In the case of (4) and (5) above, the user performs an operation for extracting the corresponding points, both of which are only operations for dragging the image, and this is the simplest and common operation. The burden is small. In addition, in the semi-automatic synthesis of (5), since the image is only dragged to the approximate position, it is much easier and easier than the operation of explicitly specifying the points in the conventional example.
[0017]
FIG. 2 shows a configuration example of a personal computer system which is a platform on which the present invention can be implemented. In FIG. 2, 301 is a computer system main body, 302 is a display for displaying data, 303 is a mouse as a typical pointing device, 304 is a mouse button, and 305 is a keyboard. Reference numeral 307 denotes an electronic camera that can be connected to a computer, which is connected by a general-purpose interface capable of transferring images at high speed, such as a bidirectional parallel interface or a SCSI interface indicated by 306.
[0018]
FIG. 1 is a diagram showing a configuration of a panoramic image synthesis system including software and hardware. In FIG. 1, reference numeral 509 denotes hardware, which has a CPU 518. Reference numeral 505 denotes an operating system (OS) that operates on the hardware 509, and reference numeral 504 denotes application software that operates on the OS 505. Of the blocks constituting the hardware 509 and the OS 505, the blocks that are naturally included as constituent elements but are not directly necessary for describing the embodiment of the present invention are not shown. An example of such a block (not shown) includes a memory as hardware and a memory management system as an OS.
[0019]
In FIG. 1, 515 is a hard disk that physically stores files and data, 508 is a file system that constitutes the OS 505, and has a function that allows the application software 504 to input and output files without being aware of the hardware 509. is there. Reference numeral 514 denotes a disk IO interface for the file system 508 to read / write the hard disk 515. Reference numeral 507 denotes a drawing management system constituting the OS 505, which has a function that allows the application software 504 to perform drawing without being aware of the hardware 509.
[0020]
Reference numeral 513 denotes a video interface for the drawing management system 507 to draw on the display 302. Reference numeral 506 denotes an input device management system constituting the OS 505, which has a function that allows the application software 504 to receive user input without being aware of the hardware 509. Reference numeral 510 denotes a keyboard interface for the input device management system 506 to receive input from the keyboard 305. Reference numeral 512 denotes a mouse interface for the input device management system 506 to receive input from the mouse 303.
[0021]
Further, the electronic camera 307 is connected to a bi-directional interface or a SCSI interface 516, and can exchange image data and the like through the input device management system 506. Reference numeral 501 denotes an image data management system, and reference numeral 502 denotes a data management unit for managing image data with attribute information or keywords input by the user. Reference numeral 503 denotes a data display unit that searches and displays the managed image data using the attribute information or keywords input by the user. Reference numeral 517 denotes a panoramic image synthesis system, which receives an image shot in the panoramic shooting mode from the image data management system 501 and performs panoramic image synthesis processing. The synthesized image is registered in the image data management system 501.
[0022]
FIG. 6 shows a data structure of image data and attribute information stored in a built-in memory in the camera. First, an image management table 81 is placed in the memory, and corresponding image data 82 and attribute information 83 are referred to. The image data 82 is stored as either camera-specific format data (native data) or general-purpose format data such as JPEG. The native data is, for example, data obtained by simply A / D output from the CCD. In general, native data takes a short time to record, but the data size is large, and JPEG data takes a long time to record, but is different in that the data size can be reduced. The user selects and stores either one according to the shooting situation.
[0023]
In the attribute information 83, a file name 84, a filter type 85, a shooting date / time 86, and a shooting mode 87 are recorded. The file name 84 is a unique file name automatically assigned by the camera. The filter type 85 indicates whether it is a native data format or a JPEG format, or another general-purpose format supported by the camera. As for the shooting date and time 86, the camera has a calendar and a timer inside, and the date and time when the shutter button of the camera is pressed is recorded. The shooting mode 87 indicates a mode selected at the time of shooting among several types of shooting modes of the camera. If this is the “panoramic shooting mode”, an identifier 88 is further added. This identifier stores a mode ID 89 which is a unique number set when the panorama shooting mode is set, and data 90 indicating the number of pictures in that mode. Therefore, a plurality of images having the same mode ID 89 in the panorama shooting mode is one set. In the example of FIG. 6, since the landscape is photographed as two left and right images, 89a and 89b have the same mode ID.
As described above, the image data 82 and the attribute information 83 are stored in the camera.
[0024]
FIG. 7 shows a screen for copying data in the camera to a personal computer. The camera 307 in FIG. 1 is connected to the computer via the general-purpose interface 306, and the image data management system 502 is activated. The image data management system 502 displays the data in the camera in a window named camera catalog 91 in FIG. Here, 94 is a reduced image (thumbnail image) of the image data, and 95 is a file name, filter type, etc. in the attribute information. How much of the attribute information is displayed can be changed by user designation. A user catalog 92 displays a part of the user image database existing in the hard disk of the personal computer.
[0025]
When the user selects an image from the camera catalog 91 (93 is a frame indicating that it has been selected) and performs a Drag & Drop operation on the user catalog 92, copying is performed. At this time, whether it is a copy (data remains in the camera) or a move (data in the camera is deleted) can be switched to either by the user's designation. During this copy operation,
・ Native data is converted into a predetermined general-purpose format.
・ If there are images shot in the panorama shooting mode, combine them.
[0026]
The above operation is automatically detected and executed. FIG. 8 shows a data structure on the user catalog 91 at that time, and FIG. 9 shows a flowchart.
First, FIG. 8 will be described. First, the image data management system 508 manages the image data stored therein with a unique ID number. That is the data management table 1108. This corresponds to the data ID 1109 and the image data and attribute information linked thereto. This data ID is the basis of management. In the image data management system 502, a user can have an arbitrary number of user catalogs 92. Each catalog has a catalog table 1100. The image data in the catalog has a function for the user to categorize a plurality of images as one group. As a result, data can be managed in a hierarchy in one catalog.
[0027]
In the catalog table 1100, the data ID 1101 of the image belonging to this catalog and the group ID 1102 of the group to which it belongs are held. The group ID 1102 is linked to the group table 1103. The group attribute table 1104 is basically the same as the catalog table, and has a data ID of an image belonging to this group or a group ID belonging to this group. The difference is that it has group attribute data 1104 at the head. This stores a group name 1106, a creation date 1107, and a group type 1110. The group name 1106 is an arbitrary name given by the user. When a group is created as a set of panoramic images, this group name is given as “panoramic image” by default.
[0028]
The creation date / time stores the date / time when this group was created. The group type 1110 contains “user created” when created by the user, and “panoramic image” when a group is created as a set of panoramic images. In the case of a panoramic image, a mode ID 89a is stored by linking with an identifier. Actual image data and attribute information are stored in the same structure as in FIG. 6, and these are referenced from the data management table 1108.
[0029]
Next, FIG. 9 will be described. While processing of all image data to be copied is not completed in step 1000, in step 1001, one piece of image data and attribute information associated therewith are first acquired in the copying operation. In step 1002, it is determined from the filter type 85 in the attribute information whether this image data is native data. If so, the native data is converted into a general-purpose format (JPEG, TIFF, etc.) determined as a default in step 1003. When the conversion is completed, the filter type 85 is also updated. In step 1004, the shooting mode 87 is checked to check whether the image is shot in the panoramic shooting mode. If it is not a panoramic image, it is registered as normal image data in step 1008. Specifically, the data management table 1108 in FIG. 8 is registered with a unique data ID, and the data ID is registered in the catalog table 1100.
[0030]
If it is a panoramic image, it is checked whether a group for this panoramic image has already been created. This is performed by tracing the catalog table 1100 of FIG. 8 and checking whether the mode ID 89a of the group ID is the same as the mode ID 89a of the image. If there is no corresponding group, create a group. In this case, a group ID 1102 is newly registered in the catalog table 1100, and a group name 1106, a creation date 1107, and a group type 1110 are created. “Panorama shooting” is recorded in the group type 1110, and the mode ID 89a in the attribute information of the image is stored. In step 1007, the panoramic image data is registered with a unique data ID in the management table 1108, and the data ID 1105 is registered.
[0031]
The above series of processing is performed for all images to be copied. If the processing is completed for all the images, the process proceeds to step 1009. In step 1009, it is checked whether or not a group of panoramic images has been created among those copied so far. If so, panoramic image synthesis processing described later is performed in step 1010 using the images in the group. Otherwise, the process is terminated.
[0032]
FIG. 10 shows a flowchart of the panoramic image synthesis process in step 1010. First, in step 1200, it is checked whether there are two or more images in the group. When there are two sheets, full auto processing described later is entered in step 1202. If there are more than two sheets, an automatic composition process (described later) is entered in step 1201. In the processing of

steps

1201 and 1202, it is checked in

step

1203 and 1204 whether the result is success or failure. This success or failure is determined based on whether or not a predetermined number or more of corresponding points between images have been found. Therefore, since the determination is made at an early stage of the corresponding point extraction process in the entire synthesis process, even if the result is unsuccessful, the time for the user to wait until obtaining the result is short. If successful, the process ends. If unsuccessful, a semi-automatic process (to be described later) is performed in step 1205.
[0033]
FIG. 11 shows a user interface for the auto synthesis process. First, all images belonging to the group of panoramic images in step 1009 are resized and displayed on the screen so as to fit in the window. The user sees this and drags and rearranges them in the correct order. In the example of FIG. 11, three images are displayed, and the image located at the bottom of FIG. 11A is actually the rightmost, so when dragging to the right, three images are arranged side by side from that position. It is detected that the panorama is as shown in FIG. Then, it is resized and displayed so that it can enter the window.
[0034]
FIG. 12 shows a flowchart of the auto composition process. First, in step 1301, the positional relationship information rearranged by the user is acquired. In step 1302, a search range is set in order to find matching points. This will be described with reference to FIG. As a rule when shooting as a panoramic image, if the overlap is 10% minimum and 50% maximum, and the vertical displacement perpendicular to it is determined to be 5% or less, the overlap range is always shown in FIG. 1504. Moreover, the range which may overlap is a range shown by 1505 of FIG.5 (b). According to the above rule, the point 1503 in the range 1504 has a corresponding point in 1505 in the search range. In the corresponding point extraction processing described later, it is determined whether or not the search range 1505 is matched.
[0035]
Returning again to the flowchart of FIG. 12, in step 1302, the parameters used for setting the search range 1505 are set. In step 1303, processing for extracting corresponding points is performed. Details of this will be described later. In step 1304, it is determined whether or not the obtained number of corresponding points is equal to or greater than a predetermined value. If the number of corresponding points is equal to or smaller than the predetermined value, sufficient corresponding points cannot be automatically found, and the process proceeds to semi-automatic processing. If it is greater than the predetermined value, the process proceeds to the synthesis parameter processing in step 1305. Here, the movement, enlargement (reduction), and rotation parameters used in the synthesis are obtained from the coordinates of the corresponding points. Details thereof will be described later. In step 1306, an image is synthesized based on these parameters. Details thereof will also be described later.
[0036]
FIG. 14 shows a flowchart of the full auto process. First, in step 1601, a matching range which is a search range for extracting corresponding points is set. This is the same as step 1302 described above. Next, corresponding point extraction processing is performed four times. In the case of full auto, since the number of sheets is limited to two, there are four possible positional relationships of image 1 and image 2, that is, top and bottom, bottom and top, left and right, and right and left. Accordingly, corresponding point extraction processing is performed for these four cases, and the number of points that can be extracted as corresponding points and the average of matching levels as corresponding points are held. These processes are steps 1602 to 1609. In step 1610, for the above four cases, it is checked whether or not there is a corresponding point having a predetermined number or more corresponding points extracted. If there is none, proceed to semi-automatic processing.
[0037]
If the number of extracted corresponding points is greater than or equal to the predetermined number, in step 1611, the one with the highest average of the matching levels as the corresponding points is assumed to be the true positional relationship. In a normal image, if the corresponding point is greater than or equal to a predetermined value, it should be only one of the four. However, for example, when a document is divided and taken, similar characters are lined up and the positional relationship is incorrect. Even in this case, there is a case where a predetermined value or more is extracted as a corresponding point. Therefore, in this step 1611, the best fit (the highest average match level) is selected. The synthesizing parameter process in step 1612 and the image synthesizing process in step 1613 are the same as those in

steps

1305 and 1306.
[0038]
FIG. 15 shows a user interface in the semi-auto process. First, all images belonging to the group of panoramic images in step 1009 are resized and displayed on the screen so as to fit in the window. This is overlapped with the overlap position that the user sees. The overlapping area is displayed by performing an AND operation for each bit in units of pixels. Therefore, both images can be seen through the overlapped part. Then it is resized and displayed again to fit into the window.
[0039]
The above operations are basically the same as the operations in the automatic processing, and the burden on the user is small. The difference is that the position where the image is dragged and released with a pointing device such as a mouse is rearranged and displayed using only the positional information (automatic processing) and the position information is used as it is and overlapped and displayed. (Semi-automatic processing). Also, since it can be seen through in the Drag by the previous AND operation, the approximate position can be easily adjusted.
[0040]
FIG. 16 shows a flowchart in the semi-auto process. The processing here is almost the same as in the automatic processing. In step 1701, overlap position information obtained by the user of the image is acquired. In step 1702, a matching range is set. The range here is a predetermined range (an error range of an assumed position matched by the user + margin). Accordingly, the range is much narrower than the range at the time of auto processing, and the calculation time can be shortened and the accuracy can be improved. The corresponding point extraction process, the synthesis parameter setting process, and the image synthesis process in

steps

1703, 1704, and 1705 are all the same as in the auto process.
[0041]
FIG. 17 shows a corresponding point extraction process flowchart, and FIG. FIG. 18 shows an example of two left and right images. When the number of images is larger than two, the composition is basically the same because the composition of the two images may be repeated several times. First, in accordance with the shooting rules, the range 2005 for setting the template is set to a range of 90% vertically and 10% horizontally. In addition, the search range is set to a range 2006 of 100% vertical and 50% horizontal because there is a possibility that a corresponding point exists. A point whose edge is stronger than a predetermined value is searched from a range 2005 in the image, and a rectangle of vertical and horizontal n pixels is cut out as a template image centering on the point. The template image 2003 is placed on the search range 2006, and the difference is obtained in units of pixels. The position where the sum is minimized is obtained by shifting the search range 2006 by one pixel. If the minimum value of the search results on the search range 2006 is not more than a predetermined value, the points (x, y) and (x ′, y ′) are held as a pair of corresponding points.
[0042]
The above is the outline of the processing. This will be described once more along the flowchart of FIG. First, in step 1901, an edge extraction image is created. In step 1902, a point whose edge is stronger than a predetermined value is searched from a range 2005 for setting a template in the edge extracted image. If there is such a point, in step 1903, an image is cut out from the point in a rectangle of vertical and horizontal ± n pixels and used as a template image. In step 1904, a search range in the right image is set from the position of the point. In step 1905, the image in the search range and the template image are overlapped, and the absolute value of the difference between the pixel values is obtained in units of pixels to obtain the total. In step 1906, it is checked whether or not the sum of the differences is the previous minimum value. If so, in step 1907, the coordinates of the point in the search range and the minimum value are held.
[0043]
The above is repeated over the entire search range to find the point with the best match (with the smallest difference). In step 1908, it is checked whether the search has been performed over the entire search range. In step 1909, it is determined by comparing with a predetermined value L whether the minimum value obtained as a result is a sufficiently small value (a reliable corresponding point). If it is smaller than the predetermined value L, in step 1910, the coordinates (x, y) of the point at which the template image is cut out in the list of corresponding points, and the coordinates (x ′, y ′) of the point for which the minimum value has been obtained. The minimum value is registered. The above is performed for all the points where the edge of the template setting range is stronger than the predetermined value in step 1911, and when completed, the average value is obtained from all the minimum values in the list of corresponding points, and this is set as the matching level value in step 1912. Hold. The corresponding point extraction process is thus completed.
[0044]
Next, the synthesis parameter setting process will be described. When the number of images is two (when two or more images are combined, the combination of two images is repeated, so two images may be considered first), and the shift is the translation in the x and y directions, the rotation, and the enlargement ratio It can be expressed by the difference. Accordingly, the corresponding points (x, y) and (x ′, y ′) can be expressed as follows.
[0045]
[Expression 1]

[0046]
Here, θ is a rotation angle, Δx and y are translations, and m is a magnification. Therefore, the coordinate transformation can be expressed by obtaining the parameters A, B, C, and D. In the previous corresponding point extraction process, a plurality of sets of corresponding points (x, y) and (x ′, y ′) are acquired. The parameters A, B, C, and D are obtained by using the least square method.
[0047]
[Expression 2]

[0048]
In the condition of
[0049]
[Equation 3]

[0050]
Parameters A, B, C, and D that satisfy the above are obtained.
here,
[0051]
[Expression 4]

[0052]
Then, the parameters A, B, C, and D can be expressed as follows.
[0053]
[Equation 5]

[0054]
This parameter p ₁ To p ₈ And parameters A, B, C, and D are calculated.
[0055]
Finally, the image composition process will be described. Since the parameters A, B, C, and D have already been obtained, the following formula
x ′ = Ax + By + C
y ′ = − Bx + Ay + D
Can be substituted into.
[0056]
FIG. 19 illustrates this. When the images are the left and right images, the composite image 2103 is secured twice as large as the left image. First, the left image is copied as it is. Next, for the remaining area (x, y) of the composite image, the corresponding (x ′, y ′) is obtained from the above equation. Then, the pixel (x ′, y ′) in the right image is copied to (x, y). This is performed for all remaining areas of the composite image.
[0057]
FIG. 20 shows a similar content in the flowchart. In step S2201, a region twice as large as the first image (the left image in 20) is secured as a composite image region. In step S2202, the first image is simply copied to this composite image area. In step S2203, the corresponding region (x ′, y ′) is obtained from the above equation for the remaining region (x, y) of the composite image. In step 2204, it is checked whether (x ′, y ′) is in the second image (the right image in FIG. 19), and if there is (x ′, y ′) in step S2205, the pixel (x , Y). The above is repeated for all the remaining areas of the composite image, and the process ends.
As described above, a final panorama composite image can be created.
[0058]
According to this embodiment, first, a set of images to be automatically combined as a panoramic image is extracted. This eliminates the need for the user to designate a set of images to be synthesized. In addition, when there are two sets of images to be combined, it is fully automatic, and when there are more than two images, the user can perform combining processing only by instructing the positional relationship (auto), and the burden on the user is very light. It has become a thing. Further, when sufficient corresponding points cannot be detected in full auto and auto, the mode is switched to semi-auto processing, and in this case, the user simply designates a simple position by dragging and superimposing the images. There are few cases of switching to this semi-automatic process, and even if so, it can be done with a very simple operation, so it can be said that the burden on the user is also light.
[0059]
【The invention's effect】
As described above, according to the image forming apparatus of the present invention, the user moves the image displayed in the first display area to the second display area by Drag & Drop or the like and displays a plurality of images to be combined. Thus, a synthesized image can be obtained by an easy operation while confirming the image to be synthesized.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an embodiment of the present invention.
FIG. 2 is a configuration diagram illustrating a configuration example of a device.
FIG. 3 is a configuration diagram showing a combination of two images assumed in the full auto process.
FIG. 4 is a configuration diagram showing a user interface for designating a relative position in auto processing.
FIG. 5 is a block diagram showing a user interface for instructing a substantial overlap position in semi-automatic processing.
FIG. 6 is a configuration diagram of image data recorded in a caramel.
FIG. 7 is a configuration diagram showing an operation for copying image data in a camera into a computer.
FIG. 8 is a configuration diagram of data managed in a computer.
FIG. 9 is a flowchart of processing performed when copying image data in a camera into a computer.
FIG. 10 is a flowchart showing the flow of panoramic synthesis full-auto, auto, and semi-auto.
FIG. 11 is a block diagram showing a user interface for auto processing.
FIG. 12 is a flowchart of auto processing.
FIG. 13 is a configuration diagram showing a range of matching at the time of synthesis.
FIG. 14 is a flowchart of a full auto process.
FIG. 15 is a block diagram showing a user interface for semi-automatic processing.
FIG. 16 is a flowchart of a semi-auto process.
FIG. 17 is a flowchart of corresponding point extraction processing;
FIG. 18 is a configuration diagram showing a template image and a matching range in the corresponding point extraction device.
FIG. 19 is a configuration diagram showing a synthesis process.
FIG. 20 is a flowchart of a composition process.
FIG. 21 is a block diagram showing a conventional corresponding point instruction.
[Explanation of symbols]
301 Computer system body
302 display
303 mouse
305 keyboard
307 Electronic camera
504 Application software
505 operating system
518 CUP

Claims

Display means for displaying a first display area capable of displaying a plurality of images and a second display area capable of displaying a plurality of images different from the first display area;
Moving means for moving a plurality of images selected from the images displayed in the first display area to the second display area in accordance with a user instruction;
Detecting means for detecting overlapping portions of a plurality of images moved from the first display area to the second display area;
Corresponding point extracting means for extracting corresponding points in the overlap portion detected by the detecting means;
Combining processing means for connecting the plurality of images based on the corresponding points extracted by the corresponding point extracting means to create one image;
With
The image synthesizing apparatus according to claim 1, wherein the detecting unit automatically detects an overlapping portion of the images when the number of the plurality of images moved by the moving unit is less than a predetermined number.

Display means for displaying a first display area capable of displaying a plurality of images and a second display area capable of displaying a plurality of images different from the first display area;
Moving means for moving a plurality of images selected from the images displayed in the first display area to the second display area in accordance with a user instruction;
Detecting means for detecting overlapping portions of a plurality of images moved from the first display area to the second display area;
Corresponding point extracting means for extracting corresponding points in the overlap portion detected by the detecting means;
Combining processing means for connecting the plurality of images based on the corresponding points extracted by the corresponding point extracting means to create one image;
With
The moving means, when a predetermined number of images or more are input, moves these images on the second display area and designates the relative position of each image;
The image synthesizing apparatus, wherein the detecting means detects the overlap portion based on the designated relative position.

3. The image according to claim 2, wherein the detecting unit detects an overlap portion of each image from position information of each image designated by overlapping the images on the second display area by the moving unit. Synthesizer.

The plurality of images includes identification information in which each image represents a set of combined images, and the detection unit and the combination processing unit perform processing based on the identification information. Item 3. The image composition device according to Item 2.