JP3913076B2

JP3913076B2 - Image composition processing device

Info

Publication number: JP3913076B2
Application number: JP2002043123A
Authority: JP
Inventors: 憲司守田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2002-02-20
Filing date: 2002-02-20
Publication date: 2007-05-09
Anticipated expiration: 2022-02-20
Also published as: JP2003244726A

Description

【０００１】
【発明の属する技術分野】
本発明は、２系統以上の実写映像と仮想画像を実時間で合成する画像合成処理装置に関する。
【０００２】
【従来の技術】
２系統の実写映像の入力の同期をとる方法としては、２系統のＮＴＳＣのビデオ信号のそれぞれを、１つのビデオ信号のＯＤＤフィールドとＥＶＥＮフィールドに割り当て、１つのＮＴＳＣ信号にしてから、それを単一の入力装置で１フレーム分同時に取り込むという方法があった。
【０００３】
また、非同期に２系統の実写映像を取り込み表示する場合でも、３０ｆｐｓ程度の処理速度が保たれていれば、複数の入力映像の同期ずれは、実用上大きな問題にならない程度に抑える事がきる。
【０００４】
図３、図4は従来例の非同期に２系統の実写映像を取り込む場合の内部的な処理に関する部分のフローと概念図である。図３では、図１および図２に示されるシステムに適用した場合の処理手順について説明する。
【０００５】
処理装置おいて複合現実空間映像を生成する手順を、図３のフローチャートに基づいて説明する。
【０００６】
まず、位置センサ本体から送信されてきた視点位置および視線方向、手の位置および方向を取り込む（ステップＳ３０１）。なお、ステップＳ３０１では、位置センサ本体から送信されてきた視点位置および視線方向、手の位置および方向を定期的に取り込むスレッドＳ３１１を利用する。
【０００７】
次に、仮想空間の時間を更新し、仮想空間の状態（仮想空間物体の種類、位置、状態）を更新する（ステップＳ３０２）。このとき、現実空間物体の位置方向に同期して位置方向が変化する仮想空間物体がある場合は、それらも合わせて状態を更新する。例えば、手の上に常に仮想物体のキャラクタが乗っているように見せる場合は、このステップＳ３０２でグローブの位置方向が更新される。
【０００８】
次に、現実空間物体の位置方向（手の位置、視点位置）と仮想空間物体の位置方向の関係を調べ、予め定義されているイベントが発生したと判断される場合には、そのイベントに応じて仮想空間の状態を更新する（ステップＳ３０３）。例えば、手で仮想空間物体に触れた場合に仮想空間物体を爆発させること等が考えられる。
【０００９】
次に、ビデオカメラ１１１から得られた観察者の視点位置および視線方向での現実空間映像を取り込む（ステップＳ３０４）。このステップＳ３０４では、ビデオカメラ１１１から得られた現実空間映像をビデオキャプチャーカードから定期的に取得するスレッドＳ３１４を利用する。
【００１０】
そして、ステップＳ３０１で取得した観察者の視点位置および視線方向からの仮想空間映像を、ステップＳ３０２，Ｓ３０３で更新された仮想空間の状態に応じて生成する（ステップＳ３０５）。例えばここで、１７０、１７１に示すような仮想キャラクタが、あたかも現実空間に存在するかのように描画される。
【００１１】
最後に、ステップＳ３０５で生成された仮想空間映像と、ステップＳ３０４で取り込まれた現実空間映像を合成し、ＨＭＤ１１０のＬＣＤ１１２に出力する（ステップＳ３０６）。
【００１２】
以上の処理を、何らかの終了操作が行われるまで（ステップＳ３０７）、繰り返し実行する。
【００１３】
つぎに図３のＳ３０４、Ｓ３１４について図４の現実空間画像取得の概念図を使って詳しく説明する。
【００１４】
Ｓ３１４では図３のフローの繰り返しのレートと同じレートでＳ３１４の映像キャプチャスレッドが動作する。つまり、画像の表示レートとキャプチャレートが同一である。
【００１５】
例として表示内容が非常に多く１０ｆｐｓ程度の表示しかできない場合のＳ３１４の処理内容について説明する。
【００１６】
４５０右目用キャプチャーボード、４５１左目用キャプチャーボードは左右非同期に動作しており、表示のレートに合わせて、それぞれ１０ｆｐｓでそれぞれ、４９２右目用画像バッファ、左目用画像バッファに現実空間画像を記録する。左右非同期で１０ｆｐｓである事から、ここで記録される画像は最大で１／６秒、平均でも１／１２秒程度ずれを生じてしまう。この時間のずれは、人にとって十分認識できるずれであり、現実空間画像の変化が大きい場面においては立体視が不可能になってしまう。
【００１７】
【発明が解決しようとする課題】
２系統のＮＴＳＣのビデオ信号のそれぞれを、１つのビデオ信号のＯＤＤフィールドとＥＶＥＮフィールドに割り当て、１つのＮＴＳＣ信号にしてから、それを単一の入力装置で１フレーム分同時に取り込むという方法では、以下のような問題点がある。
・２つのビデオ信号を一つに合成する特殊な装置が必要で高価である。
・合成する時に画質が劣化する。ＥＶＥＮ、ＯＤＤのフィールドを使うため縦方向の解像度が１／２に低下する。
・合成装置の内部で遅れが生じる。
・入力に一般的なフィールドシーケンスシャルのカメラを用いた場合、２系統の画像に１／６０秒の定常的なずれを生じてしまう。
【００１８】
一方、非同期に２系統の実写映像を取り込む場合には３０ｆｐｓの処理速度が必須なり、高性能な装置が高価となる。もしくは表示内容を充実させようとした場合に、処理能力不足のため実現できなくなるか、処理速度が低下し、左右の実写映像の同期がずれてしまい、立体視できなくなってしまうといった現象が生じる。
【００１９】
本発明は、安価な装置で、２系統以上の実写映像の各々に仮想空間画像を合成した合成画像を同期ずれなく実時間で表示可能にすることを目的とする。
【００２０】
【課題を解決するための手段】
上述の目的を達成するために本発明は以下の構成を有することを特徴とする。
【００２１】
本願請求項１の発明は、２系統以上の実写映像に仮想空間画像を合成して出力する画像合成処理装置において、複数のカメラそれぞれから送られてくるビデオ信号のフレームレートとほぼ同じ高フレームレートにて、各カメラからのビデオ信号を非同期にてバッファーに記憶する手段と、前記フレームレートとは非同期に、前記各カメラから前記バッファーに最後に記録された画像を同時に取得する手段と、仮想空間画像と前記取得した各画像とを合成する合成手段と、前記合成画像のそれぞれを表示する手段とを有する事を特徴とする。
【００２２】
本願請求項３の発明は、撮影された右目用および左目用の実写画像の夫々に仮想空間画像を合成し、リアルタイムにステレオ画像を表示する画像合成処理装置において、各カメラからのビデオ信号を取得し、右目用および左目用の実写画像としてメモリに格納する右目用および左目用のキャブチャボード、前記メモリから読み出された前記右目用および左目用の実写画像に、生成した仮想空間画像を合成し、右目用および左目用の表示画像を生成するグラフィックボードとを有し、前記右目用および左目用のキャブチャボードは夫々が非同期かつ前記グラフィックボードの動作と非同期に高いフレームレートでビデオ信号を取得することを特徴とする。
【００２３】
【発明の実施の形態】
以下、本発明の実施の形態を図面に基づいて説明する。
【００２４】
［第１の実施形態］
図１は、本発明の第１の実施形態を適用した複合現実感システムの概略構成を示すシステム構成図である。
【００２５】
第一の観察者１００は、ＨＭＤ（ＨｅａｄＭｏｕｎｔＤｉｓｐｌａｙ）１１０を頭部に、グローブ１２０を手に装着している。
【００２６】
ＨＭＤ１１０は、図２に示したように、ビデオカメラ１１１、ＬＣＤ１１２、位置方向測定装置受信機（目用位置センサ）１１３、光学プリズム１１４，１１５で構成されている。ビデオカメラ１１１は、光学プリズム１１５によって導かれた観察者の視点位置および視線方向の現実空間映像を撮影する。目用位置センサ１１３は、観察者の視点位置および視線方向を測定するために利用される。ＬＣＤ１１３は、複合現実空間映像を表示し、その映像は光学プリズム１１４によって観察者の瞳に導かれる。
【００２７】
グローブ１２０には、手用位置センサ１２１、スピーカ１２２（図示省略）が内蔵されている。手用位置センサ１２１は、観察者の手の位置および方向を測定するためのセンサとして利用される。スピーカ１２２からは手の位置において発生したイベントに応じた音が発生される。この音としては、例えば、手で仮想空間物体に触ったり叩いたりした時の音や、手の位置に同期して表示される仮想空間物体の状態が変わったときに発生する音などが考えられる。
【００２８】
１３０は位置方向測定装置発信機（位置センサ発信機）、１３１は位置方向測定装置本体（位置センサ本体）である。目用位置センサ１１３、手用位置センサ１２１、及び位置センサ発信機１３０は、位置センサ本体１３１に接続されている。位置センサ発信機１３０からは磁気が発信されており、この磁気は、目用位置センサ１１３、手用位置センサ１２１で受信される。位置センサ本体１３１は、目用位置センサ１１３、手用位置センサ１２１からの受信強度信号に基づいて、夫々目、手の位置および方向を算出する。ここで、位置方向測定装置としては、米国Polhemus社製FASTRAKなどが利用可能である。
【００２９】
１４０は観察者１人分の複合現実空間映像を生成して、ＨＭＤ１１０に表示する処理装置である。この処理装置１４０は、例えば、パーソナルコンピュータとビデオキャプチャーカード、ＣＧ描画機能を有するビデオカード、サウンドカード等で構成される。処理装置１４０には、ＨＭＤ１１０、スピーカ１２２、位置センサ本体１３１が接続されている。
【００３０】
１７０は第一の観察者１００の手の上に乗っているかの様に複合される仮想キャラクタである。１８０は第一の観察者の視線を示す。視線は目用位置センサ１１３、位置センサ発信機１３０、位置センサ本体１３１によって計測可能である。
【００３１】
第二の観察者１０１も第一の観察者と同様の構成である。
【００３２】
次に、処理装置１４０において複合現実空間映像を生成する手順を、図8のフローチャートに基づいて説明する。
【００３３】
処理装置１４０は、まず、位置センサ本体１３１から送信されてきた視点位置および視線方向、手の位置および方向を取り込む（ステップＳ２００１）。なお、ステップＳ２００１では、位置センサ本体１３１から送信されてきた視点位置および視線方向、手の位置および方向を定期的に取り込むスレッドＳ２０１１を利用する。
【００３４】
次に、仮想空間の時間を更新し、仮想空間の状態（仮想空間物体の種類、位置、状態）を更新する（ステップＳ２００２）。このとき、現実空間物体の位置方向に同期して位置方向が変化する仮想空間物体がある場合は、それらも合わせて状態を更新する。例えば、手の上に常に仮想物体のキャラクタが乗っているように見せる場合は、このステップＳ２００２でグローブの位置方向が更新される。
【００３５】
次に、現実空間物体の位置方向（手の位置、視点位置）と仮想空間物体の位置方向の関係を調べ、予め定義されているイベントが発生したと判断される場合には、そのイベントに応じて仮想空間の状態を更新する（ステップＳ２００３）。例えば、手で仮想空間物体に触れた場合に仮想空間物体を爆発させること等が考えられる。
【００３６】
次に、左右両方の現実空間画像バッファをロックして変更不能にする（ステップＳ２００４）。
【００３７】
次にバッファに取り込まれている内容で、現実空間描画用の映像を更新する（ステップＳ２００５）。現実空間画像バッファには別処理で、カメラ１１１からの画像が取り込まれているが、その部分は本実施形態に特有の方式であり詳細は後述する。本実施形態では現実空間画像バッファにカメラからの画像を書き込む処理を、現時空間画像バッファに書き込まれた画像に基づく表示画像の生成を独立、非同期に行っている。
【００３８】
ステップＳ２００５で更新した後にはステップＳ２００４でロックしたバッファを開放する。（ステップＳ２００６）
次にステップＳ２００５で更新された現実空間画像を描画する。（ステップＳ２００７）
そして、ステップＳ２００１で取得した観察者の視点位置および視線方向からの仮想空間映像を、ステップＳ２００２，Ｓ２００３で更新された仮想空間の状態に応じて生成する（ステップＳ２００８）。
【００３９】
最後に、ステップＳ２００８で生成された仮想空間映像と、ステップＳ２００７で取り込まれた現実空間映像を合成し、ＨＭＤ１１０のＬＣＤ１１２に出力する（ステップＳ２００９）。
【００４０】
以上の処理を、何らかの終了操作が行われるまで（ステップＳ２０１０）、繰り返し実行する。
【００４１】
これらの一連の処理はステップＳ２００８で描画される仮想空間映像の内容等によって処理時間が大きく変わり、１／３０秒で処理できない場合が多々生じてしまう。
【００４２】
しかしながら、本実施形態では、左右の現実空間画像の時間ずれを問題がないレベルまでに抑制することができる。Ｓ２００５で更新される実写画像は１／３０ｆｐｓで更新されている最後に更新された映像であるため、最大でも１／３０秒の遅れ、平均では１／６０秒の遅れに抑えることができる。つまり、左右の映像の更新に遅れの偏りがでるといった現象を防ぐことができる。
【００４３】
図５、図６は、ＨＭＤ１１０の外観図であり、図５は撮影部の方向から見た外観図、図６は表示部の方向から見た外観図である。
【００４４】
２０１はＨＭＤ表示部である。このHMD表示部２０１としては、右目用表示２０１Ｒと左目用表示部２０１Ｌの２つが有り、共にカラー液晶とプリズムを有し、観察者の視点位置および視線方向に対応する複合現実空間映像が表示される。
【００４５】
２０４〜２０８は、頭部装着用の構成部材である。ＨＭＤ１１０を頭部に装着するには、まず、アジャスタ２０５で長さ調整部２０６を緩めた状態で頭にかぶる。そして、額装着部２０８を額に密着させてから側頭装着部２０４と後頭装着部２０７を各々側頭部、後頭部に密着させるように、アジャスタ２０５で長さ調整部２０６を絞めればよい。
【００４６】
２０３は観察者の視点位置および視線方向の現実空間映像を撮影するためのHMD撮影部であり、このＨＭＤ撮影部２０３としては、右目用撮影部２０３Ｒと左目撮影用２０３Ｌの２つがあり、共にＮＴＳＣ方式の小型ビデオカメラにより構成されている。撮影された現実空間映像は、仮想空間映像と重畳されて複合現実空間映像となる。
【００４７】
受信機３０２は、観察者の視点位置および視線方向を測定するための情報として発信機３０４から発せられた磁気を受信するために利用される。この受信機３０２のＨＭＤ３０１への取付部としては、受信機接合部２００Ｒ，２００Ｌ，２００Ｃの３つが形成されており、これら受信機接合部２００Ｒ，２００Ｌ，２００Ｃの任意の接合部に受信機３０２を着脱自在に取付けることが可能となっている。すなわち、図５，図６では、受信機３０２は、観察者の視線の進行方向の右側の受信機接合部２００Ｒに取付けられているが、観察者の視線の進行方向の左側の受信機接合部２００Ｌ、或いは観察者の正中線上の受信機接合部２００Ｃに取付けることも可能である。
【００４８】
受信機接合部２００Ｒ，２００Ｌ，２００Ｃは、本実施形態では、受信機３０２を嵌めこんで固定するための差込口を有する構成となつているが、他の着脱自在な接合（取付）方式を採用してもよい。
【００４９】
２１０は受信機信号線であり、受信機接合部２００Ｃの近傍からＨＭＤ３０１の外部に露出している。この受信機信号線２１０は、受信機接合部２００Ｒ，２００Ｌ，２００Ｃの何れにも受信機３０２を取付けられるように、十分な長さが確保されている。
【００５０】
２０９はＨＭＤ表示部２０１、ＨＭＤ撮影部２０３等への信号線や電源供給線、上記受信機信号線２１０等の各種の線を纏めた結束線材であり、後頭装着部２０７に取り付けられている。そして、結束線材２０９中の左右のＨＭＤ表示部２０１Ｒ，２０１Ｌ、ＨＭＤ撮影部２０３Ｒ，２０３Ｌ等への信号線や電源供給線は、各々、左右の側頭装着部２０４を通っている。
【００５１】
図７は、図１のシステムにおける観察者１人分のハードウェア構成を示すブロック図である。処理装置（コンピュータ）３０７には、右目用ビデオキャプチャーボード３５０、左目用ビデオキャプチャーボード３５１、右目用グラフィックボード３５２、左目用グラフィックボード３５３、Ｉ／Ｏインターフェース３５４、ネットワークインターフェース３５９が搭載されており、これら構成要素は、ＣＰＵ３５６、ＨＤＤ３５５、メモリ３５７と接続されている。
【００５２】
左右の目用のビデオキャプチャーボード３５１，３５０は、夫々、左右の目用のビデオカメラ２０３Ｌ，２０３Ｒに接続され、これらビデオカメラ２０３Ｌ，２０３Ｒにて撮影された実写映像を本処理装置３０７で仮想空間映像と合成可能な形式に変換する。また、左右の目用のグラフィックボード３５３，３５２は、夫々、左右の目用の表示部（装置）２０１Ｌ，２０１Ｒに接続され、これら左右の目用の表示部２０１Ｌ，２０１Ｒに対する表示制御を行う。
【００５３】
また、Ｉ／Ｏインターフェース３５４は、位置方向測定装置本体３０６と接続され、ネットワークインターフェース３５９は、ネットワーク３３０と接続されている。
【００５４】
次に本実施形態に固有の現実空間画像バッファの処理に関して、コンピュータ１４０の内部の関連する部分の概念を示した図１１と図９を用いて説明する。図１０は左目に関する処理で、図９の右目に関する処理と同等である。
【００５５】
まず１フレーム分の映像信号の取り込みが右目用キャプチャーカード３５０で完了するのを待つ（ステップＳ２１０１）。次に右目用キャプチャーカード３５０から１フレーム分の画像を取得する。（ステップＳ２１０２）
次にメモリ３５７中の右目用現実空間画像バッファ２３０２をロックする（ステップＳ２１０３）。このロックによって、更新中の右目用現実空間画像バッファ２３０２を描画に使ってしまい更新中の乱れた映像を描画してしまうという事を防ぐことができる。
【００５６】
次に右目用現実空間画像バッファ２３０２に現実空間画像をコピーする（ステップＳ２１０４）。このときに、バッファが上書きされる事が重要である。画像がキューに積まれて、順番に処理されてしまうと、表示のレートが、キャプチャーのレートよりも遅い場合に、古いデータから順に画像が処理される事になってしまい、以下の分だけ遅れが生じてしまう。
【００５７】
遅れ＝キューに溜まっている画像の数×１／３０秒（キャプチャレートによる）
他の部分の実装方法によってはリングバッファー、ダブルバッファー、などいくつかバッファーを設けて、読み出しと書き込みを同時に行う、コピーの処理を除いて高速化するなどの事も可能であるが、いずれの場合にも、最後に書き込んだバッファーを保持し、描画においては最後に書き込んだ最新の画像を参照する事が本実施形態において重要である。
【００５８】
最後に右目用現実空間画像バッファ２３０２のロックを開放し（Ｓ２１０６）次のフレームの処理を行う。
【００５９】
本実施形態の現実空間画像バッファ処理によれば、高いフレームレートでカメラからの画像をキャプチャするので、現実空間映像の遅れを実用上問題ないレベルまで抑え、左右の現実空間映像の時間的差を実用上問題ないレベルにする事が可能である。
【００６０】
従来すぐに次のフレームの処理を行わず、最新の画像による描画を待つような処理手順が一般的であったが、そのような方式ではＣＰＵ負荷は軽減できるものの、現実空間映像の遅れが目立ってしまう。
【００６１】
以上のように、本実施形態によれば、現実空間映像と仮想空間映像を合成して、複合現実感空間映像を生成・表示する装置において、比較的安価な装置においても、現実空間映像の遅れを実用上問題ないレベルまで抑え、左右の現実空間映像の、時間的差も最小にする事が可能である。
【００６２】
【発明の効果】
本発明によれば、現実空間映像と仮想空間映像を合成して、複合現実感空間映像を生成・表示する装置において、比較的安価な装置においても、現実空間映像の遅れを実用上問題ないレベルまで抑え、左右の現実空間映像の時間的差も最小にする事が可能である。
【図面の簡単な説明】
【図１】本発明の第３の実施形態を適用した複合現実感システムの概略構成を示すシステム構成図である。
【図２】ＨＭＤの構成を示す構成図である。
【図３】従来の複合現実空間映像の生成処理を示すフローチャートである。
【図４】従来のハードウェアの概念図である。
【図５】ＨＭＤを撮影部の方向から見た場合の外観図である。
【図６】ＨＭＤを表示部の方向から見た場合の外観図である。
【図７】図１のシステムにおける観察者１人分のハードウェア構成を示すブロック図である。
【図８】本発明の複合現実空間映像の生成処理を示すフローチャートである。
【図９】本発明の右目の現実空間映像の処理を示すフローチャートである。
【図１０】本発明の右目の現実空間映像の処理を示すフローチャートである。
【図１１】図７のハードウエアの中で実施形態分の概念を示す図である。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image composition processing device that synthesizes two or more systems of live-action images and virtual images in real time.
[0002]
[Prior art]
In order to synchronize the input of the two systems of live-action video, each of the two systems of NTSC video signals is assigned to one ODD field and EVEN field of one video signal, and then converted into one NTSC signal. There was a method of simultaneously capturing one frame with one input device.
[0003]
Further, even when two systems of real-captured video are captured and displayed asynchronously, if a processing speed of about 30 fps is maintained, a synchronization shift between a plurality of input videos can be suppressed to a level that does not cause a large problem in practice.
[0004]
FIG. 3 and FIG. 4 are a flow diagram and a conceptual diagram of a part related to internal processing when capturing two systems of live-action images asynchronously in the conventional example. In FIG. 3, a processing procedure when applied to the system shown in FIGS. 1 and 2 will be described.
[0005]
A procedure for generating the mixed reality space video in the processing device will be described based on the flowchart of FIG.
[0006]
First, the viewpoint position and line-of-sight direction, and the position and direction of the hand transmitted from the position sensor main body are captured (step S301). In step S301, a thread S311 that periodically captures the viewpoint position and line-of-sight direction and the position and direction of the hand transmitted from the position sensor main body is used.
[0007]
Next, the time of the virtual space is updated, and the state of the virtual space (the type, position, and state of the virtual space object) is updated (step S302). At this time, if there is a virtual space object whose position and direction change in synchronization with the position and direction of the real space object, the state is updated together with them. For example, when the virtual object character is always displayed on the hand, the glove position direction is updated in step S302.
[0008]
Next, the relationship between the position direction of the real space object (hand position, viewpoint position) and the position direction of the virtual space object is examined, and if it is determined that a predefined event has occurred, The state of the virtual space is updated (step S303). For example, it is conceivable to explode a virtual space object when touching the virtual space object with a hand.
[0009]
Next, the real space image in the observer's viewpoint position and line-of-sight direction obtained from the video camera 111 is captured (step S304). In step S304, a thread S314 that periodically acquires the real space video obtained from the video camera 111 from the video capture card is used.
[0010]
Then, a virtual space image from the viewpoint position and line-of-sight direction of the observer acquired in step S301 is generated according to the state of the virtual space updated in steps S302 and S303 (step S305). For example, here, virtual characters such as 170 and 171 are drawn as if they exist in the real space.
[0011]
Finally, the virtual space image generated in step S305 and the real space image captured in step S304 are combined and output to the LCD 112 of the HMD 110 (step S306).
[0012]
The above processing is repeatedly executed until any termination operation is performed (step S307).
[0013]
Next, S304 and S314 in FIG. 3 will be described in detail using the conceptual diagram of real space image acquisition in FIG.
[0014]
In S314, the video capture thread in S314 operates at the same rate as the repetition rate of the flow of FIG. That is, the image display rate and the capture rate are the same.
[0015]
As an example, the processing content of S314 when the display content is very large and can only be displayed at about 10 fps will be described.
[0016]
The 450 right-eye capture board and the 451 left-eye capture board operate asynchronously to the left and right, and record real space images in the 492 right-eye image buffer and the left-eye image buffer at 10 fps, respectively, in accordance with the display rate. Since it is 10 fps in the left and right asynchronization, the image recorded here is shifted by a maximum of 1/6 second and on average about 1/12 second. This time shift is a shift that can be sufficiently recognized by humans, and stereoscopic viewing becomes impossible in a scene where the change in the real space image is large.
[0017]
[Problems to be solved by the invention]
In the method of assigning each of the two NTSC video signals to the ODD field and EVEN field of one video signal and converting it to one NTSC signal, the single input device simultaneously captures it for one frame. There are problems like this.
A special device that synthesizes two video signals into one is necessary and expensive.
・ Image quality deteriorates when compositing. Since the EVEN and ODD fields are used, the vertical resolution is reduced to ½.
・ There is a delay inside the synthesizer.
When a general field sequential camera is used for input, a steady shift of 1/60 seconds occurs between the two images.
[0018]
On the other hand, when asynchronously capturing two systems of live-action video, a processing speed of 30 fps is essential, and a high-performance device becomes expensive. Alternatively, when the display contents are to be enhanced, a phenomenon may occur in which the processing cannot be realized due to insufficient processing capacity, or the processing speed decreases, the left and right live-action images are out of synchronization, and stereoscopic viewing becomes impossible.
[0019]
An object of the present invention is to make it possible to display a synthesized image obtained by synthesizing a virtual space image on each of two or more types of real video images in real time without being out of synchronization.
[0020]
[Means for Solving the Problems]
In order to achieve the above object, the present invention is characterized by having the following configuration.
[0021]
According to the first aspect of the present invention, in an image composition processing apparatus that synthesizes and outputs a virtual space image with two or more systems of live-action images, a high frame rate substantially the same as a frame rate of a video signal sent from each of a plurality of cameras. Means for asynchronously storing video signals from each camera in a buffer, means for simultaneously obtaining the last recorded image from each camera in the buffer asynchronously with the frame rate, and virtual space The image processing apparatus includes combining means for combining the image and each acquired image, and means for displaying each of the combined images.
[0022]
The invention of claim 3 of the present application acquires a video signal from each camera in an image composition processing apparatus that synthesizes a virtual space image with each of the photographed real images for the right eye and the left eye and displays a stereo image in real time. The right-eye and left-eye carburetor boards that are stored in the memory as right-eye and left-eye actual images, and the generated virtual space images are synthesized with the right-eye and left-eye actual images read from the memory. And a graphic board that generates display images for the right eye and the left eye, and the right eye and left eye cab boards are each asynchronous and asynchronous with the operation of the graphic board at a high frame rate. It is characterized by acquiring.
[0023]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0024]
[First Embodiment]
FIG. 1 is a system configuration diagram showing a schematic configuration of a mixed reality system to which the first embodiment of the present invention is applied.
[0025]
The first observer 100 wears an HMD (Head Mount Display) 110 on his head and a glove 120 on his hand.
[0026]
As shown in FIG. 2, the HMD 110 includes a video camera 111, an LCD 112, a position / direction measuring device receiver (eye position sensor) 113, and optical prisms 114 and 115. The video camera 111 captures a real space image of the observer's viewpoint position and line-of-sight direction guided by the optical prism 115. The eye position sensor 113 is used to measure the observer's viewpoint position and line-of-sight direction. The LCD 113 displays a mixed reality space image, and the image is guided to the observer's pupil by the optical prism 114.
[0027]
The globe 120 includes a hand position sensor 121 and a speaker 122 (not shown). The hand position sensor 121 is used as a sensor for measuring the position and direction of the observer's hand. The speaker 122 generates a sound corresponding to an event that occurs at the position of the hand. As this sound, for example, a sound when touching or hitting a virtual space object with a hand, a sound generated when the state of a virtual space object displayed in synchronization with the position of the hand is changed, etc. .
[0028]
Reference numeral 130 denotes a position / direction measuring device transmitter (position sensor transmitter), and 131 denotes a position / direction measuring device main body (position sensor main body). The eye position sensor 113, the hand position sensor 121, and the position sensor transmitter 130 are connected to the position sensor main body 131. Magnetism is transmitted from the position sensor transmitter 130, and this magnetism is received by the eye position sensor 113 and the hand position sensor 121. The position sensor main body 131 calculates the positions and directions of the eyes and hands based on the received intensity signals from the eye position sensor 113 and the hand position sensor 121, respectively. Here, FASTRAK manufactured by Polhemus of the United States can be used as the position / direction measuring device.
[0029]
Reference numeral 140 denotes a processing device that generates a mixed reality space image for one observer and displays it on the HMD 110. The processing device 140 includes, for example, a personal computer, a video capture card, a video card having a CG drawing function, a sound card, and the like. The processing device 140 is connected to the HMD 110, the speaker 122, and the position sensor main body 131.
[0030]
A virtual character 170 is combined as if it is on the hand of the first observer 100. Reference numeral 180 indicates the line of sight of the first observer. The line of sight can be measured by the eye position sensor 113, the position sensor transmitter 130, and the position sensor main body 131.
[0031]
The second observer 101 has the same configuration as the first observer.
[0032]
Next, the procedure for generating the mixed reality space image in the processing device 140 will be described based on the flowchart of FIG.
[0033]
First, the processing device 140 captures the viewpoint position and line-of-sight direction, and the hand position and direction transmitted from the position sensor main body 131 (step S2001). In step S2001, a thread S2011 that periodically captures the viewpoint position and the line-of-sight direction and the hand position and direction transmitted from the position sensor main body 131 is used.
[0034]
Next, the time of the virtual space is updated, and the state of the virtual space (the type, position, and state of the virtual space object) is updated (step S2002). At this time, if there is a virtual space object whose position and direction change in synchronization with the position and direction of the real space object, the state is updated together with them. For example, in the case where the virtual object character is always displayed on the hand, the glove position direction is updated in step S2002.
[0035]
Next, the relationship between the position direction of the real space object (hand position, viewpoint position) and the position direction of the virtual space object is examined, and if it is determined that a predefined event has occurred, The state of the virtual space is updated (step S2003). For example, it is conceivable to explode a virtual space object when touching the virtual space object with a hand.
[0036]
Next, both the left and right real space image buffers are locked so that they cannot be changed (step S2004).
[0037]
Next, the video for real space drawing is updated with the contents captured in the buffer (step S2005). An image from the camera 111 is captured in a separate process in the real space image buffer, but this portion is a method specific to this embodiment, and will be described in detail later. In the present embodiment, the process of writing the image from the camera in the real space image buffer is performed independently and asynchronously on the basis of the image written in the current space image buffer.
[0038]
After updating in step S2005, the buffer locked in step S2004 is released. (Step S2006)
Next, the real space image updated in step S2005 is drawn. (Step S2007)
Then, a virtual space image from the observer's viewpoint position and line-of-sight direction acquired in step S2001 is generated according to the state of the virtual space updated in steps S2002 and S2003 (step S2008).
[0039]
Finally, the virtual space image generated in step S2008 and the real space image captured in step S2007 are combined and output to the LCD 112 of the HMD 110 (step S2009).
[0040]
The above processing is repeatedly executed until any termination operation is performed (step S2010).
[0041]
The processing time of these series of processing varies greatly depending on the contents of the virtual space video rendered in step S2008, and there are many cases where processing cannot be performed in 1/30 seconds.
[0042]
However, in the present embodiment, it is possible to suppress the time shift between the left and right physical space images to a level where there is no problem. Since the photographed image updated in S2005 is the last updated image updated at 1/30 fps, it can be suppressed to a delay of 1/30 seconds at the maximum, and an average delay of 1/60 seconds. That is, it is possible to prevent a phenomenon in which there is a delay in updating the left and right videos.
[0043]
5 and 6 are external views of the HMD 110, FIG. 5 is an external view seen from the direction of the photographing unit, and FIG. 6 is an external view seen from the direction of the display unit.
[0044]
Reference numeral 201 denotes an HMD display unit. The HMD display unit 201 includes a right-eye display 201R and a left-eye display unit 201L, both of which have a color liquid crystal and a prism, and display a mixed reality space image corresponding to the observer's viewpoint position and line-of-sight direction. The
[0045]
204 to 208 are components for head mounting. In order to mount the HMD 110 on the head, first, the length adjuster 206 is loosened by the adjuster 205 and put on the head. Then, the length adjusting unit 206 may be narrowed by the adjuster 205 so that the temporal mounting unit 208 and the occipital mounting unit 207 are in close contact with the temporal and occipital regions after the forehead mounting unit 208 is in close contact with the forehead.
[0046]
Reference numeral 203 denotes an HMD photographing unit for photographing a real space image in the viewpoint position and line-of-sight direction of the observer. The HMD photographing unit 203 includes a right eye photographing unit 203R and a left eye photographing 203L, both of which are NTSC. It is composed of a small video camera of the type. The captured real space image is superimposed on the virtual space image to become a mixed reality space image.
[0047]
The receiver 302 is used to receive magnetism emitted from the transmitter 304 as information for measuring the observer's viewpoint position and line-of-sight direction. Three receiver joints 200R, 200L, and 200C are formed as attachment parts of the receiver 302 to the HMD 301, and the receiver 302 is connected to an arbitrary joint of the receiver joints 200R, 200L, and 200C. It can be attached detachably. That is, in FIGS. 5 and 6, the receiver 302 is attached to the receiver joint portion 200 </ b> R on the right side in the traveling direction of the observer's line of sight, but the receiver joint portion on the left side in the traveling direction of the observer's line of sight. It is also possible to attach to the receiver joint 200C on the 200L or the observer's midline.
[0048]
In this embodiment, the receiver joints 200R, 200L, and 200C have an insertion port for fitting and fixing the receiver 302, but other detachable joining (attachment) methods are used. It may be adopted.
[0049]
Reference numeral 210 denotes a receiver signal line that is exposed to the outside of the HMD 301 from the vicinity of the receiver joint 200C. The receiver signal line 210 has a sufficient length so that the receiver 302 can be attached to any of the receiver joints 200R, 200L, and 200C.
[0050]
Reference numeral 209 denotes a bundling wire that collects various lines such as a signal line and a power supply line to the HMD display unit 201 and the HMD photographing unit 203 and the receiver signal line 210 and is attached to the occipital mounting unit 207. The signal lines and power supply lines to the left and right HMD display units 201R and 201L and the HMD photographing units 203R and 203L in the binding wire 209 pass through the left and right temporal mounting units 204, respectively.
[0051]
FIG. 7 is a block diagram showing a hardware configuration for one observer in the system of FIG. The processing device (computer) 307 includes a right-eye video capture board 350, a left-eye video capture board 351, a right-eye graphic board 352, a left-eye graphic board 353, an I / O interface 354, and a network interface 359. These components are connected to the CPU 356, the HDD 355, and the memory 357.
[0052]
The left and right eye video capture boards 351 and 350 are connected to the left and right eye video cameras 203L and 203R, respectively, and real images captured by the video cameras 203L and 203R are virtual spaced by the processing device 307. Convert to a format that can be combined with video. Also, the graphic board 353,352 for the right and left eyes, respectively, the display unit for the right and left eyes (apparatus) 201L, is connected to the 201R, a display unit for eye right and left 201L, Viewing control against the 201R I do.
[0053]
The I / O interface 354 is connected to the position / direction measuring device main body 306, and the network interface 359 is connected to the network 330.
[0054]
Next, the processing of the real space image buffer unique to the present embodiment will be described with reference to FIGS. 11 and 9 showing the concept of related parts inside the computer 140. FIG. FIG. 10 shows the process for the left eye, which is equivalent to the process for the right eye in FIG.
[0055]
First, it waits for the capture of the video signal for one frame to be completed by the right-eye capture card 350 (step S2101). Next, an image for one frame is acquired from the right-eye capture card 350. (Step S2102)
Next, the right-eye real space image buffer 2302 in the memory 357 is locked (step S2103). This lock prevents the right-eye real-space image buffer 2302 that is being updated from being used for rendering and rendering a distorted video that is being updated.
[0056]
Then copy the actual air-Maga image right-eye real space image buffer 2302 (step S2104). At this time, it is important that the buffer is overwritten. If images are stacked in a queue and processed in order, if the display rate is slower than the capture rate, the images will be processed in order from the oldest data, and the delay is as follows: Will occur.
[0057]
Delay = number of images in the queue x 1/30 seconds (depending on capture rate)
Depending on the implementation method of other parts, it is possible to provide several buffers such as a ring buffer and double buffer to perform reading and writing at the same time, speeding up except for copy processing, etc. In addition, it is important in the present embodiment that the last written buffer is held and the latest image written last is referred in drawing.
[0058]
Finally, the right-eye real space image buffer 2302 is unlocked (S2106), and the next frame is processed.
[0059]
According to the real space image buffer processing of the present embodiment, the image from the camera is captured at a high frame rate, so that the delay of the real space image is suppressed to a level that does not cause a problem in practice, and the time difference between the left and right real space images is reduced. It is possible to make the level practically no problem.
[0060]
Conventionally, a processing procedure that does not immediately process the next frame and waits for drawing with the latest image has been common, but although such a method can reduce the CPU load, the delay of the real space video is conspicuous End up.
[0061]
As described above, according to the present embodiment, in an apparatus that generates and displays a mixed reality space image by synthesizing a real space image and a virtual space image, even in a relatively inexpensive device, the delay of the real space image is delayed. It is possible to minimize the time difference between the left and right real space images.
[0062]
【The invention's effect】
According to the present invention, in a device that generates and displays a mixed reality space image by synthesizing a real space image and a virtual space image, even in a relatively inexpensive device, the delay of the real space image is at a level that is practically acceptable. It is possible to minimize the time difference between the right and left real space images.
[Brief description of the drawings]
FIG. 1 is a system configuration diagram showing a schematic configuration of a mixed reality system to which a third embodiment of the present invention is applied.
FIG. 2 is a configuration diagram showing a configuration of an HMD.
FIG. 3 is a flowchart showing a conventional mixed reality space video generation process.
FIG. 4 is a conceptual diagram of conventional hardware.
FIG. 5 is an external view when the HMD is viewed from the direction of the photographing unit.
FIG. 6 is an external view when the HMD is viewed from the direction of the display unit.
7 is a block diagram showing a hardware configuration for one observer in the system of FIG. 1. FIG.
FIG. 8 is a flowchart illustrating mixed reality space image generation processing according to the present invention.
FIG. 9 is a flowchart showing processing of a real space image of the right eye according to the present invention.
FIG. 10 is a flowchart showing processing of a real space image of the right eye according to the present invention.
FIG. 11 is a diagram illustrating the concept of the embodiment in the hardware of FIG. 7;

Claims

In an image composition processing apparatus that synthesizes and outputs a virtual space image to two or more live-action images,
Means for asynchronously storing the video signal from each camera in a buffer at a high frame rate substantially the same as the frame rate of the video signal sent from each of the plurality of cameras;
Means for simultaneously acquiring the last recorded image in the buffer from each camera asynchronously with the frame rate;
Combining means for combining the virtual space image and each acquired image;
An image composition processing apparatus comprising: means for displaying each of the composite images.

A position sensor;
The image composition processing apparatus according to claim 1, further comprising a generation unit configured to generate the virtual space image according to position information of the photographed video obtained by the position sensor.

In an image composition processing apparatus that synthesizes a virtual space image with each of the photographed real image for the right eye and the left eye, and displays a stereo image in real time,
A right-eye and left-eye cabchat board that acquires video signals from each camera and stores them in memory as live-action images for the right and left eyes,
A graphic board for synthesizing the generated virtual space image to the right-eye and left-eye live-action images read from the memory, and generating right-eye and left-eye display images;
An image composition processing apparatus, wherein the right-eye and left-eye cabture boards each acquire a video signal at a high frame rate asynchronously and asynchronously with the operation of the graphic board.