JP3734805B2

JP3734805B2 - Information recording device

Info

Publication number: JP3734805B2
Application number: JP2003138979A
Authority: JP
Inventors: 俊和金子; 隆司松谷; 武邦山本
Original assignee: MegaChips Corp
Current assignee: MegaChips Corp
Priority date: 2003-05-16
Filing date: 2003-05-16
Publication date: 2006-01-11
Anticipated expiration: 2019-12-13
Also published as: JP2004032726A

Description

【０００１】
【発明の属する技術分野】
この発明は、音声情報や画像情報等のいわゆるマルチメディア情報を記録する情報記録装置および再生する情報再生装置に関する。
【０００２】
【従来の技術】
マイクロプロセッサの情報処理能力の向上にともない、マルチメディア情報を記録する情報記録装置および再生する情報再生装置の能力が急速に発展しつつある。例えば、音声情報の分野では、ＤＳＰ（デジタルシグナルプロセッサ）を用いて再生音に遅延処理を加えたり反響音を生み出す残響処理を施して様々な音場を創出することが可能なステレオコンポが存在し、画像情報の分野では画像をデジタル情報として記録して様々な画像処理を行えるデジタルカメラやパーソナルコンピュータが存在する。
【０００３】
【発明が解決しようとする課題】
これら従来の情報記録装置および情報再生装置においては、その再生手段として少数のスピーカやディスプレイを用いることから、情報が平面的に記録されており、充分に現実感や立体感を得ることや情報の利便性を得ることができなかった。ここで、情報が平面的に記録される、とは、音源や被写体の奥行きや上下方向等の正確な位置に関する情報が全く記録されない、または不十分にしか記録されないことを指している。
【０００４】
例えばステレオ音声情報を記録する場合、左右のチャネルの音量バランスや時間差等により左右方向の音像定位が行われる。すなわち、図１８に示すように、右スピーカＳＰＲおよび左スピーカＳＰＬから再生される音声情報がリスニングポイントＬＰに位置するリスナーに届いたときに、音像がスピーカ間の距離ＤＣのうちのどこかに定位するようステレオ音声情報が記録される。このことを音場データのイメージとして表したのが、図１９である。図１９において、リスニングポイントＬＰの前に広がっている音場ＳＦ２は、その上に示した右チャネルＲｃｈの音場データイメージと左チャネルＬｃｈの音場データイメージとから成り立っている。この音場データイメージにおける丸印ＳＤ１Ｌ〜ＳＤ３Ｌ，ＳＤ１Ｒ〜ＳＤ３Ｒは、各音源の音量の大小と音場中の分布とを示したものである。例えば、ある音源に対応する左右の音量ＳＤ２Ｒ，ＳＤ２Ｌは同程度であるので、音場中の定位は中央付近になる。一方、別の音源に対応する左右の音量ＳＤ３Ｒ，ＳＤ３Ｌは右側が左側よりも大きいので、音場中の定位は右側よりとなる。
【０００５】
このように左右のスピーカの音量比を制御する方法では、左右方向の定位については得られるものの、奥行き感や上下、前後の感覚は得られない。
【０００６】
なお、これを改善するものとして、左右のスピーカで発音時間をずらす（位相差を設ける）ことで奥行き感を出したり、また、リスナーの耳介による音源位置特定作用を考慮に入れて上下や前後の方向感覚を出すようにした、３Ｄサウンドなどと呼ばれる音声信号の補正技術が存在する。図２０は、この技術を音場データのイメージとして表したものである。各音源の音量の大小と音場中の分布とを示す丸印ＳＤ１Ｌ〜ＳＤ３Ｌ，ＳＤ１Ｒ〜ＳＤ３Ｒには、さらに、位相差等の補正に関する付加情報ＡＤ１Ｌ〜ＡＤ３Ｌ，ＡＤ１Ｒ〜ＡＤ３Ｒが加わっている。これにより、音場ＳＦ３はスピーカの外側やリスニングポイントの前後左右上下へと広がり、音場ＳＦ２と比べて大きくなっている。
【０００７】
しかし、この技術によれば、各音源からの音声情報を記録する段階で録音技術者が付加情報を加えるために、録音技術者の経験や主観が大きな要素を占めていた。よって、必ずしも正確な位置に関する情報が記録されていたわけではない。
【０００８】
また、図１８に示したような、スピーカ間距離ＤＣおよび左右のスピーカとリスニングポイントとの間の距離ＤＬ，ＤＲとで囲まれる三角形の領域からリスナーが踏み出してしまうと、音場がアンバランスとなり臨場感を得ることが難しくなるという問題もあった。
【０００９】
一方、画像情報については、例えば風景の中に人物などを配置して記録する場合がよくある。この場合も、平面的に画像が記録されるだけであり被写体の位置や奥行きに関する情報が記録されるわけではない。よって、例えば、デジタルカメラでそのような画像情報を取得し、パソコンにおいて背景から人物だけを切り出す場合などにおいては、人物と背景との色調の差やピントの合い具合などから両者を区別するほかなく、その区別が難しい場合もあった。
【００１０】
本発明は、上記の課題を解決するものであり、音声情報や画像情報等に音源や被写体の位置に関する情報を付加して記録し、それら情報の再生時に位置に関する情報を有効に利用する情報記録装置および情報再生装置を実現するものである。
【００１１】
【課題を解決するための手段】
請求項１に記載の発明は、被写体および背景までの距離を位置情報として付加しつつ前記被写体および前記背景の画像情報を記録する情報記録装置であって、前記被写体および背景の位置情報が時間的に変化し、前記被写体の前記情報記録装置からの等距離面が画面中で揺れるかどうかを検出するものである。
【００１８】
【発明の実施の形態】
＜実施の形態１＞
この発明の実施の形態１は、音声情報に対して音源の空間的な位置を規定する位置情報を付加しつつ録音する情報記録装置と、位置情報が付加された音声情報を位置情報を利用しつつ再生する情報再生装置とを示すものである。
【００１９】
図１は、本実施の形態に係る情報記録装置が用いられる場面を示す図である。図１では、ステージ上でのバンドの演奏を録音する状況が示されている。なお、録音に際しては一般にマルチトラックレコーディングが行われ、各楽器ごとにトラックが割り当てられ演奏が記録される。ここでは、例としてテナーサックスＴｓ、アルトサックスＡｓ、ソプラノサックスＳｓにマイクＭｃ１〜Ｍｃ３が、ピアノｐｆにマイクＭｃ４が、ドラムズＤｓにマイクＭｃ５が、トランペットＴｐ１〜Ｔｐ３にマイクＭｃ６〜Ｍｃ８が、トロンボーンＴｂにマイクＭｃ９が、ベースＢにマイクＭｃ１０が、それぞれ割り当てられている。
【００２０】
なお、このステージ上での位置は、例として図１に示すように、最前部の向かって左端を原点とし、奥行き方向をＹ軸、左右方向をＸ軸とした座標成分で表されるものとする。
【００２１】
【表１】

【００２２】
表１は、各マイクＭｃ１〜Ｍｃ１０とその位置、および録音される音声トラックデータ番号ＳＤ１〜ＳＤ１０を示したものである。本実施の形態に係る情報記録装置においては、従来の場合とは異なり、マルチトラックのデータをステレオ２チャンネルにミックスダウンするのではなく、記録した音声情報をマルチトラックのまま保持しておく。
【００２３】
本実施の形態においては、各トラックの録音時には、音声情報だけでなく音源の空間的な位置を規定する位置情報をも記録しておく。音源の位置情報は、各トラックに位置情報専用のトラックを設け、そこに書きこむようにしてもよいし、音声情報を書きこむトラックの空き部分に書きこむようにしてもよい。そして、固定値として一度だけ書きこむ、または、変化する値として定期的に書きこむ、あるいは位置情報に変化のあった場合にのみ書きこむ、などしておけばよい。
【００２４】
なお、音源の位置情報は、マイクの位置に基づいて決定してもよいし、演奏者あるいは楽器の位置に基づいて決定してもよい。
【００２５】
また、図１や表１においては、表示を簡単にするためＸ軸、Ｙ軸の２次元の位置情報の場合を示しているが、両軸に垂直なＺ軸方向の座標成分を加えて３次元の位置情報としてもよい。
【００２６】
このようにして位置情報が記録された音声情報の利用について、以下に説明する。音声は上述のＤＳＰ内蔵のステレオコンポのように、遅延処理や残響処理を施すことで、奥行き感を出すことができる。また、耳介による音源位置特定作用や位相差等を考慮に入れて音声信号を補正することで、上下や前後の方向感覚を出すことができる。このような遅延処理や残響処理、補正処理は、従来のステレオコンポや３Ｄサウンド技術で用いられている技術をそのまま適用すればよい。遅延処理や残響処理、補正処理は、音源およびリスナーの位置関係に大きく依存するものであり、これらの処理に関するパラメータは、残響レベルや遅延時間の多少、伝播媒体や壁の材質等を予め決めておけば、音源およびリスナーの位置関係が決定されることで自動的に決まる。なお、これらの処理に関するパラメータのことを本願では「音声情報の伝播特性」と表現する。音声情報の伝播特性には、遅延処理や残響処理、耳介による音源位置特定作用や位相差等による補正処理の他、音量レベルを時間的に変化させることで風の影響や壁などの材質の影響を表現したり、遅延処理や音量レベルの変化を工夫して音速の変化要素である気温や伝播媒体の種類（水や空気等）や密度を表現したりすることも含まれる。
【００２７】
さて、本実施の形態に係る情報記録装置によって記録された音声情報にはそれぞれ音源の位置情報が付加されているので、音源ごとに音声情報の伝播特性を決定することができる。すなわち、例えば従来のステレオコンポでは、ステレオ音声情報について伝播特性を決定する場合には、音源ごとではなくミックスダウンされた音声情報に一律に処理がなされてしまい、立体感が得にくかったが、音源ごとに音声情報の伝播特性を決定することができれば、より現実感の増した音声情報を再生することができる。また、３Ｄサウンド技術によれば、録音技術者の経験や主観が大きな要素を占めていたため、必ずしも正確な位置に関する情報が記録されていたわけではなかったが、音源ごとの位置情報が付加されておれば、正確な位置情報を用いつつ音声情報の伝播特性をより精度よく決定することが可能となる。
【００２８】
図２は、音源ごとに音声情報の伝播特性を決定する、本実施の形態に係る情報再生装置が用いられる場面を示す図である。図２では、表１に示された各音声トラックデータＳＤ１〜ＳＤ１０がスピーカＳＰＬ，ＳＰＲから再生されたときに形成される音場ＳＦ１の音場データイメージが示されている。各音声トラックデータＳＤ１〜ＳＤ１０の音場データイメージは、図１に示した実際のステージ上での各楽器の配置と対応している。
【００２９】
なお、この音場データイメージは、リスナーがリスニングポイントＬＰ１にいるときに最適となるように、音声情報の伝播特性が決定された場合を示している。仮に、リスナーがリスニングポイントＬＰ１からリスニングポイントＬＰ２へと移動した場合は、そのままでは音声情報の伝播特性が最適ではなくなってしまうので、リスニングポイントＬＰ２を検知した上で、新たに音声情報の伝播特性を決定するようにすればよい。なお、リスナーの場所の特定には、リスナーからの位置情報の入力を待つようにしてもよいし、本実施の形態に係る情報再生装置にＣＣＤ測距センサや赤外線センサを設けて自動検知するようにしてもよい。
【００３０】
また図２では、例として２本のスピーカで音場形成する場合を示しているが、もちろんそれ以上の複数のスピーカが存在する場合には、各スピーカの配置に応じて出力させる音声情報を変化させるようにしておけばよい。また、本実施の形態に係る情報再生装置の音声情報の処理能力が低く、マルチトラックの全てについて独立に再生を行うことが困難である場合には、例えば、音源の位置が近いもの同士の音声情報を一つに合成して、トラック数を減らすようにしてもよい。
【００３１】
なお、音源が移動する場合（例えばワイヤレスマイク等を用いる場合など）には、音源の位置とリスナーの位置との間で生じるドップラー効果を考慮して、音声情報の周波数を変更しつつ音声情報を再生するようにしておけばよい。ドップラー効果は、移動する音源から発せられる音声の周波数が静止時のときと比べて変化する現象のことを指す。この現象は、
【００３２】
【数１】

【００３３】
のように定量的に表わされる。なお数１において、ｆはリスナーが受け取る音声情報の周波数を、ｆ₀は静止時の音源から発せられる音声の周波数を、ｃは音声の速度を、それぞれ表わす。また、その他のパラメータについては、図３に示すとおりである。すなわち、ｖ₀はリスナーの現在地点０における移動速度の絶対値を、ｖ_Sは音源の現在地点Ｓにおける移動速度の絶対値を、φおよびθはリスナーの現在地点０と音源の現在地点Ｓとを結ぶ直線からのリスナーの移動速度の角度および音源の移動速度の角度を、それぞれ示している。
【００３４】
よって、移動する音源から発せられる音声情報については、ｃ，ｖ₀，ｖ_S，φおよびθで決定される数１におけるｆ₀の係数を、音声情報の周波数に乗算する補正処理を施せばよい。音声の速度ｃは、気温や伝播媒体等のパラメータを決めることで決定され、ｖ₀，ｖ_S，φおよびθは、音源の位置情報の時間変化およびリスナーの位置情報の時間変化を計算することにより得ることができるので、数１におけるｆ₀の係数を求めることは困難ではない。
【００３５】
上記のドップラー効果を再現する機能を備えた、情報再生装置のブロック図を図４に示す。図４において、相対関係算出処理ブロックＳＴ１は音源位置情報ＩＦＳおよびリスナー位置情報ＩＦＬを得て、両者間の距離等の位置情報を算出し、また、音源およびリスナーの位置情報の時間変化からｖ₀，ｖ_S，φおよびθを算出する。そしてそれらの情報を、ピッチ変更処理ブロックＳＴ２および伝播特性変更処理ブロックＳＴ３へと送る。ピッチ変更処理ブロックＳＴ２においては音声情報および仮想空間における環境情報（伝播媒体の種類や記音等に関する情報）が与えられてドップラー効果を音声情報に付加し、伝播特性変更処理ブロックＳＴ３においてはピッチ変更処理ブロックＳＴ２からの出力および仮想空間における環境情報が与えられて音声情報に伝播特性を付加する。そして、伝播特性変更処理ブロックＳＴ３の出力は、音声再生処理ブロックＳＴ４に与えられてリスナーに伝えられる。
【００３６】
また、リスナーが複数存在し、それぞれのリスナーが異なる位置に存在する場合は、情報再生装置に、図５に示すブロック図のように、リスナーごとに相対関係算出処理ブロックＳＴ１ａ〜ＳＴ１ｃ、音声再生加工処理ブロックＳＴ２３ａ〜ＳＴ２３ｃ、リスナー別音楽再生処理ブロックＳＴ４ａ〜ＳＴ４ｃを設けるようにすればよい。相対関係算出処理ブロックＳＴ１ａ〜ＳＴ１ｃがリスナーごとに設けられることに伴い、リスナー位置情報ＩＦＬａ〜ＩＦＬｃもリスナーごとに採取され、対応する相対関係算出処理ブロックにそれぞれ入力される。なお、音声再生加工処理ブロックＳＴ２３ａ〜ＳＴ２３ｃは、図４におけるピッチ変更処理ブロックＳＴ２および伝播特性変更処理ブロックＳＴ３をまとめて示したものである。また、再生処理ブロックは、他のリスナーとの干渉を防ぐためにリスナー別に設けられている。リスナー別音楽再生処理ブロックＳＴ４ａ〜ＳＴ４ｃの具体例としては、ヘッドフォンや超指向性スピーカ等がある。
【００３７】
この場合、同一の音声情報に対し、リスナーごとに異なった再生プロセスを通すので、各リスナーに適した音場を形成することが可能となる。このようにすれば、例えばバーチャルリアリティ空間で発音音源とリスナーとが動き回る状況を形成することや、車内のオーディオ再生装置でドライバーやナビゲーターの座席位置に応じた音場を個別に設定すること、家庭のオーディオ再生装置でコンサートホールの座席配置を考慮した音場補正を行うこと、コンサートホールで客席の位置による音場の差異の補正を行うこと、が可能となる。
【００３８】
本実施の形態に係る情報記録装置を用いれば、音源の位置情報を付加しつつ音源から発せられる音声情報を録音するので、音声情報の再生時に音源の位置情報を用いて音声情報に対して加工を行うことができる。
【００３９】
また、本実施の形態に係る情報再生装置を用いれば、音源の位置情報を用いて音声情報の伝播特性を決定しつつ音声情報を再生するので、リスナーに現実感や立体感のある音声情報を与えることができる。さらに伝播特性を決定する際にリスナーの位置情報をも用いれば、リスナーの位置に応じた、より現実感や立体感のある音声情報を聴取者に与えることができる。また、音源の位置とリスナーの位置との間で生じるドップラー効果を考慮して音声情報の周波数を変更すれば、より現実感や立体感のある音声情報をリスナーに与えることができる。また、リスナーが複数である場合には、複数のリスナーの各々に対応する位置情報を用いて音声情報の伝播特性を決定しつつ、またはそれに加えて音声情報の周波数を変更しつつ、複数のリスナーの各々に対して音声情報を再生することで、複数のリスナーの各々により現実感や立体感のある音声情報を与えることができる。
【００４０】
＜実施の形態２＞
この発明の実施の形態２は、画像情報に対して被写体および背景までの距離を位置情報として付加しつつ記録する情報記録装置と、位置情報が付加された画像情報を位置情報を利用しつつ再生する情報再生装置とを示すものである。
【００４１】
図６は本実施の形態に係る情報記録装置の構成を示す図である。図６では、デジタルカメラ等の撮像装置ＣＭが捉えた画像情報と、撮像装置ＣＭ近傍に備えつけられた、赤外線センサやＣＣＤ測距センサ、超音波センサ、重力・圧力センサ等の距離を測定するセンサ素子ＳＳが捉えた被写体ＳＢ０および背景ＢＧの位置情報とをともにデータ化して、位置情報が付加された画像情報ＧＡを得ている。なお、被写体ＳＢ０および背景ＢＧの位置情報とは、撮像装置ＣＭと被写体ＳＢ０との間の距離および撮像装置ＣＭと背景ＢＧとの間の距離のことを指す。
【００４２】
画像情報ＧＡには被写体ＳＢ０および背景ＢＧが単に映っているだけではなく、撮像装置ＣＭと被写体ＳＢ０または背景ＢＧとの間の距離の情報が、ある単位区画ごと（例えば画面を縦または横に数等分したものや、究極的には単位ピクセルごと）に記録される。なお、この被写体ＳＢ０は、三個の物体ＳＢ０ａ，ＳＢ０ｂ，ＳＢ０ｃとからなっている。図６においては例として、一番手前に存在する右の物体ＳＢ０ｃの正面部分までの距離は２．５ｍ、二番目に手前に存在する左の物体ＳＢ０ａの正面部分までの距離は２．７ｍ、一番奥に存在する中央の物体ＳＢ０ｂの正面部分までの距離は３．０ｍと示されている。また、背景ＢＧまでの距離は１０．０ｍと示されている。
【００４３】
また、本実施の形態に係る情報再生装置とは、このように記録された画像情報ＧＡを、被写体ＳＢ０および背景ＢＧの位置情報とともにまたは個別に表示する装置である。このように、画像情報に被写体の位置情報が付加されておれば、画像情報の再生を行う際に容易に背景と被写体とを区別することができ、例えば背景から人物だけを切り出すなどの画像処理が容易となる。
【００４４】
また、撮像装置ＣＭが動画撮影可能なビデオカメラである場合には、画像情報に付加された位置情報を、図７に示すように手ブレ補正に利用することも可能である。すなわち、被写体の撮像装置からの等距離面が画面中で全体的に小刻みに揺れれば、手ブレであると検出できる。そして手ブレによる移動分に対し補正を行えば、手ブレが存在しないかのように動画を記録することができる。
【００４５】
また、本実施の形態に係る情報記録装置は、実施の形態１における音声情報についての情報記録装置と組み合わせて用いてもよい。すなわち、画像情報の記録時に図８に示すように画面ＧＡ内の等距離面により区分されるオブジェクトＯＢ（実施の形態１における音源に対応するもの）が画像認識等の手法により認識された場合、その移動に伴って、録音する音源の位置情報についても更新を行うのである。そうすれば、実施の形態１における音声情報についての情報記録装置のうち、音源の位置情報の時間的変化のデータを記録できないものであっても、オブジェクトＯＢの動きにあわせて音源を移動させることができる。
【００４６】
また、本実施の形態に係る情報再生装置は、上記と同様に実施の形態１における音声情報についての情報再生装置と組み合わせて用いてもよい。すなわち、画像情報の再生時に図８に示すように画面ＧＡ内の等距離面により区分されるオブジェクトＯＢ（音源に対応するもの）が画像認識等の手法により認識された場合、その移動に伴って、再生する音源の位置情報についても更新を行うのである。そうすれば、実施の形態１における音声情報についての情報再生装置のうち、音源の位置情報の時間的変化のデータを有していないものであっても、オブジェクトＯＢの動きにあわせて音源を移動させることができる。
【００４７】
本実施の形態に係る情報記録装置を用いれば、被写体および背景の位置情報を付加しつつ被写体および背景の画像情報を記録するので、画像情報の再生時に被写体および背景の位置情報を用いて画像情報に対して加工を行うことができる。また、画面中で等距離面が全体的に小刻みに動くかどうかを検出することで、手ブレを検出することができる。また、音源の位置情報が被写体の移動に伴って更新されるようにしておくことで、音源の位置情報の時間的変化のデータを記録できない情報記録装置であっても、被写体の移動にあわせて音源を移動させることができる。
【００４８】
また、本実施の形態に係る情報再生装置を用いれば、被写体の位置情報を用いて画像処理すべき部分を決定して当該部分に画像処理を行いつつ画像情報を再生するので、遠くに存在する被写体の圧縮率を上げたり、被写体を背景から分離したりすることができる。また、音源の位置情報が被写体の移動に伴って更新されるようにしておくことで、音源の位置情報の時間的変化のデータを有していない情報再生装置であっても、被写体の移動にあわせて音源を移動させることができる。
【００４９】
＜実施の形態３＞
この発明の実施の形態３は、被写界深度の大きい画像を得るために実施の形態２に示した情報記録装置を利用するものである。
【００５０】
図９は被写界深度について説明するものである。通常のアナログカメラやデジタルカメラ、ビデオカメラ等の撮像装置ＣＭで撮影した画像は通常、合焦点（ピントの合った位置のこと、また合焦点から撮像装置までの距離を合焦距離という）および被写界深度（合焦点の前後でピントの合う範囲のこと）が存在する。
【００５１】
被写界深度が大きいほど奥行き方向のピントが合う範囲が広くなり、くっきりとした画像を得ることができる。
【００５２】
被写界深度が浅く（短く）なる場合として、▲１▼撮影レンズの焦点距離が長い、▲２▼撮影レンズの絞り値が小さい（絞りが開いている）、▲３▼被写体までの撮影距離が近い、という３つの条件が挙げられる。例えば、▲１▼長めの焦点距離（３５ｍｍフィルムにおいて、１００〜２００ｍｍ位）の撮影レンズを用い、▲２▼花などの撮影を至近距離（数十ｃｍ）で行い、▲３▼絞り値（焦点距離÷有効瞳径）が開放に近いｆ＝２．８かそれ以下の場合、全体の被写界深度は数ｃｍの範囲しかない。
【００５３】
数ｃｍの被写界深度では、花の写真を撮る際に、例えば花芯にピントを合わせると周囲の花弁はピントがボケてしまう。また、もし、花全体あるいは茎や葉にもピントを合わせようとすると、▲３▼の絞り値を大きくする（絞る）しかなく、必然的に露出光量が低下して、シャッター速度を低下させる（シャッターを長時間（一般的な撮影光量でｆ＝３２程度まで絞ると数分の一秒から数秒程度）開ける）ことになり、手ブレや風などによるブレの影響が出て写真として使いものにならない。
【００５４】
近距離撮影で被写界深度が浅くなる問題を解決するため、ある種のカメラでは、ｆ＝４５まで絞り、露光不足をストロボでカバーするような機構を持つものもある。しかし、自然光と人工光との違い（色、入光角度、光の分布、拡散など）で、写真の仕上がりのイメージがかなり異なってしまう。また、ストロボ光が被写体に反射して写る、ある程度の距離（ストロボ光到達距離＝ガイドナンバー÷絞り値×フィルム感度補正）以上にはストロボ光は届かない、など新たな問題が生じる。
【００５５】
このような被写界深度の問題は、アナログのカメラのみならず、光学系を用いたデジタルカメラやビデオカメラにおいても同様に発生する。ただし、実際の撮影においては、意図的に背景をぼかしたりすることで写真的表現や芸術的表現となることがあるので、被写界深度が浅いこと自体は光学機器システム全体としては欠点というわけではない。むしろ、撮影者の意図する被写界深度を上記▲１▼〜▲３▼の３つの条件とその場の光量に合うように設定し、コントロールすることが、知識と経験がない限りは至難の技であるという点が問題であった。
【００５６】
そこで、実施の形態２に示した情報記録装置を利用することで、被写界深度の大きい画像を得る。
【００５７】
まず、被写体ＳＢ０ａ〜ＳＢ０ｃを上面からみた図１０に示すように、被写体ＳＢ０ａ〜ＳＢ０ｃに対し、合焦点をＦＰ１から例えばＦＰ７まで段階的に変化させて撮像装置ＣＭを含む情報記録装置を用いて撮影を行い、位置情報付きの画像情報を得ておく。各合焦点ＦＰ１〜ＦＰ７に対応する被写界深度はＤ１〜Ｄ７で表わされている。なお、各合焦点間の距離は、被写界深度が断絶することがないように被写界深度を概算で求めておいて決定することが望ましいが、３ｃｍや５ｃｍというような固定値を適宜設定するようにしてもよい。
【００５８】
さて、上記の例の場合、合焦点をＦＰ１からＦＰ７まで段階的に変化させて撮影を行ったので、ピントの具合が異なる画像情報が７枚存在することになる。このうち、ピントが合っている部分を７枚の各画像情報から抜き出して合成すれば、被写界深度の深い画像を得ることができる。
【００５９】
ピントが合っている部分を各画像情報から抜き出すには、各画像情報に含まれる撮像装置ＣＭと被写体ＳＢ０ａ〜ＳＢ０ｃとの間の距離についての位置情報を用いて、合焦点までの距離の値がその被写体の撮像面の距離の値と近い画像情報の一部分を抜き出すようにすればよい。
【００６０】
そして、ピントが合っている部分を各画像情報から抜き出して合成することについて示したのが、図１１である。図１１では、被写体ＳＢ０ｃのピントが合っている部分として、合焦点ＦＰ２、被写界深度Ｄ２の下で撮影された画像のうち範囲ＷＡが選択されている。なお符号Ａ１は図１０の一部を示し、符号Ａ２は、被写界深度Ｄ２の下で撮影された画像のうち範囲ＷＡのみを示した図である。同様にして、被写体ＳＢ０ｂのピントが合っている部分として、合焦点ＦＰ３、被写界深度Ｄ３の下で撮影された画像のうち範囲ＷＢが選択され、被写体ＳＢ０ａのピントが合っている部分として、合焦点ＦＰ５、被写界深度Ｄ５の下で撮影された画像のうち範囲ＷＣが選択されている。なお、範囲ＷＢとして選択される部分は、範囲ＷＡを除いた部分から選択し、範囲ＷＣとして選択される部分は、範囲ＷＡおよびＷＢを除いた部分から選択するようにしておけばよい。このようにピントの合っている部分を順次、抜き出して合成すれば、結果として被写界深度の大きい画像を得ることができる。
【００６１】
また、このようにすれば、図１２に示すように、被写体ＳＢ１の撮像面に平行でない壁面の全体にピントを合わせた画像を得ることもできる。アナログのカメラにおいては、シフトレンズ等の光軸を傾斜させる機構を用いて商品や建築物の斜面を撮影していたが、そのような機構を用いることなく、撮像面に平行でない壁面の全体にピントを合わせた画像を得ることができ、非常に有効となる。
【００６２】
本実施の形態に係る情報記録装置を用いれば、ピントの合っている部分を抜き出して合成するので、被写界深度の大きい画像を得ることができる。
【００６３】
なお、実施の形態２に示した情報記録装置以外の情報記録装置を用いる場合であっても、すなわち、撮像装置ＣＭと被写体ＳＢ０との間の距離についての位置情報が各画像情報に含まれない場合であっても、上記と同様の効果を有する情報記録装置を実現することは可能である。つまり、合焦点を段階的に変化させて複数枚の画像情報を得ておき、ピントが合っている部分を各画像情報から抜き出して合成すれば、被写界深度の深い画像を得ることができる。この場合にピントの合っている部分を各画像情報から抜き出すには、複数枚の画像情報のそれぞれに高域成分を抽出する画像処理を施すことにより、ピントが合っている部分を特定すればよい。
【００６４】
＜実施の形態４＞
この発明の実施の形態４は、立体映像を得るために実施の形態２に示した情報再生装置を利用するものである。
【００６５】
図１３は、立体視の原理を示す図である。例えば図１３に示すような三角柱形状の被写体ＳＢ２を人間が見るとき、左目には被写体ＳＢ２の左側面Ｓ１が右側面Ｓ２よりも大きく写り、右目には被写体ＳＢ２の右側面Ｓ２が左側面Ｓ１よりも大きく写る。このように右目と左目との間で視差が生じることにより、人間は立体的な奥行きを感じる。
【００６６】
そこで、実施の形態２に示した情報記録装置を用いて、被写体ＳＢ２をその左側面Ｓ１および右側面Ｓ２の位置情報を付加しつつ一枚の画像情報として記録しておく。
【００６７】
そして、実施の形態２に示した情報再生装置を変形して、視差を考慮しつつ左目用映像と右目用映像とをそれぞれ再生する。具体的には、図１４に示すように、位置情報を用いて、視差の分だけ水平方向を長くした左側面Ｓ１Ｌと視差の分だけ水平方向を短くした右側面Ｓ２Ｌとからなる左目用映像ＳＢ２Ｌを作りだし、視差の分だけ水平方向を長くした右側面Ｓ２Ｒと視差の分だけ水平方向を短くした左側面Ｓ１Ｒとからなる右目用映像ＳＢ２Ｒを作りだして、左目用と右目用との両映像をそれぞれ再生する。
【００６８】
なお、左目用映像ＳＢ２Ｌおよび右目用映像ＳＢ２Ｒには、もちろん被写体だけでなく背景も含まれている。この背景に対しても、被写体と同様に水平方向の補正が行われることがある。ただし、背景に対して行われる補正と被写体に対して行われる補正とではその補正量が異なる場合があるため、被写体に対して水平方向の補正を行うことにより背景と被写体との間に隙間が生じてしまうことが考えられる。その場合は、生じた隙間を周囲の画素の色を平均化した色で補填するなどの手当てを行えばよい。
【００６９】
そして、再生された両映像は、立体眼鏡等を用いて鑑賞されることで立体映像となる。
【００７０】
本実施の形態に係る情報再生装置を用いれば、一枚の画像情報から、位置情報を用いて、視差の分だけ水平方向の距離を補正した左目用映像および右目用映像を作り出すので、従来の立体映像のように右目用と左目用の両映像を記録しておく必要がない。
【００７１】
＜実施の形態５＞
この発明の実施の形態５は、実施の形態２に示した情報記録装置を利用して、ＧＰＳやＰＨＳを用いた移動体の位置測定装置の精度向上に役立てるものである。すなわち、実施の形態２に示した情報記録装置が、さらに自身の位置を測定するための位置測定装置を備え、位置測定装置により測定された位置を仮の現在地としつつ、画像情報に付加された位置情報を用いて位置測定の精度を向上させ真の現在地を求める、というものである。
【００７２】
例えば図１５に示すように、現在地から建物等の目標点となる２つの物体Ｂ１，Ｂ２までのそれぞれの距離を、実施の形態２に示した情報記録装置を用いて測っておく。この物体Ｂ１，Ｂ２には、位置測定装置内の地図に記載されているものを選ぶ。次に、図１６に示すように、ＧＰＳやＰＨＳを用いた移動体の位置測定装置により特定される現在地の範囲ＡＲ１を地図ＭＰ上に表示する。仮に範囲ＡＲ１の中心Ｐ１が現在地であるとすれば、Ｐ１と物体Ｂ１との距離ＤＧ１およびＰ１と物体Ｂ２との距離ＤＧ２が、実施の形態２に示した情報記録装置により得られた距離の値と一致するはずである。もし一致しなければ、現在地はＰ１ではないことが判明する。
【００７３】
その場合は、実施の形態２に示した情報記録装置を用いて得られた現在地から物体Ｂ１，Ｂ２までの距離の情報を用いて、現在地から物体Ｂ１までの距離ＤＳ１を半径とする円ＣＬ１を物体Ｂ１を中心として描き、同様にして、現在地から物体Ｂ２までの距離ＤＳ２を半径とする円ＣＬ２を物体Ｂ２を中心として描く。そして両者の交点Ｐ２，Ｐ３のうち、位置測定装置により得られた範囲ＡＲ１に近い方の交点を真の現在地として採用すればよい。
【００７４】
本実施の形態に係る情報記録装置を用いれば、位置測定装置をさらに備え、地図上に仮の現在地を表示し、また、画像情報の中から２つの物体を決定し、両物体を中心とし両物体までのそれぞれの距離を半径とする２つの円を描き、それらの円の交点のうち仮の現在地に近い方の交点を真の現在地と判定するので、位置測定装置の精度を向上させることができる。
【００７５】
＜実施の形態６＞
この発明の実施の形態６は、実施の形態２に示した情報記録装置および情報記録装置において、被写体または背景に文字が含まれており、その文字を画像認識してテキスト情報に置き換えて情報が保存されるものである。
【００７６】
被写体または背景に文字が含まれている場合、ビットマップデータとして保存するよりもテキスト情報としてコード化して情報を保持する方がデータ効率がよい。さらに、テキスト情報としておくことで、背景や被写体の位置情報に変化があった場合には、図１７に示すようにテキスト情報Ｃ１のフォントサイズをＣ２，Ｃ３のように変更するなどの加工が容易に行える。そのほかにもテキスト情報の色等を背景や被写体の位置情報の変化に合わせて変化させるようにしてもよい。
【００７７】
【発明の効果】
請求項１に記載の発明によれば、被写体および背景の位置情報を付加しつつ被写体および背景の画像情報を記録するので、画像情報の再生時に被写体および背景の位置情報を用いて画像情報に対して加工を行うことができる。また、等距離面が画面中で揺れるかどうかを検出するので、手ブレを検出することができる。
【図面の簡単な説明】
【図１】この発明の実施の形態１に係る情報記録装置が用いられる場面を示す図である。
【図２】この発明の実施の形態１に係る情報再生装置が用いられる場面を示す図である。
【図３】ドップラー効果における各パラメータを示す図である。
【図４】この発明の実施の形態１に係る情報再生装置の構成を示すブロック図である。
【図５】この発明の実施の形態１に係る情報再生装置の他の構成を示すブロック図である。
【図６】この発明の実施の形態２に係る情報記録装置の構成を示すブロック図である。
【図７】この発明の実施の形態２に係る情報記録装置における手ブレ補正を示す図である。
【図８】この発明の実施の形態２に係る情報記録装置または情報再生装置における被写体の移動を示す図である。
【図９】被写界深度を説明する図である。
【図１０】この発明の実施の形態３に係る情報記録装置を用いて被写体が撮影される様子を示す図である。
【図１１】この発明の実施の形態３に係る情報記録装置を用いて画像が合成される様子を示す図である。
【図１２】撮像面に対し平行でない面を有する被写体を撮影する様子を示す図である。
【図１３】立体視を説明する図である。
【図１４】この発明の実施の形態４に係る情報記録装置により作り出される映像を示す図である。
【図１５】この発明の実施の形態５に係る情報記録装置において目標点となる２つの物体を示す図である。
【図１６】この発明の実施の形態５に係る情報記録装置において現在地を判定する方法を示す図である。
【図１７】この発明の実施の形態６に係る情報記録装置において文字のサイズが変化する様子を示す図である。
【図１８】従来のステレオ音声情報を示す図である。
【図１９】従来のステレオ音声情報の音場データのイメージを示す図である。
【図２０】従来の３Ｄサウンド技術の音場データのイメージを示す図である。
【符号の説明】
ＳＤ１〜ＳＤ１０音声トラックデータ
ＩＦＳ音源位置情報
ＩＦＬ，ＩＦＬａ〜ＩＦＬｃリスナー位置情報
ＳＴ１，ＳＴ１ａ〜ＳＴ１ｃ相対関係算出処理ブロック
ＳＴ２ピッチ変更処理ブロック
ＳＴ３伝播特性変更処理ブロック
ＳＴ２３ａ〜ＳＴ２３ｃ音声再生加工処理ブロック
ＳＴ４音声再生処理ブロック
ＳＴ４ａ〜ＳＴ４ｃリスナー別音声再生処理ブロック
ＣＭ撮像装置
ＳＳセンサ素子
ＳＢＯ，ＳＢＯａ〜ＳＢ０ｃ，ＳＢ１，ＳＢ２被写体
ＯＢオブジェクト
ＦＰ１〜ＦＰ７合焦点
Ｄ１〜Ｄ７被写界深度
ＡＲ１位置測定装置により特定される範囲
Ｂ１，Ｂ２目標点となる物体
Ｃ１〜Ｃ３文字フォント[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an information recording apparatus that records so-called multimedia information such as audio information and image information, and an information reproducing apparatus that reproduces the information.
[0002]
[Prior art]
With the improvement of the information processing capability of microprocessors, the capabilities of information recording devices that record multimedia information and information playback devices that reproduce information are rapidly developing. For example, in the field of audio information, there are stereo components that can create various sound fields by applying a delay process to a reproduced sound or performing a reverberation process that generates an echo sound using a DSP (digital signal processor). In the field of image information, there are digital cameras and personal computers capable of performing various image processing by recording images as digital information.
[0003]
[Problems to be solved by the invention]
In these conventional information recording apparatuses and information reproducing apparatuses, since a small number of speakers and displays are used as reproducing means, the information is recorded in a plane, so that a sufficient sense of reality and stereoscopic effect can be obtained. Convenience could not be obtained. Here, the information is recorded in a two-dimensional manner means that information regarding an accurate position such as the depth and the vertical direction of a sound source or a subject is not recorded at all or is recorded insufficiently.
[0004]
For example, when recording stereo audio information, sound image localization in the left-right direction is performed based on the volume balance of the left and right channels, the time difference, and the like. That is, as shown in FIG. 18, when the audio information reproduced from the right speaker SPR and the left speaker SPL reaches the listener located at the listening point LP, the sound image is localized somewhere within the distance DC between the speakers. Stereo audio information is recorded so that FIG. 19 shows this as an image of sound field data. In FIG. 19, the sound field SF2 spreading before the listening point LP is composed of the sound field data image of the right channel Rch and the sound field data image of the left channel Lch shown above. The circles SD1L to SD3L and SD1R to SD3R in the sound field data image indicate the volume level of each sound source and the distribution in the sound field. For example, the left and right sound volumes SD2R and SD2L corresponding to a certain sound source are approximately the same, and the localization in the sound field is near the center. On the other hand, since the right and left volumes SD3R and SD3L corresponding to different sound sources are larger on the right side than on the left side, the localization in the sound field is on the right side.
[0005]
As described above, in the method of controlling the volume ratio of the left and right speakers, although the localization in the left and right direction can be obtained, the sense of depth and the feeling of up and down and front and back cannot be obtained.
[0006]
In order to improve this, the left and right speakers shift the sound generation time (set a phase difference) to give a sense of depth, and the sound source location by the listener's auricle is taken into consideration, up and down, front and back There is an audio signal correction technique called 3D sound that gives a sense of direction. FIG. 20 shows this technique as an image of sound field data. Additional information AD1L to AD3L and AD1R to AD3R relating to correction of phase difference and the like are further added to the circles SD1L to SD3L and SD1R to SD3R indicating the volume level of each sound source and the distribution in the sound field. As a result, the sound field SF3 spreads outside the loudspeaker, front, rear, left, and right, up and down of the listening point, and is larger than the sound field SF2.
[0007]
However, according to this technique, since the recording engineer adds additional information at the stage of recording audio information from each sound source, the experience and subjectivity of the recording engineer accounted for a large factor. Therefore, the information regarding the accurate position is not necessarily recorded.
[0008]
Also, if the listener steps out of the triangular area surrounded by the speaker distance DC and the distances DL and DR between the left and right speakers and the listening point as shown in FIG. 18, the sound field becomes unbalanced. There was also a problem that it was difficult to get a sense of reality.
[0009]
On the other hand, the image information is often recorded by arranging a person or the like in a landscape, for example. In this case as well, the image is only recorded in a plane, and information on the position and depth of the subject is not recorded. Therefore, for example, when acquiring such image information with a digital camera and cutting out only a person from the background on a personal computer, there is no choice but to distinguish both from the color difference between the person and the background, the degree of focus, etc. In some cases, it was difficult to distinguish.
[0010]
The present invention solves the above-mentioned problems, and records information by adding information on the position of a sound source or subject to audio information, image information, etc., and effectively using the information on the position when reproducing the information. An apparatus and an information reproducing apparatus are realized.
[0011]
[Means for Solving the Problems]
The invention according to claim 1 is an information recording apparatus that records image information of the subject and the background while adding the distance to the subject and the background as position information, and the positional information of the subject and the background is temporal To detect whether or not the equidistant surface of the subject from the information recording device is shaken in the screen.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
<Embodiment 1>
Embodiment 1 of the present invention uses an information recording apparatus for recording while adding position information that defines the spatial position of a sound source to sound information, and the sound information to which position information is added using position information. 2 shows an information reproducing apparatus that reproduces the information while reproducing the information.
[0019]
FIG. 1 is a diagram showing a scene in which the information recording apparatus according to the present embodiment is used. FIG. 1 shows a situation where the performance of the band on the stage is recorded. In general, multi-track recording is performed for recording, and a track is assigned to each musical instrument and a performance is recorded. Here, for example, tenor saxophone Ts, alto saxophone As, soprano saxophone Ss, microphone Mc1 to Mc3, piano pf to microphone Mc4, drums Ds to microphone Mc5, trumpet Tp1 to Tp3 to microphone Mc6 to Mc8, trombone A microphone Mc9 is assigned to Tb, and a microphone Mc10 is assigned to the base B.
[0020]
As shown in FIG. 1, for example, the position on the stage is represented by a coordinate component with the left end toward the forefront as the origin, the depth direction as the Y axis, and the left and right direction as the X axis. To do.
[0021]
[Table 1]

[0022]
Table 1 shows the microphones Mc1 to Mc10, their positions, and the recorded audio track data numbers SD1 to SD10. Unlike the conventional case, the information recording apparatus according to the present embodiment does not mix down multitrack data into two stereo channels, but retains recorded audio information as multitrack.
[0023]
In this embodiment, at the time of recording each track, not only audio information but also position information that defines the spatial position of the sound source is recorded. The position information of the sound source may be written in each track by providing a track dedicated to the position information, or may be written in an empty portion of the track in which the audio information is written. Then, it may be written only once as a fixed value, periodically written as a changing value, or written only when position information has changed.
[0024]
Note that the position information of the sound source may be determined based on the position of the microphone, or may be determined based on the position of the performer or the musical instrument.
[0025]
Further, FIG. 1 and Table 1 show the case of two-dimensional position information of the X axis and the Y axis for the sake of simplicity of display, but the coordinate component in the Z axis direction perpendicular to both axes is added. Dimensional position information may be used.
[0026]
The use of audio information in which position information is recorded in this way will be described below. The sound can be given a sense of depth by performing delay processing and reverberation processing like the stereo component with a built-in DSP described above. Further, by correcting the sound signal in consideration of the sound source position specifying action by the auricle and the phase difference, it is possible to obtain a sense of direction in the up and down direction and the front and back direction. For such delay processing, reverberation processing, and correction processing, techniques used in conventional stereo components and 3D sound technology may be applied as they are. Delay processing, reverberation processing, and correction processing are highly dependent on the positional relationship between the sound source and the listener, and parameters related to these processing are determined in advance by the reverberation level, delay time, propagation medium, wall material, etc. If so, it is automatically determined by determining the positional relationship between the sound source and the listener. In addition, the parameters regarding these processes are expressed as “speech information propagation characteristics” in the present application. The propagation characteristics of audio information include delay processing, reverberation processing, sound source localization by the auricle, correction processing by phase difference, etc., as well as the influence of wind and wall materials by changing the volume level over time. It also includes expressing the influence and expressing the temperature, the type of propagation medium (water, air, etc.) and the density, which are the elements that change the sound speed, by devising delay processing and changes in volume level.
[0027]
Since the sound information recorded by the information recording apparatus according to this embodiment is added with the position information of the sound source, the propagation characteristics of the sound information can be determined for each sound source. That is, for example, in the conventional stereo component, when determining the propagation characteristics for stereo sound information, the processing is performed uniformly on the sound information that is mixed down, not for each sound source, and it is difficult to obtain a stereoscopic effect. If the propagation characteristic of voice information can be determined every time, voice information with a more realistic feeling can be reproduced. In addition, according to the 3D sound technology, the recording engineer's experience and subjectivity accounted for a large factor, so information on the exact position was not necessarily recorded, but position information for each sound source was added. For example, it is possible to determine the propagation characteristic of the voice information with higher accuracy while using the accurate position information.
[0028]
FIG. 2 is a diagram showing a scene in which the information reproducing apparatus according to the present embodiment that determines the propagation characteristics of audio information for each sound source is used. FIG. 2 shows a sound field data image of the sound field SF1 formed when each of the sound track data SD1 to SD10 shown in Table 1 is reproduced from the speakers SPL and SPR. The sound field data image of each audio track data SD1 to SD10 corresponds to the arrangement of each instrument on the actual stage shown in FIG.
[0029]
This sound field data image shows a case where the propagation characteristic of the sound information is determined so as to be optimal when the listener is at the listening point LP1. If the listener moves from the listening point LP1 to the listening point LP2, the voice information propagation characteristic is not optimal as it is, so that after detecting the listening point LP2, a new voice information propagation characteristic is set. It may be determined. The listener location may be specified by waiting for input of position information from the listener, or by automatically providing a CCD distance sensor or infrared sensor in the information reproducing apparatus according to the present embodiment. It may be.
[0030]
In addition, FIG. 2 shows the case where the sound field is formed by two speakers as an example. Of course, when there are more than two speakers, the audio information to be output varies depending on the arrangement of each speaker. You can make it happen. In addition, when the information reproducing apparatus according to the present embodiment has low processing capability of audio information and it is difficult to independently reproduce all of the multitracks, for example, the audio between the sound sources close to each other Information may be combined into one to reduce the number of tracks.
[0031]
When the sound source moves (for example, when a wireless microphone or the like is used), the sound information is changed while changing the frequency of the sound information in consideration of the Doppler effect generated between the sound source position and the listener position. You should make it play. The Doppler effect refers to a phenomenon in which the frequency of sound emitted from a moving sound source changes compared to when stationary. This phenomenon
[0032]
[Expression 1]

[0033]
It is expressed quantitatively as follows. In Equation 1, f is the frequency of the audio information received by the listener, f ₀ Represents the frequency of the sound emitted from the sound source at rest, and c represents the speed of the sound. Other parameters are as shown in FIG. That is, v ₀ Is the absolute value of the moving speed of the listener at current point 0, v _S Is the absolute value of the moving speed of the sound source at the current position S, φ and θ are the angle of the moving speed of the listener and the angle of the moving speed of the sound source from the straight line connecting the current position 0 of the listener and the current position S of the sound source, Each is shown.
[0034]
Thus, for audio information emitted from a moving sound source, c, v ₀ , V _S , F in equation 1 determined by, φ and θ ₀ A correction process for multiplying the frequency of audio information by the frequency of the voice information may be performed. The voice speed c is determined by determining parameters such as temperature and propagation medium, and v ₀ , V _S , Φ and θ can be obtained by calculating the time change of the position information of the sound source and the time change of the position information of the listener. ₀ It is not difficult to obtain the coefficient of.
[0035]
FIG. 4 shows a block diagram of an information reproducing apparatus having a function of reproducing the above Doppler effect. In FIG. 4, the relative relationship calculation processing block ST1 obtains the sound source position information IFS and the listener position information IFL, calculates position information such as the distance between the two, and also calculates v from the time change of the position information of the sound source and the listener. ₀ , V _S , Φ and θ are calculated. These pieces of information are sent to the pitch change processing block ST2 and the propagation characteristic change processing block ST3. In the pitch change processing block ST2, voice information and environment information in the virtual space (information on the type of propagation medium, sound recording, etc.) are given to add the Doppler effect to the voice information, and in the propagation characteristic change processing block ST3, the pitch is changed. An output from the processing block ST2 and environment information in the virtual space are given to add propagation characteristics to the sound information. The output of the propagation characteristic change processing block ST3 is given to the audio reproduction processing block ST4 and transmitted to the listener.
[0036]
Further, when there are a plurality of listeners and each listener is present at a different position, the relative reproduction calculation processing blocks ST1a to ST1c for each listener, as shown in the block diagram of FIG. Processing blocks ST23a to ST23c and listener-specific music reproduction processing blocks ST4a to ST4c may be provided. As the relative relationship calculation processing blocks ST1a to ST1c are provided for each listener, listener position information IFLa to IFLc is also collected for each listener and input to the corresponding relative relationship calculation processing block. Note that the audio reproduction processing blocks ST23a to ST23c collectively represent the pitch change processing block ST2 and the propagation characteristic change processing block ST3 in FIG. The reproduction processing block is provided for each listener in order to prevent interference with other listeners. Specific examples of the music reproduction processing blocks ST4a to ST4c by listener include headphones and superdirective speakers.
[0037]
In this case, since different playback processes are performed for the same audio information for each listener, a sound field suitable for each listener can be formed. In this way, for example, it is possible to create a situation where the sound source and listener move around in the virtual reality space, or to individually set the sound field according to the seat position of the driver or navigator with the audio playback device in the car, It is possible to perform sound field correction in consideration of the seat arrangement of the concert hall with the audio reproduction apparatus of FIG. 1, and to correct the difference in sound field depending on the position of the audience seat in the concert hall.
[0038]
If the information recording apparatus according to the present embodiment is used, the sound information emitted from the sound source is recorded while the position information of the sound source is added, so that the sound information is processed using the position information of the sound source when the sound information is reproduced. It can be performed.
[0039]
Also, if the information reproducing apparatus according to the present embodiment is used, the sound information is reproduced while determining the propagation characteristics of the sound information using the position information of the sound source. Can be given. Furthermore, if the listener's position information is also used when determining the propagation characteristics, the listener can be given more realistic and stereoscopic sound information according to the listener's position. Further, if the frequency of the audio information is changed in consideration of the Doppler effect generated between the position of the sound source and the position of the listener, audio information with a more realistic feeling or a stereoscopic effect can be given to the listener. In addition, when there are a plurality of listeners, a plurality of listeners are determined while determining the propagation characteristics of the sound information using position information corresponding to each of the plurality of listeners, or in addition to changing the frequency of the sound information. By reproducing the audio information for each of them, it is possible to provide audio information with a sense of reality and a stereoscopic effect by each of the plurality of listeners.
[0040]
<Embodiment 2>
Embodiment 2 of the present invention is an information recording apparatus that records image information while adding the distance to the subject and the background as position information, and reproduces the image information to which the position information is added using the position information. The information reproducing apparatus which performs is shown.
[0041]
FIG. 6 is a diagram showing the configuration of the information recording apparatus according to the present embodiment. In FIG. 6, the image information captured by the imaging device CM such as a digital camera, and a sensor for measuring the distance such as an infrared sensor, a CCD distance measuring sensor, an ultrasonic sensor, a gravity / pressure sensor, etc. provided near the imaging device CM. Both the subject SB0 and the position information of the background BG captured by the element SS are converted into data, and image information GA to which the position information is added is obtained. Note that the position information of the subject SB0 and the background BG indicates the distance between the imaging device CM and the subject SB0 and the distance between the imaging device CM and the background BG.
[0042]
The image information GA includes not only the subject SB0 and the background BG but also information on the distance between the imaging device CM and the subject SB0 or the background BG for each unit section (for example, the number of screens vertically or horizontally). Recorded in equal parts and ultimately in units of pixels). The subject SB0 includes three objects SB0a, SB0b, and SB0c. In FIG. 6, as an example, the distance to the front part of the right object SB0c existing in the foreground is 2.5 m, the distance to the front part of the left object SB0a existing in the foreground is 2.7 m, The distance to the front part of the center object SB0b existing at the back is 3.0 m. Further, the distance to the background BG is shown as 10.0 m.
[0043]
In addition, the information reproducing apparatus according to the present embodiment is an apparatus that displays the image information GA recorded in this way together with the position information of the subject SB0 and the background BG or individually. As described above, if the subject position information is added to the image information, the background and the subject can be easily distinguished when reproducing the image information. For example, image processing such as cutting out only a person from the background. Becomes easy.
[0044]
Further, when the imaging device CM is a video camera capable of shooting a moving image, the position information added to the image information can be used for camera shake correction as shown in FIG. In other words, if the equidistant surface of the subject from the image pickup apparatus shakes on the entire screen in small increments, it can be detected that there is a camera shake. If the movement due to camera shake is corrected, a moving image can be recorded as if there is no camera shake.
[0045]
Further, the information recording apparatus according to the present embodiment may be used in combination with the information recording apparatus for audio information according to the first embodiment. That is, when image information is recorded, an object OB (corresponding to the sound source in the first embodiment) divided by equidistant surfaces in the screen GA as shown in FIG. 8 is recognized by a method such as image recognition. Along with the movement, the position information of the sound source to be recorded is also updated. Then, even if the information recording device for the sound information in the first embodiment cannot record the temporal change data of the position information of the sound source, the sound source is moved in accordance with the movement of the object OB. Can do.
[0046]
Further, the information reproducing apparatus according to the present embodiment may be used in combination with the information reproducing apparatus for audio information according to the first embodiment as described above. That is, when the object OB (corresponding to the sound source) divided by the equidistant surface in the screen GA is recognized by a technique such as image recognition as shown in FIG. The position information of the sound source to be reproduced is also updated. Then, even if the information reproducing apparatus for audio information in the first embodiment does not have data on temporal change of the position information of the sound source, the sound source is moved in accordance with the movement of the object OB. Can be made.
[0047]
If the information recording apparatus according to the present embodiment is used, the subject and background image information is recorded while the subject and background position information is added. Therefore, the image information is reproduced using the subject and background position information when reproducing the image information. Can be processed. Further, it is possible to detect camera shake by detecting whether or not the equidistant surface moves in small increments on the screen. In addition, since the position information of the sound source is updated with the movement of the subject, even an information recording apparatus that cannot record the temporal change data of the position information of the sound source can be adapted to the movement of the subject. The sound source can be moved.
[0048]
In addition, if the information reproducing apparatus according to the present embodiment is used, a portion to be image-processed is determined using the position information of the subject, and image information is reproduced while performing image processing on the portion, so that it exists far away. It is possible to increase the compression ratio of the subject or to separate the subject from the background. In addition, since the position information of the sound source is updated as the subject moves, even an information reproducing apparatus that does not have temporal change data of the position information of the sound source can move the subject. In addition, the sound source can be moved.
[0049]
<Embodiment 3>
The third embodiment of the present invention uses the information recording apparatus shown in the second embodiment in order to obtain an image with a large depth of field.
[0050]
FIG. 9 explains the depth of field. Images taken with an imaging device CM such as a normal analog camera, digital camera, or video camera are usually in focus (the in-focus position, and the distance from the in-focus to the imaging device) and the subject. There is a depth of field (a range in focus before and after the focal point).
[0051]
The greater the depth of field, the wider the range of focus in the depth direction, and a clear image can be obtained.
[0052]
(1) The focal length of the taking lens is long, (2) The aperture value of the taking lens is small (the aperture is open), and (3) The shooting distance to the subject is as follows: There are three conditions that are close. For example, (1) using a photographic lens with a long focal length (about 100 to 200 mm on a 35 mm film), (2) taking a picture of a flower at a close distance (several tens of centimeters), and (3) aperture value (focal point) When (distance / effective pupil diameter) is close to open f = 2.8 or less, the entire depth of field is only in the range of several centimeters.
[0053]
At a depth of field of several centimeters, when taking a picture of a flower, for example, focusing on the flower core causes the surrounding petals to be out of focus. If the whole flower or the stem or leaf is to be focused, the aperture value of (3) can only be increased (squeezed), and the amount of exposure light inevitably decreases and the shutter speed decreases ( The shutter will be opened for a long time (a few seconds to a few seconds if you squeeze down to about f = 32 with the amount of general shooting light), and it will be unusable as a photo due to the effects of camera shake and wind. .
[0054]
In order to solve the problem of shallow depth of field in close-up shooting, some cameras have a mechanism that stops the aperture to f = 45 and covers underexposure with a strobe. However, due to the difference between natural light and artificial light (color, incident angle, light distribution, diffusion, etc.), the finished image of the photo will differ considerably. In addition, a new problem arises such that the strobe light does not reach beyond a certain distance (strobe light arrival distance = guide number / aperture value × film sensitivity correction) where the strobe light is reflected on the subject.
[0055]
Such a depth-of-field problem occurs not only in analog cameras but also in digital cameras and video cameras using optical systems. However, in actual shooting, intentionally blurring the background may result in photographic and artistic expressions, so the shallow depth of field itself is a drawback for the entire optical system. is not. Rather, it is difficult to set and control the depth of field intended by the photographer to match the above three conditions (1) to (3) and the amount of light on the spot unless there is knowledge and experience. The problem was the skill.
[0056]
Therefore, an image with a large depth of field is obtained by using the information recording apparatus described in the second embodiment.
[0057]
First, as shown in FIG. 10 when the subjects SB0a to SB0c are viewed from the top, the subjects SB0a to SB0c are photographed by using an information recording device including the imaging device CM by changing the focal point in steps from FP1 to FP7, for example. To obtain image information with position information. The depths of field corresponding to the focal points FP1 to FP7 are represented by D1 to D7. The distance between the focal points is preferably determined by roughly calculating the depth of field so that the depth of field will not be interrupted, but a fixed value such as 3 cm or 5 cm is appropriately set. You may make it set.
[0058]
In the case of the above example, since the in-focus point is changed step by step from FP1 to FP7, photographing is performed, so that there are seven pieces of image information with different focus conditions. Of these, if an in-focus portion is extracted from each piece of image information of seven pieces and synthesized, an image with a deep depth of field can be obtained.
[0059]
In order to extract a focused portion from each piece of image information, the position information about the distance between the imaging device CM and the subjects SB0a to SB0c included in each piece of image information is used to determine the value of the distance to the focal point. What is necessary is just to extract a part of image information close | similar to the value of the distance of the imaging surface of the to-be-photographed object.
[0060]
FIG. 11 shows that a portion in focus is extracted from each piece of image information and combined. In FIG. 11, the range WA is selected from the images taken under the in-focus FP2 and the depth of field D2 as the in-focus portion of the subject SB0c. Note that reference numeral A1 indicates a part of FIG. 10, and reference numeral A2 indicates only the range WA among images taken under the depth of field D2. Similarly, as the portion where the subject SB0b is in focus, the range WB is selected from the images photographed under the in-focus FP3 and the depth of field D3, and as the portion where the subject SB0a is in focus, A range WC is selected from images captured under the focal point FP5 and the depth of field D5. Note that the portion selected as the range WB may be selected from the portion excluding the range WA, and the portion selected as the range WC may be selected from the portion excluding the ranges WA and WB. If the in-focus portions are sequentially extracted and combined in this way, an image with a large depth of field can be obtained as a result.
[0061]
In this way, as shown in FIG. 12, it is also possible to obtain an image in which the entire wall surface not parallel to the imaging surface of the subject SB1 is in focus. In analog cameras, the slopes of products and buildings were photographed using a mechanism that tilts the optical axis, such as a shift lens, but without using such a mechanism, the entire wall surface not parallel to the imaging surface was used. It is possible to obtain a focused image, which is very effective.
[0062]
If the information recording apparatus according to the present embodiment is used, an in-focus portion is extracted and combined, so that an image with a large depth of field can be obtained.
[0063]
Even when an information recording apparatus other than the information recording apparatus described in the second embodiment is used, that is, position information about the distance between the imaging apparatus CM and the subject SB0 is not included in each image information. Even in this case, it is possible to realize an information recording apparatus having the same effect as described above. That is, if a plurality of pieces of image information are obtained by changing the focal point stepwise, and an in-focus portion is extracted from each piece of image information and synthesized, an image with a deep depth of field can be obtained. . In this case, in order to extract the in-focus portion from each piece of image information, it is only necessary to identify the in-focus portion by performing image processing for extracting a high frequency component for each of a plurality of pieces of image information. .
[0064]
<Embodiment 4>
The fourth embodiment of the present invention uses the information reproducing apparatus shown in the second embodiment in order to obtain a stereoscopic image.
[0065]
FIG. 13 is a diagram illustrating the principle of stereoscopic vision. For example, when a person views a triangular prism-shaped subject SB2 as shown in FIG. 13, the left side S1 of the subject SB2 is larger than the right side S2 in the left eye, and the right side S2 of the subject SB2 is greater than the left side S1 in the right eye. Also appears large. In this way, a parallax is generated between the right eye and the left eye, so that a human feels a three-dimensional depth.
[0066]
Therefore, using the information recording apparatus shown in the second embodiment, the subject SB2 is recorded as one piece of image information while adding the position information of the left side surface S1 and the right side surface S2.
[0067]
Then, the information reproducing apparatus shown in the second embodiment is modified to reproduce the left-eye video and the right-eye video while considering the parallax. Specifically, as shown in FIG. 14, using the position information, a left-eye image SB2L composed of a left side surface S1L having a longer horizontal direction by the amount of parallax and a right side surface S2L having a shorter horizontal direction by the amount of parallax. A right-eye image SB2R composed of a right side surface S2R that is longer in the horizontal direction by the amount of parallax and a left side surface S1R that is shorter in the horizontal direction by the amount of parallax is created. Reproduce.
[0068]
Of course, the left-eye video SB2L and the right-eye video SB2R include not only the subject but also the background. The background may be corrected in the horizontal direction as with the subject. However, since the amount of correction may differ between the correction performed on the background and the correction performed on the subject, there is a gap between the background and the subject by performing horizontal correction on the subject. It is thought that it will occur. In that case, it is only necessary to take measures such as compensating the generated gap with a color obtained by averaging the colors of surrounding pixels.
[0069]
Then, both reproduced images are viewed as stereoscopic images by using stereoscopic glasses or the like.
[0070]
By using the information reproducing apparatus according to the present embodiment, the left-eye video and the right-eye video in which the horizontal distance is corrected by the amount of parallax are created from the position information from one piece of image information. There is no need to record both the right-eye and left-eye images as in a stereoscopic image.
[0071]
<Embodiment 5>
The fifth embodiment of the present invention uses the information recording apparatus shown in the second embodiment and is useful for improving the accuracy of a position measuring apparatus for a moving body using GPS or PHS. That is, the information recording apparatus shown in the second embodiment further includes a position measuring apparatus for measuring its own position, and the position measured by the position measuring apparatus is added to the image information while making the position a temporary current location. The position information is used to improve the accuracy of position measurement and to find the true current location.
[0072]
For example, as shown in FIG. 15, the distances from the current location to the two objects B1 and B2, which are target points such as buildings, are measured using the information recording apparatus shown in the second embodiment. As the objects B1 and B2, those listed on the map in the position measuring device are selected. Next, as shown in FIG. 16, the range AR1 of the current location specified by the position measuring apparatus for moving objects using GPS or PHS is displayed on the map MP. If the center P1 of the range AR1 is the current location, the distance DG1 between P1 and the object B1 and the distance DG2 between P1 and the object B2 are distance values obtained by the information recording apparatus shown in the second embodiment. Should match. If they do not match, it is determined that the current location is not P1.
[0073]
In that case, a circle CL1 having a radius DS1 from the current location to the object B1 is obtained using the information on the distance from the current location to the objects B1 and B2 obtained using the information recording apparatus described in the second embodiment. Similarly, a circle CL2 having a radius DS2 from the current position to the object B2 is drawn with the object B2 as the center. Of the intersections P2 and P3, the intersection closer to the range AR1 obtained by the position measuring device may be adopted as the true current location.
[0074]
If the information recording apparatus according to the present embodiment is used, the information recording apparatus further includes a position measurement device, displays a temporary current location on a map, determines two objects from image information, It is possible to improve the accuracy of the position measurement device by drawing two circles with the radius of each distance to the object and determining the intersection closest to the temporary current location among the intersections of these circles as the true current location. it can.
[0075]
<Embodiment 6>
In Embodiment 6 of the present invention, in the information recording apparatus and information recording apparatus shown in Embodiment 2, characters are included in the subject or background, and the characters are image-recognized and replaced with text information. It will be preserved.
[0076]
When characters are included in the subject or the background, it is more efficient to store information by encoding it as text information than storing it as bitmap data. Furthermore, by setting it as text information, when there is a change in the position information of the background or subject, processing such as changing the font size of the text information C1 to C2 and C3 as shown in FIG. 17 is easy. Can be done. In addition, the color or the like of the text information may be changed in accordance with the change in the background information or the position information of the subject.
[0077]
【The invention's effect】
According to the first aspect of the present invention, since the subject and background image information is recorded while the subject and background position information is added, the image information is reproduced using the subject and background position information when reproducing the image information. Can be processed. Further, since it is detected whether or not the equidistant surface is shaken in the screen, camera shake can be detected.
[Brief description of the drawings]
FIG. 1 is a diagram showing a scene in which an information recording apparatus according to Embodiment 1 of the present invention is used.
FIG. 2 is a diagram showing a scene in which the information reproducing apparatus according to Embodiment 1 of the present invention is used.
FIG. 3 is a diagram illustrating parameters in the Doppler effect.
FIG. 4 is a block diagram showing a configuration of an information reproducing apparatus according to Embodiment 1 of the present invention.
FIG. 5 is a block diagram showing another configuration of the information reproducing apparatus according to Embodiment 1 of the present invention.
FIG. 6 is a block diagram showing a configuration of an information recording apparatus according to Embodiment 2 of the present invention.
FIG. 7 is a diagram showing camera shake correction in an information recording apparatus according to Embodiment 2 of the present invention.
FIG. 8 is a diagram showing movement of a subject in an information recording apparatus or information reproduction apparatus according to Embodiment 2 of the present invention.
FIG. 9 is a diagram illustrating the depth of field.
FIG. 10 is a diagram showing how a subject is photographed using an information recording apparatus according to Embodiment 3 of the present invention.
FIG. 11 is a diagram showing a state in which an image is synthesized using the information recording apparatus according to Embodiment 3 of the present invention.
FIG. 12 is a diagram illustrating a state in which a subject having a surface that is not parallel to the imaging surface is captured.
FIG. 13 is a diagram illustrating a stereoscopic view.
FIG. 14 is a diagram showing an image created by an information recording apparatus according to Embodiment 4 of the present invention.
FIG. 15 is a diagram showing two objects serving as target points in an information recording apparatus according to Embodiment 5 of the present invention.
FIG. 16 is a diagram showing a method for determining a current location in an information recording apparatus according to Embodiment 5 of the present invention;
FIG. 17 is a diagram showing how the character size changes in the information recording apparatus according to Embodiment 6 of the present invention;
FIG. 18 is a diagram showing conventional stereo audio information.
FIG. 19 is a diagram showing an image of sound field data of conventional stereo sound information.
FIG. 20 is a diagram showing an image of sound field data of a conventional 3D sound technology.
[Explanation of symbols]
SD1 to SD10 Audio track data
IFS sound source position information
IFL, IFLa to IFLc Listener position information
ST1, ST1a to ST1c Relative relationship calculation processing block
ST2 Pitch change processing block
ST3 Propagation characteristics change processing block
ST23a-ST23c Sound playback processing block
ST4 Audio playback processing block
ST4a to ST4c Listener-specific audio playback processing block
CM imaging device
SS sensor element
SBO, SBOa to SB0c, SB1, SB2
OB object
FP1 to FP7 Focusing point
D1-D7 depth of field
AR1 Range specified by position measuring device
B1, B2 Objects that are target points
C1-C3 character font

Claims

An information recording apparatus for recording image information of the subject and the background while adding the distance to the subject and the background as position information,
The position information of the subject and the background changes with time,
An information recording apparatus that detects whether or not an equidistant surface of the subject from the information recording apparatus is shaken in a screen.