JP2003333621A

JP2003333621A - Multi-viewpoint image transmission method

Info

Publication number: JP2003333621A
Application number: JP2003154399A
Authority: JP
Inventors: Takeo Azuma; 健夫吾妻; Kenya Uomori; 謙也魚森; Atsushi Morimura; 森村　　淳
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-04-05
Filing date: 2003-05-30
Publication date: 2003-11-21
Anticipated expiration: 2017-04-02
Also published as: JP3735617B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a multi-viewpoint image transmission system which enables a display side to precisely calculate a visual field angle for photographing. <P>SOLUTION: In the multi-viewpoint image transmission method, information of an image pickup size of a camera for picking up an image and a distance between the center of a lens and an image pickup face is transmitted at the time of transmitting an image with two or more viewpoints, so that information of a visual field angle for image pickup can be obtained on the display side. <P>COPYRIGHT: (C)2004,JPO

Description

【発明の詳細な説明】【０００１】【発明の属する技術分野】本発明は、多視点画像の伝送
方法に関する。【０００２】【従来の技術】従来、立体映像方式には様々なものが提
案されているが、特殊な眼鏡をかけることなく立体動画
像を複数人数で観察できる方式として、多視点画像によ
る多眼式立体映像方式が有望である。多眼式立体映像方
式においては、使用するカメラ台数及び表示装置台数が
多いほど、観察者に対して自然な運動視差を感じさせる
ことができ、また、多人数での観察が容易になる。しか
しながら、撮像系の規模やカメラの光軸の設定等の制約
により、実用的に用いることができるカメラ台数には限
度がある。また、伝送、蓄積過程においては、カメラ台
数に比例して増大する情報量を低減することが望まれ
る。【０００３】そこで、表示側において、２眼式ステレオ
画像から中間視点画像を生成することにより多眼式立体
画像を表示できれば、撮像系の負担を軽減し、伝送、蓄
積時の情報量を低減することができることになる。視点
の異なる複数の画像から、その異なる視点間の任意の視
点で見えるべき中間視点画像を生成するためには、画像
間で画素の対応を求めて奥行きを推定する必要がある。【０００４】また、動画像をデジタル伝送するための画
像圧縮方式として、ＭＰＥＧ−１、ＭＰＥＧ−２が提案
されている。さらに、ＭＰＥＧ−２を拡張して多視点画
像を伝送する試みも行われている（ISO/IEC13818-2/PDA
M3）。図２８は、ＭＰＥＧ−２シンタックスの概略図で
ある。ＭＰＥＧ−２による伝送は、Sequence、ＧＯＰ
（Group Of Picture）、Picture という階層構造を持つ
画像データの符号化、復号化によって行われる。ISO/IE
C13818-2/PDAM3によると、ＭＰＥＧ−２の拡張による多
視点画像の伝送は、（明記されていないためはっきりし
ないが）ＧＯＰ層を拡張して実現されるようである。【０００５】図２９は、伝送される多視点画像の時空間
方向の関係を示すものである。従来のＭＰＥＧ−２で用
いられてきた動き補償に加えて、視差補償を用いること
によって符号化効率を高めようとしている。多視点画像
を伝送する際には、各カメラに関する情報（カメラの位
置、カメラの光軸の向き等のカメラパラメータ）を付加
して伝送する必要がある。ISO/IEC13818-2/PDAM3には、
カメラパラメータは図２８のPic.Extension（Picture層
の拡張）に含めて伝送することが述べられているが、具
体的なカメラパラメータの記述については述べられてい
ない。【０００６】カメラパラメータの記述に関しては、ＣＧ
言語であるＯｐｅｎＧＬにおいて、カメラの位置、カメ
ラの光軸の向き、カメラの位置と画像面との距離がカメ
ラパラメータとして定義されている（「オープンジー
エルプログラミングガイド」（OpenGL Programming Gu
ide,The Official Guide to Learning OpenGL,Release
1,Addison-Wesley Publishing Company,1993））。【０００７】図３０は、ＯｐｅｎＧＬによるカメラパラ
メータの定義を示す説明図である。図３０において、Ａ
はレンズ中心、Ｂは画像面（すなわち撮像面）の中心、
ＣはＢから画像上端におろした垂線と画像上端の交点を
示す。Ａ，Ｂ，Ｃの座標値はそれぞれ、（optical cent
er X,optical center Y,optical center Z）,（image pl
ane center X,image plane center Y,image plane cent
er Z）,（image plane vertical X,image plane vertic
al Y,image plane vertical Z）として定義されてい
る。【０００８】上記のＯｐｅｎＧＬで定義されるカメラパ
ラメータの情報をPic.Extensionに付加して多視点画像
を伝送することが容易に考えられる。【０００９】【発明が解決しようとする課題】しかしながら上記のよ
うな従来の方法では、画像内の最近点、最遠点（すなわ
ち被写体の最近点、最遠点）に関する情報がないため、
表示時に目の疲れにくい表示（例えば視差制御）を行お
うとする際に、どの奥行き範囲に対して行えばよいのか
わからないという課題を有していた。【００１０】また、多視点画像を表示する際には、撮影
時の視野角や撮像面のサイズ、レンズ中心と撮像面との
距離等の条件をもとに、観察距離を適切に決定する必要
がある（視差がつきすぎて違和感のある表示や、逆に立
体感の乏しい表示にならないようにするため）。しか
し、ＯｐｅｎＧＬでは、撮像面の大きさ（ＣＣＤの物理
的なサイズ）が定義されておらず、また、レンズ中心と
結像面との距離をレンズの焦点距離と常に等しいものと
して扱っている。そのため、撮像時の視野角の大きさが
表示側ではわからず、表示時の適切な視野角すなわち適
切な観察距離を決定できず、違和感のある立体表示とな
る可能性があるという課題を有していた。【００１１】本発明はかかる点に鑑み、表示側で撮影時
の視野角を精度よく計算することができる多視点画像の
伝送方法を提供することを目的とする。【００１２】【課題を解決するための手段】本発明は、２視点以上の
画像の伝送において、画像を撮像するカメラの撮像サイ
ズ及びレンズ中心と撮像面との距離の情報を伝送するこ
とにより、表示側において撮像時の視野角の情報を得る
ことができることを特徴とする多視点画像伝送方法であ
る。【００１３】【発明の実施の形態】以下に、本発明をその実施の形態
を示す図面に基づいて説明する。（第１の実施の形態）図４は、本発明の第１の実施の形
態における画像伝送方法で定義するパラメータを示す図
である。図４において、A1，A2はカメラのレンズ中心の
位置を示し、B1，B2は撮像面の中心を示す（説明を簡単
にするために、撮像面をレンズ中心に対して被写体側に
折り返して考えている）。【００１４】ＯｐｅｎＧＬでは図４のA1B1，A2B2の距離
をカメラのレンズの焦点距離として定義しているが、本
発明においては、カメラのレンズ中心と撮像面の距離を
該レンズの焦点距離とは独立に定義する。この定義によ
り、合焦時のレンズ中心と撮像面との距離を被写体の距
離に応じて計算でき、正確な視野角を計算できる。視野
角は、撮像面のサイズと、レンズ中心と撮像面との距離
から計算できる。【００１５】以下に図２を用いて、合焦時のレンズ中心
と撮像面との距離が、被写体とレンズ中心との距離によ
って変化することを説明する。図２は、被写体の位置、
合焦時の撮像面の位置と焦点距離の関係を示す図であ
る。図２において、Ａは被写体の位置、ＢはＡからの光
が結像する点、Ｏはレンズの中心、Ｆは平行光がレンズ
により結像する点、ａは被写体とレンズ中心Ｏとの距
離、ｂはＡからの光が結像する点Ｂとレンズ中心Ｏとの
距離、ｆはレンズの焦点距離を示す。ａ，ｂ，ｆの間に
は（数１）の関係が成り立つことが知られている。【００１６】【数１】【００１７】（数１）より、被写体が焦点距離を無視で
きるくらいレンズから遠い（ａ>>ｆ）場合には、１／ａ
→ ０となりｂ＝ｆと近似できる。しかし、被写体が
比較的レンズに近い場合には、１／ａの項を無視でき
ず、ｂ≠ｆとなる。従って、被写体が比較的レンズに近
い場合にも正しく視野角を計算するためには、レンズ中
心と結像面との距離を焦点距離とは独立に定義する必要
がある。そして、撮像面の幅をｗin、高さをｈinとする
と、撮像時の視野角は（数２）で表される。【００１８】【数２】【００１９】よって、表示時の画像の幅をｗout、高さ
をｈoutとすると、撮像時の視野角を再現する観察距離
は、【００２０】【数３】【００２１】となる。【００２２】次に、画像内の最近点、最遠点に基づく表
示側における見やすさの改善について説明する。図３
は、２つのプロジェクタを用いて輻輳投影をする場合の
輻輳距離、最近点、最遠点の位置関係を説明するための
図である。図３において、Ｃは輻輳点、Ａは最近点、Ｂ
は最遠点を示す。【００２３】輻輳のある投射においては、観察者が輻輳
点Ｃを見る場合に視差が０となる（図３において、両眼
とも画像の中心を見ることになるので、左右の目が見る
画像内の相対的な位置の違いはなくなる）。そして、最
近点Ａを見る場合にはいわゆる寄り目の状態となり、画
像上で寄り目の方向にＤａの視差が生じる。図３におい
て観察者は、輻輳点Ｃを見るときと比べて、両目とも内
側にＤａ／２ずれた点を見る。また、逆に最遠点Ｂを見
る場合にはいわゆる離れ目の状態となり、画像上で離れ
目の方向にＤｂの視差が生じる。【００２４】また、図１は平行投影の場合の最近点、最
遠点、観察者の輻輳と調節が一致する点の位置関係を示
す図である。図１において、Ａは表示される画像の最近
点、Ｂは最遠点、Ｃは観察者の輻輳と調節が一致する点
を示す。図１に示す平行投影の場合、Ｄc の視差がある
画像を表示すると、スクリーン上では同じ点に表示さ
れ、観察者の輻輳と調節が一致する。【００２５】上述の図３と図１の画像内における視差
は、観察者にスクリーン面（Ｃを含む面）に対して手前
か奥かという立体感として知覚されるが、視差が大きく
なると融合しなくなったり（２重に見える状態）、観察
者に違和感・不快感を与えたりする。【００２６】観察者の見やすさの改善は、最近点、最遠
点、撮像時の輻輳点をもとに、画像を図３に示す方向
（画像１、画像２を各々の投射軸の垂直面内で水平方
向）にずらすことにより、輻輳点と最遠距離、最近距離
との位置関係を変化させることで可能となる。画像のず
らし方については、例えば画像間の視差の平均値を相殺
するようにずらすことによって、画像全体を均一に見や
すくできる。【００２７】図５は、そのような処理のブロック図であ
る。図５では、簡単のために２眼式（２視点）のデータ
についての例を示している。図５において、１は画像復
号手段、２は視差推定手段、３は平均視差演算手段、４
ａ，４ｂは画像シフト手段、５ａ，５ｂは画像表示手段
である。以下に各手段の動作について説明する。【００２８】画像復号手段１は、送信側で符号化された
多視点画像データを受信し、これを復号する。画像復号
手段１により復号された左右の画像は視差推定手段２に
送られる。視差推定手段２は、画像復号手段１によって
復号された左右の画像から各画素における視差（視差地
図）を計算する。例えば、左画像を基準としてブロック
マッチングにより視差を計算する場合について、図６を
用いて以下に説明する。まず、左画像中に窓領域を設定
する。次に、（数４）に示す残差平方和(ＳＳＤ)を計算
する。【００２９】【数４】【００３０】（数４）の計算は、dminからdmaxの範囲の
ｄについて１画素間隔で計算する。そして、dminからdm
axの範囲でＳＳＤを最小にするｄの値を、設定した窓領
域での視差とする。画像の各画素における視差は、窓領
域を順次ずらして設定し、上記の計算をすることによっ
て得られる。【００３１】ＳＳＤを計算する範囲dmin、dmaxは、最近
点、最遠点の情報より計算できる。図７、図８を用い
て、平行撮影時と輻輳撮影時の場合のdmin、dmaxの求め
方について以下に説明する。【００３２】図７は、平行撮影の場合を示す図である。
図７に示す座標系において、左右のレンズ中心の座標値
を（−Ｄ／２，０）、（Ｄ／２，０）、撮像面とレンズ
中心との距離をｂ、３次元空間中の物体位置の水平座標
値をＸ0 、奥行き方向の座標値をＺ0 、左右の撮像面で
位置（Ｘ0、Ｚ0）の物体からの光が撮像される水平位置
をそれぞれｘl0, ｘr0とする（ｘl0, ｘr0はカメラの光
軸と撮像面の交点を原点とする平面座標系の水平座標）
と、図形的な関係より、【００３３】【数５】【００３４】となる。よって、左右の画像を基準とした
視差はそれぞれ、（数６）に示す式で表される。【００３５】【数６】【００３６】ここで、画像中の最近点の奥行き値をＺmi
n、最遠点の奥行き値をＺmaxとすると、ＳＳＤを計算す
る範囲の上限dmaxと下限dminは（数７）で表される。【００３７】【数７】【００３８】また、図８は輻輳撮影の場合を示す図であ
る。図８に示す座標系において、輻輳点（左右のカメラ
の光軸の交点）の座標値を（０，Ｃ）、左右のレンズ中
心の座標値を（−Ｄ／２，０）、（Ｄ／２，０）、撮像
面とレンズ中心との距離をｂ、３次元空間中の物体位置
の水平座標値をＸ0、奥行き方向の座標値をＺ0、左右の
撮像面で位置（Ｘ0、Ｚ0）の物体からの光が撮像される
水平位置をそれぞれｘl0，ｘr0とする（ｘl0，ｘr0はカ
メラの光軸と撮像面の交点を原点とする平面座標系の水
平座標）と、図形的な関係より、【００３９】【数８】【００４０】となる。したがって、左右の画像を基準と
した時の視差はそれぞれ、（数９）に示す式で表され
る。【００４１】【数９】【００４２】（数９）の式中にＸ0が残っていることか
ら、輻輳撮像では奥行きが同じであっても、水平方向の
位置によって視差が異なる（即ち、再生される立体像が
歪む）ことがわかる。今、簡単のためにＸ0＝０（即ち
Ｚ軸）上の点における視差を考えると、（数９）にＸ0
＝０を代入して（数１０）を得る。【００４３】【数１０】【００４４】（数１０）より、画像中の最近点の奥行き
値Ｚmin、最遠点の奥行き値Ｚmax、輻輳点の奥行き値Ｃ
の位置関係と、水平画素数ｎｘ、撮像面(ＣＣＤ)の幅ｗ
inから視差の上限画素数dmax、下限画素数dminを決定で
きる。【００４５】Ｚ軸上以外の点における視差を考慮する場
合には、（数９）の最大値、最小値を計算することによ
って、視差の上限dmax、下限dminを決定できる。【００４６】以上説明したように、画像中の最近点の奥
行き値、最遠点の奥行き値、カメラの位置、カメラの光
軸の向きが与えられると、視差の取るべき値の範囲を計
算でき、視差演算時にＳＳＤを計算する範囲を決定でき
る。平均視差演算手段３は、視差推定手段２によって計
算された視差地図の平均を演算する。視差地図の平均は
（数１１）を計算することによって得られる。【００４７】【数１１】【００４８】画像シフト手段４ａ、４ｂは、平均視差演
算手段３によって得られる平均視差を有する奥行きの点
が、表示面と同じ奥行き（すなわち表示面上で視差０と
なるように）に表示されるように画像をシフトする。【００４９】平行投影による表示を示す図１において、
Ａは表示する画像中の最近点の奥行き、Ｂは最遠点の奥
行き、Ｃは平均視差の奥行きを示す。図１から、平行投
影では左右の画像間で（数１２）で示すＤc の視差があ
る場合に、スクリーン上で視差がなくなり、輻輳と調節
が一致した自然な表示となることがわかる。【００５０】【数１２】【００５１】画像シフト手段４ａは、（数１３）に示す
シフト量（右方向へのシフトを正としている）だけ左画
像をシフトする。【００５２】【数１３】【００５３】そして、画像シフト手段４ｂは、逆方向に
同じ量だけ右画像をシフトする。画像シフト手段４ａお
よび４ｂによるシフトの結果、平均視差を有する点がス
クリーンと同一の奥行きに表示されるようになる。【００５４】また、輻輳投影による表示を示す図３にお
いて、Ａは表示する画像中の最近点の奥行き、Ｂは最遠
点の奥行き、Ｃは平均視差の奥行きを示す。輻輳投影で
は、画像の中心で視差が０の場合に、スクリーンと同一
の奥行きに表示されることになる。したがって、輻輳投
影の場合画像シフト手段４ａおよび４ｂは平均視差を−
１／２倍した値だけ左右の画像をシフトする。【００５５】以上のように本実施の形態によれば、多視
点画像を伝送する際に、画像内の最近点、最遠点の情報
を付加することにより、表示側で目の疲れない表示（視
差制御）を行うことができる。【００５６】また、カメラの撮像面（ＣＣＤ）のサイ
ズ、撮像面とレンズ中心との距離、及びレンズの焦点距
離に関する情報を付加して伝送することにより、撮影時
の視野角に応じた表示を行おうとする際、被写体に接近
して撮影した映像についても、表示側で撮影時の視野角
を精度よく計算することができる。【００５７】なお、多視点画像中の最近点、最遠点に関
する情報を付加せずに伝送する場合には、最近点、最遠
点に関する情報の変わりに、最近点、最遠点に関する情
報が付加されていないことを示す専用の符号を付加して
伝送し、表示側において、予め設定した範囲内で視差の
計算を行うことにより、画像内の最近点、最遠点での視
差を推定することができ、本発明に含まれる。【００５８】さらに、伝送側において、多視点画像中の
最近点、最遠点に関する情報を特定の奥行き値に設定す
ることにより、その設定された特定の奥行き範囲での視
差が融合範囲に入るように視差制御することができ、本
発明に含まれる。【００５９】また、本発明においては視差の計算を表示
側で行う例について説明したが、符号化された画像中に
含まれる視差を用いてもよく、本発明に含まれる。図１
０を用いてそのような例について説明する。【００６０】図１０において、画像復号手段６以外の構
成の動作は、図５に示す視差制御方式と同一であるので
説明を省略し、以下画像復号手段６の動作について説明
する。画像復号手段６は、符号化された画像データを復
号し、左右の画像と左画像を基準とした視差を出力す
る。ＭＰＥＧ−２による多視点画像伝送方式で２眼式画
像を伝送する際には、左画像を基準とする視差補償によ
り圧縮率を高めている。符合化された画像データ中から
視差を取り出すことにより、表示側で視差の計算をする
必要がなくなり、表示側での演算量を低減できる。【００６１】なお、平均視差演算手段３による視差の平
均の計算は、画面の中央部を重視して（数１４）による
重み付け平均値を用いてもよい。こうのようにすれば、
画像の中心部で、より融合しやすい視差制御を行え、本
発明に含まれる。【００６２】【数１４】【００６３】図９（ａ）（ｂ）（ｃ）は、（数１４）に
よる重み付け平均の計算に用いる重みの分布の例を示
す。簡単のため１次元的に示しているが、実際には、画
像中央部で周辺部よりも大きな値となる２次元的な分布
である。また、重みの値はすべて０以上の値（負でない
値）である。（第２の実施の形態）図１１は、本発明の第２の実施の
形態における視差制御方式のブロック図である。図１１
において、頻度計算手段７、シフト量演算手段８以外の
構成は、第１の実施の形態におけるものと同一の動作を
行うものであるため、第１の実施の形態での説明図と同
一の符号を付し、説明を省略する。以下に頻度計算手段
７、シフト量演算手段８の動作について説明する。【００６４】頻度計算手段７は、画像復号手段６によっ
て復号された左画像基準の視差の頻度を計算する。視差
の頻度とは、画像のある領域（たとえば、画像全体でも
よいし、いっての基準で決めた特定の領域でもよい）内
における視差の各値毎に計算した画素数である。シフト
量演算手段８は、頻度計算手段７によって計算された
（画像間での）視差の頻度と画像の視野角に応じた人の
目の融合範囲とから、融合範囲内の視差の頻度の和が最
大になるシフト量を演算し、画像シフト手段４ａ, ４ｂ
に出力する。【００６５】図１２は、シフト演算手段８の構成の一例
を示す。図１２において、９はＭＰＵ、１０は融合範囲
テーブルである。ＭＰＵ９は画像表示面の幅と観察距離
から（数１５）に示す水平方向の視野角を計算し、該視
野角における融合範囲を融合範囲テーブル１０から読み
出す。【００６６】【数１５】【００６７】図１３は融合範囲テーブルの特性の１例を
示す。図１３において、横軸は画像表示面の水平方向の
視野角であり、縦軸は視差の融合範囲（（数１６）によ
り角度換算している）である。【００６８】【数１６】【００６９】なお、図１３の縦軸の符号は負の側が表示
面よりも手前に知覚される視差、正の側が表示面よりも
奥に知覚される視差を示している。図１４は、（数１
６）の図形的な意味を示す図である。図１４は、角度換
算した視差θは画像表示面上での視差Δを視野角に換算
したものであることを示す。【００７０】一方、図１および図３に示す平行投影と輻
輳投影において、画像の位置（例えば液晶プロジェクタ
であれば液晶上の画素の位置）ｘl1,ｘr1 と表示面上で
の位置Ｘl,Ｘr の位置関係は、それぞれ（数１７）（数
１９）となり、表示面上での視差は（数１８）（数２
０）となる。【００７１】【数１７】【００７２】【数１８】【００７３】【数１９】【００７４】【数２０】【００７５】そして、撮影時の撮影面上での座標値（ｘ
l0,ｙl0），（ｘr0,ｙr0）と、投影時の画像の位置（ｘ
l1,ｙl0），（ｘr1,ｙr1）（例えば液晶プロジェクタで
あれば液晶上の画素の位置）との関係は、（数２１）で
表される。【００７６】【数２１】【００７７】ここで、撮像面の幅ｗinはカメラパラメー
タから得られ、表示時の画像幅ｗoutは表示系固有の値
である。【００７８】撮像時の条件（平行撮影／輻輳撮影）に応
じて（数５）もしくは（数８）を用いてｘl0,ｘr0を計
算し、（数２１）によりｘl1,ｘr1に変換する。更に、
投影時の条件（平行投影／輻輳投影）に応じて、（数１
８）もしくは（数２０）を計算することにより、撮像条
件、投影条件の双方を考慮して、表示画面上での視差を
計算できる。【００７９】ＭＰＵ９は、融合範囲テーブル１０から読
み出した融合範囲を表示面上での視差(距離)に換算し、
画像表示面上での視差の融合範囲を決定する。そして、
ＭＰＵ９は、上述した画像データにおける視差と画像表
示面上での視差の関係とを用いて、融合範囲内の視差の
頻度の和が最大になるような、画像データに対するシフ
ト量を計算する（視差制御による画像のシフトは、視差
の頻度分布を図１５において水平方向に移動させること
を意味する）。【００８０】画像シフト手段４ａ，４ｂによって該出力
シフト量だけ逆方向に画像をシフトし、画像表示手段５
ａ，５ｂによって表示することにより、融合範囲内での
視差の頻度の和が最大（すなわち画像内で融合する画素
の面積が最大）になる表示を行うことができる。【００８１】以上説明したように、本実施の形態によれ
ば、人の目の融合範囲に応じた視差制御を行うことによ
って、表示時に画像のより多くの部分で視差を融合範囲
内に入るようにすることができる。【００８２】なお、本実施の形態では、融合範囲内での
視差頻度の和が最大になる視差制御について説明した
が、視差の平均値が融合範囲の中央になるように視差制
御してもほぼ同等の効果を得ることができ、本発明に含
まれる。【００８３】また、伝送側において、最近点及び最遠点
を、実際の画像中の最近点及び最遠点とは異なる値に設
定し、表示側において該設定値の最近点及び最遠点に相
当する各々の視差の平均の視差が、融合範囲の中央にな
るように視差制御することにより、画像作成者の意図す
る奥行きでの画像を優先的に観察者に提示することがで
き、本発明に含まれる。（第３の実施の形態）本発明の第３の実施の形態は、１
組の画像対を入力し、初期視差と初期視差の信頼性とを
計算し、基準画像と初期視差の信頼性とから物体輪郭線
を検出し、初期視差と初期視差の信頼性と検出された物
体輪郭線とから、物体輪郭線近傍の初期視差の信頼性の
低い領域での視差を決定する。このとき視差は、物体輪
郭線において変化し、かつ、周囲の視差とは滑らかに接
続するように決定する視差推定方法およびその装置であ
る。【００８４】本実施の形態では前述した構成により、基
準画像と参照画像の１組の画像対から、初期視差と初期
視差の信頼性とを計算し、基準画像と初期視差の信頼性
とから物体輪郭線を検出し、初期視差と初期視差の信頼
性と検出された物体輪郭線とから、物体輪郭線近傍の初
期視差の信頼性の低い領域での視差が、物体輪郭線にお
いて変化し、かつ、周囲の視差とは滑らかに接続するよ
うに決定する。【００８５】図１６は、本発明の第３の実施の形態にお
ける視差推定装置のブロック図である。【００８６】図１６において、２０１はブロックマッチ
ングによる初期視差を計算する初期視差推定部、２０２
は初期視差推定時の信頼性評価部、２０３は輪郭検出
部、２０４は物体輪郭付近での視差推定部である。【００８７】以下に上記構成の動作について説明する。【００８８】初期視差推定部２０１は、（数２２）に示
す残差平方和（ＳｕｍｏｆＳｑｕａｒｅｄｄｉｆｆ
ｅｒｅｎｃｅｓ以下ＳＳＤ）の計算を行う。（数２
２）によるＳＳＤの値は、基準画像に設定した窓領域と
参照画像中に設定した窓領域内の画素値の分布が似てい
るところでは小さな値となり、逆に双方の窓領域内での
画素値の分布が異なるところでは大きな値となる。初期
視差推定部２０１は、所定の探索範囲内でＳＳＤの値を
最小とする画像間のずれ量ｄを着目点（ｘ，ｙ）におけ
る視差とし、その視差の値を物体輪郭付近での視差推定
部２０４に出力し、探索範囲内でのＳＳＤの最小値を初
期視差推定時の信頼性評価部２０２に出力する。【００８９】【数２２】【００９０】図１７は、初期視差推定部２０１による上
記初期視差推定（ブロックマッチング）を説明する図で
ある。図１７において、着目点（ｘ，ｙ）を中心にして
設定した窓領域が、（数２２）の積分領域Ｗを示す。窓
領域を順次ずらして設定し、上記のＳＳＤの計算を行う
ことにより画像全体での初期視差を得ることができる。【００９１】初期視差推定時の信頼性評価部２０２は、
初期視差推定部２０１による視差計算で得られたＳＳＤ
の探索範囲中での最小値、窓領域（ブロック）内の画素
数、画像間のノイズの分散、窓領域内での基準画像の水
平垂直方向の輝度こう配の２乗の平均値から、（数２
３）に示す対応付けの信頼性評価値を計算する。【００９２】【数２３】【００９３】（数２３）の値は、小さいほど視差推定の
信頼性が高いことを示し、逆に大きいほど信頼性が低い
ことを示す。【００９４】図１８は、輪郭検出部２０３の構成の一例
を示すブロック図である。図１８において、２０５は基
準画像を輝度成分と色成分に分離するＹＣ分離回路、２
０６Ａ，２０６Ｂ，２０６Ｃは、上記分離された輝度成
分Ｙ、色成分Ｒ−Ｙ，Ｂ−Ｙからそれぞれエッジを検出
するエッジ検出回路、２０７はエッジ検出結果の稜線に
おける強度のみを出力する稜線検出部、２０８は初期視
差推定値の信頼性の低い領域で１の重みを出力し、初期
視差推定値の信頼性の高い領域では０の重みを出力する
重み発生回路である。【００９５】以下に上記構成の動作について説明する。【００９６】ＹＣ分離回路２０５は、基準画像を輝度成
分Ｙ、色成分Ｒ−Ｙ，Ｂ−Ｙに分離し出力する。【００９７】エッジ検出回路２０６Ａ，２０６Ｂ，２０
６Ｃはそれぞれ、上記Ｙ，Ｒ−Ｙ，Ｂ−Ｙ成分からエッ
ジ成分を検出する。図１９は、エッジ検出回路２０６の
構成の一例を示すブロック図である。図１９において、
２０９Ａ，２０９Ｂ，２０９Ｃはそれぞれ低空間周波数
域、中空間周波数域、高空間周波数域におけるエッジ成
分を検出する方向別フィルタ群である。２１０、２１
１、２１２、２１３は、それぞれの方向別フィルタ群を
構成する方向別フィルタである。図２０は、上記方向別
フィルタの空間的な重みの一例であり。図２０（ａ），
（ｂ），（ｃ）は垂直方向に連続するエッジを、
（ｄ），（ｅ），（ｆ）は斜め方向のエッジを検出する
ものである。【００９８】尚、（ａ），（ｄ）が高空間周波数域、
（ｂ），（ｅ）が中空間周波数域、（ｃ），（ｆ）が低
空間周波数域用の重みの分布の一例を示す。水平および
他方の斜め方向のエッジ検出は、図２０の計数の配置を
９０度回転させればよい。また、エッジの方向は４５度
刻みに限る必要はなく、３０度刻みなどでもよいのは当
然である。【００９９】また、方向別フィルタの空間的な重みは図
２０に示すものに限る必要はなく、方向毎についての微
分型の重み分布になっていればよいのは当然である。各
方向別のエッジ強度の算出法を式で示すと（数２４）に
なる。【０１００】【数２４】【０１０１】統合部２１４は方向別フィルタ２１０，２
１１，２１２，２１３の出力を統合する。統合部２１４
による統合の一例を式で示すと（数２５）になる。【０１０２】【数２５】【０１０３】尚、統合部２１４による統合は（数２５）
で示される２乗和の形式のものに限る必要はなく、絶対
値和の形式のものなどでもよいのは当然である。【０１０４】輝度成分Ｙ、色成分Ｒ−Ｙ，Ｂ−Ｙについ
て、高空間周波数域、中空間周波数域、低空間周波数域
でそれぞれ統合部２１４Ａ，２１４Ｂ，２１４Ｃにより
統合されたエッジ強度は、乗算され出力される。そし
て、Ｙ，Ｒ−Ｙ，Ｂ−Ｙ各成分についての上記エッジ強
度は、加算され稜線検出部７に転送される。【０１０５】尚、輪郭検出部２０３における基準画像の
輝度成分、色成分への分離はＹ，Ｒ−Ｙ，Ｂ−Ｙに限る
必要はなく、Ｒ，Ｇ，Ｂ等他の成分へ分離してもよいの
は当然である。また、Ｙ，Ｒ−Ｙ，Ｂ−Ｙについての上
記エッジ強度は加算後に稜線検出部２０７に転送するも
のに限る必要はなく、乗算後に稜線検出部２０７に転送
してもよい。【０１０６】図１８に戻って、稜線検出部２０７は、上
記Ｙ，Ｒ−Ｙ，Ｂ−Ｙについて加算されたエッジ強度の
稜線における値のみを出力する。図２１は、稜線検出部
２０７の構成の一例である。図２１において、水平稜線
検出回路２１５は着目画素でのエッジ強度が着目点の上
下の画素でのエッジ強度の双方よりも大きい場合に１を
出力し、そうでない場合には０を出力する。【０１０７】同様に、垂直稜線検出回路２１６は着目画
素でのエッジ強度が着目点の左右の画素でのエッジ強度
の双方よりも大きい場合に１を出力し、そうでない場合
には０を出力する。水平稜線検出回路２１５と垂直稜線
検出回路２１６の出力は、ＯＲ演算され、更に入力信号
と乗算して出力される。すなわち、稜線検出部２０７
は、水平方向もしくは垂直方向に隣接する画素でのエッ
ジ強度よりも強いエッジ強度を有する画素（すなわち稜
線となっている画素）におけるエッジ強度のみを出力
し、その他の画素については０を出力する。【０１０８】再び図１８に戻って、重み発生回路２０８
は、初期視差推定値の信頼性評価値がしきい値以上の時
１を出力し、しきい値未満の時には０を出力する。重み
発生回路２０８の出力を稜線検出部２０７の出力と乗算
することにより、初期視差推定値の信頼性が低いところ
でのエッジ、すなわち視差が不連続に変化する物体輪郭
線を抽出できる。また、重み発生回路２０８の出力は、
後述する物体輪郭付近での視差推定部２０４の演算領域
メモリに記憶される。物体輪郭線の抽出を式で示すと
（数２６）となる。【０１０９】【数２６】【０１１０】尚、エッジ検出結果２０６Ａ，２０６Ｂ，
２０６Ｃの出力を加算して稜線検出部７に入力するよう
に限る必要はなく、乗算して稜線検出部２０７に入力し
てもよい。また、稜線検出部２０７の出力と乗算される
重み発生回路２０８による重み発生の方法は、０と１の
２値に限る必要はなく、初期視差推定時の信頼性に応じ
て連続的な値を出力してもよいのは当然である。【０１１１】物体輪郭付近での視差推定部２０４は、物
体輪郭線近傍の初期視差推定値の信頼性の低い領域での
視差を、輪郭強度、初期視差から再計算する。物体輪郭
付近での視差推定部２０４は、（数２７）で定義される
視差の分布についてのエネルギーを最小化する視差分布
を計算する。【０１１２】【数２７】【０１１３】重み関数ｗ（ｘ，ｙ）は滑らかさのパラメ
ータと輪郭強度により（数２８）として定義する。【０１１４】【数２８】【０１１５】（数２７）を最小にする視差分布の条件は
（数２９）である。【０１１６】【数２９】【０１１７】（数２９）の微分方程式は、有限要素法
（ＦＥＭ）等の公知の技術によって数値的に解くことが
できる。【０１１８】図２２は、物体輪郭付近での視差推定部２
０４の構成の一例を示すブロック図である。図２２にお
いて、２１７は視差分布エネルギー用の重みを発生する
視差分布エネルギー用重み発生回路、２１８は演算領域
メモリ、２１９は視差メモリ、２２０は重みメモリ、２
２１はＦＥＭ演算回路である。【０１１９】視差分布エネルギー用重み発生回路２１７
は、輪郭強度と滑らかさのパラメータλから（数２８）
の重み関数の値を計算し、重みメモリ２２０に書き込
む。ＦＥＭ演算回路２２１は、（数２９）を有限要素法
により解き、視差分布を計算する。【０１２０】以上のように本実施の形態によれば、ブロ
ックマッチングによる視差推定値の信頼性が低い領域に
おいて、物体輪郭線を検出し、検出した物体輪郭線の所
で視差が不連続に変化するように視差推定を行うことが
できる。【０１２１】また、本実施の形態によれば、任意の形状
の物体輪郭線の所で視差が不連続に変化するように視差
推定を行うことができる。【０１２２】尚、物体輪郭付近での視差推定は、視差が
物体輪郭線の所で変化し、かつ、周囲の視差と滑らかに
接続すればよく、（数２７）に示すエネルギーを最小化
する視差として計算する方法に限る必要はない。そのよ
うな例について、以下に説明する。（第４の実施の形態）図２３は、本発明の第４の実施の
形態における視差推定装置の構成を示すブロック図であ
る。図２３において、２０１はブロックマッチングによ
る初期視差を計算する初期視差推定部、２０２は初期視
差推定時の信頼性評価部、２２２は輪郭検出部、２２３
は物体輪郭付近での視差推定部である。【０１２３】上記構成において、輪郭検出部２２２、物
体輪郭付近での視差推定部２２３以外の構成の動作は本
発明の第３の実施の形態と同一であるので説明を省略
し、以下に輪郭検出部２２２、物体輪郭付近での視差推
定部２２３の動作について説明する。【０１２４】まず、輪郭検出部２２２は、本発明の第３
の実施の形態における輪郭検出部と同様の輪郭検出を行
ない、検出結果を２値化（例えば、０と１）して出力す
る。物体輪郭付近での視差推定部２２３は、物体輪郭線
近傍の初期視差推定値の信頼性の低い領域での視差を、
初期視差と輪郭検出部２２２によって検出された物体輪
郭線とから計算する。【０１２５】図２４は、物体輪郭付近での視差推定部２
２３による視差推定の様子を示す図である。図２４にお
いて、２９１は初期視差推定値の信頼性の低い領域、２
９２は輪郭検出部２２２によって検出された物体輪郭
線、２９３は初期視差推定値の信頼性の高い領域、２９
４は視差を計算しようとする着目点、２９５は着目点を
含むように設定した窓領域である。【０１２６】着目点２９４（ｘ，ｙ）における視差は、
設定窓領域内で初期視差推定値の信頼性の低い領域２９
１と接する周囲の領域（この場合は、初期視差推定値の
信頼性の高い領域２９３ａ）での視差を用い、着目点２
９４での視差が、周囲の領域と着目点２９４との距離に
応じて、周囲の領域での視差の値の影響を受けるように
決定する。この時、周囲の領域における視差は、物体輪
郭線２９２を越えて着目点２９４に影響を与えないよう
にすることにより、物体輪郭線２９２の所で変化し、か
つ、周囲の視差と滑らかに接続するするように視差を決
定できる。物体輪郭付近での視差推定部２２３による視
差推定を一例として式で表すと（数３０）となる。【０１２７】【数３０】【０１２８】ただし、物体輪郭付近での視差推定部２２
３による視差推定は、（数３０）に限る必要はなく、視
差が物体輪郭線で変化し、かつ、周囲の視差と滑らかに
接続するものであればよいのは当然である。【０１２９】以上のように本実施の形態によれば、ブロ
ックマッチングによる視差推定値の信頼性が低い領域に
おいて、物体輪郭線を検出し、検出した物体輪郭線の所
で視差が不連続に変化するように視差推定を行うことが
できる。【０１３０】また、本実施の形態によれば、任意の形状
の物体輪郭線の所で視差が不連続に変化するように視差
推定を行うことができる。【０１３１】さらに、本実施の形態によれば、初期視差
推定値の信頼性の低い領域において、着目点近傍で比較
的少数の周囲の視差を参照して視差を計算することによ
り、少ないメモリ容量と演算量で視差の計算を行うこと
ができる。【０１３２】また、第３と第４の実施の形態で説明した
視差推定の結果を用いて、左右の画像をシフトし統合す
ることにより、それら左右の画像に対応する各々の視点
の間の所定の中間視点における画像を生成できる。ここ
で、視差推定と中間視点画像生成とは異なる場所で行っ
てもよい。以下に、視差推定と中間視点画像生成とを異
なる場所で行う際の伝送、受信方法について説明する。（第５の実施の形態）図２５は、本発明の第５の実施の
形態において、送信側で視差推定（もしくは動き推定）
を行うシステムの送信ブロックの一例である。【０１３３】図２５において、１７０は左画像を基準と
した視差ＶL を推定する視差推定手段、１７１は右画像
を基準とした視差ＶR を推定する視差推定手段、１７２
ａ〜ｄは符号化器、１７３ａ，ｂは復号化器、１７４は
左画像Ｌと左画像を基準とした視差ＶL から右画像Ｒを
予測する予測手段、１７５は左画像を基準とした視差Ｖ
Lから右画像を基準とした視差ＶRを予測する予測手段、
１７６ａ，ｂは視差が正しく推定されない領域での視差
を決定する穴埋め手段である。以下に上記構成の動作に
ついて説明する。【０１３４】まず、左画像Ｌは符号化器１７２ａによっ
て符号化される。また、視差推定手段１７０、１７１に
よって左右の画像をそれぞれ基準とした視差ＶL，ＶRが
推定される。オクルージョン等により視差が正しく推定
されない領域については、第３または第４の実施の形態
で説明した視差推定方法を用いた穴埋め手段１７６ａ，
１７６ｂによって視差が決定される。【０１３５】次に、左画像を基準とした穴埋め後の視差
は符号化器１７２ｂにより符号化される。符号化された
左画像を基準とした穴埋め後の視差は、復号化器１７３
ａにより復号化され、予測器１７４による右画像Ｒの予
測と、予測器１７５による穴埋め後の右画像を基準とし
た視差の予測に用いられる。予測器１７５による右画像
を基準とした視差ＶR の予測は、左画像を基準とした視
差を用いて、（数３１）として計算する。【０１３６】【数３１】【０１３７】右画像Ｒは予測器１７４による予測画像と
の残差をとり、符号化器１７２ｄによって符号化され
る。右画像を基準とした穴埋め後の視差ＶR は、予測器
１７５による予測視差との残差をとり、符号化器１７２
ｃにより符号化される。【０１３８】図２６は、受信側で視差推定を行うシステ
ムの受信ブロックの一例である。図２６において、１８
１ａ〜ｄは復号化器、１７４は右画像Ｒの予測器、１７
５は右画像を基準とした視差の予測器である。符号化さ
れた左画像Ｌ、左画像基準の視差ＶL、右画像基準の視
差ＶRの予測誤差、右画像Ｒの予測誤差はそれぞれ復号
化器１８１ａ〜１８１ｄにより復号化される。右画像Ｒ
は予測器１７４による予測結果と復号化された右画像の
予測誤差とを加算して復元される。右画像基準の視差Ｖ
R は、予測器１７５による予測結果と復号化された予測
誤差とを加算して復元される。【０１３９】左画像Ｌ、右画像Ｒ、左画像基準の視差Ｖ
L、右画像基準の視差ＶRが復元されると、例えば特願平
７−１０９８２１号に示される中間視点画像生成方法に
より左右の画像の中間視点での画像を生成することがで
き、左画像、右画像と併せて多視点画像として表示する
ことができる。【０１４０】以上説明したように、上記の構成により、
送信側で視差推定と穴埋め処理を行うことにより、受信
側での演算量を低減することができ、受信側の装置規模
を縮小することができる。【０１４１】また、多視点画像を伝送する際に、送信側
で中間視点画像生成を行うことにより伝送量を低減した
画像伝送を行うことができる。そのような例について以
下に説明する。（第６の実施の形態）図２７は、本発明の第６の実施の
形態における多視点画像圧縮伝送システムの送信側の構
成図である。図２７において、１０１ａ〜１０１ｄは各
視点位置での画像を撮像するカメラ、１０２はカメラ１
の画像とカメラ４の画像を圧縮し符号化する画像圧縮符
号化部、１０３ａは画像圧縮符号化部１０２が圧縮符号
化した画像データを復号化伸長する復号化画像伸長部、
１０４ａは復号化画像伸長部１０３ａが復号化伸長した
カメラ１の画像とカメラ４の画像から、カメラ２の視点
とカメラ３の視点での画像を予測し生成する中間視点画
像生成部、１０５はカメラ２の画像とカメラ３の画像に
ついて中間視点画像生成部１０４ａが生成した画像との
残差を圧縮し符号化する残差圧縮符号化部である。以下
に上記構成の動作について説明する。【０１４２】画像圧縮符号化部１０２は、多視点画像中
の複数の画像（本実施の形態では４視点の画像の両端の
視点の画像）を、画像間のブロック相関等を利用した既
存の技術により圧縮し符号化する。図３１は、画像圧縮
符号化部１０２の構成の一例を示す。図３１において、
１０７ａ，１０７ｂは８×８画素もしくは１６×１６画
素毎にＤＣＴ計算を行いＤＣＴ係数を計算するＤＣＴ手
段、１０８ａ，１０８ｂはＤＣＴ係数を量子化する量子
化手段、１０９ａは逆量子化手段、１１０ａは逆ＤＣＴ
計算をおこなう逆ＤＣＴ手段、１１１は視差検出手段、
１１２ａは視差補償手段、１１３ａは量子化されたＤＣ
Ｔ係数と視差を符号化する符号化手段である。以下に上
記構成の動作について説明する。【０１４３】ＤＣＴ手段１０７ａは、カメラ１の画像を
ブロック毎に処理し、各ブロックについてＤＣＴ係数を
計算する。量子化手段１０８ａは、そのＤＣＴ係数を量
子化する。逆量子化手段１０９ａは、その量子化された
ＤＣＴ係数を逆量子化する。逆ＤＣＴ手段１１０ａは、
その逆量子化されたＤＣＴ係数を逆変換し、受信側で得
られるカメラ１の画像を復元する。視差検出手段１１１
は復元されたカメラ１の画像とカメラ４の画像間でブロ
ックマッチングを行い、カメラ１の画像を基準とした視
差をブロック毎に計算する。視差補償手段１１２ａは、
上記復元されたカメラ１の画像とブロック毎の視差を用
いてカメラ４の画像を予測する（すなわち、動画像の動
き補償に相当する処理を行う）。ＤＣＴ手段１０７ｂ
は、カメラ４の画像と上記予測画像の残差をブロック毎
に処理しＤＣＴ係数を計算する。量子化手段１０８ｂは
その残差のＤＣＴ係数を量子化する。符号化手段１１３
ａは、カメラ１の画像の量子化されたＤＣＴ係数、ブロ
ック毎の視差、視差補償の残差の量子化されたＤＣＴ係
数を符号化する。【０１４４】また、復号化画像伸長部１０３ａは、画像
圧縮符号化部１０２によって圧縮符号化された画像デー
タを復号化し伸長する。図３２は、復号化画像伸長部１
０３ａの構成の一例を示す図である。図３２において、
１１４ａは復号化手段、１０９ｂ、１０９ｃは逆量子化
手段、１１０ｂ，１１０ｃは逆ＤＣＴ手段、１１２ｂは
視差補償手段である。以下に上記構成の動作について説
明する。【０１４５】復号化手段１１４ａは、圧縮符号化された
データを復号化し、カメラ１の画像の量子化されたＤＣ
Ｔ係数、ブロック毎の視差、視差補償の残差の量子化さ
れたＤＣＴ係数を伸長する。カメラ１の画像の量子化さ
れたＤＣＴ係数は、逆量子化手段１０９ｂによって逆量
子化され、逆ＤＣＴ手段１１０ｂによって画像として伸
長される。動き補償手段１１２ｂは、その伸長されたカ
メラ１の画像と復号化された視差から、カメラ４の予測
画像を生成する。そして、逆量子化手段１０９ｃ、逆Ｄ
ＣＴ手段１１０ｃによって伸長された残差を上記予測画
像に加えることにより、カメラ４の画像を伸長する。【０１４６】中間視点画像生成部１０４ａは、本発明の
第３もしくは第４のいずれかの実施の形態に示す方法に
よって、カメラ１とカメラ４の画像から画素毎の視差を
計算し、カメラ２とカメラ３の画像を予測し生成する。【０１４７】残差圧縮符号化部１０５は、カメラ２とカ
メラ３の画像と上記予測画像の残差を圧縮し符号化す
る。中間視点画像生成部１０４ａは、視差を画素毎に計
算するため、ブロックマッチングによるブロック毎の視
差計算と比較して、精度よく視差を推定できる。その結
果、中間視点画像の予測誤差（すなわち残差）を小さく
することができ、圧縮効率を高めることができるととも
に、より有効なビット割り当てを行うことができ、画質
を維持した圧縮を行える。図３３は、残差圧縮符号化部
の構成の一例を示す。図３３において、１０７ｃ，１０
７ｄはＤＣＴ手段、１０８ｃ，１０８ｄは量子化手段、
１１３ｂは符号化手段である。カメラ２、カメラ３の画
像の残差はそれぞれＤＣＴ手段１０７ｃ，１０７ｄによ
ってＤＣＴ係数に変換され、量子化手段１０８ｃ，１０
８ｄによって量子化され、符号化手段１１３ｂによって
符号化される。【０１４８】図３４は、本発明の第６の実施の形態にお
ける多視点画像圧縮伝送システムの受信側の構成図であ
る。図３４において、１０３ｂは送信側の画像圧縮符号
化部１０２が圧縮符号化したカメラ１とカメラ４の画像
データを復号化伸長する復号化画像伸長部、１０４ｂは
復号化画像伸長部１０３ｂが復号化伸長したカメラ１と
カメラ４の画像から、カメラ２とカメラ３の視点での画
像を予測し生成する中間視点画像生成部、１０６はカメ
ラ２とカメラ３の視点での予測画像の予測誤差（残差）
を復号化し伸長する復号化残差伸長部である。復号化画
像伸長部１０３ｂおよび中間視点画像生成部１０４ｂの
動作については、送信側の復号化画像伸長部１０３ａお
よび中間視点画像生成部１０４ａの動作と同一であるの
で説明を省略し、以下に復号化残差伸長部の動作につい
て説明する。【０１４９】復号化残差伸長部１０６は、送信側の残差
圧縮符号化部１０５によって圧縮符号化されたカメラ２
とカメラ３の視点での予測画像の予測誤差（残差）を復
号化し伸長する。図３５は、復号化残差伸長部１０６の
構成の一例を示す。図３５において、１１４ｂは復号化
手段、１０９ｄ，１０９ｅは逆量子化手段、１１０ｄ，
１１０ｅは逆ＤＣＴ手段である。圧縮符号化されたカメ
ラ２とカメラ３の画像の残差データは、復号化手段１１
４ｂによって復号化され、それぞれ、逆量子化手段１０
９ｄ，１０９ｅにより逆量子化され、逆ＤＣＴ手段１１
０ｄ，１１０ｅにより伸長される。復号化伸長されたカ
メラ２とカメラ３の画像の残差を、中間視点画像生成部
１０４ｂによって生成された画像にそれぞれ重畳するこ
とにより、カメラ２とカメラ３の視点の画像を復元す
る。【０１５０】以上のように、本実施の形態によれば、送
信側で、多視点画像中の隣接しない２つの画像からその
中間視点の画像を生成し、その生成した中間視点画像と
その中間視点の実際の画像との残差を求め、上記２つの
画像と中間視点画像の残差とを圧縮符号化して伝送す
る。受信側で、伝送されてきた２つの画像と中間視点画
像の残差とを復号化伸長し、２つの画像から中間視点の
画像を生成し、復号化伸長した中間視点画像の残差を重
畳して中間視点での実際の画像に対応する画像を復元す
る。このようにすることにより、多視点画像を効率よ
く、また、画質を維持して圧縮伝送することができる。【０１５１】なお、中間視点画像の生成は、多視点画像
の両端の２視点（カメラ１とカメラ４の視点）での画像
から中間視点での画像を生成する構成に限る必要はな
く、例えば、カメラ２とカメラ４の画像からカメラ１と
カメラ３の視点での画像を生成してもよく、カメラ１と
カメラ３の画像からカメラ２とカメラ４の視点での画像
を生成してもよい。更には、カメラ２とカメラ３の画像
からカメラ１とカメラ４の視点での画像を生成してもよ
く、それぞれ本発明に含まれる。【０１５２】また、多視点画像の視点数は４視点に限る
必要はなく、また、２視点以上の視点での画像からそれ
ぞれの視点間の中間視点画像を生成してもよいのは明ら
かであり、本発明に含まれる。【０１５３】また、本発明の第３および第４の実施の形
態において、初期視差推定値の信頼性評価値としては、
（数２３）に示すものに限る必要はなく、（数２３）の
分子のみを信頼性評価値としても、参照画像の輝度こう
配の影響を受けるがほぼ同様の効果を得ることができ本
発明に含まれる。【０１５４】また、画像のノイズレベルが低い場合に
は、信頼性評価値としてノイズ項を無視した値を計算し
ても同様の効果が得られるのは当然であり本発明に含ま
れる。【０１５５】さらに簡略化して、信頼性評価値として、
１画素当たりの残差平方和の最小値、あるいは残差平方
和の最小値を用いてもよく、より簡単な回路で計算が可
能となり、本発明に含まれる。【０１５６】また、初期視差推定値の信頼性評価値とし
ては、（数３２）に示す双方向に推定した視差の差異を
用いてもよく、本発明に含まれる。【０１５７】【数３２】【０１５８】また、初期視差推定の信頼性評価値として
は、上記のものを２つ以上組み合わせて用いることによ
り、より安定した信頼性評価をすることができ、本発明
に含まれる。【０１５９】また、本発明の第３および第４の実施の形
態において、初期視差推定のための画像間の相関演算は
残差平方和（ＳＳＤ）に限る必要はなく、残差絶対値和
（ＳＡＤ）を用いても同様の効果を得ることができ、そ
のような実施の形態ももちろん本発明に含まれる。【０１６０】また、本発明の第６の実施の形態におい
て、隣接しない２つの視点での画像の圧縮符号化の方法
としては、画像間（視点間）の相関を利用したものに限
る必要はなく、時間方向の相関を利用したものを用いて
もよく、本発明に含まれる。【０１６１】【発明の効果】以上のように本発明によれば、カメラの
撮像面（ＣＣＤ）のサイズと、撮像面とレンズ中心との
距離と、レンズの焦点距離に関する情報とを付加して伝
送することにより、撮影時の視野角に応じた表示を行お
うとする際、被写体に接近して撮影した映像について
も、表示側で撮影時の視野角を精度よく計算することが
でき、撮影時と同一の視野角を再現する観察距離を精度
よく決定できる。【０１６２】また、多視点画像を伝送する際に画像内の
最近点、最遠点の情報を付加することにより、表示時に
目の疲れない表示（視差制御）を行うことができる。【０１６３】また、人の目の融合範囲に応じた視差制御
を行うことによって、表示時に画像のより多くの部分で
視差を融合範囲内に入るようにすることができる。【０１６４】また、伝送側において、付加する最近点、
最遠点の情報として、実際の画像中の最近点、最遠点と
は異なる値を設定し、表示側において該設定値の最近点
に相当する視差と、最遠点に相当する視差の平均の視差
が、融合範囲の中央になるように視差制御することによ
り、画像作成者の意図する奥行きでの画像を優先的に観
察者に提示することができる。【０１６５】また、本発明によれば、ブロックマッチン
グによる視差推定値の信頼性が低い領域において、物体
輪郭線を検出し、検出した物体輪郭線の所で視差が不連
続に変化するように視差推定を行うことができる。【０１６６】また、任意の形状の物体輪郭線の所で視差
が不連続に変化するように視差推定を行うことができ
る。。【０１６７】また、送信側で視差の穴埋め処理（本発明
による、視差が物体輪郭線の所で変化し、かつ、周囲の
視差と滑らかに接続する視差推定処理）を行うことによ
り、受信側での演算量を低減することができ、受信側の
装置規模を縮小することができる。【０１６８】また、多視点画像伝送システムの送信側と
受信側の双方で中間視点画像の生成を行うことにより、
中間視点画像の伝送量（残差の伝送量）を少なくするこ
とができ、その結果多視点画像を効率よく、また、画質
を維持して圧縮伝送することができる。DETAILED DESCRIPTION OF THE INVENTION [0001] BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to transmission of multi-view images.
About the method. [0002] 2. Description of the Related Art Conventionally, various types of stereoscopic video systems have been proposed.
3D video without special glasses
Multi-view images can be used by multiple people to observe images.
A multi-view stereoscopic video system is promising. Multi-view stereoscopic video
In the formula, the number of cameras and display devices used is
The more, the more observer feels natural motion parallax
And observation by a large number of people becomes easy. Only
However, restrictions such as the size of the imaging system and the setting of the optical axis of the camera
Limit the number of cameras that can be used practically
There is a degree. In the transmission and storage process,
It is desirable to reduce the amount of information that increases in proportion to the number
You. Therefore, on the display side, a binocular stereo
Multi-view stereo by generating intermediate viewpoint images from images
If images can be displayed, the burden on the imaging system can be reduced, and transmission and storage can be performed.
The amount of information at the time of integration can be reduced. point of view
Arbitrary views between different viewpoints from different images
To generate an intermediate viewpoint image that should be visible at a point, the image
It is necessary to estimate the depth by finding the correspondence of pixels between them. [0004] An image for digitally transmitting a moving image is also provided.
MPEG-1 and MPEG-2 are proposed as image compression methods
Have been. In addition, MPEG-2 has been expanded to
Attempts have been made to transmit images (ISO / IEC13818-2 / PDA
M3). FIG. 28 is a schematic diagram of MPEG-2 syntax.
is there. MPEG-2 transmission includes Sequence, GOP
(Group Of Picture), has a hierarchical structure called Picture
This is performed by encoding and decoding image data. ISO / IE
According to C13818-2 / PDAM3, many
The transmission of the viewpoint image is not clear (because it is not specified.
It does not seem to be implemented by extending the GOP layer. FIG. 29 shows a spatiotemporal image of a transmitted multi-viewpoint image.
It shows the relationship between directions. For conventional MPEG-2
Use parallax compensation in addition to the conventional motion compensation
To improve the coding efficiency. Multi-view images
When transmitting the information, information about each camera (camera
Camera parameters such as the position and orientation of the optical axis of the camera)
Must be transmitted. ISO / IEC13818-2 / PDAM3 includes:
The camera parameters are Pic.Extension (Picture layer) in FIG.
Is included in the transmission), but the
The description of the physical camera parameters is described.
Absent. For description of camera parameters, see CG
In the language OpenGL, the camera position,
The direction of the optical axis of the camera and the distance between the camera
Parameters ("Open
L Programming Guide "(OpenGL Programming Gu
ide, The Official Guide to Learning OpenGL, Release
1, Addison-Wesley Publishing Company, 1993)). FIG. 30 shows a camera parameter based on OpenGL.
It is explanatory drawing which shows the definition of a meter. In FIG. 30, A
Is the center of the lens, B is the center of the image plane (that is, the imaging plane),
C is the intersection of the perpendicular drawn from B to the top of the image and the top of the image
Show. The coordinate values of A, B, and C are respectively (optical cent
er X, optical center Y, optical center Z), (image pl
ane center X, image plane center Y, image plane cent
er Z), (image plane vertical X, image plane vertic
al Y, image plane vertical Z)
You. [0008] The camera path defined by the above OpenGL
Multi-view image by adding parameter information to Pic.Extension
Can easily be transmitted. [0009] However, as described above,
In conventional methods, such as the closest point and the farthest point (that is,
(The closest point, the farthest point of the subject)
Perform a display that does not cause eye fatigue (for example, parallax control).
Which depth range to use when trying to
I had the problem of not knowing. In displaying a multi-viewpoint image, a photograph
The viewing angle, the size of the imaging surface, and the distance between the lens center and the imaging surface.
Observation distance must be determined appropriately based on conditions such as distance
(There is a display that makes the display uncomfortable due to too much parallax,
So that the display does not have a poor sensation). Only
In OpenGL, the size of the imaging surface (the physical
Size) is not defined, and the lens center and
The distance from the image plane is always equal to the focal length of the lens.
I treat it. Therefore, the size of the viewing angle during imaging is
The display side does not know the
It is not possible to determine a sharp observation distance, resulting in a strange 3D display.
There was a problem that there is a possibility. The present invention has been made in view of the above, and has been described in view of the followings.
Of multi-view images that can calculate the viewing angle of
It is intended to provide a transmission method. [0012] According to the present invention, there are provided two or more viewpoints.
In image transmission, the imaging size of the camera that captures the image
Information about the distance between the lens and the center of the lens and the imaging surface.
With the above, information on the viewing angle at the time of imaging is obtained on the display side.
Multi-view image transmission method characterized in that
You. [0013] BEST MODE FOR CARRYING OUT THE INVENTION The present invention will now be described with reference to the embodiments.
Will be described based on the drawings showing (First Embodiment) FIG. 4 shows a first embodiment of the present invention.
Showing the parameters defined by the image transmission method in the state
It is. In FIG. 4, A1 and A2 are the center of the lens of the camera.
Indicates the position, and B1 and B2 indicate the center of the imaging surface.
The image pickup surface on the subject side with respect to the lens center.
I'm thinking back.) In OpenGL, the distance between A1B1 and A2B2 in FIG.
Is defined as the focal length of the camera lens.
In the invention, the distance between the lens center of the camera and the imaging surface is
Defined independently of the focal length of the lens. By this definition
Distance between the center of the lens and the imaging surface during focusing.
It can be calculated according to the distance, and the accurate viewing angle can be calculated. Field of view
The angle is the size of the imaging surface and the distance between the lens center and the imaging surface
Can be calculated from The center of the lens at the time of focusing will be described with reference to FIG.
The distance between the camera and the imaging surface depends on the distance between the subject and the center of the lens.
Will be described. FIG. 2 shows the position of the subject,
FIG. 4 is a diagram illustrating a relationship between a position of an imaging surface and a focal length during focusing.
You. In FIG. 2, A is the position of the subject, and B is the light from A.
Is the point where the image is formed, O is the center of the lens, F is the parallel light
A is the distance between the subject and the center O of the lens.
The distance b between the point B where the light from A forms an image and the lens center O
The distance, f, indicates the focal length of the lens. between a, b, f
It is known that the relationship of (Equation 1) holds. [0016] (Equation 1) According to (Equation 1), the subject can ignore the focal length.
When it is far from the lens (a >> f), 1 / a
→ 0, which can be approximated as b = f. However, if the subject
When relatively close to the lens, the 1 / a term can be ignored.
Instead, b ≠ f. Therefore, the subject is relatively close to the lens.
In order to calculate the viewing angle correctly even if the
The distance between the mind and the imaging plane must be defined independently of the focal length
There is. Then, the width of the imaging surface is defined as win and the height is defined as hin.
And the viewing angle at the time of imaging is represented by (Equation 2). [0018] (Equation 2) Therefore, the width of the image at the time of display is represented by wout and the height
Let hout be the observation distance that reproduces the viewing angle at the time of imaging
Is [0020] (Equation 3) ## EQU1 ## Next, a table based on the nearest point and the farthest point in the image
The improvement of the visibility on the presentation side will be described. FIG.
Is the case when performing convergence projection using two projectors.
To explain the positional relationship between the convergence distance, the nearest point, and the farthest point
FIG. In FIG. 3, C is the congestion point, A is the nearest point, B
Indicates the farthest point. In a projection with congestion, the observer is
When viewing the point C, the parallax becomes 0 (in FIG.
Both will see the center of the image, so the left and right eyes will see
There is no difference in relative position in the image). And the most
When looking at the near point A, it becomes a so-called cross-eyed state,
A parallax of Da occurs in the direction of the crossed eye on the image. Figure 3
The observer is more likely to see both eyes
Look at the point shifted by Da / 2 to the side. Conversely, look at the farthest point B
If you see the image
Db parallax occurs in the direction of the eyes. FIG. 1 shows the closest point and the most recent point in the case of parallel projection.
Indicates the positional relationship between the far point and the point where the vergence of the observer matches the accommodation.
FIG. In FIG. 1, A represents the most recent image to be displayed.
Point, B is the farthest point, C is the point where the observer's convergence and accommodation match
Is shown. In the case of the parallel projection shown in FIG. 1, there is a parallax of Dc.
When you display an image, it appears at the same point on the screen.
And the observer's convergence and accommodation match. The parallax in the images of FIGS. 3 and 1 described above.
Is closer to the screen than the screen (including C)
Is perceived as a three-dimensional effect
When it becomes impossible to fuse (it looks double), observation
Give the person discomfort and discomfort. The improvement of the observer's visibility is the latest point, the farthest point.
Based on the point and the point of convergence at the time of imaging, the image is displayed in the direction shown in FIG.
(Image 1 and Image 2 are horizontal in the vertical plane of each projection axis.
To the congestion point, the farthest distance, the closest distance
This is made possible by changing the positional relationship with. Without image
For example, cancel the average value of the disparity between images.
So that the entire image is viewed evenly
I can do it. FIG. 5 is a block diagram of such a process.
You. In FIG. 5, for the sake of simplicity, data of a two-lens system (two viewpoints) is shown.
Is shown. In FIG. 5, 1 is an image restoration
Signal means 2, parallax estimating means 3, average parallax calculating means 4,
a and 4b are image shift means, 5a and 5b are image display means
It is. The operation of each means will be described below. [0028] The image decoding means 1
The multi-view image data is received and decoded. Image decoding
The left and right images decoded by the means 1 are sent to the parallax estimating means 2.
Sent. The parallax estimating means 2 is
From the decoded left and right images, the parallax (parallax
Figure). For example, block based on the left image
FIG. 6 shows the case where parallax is calculated by matching.
This will be described below. First, set the window area in the left image
I do. Next, the residual sum of squares (SSD) shown in (Equation 4) is calculated.
I do. [0029] (Equation 4) Equation (4) is calculated in the range of dmin to dmax.
d is calculated at one pixel intervals. And dmin to dm
Set the value of d that minimizes SSD in the range of ax to the set window area.
Parallax in the area. The parallax at each pixel of the image is
By shifting the range sequentially and performing the above calculation,
Obtained. The SSD calculation ranges dmin and dmax are
It can be calculated from the information of the point and the farthest point. Using FIGS. 7 and 8
Dmin and dmax for parallel shooting and convergence shooting
Will be described below. FIG. 7 is a diagram showing the case of parallel photographing.
In the coordinate system shown in FIG. 7, the coordinate values of the left and right lens centers
To (−D / 2,0), (D / 2,0), imaging surface and lens
The distance from the center is b, the horizontal coordinate of the object position in the three-dimensional space
The value is X0, the coordinate value in the depth direction is Z0, and
Horizontal position where light from the object at position (X0, Z0) is imaged
Are defined as xl0 and xr0, respectively (xl0 and xr0 are the camera light
Horizontal coordinate in plane coordinate system with origin at the intersection of axis and imaging plane)
And from the graphical relationship, [0033] (Equation 5) ## EQU1 ## Therefore, based on the left and right images
Each of the parallaxes is represented by an equation shown in (Equation 6). [0035] (Equation 6) Here, the depth value of the nearest point in the image is Zmi
n, calculate the SSD assuming that the depth value of the farthest point is Zmax
The upper limit dmax and the lower limit dmin of the range are expressed by (Equation 7). [0037] (Equation 7) FIG. 8 is a diagram showing a case of convergence photographing.
You. In the coordinate system shown in FIG.
(0, C) coordinate value of the intersection of the optical axes of
Coordinate values of the heart are (-D / 2, 0), (D / 2, 0), and imaging
The distance between the surface and the center of the lens is b, the object position in the three-dimensional space
The horizontal coordinate value of X0, the coordinate value in the depth direction is Z0,
Light from an object at a position (X0, Z0) is imaged on the imaging surface
The horizontal positions are assumed to be x10 and xr0, respectively (xl0 and xr0 are
Water in a planar coordinate system with the origin at the intersection of the camera's optical axis and the imaging surface
(Coordinates) and the graphical relationship, [0039] (Equation 8) ## EQU4 ## Therefore, the left and right images are
The parallax at the time of doing each is expressed by the equation shown in (Equation 9).
You. [0041] (Equation 9) Whether X0 remains in the equation (Equation 9)
Therefore, in convergence imaging, even if the depth is the same,
The parallax differs depending on the position (that is, the reproduced stereoscopic image is
(Distorted). Now, for simplicity, X0 = 0 (ie,
Considering parallax at a point on the (Z-axis),
= 0 to obtain (Equation 10). [0043] (Equation 10)From equation (10), the depth of the nearest point in the image is obtained.
Value Zmin, depth value Zmax at the farthest point, depth value C at the point of convergence
, The number of horizontal pixels nx, and the width w of the imaging surface (CCD)
The upper limit pixel number dmax of parallax and the lower limit pixel number dmin can be determined from in.
Wear. A case where parallax at points other than on the Z axis is considered
In this case, the maximum and minimum values of (Equation 9) are calculated.
Thus, the upper limit dmax and the lower limit dmin of the parallax can be determined. As described above, the depth of the nearest point in the image
Outgoing value, farthest point depth value, camera position, camera light
Given the axis orientation, measure the range of values that parallax should take.
Can calculate the SSD calculation range during parallax calculation.
You. The average parallax calculating means 3 is calculated by the parallax estimating means 2.
The average of the calculated disparity map is calculated. The average disparity map is
It is obtained by calculating (Equation 11). [0047] (Equation 11) The image shift means 4a and 4b are used for the average parallax performance.
Point having average parallax obtained by the calculating means 3
Is the same depth as the display surface (that is, parallax 0 and
Shift the image so that it is displayed as In FIG. 1 showing the display by parallel projection,
A is the depth of the nearest point in the displayed image, and B is the depth of the farthest point.
Go, C indicates the depth of the average parallax. From FIG.
In the shadow, there is a parallax of Dc shown by (Equation 12) between the left and right images.
When there is no parallax on the screen,
It can be seen that the display becomes a natural display that matches. [0050] (Equation 12) The image shift means 4a is represented by (Expression 13)
Left image by shift amount (shift to right is positive)
Shift the image. [0052] (Equation 13) The image shift means 4b operates in the reverse direction.
Shift the right image by the same amount. Image shift means 4a
As a result of the shifts by 4b and 4b, points having average parallax
It will be displayed at the same depth as clean. FIG. 3 shows a display by convergence projection.
A is the depth of the closest point in the displayed image, B is the farthest
Point depth, C, indicates the depth of average parallax. With convergence projection
Is the same as the screen when parallax is 0 at the center of the image
Will be displayed at the depth of. Therefore, the congestion throw
In the case of a shadow, the image shift means 4a and 4b reduce the average parallax by-
The left and right images are shifted by half the value. As described above, according to the present embodiment, multi-view
When transmitting a point image, information on the nearest point and the farthest point in the image
By adding, the display that does not cause eye fatigue on the display side (visual
Difference control). Also, the size of the imaging surface (CCD) of the camera
Lens, the distance between the imaging surface and the center of the lens, and the focal length of the lens.
By adding information about the separation and transmitting it,
Approaching the subject when trying to display according to the viewing angle of the
Angle of view when shooting on the display side
Can be calculated accurately. Note that the nearest point and the farthest point in the multi-viewpoint image are
When transmitting without adding information to the
Instead of information on points, information on the nearest point and the farthest point
With a special code indicating that no information has been added.
Transmit and display the parallax within a preset range on the display side.
By performing calculations, it is possible to view the closest point and the farthest point in the image.
The difference can be estimated and is included in the present invention. Further, on the transmission side,
Set information about the nearest point and farthest point to a specific depth value
This allows the camera to be viewed in the specified specific depth range.
Parallax control can be performed so that the difference falls within the fusion range.
Included in the invention. In the present invention, the calculation of the parallax is displayed.
Although the example performed on the side has been described, in the encoded image
The included parallax may be used and is included in the present invention. FIG.
Such an example will be described using 0. In FIG. 10, components other than the image decoding means 6 are shown.
The operation is the same as the parallax control method shown in FIG.
The description is omitted, and the operation of the image decoding unit 6 will be described below.
I do. The image decoding means 6 decodes the encoded image data.
Output the parallax based on the left and right images and the left image.
You. Binocular image by multi-view image transmission method by MPEG-2
When transmitting the image, the parallax compensation based on the left image is used.
The compression ratio is increased. From the encoded image data
Calculate parallax on the display side by extracting parallax
This eliminates the need and reduces the amount of calculation on the display side. The average parallax calculated by the average parallax calculating means 3 is shown in FIG.
The calculation of the average is based on (Equation 14) with emphasis on the center of the screen.
A weighted average value may be used. If you do this,
At the center of the image, parallax control that is easier to fuse can be performed.
Included in the invention. [0062] [Equation 14] FIGS. 9 (a), 9 (b) and 9 (c) show (Expression 14)
An example of the distribution of weights used to calculate the weighted average by
You. Although shown one-dimensionally for simplicity, the actual
Two-dimensional distribution with a larger value at the center of the image than at the periphery
It is. Also, the weight values are all 0 or more (not negative
Value). (Second Embodiment) FIG. 11 shows a second embodiment of the present invention.
It is a block diagram of the parallax control system in a form. FIG.
, Except for the frequency calculation means 7 and the shift amount calculation means 8
The configuration performs the same operation as in the first embodiment.
This is the same as the explanation in the first embodiment.
The same reference numerals are given and the description is omitted. Below is the frequency calculation means
7. The operation of the shift amount calculating means 8 will be described. The frequency calculation means 7 is
Calculates the frequency of the parallax based on the left image decoded in this way. parallax
Is the frequency of an area of the image (for example,
Good or a specific area determined by the criteria)
Is the number of pixels calculated for each value of the parallax in. shift
The quantity calculating means 8 is calculated by the frequency calculating means 7
The frequency of parallax (between images) and the viewing angle of the image
From the eye fusion range, the sum of the parallax frequencies within the fusion range
A large shift amount is calculated, and the image shift means 4a, 4b
Output to FIG. 12 shows an example of the configuration of the shift operation means 8.
Is shown. In FIG. 12, 9 is an MPU and 10 is a fusion range.
It is a table. MPU9 is the width of the image display surface and the observation distance
The horizontal viewing angle shown in (Equation 15) is calculated from
Reading the fusion range at the field angle from the fusion range table 10
put out. [0066] (Equation 15) FIG. 13 shows an example of the characteristics of the fusion range table.
Show. In FIG. 13, the horizontal axis represents the horizontal direction of the image display surface.
The viewing angle is shown, and the vertical axis is based on the parallax fusion range ((Equation 16)).
Angle conversion). [0068] (Equation 16) The sign on the vertical axis in FIG. 13 indicates the negative side.
Parallax perceived before the surface, positive side is more than the display surface
The parallax perceived in the back is shown. FIG.
It is a figure which shows the graphic meaning of 6). FIG.
The calculated parallax θ converts the parallax Δ on the image display surface into a viewing angle.
Indicates that it was done. On the other hand, the parallel projection and the radiation shown in FIGS.
In convergence projection, the position of an image (for example, a liquid crystal projector
If so, the position of the pixel on the liquid crystal) xl1, xr1 and on the display surface
The positions of the positions Xl and Xr are expressed by (Equation 17) and (Equation 17), respectively.
19), and the parallax on the display surface is (Equation 18) (Equation 2)
0). [0071] [Equation 17] [0072] (Equation 18) [0073] [Equation 19] [0074] (Equation 20)Then, the coordinate value (x
l0, yl0), (xr0, yr0) and the position (x
l1, yl0), (xr1, yr1) (for example, with a liquid crystal projector
(If any, the position of the pixel on the liquid crystal) is given by (Equation 21)
expressed. [0076] (Equation 21) Here, the width win of the imaging surface is determined by the camera parameter.
The image width wout at the time of display is a value specific to the display system.
It is. According to imaging conditions (parallel imaging / convergence imaging)
Xl0 and xr0 are calculated using (Equation 5) or (Equation 8).
And converted to xl1, xr1 by (Equation 21). Furthermore,
In accordance with the projection conditions (parallel projection / convergence projection), (Equation 1)
8) By calculating (Equation 20),
Parallax on the display screen, taking into account both
Can be calculated. The MPU 9 reads from the fusion range table 10
Convert the protruding fusion range into parallax (distance) on the display surface,
The parallax fusion range on the image display surface is determined. And
The MPU 9 calculates the parallax and the image table in the image data described above.
Using the disparity relationship on the display surface, the disparity of the
Shift the image data so that the sum of the frequencies is maximized.
(The image shift due to parallax control is
Moving the frequency distribution of FIG. 15 horizontally in FIG.
Means). The output is output by the image shift means 4a and 4b.
The image is shifted in the reverse direction by the shift amount,
a, 5b, so that the
The sum of the disparity frequencies is the largest (that is, the pixels that fuse in the image
Can be displayed. As described above, according to the present embodiment,
For example, by performing parallax control according to the fusion range of human eyes
The parallax in more parts of the image when displayed
Inside. In the present embodiment, in the fusion range,
Parallax control that maximizes the sum of parallax frequencies was explained
However, the parallax control is performed so that the average value of the parallax is in the center of the fusion range.
Control can produce almost the same effect, and is included in the present invention.
I will. On the transmission side, the nearest point and the farthest point
To a value different from the nearest and farthest points in the actual image.
To the nearest and farthest point of the set value on the display side.
The average disparity of each applicable disparity is in the center of the fusion range.
By controlling the parallax so that the image creator
Images at different depths can be preferentially presented to the observer.
Included in the present invention. (Third Embodiment) A third embodiment of the present invention is described as follows.
Input image pairs, and determine the initial disparity and the reliability of the initial disparity.
Calculates the object contour from the reference image and the reliability of the initial parallax
The initial parallax and the reliability of the initial parallax and the detected object
From the body contour, the reliability of the initial parallax near the object contour
Determine parallax in low areas. At this time, the parallax is
It changes at the contour line and smoothly touches the surrounding parallax.
And a device for estimating the disparity to be determined to be continued.
You. In the present embodiment, the basic
The initial parallax and the initial parallax are calculated from a pair of image pair
Calculate the parallax reliability and the reliability of the reference image and the initial parallax
Object parallax is detected from the
From the detected object contour and the first
The parallax in the area where the period parallax is unreliable is
Change and smoothly connect with the surrounding parallax.
To decide. FIG. 16 shows a third embodiment of the present invention.
FIG. 2 is a block diagram of a parallax estimating device for performing the method. In FIG. 16, reference numeral 201 denotes a block match.
Initial disparity estimating unit for calculating an initial disparity due to weighting, 202
Is the reliability evaluation unit at the time of initial parallax estimation, and 203 is the contour detection
And 204, a parallax estimating unit near the object contour. The operation of the above configuration will be described below. The initial parallax estimating unit 201 is represented by (Equation 22)
Sum of Squared diff
calculations (hereinafter SSD). (Equation 2
The value of the SSD according to 2) is based on the window area set in the reference image.
The distribution of pixel values in the window area set in the reference image is similar.
Where the value is small, and conversely,
The value becomes large where the distribution of pixel values is different. initial
The disparity estimation unit 201 calculates the SSD value within a predetermined search range.
The shift amount d between the images to be minimized is set at the point of interest (x, y).
Estimation of the parallax near the object contour
The minimum value of the SSD within the search range is output to the
It outputs to the reliability evaluation part 202 at the time of period parallax estimation. [0089] (Equation 22) FIG. 17 is a diagram showing the results obtained by the initial parallax estimating unit 201.
FIG. 7 is a diagram illustrating initial parallax estimation (block matching).
is there. In FIG. 17, focusing on the point of interest (x, y),
The set window area indicates the integration area W of (Equation 22). window
The area is sequentially shifted and set, and the above SSD calculation is performed.
Thus, an initial parallax of the entire image can be obtained. The reliability evaluation unit 202 at the time of initial parallax estimation
SSD obtained by parallax calculation by the initial parallax estimating unit 201
The minimum value in the search range of, the pixels in the window area (block)
Number, noise variance between images, water of reference image in window area
From the average value of the square of the luminance gradient in the vertical and vertical directions,
The reliability evaluation value of the association shown in 3) is calculated. [0092] (Equation 23) The smaller the value of (Equation 23) is, the smaller the value of parallax estimation is.
Higher reliability indicates higher reliability
It indicates that. FIG. 18 shows an example of the configuration of the contour detection unit 203.
FIG. In FIG. 18, reference numeral 205 denotes a base.
A YC separation circuit for separating a quasi-image into a luminance component and a color component, 2
06A, 206B, and 206C are the separated luminance components.
Edges are detected from minute Y and color components RY and BY respectively
Edge detection circuit 207 is used to detect the edge line of the edge detection result.
Line detection unit that outputs only the intensity at
Output a weight of 1 in a region where the difference estimation value is unreliable,
Outputs a weight of 0 in regions where the disparity estimation value is highly reliable
It is a weight generation circuit. The operation of the above configuration will be described below. The YC separation circuit 205 converts the reference image into a luminance component.
Then, the image data is separated into minutes Y and color components RY and BY and output. Edge detection circuits 206A, 206B, 20
6C is an edge from the Y, RY, and BY components, respectively.
The di-component is detected. FIG. 19 shows the configuration of the edge detection circuit 206.
It is a block diagram showing an example of a composition. In FIG.
209A, 209B and 209C are low spatial frequencies, respectively.
Components in the low, medium and high spatial frequencies.
This is a group of direction-specific filters for detecting minutes. 210, 21
1, 212, and 213 denote filter groups for respective directions.
It is a direction-specific filter to configure. FIG. 20 shows the above directions.
It is an example of the spatial weight of a filter. FIG. 20 (a),
(B) and (c) show edges that are continuous in the vertical direction,
(D), (e), and (f) detect edges in oblique directions
Things. (A) and (d) are high spatial frequency ranges,
(B) and (e) are medium spatial frequency ranges, and (c) and (f) are low.
6 shows an example of a distribution of weights for a spatial frequency range. Horizontal and
The other edge detection in the oblique direction is based on the arrangement of the counting in FIG.
What is necessary is just to rotate 90 degrees. Edge direction is 45 degrees
It is not necessary to limit the interval to 30 degrees.
Of course. The spatial weights of the direction filters are shown in FIG.
It is not necessary to limit to those shown in FIG.
Of course, it is only necessary that the weight distribution is of a split type. each
The formula for calculating the edge strength for each direction is given by (Equation 24).
Become. [0100] (Equation 24) The integrating unit 214 includes direction-specific filters 210 and 2
11, 212, and 213 are integrated. Integration unit 214
An example of the integration by the following equation is represented by (Equation 25). [0102] (Equation 25) The integration by the integration unit 214 is (Equation 25)
It is not necessary to limit to the form of sum of squares shown by
Of course, it may be in the form of a sum of values. The luminance component Y and the color components RY and BY are described below.
High spatial frequency range, medium spatial frequency range, low spatial frequency range
And the integration units 214A, 214B and 214C respectively.
The integrated edge strength is multiplied and output. Soshi
The edge strength for each of the Y, RY, and BY components
The degrees are added and transferred to the ridge line detection unit 7. Note that the contour detection unit 203
Separation into luminance component and color component is limited to Y, RY, BY
There is no need to separate it into other components such as R, G, B
Is natural. Also, for Y, RY and BY
The edge strength is transferred to the edge detection unit 207 after the addition.
It is not necessary to limit to, and transfer to the edge detection unit 207 after multiplication.
May be. Returning to FIG. 18, the ridge line detecting section 207
Of the edge strengths added for Y, RY and BY
Output only the values at the edges. FIG. 21 shows a ridge line detection unit.
207 is an example of the configuration of FIG. In FIG. 21, the horizontal ridgeline
The detection circuit 215 determines that the edge intensity at the pixel of interest is above the point of interest.
1 if both are greater than the edge strength at the bottom pixel
Output, otherwise output 0. Similarly, the vertical ridge line detection circuit 216 outputs
The edge strength at the element is the edge strength at the left and right pixels of the point of interest
Is output if it is greater than both, otherwise.
Output 0. Horizontal ridge detection circuit 215 and vertical ridge
An output of the detection circuit 216 is subjected to an OR operation, and further, an input signal
Is output. That is, the ridge line detection unit 207
Is the edge of a pixel that is adjacent in the horizontal or vertical direction.
Pixels with an edge strength greater than the
Only the edge intensity at the pixel that is a line) is output
Then, 0 is output for the other pixels. Returning to FIG. 18 again, weight generation circuit 208
Indicates that the reliability evaluation value of the initial parallax
It outputs 1 and outputs 0 when it is less than the threshold value. weight
The output of the generation circuit 208 is multiplied by the output of the edge detection unit 207
Where the reliability of the initial disparity estimate is low
At the edge, that is, the object contour where the parallax changes discontinuously
Lines can be extracted. The output of the weight generation circuit 208 is
The calculation area of the parallax estimating unit 204 near the object contour described later
Stored in memory. The extraction of the object contour is expressed by the following equation.
(Equation 26). [0109] (Equation 26) The edge detection results 206A, 206B,
Add the output of 206C and input it to the edge detection unit 7
It is not necessary to limit to
You may. Also, it is multiplied by the output of the edge detection unit 207.
The method of weight generation by the weight generation circuit 208 is as follows.
It does not need to be limited to binary values, depending on the reliability at the time of initial parallax estimation.
It is natural that a continuous value may be output. The parallax estimating section 204 near the object contour
Initial disparity estimation near body contour
The disparity is recalculated from the contour strength and the initial disparity. Object contour
The nearby disparity estimation unit 204 is defined by (Equation 27).
Disparity distribution that minimizes energy for disparity distribution
Is calculated. [0112] [Equation 27] The weight function w (x, y) is a parameter of smoothness.
(Equation 28) based on the data and the contour strength. [0114] [Equation 28] The condition of the parallax distribution that minimizes (Equation 27) is
(Equation 29). [0116] (Equation 29) The differential equation (Equation 29) is obtained by the finite element method.
It can be solved numerically by a known technique such as (FEM).
it can. FIG. 22 shows a parallax estimating unit 2 near the object contour.
FIG. 4 is a block diagram illustrating an example of a configuration of the information management unit 04. FIG.
217 generates weights for the disparity distribution energy
A weight generation circuit for parallax distribution energy, 218 is a calculation area
Memory, 219 is a parallax memory, 220 is a weight memory, 2
21 is an FEM operation circuit. Parity distribution energy weight generation circuit 217
Is obtained from the parameters λ of the contour strength and smoothness (Equation 28)
Calculates the value of the weight function and writes it to the weight memory 220
No. The FEM operation circuit 221 calculates (Equation 29) by the finite element method.
To calculate the parallax distribution. As described above, according to the present embodiment,
In areas where the reliability of parallax estimation values by
The object contour, and
It is possible to perform disparity estimation so that disparity changes discontinuously in
it can. According to the present embodiment, any shape
Disparity so that it changes discontinuously at the object contour line
An estimate can be made. Note that the parallax estimation near the object contour is based on the parallax
It changes at the contour of the object, and smoothly with the surrounding parallax.
Just connect them and minimize the energy shown in (Equation 27)
It is not necessary to limit to a method of calculating as parallax to be performed. That's it
Such an example will be described below. (Fourth Embodiment) FIG. 23 shows a fourth embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a parallax estimating device according to an embodiment.
You. In FIG. 23, reference numeral 201 denotes block matching.
The initial parallax estimating unit 202 that calculates the initial parallax
A reliability evaluation unit at the time of difference estimation, 222 is a contour detection unit, 223
Is a parallax estimating unit near the object contour. In the above configuration, the contour detecting section 222, the object
The operation of the configuration other than the parallax estimation unit 223 near the body contour is
The description is omitted because it is the same as the third embodiment of the invention.
In the following, the contour detection unit 222 performs parallax estimation near the object contour.
The operation of the setting unit 223 will be described. First, the contour detecting section 222 is the third embodiment of the present invention.
Performs the same contour detection as the contour detection unit in the embodiment.
No, binarize the detection result (for example, 0 and 1) and output
You. The parallax estimating unit 223 near the object contour calculates the object contour line
The disparity in the region where the initial disparity estimation value in the vicinity is unreliable is
Object wheel detected by initial parallax and contour detection unit 222
Calculate from the contour line. FIG. 24 shows a parallax estimating unit 2 near the object contour.
FIG. 14 is a diagram showing a state of parallax estimation by 23. FIG.
291 is an area where the initial disparity estimated value is unreliable, 2
Reference numeral 92 denotes an object contour detected by the contour detection unit 222
The line 293 is a reliable region of the initial disparity estimation value, 29
4 is the point of interest to calculate parallax, 295 is the point of interest
This is a window area set to be included. The disparity at the point of interest 294 (x, y) is
A region 29 in which the initial disparity estimation value has low reliability within the setting window region
1 surrounding area (in this case, the initial parallax estimation value
Using the parallax in the highly reliable region 293a),
The disparity at 94 is the distance between the surrounding area and the point of interest 294.
Depending on the parallax value in the surrounding area,
decide. At this time, the parallax in the surrounding area is
Be careful not to affect point of interest 294 beyond section line 292
Changes at the object contour line 292,
The parallax so that it is smoothly connected to the surrounding parallax.
Can be determined. Viewing by the disparity estimation unit 223 near the object contour
Expression of the difference estimation as an example is (Equation 30). [0127] [Equation 30] However, the parallax estimating unit 22 near the object contour
The parallax estimation by 3 need not be limited to (Equation 30).
The difference changes at the object contour line, and smoothly changes with the surrounding parallax.
Of course, any connection can be used. As described above, according to the present embodiment,
In areas where the reliability of parallax estimation values by
The object contour, and
It is possible to perform disparity estimation so that disparity changes discontinuously in
it can. According to the present embodiment, any shape
Disparity so that it changes discontinuously at the object contour line
An estimate can be made. Furthermore, according to the present embodiment, the initial parallax
Compare in the vicinity of the point of interest in areas where the estimated value is unreliable
By calculating the disparity by referring to the disparity around the target
Calculation of parallax with a small amount of memory and computation
Can be. Further, the third and fourth embodiments have been described.
Shift and integrate the left and right images using the result of parallax estimation
By doing so, each viewpoint corresponding to the left and right images
Can be generated at a predetermined intermediate viewpoint. here
Perform parallax estimation and intermediate viewpoint image generation in different places
You may. The difference between parallax estimation and intermediate viewpoint image generation is described below.
A transmission and reception method when performing in a certain place will be described. (Fifth Embodiment) FIG. 25 shows a fifth embodiment of the present invention.
In the embodiment, the parallax estimation (or motion estimation) is performed on the transmission side.
1 is an example of a transmission block of a system that performs the following. In FIG. 25, reference numeral 170 denotes a left image as a reference.
Disparity estimating means for estimating the disparity VL obtained, 171 is a right image
Estimating means 172 for estimating disparity VR based on
a to d are encoders, 173a and b are decoders, and 174 is
The right image R is obtained from the left image L and the parallax VL based on the left image.
The prediction unit 175 for predicting the parallax V based on the left image
Prediction means for predicting parallax VR based on the right image from L,
176a and 176b are disparities in an area where disparity is not correctly estimated.
Is a means for filling in the gap. The operation of the above configuration is described below.
explain about. First, the left image L is encoded by the encoder 172a.
Is encoded. Also, the parallax estimating means 170 and 171
Therefore, the parallaxes VL and VR based on the left and right images respectively are
Presumed. Parallax is correctly estimated by occlusion etc.
The third or fourth embodiment is applied to the area not
Filling means 176a using the parallax estimation method described in
The parallax is determined by 176b. Next, the parallax after filling in the holes based on the left image.
Is encoded by the encoder 172b. Encoded
The parallax after filling in with the left image as a reference is calculated by the decoder 173.
a, the prediction of the right image R by the predictor 174.
Measurement and the right image after filling in by the predictor 175
It is used to predict the disparity. Right image by predictor 175
The prediction of the parallax VR based on the left image is based on the parallax VR based on the left image.
Using the difference, calculation is made as (Equation 31). [0136] (Equation 31) The right image R is the same as the image predicted by the predictor 174.
, And encoded by the encoder 172d.
You. The parallax VR after filling in the holes based on the right image is a predictor
The residual with the predicted disparity by 175 is calculated, and the
c. FIG. 26 shows a system for performing parallax estimation on the receiving side.
It is an example of a reception block of a system. In FIG. 26, 18
1a to 1d are decoders, 174 is a predictor for the right image R, 17
5 is a parallax predictor based on the right image. Encoded
Left image L, left image based parallax VL, right image based viewing
The prediction error of the difference VR and the prediction error of the right image R are respectively decoded.
Decoders 181a to 181d decode the data. Right image R
Is the prediction result of the predictor 174 and the decoded right image.
It is restored by adding the prediction error. Parallax V based on right image
R is the prediction result of the predictor 175 and the decoded prediction
The error is restored by adding the error. A left image L, a right image R, and a parallax V based on the left image.
L, when the parallax VR based on the right image is restored, for example,
No. 7-109821
It is possible to generate an image with an intermediate viewpoint between the left and right images.
Display as a multi-viewpoint image along with the left and right images
be able to. As described above, with the above configuration,
Performing parallax estimation and fill-in processing on the transmitting side enables reception
The amount of calculation on the receiving side can be reduced, and the device scale on the receiving side
Can be reduced. When transmitting a multi-viewpoint image, the transmission side
Reduces the amount of transmission by generating intermediate viewpoint images
Image transmission can be performed. For such an example,
This is described below. (Sixth Embodiment) FIG. 27 shows a sixth embodiment of the present invention.
Side structure of multi-view image compression transmission system
FIG. In FIG. 27, 101a to 101d are
A camera for capturing an image at the viewpoint position;
Image compression code for compressing and encoding the image of
The coding unit 103a is a compression code by the image compression coding unit 102.
A decoded image decompression unit for decoding and decompressing the decoded image data,
104a indicates that the decoded image decompression unit 103a has decoded and decompressed
From camera 1 image and camera 4 image, camera 2 viewpoint
Viewpoint image that predicts and generates an image from the viewpoint of camera and camera 3
The image generation unit 105 converts the image of the camera 2 and the image of the camera 3
And the image generated by the intermediate viewpoint image generation unit 104a.
This is a residual compression encoding unit that compresses and encodes the residual. Less than
Next, the operation of the above configuration will be described. The image compression / encoding unit 102 converts a multi-view image
A plurality of images (in the present embodiment, four
Viewpoint images) are already stored using block correlation between images.
Compress and encode using existing techniques. FIG. 31 shows image compression.
1 shows an example of the configuration of an encoding section 102. In FIG. 31,
107a and 107b are 8 × 8 pixels or 16 × 16 pixels
DCT method that performs DCT calculation for each element and calculates DCT coefficients
The stages 108a and 108b are quantum machines for quantizing DCT coefficients.
Means 109a, inverse quantization means, 110a inverse DCT
Inverse DCT means for performing calculation, 111 is disparity detection means,
112a is the parallax compensating means, 113a is the quantized DC
This is an encoding unit that encodes the T coefficient and the disparity. Below on
The operation of the above configuration will be described. The DCT means 107a converts the image of the camera 1
Process block by block and calculate DCT coefficient for each block
calculate. The quantizing means 108a quantifies the DCT coefficient.
Become a child. The inverse quantization means 109a outputs the quantized
The DCT coefficients are inversely quantized. The inverse DCT means 110a
The inversely quantized DCT coefficients are inversely transformed and obtained on the receiving side.
The image of the camera 1 is restored. Parallax detecting means 111
Is the block between the restored camera 1 image and camera 4 image.
To perform a visual matching based on the image of camera 1
The difference is calculated for each block. The parallax compensation means 112a
Using the restored image of the camera 1 and the parallax of each block
To predict the image of the camera 4 (that is,
And performs a process corresponding to compensation). DCT means 107b
Represents the residual of the image of the camera 4 and the above-described predicted image for each block.
To calculate the DCT coefficient. The quantization means 108b
The residual DCT coefficient is quantized. Encoding means 113
a is a quantized DCT coefficient of an image of the camera 1;
Quantized DCT coefficient of parallax for each check, residual of parallax compensation
Encode the number. Further, the decoded image decompression unit 103a
Image data compressed and encoded by the compression encoding unit 102
Decrypts and decompresses the data. FIG. 32 shows the decoded image decompression unit 1.
It is a figure showing an example of composition of 03a. In FIG. 32,
114a is decoding means, 109b and 109c are inverse quantization
Means, 110b and 110c are inverse DCT means, and 112b is
This is parallax compensation means. The operation of the above configuration is explained below.
I will tell. The decoding means 114a outputs the compressed and coded
The data is decoded, and the quantized DC
T coefficient, parallax for each block, quantized residual of parallax compensation
The obtained DCT coefficient is extended. Quantized image of camera 1
The inverse DCT coefficient is inversely quantized by the inverse quantization means 109b.
And is decompressed as an image by the inverse DCT means 110b.
Lengthened. The motion compensating means 112b
Prediction of the camera 4 from the image of the camera 1 and the decoded parallax
Generate an image. Then, the inverse quantization means 109c, the inverse D
The residual expanded by the CT means 110c is
The image of camera 4 is decompressed by adding to the image. [0146] The intermediate viewpoint image generation unit 104a is the same as that of the present invention.
The method shown in either the third or fourth embodiment
Therefore, the parallax of each pixel is obtained from the images of the camera 1 and the camera 4.
Calculate and predict and generate the images of camera 2 and camera 3. [0147] The residual compression encoding unit 105 is
Compresses and encodes the residual between the image of camera 3 and the predicted image
You. The intermediate viewpoint image generation unit 104a calculates the parallax for each pixel.
For each block by block matching.
The parallax can be estimated more accurately than the difference calculation. The result
As a result, the prediction error (that is, residual error) of the intermediate viewpoint image is reduced.
And increase compression efficiency
More efficient bit allocation,
Can be maintained. FIG. 33 shows a residual compression encoding unit.
An example of the configuration will be shown. In FIG. 33, 107c, 10
7d is DCT means, 108c and 108d are quantization means,
113b is an encoding means. Image of camera 2 and camera 3
The residuals of the image are calculated by the DCT units 107c and 107d, respectively.
Are converted into DCT coefficients by the quantization means 108c and 10c.
8d and quantized by the encoding means 113b.
Encoded. FIG. 34 shows a sixth embodiment of the present invention.
FIG. 1 is a configuration diagram of a receiving side of a multi-viewpoint image compression transmission system according to the present invention.
You. In FIG. 34, reference numeral 103b denotes an image compression code on the transmission side.
Of the camera 1 and the camera 4 compressed and encoded by the encoding unit 102
A decoded image decompression unit 104b that decodes and decompresses data,
With the camera 1 that the decoded image decompression unit 103b has decoded and decompressed,
From the image of camera 4, the image from the viewpoint of camera 2 and camera 3
An intermediate viewpoint image generation unit for predicting and generating an image,
Prediction error (residual error) of the prediction image from the viewpoints of camera 2 and camera 3
Is a decoding residual expansion unit that decodes and expands. Decrypted image
The image expansion unit 103b and the intermediate viewpoint image generation unit 104b
Regarding the operation, the decoded image decompression unit 103a and the
And the operation is the same as that of the intermediate viewpoint image generation unit 104a.
In the following, the operation of the decoding residual decompression unit will be described.
Will be explained. The decoding residual decompression unit 106 outputs the residual
Camera 2 compressed and encoded by the compression encoding unit 105
And the prediction error (residual) of the predicted image from the viewpoint of the camera 3
And expand it. FIG. 35 is a block diagram of the decoding residual decompression unit 106.
1 shows an example of the configuration. In FIG. 35, reference numeral 114b denotes decryption
Means, 109d and 109e are inverse quantization means, 110d,
110e is an inverse DCT means. Compression-encoded turtle
The residual data of the image of camera 2 and camera 3 is
4b, respectively, and the inverse quantization means 10
9d and 109e, inversely quantized, and inverse DCT means 11
0d and 110e. Decompressed and expanded
The residual of the images of the camera 2 and the camera 3 is converted into an intermediate viewpoint image generation unit.
104b is superimposed on the image generated by
Restores the images of the viewpoints of camera 2 and camera 3
You. As described above, according to the present embodiment, transmission
On the transmitting side, the two non-adjacent images in the multi-viewpoint image
Generate an intermediate viewpoint image, and use the generated intermediate viewpoint image
The residual from the actual image of the intermediate viewpoint is obtained, and the above two
Image and the residual of the intermediate viewpoint image are compressed and transmitted.
You. On the receiving side, the two transmitted images and the intermediate viewpoint image
Decode and expand the residual of the image, and extract the intermediate viewpoint from the two images.
An image is generated, and the residual of the decoded and expanded
To restore the image corresponding to the actual image at the intermediate viewpoint
You. In this way, multi-view images can be efficiently
In addition, compression transmission can be performed while maintaining image quality. It should be noted that the generation of the intermediate viewpoint image is based on the multi-view image.
At two viewpoints (viewpoints of camera 1 and camera 4) at both ends of
It is not necessary to limit to a configuration that generates an image with an intermediate viewpoint from
For example, from the images of camera 2 and camera 4,
An image from the viewpoint of the camera 3 may be generated.
Image from camera 2 and camera 4 from camera 3 image
May be generated. Furthermore, the images of camera 2 and camera 3
From the viewpoints of camera 1 and camera 4
And each is included in the present invention. The number of viewpoints of a multi-view image is limited to four.
It is not necessary, and it can be
It is clear that an intermediate viewpoint image between each viewpoint may be generated.
And included in the present invention. Further, the third and fourth embodiments of the present invention
In the state, as the reliability evaluation value of the initial parallax estimation value,
It is not necessary to limit to the one shown in (Equation 23).
Even if only the molecule is used as the reliability evaluation value, the brightness
Affected by the distribution, but can achieve almost the same effect
Included in the invention. When the noise level of the image is low,
Calculates a value that ignores the noise term as the reliability evaluation value.
It is natural that the same effect can be obtained even if it is included in the present invention.
It is. Further simplified, as a reliability evaluation value,
Minimum sum of residual squares per pixel, or residual square
The minimum value of the sum may be used, and calculation can be performed with a simpler circuit.
And is included in the present invention. Further, the reliability evaluation value of the initial parallax estimation value is used.
In other words, the difference between the estimated parallaxes in
It may be used and is included in the present invention. [0157] (Equation 32)As the reliability evaluation value of the initial parallax estimation,
By combining two or more of the above
It is possible to perform more stable reliability evaluation
include. Further, the third and fourth embodiments of the present invention
In the state, the correlation calculation between the images for the initial disparity estimation is
It is not necessary to limit to the residual sum of squares (SSD).
The same effect can be obtained by using (SAD).
Such an embodiment is of course included in the present invention. In the sixth embodiment of the present invention,
Compression encoding of images from two non-adjacent viewpoints
Limited to those using correlation between images (between viewpoints)
It is not necessary to use
May also be included in the present invention. [0161] As described above, according to the present invention, the camera
The size of the imaging surface (CCD) and the distance between the imaging surface and the center of the lens
Distance and information on the focal length of the lens.
The display according to the viewing angle at the time of shooting.
When trying to shoot an image taken close to the subject
Can also calculate the viewing angle at the time of shooting with high accuracy on the display side.
Accuracy of the observation distance that reproduces the same viewing angle as when shooting
Can be determined well. When transmitting a multi-viewpoint image,
By adding the information of the nearest point and the farthest point,
A display (parallax control) without eyestrain can be performed. Also, parallax control according to the fusion range of human eyes
By doing so, when displaying more parts of the image
Parallax can be within the fusion range. On the transmitting side, the nearest point to be added,
As the information of the farthest point, the nearest point and the farthest point in the actual image
Sets a different value, and displays the nearest point of the set value.
And the average disparity of the disparity corresponding to the farthest point
Is controlled by parallax control so that it is in the center of the fusion range.
The image at the depth intended by the image creator.
Can be presented to the observer. According to the present invention, a block match
In areas where the reliability of the parallax estimate by
Contour is detected and parallax is discontinued at the detected object contour
The disparity estimation can be performed so as to continuously change. Also, the parallax at the object contour line of an arbitrary shape
Disparity estimation can be performed so that
You. . Also, the parallax filling process on the transmitting side (the present invention)
, The parallax changes at the object contour and the surrounding
Parallax estimating process that smoothly connects to parallax).
The amount of calculation on the receiving side can be reduced,
The device scale can be reduced. Further, the transmission side of the multi-viewpoint image transmission system
By generating the intermediate viewpoint image on both the receiving side,
Reduce the transmission amount of the intermediate viewpoint image (transmission amount of the residual)
As a result, multi-view images can be efficiently
Can be transmitted with compression.

【図面の簡単な説明】【図１】本発明の第１の実施の形態における平行投影の
場合の最近点、最遠点、観察者の輻輳と調節が一致する
点の位置関係を示す図【図２】同被写体の位置、合焦時の撮像面の位置と焦点
距離の関係を示す図【図３】同２つのプロジェクタを用いて輻輳投影をする
場合の輻輳距離、最近点、最遠点の位置関係を示す図【図４】本発明の第１の実施の形態における画像伝送方
法で定義するパラメータを示す図【図５】画像間の視差の平均値を相殺するようにずらす
処理のブロック図【図６】左画像を基準としてブロックマッチングにより
視差を計算する場合を示す図【図７】平行撮影の場合を示す図【図８】輻輳撮影の場合を示す図【図９】（ａ）〜（ｃ）は、（数１４）による重み付け
平均の計算に用いる重みの分布の例を示す図【図１０】画像復号手段の動作を示す図【図１１】本発明の第２の実施の形態における視差制御
方式のブロック図【図１２】シフト演算手段の構成の一例を示す図【図１３】融合範囲テーブルの特性図【図１４】（数１６）の図形的な意味を示す図【図１５】視差の頻度分布図【図１６】本発明の第３の実施の形態による視差推定装
置の構成図【図１７】同ブロックマッチングを示す図【図１８】同輪郭検出部の構成図【図１９】同エッジ検出部の構成の一例を示す構成図【図２０】（ａ）〜（ｆ）は、同方向別のフィルタの重
み係数の例を示す図【図２１】同稜線検出部の構成図【図２２】同物体輪郭付近での視差推定部の構成図【図２３】本発明の第４の実施の形態による視差推定装
置の構成図【図２４】同物体輪郭線近傍での視差推定を示す図【図２５】本発明の第５の実施の形態で送信側で視差推
定を行うシステムの送信部の構成図【図２６】本発明の第５の実施の形態で送信側で視差推
定を行うシステムの受信部の構成図【図２７】本発明の第６の実施の形態における多視点画
像伝送システムの送信部の構成図【図２８】ＭＰＥＧ−２シンタックスの概略図【図２９】伝送される多視点画像の時空間方向の関係図【図３０】ＯｐｅｎＧＬによるカメラパラメータの定義
を示す図【図３１】本発明の第６の実施の形態における多視点画
像伝送システムの画像圧縮符号化部の構成の一例を示す
図【図３２】本発明の第６の実施の形態における多視点画
像伝送システムの復号化画像伸長部の構成の一例を示す
図【図３３】本発明の第６の実施の形態における多視点画
像伝送システムの残差圧縮符号化部の構成の一例を示す
図【図３４】本発明の第６の実施の形態における多視点画
像伝送システムの受信部の構成図【図３５】本発明の第６の実施の形態における多視点画
像伝送システムの復号化残差伸長部の構成の一例を示す
図【符号の説明】Ａ表示される画像の最近点Ｂ最遠点Ｃ観察者の輻輳と調節が一致する点Ａ１，Ａ２カメラのレンズ中心Ｂ１，Ｂ２画像面の中心Ｃ１輻輳点２０１初期視差推定部２０２初期視差推定時の信頼性評価部２０３輪郭検出部２０４物体輪郭付近での視差推定部BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram showing a positional relationship between a nearest point, a farthest point, and a point where the convergence and adjustment of an observer coincide with each other in parallel projection according to the first embodiment of the present invention. FIG. 2 is a diagram showing the relationship between the position of the subject, the position of the imaging surface at the time of focusing, and the focal length. FIG. 3 is the convergence distance, nearest point, and farthest point when performing convergence projection using the two projectors. FIG. 4 is a diagram showing parameters defined in the image transmission method according to the first embodiment of the present invention. FIG. 5 is a block diagram of a process for shifting so as to cancel the average value of disparity between images. FIG. 6 shows a case where parallax is calculated by block matching with reference to the left image. FIG. 7 shows a case of parallel shooting. FIG. 8 shows a case of convergence shooting. FIG. 9 (a) To (c) are distributions of weights used for calculating the weighted average by (Equation 14) FIG. 10 is a diagram showing an operation of an image decoding unit. FIG. 11 is a block diagram of a parallax control system according to a second embodiment of the present invention. FIG. 12 is an example of a configuration of a shift operation unit. FIG. 13 is a characteristic diagram of the fusion range table. FIG. 14 is a diagram showing the graphical meaning of (Equation 16). FIG. 15 is a parallax frequency distribution diagram. FIG. 16 is a diagram showing a third embodiment of the present invention. FIG. 17 is a diagram showing the same block matching. FIG. 18 is a diagram showing the configuration of the same contour detection unit. FIG. 19 is a diagram showing an example of the configuration of the edge detection unit. FIGS. 21A to 21F are diagrams illustrating examples of weight coefficients of filters in the same direction. FIG. 21 is a configuration diagram of the same ridge line detection unit. FIG. 22 is a configuration diagram of a parallax estimation unit near the same object contour. FIG. 24 is a configuration diagram of a parallax estimating device according to a fourth embodiment of the present invention. FIG. 25 is a diagram illustrating a disparity estimation of the transmission unit. FIG. 25 is a configuration diagram of a transmission unit of a system that performs disparity estimation on the transmission side according to the fifth embodiment of the present invention. FIG. 26 is a transmission side according to the fifth embodiment of the present invention. FIG. 27 is a configuration diagram of a transmission unit of a multi-viewpoint image transmission system according to a sixth embodiment of the present invention. FIG. 28 is a schematic diagram of MPEG-2 syntax. FIG. 29 is a diagram showing the relationship between the spatio-temporal directions of the transmitted multi-viewpoint images. FIG. 30 is a diagram showing the definition of camera parameters by OpenGL. FIG. 31 is an image of the multi-viewpoint image transmission system according to the sixth embodiment of the present invention. FIG. 32 is a diagram illustrating an example of a configuration of a compression encoding unit. FIG. 32 is a diagram illustrating an example of a configuration of a decoded image decompression unit of a multi-view image transmission system according to a sixth embodiment of the present invention. Multiple viewpoints in the sixth embodiment FIG. 34 is a diagram illustrating an example of a configuration of a residual compression encoding unit of the image transmission system. FIG. 34 is a configuration diagram of a reception unit of the multi-view image transmission system according to the sixth embodiment of the present invention. The figure which shows an example of a structure of the decoding residual decompression part of the multi-view image transmission system in 6th Embodiment [Description of code] A The nearest point of the displayed image B The farthest point C The congestion and adjustment of the observer Matching points A1, A2 Camera lens centers B1, B2 Image plane center C1 Convergence point 201 Initial parallax estimating unit 202 Reliability evaluation unit 203 for initial parallax estimation 203 Contour detecting unit 204 Parallax estimating unit near object contour

───────────────────────────────────────────────────── フロントページの続き (72)発明者森村淳大阪府門真市大字門真1006番地松下電器産業株式会社内Ｆターム(参考） 5C061 AA21 AB04 AB08 AB12 ────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Jun Morimura Matsushita Electric, 1006 Kadoma, Kazuma, Osaka Sangyo Co., Ltd. F term (reference) 5C061 AA21 AB04 AB08 AB12

Claims

Claims: 1. In transmitting an image from two or more viewpoints, information on an image pickup size of a camera for picking up an image and a distance between a lens center and an image pickup surface are transmitted, so that an image is picked up on a display side. A multi-view image transmission method, wherein information on a viewing angle can be obtained.