JP4584405B2

JP4584405B2 - 3D object detection apparatus, 3D object detection method, and recording medium

Info

Publication number: JP4584405B2
Application number: JP2000134407A
Authority: JP
Inventors: 收文中山; 守人塩原
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2000-05-08
Filing date: 2000-05-08
Publication date: 2010-11-24
Anticipated expiration: 2020-05-08
Also published as: JP2001319224A

Description

【０００１】
【発明の属する技術分野】
本発明は、侵入者検出、車両（自動車等）位置・速度検出、移動ロボット用ナビゲーションなど、距離情報に基づく３次元物体の位置・速度の検出を行う装置（或いはシステム）に利用され、特に、何らかの距離計測方法によって３次元位置情報（距離情報）が与えられた時に、３次元空間の広範囲に存在し得る３次元物体の位置、形状、移動速度を検出する３次元物体検出装置と３次元物体検出方法及び記録媒体に関する。
【０００２】
【従来の技術】
以下、従来例について説明する。
【０００３】
§１：従来例１
従来より、距離情報を利用して物体の位置を検出する方法が幾つか提案されている。それらを大別すると、（Ａ１）：距離情報のみを利用してその距離分布の集合を検出することで物体を得る方法と、（Ａ２）：距離情報と画像情報の両者を必ず用いる方法とがある。以下、これらの方法について説明する。
【０００４】
(1) ：（Ａ１）の説明
前記（Ａ１）の代表として、実吉ら「ステレオ画像を用いた運転支援のための前方状況認識システム」、電子情報通信学会、技術研究報告、ＰＲＭＵ９７−２５〜３６、ｐｐ．３９−４６の方法を説明する。
【０００５】
この方法では、先ず、車両の前方の路上情景を２つのカメラで観測して、２つの画像より両眼立体視により距離情報（距離画像）を取得する。得られた距離画像を、次に定めるように短冊状の領域群に分割する。領域は、予め定めた小さな幅で、高さは事前に取得したカメラの配置情報と路面位置情報の関係から、路面の上部だけを含むような高さを持つように設定する。
【０００６】
そして、それぞの領域毎に、含まれる距離情報を用いて奥行きを横軸にとり頻度を縦軸にとる奥行きのヒストグラムを作成し、ヒストグラムのピーク位置より短冊領域を代表する距離を定める。次に、隣接する領域群で距離の近いものをまとめて領域グループを作成する。
【０００７】
そして、得られたグループの３次元空間での傾きを調べて、観測方向に対して垂直に近ければ車両の後方面とし、観測方向に平行に近ければ車両側面と分類する。更に、この前面と側面の位置関係による組み合わせにより車両位置、形状を検出する。
【０００８】
(2) ：（Ａ２）の説明
前記（Ａ２）の距離情報と画像情報を必ず用いて物体を検出する方法を、下村倫子ら「ステレオ視差と先行車の高さ変化を用いた車間距離計測のばらつき低減に関する考察」、電子情報通信学会技術研究報告、ＰＲＭＵ９８−９２〜１０５、ｐｐ．２１−２８を例に説明する。
【０００９】
この方法では、先ず、車両の前方の路上情景を左右２つのカメラで観測して２つの画像を得る。一方の画像より白線を検出して自車両の走行領域を得て、その中に含まれる明確で長い水平線で最も下方に位置するものを車両下端として得る。そして、下端線の上部に下端線の幅を持つテンプレートと呼ぶ一定の大きさを持つ領域を定めて、他方画像を探索してテンプレート領域と輝度パターンが最も類似している領域を得る。これにより両眼立体視により車両（物体）までの距離が得られる。
【００１０】
更に、車両は上端、下端部に安定した水平線があると仮定して、上記テンプレート位置を参考に設けた別の領域内について、エッジ点の水平方向への投影ヒストグラムを作成して、そのピーク位置より車両の上端、下端を定める。この上端と下端の幅より画像での車両の高さ（ｈ）を定める。これにより、車両の位置、車両形状（大きさ）を検出する。
【００１１】
§２：従来例２
従来例２として、移動速度計測について説明する。物体の位置を検出しながら移動速度を算出する方法は、大きく次の（Ｂ１）：２時刻間の物体位置の差を移動速度とする方法、（Ｂ２）：過去の幾つかの物体位置を用いて現在の移動速度を算出する方法、に大別される。
【００１２】
(1) ：（Ｂ１）の説明
前記（Ｂ１）の方法では、各時刻で物体位置を検出しておき、現時刻の物体位置と前時刻の物体位置の単純な差の移動速度とする。
【００１３】
(2) ：（Ｂ２）の説明
前記（Ｂ２）の代表的な方法として、今川和幸、呂山、猪木誠二、松尾英明、「顔によるオクルージョンを考慮した手話動画像からの実時間掌追跡」、電子情報通信学会技術研究報告、ＰＲＭＵ９７−１０４〜１１０、ｐｐ．１５−２２を例に説明する。
【００１４】
この例は、対象として２次元的な手領域を追跡する方法であるが、対象を追跡するという観点からは３次元であっても同様なので、本方法を例に説明する。先ず、画像より色情報を用いて手領域を抽出し、その領域の重心点位置を領域位置ｐとする。
【００１５】
そして、領域の運動モデルとして等速運動を仮定し、観測ベクトルを領域位置ｐ、推定すべきパラメータベクトルｘとして領域位置と領域速度として、カルマンフィルタを構成してパラメータベクトルｘを推定して、パラメータベクトルで示される領域位置と領域速度を所要量として求めている。
【００１６】
このように、カルマンフィルタを用いると、各時刻で観測される領域位置に誤差を含んでいても、線形推定の意味で最も安定した位置と速度が推定できるため、速度のばらつきを抑えて安定した速度計測が可能となる。
【００１７】
【発明が解決しようとする課題】
前記のような従来のものにおいては、次のような課題があった。
【００１８】
(1) ：前記（Ａ１）の場合、一般的に３次元情報は、遠方になる程その奥行き、及び空間情報の解像度が低くなり、距離情報や位置情報のもつ信頼性が低下する。つまり、複数物体が存在するときに近傍では距離の違いで分離できる場合も、物体間の距離差は同じであっても、遠方にあるときは、距離の違いが検出できず、物体を分離するのが困難になることを意味する。従って、ある程度遠方にある物体については、距離情報だけを用いて、その距離差から物体領域を分離することは困難である。
【００１９】
更に、得られる距離情報の性質として、物体がセンサの近傍で観測される場合には、その物体は大きく観測され、得られる３次元情報の量（数）は多いが、同一物体であっても遠方に位置すると、観測される物体領域そのものも小さくなり得られる３次元情報の数（量）も少なくなるというように、対象までの距離により距離情報が増減するという性質がある。
【００２０】
これにより、例えば、実吉らの方法のように、距離画像を固定サイズの領域（以下「抽出領域」と呼ぶ）に区切って、その領域毎に物体位置を算出する方法であると、例えば、遠方の物体を区別するためには、抽出領域の大きさを小さくする必要があり、この結果として、物体が近傍に存在する場合には、十分な数の距離情報が存在しているにもかかわらず、領域幅が小さいため、領域毎に集められる３次元情報の数が減少し、物体位置の精度が低下する。
【００２１】
逆に、近傍での位置の精度を高めるために抽出領域の大きさを大きくすると、遠方物体を分離できなくなるという問題がある。更に、遠方では得られる距離情報が距離画像上で２次元的にも粗くなり、１つの物体面が、必ずしも連続した３次元位置を確定できる抽出領域として得られるとは限らない、という問題がある。
【００２２】
(2) ：前記（Ａ２）の場合、距離情報と画像情報を用いて物体を検出するのに、物体が遠方にあるときは、物体内部の模様は殆ど観測されず物体輪郭が支配的となる。従って、このような場合は、前記下村の方法のように、エッジ点の投影によって物体輪郭を抽出して物体領域を定めることは有効である。
【００２３】
ところが、物体が観測者の近傍にあるときは、物体輪郭よりも物体内部の模様が支配的となるので、その結果として、エッジ点の投影ヒストグラムでは、それら物体内部の模様エッジが輪郭部分のピークの形成を阻害して、安定に輪郭線を検出するのは困難となる。従って、常に、画像特徴と距離情報を利用すると、画像特徴が障害となって物体位置を検出できないことがある。
【００２４】
(3) ：前記（Ｂ１）の場合、各時刻の物体位置を求めて、現時刻の物体位置と前時刻の物体位置の単純な差で移動速度を求めると、各時刻における物体位置の計測精度の影響を大きく受け、計測速度のばらつきが大きく、計測精度が悪い、という問題がある。
【００２５】
(4) ：前記（Ｂ２）の場合、（Ｂ２）の方法により検出された過去の幾つかの物体位置を用いて現在の移動速度を算出する方法が考えられるが、この方法では、或る物体が存在している間に、その物体の時刻間での対応をいかに安定に求めるかが課題となる。
【００２６】
ここで、例えば、屋外での情景について物体を検出する場合などを考えると、日照変化や影の有無など周囲の証明条件の影響や、物体の移動により見え方の変化の影響があり、物体が検出できない場合が考えられる。この時には、物体の対応付けが不可能なため、物体の移動速度を計測できなくなり、しかも、フィルタを利用すると、その未検出の影響が数時刻に渡って影響し、その間の速度情報は誤ったものとなる。
【００２７】
更に、未検出であった物体を再び検出した時には、新たに運動パラメータの推定を開始する必要があるため、再び物体を検出してから数時刻は、運動パラメータの推定値は誤差を多く含み、誤った速度情報となる。このように、安定した速度検出に対して物体の未検出の影響が大きく現れるという問題がある。
【００２８】
本発明は、このような従来の課題を解決し、物体の距離に影響されることなく、安定した物体の位置や速度の検出を可能にすることを目的とする。
【００２９】
【課題を解決するための手段】
図１は本発明の原理説明図であり、図１中、４は物体検出器、４−１は基本領域検出部、４−２は物体面当てはめ／物体位置決定部、５は物体速度計測器を示す。本発明は前記目的を達成するため、次のように構成した。
【００３０】
(1) ：３次元位置情報（距離情報）が、２次元（ｘ，ｙ）の配列の各場所に奥行き情報を持つ画像形式、または３次元位置（Ｘ，Ｙ，Ｚ）情報の集合の形で与えられた時に、３次元空間に存在し得る３次元物体の位置、形状を検出する３次元物体検出装置において、前記３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する基本領域検出手段（基本領域検出部４−１）と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる物体面当てはめ手段（物体面当てはめ／物体位置決定部４−２）と、前記物体面当てはめ手段による当てはめ結果の良否より物体の有無を判定する物体領域判定手段（物体面当てはめ／物体位置決定部４−２の一部）を備え、前記基本領域検出手段は、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元物体の３次元位置により変化する距離に合わせて設定する機能を備えている。
【００３１】
(2) ：３次元位置情報（距離情報）が、２次元（ｘ，ｙ）の配列の各場所に奥行き情報を持つ画像形式や、３次元位置（Ｘ，Ｙ，Ｚ）情報の集合の形で与えられた時に、３次元空間に存在し得る３次元物体の位置、形状を検出する３次元物体検出装置において、前記３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する基本領域検出手段と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる物体面当てはめ手段と、前記物体面当てはめ手段による当てはめ結果の良否より物体の有無を判定する物体領域判定手段を備え、前記物体面当てはめ手段は、当てはめ対象である基本領域の３次元距離によって、距離情報の他に、対象情景を撮像する濃淡或いはカラー画像から得られる情報を使用するか不使用とするかを切り換える機能を備えている。
【００３２】
(3) ：前記(1) の３次元物体検出装置において、前記物体領域判定手段は、前記物体当てはめ手段による当てはめ結果の良否により物体位置を検出し、検出された物体位置を時系列的に求めていき、その得られた物体位置の時系列的な対応を求めることで、各物体の速度情報を算出する機能を有し、前記物体領域判定手段で物体が検出できず、その結果として時系列での対応が求まらない時、一定回数だけその物体を未対応状態で位置を残しておき、前記物体領域判定手段で再び物体が検出された時には、未対応であった期間での平均移動速度を参考に未対応状態の物体と矛盾なく対応できるかを調べ、もし矛盾なく対応できるならば、先の検出された物体を未対応物体が再び検出されたとして、未対応間の平均移動速度を再追跡の初期速度として与えて再び物体の追跡を行うことで、速度計測を再開できるようにした速度検出手段を備えている。
【００３３】
(4) ：３次元位置情報が、２次元の配列の各場所に奥行き情報を持つ画像形式、または３次元位置情報の集合の形で与えられた時に、３次元空間に存在し得る３次元物体の位置、形状を検出する３次元物体検出方法において、前記３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する基本領域検出処理と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる物体面当てはめ処理と、前記物体面当てはめ手段による当てはめ結果の良否より物体の有無を判定する物体領域判定処理を有し、前記基本領域検出処理では、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元物体の３次元位置により変化する距離に合わせて設定するようにした。
【００３４】
(5) ：３次元位置情報（距離情報）が、２次元（ｘ，ｙ）の配列の各場所に奥行き情報を持つ画像形式、または３次元位置（Ｘ，Ｙ，Ｚ）情報の集合の形で与えられた時に、３次元空間に存在し得る３次元物体の位置、形状を検出する３次元物体検出装置に、３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する第１の手順と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる第２の手順と、前記第２の手順による当てはめ結果の良否より物体の有無を判定する第３の手順と、前記第１の手順を行う場合、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元物体の３次元位置により変化する距離に合わせて設定する第４の手順とを実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体。
【００３５】
（作用）
前記構成に基づく本発明の作用を、図１に基づいて説明する。
【００３６】
(a) ：前記(1) の作用
基本領域検出手段は、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出する。
【００３７】
この場合、距離情報（視差情報）に物体面モデルを当てはめてその良否から物体の有無を決定する「物体検出処理」で必要となる、距離（視差）画像から近隣の距離情報をまとめて平面部分を物体面候補として抽出する。この処理において、３次元位置（奥行き）により変化する物体の観測大きさに合わせた適切な領域を「近隣」として近隣領域内の距離情報をまとめることで過不足ない距離情報を用いることを可能にする。
【００３８】
その結果、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【００３９】
(b) ：前記(2) の作用
物体面当てはめ手段は、当てはめ対象である基本領域の３次元距離によって、距離情報の他に、対象情景を撮像する濃淡或いはカラー画像から得られる情報を使用するか不使用とするかを切り換える。
【００４０】
この場合、物体検出処理に関して、対象物体が遠方にあると物体の観測大きさが小さくなり、得られる距離情報の数が不足して、距離情報への物体面モデルの当てはめではモデル位置を変化させても当てはめ度合いが殆ど変化せず、物体位置の特定が困難となる。
【００４１】
そのため、物体が閾値以上に遠い場合は、３次元距離情報の他に、ビデオカメラ（或いはテレビカメラ）等で撮像された情景の濃淡画像から得られる物体輪郭線情報を用いて画像中での物体位置を補正することで、物体の３次元位置（奥行き）によらず、常に安定して物体位置を得ることができる。
【００４２】
(c) ：前記(3) の作用
前記３次元物体検出装置では、物体領域判定手段は、物体当てはめ手段による当てはめ結果の良否により物体位置を検出し、検出された物体位置を時系列的に求めていき、その得られた物体位置の時系列的な対応を求めることで、各物体の速度情報を算出する。
そして、速度検出手段では、物体領域判定手段で物体が検出できず、その結果として時系列での対応が求まらない時、一定回数だけその物体を未対応状態で位置を残しておき、前記物体領域判定手段で再び物体が検出された時には、未対応であった期間での平均移動速度を参考に未対応状態の物体と矛盾なく対応できるかを調べ、もし矛盾なく対応できるならば、先の検出された物体を未対応物体が再び検出されたとして、未対応間の平均移動速度を再追跡の初期速度として与えて再び物体の追跡を行うことで、速度計測を再開する。
【００４３】
この場合、物体速度計測処理に関して、物体領域が検出できず、対応が求まらない時は、一定の回数だけその物体を未対応状態で位置の情報を残しておき、再び物体が検出された場合には、未対応の間の平均移動距離を初期速度として与えて再び物体の追跡を行うことで、再検出の直後から速度を安定に計測することを可能とする。
【００４４】
また、再検出物体を新規物体として追跡し始める従来方式では、速度情報を安定に得るには、更に、数フレームの物体追跡が必要で、速度情報の出力までに時間遅れを要していたが、本発明では、物体の検出ができない場合があっても、時間遅れなく、速度情報を得ることができる。
【００４５】
(d) ：前記(4) の作用
３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する基本領域検出処理と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる物体面当てはめ処理と、前記物体面当てはめ手段による当てはめ結果の良否より物体の有無を判定する物体領域判定処理を有し、前記基本領域検出処理では、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出するようにした。
【００４６】
このようにすれば、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【００４７】
(e) ：前記(5) の作用
前記３次元物体検出装置が、記録媒体のプログラムを読み出して実行することにより、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とする際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出する。
【００４８】
この場合、距離情報（視差情報）に物体面モデルを当てはめてその良否から物体の有無を決定する「物体検出処理」で必要となる、距離（視差）画像から近隣の距離情報をまとめて平面部分を物体面候補として抽出する。この処理において、３次元位置（奥行き）により変化する物体の観測大きさに合わせた適切な領域を「近隣」として近隣領域内の距離情報をまとめることで過不足ない距離情報を用いることを可能にする。
【００４９】
その結果、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【００５０】
【発明の実施の形態】
以下、本発明の実施の形態を図面に基づいて詳細に説明する。
【００５１】
§１：３次元物体検出装置の概要
本発明の３次元物体検出装置では、何らかの距離計測方法によって３次元位置情報（距離情報）が与えられた時に、３次元空間の広範囲に存在し得る３次元物体の位置、形状、移動速度を検出するものである。ここで、３次元位置情報とは、空間上の各位置或いは特定の３次元位置（Ｘ，Ｙ，Ｚ）を表す情報の集合で、与えられる形式は、２次元（Ｘ，Ｙ）の配列の各場所に奥行き情報を持つ画像形式であっても良いし、３次元位置（Ｘ，Ｙ，Ｚ）情報の集合の形をとっても良い（Ｘ、Ｙ、Ｚ：３次元空間を表す座標）。
【００５２】
先ず、３次元位置情報から任意の３次元物体の位置、形状を検出する方法について説明する。予め、検出したい３次元物体の概略大きさ、形状を表す３次元形状モデルを用意する。そして、３次元位置情報から、３次元的に位置が近く、しかも平面を構成する部分をまとめて、いくつかの面素（以下、「基本領域」と呼ぶ）を抽出する。
【００５３】
基本領域について、先の３次元モデルを構成する各面を当てはめていき、３次元モデルで観測できる全てについて問題なく当てはまるなら、その位置に３次元物体が存在するとする。ここで、一般に３次元上の物体は、遠方にあるほど観測される大きさは小さく、逆に近くにあるほど大きく観測されることを考えると、物体面を当てはめるべき基本領域の大きさは、対象の物体の位置（距離）によって変わってくるという問題が生じる。
【００５４】
そこで、基本領域を形成する際に、物体の３次元距離に応じて領域形成に利用する３次元位置情報の範囲を変化させて、遠方では小さい領域となり、逆に近傍では大きな領域となるようにする。観測される物体の大きさに合わせることで、できるだけ他の物体を含まず安定した基本領域を得る。
【００５５】
また、３次元情報は、遠方になるほど奥行き情報の解像度が低くなり、距離情報の持つ信頼性が低下する。更に、物体そのものも遠方になると小さくなり得られる３次元情報の数（量）も少なくなる。これにより、物体が遠方にあると先の３次元モデルの当てはめが困難になる。
【００５６】
そこで、３次元物体が遠方にあり、モデルの当てはめが困難になった時には、ビデオカメラ（或いはテレビカメラ）などで得られる情景の濃淡画像（或いはカラー画像）から得られる物体輪郭線を用いて物体位置を正確に検出する。逆に、物体が近傍にある時には、３次元情報が多く得られることと、濃淡画像で物体内部の模様が支配的になり輪郭線と区別するのが困難になるので、物体輪郭情報は利用しない。このように、距離に応じて適応的に画像情報を利用することで、物体の距離に影響することなく、安定して物体位置を検出する。
【００５７】
次に、物体の速度計測の方法を説明する。基本的には、先で得られた物体の３次元位置の時間変化から物体の移動速度を検出するが、検出速度の安定化のために、一般的に良く行われる物体の運動モデルを用いて過去の位置情報も利用して現在の移動速度を定める。
【００５８】
このような運動モデルを用いた方法では、速度情報が安定するまでにある程度の時刻間で同一物体を正しく対応付けることが必要である。ところが、例えば、屋外での物体検出では周囲の証明条件の影響や物体の移動による見え方の変化により物体が検出ではない場合は、物体の時間的対応付けが不可能なので、速度が計測できなくなる。
【００５９】
また、同一物体が再び観測された時には、速度情報が安定して得られるまで、再度、ある程度の時刻間の対応付けを要し、即座に速度情報を確定することは困難である。このように、対応の消失は、安定した速度検出への影響が大きく現れる。
【００６０】
そこで、物体が検出できず対応が求まらない時は、一定の回数だけその物体を未対応状態で位置、大きさの情報を残しておき、再び物体が検出された時には、未対応であった期間での平均移動速度を初期速度として与えて、再び物体の追跡を続行する。これにより、例え物体位置がある期間検出できなかったとしても、物体位置を再検出してから速度を安定させるための一定期間の物体対応が不要なので、速度情報を速やかに安定させることができる。
【００６１】
§２：３次元物体検出／速度検出処理の説明
以下、３次元物体検出／速度検出処理について説明する。
【００６２】
(1) ：本発明では物体面位置を検出するための核となる領域（基本領域）を３次元位置情報から抽出しておき、その基本領域に予め、定めた物体のモデルを当てはめて、当てはまり度合いが良いときに物体が存在するという処理を行う。
【００６３】
この基本領域として、距離画像の３次元位置情報を持つ画素について、画像上での２次元距離と３次元奥行きがそれぞれ近傍しており、しかも平面を構成する３次元位置情報をまとめた領域とするが、この画像上での２次元距離の大きさを、対象の３次元奥行きに応じて可変する。すなわち、物体が近傍にあるときは、対象物体は大きく観測できているので、画像上での近接度合いを測る距離差を大きくとり、広い領域から十分な数の３次元位置情報を抽出して安定した基本領域を得る。
【００６４】
一方、物体が遠方にある時は、対象物体は小さいので、画像上での近接度合いを測る距離差を小さくし、近傍にある他物体と融合した誤った基本領域とならないようにする。このように、物体の位置（距離）に応じて変化する観測物体の大きさに合わせた適切な大きさで特徴量をまとめることで、固定サイズでの基本領域抽出で生じる物体位置の違いによる基本領域の検出精度の問題に影響されることなく、物体面の核となる基本領域を物体の位置によらず安定して求めることができる。
【００６５】
更に、３次元物体が遠方にあると、距離情報の解像度が不足して、モデルの当てはめが困難になるので、物体が遠方にあるときには、３次元位置情報の他にビデオカメラ（或いはテレビカメラ）などで撮像される情景の濃淡画像、或いはカラー画像から得られる物体輪郭線を用いて物体位置を正確に検出する。
【００６６】
逆に、物体が近傍にある時は、３次元情報が多く得られることと、濃淡画像或いはカラー画像で物体内部の模様が主体になり、輪郭線と区別するのが困難になるので、物体輪郭情報は利用しない。このように、必要に応じて画像情報を利用することで、物体の距離に影響することなく、物体位置を検出できる。
【００６７】
(2) ：物体の移動速度検出に関して、本発明では、基本的には、物体の運動モデルを用いて過去の幾つかの位置情報を元に、現在の移動速度を定めることで安定した速度を得るが、物体が検出できず対応が求まらない時は、一定の回数だけその物体を未対応状態で位置、大きさの情報を残しておき、再び物体が検出された時には、未対応であった期間での平均移動速度を参考に未対応状態の物体と矛盾なく対応できるかを調べる。
【００６８】
そして、もし、矛盾なく対応できるならば、先の検出された物体を未対応物体が再び検出されたとして平均移動速度を初期速度として再び物体の追跡を行う。これにより、例え物体位置が或る期間検出できなかったとしても、物体位置を再検出してから速度を安定させるための一定期間物体対応が不要なので、速度情報をいつでも安定させることができる。
【００６９】
§３：具体例による説明
以下、具体例に基づいて、前記処理を詳細に説明する。
【００７０】
(1) ：システム全体の説明
図２のＡにシステム概略図を示す。このシステムは距離計測器１と、視差画像の変換処理を行う視差画像変換器２と、物体の位置、大きさ（形状）を検出する物体検出器４と、物体の移動速度を計測する物体速度計測器５と、カメラ３等を備えている。
【００７１】
前記距離計測器１とカメラ３は既存のものを使用する。そして、距離計測器１としては、対象物体を含む情景の３次元情報を取得できるもの、例えば、２台のビデオカメラ（或いはテレビカメラ）を所定距離だけ離して配置したステレオカメラと呼ばれるものを使用する。
【００７２】
また、カメラ３は対象物体を含む情景を撮像して濃淡画像（モノクロ）、或いはカラー画像が出力できるものを使用するが、前記距離計測器１が２台のカメラで構成されている場合には、その１台をカメラ３として使用しても良い。
【００７３】
このシステムの機能は次の通りである。先ず、距離計測器１により距離情報（３次元位置情報）を取得し、必要があれば、視差画像変換器２において、２次元配列（ｘ，ｙ）に視差値を持つ視差画像の形式に変換し、視差画像を出力する。続いて、物体検出器４により、視差画像と、カメラ３で撮像された濃淡（或いは、カラー）画像を用いて物体位置を検出する。
【００７４】
そして、物体速度計測器５により、各時刻で得られた物体位置の時間変化より物体の移動速度を計測する。このようにして、システム出力として、物体の位置、大きさ（形状）、移動速度情報を出力する。
【００７５】
以下、前記各部の機能や処理内容について説明する。距離計測器１は情景の３次元情報を取得できる手段であれば、その方式については言及しない。出力として、２次元配列（ｘ，ｙ）の各位置において、３次元距離或いは視差を有する画像の形式の他、３次元位置（Ｘ，Ｙ，Ｚ）の情報の集合であっても構わない。
【００７６】
視差画像変換器２は、必要に応じて設ける機構であり、距離計測器１の出力が３次元位置（Ｘ，Ｙ，Ｚ）情報の集合として与えられた時は、位置（Ｘ，Ｙ）と距離（Ｚ）と仮想的なカメラパラメータ（ｆ：カメラ焦点距離、ｂ：カメラ間距離）を用いて、ｘ＝ｆ×Ｘ／Ｚ、ｙ＝ｆ×Ｙ／Ｚ、ｄ＝ｆ×ｂ／Ｚの式で視差画像｛位置（ｘ，ｙ）で視差値ｄ｝へと変換する機能を有する。なお、距離計測器１が視差画像を出力する場合は、視差画像変換器２は不要である。
【００７７】
(2) ：物体検出器の説明
図２のＢに物体検出器の構成図を示す。物体検出器４は、視差画像とカメラ３で撮像した情景の濃淡画像（或いはカラー画像）を入力して、複数の物体位置を検出するものであり、基本領域を検出する基本領域検出部４−１と、物体面の当てはめ処理及び物体位置を決定する処理を行う物体面当てはめ／物体位置決定部４−２とを備えている。
【００７８】
この物体検出器４の機能は次の通りである。先ず、基本領域検出部４−１で、入力した視差画像から物体面を当てはめるのに十分な情報を持つ核となる領域（基本領域Ｒ_i）を求める。次に、予め、物体の形状モデルを用意しておき、物体面当てはめ／物体位置決定部４−２で、得られた基本領域Ｒ_iについてモデル物体面を当てはめていき、そこに物体が存在するか否かを決定する。結果として、検出された物体（位置、大きさ情報を有する）のリスト（物体情報配列）ＯＬとして、
【００７９】
【数１】

【００８０】
が得られる。
【００８１】
(3) ：基本領域検出部の説明
図３のＡに基本領域検出部の構成図を示す。基本領域検出部４−１は、視差画像の間引き処理を行う視差画像間引き部１１と、単位領域を検出する単位領域検出部１２と、基本領域を生成する基本領域生成部１３を備えている。
【００８２】
基本領域検出部４−１の処理は次の通りである。先ず、視差画像間引き部１１により、入力した視差画像をｎ×ｍ画像の領域にまとめることで、局所的に安定した視差情報を持つ部分を得るとともに、画像数を間引くことで全体の処理量を削減する。但し、処理時間等に問題がない場合には、視差画像間引き部１１を省いてもよい。ここで、視差画像の内、視差画像を持つ画像を「視差点」と呼ぶことにする。
【００８３】
次に、単位領域検出部１２により、間引かれた視差画像の視差点位置において、３次元上で単位大きさを持つ平面（単位平面モデルＭＳで示される）が、その視差点の位置（距離）で観測された場合の視差画像上での大きさ（領域）を求めて、その領域内にある間引き前の視差点の分布より、安定して平面を構成するもの（「単位領域Ｅ_i」と呼ぶ）を抽出して、単位領域リストＥＬ_iを得る。
【００８４】
そして、基本領域生成部１３において、単位領域群の中から同一平面を構成できるものをまとめて基本領域Ｒ_iを生成し、基本領域リストＲＬ_iを得る。
【００８５】
(4) ：基本領域検出部の詳細な処理の説明
以下、基本領域検出部の詳細な処理を説明する。
【００８６】
(4) −１：視差画像間引き部１１の処理の説明
間引き視差画像と元の視差画像の位置関係を図４に示す。図４において、Ａは元の視差画像、Ｂは間引き視差画像を示す。
【００８７】
視差画像間引き部１１では、元の視差画像の幅をＮ画素、高さをＭ画素とする時、視差画像を横ｎ画素、縦ｍ画素の小領域ｗ_iに区切り（図４のＡ参照）、それぞれの小領域ｗ_i毎に、以下の条件を満足するか否かを調べ、満足する小領域ｗ_iについてはその平均視差を画素値とする（集約する）、幅Ｎ／ｎ、高さＭ／ｍの新たな間引き視差画像（図４のＢ参照）を得る。
【００８８】
条件：以下の▲１▼又は▲２▼の条件を満たすとき、小領域ｗ_iは条件を満たしたとする。
【００８９】
▲１▼：（小領域ｗ_iの視差点の数）＞閾値１、かつ、（小領域ｗ_iの視差点の視差の分散値）＜閾値２
▲２▼：（小領域ｗ_iの視差点の数／ｎ×ｍ）＞閾値３
なお、前記閾値１、２、３は、予め定めておく。
【００９０】
(4) −２：単位領域検出部１２の処理の説明
単位領域検出部１２では、先ず、３次元空間で、縦ＭＳ_x、横ＭＳ_yのサイズを持つ平面を単位平面モデルＭＳとして定義する。そして、間引き視差画像の各視差点（ｘ_i，ｙ_i，ｄ_i）について、その距離（視差）に単位平面モデルＭＳを置いた時の元の視差画像での観測位置（ｐｘ_i，ｐｙ_i）、サイズ（ｎｘ_i，ｎｙ_i）を、式｛ｐｘ_i＝ｘ_i×ｎ、ｐｙ_i＝ｙ_i×ｍ、ｎｘ_i＝ＭＳ_x×ｄ_i／ｂ、ｎｙ_i＝ＭＳ_y×ｄ_i／ｂ｝により求める。
【００９１】
但し、ｘ_iはＸ座標上の位置、ｙ_iはＹ座標上の位置、ｄ_iは視差値、ｂは２つのカメラのカメラ間距離とする。そして、元の視差画像での位置（ｐｘ_i，ｐｙ_i）を中心として、サイズ（ｎｘ_i，ｎｙ_i）の領域を設けて、それに含まれる元の視差画像での視差点を用いて、以下の評価値を計算する。そして、評価値が閾値以上のものを単位領域Ｅ_iとする。
【００９２】
【数２】

【００９３】
但し、Ｎ_jは先の領域に含まれる視差点の個数、ｆはカメラの焦点距離、ＤＩＦＦは３次元的に区別した距離（閾値）を表す。単位領域検出部１２の操作の結果として、単位領域リストＥＬが、
【００９４】
【数３】

【００９５】
（Ｎ_Eは、ＥＬの要素の数）の式で得られる。
【００９６】
(4) −３：基本領域生成部１３の処理の説明
図５は基本領域生成部の処理フローチャートである。以下、図５に基づいて、基本領域生成部１３の処理を説明する。なお、図５において、Ｓ１〜Ｓ１２は各処理ステップを示す。
【００９７】
前記基本領域とは、同一平面をなす幾つかの単位領域を集めたもので、その奥行きは基本領域に属す単位領域群の平均距離とし、基本領域の大きさは、属す単位領域群を含む最大矩形領域とする。この基本領域を図５の処理で求める。
【００９８】
先ず、基本領域生成部１３は、視差画像、単位領域リストＥＬを入力し（Ｓ１）、前記単位領域リストＥＬから、基本領域に属していない単位領域Ｅ_iを得る処理を行ない、単位領域リストＥＬに、基本領域に属さない単位領域Ｅ_iが有るか否かを判断する（Ｓ２）。その結果、単位領域リストＥＬに属さない単位領域Ｅ_iが無ければ処理を終了する。
【００９９】
しかし、基本領域に属さない単位領域Ｅ_iが有れば、その単位領域Ｅ_iを１つ取り出し、基本領域Ｒ_kを生成する（Ｓ３）。次に、結合有りフラグを下ろし（リセット）し（Ｓ４）、前記単位領域Ｅ_iとは異なり、かつ、前記基本領域Ｒ_kに属していない他の単位領域Ｅ_jを取得する（Ｓ５）。この場合、この単位領域Ｅ_jが存在すれば（Ｓ６）、単位領域Ｅ_jが基本領域Ｒ_kに属するか否かを調べる（Ｓ７）。
【０１００】
その結果、もし、単位領域Ｅ_jが基本領域Ｒ_kに属するならば、基本領域Ｒ_kに単位領域Ｅ_jを統合する（Ｓ８）と共に、「統合フラグ」を立ててから（Ｓ９）、パラメータｊを更新（ｊ＝ｊ＋１）し（Ｓ１０）、Ｓ５の処理へ移行することで、次の単位領域Ｅ_jを取得し、処理を繰り返す。この繰り返し処理により、調査すべき単位領域Ｅ_jが無くなったら（Ｓ６）、「統合有りフラグ」の状態を調べる（Ｓ１１）。
【０１０１】
その結果、もし、「統合有りフラグ」が立っているならば、当該繰り返し処理において、新たに統合操作が行われたので、その結果として統合できる単位領域が生じた可能性がある。そこで、フラグが立っているならば、再び、Ｓ４の処理へ移行し、単位領域リストＥＬ中より、統合可能な単位領域Ｅ_jがないか、再び調べる。
【０１０２】
もし、フラグが立っていないならば、処理すべき単位領域Ｅ_jが無いので、新たに基本領域Ｒ_K+1を生成して、同様の処理を繰り返す。この場合、Ｓ１１の処理で、結合フラグが立っていなければ、パラメータｉを更新（ｉ＝ｉ＋１）し（Ｓ１２）、Ｓ２へ移行する。
【０１０３】
この結果として、
【０１０４】
【数４】

【０１０５】
が得られる。
【０１０６】
ここで、単位領域Ｅ_jが基本領域Ｒ_Kに属するか否かの判断条件について述べる。基本領域Ｒ_kを構成する単位領域の中で、画像上最も単位領域Ｅ_jに近い単位領域をＥ_kとする時、Ｅ_jからＥ_kまでの画像上距離をＬ１、視差の差をＬ２として、それらが予め決めた閾値内のときに、Ｅ_jがＲ_kに属するとする。
【０１０７】
(4) −４：前記処理の補足説明
図６は処理説明図（その１）であり、Ａは画像例、Ｂは視差の説明図、Ｃは画像を示す。図７は処理説明図（その２）であり、Ａはメモリ上の視差画像、Ｂは視差画像の一部詳細説明図、Ｃは間引き視差画像を示す。図８は処理説明図（その３）であり、Ａは視差画像の説明図、Ｂは視差画像生成時の説明図である。
【０１０８】
前記視差画像は、物体検出器４に入力すると、一旦、メモリに格納され、図６のＡに示す状態となる。この場合、図６のＡのＸ、Ｙは座標軸を示し、図の斜線部分は視差（距離）である。すなわち、視差画像は、視差（距離）を含んだ情報となっている。
【０１０９】
例えば、図６のＢに示したように、左側のカメラで撮像した画像（実線の丸）をＬ、右側のカメラで撮像した時の画像（点線の丸）をＲとした場合、前記ＬとＲを重ねると、視差のある画像となる。すなわち、実線の丸と点線の丸を重ねた時の２つの画像のずれをｘＲとすれば、このずれｘＲが視差（距離）となる。このように視差のある視差画像を元に、図６のＣに示すように、ｍ×ｎ画像を取り出して前記間引き処理を行う。
【０１１０】
この場合、図７のＡに示すように、視差画像をメモリに格納する。そして、前記視差画像を横ｎ画素、縦ｍ画素の小領域ｗ_iに区切り（図７のＢ参照）、それぞれの小領域ｗ_i毎に、前記条件を満足するか否かを調べ、満足する小領域ｗ_iについてはその平均視差を画素値とする（集約する）、間引き視差画像（図７のＣ参照）を得る。
【０１１１】
また、元の視差画像上に単位平面（ｍ×ｎ画像）を写し込むと、図８のＡの状態になる。この場合、近い物体は大きくなり、遠い物体は小さくなる。例えば、近い物体の一部の単位平面（ｍ×ｎ画像）が、視差＝１０（例えば、距離＝１ｍ）であり、遠い物体の一部の単位平面（ｍ×ｎ画像）が、視差＝１（例えば、距離＝１０ｍ）のようになる。
【０１１２】
なお、この場合、目安となる単位平面である、縦ＭＳ_x、横ＭＳ_yのサイズを持つ単位平面モデルの大きさを、ＭＳ_x×ＭＳ_y＝（４０ｃｍ×５０ｃｍ）とすれば、１０ｍ離れた所に、４０ｃｍ×５０ｃｍの板（物体）がどれいらいの大きさで見えるかを計算する。
【０１１３】
このように、近い物体であれば、視差は大きくなり、遠い物体では視差が小さくなる。このように、視差が大きければ距離が小さく、視差が小さければ距離が大きい、という関係になっている。従って、視差は距離情報と同じである。
【０１１４】
更に、前記視差画像は、２つのカメラで構成したステレオカメラを使用して得ることができる。例えば、図８のＢに示すように、第１のカメラＬと第２のカメラＲを所定距離（例えば、距離ｂ）だけ離して設置し、これら２つのカメラで撮像した濃淡画像（白黒）から、エッジ画像（物体の輪郭画像）を得て、Ｌ、Ｒの各エッジ画像を合成して視差画像が生成できる。
【０１１５】
(5) ：物体面当てはめ／物体位置決定部の説明
(5) −１：定義
説明に先立ち、物体モデルについて説明する。物体モデルは、物体の形状、大きさに合わせて複数用意し、その１つの物体モデルをモデルクラスＣ_iとし、それらモデルクラスＣ_iに合わせて、
【０１１６】
【数５】

【０１１７】
但し、Ｎ_cはモデルクラスＣ_iの数を表す。各モデルクラスＣ_iは、１つ或いは複数の幾何形状（ここでは直方体）よりなり、対象に合わせて大きさが調整されている。本例では、直方体を用いるので、直方体を構成する面の内、観測者から見える面は３面あるが、本例では、簡単化のために、直方体の面のうち、垂直面として位置する２面について取り上げて考える。
【０１１８】
但し、これは３面以上となっても、それぞれの方向へ投影面を考えれば話は同様である。これら２面を次のように分類しておく。観測者の視線方向に直交する面に近い方を「前面領域Ｓ_f,k」と呼び、視線方向に並行する面に近い方を「側面領域Ｓ_s,k」と呼ぶ。
【０１１９】
(5) −２：物体面当てはめ／物体位置決定部の構成の説明
物体面当てはめ／物体位置決定部の構成図を図３のＢに示す。物体面当てはめ／物体位置決定部４−２には、前面領域の当てはめ処理を行う前面領域当てはめ部１４と、側面領域の当てはめ処理を行う側面領域当てはめ部１５を備えている。
【０１２０】
物体面当てはめ／物体位置決定部４−２には、視差画像と基本領域リストＲＬ_i、ｉ∈｛１，２，・・・，Ｎ_R｝を入力として、先ず、前面領域当てはめ部１４により、各基本領域Ｒ_iを物体の前面領域と考えたときに、矛盾のない３次元位置分布が得られる物体領域Ｏ_f,iを得る。
【０１２１】
次に、側面領域領域当てはめ部１５により、今度は基本領域Ｒ_iを物体の側面領域と考えた時の物体領域Ｏ_s,iを得る。このように、全ての組み合わせで、物体領域を検出することで、物体の位置によらず、漏れなく物体を検出することができる。以下、詳細に説明する。
【０１２２】
(5) −３：前面領域当てはめ処理（全体処理）
図９は、前面領域当てはめ部の処理フローチャート（その１）である。以下、図９に基づいて、前面領域当てはめ部の処理を説明する。なお、Ｓ３１〜Ｓ３６は各処理ステップを示す。
【０１２３】
前面領域当てはめ部１４では、基本領域リストに属す基本領域Ｒ_i、ｉ∈｛１，２，・・・，Ｎ_R｝各々について、全てのモデルクラスＣ_k、ｋ∈｛１，２，・・・，Ｎ_C｝に対して、図９に示す処理でモデル当てはめを行い、モデルと合致する物体領域を検出する。
【０１２４】
その内容として、先ず、基本領域Ｒ_iが示す３次元位置にモデルクラスＣ_kを構成する前面領域Ｓ_f,kを置いた時に、その面を支持する視差点があるか調べる。この場合、当てはめ度合いは、当てはめ得点ＳＣ１（ＳＣ：スコア）として与えられている。
【０１２５】
もし、当てはめ度合いが良ければ、その前面領域Ｓ_f,kを元に、位置関係より生成できる側面領域Ｓ_s,kを生成し、側面領域を支持する視差点があるか調べる。この場合、当てはめ度合いは、当てはめ得点ＳＣ２（ＳＣ：スコア）として与えられている。当てはめ度合いが良ければ、モデルクラスＣ_kが与える位置に物体領域Ｏ_f,iが存在するとする。また、この物体領域の評価値ＳＣ＝ＳＣ１＋ＳＣ２とする。もし、いずれかの当てはめ度合いが悪い場合は、物体領域は無いとする。
【０１２６】
すなわち、前面領域当てはめ部１４は、視差画像、基本領域Ｒ_i、モデルクラスリストＣＬを入力し（Ｓ３１）、基本領域Ｒ_iをモデルクラスＣ_kの前面領域Ｓ_f,kとして、モデル面の当てはめを行う（Ｓ３２）。この当てはめ処理では、先ず、基本領域Ｒ_iをモデルクラスＣ_kの前面領域Ｓ_f,kとして、モデル面の当てはめを行う。この時、当てはめ得点ＳＣ１_,i,kを取得する（Ｓ３２）。
【０１２７】
そして、前面領域当てはめ部１４は、前面領域が存在したか否かを判断し（Ｓ３３）、前面が存在しなければ、物体領域無し、物体得点ＳＣ＝０として処理を終了する。しかし、前面領域Ｓ_f,kが存在した場合には、モデルクラスＣ_kに、側面領域Ｓ_s,kを生成して、側面領域Ｓ_s,kが有るか否かを調べる。この時、当てはめ得点ＳＣ２_,i,kを取得する（Ｓ３４）。
【０１２８】
そして、前面領域当てはめ部１４は、側面領域が存在したか否かを判断し（Ｓ３５）、側面領域Ｓ_s,kが存在しなければ、物体領域無し、物体得点ＳＣ＝０として処理を終了する。しかし、側面領域Ｓ_s,kが存在したら、モデルクラスＣ_kの位置を物体領域Ｏ_i,kの位置とし、物体領域Ｏ_i,kの当てはめ得点ＳＣ＝ＳＣ１_,i,k＋ＳＣ２_,i,kとする（Ｓ３６）。このようにして、物体領域Ｏ_i,k有り、当てはめ得点ＳＣ＝ＳＣ１_,i,k＋ＳＣ２_,i,kを得る。
【０１２９】
(5) −４：前面領域当てはめ処理（一部詳細処理１）
図１０は、前面領域当てはめ部の処理フローチャート（その２）であり、前面領域当てはめ部の一部詳細処理（図９のＳ３２の詳細な処理）を示したものである。以下、図１０に基づいて、前面領域当てはめ部の一部詳細処理を説明する。なお、Ｓ４１〜Ｓ４８は各処理ステップを示す。
【０１３０】
前面領域当てはめ部１４は、基本領域Ｒ_i、モデルの前面領域Ｓ_f,k、視差画像を入力し（Ｓ４１）、前面領域Ｓ_f,kを基本領域Ｒ_iが示す３次元位置に置いた時の視差画像での投影領域ｈ_fを得る（Ｓ４２）。
【０１３１】
この場合、対象物体が遠方にあると、物体の観測大きさが小さくなり得られる距離情報の数が不足して、距離情報へ物体面モデルの当てはめでは、モデル位置を変化させても当てはめ度合いが殆ど変化せず、物体位置の特定が困難となる。このため、物体が閾値以上に遠い場合は３次元距離情報の他に、ビデオカメラ（或いはテレビカメラ）などで撮像された情景の濃淡画像から得られる物体輪郭線情報を用いて、画像中での物体位置を補正することで、物体の３次元位置（奥行き）によらず、常に安定して物体位置を得る。
【０１３２】
すなわち、前面領域当てはめ部１４は、基本領域Ｒ_iの３次元距離を調べ（Ｓ４３）、その距離が閾値以上に遠い場合は距離情報の信頼性が低下しており距離情報だけで物体位置を定めるのは困難であるため、カメラ３から出力された濃淡画像を取り込み、投影領域ｈ_fを基準に物体輪郭の有無を求める（Ｓ４４）。
【０１３３】
そして、輪郭線があったか否か（輪郭線の有無）を調べ（Ｓ４５）、もし、輪郭線が無ければ、物体はないものとする（前面領域無し）。また、輪郭線があった場合は、前面領域Ｓ_f,kを距離情報が支持するか否かを調べる（Ｓ４６）。また、Ｓ４３の処理で、基本領域Ｒ_iの距離が閾値以下であれば、距離情報の信頼性があるので、輪郭線の検査は行わず、前面領域Ｓ_f,kを距離情報が支持するか否かを調べる（Ｓ４６）。
【０１３４】
そして、前面領域Ｓ_f,kを距離情報が支持するか否かを調べる処理では、投影領域ｈ_f内の視差点を用いて前面領域Ｓ_f,kに当てはめを行い、その絶対当てはめ得点Ｓ_aと、平均当てはめ得点Ｓ_rを計算する（Ｓ４６）。このようにして、Ｓ４６の処理では、前面領域を距離情報が支持するか否かを調べる処理を行うことによって、絶対的当てはめ得点Ｓ_aが得られる。
【０１３５】
そして、平均当てはめ得点Ｓ_rが閾値以下であれば（Ｓ４７）、前面領域Ｓ_f,k無しとするが、前面領域Ｓ_f,kへの距離情報の当てはめの平均当てはめ得点Ｓ_rが閾値より大きければ、安定して物体面を支持する３次元情報があると見なせるので、前面領域Ｓ_f,kが存在するとする。更に、その時の当てはめ得点ＳＣ１として、絶対得点Ｓ_aを用いる（Ｓ４８）。このような処理により、前面領域Ｓ_f,kが存在するか否かの情報が得られる。
【０１３６】
(5) −５：前面領域当てはめ処理（一部詳細処理２）
図１１は、輪郭検出処理の説明図であり、図１０のＳ４３、Ｓ４４の処理を詳細に説明したものである。ここでは、先ず、物体輪郭検出領域としての輪郭探索領域Ｈ_fを、先に求めた物体面の投影領域ｈ_fと中心を同じくして、大きさを投影領域ｈ_fと一定倍大きくした領域として定める。
【０１３７】
次に、輪郭探索領域Ｈ_fに含まれる濃淡画像でのエッジについて縦方向に投影を取る。投影像は、横軸を位置（ｘ）とし、縦軸に輝度をとるヒストグラムとなる。このヒストグラムから、閾値以上の輝度を持つピーク位置を検出することで、輪郭線の位置を決定する。
【０１３８】
複数位置が得られたならば、全ての組み合わせで、モデル領域（投影領域ｈ_fをモデル領域とする）の幅と近い幅を持つピーク組があるか否かを調べる。もし、類似した幅のピーク組があれば、該当モデルが支持する物体の輪郭線があったとする。もし、明示した幅のピーク組がなければ、物体を支持する輪郭はないとする。
【０１３９】
(5) −６：前面領域当てはめ処理（一部詳細処理３）
以下、図１０のＳ４６の処理を詳細に説明する。この処理は、前面領域を距離情報が支持するか否かを定める処理であり、以下、その内容を詳細に説明する。
【０１４０】
先ず、モデル領域ｈ_f内の視差点をｐ（ｘ_i，ｙ_i，ｄ_i）とする。この時、位置（ｘ_i，ｙ_i）でのモデル面Ｓ_f,kの視差値ｄ_Sとして、以下の操作を行う。
【０１４１】
【数６】

【０１４２】
但し、Ｓ_aは、絶対当てはめ得点（視差情報の当てはめの絶対得点）を表し、初期値はＳ_a＝０とする。この操作をモデル領域ｈ_f内の全ての視差点について行う。そして、平均当てはめ得点Ｓ_rを、Ｓ_r＝Ｓ_a／ＭＡＸ（ｈ_fの幅，ｈ_fの高さ）の式により求める。この処理で、物体面を当てはめた時の、当てはめの良否（Ｓ_r，Ｓ_a）が得られる。
【０１４３】
(5) −７：側面領域当てはめ処理（確認用）の説明
図１２は、側面領域当てはめ処理（確認用）フローチャートであり、図９のＳ３４の詳細な処理を示したものである。以下、図１２に基づいて、図９のＳ３４の処理を詳細に説明する。なお、Ｓ５１〜Ｓ５７は各処理ステップを示す。
【０１４４】
この処理は、物体の存在を確認するための側面領域の当てはめ処理である。先ず、前面領域Ｓ_f,k、モデルクラスＣ_k、視差画像を入力し（Ｓ５１）、側面領域位置の決定を行う（Ｓ５２）。この場合、先に求まった前面領域Ｓ_f,kの位置を元に、側面領域Ｓ_s,kの３次元位置を計算する。側面領域としては両横の２つが考えられるが、現在の視点から見えている方に限定する。そして、側面領域Ｓ_s,kの視差画像への投影である投影領域ｈ_sを得る（Ｓ５３）。
【０１４５】
次に、側面領域Ｓ_s,kの視差画像での側面領域Ｓ_s,kの最前方位置と最後方位置の差（奥行き変化量）を調べて、その差が距離情報の解像度に比べて十分大きいならば（Ｓ５４）、距離情報による当てはめが行えるので、以降の処理を行なう。また、もし、解像度が不足しているなら、側面領域の存在を問うこと自体が意味を持たないので、側面領域検査不要として処理を終える。
【０１４６】
また、Ｓ５４の処理で、奥行き解像度が十分な時（Ｓ５４）には、前記Ｓ３４の処理により側面領域Ｓ_s,kを支持する視差点が十分あるかを調べ、物体面当てはめ処理を行う。この場合、物体面当てはめ度合いの良否判断の結果は、平均当てはめ得点Ｓ_r、絶対当てはめ得点Ｓ_aとして返される（Ｓ５５）。
【０１４７】
次に、前面領域と同様に、平均当てはめ得点Ｓ_rが閾値以上なら（Ｓ５６）、側面領域Ｓ_s,kが存在していると見なし、当てはめ得点の評価値ＳＣ２として、絶対当てはめ得点Ｓ_aを返す（Ｓ５７）。また、平均当てはめ得点Ｓ_rが閾値以下なら、当該側面領域Ｓ_s,kは無いとする。
【０１４８】
(5) −８：側面領域当てはめ部の説明
以下、図３のＢに示した「側面領域当てはめ部１５」の処理について説明する。この処理では、各基本領域Ｒ_iについて、最初に側面領域Ｓ_s,kを求めて、該側面領域Ｓ_s,kが有れば、それを支持する前面領域Ｓ_f,kを求めて、その双方があれば物体領域Ｏ_s,kがあるとする。
【０１４９】
この意味において、前面領域当てはめ部１４との違いは前面領域と側面領域の探索順序の違いだけで、その内容は実質的に同じである。従って、これまで説明した処理を、前面を側面領域に、側面領域を前面にそれぞれ入れ換えたものと同値であるから、詳細な説明は省略する。
【０１５０】
(6) ：物体速度計測器の説明
以下、図２に示した物体速度計測器５について説明する。物体の速度は、各時刻で得られる物体位置を時刻間で対応づけることで、物体の位置変化を取得して求める。すなわち、物体の速度計測は、時刻間の物体の移動量より求める。この対応付けを行う主体として、追跡器ＴＡ_i（ｉは他の追跡器と区別するための添え字）を定義する。
【０１５１】
更に、
【０１５２】
【数７】

【０１５３】
そのために、更に、
【０１５４】
【数８】

【０１５５】
カルマンフィルタのシステム式及び推定式は以下の通りである。
【０１５６】
【数９】

【０１５７】
【数１０】

【０１５８】
【数１１】

【０１５９】
更に、時刻間の対応付けにおいて対応する物体領域Ｏ_iが無くなった時には、追跡器ＴＡ_iはロスト追跡器ＴＬ_iに遷移するとする。このロスト追跡器ＴＬ_iを集めた集合、ロスト追跡器リストＴＬＬ＝｛ＴＬ_i｝、ｉ∈｛１，２，・・・，Ｎ_n｝を定義する。
【０１６０】
(7) ：物体追跡処理（追跡器の時刻間で対応付ける方法）の説明
図１３は物体追跡処理フローチャートである。以下、図１３に基づいて物体追跡処理を説明する。なお、Ｓ６１〜Ｓ６９は各処理ステップを示す。
【０１６１】
先ず、物体領域リストＯＬ、追跡器リストＴＡＬ、ロスト追跡器リストＴＬＬを入力し（Ｓ６１）、追跡器リストＴＡＬから、物体領域と対応付けられていない未対応の追跡器ＴＡ_iを得る（Ｓ６２）。もし、そのような追跡器ＴＡ_iがなければ（Ｓ６３）、Ｓ６８の処理（後述する）へ移行する。
【０１６２】
次に、追跡器ＴＡ_iがあった場合、追跡器ＴＡ_iについて、対応可能な物体領域Ｏ_jを物体領域リストＯＬから選択する（Ｓ６４）。もし、対応する物体領域Ｏ_jがあれば（Ｓ６５）、追跡器ＴＡ_iを物体領域Ｏ_jを用いて更新処理する（Ｓ６６）。そして、処理をＳ６２に戻す。
【０１６３】
もし、対応する物体領域Ｏ_jが無ければ、追跡器ＴＡ_iに対して対応ロスト処理を行い（Ｓ６７）、処理をＳ６２に戻す。このように全ての対応追跡器ＴＡ_iについて物体領域との対応付けが終了すると、処理はＳ６８へ以降する。ここの段階の物体領域リストには、追跡器と対応付けがかなった新規の物体領域が残っている。
【０１６４】
その中には、実際に新たに加わった物体領域もあれば、以前の対応追跡器中に何らかの理由で対応をロストして、その同一物体領域がこの時点で再び観測されたものもある。そこで、先ず、後者の場合を想定して、ロスト追跡器ＴＬ_iと矛盾なく対応する物体領域Ｏ_jを探索する（Ｓ６８）。
【０１６５】
もし、そのような物体領域Ｏ_jがあれば、対応するロスト追跡器ＴＬ_iの情報を用いて初期速度を定め、対応追跡器ＴＡ_iに戻す。初期速度が定まっているので、カルマンフィルタで推定するパラメータの収束が早まり、短時間で位置、速度情報が安定する。
【０１６６】
次に、前者の新規物体領域について、追跡器ＴＡを設定して、以降の追跡処理を開始する。そして、最後に、ロスト追跡器ＴＬ_iのロスト回数をインクリメントし、一定回数以上ロスト状態が続くロスト追跡器ＴＬ_iを除去する更新処理を行う（Ｓ６９）。
【０１６７】
なお、前記Ｓ６８の処理について詳細に説明する。この処理では、ロスト追跡器ＴＬ_iの見失った時刻での位置から、現時刻の位置を喪失した時に、その位置と物体領域Ｏ_jからの位置が十分近いなら、矛盾無しとする。そして、矛盾がないなら、ロスト追跡器ＴＬ_iの速度を初期値として追跡を再開する。この時、ロスト追跡器ＴＬ_iを追跡器ＴＡ_iとして、追跡器リストＴＡＬに戻す。このようにすることで、再追跡時には、物体の移動速度が分かっているから、追跡が安定して行える。
【０１６８】
更に、各処理について詳細に説明する。先ず、Ｓ６４の対応可能物体領域の選択では、追跡器ＴＡ_iの予測位置Ｐ_p（ＸＰ_i，ＺＰ_i）を基準に、最も３次元距離が近い物体領域を対応候補とし、その対応候補との距離が閾値以下の場合は対応可能とする。それ以外は対応不可能とする。
【０１６９】
次に、Ｓ６６の追跡器の更新では、対応可能物体領域Ｏ_jの３次元位置（ＸＯ_j，ＺＯ_j）を観測位置として、前記式(2) 、式(4) で追跡器の位置、速度の推定値を更新し、その速度を追跡物体の移動速度（出力）とする。更に、前記式(1) 、式(3) を用いて、次回の位置、速度の予測値を求めておき、次時刻の対応付けに利用する。
【０１７０】
また、Ｓ６７の追跡器のロスト処理では、その追跡器ＴＡ_iが物体をロストした時点での位置と速度を、それぞれロスト位置Ｐｌ_i（ＸＬ_i，ＺＬ_i）、ロスト速度Ｖｌ_i（ＶＸＬ_i，ＶＺＬ_i）に記録しておき、再対応時に備える。更に、ロスト回数カウンタｌ_ciを１に設定する（ｌ_ci＝１）。そして、追跡器ＴＡ_iを追跡器リストＴＡＬから除去し、それをロスト追跡器リストＴＬＬに追加する。
【０１７１】
Ｓ６８の処理では、新規物体領域Ｏ_jの位置（ＸＯ_j，ＺＯ_j）と、各ロスト追跡器ＴＬ_iのロスト位置ｐｌ_i、ロスト速度ｖｌ_iを用いて以下に示す式で、予測位置と観測位置の差１を計算し、矛盾なく対応できるかを調べる。
【０１７２】
【数１２】

【０１７３】
もし、前記ｌ（エル）が閾値以内ならば（ｌ≦閾値）、物体領域Ｏ_jとロスト追跡器ＴＬ_iは矛盾なく対応できるとして、物体領域Ｏ_jの位置を用いてロスト追跡器ＴＬ_iの３次元位置と速度をカルマンフィルタで推定すると共に、その追跡器をロスト追跡器リストＴＬＬから除き、ロスト回数を０に設定してから、再度（対応）追跡器リストＴＡＬに加えて、以降の対応付けを開始する。
【０１７４】
§４：具体的な装置例と記録媒体の説明
図１４は具体的な装置例である。前記３次元物体検出／速度計測装置は、パーソナルコンピュータ、ワークステーション等の任意のコンピュータを利用して実現することができる。この３次元物体検出／速度計測装置は、コンピュータ本体２０と、該コンピュータ本体２０に接続されたディスプレイ装置３１、入力装置（キーボード／マウス等）３２、リムーバブルディスクドライブ（「ＲＤＤ」という）３３、磁気ディスク装置（「ＭＤＤ」という）３４、距離計測器１、カメラ３等で構成されている。
【０１７５】
そして、コンピュータ本体２０には、内部の各種制御や処理を行うＣＰＵ２１と、プログラムや各種データを格納しておくためのＲＯＭ２２（不揮発性メモリ）と、メモリ２３と、インタフェース制御部（「Ｉ／Ｆ制御部」という）２４と、通信制御部２５と、入出力制御部（Ｉ／Ｏ制御部）２６等が設けてある。なお、前記ＲＤＤ３３には、フレキシブルディスクドライブ（フロッピィディスクドライブ）や光ディスクドライブ等が含まれる。
【０１７６】
前記構成の装置において、例えば、ＲＯＭ２２、或いはＭＤＤ３４の磁気ディスク（記録媒体）に、３次元物体検出／速度計測装置の前記処理を実現するためのプログラムを格納しておき、このプログラムをＣＰＵ２１が読み出して実行することにより、前記３次元物体検出／速度計測処理を実行する。
【０１７７】
しかし、本発明は、このような例に限らず、例えば、ＭＤＤ３４の磁気ディスクに、次のようにしてプログラムを格納し、このプログラムをＣＰＵ２１が実行することで前記連結領域抽出処理を行うことも可能である。
【０１７８】
▲１▼：他の装置で作成されたリムーバブルディスクに格納されているプログラム（他の装置で作成したプログラムデータ）を、ＲＤＤ３３により読み取り、ＭＤＤ３４の記録媒体に格納する。
【０１７９】
▲２▼：ＬＡＮ等の通信回線を介して他の装置から伝送されたプログラム等のデータを、通信制御部２５を介して受信し、そのデータをＭＤＤ３４の記録媒体（磁気ディスク）に格納する。
【０１８０】
【発明の効果】
本発明によれば次のような効果がある。
【０１８１】
(1) ：請求項１では、基本領域検出手段は、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出する。
【０１８２】
この場合、距離情報（視差情報）に物体面モデルを当てはめてその良否から物体の有無を決定する「物体検出処理」で必要となる、距離（視差）画像から近隣の距離情報をまとめて平面部分を物体面候補として抽出する。この処理において、３次元位置（奥行き）により変化する物体の観測大きさに合わせた適切な領域を「近隣」として近隣領域内の距離情報をまとめることで過不足ない距離情報を用いることを可能にする。
【０１８３】
その結果、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【０１８４】
(2) ：請求項２では、物体面当てはめ手段は、当てはめ対象である基本領域の３次元距離によって、距離情報の他に、対象情景を撮像する濃淡或いはカラー画像から得られる情報を使用するか不使用とするかを切り換える。
【０１８５】
この場合、物体検出処理に関して、対象物体が遠方にあると物体の観測大きさが小さくなり、得られる距離情報の数が不足して、距離情報への物体面モデルの当てはめではモデル位置を変化させても当てはめ度合いが殆ど変化せず、物体位置の特定が困難となる。
【０１８６】
そのため、物体が閾値以上に遠い場合は、３次元距離情報の他に、ビデオカメラ（或いはテレビカメラ）等で撮像された情景の濃淡画像から得られる物体輪郭線情報を用いて画像中での物体位置を補正することで、物体の３次元位置（奥行き）によらず、常に安定して物体位置を得ることができる。
【０１８７】
(3) ：請求項３では、速度検出手段は、位置検出手段で物体が検出できず、その結果として時系列での対応が求まらない時、一定回数だけその物体を未対応状態で位置を残しておき、前記位置検出手段で再び物体が検出された時には、未対応であった期間での平均移動速度を参考に未対応状態の物体と矛盾なく対応できるかを調べ、もし矛盾なく対応できるならば、先の検出された物体を未対応物体が再び検出されたとして、未対応間の平均移動速度を再追跡の初期速度として与えて再び物体の追跡を行うことで、余分な追跡を必要とせず、速やかに速度計測を再開する。
【０１８８】
この場合、物体速度計測処理に関して、物体領域が検出できず、対応が求まらない時は、一定の回数だけその物体を未対応状態で位置の情報を残しておき、再び物体が検出された場合には、未対応の間の平均移動距離を初期速度として与えて再び物体の追跡を行うことで、再検出の直後から速度を安定に計測することを可能とする。
【０１８９】
また、再検出物体を新規物体として追跡し始める従来方式では、速度情報を安定に得るには、更に、数フレームの物体追跡が必要で、速度情報の出力までに時間遅れを要していたが、本発明では、物体の検出ができない場合があっても、時間遅れなく、速度情報を得ることができる。
【０１９０】
(4) ：請求項４では、３次元位置情報から物体面を構成する面の全部或いは一部を、部分特徴である基本領域として検出する基本領域検出処理と、前記検出された物体面の基本領域について、予め用意した物体形状モデルの構成面を当てはめる物体面当てはめ処理と、前記物体面当てはめ手段による当てはめ結果の良否より物体の有無を判定する物体領域判定処理を有し、前記基本領域検出処理では、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とするが、その際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出するようにした。
【０１９１】
このようにすれば、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【０１９２】
(5) ：請求項５では、３次元物体検出装置が、記録媒体のプログラムを読み出して実行することにより、或る近隣範囲に存在する３次元位置情報をまとめて基本領域とする際、このまとめる近隣範囲の大きさを、３次元位置により変化する物体の見かけの大きさに合わせた適切な大きさに設定することで、距離情報を過不足なくまとめることを可能にし、その結果として物体位置を検出する。
【０１９３】
この場合、距離情報（視差情報）に物体面モデルを当てはめてその良否から物体の有無を決定する「物体検出処理」で必要となる、距離（視差）画像から近隣の距離情報をまとめて平面部分を物体面候補として抽出する。この処理において、３次元位置（奥行き）により変化する物体の観測大きさに合わせた適切な領域を「近隣」として近隣領域内の距離情報をまとめることで過不足ない距離情報を用いることを可能にする。
【０１９４】
その結果、従来の固定大きさを「近隣」とする方法で生じる、遠方で近隣する他物体を誤って一つの物体としてしまう過併合の問題や、近傍で一つの物体面が多くの小平面に分割されて検出され、本来の物体面に統合するのが困難になる過分割の問題を生じることなく、物体の３次元位置（奥行き）によらず、常に安定して一つの物体面を一つの平面として抽出することができる。
【図面の簡単な説明】
【図１】本発明の原理説明図である。
【図２】本発明の実施の形態におけるシステム説明図である。
【図３】本発明の実施の形態における基本領域検出部、及び物体面当てはめ／物体位置決定部の構成図である。
【図４】本発明の実施の形態における間引き視差画像と元の視差画像の位置関係を示した図である。
【図５】本発明の実施の形態における基本領域生成部の処理フローチャートである。
【図６】本発明の実施の形態における処理説明図（その１）である。
【図７】本発明の実施の形態における処理説明図（その２）である。
【図８】本発明の実施の形態における処理説明図（その３）である。
【図９】本発明の実施の形態における前面領域当てはめ部の処理フローチャート（その１）である。
【図１０】本発明の実施の形態における前面領域当てはめ部の処理フローチャート（その２）である。
【図１１】本発明の実施の形態における輪郭検出処理の説明図である。
【図１２】本発明の実施の形態における側面領域当てはめ処理（確認用）フローチャートである。
【図１３】本発明の実施の形態における物体追跡処理フローチャートである。
【図１４】本発明の実施の形態における具体的な装置例である。
【符号の説明】
１距離計測器
２視差画像変換器
３カメラ（ビデオカメラ）
４物体検出器
４−１基本領域検出部
４−２物体面当てはめ／物体位置決定部
５物体速度計測器
１１視差画像間引き部
１２単位領域検出部
１３基本領域生成部
１４前面領域当てはめ部
１５側面領域領域当てはめ部
２０コンピュータ本体
２１ＣＰＵ
２２ＲＯＭ
２３メモリ
２４Ｉ／Ｆ制御部（インターフェイス制御部）
２５通信制御部
２６Ｉ／Ｏ制御部（入出力制御部）
３１ディスプレイ装置
３２入力装置
３３リムーバブルディスクドライブ（ＲＤＤ）
３４磁気ディスク装置（ＭＤＤ）[0001]
BACKGROUND OF THE INVENTION
INDUSTRIAL APPLICABILITY The present invention is used for an apparatus (or system) for detecting the position / speed of a three-dimensional object based on distance information, such as intruder detection, vehicle (automobile) position / speed detection, navigation for mobile robots, etc. A three-dimensional object detection device and a three-dimensional object for detecting the position, shape, and moving speed of a three-dimensional object that can exist in a wide range of a three-dimensional space when three-dimensional position information (distance information) is given by some distance measurement method The present invention relates to a detection method and a recording medium.
[0002]
[Prior art]
A conventional example will be described below.
[0003]
§1: Conventional example 1
Conventionally, several methods for detecting the position of an object using distance information have been proposed. They are roughly classified into (A1): a method for obtaining an object by detecting a set of distance distributions using only distance information, and (A2): a method for using both distance information and image information without fail. is there. Hereinafter, these methods will be described.
[0004]
(1): Explanation of (A1)
As representatives of the above (A1), Miyoshi et al., “Front Situation Recognition System for Driving Support Using Stereo Images”, IEICE Technical Report, PRMU 97-25-36, pp. The method 39-46 will be described.
[0005]
In this method, first, a road scene in front of a vehicle is observed with two cameras, and distance information (distance image) is acquired from the two images by binocular stereoscopic vision. The obtained distance image is divided into strip-shaped regions as defined below. The area is set to have a small width and a height that includes only the upper part of the road surface from the relationship between the camera arrangement information and the road surface position information acquired in advance.
[0006]
Then, for each region, a depth histogram is created using the distance information included and the depth on the horizontal axis and the frequency on the vertical axis, and the distance representing the strip region is determined from the peak position of the histogram. Next, an area group is created by grouping adjacent areas close together.
[0007]
Then, the inclination of the obtained group in the three-dimensional space is examined, and if it is close to the observation direction, it is classified as the rear surface of the vehicle, and if it is close to the observation direction, it is classified as the vehicle side surface. Further, the vehicle position and shape are detected by a combination based on the positional relationship between the front and side surfaces.
[0008]
(2): Explanation of (A2)
The method for detecting an object using the distance information and image information of (A2) is described by Ryoko Shimomura et al. “Study on Reducing Variation in Inter-vehicle Distance Measurement Using Stereo Parallax and Height Change of Preceding Car”, Electronic Information Communication Society Technical Research Report, PRMU 98-92-105, pp. 21-28 will be described as an example.
[0009]
In this method, first, a road scene in front of a vehicle is observed with two cameras on the left and right to obtain two images. A white line is detected from one of the images to obtain a travel region of the host vehicle, and a clear and long horizontal line included therein that is located at the lowest position is obtained as the lower end of the vehicle. Then, an area having a certain size called a template having the width of the lower end line is defined above the lower end line, and the other image is searched to obtain an area where the brightness pattern is most similar to the template area. Thereby, the distance to the vehicle (object) is obtained by binocular stereoscopic vision.
[0010]
Further, assuming that the vehicle has stable horizontal lines at the upper and lower ends, a projection histogram in the horizontal direction of the edge points is created in another area provided with reference to the template position, and the peak position is determined. Determine the upper and lower ends of the vehicle. The height (h) of the vehicle in the image is determined from the width of the upper end and the lower end. Thus, the position of the vehicle and the vehicle shape (size) are detected.
[0011]
§2: Conventional example 2
As a second conventional example, the movement speed measurement will be described. The method for calculating the moving speed while detecting the position of the object is largely the following (B1): a method in which the difference in object position between two times is used as the moving speed, (B2): using several past object positions. The method of calculating the current moving speed is roughly divided.
[0012]
(1): Explanation of (B1)
In the method (B1), the object position is detected at each time, and the moving speed is a simple difference between the object position at the current time and the object position at the previous time.
[0013]
(2): Explanation of (B2)
Representative methods of (B2) are Kazuyuki Imakawa, Ryoyama, Seiji Kashiwagi, Hideaki Matsuo, “Real-time palm tracking from sign language video considering occlusion by face”, IEICE technical report, PRMU97. -104-110, pp. 15-22 will be described as an example.
[0014]
This example is a method of tracking a two-dimensional hand region as an object. However, this method will be described as an example because it is the same even if it is three-dimensional from the viewpoint of tracking the object. First, a hand area is extracted from the image using color information, and the position of the center of gravity of the area is set as the area position p.
[0015]
Assuming constant velocity motion as a region motion model, the observation vector is the region position p, the parameter vector x to be estimated is the region position and the region velocity, the Kalman filter is configured to estimate the parameter vector x, and the parameter vector The area position and area speed indicated by are obtained as required amounts.
[0016]
In this way, when using the Kalman filter, the most stable position and speed can be estimated in the sense of linear estimation even if there is an error in the region position observed at each time. Measurement is possible.
[0017]
[Problems to be solved by the invention]
The conventional apparatus as described above has the following problems.
[0018]
(1): In the case of (A1), generally, the depth of the three-dimensional information and the resolution of the spatial information decrease as the distance increases, and the reliability of the distance information and position information decreases. In other words, when there are multiple objects, they can be separated by the difference in distance in the vicinity, but even if the distance difference between the objects is the same, when the object is far away, the difference in distance cannot be detected and the objects are separated Means that it will be difficult. Therefore, it is difficult to separate the object region from the distance difference using only the distance information for an object that is some distance away.
[0019]
Furthermore, as a property of the obtained distance information, when an object is observed in the vicinity of the sensor, the object is observed greatly, and the amount (number) of the obtained three-dimensional information is large, The distance information increases or decreases depending on the distance to the object, such that when the object is located far away, the number (amount) of three-dimensional information that can be observed is also reduced.
[0020]
As a result, for example, the method of dividing the distance image into fixed-size areas (hereinafter referred to as “extraction areas”) and calculating the object position for each area as in the method of Miyoshi et al. In order to discriminate between objects, it is necessary to reduce the size of the extraction area. As a result, when the object exists in the vicinity, a sufficient number of distance information exists. Since the area width is small, the number of three-dimensional information collected for each area decreases, and the accuracy of the object position decreases.
[0021]
Conversely, if the size of the extraction region is increased in order to increase the accuracy of the position in the vicinity, there is a problem that it becomes impossible to separate a distant object. Further, there is a problem in that distance information obtained at a distance is two-dimensionally rough on a distance image, and one object plane is not necessarily obtained as an extraction region where a continuous three-dimensional position can be determined. .
[0022]
(2): In the case of (A2), when an object is detected using distance information and image information, when the object is far away, the pattern inside the object is hardly observed and the object contour becomes dominant. . Therefore, in such a case, it is effective to define the object region by extracting the object outline by the projection of the edge point as in the method of Shimomura.
[0023]
However, when the object is in the vicinity of the observer, the pattern inside the object is more dominant than the object contour, and as a result, in the projected histogram of edge points, the pattern edge inside the object is the peak of the contour part. It is difficult to stably detect the contour line by inhibiting the formation of. Therefore, when the image feature and distance information are always used, the image feature may be an obstacle and the object position may not be detected.
[0024]
(3): In the case of (B1), when the object position at each time is obtained and the moving speed is obtained by a simple difference between the object position at the current time and the object position at the previous time, the measurement accuracy of the object position at each time is obtained. There is a problem that the measurement accuracy is poor due to the large influence of the measurement.
[0025]
(4): In the case of (B2), a method of calculating the current moving speed using several past object positions detected by the method of (B2) is conceivable. The problem is how to stably obtain the correspondence of the object between the time points.
[0026]
Here, for example, when an object is detected in an outdoor scene, there are effects of surrounding proof conditions such as changes in sunlight and the presence of shadows, and changes in appearance due to movement of the object. There may be cases where it cannot be detected. At this time, since it is impossible to associate the object, it is impossible to measure the moving speed of the object, and when the filter is used, the undetected influence is affected for several hours, and the speed information during that time is incorrect. It will be a thing.
[0027]
Furthermore, when an object that has not been detected is detected again, it is necessary to newly start estimation of the motion parameter. Therefore, several times after detecting the object again, the estimated value of the motion parameter includes many errors, Incorrect speed information. As described above, there is a problem that the effect of undetected objects appears with respect to stable speed detection.
[0028]
An object of the present invention is to solve such a conventional problem and to enable stable detection of the position and speed of an object without being affected by the distance of the object.
[0029]
[Means for Solving the Problems]
FIG. 1 is a diagram for explaining the principle of the present invention. In FIG. 1, 4 is an object detector, 4-1 is a basic area detector, 4-2 is an object surface fitting / object position determining unit, and 5 is an object velocity measuring device. Indicates. In order to achieve the above object, the present invention is configured as follows.
[0030]
(1): Image format in which three-dimensional position information (distance information) has depth information at each location of a two-dimensional (x, y) array Or In the three-dimensional object detection apparatus for detecting the position and shape of a three-dimensional object that can exist in a three-dimensional space when given in the form of a set of three-dimensional position (X, Y, Z) information, the three-dimensional position information The basic area detecting means (basic area detecting unit 4-1) for detecting all or part of the surfaces constituting the object plane as basic areas that are partial features, and the basic area of the detected object plane in advance Object surface fitting means (object surface fitting / object position determining unit 4-2) for fitting the constituent surface of the prepared object shape model, and object region determination means for judging the presence / absence of an object based on the result of fitting by the object surface fitting means. (A part of the object surface fitting / object position determining unit 4-2), and the basic area detecting unit collectively sets the three-dimensional position information existing in a certain neighboring area as a basic area. The size of the neighborhood range to stop, 3D object Varies with 3D position Set according to distance It has a function.
[0031]
(2): Three-dimensional position information (distance information) is an image format having depth information at each location of a two-dimensional (x, y) array, or a set of three-dimensional position (X, Y, Z) information. In the three-dimensional object detection device for detecting the position and shape of a three-dimensional object that can exist in the three-dimensional space, the whole or a part of the surface constituting the object surface is determined from the three-dimensional position information. Basic area detecting means for detecting as a basic area which is a feature, object surface fitting means for applying a component surface of an object shape model prepared in advance for the detected basic area of the object surface, and a result of fitting by the object surface fitting means Object region determining means for determining the presence / absence of an object based on whether the object is good or bad, and the object plane fitting means is a method for imaging a target scene in addition to distance information according to a three-dimensional distance of a basic area to be fitted. Has a function of switching whether to use information obtained from a color image or not.
[0032]
(3): The three-dimensional object detection device of (1) above. The object region determining means detects an object position based on whether the fitting result by the object fitting means is good or not, and detects the detected object position. Calculate the velocity information of each object by finding it in time series and finding the time-series correspondence of the obtained object positions. Has function, Said Object region determination means When the object cannot be detected in the above, and as a result, the correspondence in the time series cannot be obtained, the position of the object is left unsupported for a certain number of times, Object region determination means When an object is detected again, the average moving speed during the non-corresponding period is checked to see if it can be handled consistently with the object in the unsupported state. Equipped with a speed detection means that enables speed measurement to be resumed by re-tracking the object by giving the average moving speed between unsupported as the initial speed of re-tracking, assuming that an unsupported object is detected again. ing.
[0033]
(4): Image format in which 3D position information has depth information at each location of a 2D array Or In a three-dimensional object detection method for detecting the position and shape of a three-dimensional object that can exist in a three-dimensional space when given in the form of a set of three-dimensional position information, surfaces constituting an object surface from the three-dimensional position information A basic area detection process for detecting all or a part of the detected area as a basic area that is a partial feature; and an object surface fitting process for applying a component surface of an object shape model prepared in advance for the basic area of the detected object surface; An object region determination process for determining the presence or absence of an object based on whether or not the fitting result by the object surface fitting means is good. In the basic region detection process, three-dimensional position information existing in a certain neighboring range is collectively used as a basic region. However, in that case, the size of the neighborhood range to summarize, 3D object Varies with 3D position Set according to distance I did it.
[0034]
(5): An image format in which three-dimensional position information (distance information) has depth information at each location of a two-dimensional (x, y) array Or When given in the form of a set of three-dimensional position (X, Y, Z) information, a three-dimensional object detection device that detects the position and shape of a three-dimensional object that can exist in the three-dimensional space is converted from the three-dimensional position information. A first procedure for detecting all or part of the surfaces constituting the object plane as a basic region that is a partial feature, and applying the component surface of the object shape model prepared in advance to the basic region of the detected object plane When performing the second procedure, the third procedure for determining the presence / absence of an object based on the result of the fitting according to the second procedure, and the first procedure, three-dimensional position information existing in a certain neighboring range is obtained. Collectively, this is the basic area. 3D object Varies with 3D position Set according to distance The computer-readable recording medium which recorded the program for performing a 4th procedure.
[0035]
(Function)
The operation of the present invention based on the above configuration will be described with reference to FIG.
[0036]
(a): Action of (1) above
The basic area detecting means collects the three-dimensional position information existing in a certain neighboring area as a basic area. At this time, the size of the gathering neighboring area is the apparent size of the object that changes depending on the three-dimensional position. By setting the size appropriately according to the distance information, it becomes possible to collect the distance information without excess or deficiency, and as a result, the object position is detected.
[0037]
In this case, the object plane model is applied to the distance information (disparity information), and the distance information from the distance (parallax) image is necessary to be used in the “object detection process” for determining the presence / absence of the object based on the quality. Are extracted as object plane candidates. In this process, it is possible to use distance information that is not excessive or deficient by combining the distance information in the neighboring area by setting the appropriate area according to the observation size of the object that changes depending on the three-dimensional position (depth) as “neighboring”. To do.
[0038]
As a result, the problem of over-merging that causes other objects that are nearby in the distance to be mistakenly created as one object due to the conventional method of setting the fixed size as `` neighboring '', or one object surface in the vicinity to many small planes. One object plane is always stably detected regardless of the three-dimensional position (depth) of the object without causing the problem of over-division that is detected by being divided and becomes difficult to integrate into the original object plane. It can be extracted as a plane.
[0039]
(b): Action of (2) above
The object plane fitting means switches between using or not using information obtained from the grayscale or color image for capturing the target scene, in addition to the distance information, depending on the three-dimensional distance of the basic region to be fitted.
[0040]
In this case, regarding the object detection processing, if the target object is far away, the observation size of the object will be small, and the number of distance information obtained will be insufficient, and the model position will be changed when fitting the object plane model to the distance information. However, the degree of fitting hardly changes and it is difficult to specify the object position.
[0041]
For this reason, when the object is more than the threshold, the object in the image is obtained using object outline information obtained from a grayscale image of a scene captured by a video camera (or a television camera) in addition to the three-dimensional distance information. By correcting the position, the object position can always be obtained stably regardless of the three-dimensional position (depth) of the object.
[0042]
(c): Action of (3) above
The three-dimensional object detection device so Is The object area determination means detects the object position based on whether the fitting result by the object fitting means is good or not, and determines the detected object position. The time information is obtained in time series, and the speed information of each object is calculated by obtaining the time series correspondence of the obtained object positions.
And In the speed detection means, the object region determination means When the object cannot be detected in the above, and as a result, the correspondence in the time series cannot be obtained, the position of the object is left unsupported for a certain number of times, Object region determination means When an object is detected again, the average moving speed during the non-corresponding period is checked to see if it can be handled consistently with the object in the unsupported state. Assuming that an unsupported object is detected again, speed measurement is resumed by tracking the object again by giving the average moving speed between unsupported as the initial speed of retracking.
[0043]
In this case, regarding the object velocity measurement process, when the object region cannot be detected and the correspondence cannot be obtained, the object is detected again by leaving the position information in the unsupported state for a certain number of times. In this case, it is possible to stably measure the speed immediately after re-detection by giving the average moving distance between unsupported as the initial speed and tracking the object again.
[0044]
In addition, in the conventional method in which the re-detected object starts to be tracked as a new object, in order to obtain speed information stably, object tracking of several frames is further required, and a time delay is required until the speed information is output. In the present invention, even if there is a case where an object cannot be detected, speed information can be obtained without time delay.
[0045]
(d): Action of (4) above
Basic area detection processing for detecting all or part of the surfaces constituting the object surface from the three-dimensional position information as basic regions that are partial features, and object shape models prepared in advance for the detected basic regions of the object surface And an object area determination process for determining the presence / absence of an object based on whether or not the fitting result by the object plane fitting means is good, and the basic area detection process exists in a certain neighborhood range. The three-dimensional position information is collectively set as a basic region. At this time, by setting the size of the neighboring range to be combined to an appropriate size according to the apparent size of the object that changes depending on the three-dimensional position, The distance information can be collected without excess and deficiency, and as a result, the object position is detected.
[0046]
In this way, the conventional method of making the fixed size “neighboring” causes the problem of over-merging that misplaces another object that is nearby in the distance, and one object plane in the vicinity. One object plane is always detected stably regardless of the three-dimensional position (depth) of the object, without causing the problem of overdivision that is detected by being divided into small planes and becomes difficult to integrate into the original object plane. Can be extracted as one plane.
[0047]
(e): Action of (5) above
When the three-dimensional object detection device reads and executes a program on a recording medium and collects the three-dimensional position information existing in a certain neighboring area as a basic area, the size of the gathering neighboring area is set to 3 By setting the appropriate size according to the apparent size of the object that changes depending on the dimensional position, it is possible to collect the distance information without excess or deficiency, and as a result, the object position is detected.
[0048]
In this case, the object plane model is applied to the distance information (disparity information), and the distance information from the distance (parallax) image is necessary to be used in the “object detection process” for determining the presence / absence of the object based on the quality. Are extracted as object plane candidates. In this process, it is possible to use distance information that is not excessive or deficient by combining the distance information in the neighboring area by setting the appropriate area according to the observation size of the object that changes depending on the three-dimensional position (depth) as “neighboring”. To do.
[0049]
As a result, there are problems of over-merging that causes other objects nearby in the distance to be mistakenly created as one object, which is caused by the conventional method of setting the fixed size as `` neighboring '', It is detected by being divided into many small planes, and without causing the problem of overdivision that makes it difficult to integrate it into the original object plane, it is always stable and independent of the 3D position (depth) of the object. The object plane can be extracted as one plane.
[0050]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0051]
§1: Overview of 3D object detection device
The three-dimensional object detection device of the present invention detects the position, shape, and moving speed of a three-dimensional object that can exist in a wide range of a three-dimensional space when three-dimensional position information (distance information) is given by some distance measurement method. To do. Here, the three-dimensional position information is a set of information representing each position in space or a specific three-dimensional position (X, Y, Z), and the given format is a two-dimensional (X, Y) array. It may be an image format having depth information at each place, or may take the form of a set of three-dimensional position (X, Y, Z) information (X, Y, Z: coordinates representing a three-dimensional space).
[0052]
First, a method for detecting the position and shape of an arbitrary three-dimensional object from the three-dimensional position information will be described. A three-dimensional shape model representing the approximate size and shape of a three-dimensional object to be detected is prepared in advance. Then, from the three-dimensional position information, portions that are three-dimensionally close and that form a plane are collected, and several surface elements (hereinafter referred to as “basic regions”) are extracted.
[0053]
It is assumed that a three-dimensional object exists at the position if each face constituting the previous three-dimensional model is applied to the basic region and all the observable with the three-dimensional model are applied without any problem. Here, in general, an object on a three-dimensional object has a smaller observed size as it is farther away, and conversely, as it is observed as a closer object, the size of the basic region to which the object surface should be applied is There arises a problem that it varies depending on the position (distance) of the target object.
[0054]
Therefore, when forming the basic region, the range of the three-dimensional position information used for forming the region is changed according to the three-dimensional distance of the object so that it becomes a small region in the distance and vice versa. To do. By adjusting to the size of the observed object, a stable basic region can be obtained without including other objects as much as possible.
[0055]
Further, as the three-dimensional information becomes farther away, the resolution of the depth information becomes lower, and the reliability of the distance information decreases. Furthermore, the number (quantity) of three-dimensional information that can be reduced when the object itself is further away. This makes it difficult to fit the previous three-dimensional model when the object is far away.
[0056]
Therefore, when a three-dimensional object is far away and it is difficult to fit a model, the object contour line obtained from the grayscale image (or color image) of the scene obtained with a video camera (or TV camera) or the like is used. Detect position accurately. Conversely, when the object is in the vicinity, a lot of 3D information can be obtained and the pattern inside the object is dominant in the grayscale image, making it difficult to distinguish it from the contour line, so the object contour information is not used. . In this way, by using image information adaptively according to the distance, the object position can be detected stably without affecting the distance of the object.
[0057]
Next, a method for measuring the speed of an object will be described. Basically, the moving speed of the object is detected from the time change of the three-dimensional position of the object obtained in the above, but in order to stabilize the detection speed, a commonly used object motion model is used. The current moving speed is determined using past position information.
[0058]
In the method using such a motion model, it is necessary to correctly associate the same object within a certain amount of time until the speed information is stabilized. However, for example, in outdoor object detection, if the object is not detected due to the influence of the surrounding proof conditions or the change in appearance due to the movement of the object, it is impossible to measure the speed because the objects cannot be temporally associated with each other. .
[0059]
Further, when the same object is observed again, it is necessary to associate some time again until the speed information is stably obtained, and it is difficult to immediately determine the speed information. Thus, the loss of correspondence has a significant effect on stable speed detection.
[0060]
Therefore, when an object cannot be detected and a response cannot be obtained, the object is not supported for a certain number of times, leaving position and size information, and when an object is detected again, it is not supported. The average moving speed in the given period is given as the initial speed, and the object tracking is continued again. As a result, even if the object position cannot be detected for a certain period, the object information for a certain period for stabilizing the speed after the object position is re-detected is unnecessary, so that the speed information can be stabilized quickly.
[0061]
§2: Explanation of 3D object detection / speed detection processing
Hereinafter, the three-dimensional object detection / speed detection process will be described.
[0062]
(1): In the present invention, a core region (basic region) for detecting an object plane position is extracted from three-dimensional position information, and a predetermined object model is applied in advance to the basic region. When the degree is good, a process that an object exists is performed.
[0063]
As this basic region, a pixel having the three-dimensional position information of the distance image is a region in which the two-dimensional distance and the three-dimensional depth on the image are close to each other and the three-dimensional position information constituting the plane is collected. However, the magnitude of the two-dimensional distance on the image is varied according to the three-dimensional depth of the object. In other words, when the object is in the vicinity, the target object can be observed largely, so a large distance difference for measuring the degree of proximity on the image is taken, and a sufficient number of 3D position information is extracted from a wide area and stable. Get the basic area.
[0064]
On the other hand, when the object is far away, the target object is small, so the distance difference for measuring the degree of proximity on the image is reduced so that it does not become an erroneous basic region fused with other objects in the vicinity. In this way, by combining features with an appropriate size that matches the size of the observed object, which changes according to the position (distance) of the object, the basics due to the difference in the object position caused by the basic area extraction at a fixed size Without being affected by the problem of area detection accuracy, the basic area serving as the nucleus of the object plane can be stably obtained regardless of the position of the object.
[0065]
Furthermore, if the three-dimensional object is far away, the resolution of the distance information is insufficient and it is difficult to fit the model. Therefore, when the object is far away, a video camera (or TV camera) in addition to the three-dimensional position information. The object position is accurately detected using an object contour line obtained from a gray-scale image of a scene imaged by a color image or a color image.
[0066]
On the contrary, when the object is in the vicinity, a lot of 3D information can be obtained, and the pattern inside the object is mainly composed of grayscale images or color images, making it difficult to distinguish from the contour line. Information is not used. In this way, by using image information as necessary, the object position can be detected without affecting the distance of the object.
[0067]
(2): Regarding the detection of the moving speed of an object, in the present invention, basically, a stable speed is determined by determining the current moving speed based on some past position information using an object motion model. However, when the object cannot be detected and the response cannot be obtained, the position and size information is left in the unsupported state for a certain number of times, and when the object is detected again, it is not supported. Investigate whether the object can be handled consistently with an unsupported object with reference to the average moving speed over a certain period.
[0068]
If it can be handled without contradiction, the object is tracked again using the average moving speed as the initial speed on the assumption that the previously detected object is detected again as an unsupported object. Thus, even if the object position cannot be detected for a certain period, it is not necessary to deal with the object for a certain period of time after the object position is detected again, so that the speed information can be stabilized at any time.
[0069]
§3: Explanation by specific example
Hereinafter, the process will be described in detail based on specific examples.
[0070]
(1): Explanation of the entire system
A schematic diagram of the system is shown in FIG. This system includes a distance measuring device 1, a parallax image converter 2 that performs a parallax image conversion process, an object detector 4 that detects the position and size (shape) of an object, and an object speed that measures the moving speed of the object. A measuring instrument 5 and a camera 3 are provided.
[0071]
The distance measuring instrument 1 and the camera 3 are existing ones. As the distance measuring device 1, a device capable of acquiring three-dimensional information of a scene including the target object, for example, a so-called stereo camera in which two video cameras (or TV cameras) are arranged at a predetermined distance is used. To do.
[0072]
In addition, the camera 3 uses an image that captures a scene including the target object and can output a grayscale image (monochrome) or a color image. However, when the distance measuring device 1 includes two cameras, One of them may be used as the camera 3.
[0073]
The function of this system is as follows. First, distance information (three-dimensional position information) is acquired by the distance measuring device 1, and if necessary, converted into a parallax image format having a parallax value in a two-dimensional array (x, y) by a parallax image converter 2. Then, a parallax image is output. Subsequently, the object position is detected by the object detector 4 using the parallax image and the grayscale (or color) image captured by the camera 3.
[0074]
Then, the object speed measuring device 5 measures the moving speed of the object from the time change of the object position obtained at each time. In this way, the position, size (shape), and movement speed information of the object are output as the system output.
[0075]
Hereinafter, functions and processing contents of the respective units will be described. As long as the distance measuring device 1 is a means capable of acquiring three-dimensional information of a scene, the method is not mentioned. As an output, it may be a set of information of the three-dimensional position (X, Y, Z) in addition to the image format having a three-dimensional distance or parallax at each position of the two-dimensional array (x, y).
[0076]
The parallax image converter 2 is a mechanism provided as necessary. When the output of the distance measuring device 1 is given as a set of three-dimensional position (X, Y, Z) information, the position (X, Y) Using distance (Z) and virtual camera parameters (f: camera focal length, b: inter-camera distance), x = f × X / Z, y = f × Y / Z, d = f × b / Z The function of converting to a parallax image {disparity value d} at position (x, y) is given by In addition, when the distance measuring device 1 outputs a parallax image, the parallax image converter 2 is unnecessary.
[0077]
(2): Explanation of object detector
FIG. 2B shows a configuration diagram of the object detector. The object detector 4 receives a parallax image and a grayscale image (or color image) of a scene captured by the camera 3 and detects a plurality of object positions, and detects a basic region. 1 and an object surface fitting / object position determining unit 4-2 that performs object surface fitting processing and object position determining processing.
[0078]
The function of the object detector 4 is as follows. First, the basic region detection unit 4-1 has a core region (basic region R) having sufficient information for fitting the object plane from the input parallax image. _i ) Next, an object shape model is prepared in advance, and the obtained basic region R is obtained by the object surface fitting / object position determination unit 4-2. _i The model object plane is applied to, and it is determined whether or not the object exists there. As a result, as a list (object information array) OL of detected objects (having position and size information),
[0079]
[Expression 1]

[0080]
Is obtained.
[0081]
(3): Basic area detector
FIG. 3A shows a configuration diagram of the basic area detection unit. The basic region detection unit 4-1 includes a parallax image thinning unit 11 that performs a thinning process of parallax images, a unit region detection unit 12 that detects a unit region, and a basic region generation unit 13 that generates a basic region.
[0082]
The processing of the basic area detector 4-1 is as follows. First, the parallax image thinning unit 11 collects the input parallax images into an n × m image area, thereby obtaining a part having locally stable parallax information and reducing the number of images to reduce the overall processing amount. Reduce. However, when there is no problem in processing time or the like, the parallax image thinning unit 11 may be omitted. Here, among the parallax images, an image having the parallax image is referred to as a “parallax point”.
[0083]
Next, in the parallax point position of the parallax image thinned out by the unit area detection unit 12, a plane (indicated by the unit plane model MS) having a unit size in three dimensions is the position (distance) of the parallax point. ), The size (region) on the parallax image is observed, and the plane is stably constructed from the distribution of the parallax points before thinning out in the region (“unit region E _i ") And extract the unit area list EL _i Get.
[0084]
Then, in the basic area generation unit 13, the basic areas R are grouped together so that they can constitute the same plane from the unit area group. _i And the basic area list RL _i Get.
[0085]
(4): Detailed processing of the basic area detector
Hereinafter, detailed processing of the basic area detection unit will be described.
[0086]
(4) -1: Description of processing of parallax image thinning unit 11
FIG. 4 shows the positional relationship between the thinned-out parallax image and the original parallax image. In FIG. 4, A shows the original parallax image, and B shows the thinned-out parallax image.
[0087]
In the parallax image thinning unit 11, when the width of the original parallax image is N pixels and the height is M pixels, the parallax image is a small area w of horizontal n pixels and vertical m pixels. _i (See A in FIG. 4), each small area w _i Check whether the following conditions are satisfied each time, and satisfy the subregion w _i For the above, the average parallax is used as a pixel value (aggregation), and a new thinned-out parallax image (see B in FIG. 4) having a width N / n and a height M / m is obtained.
[0088]
Condition: Small area w when the following condition (1) or (2) is satisfied _i Is satisfied.
[0089]
(1): (Small area w _i Number of parallax points)> threshold 1 and (small region w _i Dispersion value of parallax point) <threshold value 2
(2): (Small area w _i Number of parallax points / n × m)> threshold value 3
The threshold values 1, 2, and 3 are determined in advance.
[0090]
(4) -2: Explanation of processing of the unit area detection unit 12
In the unit region detection unit 12, first, in the three-dimensional space, the vertical MS _x , Horizontal MS _y Is defined as a unit plane model MS. Then, each parallax point (x _i , Y _i , D _i ) In the original parallax image when the unit plane model MS is placed at the distance (parallax) (px _i , Py _i ), Size (nx _i , Ny _i ) To the expression {px _i = X _i Xn, py _i = Y _i Xm, nx _i = MS _x Xd _i / B, ny _i = MS _y Xd _i / B}.
[0091]
Where x _i Is the position on the X coordinate, y _i Is the position on the Y coordinate, d _i Is a parallax value, and b is a distance between two cameras. The position in the original parallax image (px _i , Py _i ) Around the size (nx _i , Ny _i The following evaluation value is calculated using the parallax points in the original parallax image included therein. A unit region E having an evaluation value equal to or greater than a threshold value is selected. _i And
[0092]
[Expression 2]

[0093]
However, N _j Is the number of parallax points included in the previous area, f is the focal length of the camera, and DIFF is the three-dimensionally distinguished distance (threshold). As a result of the operation of the unit region detector 12, the unit region list EL is
[0094]
[Equation 3]

[0095]
(N _E Is obtained by the formula (number of EL elements).
[0096]
(4) -3: Explanation of processing of the basic area generation unit 13
FIG. 5 is a process flowchart of the basic area generation unit. Hereinafter, the processing of the basic area generation unit 13 will be described with reference to FIG. In FIG. 5, S1 to S12 indicate each processing step.
[0097]
The basic area is a collection of several unit areas on the same plane, the depth is the average distance of the unit area groups belonging to the basic area, and the size of the basic area is the maximum including the belonging unit area group. A rectangular area. This basic area is obtained by the process of FIG.
[0098]
First, the basic area generation unit 13 inputs a parallax image and a unit area list EL (S1), and from the unit area list EL, a unit area E that does not belong to the basic area. _i To obtain a unit area E that does not belong to the basic area in the unit area list EL. _i It is determined whether or not there is (S2). As a result, the unit region E that does not belong to the unit region list EL _i If there is no, the process ends.
[0099]
However, the unit area E that does not belong to the basic area _i If there is a unit area E _i Take out one of the basic region R _k Is generated (S3). Next, the combined flag is lowered (reset) (S4), and the unit area E _i And the basic region R _k Other unit area E not belonging to _j Is acquired (S5). In this case, this unit area E _j Is present (S6), the unit region E _j Is the basic region R _k (S7).
[0100]
As a result, if the unit region E _j Is the basic region R _k If it belongs to the basic region R _k Unit area E _j Are integrated (S8), the "integration flag" is set (S9), the parameter j is updated (j = j + 1) (S10), and the process proceeds to S5. _j And repeat the process. By this iterative process, the unit area E to be investigated _j If there is no more (S6), the state of the “integrated flag” is checked (S11).
[0101]
As a result, if the “integrated flag” is set, a new integration operation has been performed in the repetitive processing. As a result, there is a possibility that a unit area that can be integrated has occurred. Therefore, if the flag is set, the process proceeds to S4 again, and unit areas E that can be integrated from the unit area list EL. _j Check again for any.
[0102]
If the flag is not set, the unit area E to be processed _j Since there is no _{K + 1} And the same processing is repeated. In this case, if the combination flag is not set in the process of S11, the parameter i is updated (i = i + 1) (S12), and the process proceeds to S2.
[0103]
As a result,
[0104]
[Expression 4]

[0105]
Is obtained.
[0106]
Here, the unit region E _j Is the basic region R _K Described below are the conditions for determining whether or not a user belongs to the category. Basic region R _k Among the unit areas constituting the largest unit area E on the image _j A unit region close to _k And E _j To E _k When the distance on the image up to L1 and the difference in parallax are L2, and they are within a predetermined threshold, E _j Is R _k It belongs to.
[0107]
(4) -4: Supplementary explanation of the above process
FIG. 6 is an explanatory diagram of processing (part 1), in which A is an image example, B is an explanatory diagram of parallax, and C is an image. FIG. 7 is an explanatory diagram of processing (part 2), in which A is a parallax image on the memory, B is a partial detailed explanatory diagram of the parallax image, and C is a thinned-out parallax image. FIG. 8 is an explanatory diagram of the process (No. 3), in which A is an explanatory diagram of a parallax image, and B is an explanatory diagram when a parallax image is generated.
[0108]
When the parallax image is input to the object detector 4, the parallax image is temporarily stored in the memory and is in a state shown in FIG. In this case, X and Y in A of FIG. 6 indicate coordinate axes, and the shaded portion in the figure is parallax (distance). That is, the parallax image is information including parallax (distance).
[0109]
For example, as shown in FIG. 6B, when L is an image captured by the left camera (solid circle), and R is an image captured by the right camera (dotted circle), L and When R is superimposed, an image with parallax is obtained. That is, if the shift between the two images when the solid circle and the dotted circle are overlapped is xR, the shift xR becomes the parallax (distance). Based on the parallax image having parallax as described above, as shown in FIG. 6C, an m × n image is extracted and the thinning process is performed.
[0110]
In this case, as shown in A of FIG. 7, the parallax image is stored in the memory. Then, the parallax image is divided into a small area w of horizontal n pixels and vertical m pixels. _i (See B in FIG. 7), each small area w _i Each time, it is checked whether or not the above condition is satisfied, and the small area w to be satisfied _i As for the average parallax, the thinned-out parallax image (see C in FIG. 7) is obtained by using the average parallax as a pixel value.
[0111]
Further, when a unit plane (m × n image) is imprinted on the original parallax image, the state shown in FIG. In this case, a near object becomes large and a far object becomes small. For example, some unit planes (m × n images) of a near object have parallax = 10 (for example, distance = 1 m), and some unit planes (m × n images) of a far object have parallax = 1. (For example, distance = 10 m).
[0112]
In this case, the vertical MS, which is a reference unit plane _x , Horizontal MS _y The size of the unit plane model with the size of _x × MS _y == (40 cm × 50 cm), it is calculated how large a 40 cm × 50 cm plate (object) can be seen at a distance of 10 m.
[0113]
In this way, the parallax increases for a close object, and the parallax decreases for a distant object. Thus, the relationship is such that the distance is small if the parallax is large, and the distance is large if the parallax is small. Accordingly, the parallax is the same as the distance information.
[0114]
Furthermore, the parallax image can be obtained using a stereo camera composed of two cameras. For example, as shown in FIG. 8B, the first camera L and the second camera R are set apart from each other by a predetermined distance (for example, distance b), and grayscale images (black and white) captured by these two cameras are used. An edge image (an outline image of an object) is obtained, and a parallax image can be generated by combining the L and R edge images.
[0115]
(5): Explanation of object surface fitting / object position determination unit
(5) -1: Definition
Prior to the description, the object model will be described. A plurality of object models are prepared according to the shape and size of the object, and one object model is designated as model class C. _i Model class C _i To match
[0116]
[Equation 5]

[0117]
However, N _c Is model class C _i Represents the number of Each model class C _i Is composed of one or a plurality of geometric shapes (here, rectangular parallelepipeds), and the size is adjusted according to the object. In this example, since a rectangular parallelepiped is used, there are three surfaces that can be seen by the observer among the surfaces constituting the rectangular parallelepiped. In this example, for simplicity, 2 of the rectangular parallelepiped surfaces are positioned as vertical surfaces. Think about the face.
[0118]
However, even if there are three or more planes, the story is the same if the projection plane is considered in each direction. These two surfaces are classified as follows. The side closer to the plane perpendicular to the observer's line of sight is the “front region S _{f, k} The side closer to the plane parallel to the line-of-sight direction is referred to as “side region S _{s, k} "
[0119]
(5) -2: Explanation of configuration of object surface fitting / object position determination unit
FIG. 3B shows a configuration diagram of the object surface fitting / object position determining unit. The object surface fitting / object position determination unit 4-2 includes a front surface region fitting unit 14 that performs a front surface region fitting process and a side surface region fitting unit 15 that performs a side surface region fitting process.
[0120]
The object plane fitting / object position determination unit 4-2 includes a parallax image and a basic area list RL. _i , I∈ {1, 2, ..., N _R } As an input, first, each of the basic regions R is processed by the front region fitting unit 14. _i Object region O where a consistent three-dimensional position distribution can be obtained _{f, i} Get.
[0121]
Next, the basic region R is now controlled by the side region region fitting unit 15. _i Object region O when _{s, i} Get. As described above, by detecting the object region in all combinations, the object can be detected without omission regardless of the position of the object. Details will be described below.
[0122]
(5) -3: Front area fitting process (whole process)
FIG. 9 is a process flowchart (part 1) of the front surface area fitting unit. Hereinafter, based on FIG. 9, the process of a front surface area fitting part is demonstrated. S31 to S36 indicate each processing step.
[0123]
In the front region fitting unit 14, the basic region R belonging to the basic region list _i , I∈ {1, 2,..., N _R } For each, all model classes C _k , K∈ {1, 2,..., N _C }, Model fitting is performed by the processing shown in FIG. 9 to detect an object region that matches the model.
[0124]
First, the basic region R _i Model class C at the three-dimensional position indicated by _k The front area S that constitutes _{f, k} It is checked whether there is a parallax point that supports the surface when the is placed. In this case, the degree of fitting is given as a fitting score SC1 (SC: score).
[0125]
If the degree of fitting is good, the front area S _{f, k} The side region S that can be generated from the positional relationship based on _{s, k} And check if there is a parallax point that supports the side region. In this case, the degree of fitting is given as a fitting score SC2 (SC: score). Model class C if fit is good _k The object region O at the position given by _{f, i} Suppose that exists. Also, the evaluation value SC of this object area is SC = SC1 + SC2. If any of the fitting degrees is poor, it is assumed that there is no object region.
[0126]
That is, the front area fitting unit 14 is configured to display the parallax image, the basic area R _i The model class list CL is input (S31), and the basic region R _i Model class C _k Front area S _{f, k} Then, the model surface is fitted (S32). In this fitting process, first, the basic region R _i Model class C _k Front area S _{f, k} Then, the model surface is fitted. At this time, the fitting score SC1 _{, i, k} Is acquired (S32).
[0127]
Then, the front surface area fitting unit 14 determines whether or not the front surface area exists (S33). If there is no front surface area, the processing is terminated with no object area and object score SC = 0. However, the front area S _{f, k} If there is a model class C _k Side region S _{s, k} To generate the side region S _{s, k} Check if there is any. At this time, fitting score SC2 _{, i, k} Is acquired (S34).
[0128]
Then, the front area fitting unit 14 determines whether or not a side area exists (S35), and the side area S _{s, k} If there is no object area, there is no object area and the object score SC = 0, and the process is terminated. However, the side region S _{s, k} Model class C _k The position of the object region O _{i, k} The object region O _{i, k} Fitting score SC = SC1 _{, i, k} + SC2 _{, i, k} (S36). In this way, the object region O _{i, k} Yes, fit score SC = SC1 _{, i, k} + SC2 _{, i, k} Get.
[0129]
(5) -4: Front area fitting process (partially detailed process 1)
FIG. 10 is a process flowchart (No. 2) of the front surface area fitting portion, and shows a partial detailed process (detailed processing in S32 of FIG. 9) of the front surface area fitting portion. Hereinafter, based on FIG. 10, a partial detailed process of the front surface area fitting portion will be described. S41 to S48 indicate each processing step.
[0130]
The front surface area fitting portion 14 has a basic area R _i , Front area S of the model _{f, k} The parallax image is input (S41), and the front area S _{f, k} The basic region R _i Projection area h in the parallax image when placed at the three-dimensional position indicated by _f Is obtained (S42).
[0131]
In this case, if the target object is far away, the number of distance information that can be obtained becomes small and the amount of distance information that can be obtained is insufficient. Almost no change occurs and it is difficult to specify the object position. For this reason, when the object is more than the threshold value, in addition to the three-dimensional distance information, the object contour line information obtained from the grayscale image of the scene captured by a video camera (or a TV camera) is used. By correcting the object position, the object position is always stably obtained regardless of the three-dimensional position (depth) of the object.
[0132]
That is, the front surface area fitting portion 14 has the basic area R. _i (S43). If the distance is more than the threshold, the reliability of the distance information is lowered, and it is difficult to determine the object position only by the distance information. Gradation image is captured and projected area h _f The presence / absence of an object contour is obtained based on the above (S44).
[0133]
Then, it is checked whether or not there is a contour line (the presence or absence of a contour line) (S45). If there is no contour line, no object is present (no front area). When there is a contour line, the front area S _{f, k} Whether the distance information supports (S46). In the process of S43, the basic area R _i If the distance is less than or equal to the threshold, the distance information is reliable, so the contour line is not inspected and the front region S _{f, k} Whether the distance information supports (S46).
[0134]
And the front area S _{f, k} In the process of checking whether the distance information supports the projection area h _f Using the parallax points in the front region S _{f, k} Fit to the absolute fit score S _a And average fit score S _r Is calculated (S46). In this way, in the process of S46, the absolute fitting score S is obtained by performing the process of checking whether the distance information supports the front area. _a Is obtained.
[0135]
And the average fit score S _r Is less than or equal to the threshold (S47), the front area S _{f, k} None, but front area S _{f, k} Average fit score S of distance information fit to _r Is larger than the threshold value, it can be considered that there is three-dimensional information that stably supports the object surface. _{f, k} Suppose that exists. Furthermore, as the fitting score SC1 at that time, the absolute score S _a Is used (S48). By such processing, the front area S _{f, k} Information on whether or not exists is obtained.
[0136]
(5) -5: Front area fitting process (partly detailed process 2)
FIG. 11 is an explanatory diagram of the contour detection process, and details the processes of S43 and S44 of FIG. Here, first, a contour search region H as an object contour detection region. _f Is the projection area h of the object plane obtained previously. _f And the size of the projection area h _f And defined as a region that is a certain size larger.
[0137]
Next, the contour search area H _f The projection in the vertical direction is taken with respect to the edge in the grayscale image included in. The projected image is a histogram with the horizontal axis representing position (x) and the vertical axis representing luminance. The position of the contour line is determined by detecting a peak position having a luminance equal to or higher than the threshold from the histogram.
[0138]
If multiple positions are obtained, the model region (projection region h _f Whether or not there is a peak set having a width close to that of the model region. If there is a peak set having a similar width, it is assumed that there is an outline of an object supported by the corresponding model. If there is no peak set with an explicit width, there is no contour that supports the object.
[0139]
(5) -6: Front area fitting process (partly detailed process 3)
Hereinafter, the process of S46 of FIG. 10 will be described in detail. This process is a process for determining whether or not the distance information supports the front area, and the details thereof will be described below.
[0140]
First, model area h _f P (x _i , Y _i , D _i ). At this time, the position (x _i , Y _i Model surface S) _{f, k} Parallax value d _S The following operations are performed.
[0141]
[Formula 6]

[0142]
However, S _a Represents an absolute fitting score (absolute score for fitting parallax information), and the initial value is S _a = 0. Perform this operation in the model area h _f For all the parallax points. And the average fit score S _r , S _r = S _a / MAX (h _f Width, h _f (Height). In this process, when fitting the object surface, the quality of the fit (S _r , S _a ) Is obtained.
[0143]
(5) -7: Explanation of side area fitting process (for confirmation)
FIG. 12 is a side surface area fitting process (confirmation) flowchart, and shows the detailed process of S34 of FIG. Hereinafter, based on FIG. 12, the process of S34 of FIG. 9 is demonstrated in detail. S51 to S57 indicate each processing step.
[0144]
This process is a side area fitting process for confirming the presence of an object. First, the front area S _{f, k} Model class C _k The parallax image is input (S51), and the position of the side region is determined (S52). In this case, the front area S found earlier _{f, k} Side region S based on the position of _{s, k} Is calculated. Two sides can be considered as the side area, but the side area is limited to those visible from the current viewpoint. And the side area S _{s, k} Projection area h that is a projection onto a parallax image _s Is obtained (S53).
[0145]
Next, the side surface region S _{s, k} Side surface region S in the parallax image _{s, k} The difference between the foremost position and the rearmost position (depth change amount) is checked, and if the difference is sufficiently larger than the resolution of the distance information (S54), since the fitting can be performed using the distance information, the subsequent processing is performed. . Also, if the resolution is insufficient, asking the existence of the side area itself does not make sense, so the processing is terminated as the side area inspection is unnecessary.
[0146]
When the depth resolution is sufficient in the process of S54 (S54), the side surface region S is processed by the process of S34. _{s, k} It is checked whether there are enough parallax points to support the object surface, and object surface fitting processing is performed. In this case, the result of the quality determination of the object surface fitting degree is the average fitting score S _r Absolute fitting score S _a (S55).
[0147]
Next, as with the front region, the average fit score S _r If is greater than or equal to the threshold (S56), the side region S _{s, k} As an evaluation value SC2 of the fitting score, and the absolute fitting score S _a Is returned (S57). Average fit score S _r If the threshold value is less than or equal to the threshold, _{s, k} Suppose there is no.
[0148]
(5) -8: Explanation of side area fitting part
Hereinafter, the process of the “side surface area fitting portion 15” illustrated in FIG. 3B will be described. In this process, each basic region R _i First, the side area S _{s, k} For the side region S _{s, k} If there is a front region S that supports it _{f, k} If there are both, the object region O _{s, k} Suppose there is.
[0149]
In this sense, the difference from the front area fitting portion 14 is only the difference in the search order of the front area and the side area, and the contents are substantially the same. Accordingly, since the processing described so far is equivalent to the processing in which the front surface is replaced with the side surface region and the side surface region is replaced with the front surface, detailed description will be omitted.
[0150]
(6): Explanation of object velocity measuring instrument
Hereinafter, the object velocity measuring instrument 5 shown in FIG. 2 will be described. The speed of the object is obtained by acquiring the position change of the object by associating the object position obtained at each time between the times. That is, the speed measurement of the object is obtained from the amount of movement of the object between times. As the subject for this association, the tracker TA _i (I is a subscript for distinguishing from other trackers).
[0151]
Furthermore,
[0152]
[Expression 7]

[0153]
To that end,
[0154]
[Equation 8]

[0155]
The system formula and estimation formula of the Kalman filter are as follows.
[0156]
[Equation 9]

[0157]
[Expression 10]

[0158]
[Expression 11]

[0159]
Further, the corresponding object region O in the association between the times _i When there is no longer any tracker TA _i Is a lost tracker TL _i Suppose that This lost tracker TL _i , Lost tracker list TLL = {TL _i }, I∈ {1, 2,..., N _n } Is defined.
[0160]
(7): Explanation of object tracking processing (method of matching between tracker times)
FIG. 13 is an object tracking process flowchart. Hereinafter, the object tracking process will be described with reference to FIG. In addition, S61-S69 show each process step.
[0161]
First, an object area list OL, a tracker list TAL, and a lost tracker list TLL are input (S61). From the tracker list TAL, an unsupported tracker TA that is not associated with an object area. _i Is obtained (S62). If such a tracker TA _i If there is not (S63), the process proceeds to S68 (described later).
[0162]
Next, the tracker TA _i If there is a tracker TA _i Can correspond to the object region O _j Is selected from the object region list OL (S64). If the corresponding object region O _j If there is (S65), the tracker TA _i The object region O _j Update processing is performed using (S66). Then, the process returns to S62.
[0163]
If the corresponding object region O _j If there is no tracker TA _i The corresponding lost process is performed (S67), and the process returns to S62. In this way, all corresponding trackers TA _i When the association with the object region is finished, the process proceeds to S68. In the object area list at this stage, there remains a new object area that is associated with the tracker.
[0164]
Some of the object regions are actually newly added, and some are lost in the previous correspondence tracker for some reason, and the same object region is observed again at this point. Therefore, first, assuming the latter case, the lost tracker TL _i Object area O corresponding to _j Is searched (S68).
[0165]
If such an object region O _j If there is a corresponding lost tracker TL _i The initial speed is determined using the information of the corresponding tracker TA _i Return to. Since the initial speed is fixed, the parameter estimated by the Kalman filter converges quickly, and the position and speed information is stabilized in a short time.
[0166]
Next, the tracker TA is set for the former new object area, and the subsequent tracking process is started. And finally, the lost tracker TL _i Lost tracker TL that increments the number of lost times and continues the lost state more than a certain number of times _i Update processing is performed to remove (S69).
[0167]
The process of S68 will be described in detail. In this process, the lost tracker TL _i When the position at the current time is lost from the position at the time of losing sight, the position and the object area O _j If the position from is close enough, there is no contradiction. And if there is no contradiction, the lost tracker TL _i The tracking is resumed with the speed of the initial value. At this time, lost tracker TL _i Tracker TA _i To the tracker list TAL. By doing so, since the moving speed of the object is known at the time of retracking, tracking can be performed stably.
[0168]
Further, each process will be described in detail. First, in the selection of the corresponding object region in S64, the tracker TA _i Predicted position P _p (XP _i , ZP _i ) As a reference, an object region having the closest three-dimensional distance is set as a corresponding candidate, and when the distance to the corresponding candidate is equal to or smaller than a threshold value, it is possible to respond. Otherwise, it is impossible to respond.
[0169]
Next, in the update of the tracker in S66, the applicable object region O _j 3D position (XO _j , ZO _j ) As the observation position, the estimated position and speed of the tracker are updated by the above formulas (2) and (4), and the speed is set as the moving speed (output) of the tracked object. Further, using the above formulas (1) and (3), predicted values of the next position and speed are obtained and used for associating the next time.
[0170]
In the lost processing of the tracker in S67, the tracker TA _i The position and speed at the time when the object has lost the object are respectively expressed as the lost position Pl _i (XL _i , ZL _i ), Lost speed Vl _i (VXL _i , VZL _i ) And prepare for re-response. Furthermore, the lost count counter l _ci Is set to 1 (l _ci = 1). And the tracker TA _i Are removed from the tracker list TAL and added to the lost tracker list TLL.
[0171]
In the process of S68, the new object region O _j Position (XO _j , ZO _j ) And each lost tracker TL _i Lost position pl _i , Lost speed vl _i Is used to calculate the difference 1 between the predicted position and the observed position, and check whether it can be handled without contradiction.
[0172]
[Expression 12]

[0173]
If l (el) is within the threshold (l ≦ threshold), the object region O _j And lost tracker TL _i Can be handled without contradiction, the object region O _j Lost tracker TL using the position of _i 3D position and velocity are estimated with the Kalman filter, the tracker is removed from the lost tracker list TLL, the lost count is set to 0, and then added to the (corresponding) tracker list TAL again. Start pasting.
[0174]
§4: Description of specific device examples and recording media
FIG. 14 shows a specific apparatus example. The three-dimensional object detection / velocity measurement apparatus can be realized using an arbitrary computer such as a personal computer or a workstation. The three-dimensional object detection / velocity measurement apparatus includes a computer main body 20, a display device 31 connected to the computer main body 20, an input device (keyboard / mouse, etc.) 32, a removable disk drive (referred to as "RDD") 33, a magnetic It comprises a disk device (referred to as “MDD”) 34, a distance measuring device 1, a camera 3, and the like.
[0175]
The computer main body 20 includes a CPU 21 that performs various internal controls and processes, a ROM 22 (nonvolatile memory) for storing programs and various data, a memory 23, and an interface control unit (“I / F”). 24 ”, a communication control unit 25, an input / output control unit (I / O control unit) 26, and the like. The RDD 33 includes a flexible disk drive (floppy disk drive), an optical disk drive, and the like.
[0176]
In the apparatus configured as described above, for example, a program for realizing the processing of the three-dimensional object detection / velocity measuring apparatus is stored in the ROM 22 or the magnetic disk (recording medium) of the MDD 34, and the CPU 21 reads this program. The three-dimensional object detection / velocity measurement process is executed.
[0177]
However, the present invention is not limited to such an example. For example, the program may be stored on the magnetic disk of the MDD 34 as follows, and the CPU 21 may execute the connected region extraction process by executing the program. Is possible.
[0178]
{Circle around (1)} A program (program data created by another device) stored in a removable disk created by another device is read by the RDD 33 and stored in a recording medium of the MDD 34.
[0179]
(2): Data such as a program transmitted from another device via a communication line such as a LAN is received via the communication control unit 25, and the data is stored in a recording medium (magnetic disk) of the MDD 34.
[0180]
【The invention's effect】
The present invention has the following effects.
[0181]
(1): In claim 1, the basic area detecting means collectively sets the three-dimensional position information existing in a certain neighboring range as a basic area. By setting the appropriate size according to the apparent size of the object that changes due to the above, distance information can be collected without excess and deficiency, and as a result, the object position is detected.
[0182]
In this case, the object plane model is applied to the distance information (disparity information), and the distance information from the distance (parallax) image is necessary to be used in the “object detection process” for determining the presence / absence of the object based on the quality. Are extracted as object plane candidates. In this process, it is possible to use distance information that is not excessive or deficient by combining the distance information in the neighboring area by setting the appropriate area according to the observation size of the object that changes depending on the three-dimensional position (depth) as “neighboring”. To do.
[0183]
As a result, the problem of over-merging that causes other objects that are nearby in the distance to be mistakenly created as one object due to the conventional method of setting the fixed size as `` neighboring '', or one object surface in the vicinity to many small planes. One object plane is always stably detected regardless of the three-dimensional position (depth) of the object without causing the problem of over-division that is detected by being divided and becomes difficult to integrate into the original object plane. It can be extracted as a plane.
[0184]
(2): In claim 2, whether the object plane fitting means uses information obtained from a gray-scale or color image capturing the target scene in addition to the distance information depending on the three-dimensional distance of the basic region to be fitted Switch whether to not use.
[0185]
In this case, regarding the object detection processing, if the target object is far away, the observation size of the object will be small, and the number of distance information obtained will be insufficient, and the model position will be changed when fitting the object plane model to the distance information. However, the degree of fitting hardly changes and it is difficult to specify the object position.
[0186]
For this reason, when the object is more than the threshold, the object in the image is obtained using object outline information obtained from a grayscale image of a scene captured by a video camera (or a television camera) in addition to the three-dimensional distance information. By correcting the position, the object position can always be obtained stably regardless of the three-dimensional position (depth) of the object.
[0187]
(3): In claim 3, when the speed detection means cannot detect the object by the position detection means, and as a result, the correspondence in the time series cannot be obtained, the speed detection means positions the object in an unsupported state for a certain number of times. When the object is detected again by the position detection means, it is checked whether it can be handled consistently with an unsupported object with reference to the average moving speed during the unsupported period. If possible, if an unsupported object is detected again as an earlier detected object, extra tracking can be performed by giving the average moving speed between unsupported as the initial speed of retracking and tracking the object again. Restart speed measurement promptly without need.
[0188]
In this case, regarding the object velocity measurement process, when the object region cannot be detected and the correspondence cannot be obtained, the object is detected again by leaving the position information in the unsupported state for a certain number of times. In this case, it is possible to stably measure the speed immediately after re-detection by giving the average moving distance between unsupported as the initial speed and tracking the object again.
[0189]
In addition, in the conventional method in which the re-detected object starts to be tracked as a new object, in order to obtain speed information stably, object tracking of several frames is further required, and a time delay is required until the speed information is output. In the present invention, even if there is a case where an object cannot be detected, speed information can be obtained without time delay.
[0190]
(4): In claim 4, a basic area detection process for detecting all or a part of the surface constituting the object surface from the three-dimensional position information as a basic region which is a partial feature, and the basic of the detected object surface Object region fitting processing for applying a component surface of an object shape model prepared in advance for the region, and object region determination processing for determining the presence / absence of an object based on whether or not the fitting result by the object surface fitting means is good, and the basic region detection processing Then, the three-dimensional position information existing in a certain neighboring range is collectively used as a basic region. In this case, the size of the gathering neighboring range is appropriately matched with the apparent size of the object that changes depending on the three-dimensional position. By setting it to a large size, it is possible to gather distance information without excess or deficiency, and as a result, the object position is detected.
[0191]
In this way, the conventional method of making the fixed size “neighboring” causes the problem of over-merging that misplaces another object that is nearby in the distance, and one object plane in the vicinity. One object plane is always detected stably regardless of the three-dimensional position (depth) of the object, without causing the problem of overdivision that is detected by being divided into small planes and becomes difficult to integrate into the original object plane. Can be extracted as one plane.
[0192]
(5): In claim 5, when the three-dimensional object detection apparatus reads and executes the program of the recording medium and collects the three-dimensional position information existing in a certain neighboring area as a basic area, the information is collected. By setting the size of the neighborhood range to an appropriate size according to the apparent size of the object that changes according to the three-dimensional position, it is possible to gather distance information without excess and deficiency, and as a result, the object position To detect.
[0193]
In this case, the object plane model is applied to the distance information (disparity information), and the distance information from the distance (parallax) image is necessary to be used in the “object detection process” for determining the presence / absence of the object based on the quality. Are extracted as object plane candidates. In this process, it is possible to use distance information that is not excessive or deficient by combining the distance information in the neighboring area by setting the appropriate area according to the observation size of the object that changes depending on the three-dimensional position (depth) as “neighboring”. To do.
[0194]
As a result, the problem of over-merging that causes other objects that are nearby in the distance to be mistakenly created as one object due to the conventional method of setting the fixed size as `` neighboring '', or one object surface in the vicinity to many small planes. One object plane is always stably detected regardless of the three-dimensional position (depth) of the object without causing the problem of over-division that is detected by being divided and becomes difficult to integrate into the original object plane. It can be extracted as a plane.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating the principle of the present invention.
FIG. 2 is a system explanatory diagram according to the embodiment of the present invention.
FIG. 3 is a configuration diagram of a basic area detection unit and an object plane fitting / object position determination unit in the embodiment of the present invention.
FIG. 4 is a diagram showing a positional relationship between a thinned-out parallax image and an original parallax image in the embodiment of the present invention.
FIG. 5 is a process flowchart of a basic area generation unit in the embodiment of the present invention.
FIG. 6 is a process explanatory diagram (part 1) according to the embodiment of the present invention;
FIG. 7 is a process explanatory diagram (No. 2) according to the embodiment of the present invention;
FIG. 8 is a process explanatory diagram (part 3) according to the embodiment of the present invention;
FIG. 9 is a process flowchart (No. 1) of the front surface area fitting unit in the embodiment of the present invention;
FIG. 10 is a process flowchart (No. 2) of the front surface area fitting unit in the embodiment of the present invention;
FIG. 11 is an explanatory diagram of contour detection processing according to the embodiment of the present invention.
FIG. 12 is a flowchart of a side area fitting process (for confirmation) in the embodiment of the present invention.
FIG. 13 is an object tracking processing flowchart according to the embodiment of the present invention.
FIG. 14 is a specific apparatus example according to the embodiment of the present invention.
[Explanation of symbols]
1 Distance measuring instrument
2 Parallax image converter
3 Camera (video camera)
4 Object detector
4-1 Basic area detector
4-2 Object surface fitting / object position determination unit
5 Object velocity measuring instrument
11 Parallax image thinning unit
12 Unit area detector
13 Basic region generator
14 Front area fitting part
15 Side area area fitting part
20 Computer body
21 CPU
22 ROM
23 memory
24 I / F control unit (interface control unit)
25 Communication control unit
26 I / O control unit (input / output control unit)
31 Display device
32 input devices
33 Removable disk drive (RDD)
34 Magnetic disk unit (MDD)

Claims

The position and shape of a three-dimensional object that can exist in a three-dimensional space when the three-dimensional position information is given in the form of an image having depth information at each location of a two-dimensional array or in the form of a set of three-dimensional position information In the three-dimensional object detection device for detecting
Basic area detecting means for detecting all or part of the surface constituting the object surface from the three-dimensional position information as a basic area that is a partial feature;
An object surface fitting means for applying a component surface of an object shape model prepared in advance for the detected basic area of the object surface;
An object region determination means for determining the presence or absence of an object from the quality of the fitting result by the object surface fitting means;
Distance said elementary region detection unit is the basic region are collectively three-dimensional position information present at a certain close range, this time, the magnitude of this summary close range, varies by a three-dimensional position of the three-dimensional object It has a function to set according to
A three-dimensional object detection device.

The position and shape of a three-dimensional object that can exist in a three-dimensional space when the three-dimensional position information is given in an image format having depth information at each location of a two-dimensional array or in the form of a set of three-dimensional position information In the three-dimensional object detection device for detecting
Basic area detecting means for detecting all or part of the surface constituting the object surface from the three-dimensional position information as a basic area that is a partial feature;
An object surface fitting means for applying a component surface of an object shape model prepared in advance for the detected basic area of the object surface;
An object region determination means for determining the presence or absence of an object from the quality of the fitting result by the object surface fitting means;
The object plane fitting means is a function for switching whether to use or not to use information obtained from a grayscale or color image for capturing the target scene, in addition to the distance information, depending on the three-dimensional distance of the basic region to be fitted. With
A three-dimensional object detection device.

The object region determination means detects an object position based on whether or not the fitting result by the object fitting means is good, obtains the detected object position in time series, and responds to the obtained object position in time series. By calculating, it has a function to calculate the speed information of each object ,
When the object area determination means cannot detect the object and as a result it is not possible to find a correspondence in time series, the object area determination means leaves the position in a non-corresponding state for a certain number of times, and the object area determination means again When an object is detected, it is checked whether it can be handled consistently with an object that is not supported by referring to the average moving speed during the period when the object is not supported. Provided with speed detection means that speed measurement can be resumed by giving the average moving speed between unsupported as the initial speed of re-tracking and tracking the object again when an unsupported object is detected again ,
The three-dimensional object detection apparatus according to claim 1.

The position and shape of a three-dimensional object that can exist in a three-dimensional space when the three-dimensional position information is given in the form of an image having depth information at each location of a two-dimensional array or in the form of a set of three-dimensional position information In the three-dimensional object detection method for detecting
A basic region detection process for detecting all or part of the surface constituting the object surface from the three-dimensional position information as a basic region that is a partial feature;
An object surface fitting process for applying a component surface of an object shape model prepared in advance for the detected basic area of the object surface;
An object region determination process for determining the presence / absence of an object based on whether or not the fitting result by the object surface fitting means is good,
Distance the basic area detecting process is the basic region are collectively three-dimensional position information present at a certain close range, this time, the magnitude of this summary close range, it varies by a three-dimensional position of the three-dimensional object Set according to
A three-dimensional object detection method.

The position and shape of a three-dimensional object that can exist in a three-dimensional space when the three-dimensional position information is given in the form of an image having depth information at each location of a two-dimensional array or in the form of a set of three-dimensional position information In the three-dimensional object detection device that detects
A first procedure for detecting all or part of a surface constituting an object surface from three-dimensional position information as a basic region that is a partial feature;
A second procedure for applying a component surface of an object shape model prepared in advance for the detected basic region of the object surface;
A third procedure for determining the presence / absence of an object based on whether the fitting result according to the second procedure is good or bad;
When the first procedure is performed, three-dimensional position information existing in a certain neighboring range is collectively used as a basic region. At this time, the size of the gathering neighboring range is changed depending on the three-dimensional position of the three- dimensional object. A fourth procedure to set according to the distance to be
The computer-readable recording medium which recorded the program for performing this.