JP4144300B2

JP4144300B2 - Plane estimation method and object detection apparatus using stereo image

Info

Publication number: JP4144300B2
Application number: JP2002256453A
Authority: JP
Inventors: 知禎相澤
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2002-09-02
Filing date: 2002-09-02
Publication date: 2008-09-03
Anticipated expiration: 2022-09-02
Also published as: JP2004094707A

Description

【０００１】
【発明の属する技術分野】
本発明は、カメラ等の撮像手段によって得られた画像から車輌等の物体を検出する装置に用いられて有効な技術に関し、特に、ステレオ画像から物体が移動する平面の３次元的位置を推定する技術に関するものである。
【０００２】
【従来の技術】
従来より、高速道路や駐車場などの映像をカメラで撮影し、得られた画像から走行車輌の数や種別等を検出する装置が実用化されている。例えば、特許文献１では、２台のカメラで得られたステレオ画像から走行車輌の３次元情報を算出し、マッチング処理によって車輌種別を認識する装置が提案されている。
【０００３】
このように複数のカメラで撮像して得たステレオ画像から物体の３次元情報を求める場合には、あらかじめその物体が移動する移動平面（道路の路面等）の空間位置（３次元的位置）を定めておく必要がある。その平面の空間位置を基準にして、物体の高さ情報を算出するためである。
【０００４】
平面の３次元的位置は、設置されるカメラの平面に対する相対的な位置関係で規定される。しかしながら、一般的には、カメラを設置する際にその設置高さや俯角（カメラの光軸と平面とのなす角）を正確に調整することは難しい。そこで、おおよその位置にカメラを固定した後に、撮像した画像から平面を推定し、カメラと平面の相対的な位置関係を設定する方法が採られることがある。
【０００５】
ステレオ画像を用いた平面推定は、例えば次のようにして行うことができる。すなわち、２台のカメラから得られた画像それぞれについて、画像中の平面物体の特徴点を少なくとも３点抽出する。次に、各画像同士の特徴点の座標を比較・対応付けすることにより基準画像（基準撮像装置となる一方のカメラから得られた画像）上の各特徴点の視差を算出し、三角測量の原理で特徴点各々の実空間上の座標値を求める。そして、最小２乗法あるいはハフ変換などの統計的処理を用いて平面方程式を算出するのである。
【０００６】
ただし、この方法は、特徴点の実空間座標値を用いるため、その座標導出誤差の影響を受けやすいという問題がある。レンズの精度や量子化誤差に起因して、画像中の特徴点の座標はある程度の誤差を含んでいる。この誤差によって実空間座標上で生じる誤差は、図８に示すように、カメラからの距離が遠くなるほど大きくなる傾向にある。したがって、特徴点の画像上の座標値に依存して、実空間座標の導出精度（測距精度）にバラツキが出る。その結果、図９に示すように、平面推定精度が座標導出精度の悪い特徴点の座標導出結果に大きく左右されることとなるのである。
【０００７】
かかる問題を解決するために、画像上に設けた直交２次元座標軸に視差軸を加えた３次元空間（ＳＤＳ：Spatio-Disparity Space）を利用した平面推定方法が提案されている（非特許文献１参照）。この方法は、ＳＤＳ上では、図１０に示すように、基準画像中の特徴点の座標の誤差がカメラからの距離にかかわらず一定となること、さらに、実空間上における平面はＳＤＳ上においても平面になることに着目したものである。
【０００８】
この方法では、基準画像上の各特徴点の視差を算出した後、その特徴点各々についてＳＤＳ上の座標値を求める。そして、ＳＤＳ上にて最小２乗法あるいはハフ変換などの統計的処理を用いて平面方程式を算出する。この方法によれば、各々の特徴点の座標導出精度のバラツキがなく、図１１に示すように、平面推定を安定した精度で行うことができる。
【０００９】
なお、これに類する手法としては、特許文献２に開示された手法も知られている。ここでも、視差軸を加えた３次元空間においてハフ変換による平面推定を行っている。
【００１０】
【特許文献１】
特開平１１−２５９６５８号公報
【特許文献２】
特開平１０−９６６０７号公報
【非特許文献１】
Y.Yang et al., "Local, global, and multilevel stereo matching," Proc. CVPR, pp.274-279, 1993.
【００１１】
【発明が解決しようとする課題】
ところで、上記のような画像処理においては、撮像エリア内を移動する物体の存在が平面推定処理に悪影響を及ぼすことがある。画像中に車輌等の物体が映り込んでいると、特徴点抽出処理において、平面を示す特徴点に加えて当該物体のエッジ部分等も特徴点として抽出される。平面推定処理にあっては、これらの物体を示す特徴点はノイズ情報にしかすぎず、最小２乗法あるいはハフ変換などの統計的処理において外れ値として作用し、平面推定精度を著しく低下させてしまうのである。それゆえ、平面推定を行う際には、撮像エリア内に平面以外の物体が存在しない状況下で画像を撮影するのが理想的である。
【００１２】
しかしながら、現実にはそのような撮影が困難な場合も多い。例えば、走行車輌や通行人がほとんど途切れることなく通過するエリアに装置を設置するケースでは、撮影タイミングを見つけることが難しく、設置作業に長時間を要してしまうことがある。とはいえ、設置作業のために交通規制をすることは多方面に与える影響やコストを考えると現実的とは言い難い。
【００１３】
本発明は上記実情に鑑みてなされたものであって、その目的とするところは、撮像エリア内に平面以外の物体が存在する場合でも、高精度に平面を推定可能な技術を提供することにある。
【００１４】
【課題を解決するための手段】
本発明者は、従来技術が有する上述の課題を解決すべく、鋭意検討を行った。以下にその概要を説明する。
【００１５】
撮像エリア内を移動する物体の存在により平面推定精度が低下するのは、物体を示す特徴点と平面を示す特徴点とが混在する特徴点群に基づいて平面の３次元的位置を算出することに原因がある。したがって、基準画像から抽出された複数の特徴点の中から、平面以外の物体を示す点をあらかじめ除外すればよい。
【００１６】
しかしながら、平面の３次元的位置が未定の段階では、画像上の座標系と実空間座標系との対応をとることができないため、特徴点の画像上の座標値のみから、当該特徴点が平面を示す点か否かを正確に判別することは困難である。
【００１７】
そこで、本発明者は、複数の特徴点の相対的な関係に基づいて、上記判別処理を精度良く行うことができないか、との着想を得た。
【００１８】
その一方で、本発明者による実験の結果、特徴点同士を相対的に比較する際に、実空間上で遠く離れた点同士を比較したとしても、妥当性に欠け、有効な判別処理を行うことができないことが明らかとなった。
【００１９】
さらに検討を行った結果、特徴点の視差が撮像手段からの距離に反比例している事実に着目し、この視差に基づいて対比対象となる特徴点を選択するという手法を想起するに至った。換言すれば、この手法は、特徴点の画像上の座標値だけでなく、特徴点の視差をも考慮することで、特徴点同士の相対的な位置関係を３次元的に比較するものである。
【００２０】
すなわち、上記目的を達成するために、本発明にあっては、物体が移動する平面を俯瞰するように複数の撮像手段を設置する。このとき、各撮像手段を所定の間隔で配置することにより、平面を俯瞰するようなステレオ画像を撮像することができる。なお、撮像手段の数は、２以上であればよい。
【００２１】
ここで「物体が移動する平面」は、撮像エリアの選び方により種々のものが想定できる。たとえば、撮像エリアが車道や歩道等の場合には、「物体」は車輌，歩行者，自転車等であり、「平面」は道路の路面である。また、撮像エリアが踏切や線路等の場合には、「物体」は電車であり、「平面」は線路である。さらには、撮像エリアが工場内の製造現場や物流現場の場合には、「物体」は製造物や搬送物であり、「平面」はベルトコンベア等の搬送装置の搬送面である。
【００２２】
このような撮像エリアを上記複数の撮像手段で撮像し、ステレオ画像を得る。そしてまず、ステレオ画像のうちの基準画像について複数の特徴点を抽出する。「特徴点」とは、画像中の他の領域と比較して明確に区別し得る特徴的な部分をいい、たとえば、物体を示す特徴点としては物体のエッジ部分や模様部分等が該当し、平面を示す特徴点としては、レーンマーキング等の平面上の模様部分や物体が平面上につくる影部分等が該当する。
【００２３】
次に、抽出した各特徴点について、他の画像（参照画像）中の対応点を探索することにより視差を求める。「対応点」とは、基準画像中の特徴点と同一の部分を示す参照画像中の点である。特徴点と対応点は撮像手段からの距離に応じて画像上の座標値が異なる。この差が特徴点の視差となる。
【００２４】
視差を求めた後、各特徴点について、視差および画像上の座標値に基づいて他の特徴点との対比を行うことによって、当該特徴点が物体を示す点か否かを判別する。視差および画像上の座標値を考慮することで、特徴点同士の相対的な位置関係を３次元的に比較することができ、良好な判別処理を行うことができる。
【００２５】
ここで、たとえば、視差に基づき対比すべき他の特徴点を選択するとともに、選択した他の特徴点と対比することによって当該特徴点が物体を示す点か否かを判別することが好適である。視差に基づき対比対象を選択することで、実空間上の位置について関連性の高い特徴点の組を見い出すことができる。そして、このように選択された特徴点同士を対比することによって、それらの相対的な位置関係を高い妥当性をもって比較することができ、当該特徴点が物体を示す点か否かの判別が可能となる。
【００２６】
そして、上記判別処理により物体を示す点と判別された特徴点を除く、残りの特徴点から平面の３次元的位置を算出する。残りの特徴点はすべて平面を示す点である蓋然性が高いので、高精度に平面の推定を行うことが可能となる。
【００２７】
上記平面推定方法において、典型的には、略同じ視差を有する特徴点を対比すべき他の特徴点として選択するとよい。これにより、たとえば物体を示す点とその物体が平面上につくる影を示す点など、より妥当性の高い対比対象を選択することができる。
【００２８】
また、典型的には、特徴点と他の特徴点の画像上の座標値の差が所定の閾値よりも大きい場合に、当該特徴点を物体を示す点と判別するとよい。このとき閾値を特徴点の視差に応じて変化させることが好ましい。
【００２９】
また、典型的には、統計的処理により平面の３次元的位置を算出するとよい。これにより、ロバスト性の高い平面推定を行うことができる。
【００３０】
さらに、画像上の座標軸と視差軸からなる３次元座標系で、前記特徴点が前記物体を示す点か否かを判別する処理、および、前記平面の３次元的位置を算出する処理を行うとよい。これにより、画像中の特徴点の座標の誤差の影響を受けず、平面推定を安定した精度で行うことができる。また、両処理を同一の座標系で行うことにより、座標変換等の余分な処理が不要となり、処理コストを削減することができる。
【００３１】
なお、基準画像中の平面の部分を対象領域として指定し、指定された対象領域から複数の特徴点を抽出することも好ましい。このように平面以外の部分（対象領域以外の部分）からは特徴点を抽出しないことにより、平面推定処理に要する処理コストを削減でき、また、あらかじめノイズ情報が削減されるため、平面推定の精度をより向上させることが可能となる。
【００３２】
以上述べた方法によれば、撮像エリア内に平面以外の物体が存在する場合であっても、高精度に平面を推定することが可能である。
【００３３】
本発明の物体検出装置は、物体が移動する平面を俯瞰するように設置されステレオ画像を撮像する複数の撮像手段、上記平面推定方法を行う平面推定手段、および、この平面推定手段によって推定した平面の３次元的位置を基準として撮像エリア内の平面上を移動する物体を検出する検出手段を備えることを特徴とする。
【００３４】
これにより、撮像エリア内に平面以外の物体が存在する状況下でも、即座に平面推定処理を実施することができ、短時間かつ簡易に設置作業を行うことが可能となる。また、高精度に平面推定を行うことができるので、物体検出の信頼性を向上することもできる。
【００３５】
【発明の実施の形態】
以下に図面を参照して、この発明の好適な実施の形態を例示的に詳しく説明する。
【００３６】
なお、以下の実施の形態に記載されている構成部分の形状、大きさ、その相対配置などは、特に特定的な記載がない限りは、この発明の範囲をそれらのみに限定する趣旨のものではない。
【００３７】
（第１の実施形態）
図１は、本発明の物体検出装置の一実施形態に係る車輌検出装置の設置例を示す。
【００３８】
車輌検出装置１は、道路ＲＤの脇に設置された支柱４に取り付けられており、道路ＲＤの各車道毎の通過車輌の台数や車種の判別、特定車輌の通過速度の計測、渋滞状況の把握、違法駐車中の車輌の検出等を自動的に行う装置である。この車輌検出装置１は、２台の撮像装置２ａ，２ｂと制御装置３とを有して構成される。
【００３９】
撮像装置２ａ，２ｂは、車輌５が移動する道路ＲＤを俯瞰するように設置された撮像手段である。撮像装置２ａ，２ｂとしては、たとえばビデオカメラやＣＣＤカメラなどを用いることができる。
【００４０】
各撮像装置２ａ，２ｂは、焦点距離が同じものを用いる。また、各撮像装置２ａ，２ｂは、互いの光軸が平行になり、かつ、各撮像面が同一平面上に位置するようにして、所定間隔をあけて縦並びに取り付けられている。したがって、撮像装置２ａ，２ｂにより、道路ＲＤを俯瞰するようなステレオ画像を撮像することができる。
【００４１】
なお、同図の例では、２台の撮像装置を用いているが、これに限らず、３台以上の撮像装置を用いてもよい。また、撮像装置の配置は縦並びに限らず、横並びにしてもよい。
【００４２】
制御装置３は、ＣＰＵ（中央演算処理装置），ＲＯＭ（Read Only Memory），ＲＡＭ（Random Access Memory），画像メモリ等を基本ハードウエアとして備える制御手段である。装置稼動時には、ＲＯＭに格納されたプログラムがＣＰＵに読み込まれ実行されることで、後述する各機能が実現される。なお、制御装置３は、保守，点検などの必要から支柱４の基部付近に設置することが好ましい。
【００４３】
図２は、車輌検出装置１の機能構成を示す機能ブロック図である。同図に示すように、制御装置３は、概略、画像入力部３０，平面推定処理部３１，物体検出処理部３２，記憶部３３，出力部３４を有している。
【００４４】
画像入力部３０は、撮像装置２ａ，２ｂから得られる画像信号を制御装置３に入力するための機能である。撮像装置２ａ，２ｂからの入力が動画像の場合には、画像入力部３０によって１フレームの静止画像が取り込まれる。また、前記画像信号がアナログ量の場合には、画像入力部３０によってＡ／Ｄ変換されデジタル画像として取り込まれる。取り込まれた２枚の画像データは各々画像メモリに格納される。なお、ここで取り込まれる画像はカラー画像でもモノクロ画像（濃淡画像）でもよいが、車輌検出が目的であればモノクロ画像で十分である。
【００４５】
平面推定処理部３１は、画像メモリに取り込まれたステレオ画像から、車輌５が移動する平面（道路ＲＤ）の３次元的位置を推定する平面推定手段として機能する。車輌検出装置１が設置された直後は、撮像装置２ａ，２ｂと道路ＲＤの相対的な位置関係が未定の状態であり、物体検出処理を行うことができない。そこで、最初に平面推定処理を実行し、道路ＲＤの３次元的位置（具体的には撮像装置２ａ，２ｂの道路ＲＤに対する高さおよび俯角）を算出するのである。なお、この処理は、車輌検出装置１を設置した際に１回実行すれば足りる。
【００４６】
平面推定処理部３１にて算出された道路ＲＤの３次元的位置のデータ（以下、単に「平面データ」という。）は、記憶部３３に格納される。また、平面推定が正常に行われたかどうかを確認するために、必要に応じて出力部３４から平面データを出力することも可能である。
【００４７】
物体検出処理部３２は、道路ＲＤの３次元的位置を基準として撮像エリア内の道路ＲＤ上を移動する物体（車輌５）を検出する検出手段として機能する。
【００４８】
物体検出処理部３２では、画像メモリに取り込まれたステレオ画像のうち一方の基準画像に対してエッジ強度抽出処理を行って、車輌５の輪郭部分などを示す特徴点を抽出する。各々の特徴点について、他方の参照画像中の類似した濃淡パターンを探索することにより対応点を見つけ、視差を求める。そして、各特徴点について、三角測量の原理に基づき実空間上における３次元座標を算出する。
【００４９】
三角測量の原理について図３を参照して説明する。同図では、説明の簡単のため２次元で示している。
【００５０】
同図の点Ｃ_ａ，Ｃ_ｂはそれぞれ基準撮像装置２ａ，参照撮像装置２ｂを表す。基準撮像装置２ａの設置高さはＨであり、撮像装置２ａ，２ｂの俯角はθ、互いの間隔（ベース長）はＢである。撮像装置２ａ，２ｂの焦点距離をｆとすると、撮像された画像Ｉ_ａ，Ｉ_ｂは、図示のように点Ｃ_ａ，Ｃ_ｂから距離ｆだけ離れた平面として観念することができる。
【００５１】
実空間上の点Ｐは、画像Ｉ_ａ，Ｉ_ｂ中の点ｐ_ａ，ｐ_ｂの位置に現れる。点ｐ_ａが点Ｐを表す特徴点であり、点ｐ_ｂが特徴点ｐ_ａに対応する対応点である。特徴点ｐ_ａの画像Ｉ_ａ中の座標値と対応点ｐ_ｂの画像Ｉ_ｂ中の座標値は異なり、この差（ｄ_ａ＋ｄ_ｂ）が点Ｐの視差ｄとなる。
【００５２】
このとき、撮像装置２ａ，２ｂの撮像面から点Ｐまでの距離Ｌは、
Ｌ＝Ｂｆ／ｄ
により算出できる。そして、俯角θと設置高さＨが既知であれば、距離Ｌから点Ｐの実空間上における３次元座標を算出することができる。これが三角測量の原理である。
【００５３】
ここで、俯角θと設置高さＨは、記憶部３３に格納された平面データによって与えられる。すなわち、物体検出処理では、平面推定処理部３１にて推定された平面（道路ＲＤ）が高さ０であるとしたときの３次元座標が算出されるのである。
【００５４】
このようにして、物体検出処理部３２では、道路ＲＤ上に存在する物体（車輌等）の３次元形状を復元することができる。さらに、あらかじめＲＯＭに車輌のモデルデータを格納しておき、そのモデルデータと復元された３次元形状とのテンプレートマッチング処理を行えば、車輌の数や車種などを判別することも可能となる。
【００５５】
以上のように、物体検出処理は道路ＲＤの平面データが既知であることを前提としたものである。そして、物体検出の精度を高めるためには、高い精度で平面の３次元的位置を推定することが重要となる。
【００５６】
では次に、図４のフローチャートを参照して、上記平面推定処理部３１における平面推定処理について詳しく説明する。
【００５７】
まず、ステップＳ１において、各撮像装置２ａ，２ｂによってステレオ画像を撮像する。撮像装置２ａ，２ｂから取り込まれた画像信号は、画像入力部３０によってデジタルデータに変換される。生成されたデジタル量の濃淡画像データは、撮像装置２ａから取り込まれたものは基準画像として、撮像装置２ｂから取り込まれたものは参照画像として、それぞれ画像メモリに格納される。
【００５８】
ステップＳ２において、平面推定処理部３１は、画像メモリに格納された基準画像からエッジ抽出処理によって複数の特徴点を抽出する。エッジ抽出処理は、ラプラシアンフィルタやソーベルフィルタなどのエッジ抽出フィルタで画像を走査することにより行うことができる。これにより、車輌５の輪郭部分，道路ＲＤのレーンマーキングやひび割れ、道路ＲＤにうつる影部分などが特徴点として抽出される。
【００５９】
次に、平面推定処理部３１は、抽出した各特徴点について参照画像中の対応点を探索する対応付け処理を行い、視差を求める（ステップＳ３）。この対応付け処理は、たとえば特徴点の周囲数近傍の小画像をサンプルパターンとして用意し、このサンプルパターンと類似する濃度パターンを参照画像中から探索することにより行うことができる。そして、各特徴点についてＳＤＳ座標を求め、それらを候補点群としてＲＡＭに格納する。ＳＤＳ座標とは、画像上に設けた直交２次元座標軸に視差軸を加えた３次元座標であり、特徴点の画像上の水平方向の座標値をｘ、垂直方向の座標値をｙ、視差をｄとしたとき、（ｘ，ｙ，ｄ）で表される。
【００６０】
ステップＳ４では、各特徴点について、視差および画像上の座標値に基づいて他の特徴点との対比を行うことによって、当該特徴点が道路ＲＤを示す点か道路ＲＤ上の物体（車輌等）を示す点かを判別する。この判別処理は、特徴点が物体を示す点か否かを判別する判別式により行う。上記判別処理によって物体を示す点と判別された特徴点はＲＡＭ中の候補点群から除去される。
【００６１】
そして、平面推定処理部３１は、物体を示す点と判別された特徴点を除いた残りの特徴点（候補点群）から道路ＲＤの３次元的位置を算出する（ステップＳ５）。平面位置算出処理は、たとえば最小２乗法やハフ変換などの統計的処理により行うことができる。このとき特徴点は３点以上、好ましくは８点以上あるとよい。なお、上記判別処理によって特徴点が３点よりも少なくなった場合には、再びステップＳ１からの処理を繰り返す。
【００６２】
算出結果は、撮像装置２ａ，２ｂの俯角θおよび設置高さＨの形式で記憶部３３に格納されるとともに、確認のため出力部３４から出力される（ステップＳ６）。
【００６３】
上記ステップＳ４の判別処理で用いる判別式は、たとえば、次のように決定される。
【００６４】
２台の撮像装置２ａ，２ｂの間隔（ベース長）をＢ、撮像装置２ａ，２ｂのレンズの焦点距離をｆとする。基準画像上に設けた直交２次元座標系（ｘ，ｙ）に対して、実空間上において、カメラ座標系（Ｘ′，Ｙ′，Ｚ′）と地面座標系（Ｘ，Ｙ，Ｚ）を図５のように定める。ここで地面座標系とは、道路面がＸ−Ｚ平面となり、かつ、Ｘ軸と基準画像上のｘ軸とが平行になるように定めた座標系である。また、図５において、Ｈは撮像装置２ａの設置高さを、θは撮像装置２ａの俯角（光軸と道路面とのなす角）を表している。
【００６５】
このとき、視差をｄとして、式（１）が成り立つ。
【数１】

【００６６】
また、カメラ座標系（Ｘ′，Ｙ′，Ｚ′）と地面座標系（Ｘ，Ｙ，Ｚ）との関係が式（２）で定義される。
【数２】

【００６７】
さらに、道路面は式（３）を満たす平面とみなせる。
【数３】

【００６８】
一方、道路ＲＤ上の車輌５は、Ｙ＞０を満たす点の集合体とみなせる。また、車輌５の幅は道路幅より狭く、レーンマーキングなど車輌５の周囲の少なくとも一部の道路面上の特徴も併せて撮像されている蓋然性が高いことを考慮すれば、ある車輌を示す特徴点Ｐ_Ｖ（Ｘ_Ｖ，Ｙ_Ｖ，Ｚ_Ｖ）について、式（４），（５）を満たすような道路を示す特徴点Ｐ_Ｒ（Ｘ_Ｒ，Ｙ_Ｒ，Ｚ_Ｒ）が存在する。
【００６９】
なお、ΔＹ_ｔｈは、ΔＹ_ｔｈ＞０であるような所定の閾値である。ΔＹ_ｔｈの値としては、車輌のボンネットの高さ（約６０ｃｍ）を考慮して、その半分の３０ｃｍ程度に定めればよい。
【数４】

【００７０】
ここで、特徴点Ｐ_Ｖ，Ｐ_Ｒについて、ＳＤＳにおいて対応する特徴点座標を各々ｐ_Ｖ（ｘ_Ｖ，ｙ_Ｖ，ｚ_Ｖ），ｐ_Ｒ（ｘ_Ｒ，ｙ_Ｒ，ｚ_Ｒ）とすると、式（１），（２）より、
【数５】

【００７１】
式（５）よりＺは定数としてよいから、両辺をｙで偏微分して近似式（６）が得られる。
【数６】

【００７２】
一方、式（１），（２），（３）より、
【数７】

【００７３】
したがって、ＳＤＳにおいて、道路面上の２つの特徴点ｐ_Ｒ１（ｘ_Ｒ１，ｙ_Ｒ _１，ｄ_Ｒ１）およびｐ_Ｒ２（ｘ_Ｒ２，ｙ_Ｒ２，ｄ_Ｒ２）について、近似式（７）が得られる。
【数８】

【００７４】
車輌検出装置１の実際の設置場面を想定して、たとえば、Ｂは数十ｃｍ、Ｈは数ｍ、Ｚは数十ｍのオーダー、θは約２０°とすると、式（６），（７）より２つの近似式（８），（９）が得られる。
【数９】

【００７５】
式（８）より式（１０）が得られる。
【数１０】

【００７６】
さらに、式（１），（２），（４）より、
【数１１】

【００７７】
式（１０）より、ｄ_Ｒ＝ｄ_Ｖゆえ、
【数１２】

【００７８】
θ≒２０°のとき、cosθ≒１ゆえ、次式が得られる。
【数１３】

【００７９】
以上より次のことが成り立つ。すなわち、もし、
【数１４】

ならば、式（９）より、
【数１５】

【００８０】
したがって、ＳＤＳにおける２つの特徴点ｐ_１（ｘ_１，ｙ_１，ｄ_１）およびｐ_２（ｘ_２，ｙ_２，ｄ_２）について、次の２式が満たされるとき、前記２つの特徴点のうち少なくとも特徴点ｐ_１は物体を示す特徴点であるといえる。
【数１６】

【００８１】
本実施形態では、式（１４），（１５）を判別式として採用する。
【００８２】
この判別式の特徴を以下にまとめる。
【００８３】
・判別式（１４），（１５）は、視差ｄと画像上の座標値ｙから構成されており、２つの特徴点の視差ｄおよび座標値ｙを対比するものである。すなわち、この式を用いた判別処理は、ある特徴点について、視差および画像上の座標値に基づいて他の特徴点との対比を行うことによって、当該特徴点が物体を示す点か否かを判別する処理となる。このように視差および画像上の座標値を考慮することで、特徴点同士の相対的な位置関係を３次元的に比較することができ、良好な判別処理を行うことができる。
【００８４】
・判別式（１４）は、特徴点の視差に基づき対比すべき他の特徴点を選択する処理に相当する。これにより実空間上の位置について関連性の高い特徴点の組を見い出すことができる。そして、このように選択された特徴点同士を判別式（１５）により対比することによって、それらの相対的な位置関係を高い妥当性をもって比較することができる。また、判別式（１４）によって対比対象を絞り込むことで処理コストを削減することができる。
【００８５】
・判別式（１４）では、略同じ視差を有する特徴点を対比すべき他の特徴点として選択する。これにより、たとえば物体を示す点とその物体が平面上につくる影を示す点など、より妥当性の高い対比対象を選択することができる。
【００８６】
・判別式（１５）は、特徴点と他の特徴点の画像上の座標値の差（ｙ_１−ｙ_２）が所定の閾値（ｄ_１Δｙ_ｔｈ）よりも大きい場合に、当該特徴点を物体を示す点と判別する。略同じ視差を有する特徴点同士の関係では、実空間上の高さの比較を、画像上の座標値ｙの比較に近似的に置き換えできることに着目したものである。
【００８７】
・判別式（１５）の閾値（ｄ_１Δｙ_ｔｈ）は、特徴点の視差ｄ_１に応じて変化する。撮像面からの距離が遠い物体ほど、画像上は小さく映る。つまり、特徴点同士の座標値の差は、実空間上で一定であっても、画像上では撮像面からの距離が遠くなるほど小さくなってしまう。したがって、閾値を固定値にするのは妥当でない。判別式（１５）では、視差に応じて閾値も変化させることで、画像全体にわたり良好な判別処理を行うことが可能となる。
【００８８】
・判別式（１４），（１５）は、ＳＤＳ座標における座標値（ｘ，ｙ，ｄ）に関する式である。また、上述したように平面位置算出処理もＳＤＳ座標において行われる。すなわち、画像上の座標軸と視差軸からなるＳＤＳ座標系で、判別処理および平面位置算出処理が行われるので、座標変換等の余分な処理が不要となり、処理コストを削減することができる。
【００８９】
以上述べたように、本実施形態の平面推定方法によれば、撮像エリア内の道路上に車輌等が存在する場合であっても、車輌等を示す特徴点を判別し除外することができるので、高精度に路面の３次元的位置を推定することが可能となる。
【００９０】
したがって、車輌検出装置を設置する際に、交通規制等を行うまでもなく、即座に平面推定処理を実施することができ、短時間かつ簡易に設置作業を行うことが可能となる。
【００９１】
また、高精度に平面推定を行うことができるので、物体検出の信頼性を向上することもできる。
【００９２】
（第２の実施形態）
図６は、本発明の第２の実施形態に係る車輌検出装置の機能構成を示す機能ブロック図である。
【００９３】
本実施形態の車輌検出装置は、領域指定部３５を有している。その他の構成および作用については第１の実施形態と同一なので、同一の構成部分については同一の符号を付して、その説明は省略する。
【００９４】
領域指定部３５は、基準画像中の平面の部分を特徴点抽出のための対象領域として指定する手段である。本実施形態では、モバイルコンピュータやＰＤＡなど、ディスプレイを備えた携帯端末装置により領域指定部３５を実現している。領域指定を行う際には、この携帯端末装置を制御装置３の通信Ｉ／Ｆに接続し、制御装置３と協働させることによって以下の処理を実行する。なお、領域指定部３５を制御装置３の内部機能としてもよい。
【００９５】
平面推定処理のために撮像装置２ａ，２ｂによって撮像された画像は、画像入力部３０によって取り込まれ画像メモリに格納される。すると、画像メモリ内の基準画像が領域指定部３５に送信され、ディスプレイに表示される。
【００９６】
ユーザ（設置作業者）は、ディスプレイに表示された基準画像を参照しつつ、画像中の道路の部分を範囲指定する。図７にその操作画面の一例を示す。ここでは、道路と道路外の領域との境界を示す左右２本のライン（破線）を画像上で指定することにより、対象領域（斜線部分）の指定を行っている。
【００９７】
指定された対象領域の情報は、通信Ｉ／Ｆを介して制御装置３のＲＡＭに格納される。
【００９８】
平面推定処理部３１は、基準画像からエッジ抽出処理によって特徴点を抽出する際に、ＲＡＭから指定対象領域の情報を読み込み、この領域内のみから特徴点を抽出する。
【００９９】
このように道路以外の部分（対象領域以外の部分）からは特徴点を抽出しないことにより、平面推定処理に要する処理コストを削減できる。また、あらかじめノイズ情報が削減されるため、平面推定の精度をより向上させることが可能となる。
【０１００】
【発明の効果】
以上説明したように、本発明の平面推定方法によれば、撮像エリア内に平面以外の物体が存在する場合であっても、物体を示す特徴点を判別し除外することができるので、高精度に平面の３次元的位置を推定することが可能となる。
【０１０１】
また、本発明の物体検出装置によれば、撮像エリア内に平面以外の物体が存在する状況下でも、物体を示す特徴点を判別し除外することで、即座に平面推定処理を実施することができ、短時間かつ簡易に設置作業を行うことが可能となる。
【０１０２】
また、高精度に平面推定を行うことができるので、物体検出の信頼性を向上することもできる。
【図面の簡単な説明】
【図１】本発明の物体検出装置の一実施形態に係る車輌検出装置の設置例を示す図である。
【図２】第１の実施形態に係る車輌検出装置の機能構成を示す機能ブロック図である。
【図３】三角測量の原理を説明する図である。
【図４】平面推定処理のフローチャートである。
【図５】基準画像上に設けた直交２次元座標系と、実空間上におけるカメラ座標系および地面座標系の関係を示す図である。
【図６】第２の実施形態に係る車輌検出装置の機能構成を示す機能ブロック図である。
【図７】領域指定処理の操作画面の一例である。
【図８】実空間座標における特徴点の座標導出精度を説明する図である。
【図９】実空間座標における平面推定精度を説明する図である。
【図１０】ＳＤＳ座標における特徴点の座標導出精度を説明する図である。
【図１１】ＳＤＳ座標における平面推定精度を説明する図である。
【符号の説明】
１車輌検出装置（物体検出装置）
２ａ，２ｂ撮像装置（撮像手段）
３制御装置
３０画像入力部
３１平面推定処理部（平面推定手段）
３２物体検出処理部（検出手段）
３３記憶部
３４出力部
３５領域指定部
４支柱
５車輌（物体）
ＲＤ道路（平面）[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an effective technique used in an apparatus for detecting an object such as a vehicle from an image obtained by an imaging means such as a camera, and in particular, estimates a three-dimensional position of a plane on which the object moves from a stereo image. It is about technology.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, an apparatus that takes a video of an expressway, a parking lot, and the like with a camera and detects the number and type of traveling vehicles from the obtained image has been put into practical use. For example, Patent Document 1 proposes an apparatus that calculates three-dimensional information of a traveling vehicle from stereo images obtained by two cameras and recognizes the vehicle type by matching processing.
[0003]
When three-dimensional information of an object is obtained from stereo images obtained by imaging with a plurality of cameras in this way, the spatial position (three-dimensional position) of a moving plane (such as a road surface of a road) on which the object moves is determined in advance. It is necessary to decide. This is because the height information of the object is calculated on the basis of the spatial position of the plane.
[0004]
The three-dimensional position of the plane is defined by a relative positional relationship with respect to the plane of the installed camera. However, in general, when installing a camera, it is difficult to accurately adjust the installation height and depression angle (angle formed by the optical axis of the camera and a plane). Therefore, there is a case in which after fixing the camera at an approximate position, the plane is estimated from the captured image and the relative positional relationship between the camera and the plane is set.
[0005]
Plane estimation using a stereo image can be performed as follows, for example. That is, for each of the images obtained from the two cameras, at least three feature points of the planar object in the images are extracted. Next, the parallax of each feature point on the reference image (image obtained from one camera serving as the reference imaging device) is calculated by comparing and associating the coordinates of the feature points between the images, and triangulation In principle, the coordinate value of each feature point in real space is obtained. Then, a plane equation is calculated using a statistical process such as a least square method or a Hough transform.
[0006]
However, since this method uses the real space coordinate value of the feature point, there is a problem that it is easily influenced by the coordinate derivation error. Due to the accuracy of the lens and the quantization error, the coordinates of the feature points in the image include a certain amount of error. As shown in FIG. 8, the error generated on the real space coordinates due to this error tends to increase as the distance from the camera increases. Therefore, the derivation accuracy (ranging accuracy) of the real space coordinates varies depending on the coordinate value of the feature point on the image. As a result, as shown in FIG. 9, the plane estimation accuracy greatly depends on the coordinate derivation result of the feature point with poor coordinate derivation accuracy.
[0007]
In order to solve such a problem, a plane estimation method using a three-dimensional space (SDS: Spatio-Disparity Space) in which a parallax axis is added to an orthogonal two-dimensional coordinate axis provided on an image has been proposed (Non-Patent Document 1). reference). In this method, as shown in FIG. 10, the error of the coordinate of the feature point in the reference image is constant regardless of the distance from the camera on the SDS, and the plane in the real space is also on the SDS. It focuses on becoming a flat surface.
[0008]
In this method, after calculating the parallax of each feature point on the reference image, the coordinate value on the SDS is obtained for each feature point. Then, a plane equation is calculated using statistical processing such as least squares or Hough transform on SDS. According to this method, there is no variation in the accuracy of deriving the coordinates of each feature point, and plane estimation can be performed with stable accuracy as shown in FIG.
[0009]
As a method similar to this, the method disclosed in Patent Document 2 is also known. Also here, plane estimation is performed by Hough transform in a three-dimensional space to which a parallax axis is added.
[0010]
[Patent Document 1]
Japanese Patent Laid-Open No. 11-259658
[Patent Document 2]
JP-A-10-96607
[Non-Patent Document 1]
Y. Yang et al., "Local, global, and multilevel stereo matching," Proc. CVPR, pp.274-279, 1993.
[0011]
[Problems to be solved by the invention]
By the way, in the image processing as described above, the presence of an object moving within the imaging area may adversely affect the plane estimation processing. When an object such as a vehicle is reflected in the image, in the feature point extraction process, in addition to the feature point indicating the plane, the edge portion of the object is extracted as the feature point. In the plane estimation process, the feature points indicating these objects are only noise information, and act as outliers in statistical processes such as the least square method or the Hough transform, and the plane estimation accuracy is significantly reduced. It is. Therefore, when performing plane estimation, it is ideal to capture an image in a situation where no object other than a plane exists in the imaging area.
[0012]
However, in reality, there are many cases where such shooting is difficult. For example, in a case where the apparatus is installed in an area where a traveling vehicle or a passerby passes almost without interruption, it is difficult to find the shooting timing, and the installation work may take a long time. Nonetheless, traffic regulation for installation work is not practical considering the impact and costs on many fronts.
[0013]
The present invention has been made in view of the above circumstances, and an object thereof is to provide a technique capable of estimating a plane with high accuracy even when an object other than a plane exists in an imaging area. is there.
[0014]
[Means for Solving the Problems]
The present inventor has intensively studied to solve the above-described problems of the prior art. The outline will be described below.
[0015]
The reason why the plane estimation accuracy decreases due to the presence of an object moving within the imaging area is that the three-dimensional position of the plane is calculated based on the feature point group in which the feature points indicating the object and the feature points indicating the plane are mixed. There is a cause. Therefore, a point indicating an object other than a plane may be excluded in advance from a plurality of feature points extracted from the reference image.
[0016]
However, since the correspondence between the coordinate system on the image and the real space coordinate system cannot be established at a stage where the three-dimensional position of the plane is undetermined, the feature point is determined from the coordinate value of the feature point on the image alone. It is difficult to accurately determine whether or not this is a point indicating.
[0017]
Therefore, the present inventor has come up with the idea that the discrimination processing can be performed with high accuracy based on the relative relationship between a plurality of feature points.
[0018]
On the other hand, as a result of experiments by the present inventor, even when comparing feature points relatively, even if points far apart in real space are compared, lack of validity and effective discrimination processing is performed. It became clear that it was not possible.
[0019]
As a result of further studies, the inventors have focused on the fact that the parallax of feature points is inversely proportional to the distance from the imaging means, and have come to recall a method of selecting feature points to be compared based on this parallax. In other words, this method compares the relative positional relationship between feature points in a three-dimensional manner by considering not only the coordinate values of feature points on the image but also the parallax of the feature points. .
[0020]
That is, in order to achieve the above object, in the present invention, a plurality of imaging means are installed so as to overlook a plane on which an object moves. At this time, by arranging the imaging units at a predetermined interval, it is possible to capture a stereo image that looks down on the plane. Note that the number of imaging means may be two or more.
[0021]
Here, various “planes on which the object moves” can be assumed depending on how the imaging area is selected. For example, when the imaging area is a roadway or a sidewalk, the “object” is a vehicle, a pedestrian, a bicycle, etc., and the “plane” is the road surface of the road. When the imaging area is a railroad crossing or a railroad track, the “object” is a train and the “plane” is a railroad track. Furthermore, when the imaging area is a manufacturing site or a distribution site in a factory, the “object” is a product or a conveyed product, and the “plane” is a conveying surface of a conveying device such as a belt conveyor.
[0022]
Such an imaging area is imaged by the plurality of imaging means to obtain a stereo image. First, a plurality of feature points are extracted for the reference image of the stereo images. “Characteristic points” refer to characteristic parts that can be clearly distinguished compared to other regions in the image. For example, the feature points that indicate an object include an edge part or a pattern part of the object, The feature point indicating a plane corresponds to a pattern portion on a plane such as lane marking or a shadow portion created by an object on the plane.
[0023]
Next, for each extracted feature point, parallax is obtained by searching for a corresponding point in another image (reference image). The “corresponding point” is a point in the reference image that shows the same part as the feature point in the base image. The feature point and the corresponding point have different coordinate values on the image according to the distance from the imaging means. This difference is the feature point parallax.
[0024]
After obtaining the parallax, each feature point is compared with other feature points based on the parallax and the coordinate values on the image to determine whether the feature point is a point indicating an object. By considering the parallax and the coordinate values on the image, the relative positional relationship between the feature points can be compared three-dimensionally, and a good discrimination process can be performed.
[0025]
Here, for example, it is preferable to select another feature point to be compared based on the parallax and to determine whether or not the feature point is a point indicating an object by comparing with the selected other feature point. . By selecting the comparison target based on the parallax, it is possible to find a set of feature points that are highly relevant to positions in the real space. Then, by comparing the feature points selected in this way, their relative positional relationships can be compared with high validity, and it is possible to determine whether the feature point is a point indicating an object. It becomes.
[0026]
Then, the three-dimensional position of the plane is calculated from the remaining feature points excluding the feature points determined as the points indicating the object by the determination process. Since the remaining feature points are highly likely to be plane points, the plane can be estimated with high accuracy.
[0027]
In the plane estimation method, typically, feature points having substantially the same parallax may be selected as other feature points to be compared. Thereby, for example, it is possible to select a more appropriate comparison target such as a point indicating an object and a point indicating a shadow that the object creates on a plane.
[0028]
Also, typically, when the difference between the coordinate values of the feature point and other feature points on the image is larger than a predetermined threshold, the feature point may be determined as a point indicating an object. At this time, it is preferable to change the threshold according to the parallax of the feature points.
[0029]
Typically, the three-dimensional position of the plane is calculated by statistical processing. Thereby, plane estimation with high robustness can be performed.
[0030]
Furthermore, in a three-dimensional coordinate system composed of a coordinate axis and a parallax axis on the image, a process of determining whether the feature point is a point indicating the object and a process of calculating a three-dimensional position of the plane are performed. Good. Thereby, plane estimation can be performed with stable accuracy without being affected by the error of the coordinates of the feature points in the image. In addition, by performing both processes in the same coordinate system, an extra process such as coordinate conversion becomes unnecessary, and the processing cost can be reduced.
[0031]
Note that it is also preferable to specify a plane portion in the reference image as a target area and extract a plurality of feature points from the specified target area. By not extracting feature points from parts other than the plane (parts other than the target area) in this way, the processing cost required for the plane estimation process can be reduced, and noise information is reduced in advance. Can be further improved.
[0032]
According to the method described above, even when an object other than a plane exists in the imaging area, the plane can be estimated with high accuracy.
[0033]
An object detection apparatus according to the present invention includes a plurality of imaging units that are installed so as to overlook a plane on which an object moves, and that capture a stereo image, a plane estimation unit that performs the plane estimation method, and a plane estimated by the plane estimation unit And a detecting means for detecting an object moving on a plane in the imaging area with reference to the three-dimensional position.
[0034]
Thereby, even in a situation where an object other than a plane exists in the imaging area, the plane estimation process can be performed immediately, and the installation work can be performed in a short time and easily. In addition, since plane estimation can be performed with high accuracy, the reliability of object detection can be improved.
[0035]
DETAILED DESCRIPTION OF THE INVENTION
Exemplary embodiments of the present invention will be described in detail below with reference to the drawings.
[0036]
It should be noted that the shape, size, relative arrangement, and the like of the constituent parts described in the following embodiments are not intended to limit the scope of the present invention only to those unless otherwise specified. Absent.
[0037]
(First embodiment)
FIG. 1 shows an installation example of a vehicle detection device according to an embodiment of the object detection device of the present invention.
[0038]
The vehicle detection device 1 is attached to a support column 4 installed on the side of the road RD, and determines the number and type of vehicles passing through each roadway of the road RD. It is a device that automatically detects vehicles that are illegally parked. The vehicle detection device 1 includes two

imaging devices

2 a and 2 b and a control device 3.
[0039]
The

imaging devices

2a and 2b are imaging means installed so as to overlook the road RD on which the vehicle 5 moves. For example, a video camera or a CCD camera can be used as the

imaging devices

2a and 2b.
[0040]
The

imaging devices

2a and 2b have the same focal length. Further, the

imaging devices

2a and 2b are attached vertically with a predetermined interval so that the optical axes thereof are parallel to each other and the imaging surfaces are positioned on the same plane. Therefore, the

imaging devices

2a and 2b can capture a stereo image overlooking the road RD.
[0041]
In the example of FIG. 2, two imaging devices are used. However, the present invention is not limited to this, and three or more imaging devices may be used. Further, the arrangement of the imaging devices is not limited to the vertical arrangement, but may be the horizontal arrangement.
[0042]
The control device 3 is a control means including a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), an image memory, and the like as basic hardware. When the apparatus is in operation, a program stored in the ROM is read and executed by the CPU, thereby realizing each function described below. The control device 3 is preferably installed in the vicinity of the base portion of the support column 4 for maintenance and inspection.
[0043]
FIG. 2 is a functional block diagram showing a functional configuration of the vehicle detection device 1. As shown in the figure, the control device 3 generally includes an image input unit 30, a plane estimation processing unit 31, an object detection processing unit 32, a storage unit 33, and an output unit 34.
[0044]
The image input unit 30 is a function for inputting image signals obtained from the

imaging devices

2 a and 2 b to the control device 3. When the input from the

imaging devices

2a and 2b is a moving image, a still image of one frame is captured by the image input unit 30. When the image signal is an analog amount, the image input unit 30 performs A / D conversion and captures it as a digital image. The two pieces of captured image data are each stored in an image memory. The image captured here may be a color image or a monochrome image (grayscale image), but a monochrome image is sufficient for the purpose of vehicle detection.
[0045]
The plane estimation processing unit 31 functions as plane estimation means for estimating the three-dimensional position of the plane (road RD) on which the vehicle 5 moves from the stereo image captured in the image memory. Immediately after the vehicle detection device 1 is installed, the relative positional relationship between the

imaging devices

2a and 2b and the road RD is undetermined, and object detection processing cannot be performed. Therefore, the plane estimation process is first executed to calculate the three-dimensional position of the road RD (specifically, the height and depression angle of the

imaging devices

2a and 2b with respect to the road RD). This process only needs to be executed once when the vehicle detection device 1 is installed.
[0046]
Data of the three-dimensional position of the road RD calculated by the plane estimation processing unit 31 (hereinafter simply referred to as “plane data”) is stored in the storage unit 33. Further, in order to confirm whether or not the plane estimation has been normally performed, it is possible to output plane data from the output unit 34 as necessary.
[0047]
The object detection processing unit 32 functions as a detection unit that detects an object (vehicle 5) moving on the road RD in the imaging area with reference to the three-dimensional position of the road RD.
[0048]
In the object detection processing unit 32, edge strength extraction processing is performed on one reference image of the stereo images captured in the image memory, and feature points indicating the contour portion of the vehicle 5 and the like are extracted. For each feature point, a corresponding point is found by searching for a similar shading pattern in the other reference image, and parallax is obtained. Then, for each feature point, three-dimensional coordinates in the real space are calculated based on the principle of triangulation.
[0049]
The principle of triangulation will be described with reference to FIG. In the figure, two dimensions are shown for easy explanation.
[0050]
Point C in the figure_a, C_bRepresents the standard imaging device 2a and the reference imaging device 2b, respectively. The installation height of the reference imaging device 2a is H, the included angle of the

imaging devices

2a and 2b is θ, and the distance (base length) between them is B. If the focal length of the

imaging devices

2a and 2b is f, the captured image I_a, I_bIs point C as shown._a, C_bCan be thought of as a plane that is a distance f from
[0051]
The point P in the real space is the image I_a, I_bMiddle point p_a, P_bAppears at the position of. Point p_aIs a feature point representing point P, and point p_bIs the feature point p_aIt is a corresponding point corresponding to. Feature point p_aImage I_aCoordinate value and corresponding point p_bImage I_bThe coordinate values inside are different, and this difference (d_a+ D_b) Is the parallax d of the point P.
[0052]
At this time, the distance L from the imaging surface of the

imaging devices

2a and 2b to the point P is
L = Bf / d
Can be calculated. If the depression angle θ and the installation height H are known, the three-dimensional coordinates of the point P in the real space can be calculated from the distance L. This is the principle of triangulation.
[0053]
Here, the depression angle θ and the installation height H are given by plane data stored in the storage unit 33. That is, in the object detection process, the three-dimensional coordinates when the plane (road RD) estimated by the plane estimation processing unit 31 is 0 height are calculated.
[0054]
In this way, the object detection processing unit 32 can restore the three-dimensional shape of an object (such as a vehicle) existing on the road RD. Further, if the model data of the vehicle is stored in the ROM in advance and the template matching process is performed between the model data and the restored three-dimensional shape, it is possible to determine the number of vehicles and the vehicle type.
[0055]
As described above, the object detection process is based on the assumption that the plane data of the road RD is known. In order to increase the accuracy of object detection, it is important to estimate the three-dimensional position of the plane with high accuracy.
[0056]
Next, the plane estimation process in the plane estimation processing unit 31 will be described in detail with reference to the flowchart of FIG.
[0057]
First, in step S1, a stereo image is captured by each of the

imaging devices

2a and 2b. Image signals captured from the

imaging devices

2 a and 2 b are converted into digital data by the image input unit 30. The generated digital grayscale image data is stored in the image memory as a reference image when it is captured from the imaging device 2a and as a reference image when it is captured from the imaging device 2b.
[0058]
In step S2, the plane estimation processing unit 31 extracts a plurality of feature points by edge extraction processing from the reference image stored in the image memory. The edge extraction process can be performed by scanning an image with an edge extraction filter such as a Laplacian filter or a Sobel filter. As a result, the contour portion of the vehicle 5, the lane markings and cracks of the road RD, and the shadow portion on the road RD are extracted as feature points.
[0059]
Next, the plane estimation processing unit 31 performs association processing for searching for corresponding points in the reference image for each extracted feature point, and obtains parallax (step S3). This association processing can be performed, for example, by preparing a small image near the number of surroundings of feature points as a sample pattern and searching for a density pattern similar to this sample pattern from the reference image. Then, the SDS coordinates are obtained for each feature point, and these are stored in the RAM as candidate point groups. The SDS coordinate is a three-dimensional coordinate obtained by adding a parallax axis to an orthogonal two-dimensional coordinate axis provided on the image. The horizontal coordinate value on the image of the feature point is x, the vertical coordinate value is y, and the parallax is calculated. When d, it is represented by (x, y, d).
[0060]
In step S4, each feature point is compared with another feature point based on the parallax and the coordinate value on the image, so that the feature point indicates a road RD or an object (vehicle or the like) on the road RD. It is determined whether it is a point indicating. This discrimination process is performed by a discriminant that discriminates whether or not the feature point is a point indicating an object. The feature points determined as the points indicating the object by the determination processing are removed from the candidate point group in the RAM.
[0061]
Then, the plane estimation processing unit 31 calculates the three-dimensional position of the road RD from the remaining feature points (candidate point group) excluding the feature points discriminated as points indicating the object (step S5). The plane position calculation process can be performed by a statistical process such as a least square method or a Hough transform. At this time, there are 3 or more feature points, preferably 8 or more. Note that, when the number of feature points is less than 3 by the discrimination process, the process from step S1 is repeated again.
[0062]
The calculation result is stored in the storage unit 33 in the form of the depression angle θ and the installation height H of the

imaging devices

2a and 2b, and is output from the output unit 34 for confirmation (step S6).
[0063]
The discriminant used in the discriminating process in step S4 is determined as follows, for example.
[0064]
The interval (base length) between the two

imaging devices

2a and 2b is B, and the focal length of the lenses of the

imaging devices

2a and 2b is f. In the real space, the camera coordinate system (X ′, Y ′, Z ′) and the ground coordinate system (X, Y, Z) are compared with the orthogonal two-dimensional coordinate system (x, y) provided on the reference image. It is determined as shown in FIG. Here, the ground coordinate system is a coordinate system in which the road surface is an XZ plane and the X axis and the x axis on the reference image are parallel to each other. In FIG. 5, H represents the installation height of the imaging device 2a, and θ represents the depression angle (angle formed by the optical axis and the road surface) of the imaging device 2a.
[0065]
At this time, Equation (1) is established with parallax as d.
[Expression 1]

[0066]
Further, the relationship between the camera coordinate system (X ′, Y ′, Z ′) and the ground coordinate system (X, Y, Z) is defined by Expression (2).
[Expression 2]

[0067]
Furthermore, the road surface can be regarded as a plane satisfying the expression (3).
[Equation 3]

[0068]
On the other hand, the vehicle 5 on the road RD can be regarded as an aggregate of points satisfying Y> 0. Further, if the width of the vehicle 5 is narrower than the width of the road, and it is highly likely that at least a part of the road surface surrounding the vehicle 5 such as lane markings are also captured, the characteristics indicating a certain vehicle Point P_V(X_V, Y_V, Z_V), A feature point P indicating a road satisfying the expressions (4) and (5)_R(X_R, Y_R, Z_R) Exists.
[0069]
ΔY_thIs ΔY_thA predetermined threshold such that> 0. ΔY_thIn consideration of the height of the vehicle bonnet (about 60 cm), the value of may be set to about 30 cm, which is half that value.
[Expression 4]

[0070]
Here, the feature point P_V, P_RFor each corresponding feature point coordinate in SDS_V(X_V, Y_V, Z_V), P_R(X_R, Y_R, Z_R), From equations (1) and (2),
[Equation 5]

[0071]
Since Z may be a constant from equation (5), approximate equation (6) is obtained by partial differentiation of both sides with y.
[Formula 6]

[0072]
On the other hand, from the equations (1), (2), (3)
[Expression 7]

[0073]
Therefore, in SDS, two feature points p on the road surface_R1(X_R1, Y_R ₁, D_R1) And p_R2(X_R2, Y_R2, D_R2), An approximate expression (7) is obtained.
[Equation 8]

[0074]
Assuming an actual installation scene of the vehicle detection device 1, for example, assuming that B is several tens of centimeters, H is several meters, Z is on the order of several tens of meters, and θ is about 20 °, equations (6) and (7 ) Gives two approximate expressions (8) and (9).
[Equation 9]

[0075]
Equation (10) is obtained from Equation (8).
[Expression 10]

[0076]
Furthermore, from the equations (1), (2), (4),
[Expression 11]

[0077]
From equation (10), d_R= D_VTherefore,
[Expression 12]

[0078]
When θ≈20 °, cos θ≈1, so the following equation is obtained.
[Formula 13]

[0079]
From the above, the following holds. That is, if
[Expression 14]

Then, from equation (9),
[Expression 15]

[0080]
Therefore, two feature points p in SDS₁(X₁, Y₁, D₁) And p₂(X₂, Y₂, D₂), When the following two expressions are satisfied, at least the feature point p of the two feature points:₁Is a feature point indicating an object.
[Expression 16]

[0081]
In this embodiment, formulas (14) and (15) are adopted as discriminants.
[0082]
The features of this discriminant are summarized below.
[0083]
The discriminants (14) and (15) are composed of the parallax d and the coordinate value y on the image, and compare the parallax d and the coordinate value y of the two feature points. In other words, the discrimination process using this equation is performed by comparing a feature point with another feature point based on the parallax and the coordinate value on the image, and thereby determining whether or not the feature point is a point indicating an object. This is a process to determine. Thus, by considering the parallax and the coordinate value on the image, the relative positional relationship between the feature points can be compared three-dimensionally, and a good discrimination process can be performed.
[0084]
The discriminant (14) corresponds to a process of selecting another feature point to be compared based on the parallax of the feature point. This makes it possible to find a set of feature points that are highly relevant to positions in the real space. Then, by comparing the feature points selected in this way with the discriminant (15), their relative positional relationships can be compared with high validity. Further, the processing cost can be reduced by narrowing down the comparison target by the discriminant (14).
[0085]
In discriminant (14), feature points having substantially the same parallax are selected as other feature points to be compared. Thereby, for example, it is possible to select a more appropriate comparison target such as a point indicating an object and a point indicating a shadow that the object creates on a plane.
[0086]
The discriminant (15) is the difference between coordinate values on the image of the feature point and other feature points (y₁-Y₂) Is a predetermined threshold (d₁Δy_thIf it is greater than (), the feature point is determined as a point indicating an object. In the relationship between feature points having substantially the same parallax, attention is paid to the fact that height comparison in real space can be approximately replaced by comparison of coordinate values y on an image.
[0087]
・ Threshold value (d) of discriminant (15)₁Δy_th) Is the parallax d of the feature points₁It changes according to. Objects that are farther away from the imaging surface appear smaller on the image. That is, even if the difference between the coordinate values of the feature points is constant in the real space, the smaller the distance from the imaging surface on the image, the smaller the difference between the feature points. Therefore, it is not appropriate to set the threshold value to a fixed value. In the discriminant (15), it is possible to perform good discrimination processing over the entire image by changing the threshold value in accordance with the parallax.
[0088]
The discriminants (14) and (15) are equations relating to coordinate values (x, y, d) in SDS coordinates. Further, as described above, the plane position calculation process is also performed on the SDS coordinates. That is, since the discrimination process and the plane position calculation process are performed in the SDS coordinate system composed of the coordinate axis and the parallax axis on the image, an extra process such as coordinate conversion becomes unnecessary, and the processing cost can be reduced.
[0089]
As described above, according to the plane estimation method of the present embodiment, even when a vehicle or the like is present on a road in the imaging area, a feature point indicating the vehicle or the like can be determined and excluded. It is possible to estimate the three-dimensional position of the road surface with high accuracy.
[0090]
Therefore, when installing the vehicle detection device, the plane estimation process can be performed immediately without performing traffic regulation and the like, and the installation operation can be performed in a short time and simply.
[0091]
In addition, since plane estimation can be performed with high accuracy, the reliability of object detection can be improved.
[0092]
(Second Embodiment)
FIG. 6 is a functional block diagram showing a functional configuration of a vehicle detection device according to the second embodiment of the present invention.
[0093]
The vehicle detection device according to the present embodiment includes an area specifying unit 35. Since other configurations and operations are the same as those in the first embodiment, the same components are denoted by the same reference numerals, and description thereof is omitted.
[0094]
The area designating unit 35 is a means for designating a plane portion in the reference image as a target area for feature point extraction. In the present embodiment, the area specifying unit 35 is realized by a mobile terminal device having a display such as a mobile computer or a PDA. When performing area designation, the portable terminal device is connected to the communication I / F of the control device 3 and cooperates with the control device 3 to execute the following processing. Note that the area specifying unit 35 may be an internal function of the control device 3.
[0095]
Images captured by the

imaging devices

2a and 2b for the plane estimation processing are captured by the image input unit 30 and stored in the image memory. Then, the reference image in the image memory is transmitted to the area specifying unit 35 and displayed on the display.
[0096]
The user (installer) designates a range of the road portion in the image while referring to the reference image displayed on the display. FIG. 7 shows an example of the operation screen. Here, the target region (shaded portion) is specified by specifying on the image two left and right lines (broken lines) indicating the boundary between the road and the region outside the road.
[0097]
Information on the designated target area is stored in the RAM of the control device 3 via the communication I / F.
[0098]
When extracting the feature points from the reference image by the edge extraction process, the plane estimation processing unit 31 reads the information on the designated target region from the RAM and extracts the feature points only from this region.
[0099]
Thus, by not extracting feature points from portions other than the road (portions other than the target region), the processing cost required for the plane estimation processing can be reduced. In addition, since noise information is reduced in advance, it is possible to further improve the accuracy of plane estimation.
[0100]
【The invention's effect】
As described above, according to the plane estimation method of the present invention, even when an object other than a plane exists in the imaging area, it is possible to discriminate and exclude feature points indicating the object. It is possible to estimate the three-dimensional position of the plane.
[0101]
In addition, according to the object detection device of the present invention, even in a situation where an object other than a plane exists in the imaging area, the plane estimation process can be performed immediately by discriminating and excluding the feature points indicating the object. It is possible to perform installation work in a short time and easily.
[0102]
In addition, since plane estimation can be performed with high accuracy, the reliability of object detection can be improved.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating an installation example of a vehicle detection device according to an embodiment of an object detection device of the present invention.
FIG. 2 is a functional block diagram showing a functional configuration of the vehicle detection device according to the first embodiment.
FIG. 3 is a diagram for explaining the principle of triangulation.
FIG. 4 is a flowchart of plane estimation processing.
FIG. 5 is a diagram illustrating a relationship between an orthogonal two-dimensional coordinate system provided on a reference image and a camera coordinate system and a ground coordinate system in real space.
FIG. 6 is a functional block diagram showing a functional configuration of a vehicle detection device according to a second embodiment.
FIG. 7 is an example of an operation screen for region designation processing;
FIG. 8 is a diagram for explaining the coordinate derivation accuracy of feature points in real space coordinates;
FIG. 9 is a diagram for explaining plane estimation accuracy in real space coordinates;
FIG. 10 is a diagram illustrating the coordinate derivation accuracy of feature points in SDS coordinates.
FIG. 11 is a diagram for explaining plane estimation accuracy in SDS coordinates.
[Explanation of symbols]
1 Vehicle detection device (object detection device)
2a, 2b Imaging device (imaging means)
3 Control device
30 Image input section
31 Plane estimation processing unit (plane estimation means)
32 Object detection processing unit (detection means)
33 Memory unit
34 Output section
35 Area specification part
4 props
5 Vehicle (object)
RD road (plane)

Claims

Taking a stereo image by a plurality of imaging means installed so as to overlook the plane on which the object moves,
Extracting a plurality of feature points for a reference image of the stereo images;
Find the parallax by searching for corresponding points in other images for each extracted feature point,
For each feature point, another feature point having substantially the same parallax is selected as a comparison target, and the coordinate value on the image is compared with the selected other feature point, so that the feature point can identify the object. Determine whether the point is
A plane estimation method using a stereo image, wherein a three-dimensional position of the plane is calculated from remaining feature points excluding a feature point determined as a point indicating the object.

If the difference between the coordinate value on the image of the feature point and the other feature points is greater than a predetermined threshold, the plane estimating the feature point according to claim 1 Symbol placement of the stereo image to determine the point indicating the object Method.

The method for estimating a plane using a stereo image according to claim 2 , wherein the predetermined threshold value changes according to a parallax of the feature point.

The plane estimation method using a stereo image according to any one of claims 1 to 3 , wherein a three-dimensional position of the plane is calculated by statistical processing.

2. A process for determining whether or not the feature point is a point indicating the object in a three-dimensional coordinate system including a coordinate axis and a parallax axis on an image, and a process for calculating a three-dimensional position of the plane. plane estimating method according to the stereo image according to any one of 1-4.

When the coordinate value in the vertical direction on the image of the feature point p _i (i is a natural number) is y _i and the parallax is d _i ,
| D _i −d _j | ≈0 (j is a natural number; i ≠ j)
A feature point p _j satisfying is selected as another feature point to be compared,
(Y _i −y _j ) / d _i > Δy _th
(Δy _th is a predetermined constant)
2. The method for estimating a plane using a stereo image according to claim 1, wherein when the condition is satisfied, the feature point p _i is determined as a point indicating the object.

The plane estimation method using a stereo image according to any one of claims 1 to 6 , wherein a portion of the plane in the reference image is designated as a target region, and the plurality of feature points are extracted from the designated target region. .

A plurality of imaging means installed to take a bird's-eye view of a plane on which the object moves;
Plane estimation means for estimating a three-dimensional position of the plane from stereo images obtained by the plurality of imaging means;
Detecting means for detecting an object moving on the plane in the imaging area with reference to the estimated three-dimensional position of the plane;
The plane estimation means includes
Extracting a plurality of feature points for a reference image of the stereo images;
Find the parallax by searching for corresponding points in other images for each extracted feature point,
Wherein for each feature point is selected as approximately another contradistinction target feature points having the same parallax, by comparing the coordinate values on the image with the other feature point selected, the feature point is the object Whether or not it is a point indicating
An object detection apparatus that calculates a three-dimensional position of the plane from the remaining feature points excluding the feature points determined as the points indicating the object.

The plane estimation means includes
The object detection device according to claim 8 , wherein when a difference between coordinate values on the image of the feature point and the other feature point is larger than a predetermined threshold, the feature point is determined as a point indicating the object.

The plane estimation means includes
The object detection apparatus according to claim 9, wherein the predetermined threshold is changed according to a parallax of the feature point.

The plane estimation means includes
Object detection apparatus according to any one of claims 8-10 to calculate the three-dimensional position of the plane by statistical processing.

The plane estimation means includes
Coordinate axes and the three-dimensional coordinate system consisting of parallax axis on the image, processing the feature points, it is determined whether or not the point indicating the object, and, according to claim 8 for performing processing for calculating the three-dimensional position of the plane object detection apparatus according to any one of the ~ 11.

The plane estimation means includes
When the coordinate value in the vertical direction on the image of the feature point p _i (i is a natural number) is y _i and the parallax is d _i ,
| D _i −d _j | ≈0 (j is a natural number; i ≠ j)
A feature point p _j satisfying is selected as another feature point to be compared,
(Y _i −y _j ) / d _i > Δy _th
(Δy _th is a predetermined constant)
The object detection device according to claim 8 , wherein when the condition is satisfied, the feature point p _i is determined as a point indicating the object.

Means for designating a portion of the plane in the reference image as a target region;
It said plane estimating means, the object detecting device according to any one of claims 8 to 13 for extracting a plurality of feature points from the designated target area.