JP2004171189A

JP2004171189A - Moving object detection device, moving object detection method and moving object detection program

Info

Publication number: JP2004171189A
Application number: JP2002334970A
Authority: JP
Inventors: Nobuo Higaki; 信男檜垣
Original assignee: Honda Motor Co Ltd
Current assignee: Honda Motor Co Ltd
Priority date: 2002-11-19
Filing date: 2002-11-19
Publication date: 2004-06-17
Anticipated expiration: 2022-11-19
Also published as: JP3952460B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a moving object detection device, a moving object detection method and a moving object detection program, capable of individually detecting an object even when a plurality of objects are adjacent to each other on an image picked up with a mobile camera. <P>SOLUTION: In this moving object detection device 1, a target distance setting part 21 finds a distance to a moving object moved most, from a distance image embedded with distance information to an image pickup target and a difference image embedded with movement of the moving object as movement information. A target distance image generation part 22 generates a target distance image corresponding to the distance. An outline extraction part 24 extracts an outline from the target distance image. Thereby, the moving object detection device 1 detects the moving body. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、カメラによって撮像された画像から、その画像上に存在する移動物体を検出する移動物体検出装置、移動物体検出方法及び移動物体検出プログラムに関する。
【０００２】
【従来の技術】
従来、ＣＣＤ等のカメラによって撮像された画像から、その画像上に存在する物体を検出する技術としては、例えば、画像内で物体の初期の曖昧な輪郭を輪郭モデルとして設定し、その輪郭モデルを所定の規則に従って収縮変形することで物体の輪郭を抽出して物体を検出する技術（動的輪郭モデル：ＳＮＡＫＥＳ）が存在する。なお、この輪郭抽出に基づいた物体検出技術においては、時間的に連続した画像により、動きのある物体（移動物体）のエッジを検出し、輪郭モデルをそのエッジに連結させることで移動物体の輪郭を抽出して移動物体を検出している（例えば、特許文献１参照。）。
【０００３】
また、移動カメラで撮像した画像から移動物体を検出する技術としては、時間的に連続する画像の輝度情報から移動カメラの動きを解析し、その動きを背景の動きであると仮定し、連続する画像の差分と背景の動きとに基づいて、移動物体の領域を検出し、輪郭として抽出する技術が存在する（例えば、非特許文献１参照。）。
【０００４】
【特許文献１】
特開平８−３２９２５４号公報（第７頁、第９−１０図）
【非特許文献１】
松岡，荒木，山澤，竹村，横矢，「移動カメラ画像からの移動物体輪郭の抽出・追跡とＤＳＰによる実時間処理」、社団法人電子情報通信学会、信学技報、ＰＲＭＵ９７−２３５、１９９８
【０００５】
【発明が解決しようとする課題】
しかし、前記従来の技術において、第１の例である、輪郭モデルを連続する画像から検出されるエッジに連結することで移動物体の輪郭を抽出して物体を検出する技術では、撮像した画像上で、複数の物体が隣接して存在する場合、その複数の物体を一つの物体として認識してしまうという問題がある。
【０００６】
また、前記従来の技術において、第２の例である、移動カメラによって移動物体を検出する技術では、移動カメラで撮像された画像全体を輪郭抽出の対象領域として処理を行うため、計算量が多くなり、実時間で移動物体の輪郭を逐次抽出するためには高速の演算装置が必要になるという問題がある。さらに、前記第１の例と同様に、撮像した画像上で、複数の物体が隣接して存在する場合、その複数の物体を一つの物体として認識してしまうという問題がある。
【０００７】
本発明は、以上のような問題点に鑑みてなされたものであり、移動カメラで撮像した画像であっても、移動物体の輪郭抽出を行う演算処理を軽減し、また、撮像した画像上に複数の物体が隣接した場合でも、個別に物体を検出することを可能にした移動物体検出装置、移動物体検出方法及び移動物体検出プログラムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明は、前記目的を達成するために創案されたものであり、まず、請求項１に記載の移動物体検出装置は、同期した複数の撮像手段で、撮像対象を撮像した複数の撮像画像から、前記撮像対象内に存在する移動物体を検出する移動物体検出装置であって、前記複数の撮像画像の視差に基づいて、前記撮像対象までの距離を距離情報として生成する距離情報生成手段と、前記複数の撮像手段の中の少なくとも一つの撮像手段から、時系列に入力される撮像画像の差分に基づいて、前記移動物体の動きを動き情報として生成する動き情報生成手段と、前記距離情報及び前記動き情報に基づいて、前記移動物体が存在する対象距離を設定する対象距離設定手段と、前記距離情報に基づいて、前記対象距離設定手段で設定された対象距離に対応する画素からなる対象距離画像を生成する対象距離画像生成手段と、前記対象距離画像内に、少なくとも前記対象距離に対応して、前記移動物体を検出する対象となる対象領域を設定する対象領域設定手段と、この対象領域設定手段で設定された対象領域から輪郭を抽出することで、前記移動物体を検出する輪郭抽出手段と、を備える構成とした。
【０００９】
かかる構成によれば、移動物体検出装置は、距離情報生成手段によって、複数の撮像画像の視差に基づいて、撮像対象までの距離を距離情報として生成する。例えば、複数の撮像画像から視差が検出された画素において、その視差の大きさ（視差量）を、撮像対象までの視差（距離）として各画素毎に埋め込んだ距離画像（距離情報）を生成する。
【００１０】
また、移動物体検出装置は、動き情報生成手段によって、複数の撮像手段の中の少なくとも一つの撮像手段から、時系列に入力される撮像画像の差分に基づいて、移動物体の動きを動き情報として生成する。例えば、時系列に入力される２枚の撮像画像の差分をとって、値が“０”でない画素値をすべて“１”にした差分画像を移動物体の動き情報として生成する。
【００１１】
そして、移動物体検出装置は、対象距離設定手段によって、距離情報と動き情報とにより、最も動き量の多い視差（距離）を特定し、その視差（距離）を対象距離として設定する。
【００１２】
また、移動物体検出装置は、対象距離画像生成手段によって、距離画像（距離情報）から対象距離に対応する画素を抽出して対象距離画像を生成する。例えば、対象距離にある程度の幅（例えば、数十ｃｍ等）を持たせ、その距離に対応する画素を距離画像から抽出する。さらに、対象領域設定手段によって、対象距離画像内に、少なくとも前記対象距離に対応して、移動物体を検出する対象となる対象領域を設定する。例えば、対象距離に対応する画素で生成された対象距離画像で、画素が存在する領域を対象領域とする。これによって、対象距離画像の中で移動物体が存在すると想定される領域を絞り込むことができる。そして、輪郭抽出手段によって、対象距離画像内の対象領域から移動物体の輪郭を抽出することで移動物体を検出する。
【００１３】
また、請求項２に記載の移動物体検出装置は、請求項１に記載の移動物体検出装置において、前記対象距離設定手段が、距離毎に動きがあった画素の累積値を求め、その累積値に基づいて、前記移動物体が存在する対象距離を設定することを特徴とする
【００１４】
かかる構成によれば、移動物体検出装置は、対象距離設定手段によって、距離情報に含まれる視差（距離）毎に、動き情報に含まれる動きのあった画素値を累計（ヒストグラム化）し、その累計値が最も多くなる視差（距離）に、最も動き量の多い移動物体が存在していると判定し、その視差（距離）を対象距離として設定する。このように、画素を累計するという簡単な動作で対象と距離を設定することができ、処理を高速化することができる。
【００１５】
さらに、請求項３に記載の移動物体検出装置は、請求項１又は請求項２に記載の移動物体検出装置において、前記対象距離画像生成手段が、少なくとも前記対象距離を基準として奥行き方向の所定範囲内に存在する画素からなる対象距離画像を生成することを特徴とする。
【００１６】
かかる構成によれば、移動物体検出装置は、対象距離画像生成手段によって、例えば、対象距離を基準とした奥行き方向（前後方向）で、予め定めた範囲（所定範囲）内に存在する画素のみを抽出することで対象距離画像を生成する。これによって、同一方向に複数移動物体が存在していても、その中から対象距離に存在する移動物体を特定した対象距離画像を生成することができる。
【００１７】
また、請求項４に記載の移動物体検出装置は、請求項１乃至請求項３のいずれか１項に記載の移動物体検出装置において、前記対象領域設定手段が、前記対象距離画像内における垂直方向の画素量に基づいて、その画素量のピークから水平方向の所定範囲内に対象領域を設定することを特徴とする。
【００１８】
かかる構成によれば、移動物体検出装置は、移動物体が存在する対象領域を設定する際に、対象領域設定手段によって、対象距離画像内における移動物体の垂直方向の画素量に基づいて、移動物体の水平方向の位置を特定する。例えば、移動物体の垂直方向の画素量が最も多い箇所（ピーク）を、水平方向における移動物体の中心として、その中心から所定範囲を移動物体の存在領域として設定する。これによって、同一距離に複数の移動物体が存在している場合でも、その中の一つを検出することができる。
【００１９】
さらに、請求項５に記載の移動物体検出装置は、請求項１乃至請求項４のいずれか１項に記載の移動物体検出装置において、前記対象領域設定手段が、少なくとも前記撮像手段のチルト角及び設置面からの高さに基づいて、前記対象領域の垂直方向の範囲を設定することを特徴とする。
【００２０】
かかる構成によれば、移動物体検出装置は、移動物体が存在する対象領域を設定する際に、対象領域設定手段によって、撮像手段であるカメラのチルト角や、そのカメラの基準となる設置面からの高さ等のカメラパラメータに基づいて、移動物体の垂直方向の存在領域の範囲を設定する。例えば、移動物体の高さを特定の大きさ（人間であれば２ｍ等）に定めることで、その大きさとカメラパラメータとに基づいて、移動物体が対象距離画像内のどの範囲に位置するかを特定することができる。
【００２１】
また、請求項６に記載の移動物体検出装置は、請求項１乃至請求項５のいずれか１項に記載の移動物体検出装置において、前記撮像画像の各画素の色情報又は濃淡情報に基づいて、その撮像画像のエッジを抽出したエッジ画像を生成するエッジ画像生成手段を備え、前記対象距離画像生成手段が、前記距離情報に基づいて、前記対象距離に対応する前記エッジ画像の画素を抽出して、前記対象距離画像を生成することを特徴とする。
【００２２】
かかる構成によれば、移動物体検出装置は、エッジ画像生成手段によって、撮像画像の色情報又は濃淡情報から、撮像画像のエッジを抽出したエッジ画像を生成する。例えば、撮像画像の明るさ（輝度）に基づいて、その明るさが大きく変化する部分をエッジとして検出することで、エッジのみからなるエッジ画像を生成する。なお、撮像画像がカラー画像で、移動物体を人物として特定する場合は、例えば、人物の顔の色（肌色）等を色情報として検出することで、エッジを検出することも可能である。
【００２３】
そして、移動物体検出装置は、対象距離画像生成手段によって、エッジ画像から対象距離の範囲内に存在する対象距離画像を生成する。これによって、輪郭抽出手段が対象距離画像から輪郭を抽出する際に、エッジを検出する動作を省くことができる。
【００２４】
さらに、請求項７に記載の移動物体検出装置は、請求項１乃至請求項６のいずれか１項に記載の移動物体検出装置において、前記輪郭抽出手段で抽出された輪郭の内部領域を、前記移動物体の抽出済領域として、前記距離情報を更新する距離情報更新手段を備えたことを特徴とする。
【００２５】
かかる構成によれば、移動物体検出装置は、距離情報更新手段によって、輪郭抽出手段で抽出された輪郭の内部領域を、すでに移動物体の輪郭を抽出した抽出済領域とすることで、距離情報を更新する。これにより、すでに抽出された移動物体の情報が距離情報から削除されることになるので、別の移動物体を順次検出することが可能になる。
【００２６】
さらに、請求項８に記載の移動物体検出方法は、同期した複数の撮像手段で撮像された撮像画像に基づいて生成された撮像対象までの距離情報と、前記複数の撮像手段の中の少なくとも一つの撮像手段から時系列に入力される撮像画像に基づいて生成された動き情報とにより、前記撮像対象内で動きのある移動物体を検出する移動物体検出方法であって、前記距離情報及び前記動き情報に基づいて、前記移動物体が存在する対象距離を設定する対象距離設定ステップと、前記距離情報に基づいて、前記対象距離設定ステップで設定された対象距離に対応する画素からなる対象距離画像を生成する対象距離画像生成ステップと、前記対象距離画像内に、少なくとも前記対象距離に対応して、前記移動物体を検出する対象となる対象領域を設定する対象領域設定ステップと、この対象領域設定ステップで設定された対象領域から輪郭を抽出することで、前記移動物体を検出する輪郭抽出ステップと、を含んでいることを特徴とする。
【００２７】
この方法によれば、移動物体検出方法は、対象距離設定ステップにおいて、同期した複数の撮像手段で撮像された撮像画像に基づいて生成された撮像対象までの距離情報と、複数の撮像手段の中の少なくとも一つの撮像手段で時系列に入力される撮像画像に基づいて生成された動き情報とにより、最も動き量の多い視差（距離）を特定し、その視差（距離）を対象距離として設定する。
【００２８】
そして、対象距離画像生成ステップにおいて、距離画像（距離情報）から対象距離に対応する画素を抽出して対象距離画像を生成する。例えば、対象距離にある程度の幅（例えば、数十ｃｍ等）を持たせ、その距離に対応する画素を距離画像から抽出する。さらに、対象領域設定ステップにおいて、対象距離画像内に、少なくとも前記対象距離に対応して、移動物体を検出する対象となる対象領域を設定する。これによって、対象距離画像の中で移動物体が存在すると想定される領域を絞り込むことができる。そして、輪郭抽出ステップにおいて、対象距離画像内の対象領域から移動物体の輪郭を抽出することで移動物体を検出する。
【００２９】
また、請求項９に記載の移動物体検出プログラムは、同期した複数の撮像手段で撮像された撮像画像に基づいて生成された撮像対象までの距離情報と、前記複数の撮像手段の中の少なくとも一つの撮像手段から時系列に入力される撮像画像に基づいて生成された動き情報とにより、前記撮像対象内で動きのある移動物体を検出するために、コンピュータを、以下の手段によって機能させる構成とした。
【００３０】
すなわち、前記距離情報及び前記動き情報に基づいて、前記移動物体が存在する対象距離を設定する対象距離設定手段、前記距離情報に基づいて、前記対象距離設定手段で設定された対象距離に対応する画素からなる対象距離画像を生成する対象距離画像生成手段、前記対象距離画像内に、少なくとも前記対象距離に対応して、前記移動物体を検出する対象となる対象領域を設定する対象領域設定手段、この対象領域設定手段で設定された対象領域から輪郭を抽出することで、前記移動物体を検出する輪郭抽出手段、とした。
【００３１】
かかる構成によれば、移動物体検出プログラムは、対象距離設定手段によって、距離情報と動き情報とにより、最も動き量の多い視差（距離）を特定し、その視差（距離）を対象距離として設定する。
【００３２】
そして、対象距離画像生成手段によって、距離画像（距離情報）から対象距離に対応する画素を抽出して対象距離画像を生成し、対象領域設定手段によって、対象距離画像の中で移動物体が存在すると想定される領域を絞り込んだ対象領域を設定する。
そして、輪郭抽出手段によって、対象距離画像内の対象領域から移動物体の輪郭を抽出することで移動物体を検出する。
【００３３】
【発明の実施の形態】
以下、本発明の実施の形態について図面を参照して説明する。
［第一の実施の形態］
（移動物体検出装置の構成）
図１は、本発明における第一の実施の形態である移動物体検出装置１の構成を示したブロック図である。図１に示すように移動物体検出装置１は、２台のカメラ（撮像手段）２で撮像されたカメラ画像（撮像画像）から、動きを伴う物体（移動物体）を検出するものである。ここでは、移動物体検出装置１を、入力されたカメラ画像を解析する入力画像解析手段１０と、解析されたカメラ画像から物体を検出する物体検出手段２０とで構成した。なお、２台のカメラ２は、左右に距離Ｂだけ離れて配置されており、それぞれを右カメラ２ａ及び左カメラ２ｂとする。
【００３４】
入力画像解析手段１０は、撮像対象を撮像した２台のカメラ２（撮像手段：２ａ、２ｂ）から同期して入力されるカメラ画像（撮像画像）を解析して、距離情報を含んだ距離画像と動き情報を含んだ差分画像とを生成するものである。ここでは、入力画像解析手段１０を、距離情報生成部１１と、動き情報生成部１２とで構成した。
【００３５】
距離情報生成部（距離情報生成手段）１１は、同時刻に右カメラ２ａと左カメラ２ｂとで撮影された２枚のカメラ画像の視差を、カメラ２からカメラ２で撮像した撮像対象までの距離情報（より正確には、カメラ２の焦点位置からの距離）として埋め込み、距離画像として生成するものである。
【００３６】
この距離情報生成部１１では、右カメラ２ａを基準カメラ（基準撮像手段）として、この基準カメラ（右カメラ２ａ）で撮像されたカメラ画像（基準撮像画像）と、左カメラ２ｂで撮像されたカメラ画像（同時刻撮像画像）とで、特定の大きさのブロック（例えば１６×１６画素）でブロックマッチングを行うことで、基準撮像画像からの視差を計測する。そして、その視差の大きさ（視差量）を基準撮像画像の各画素に対応付けた距離画像を生成する。
【００３７】
なお、視差をＺとしたとき、この視差Ｚに対応するカメラ２から物体までの距離Ｄ（図示せず）は、カメラ２の焦点距離をｆ（図示せず）、右カメラ２ａと左カメラ２ｂとの距離をＢとすると、（１）式で求めることができる。
【００３８】
Ｄ＝Ｂ×ｆ／Ｚ …（１）
【００３９】
動き情報生成部（動き情報生成手段）１２は、基準カメラ（右カメラ２ａ）で時系列に撮像された２枚のカメラ画像の差分に基づいて、カメラ画像内の移動物体の動きを動き情報として埋め込んだ、差分画像を生成するものである。
【００４０】
この動き情報生成部１２では、右カメラ２ａを基準カメラ（基準撮像手段）として、この基準カメラ（右カメラ２ａ）で時系列（時刻ｔ及び時刻ｔ＋１）に撮像された２枚のカメラ画像の差分をとる。そして、差のあった画素には動きのあった画素として画素値“１”を与え、差のなかった画素には動きのなかった画素として画素値“０”を与えた差分画像を生成する。なお、動き情報生成部１２では、さらに差分画像に対して、メディアンフィルタ等のフィルタリング処理を行うことで、ノイズを除去しておく。
【００４１】
なお、カメラ２を移動カメラとし、撮像されたカメラ画像内の背景が変化する場合は、カメラ２からカメラ画像毎のパン、チルト等のカメラ移動量を入力し、例えば、時刻ｔ＋１のカメラ画像をそのカメラ移動量分補正することで、時刻ｔ及び時刻ｔ＋１において、動きのあった画素のみを検出する。
【００４２】
ここで、図４を参照（適宜図１参照）して、距離情報生成部１１で生成される距離画像、及び動き情報生成部１２で生成される差分画像の内容について説明する。図４は、距離画像ＤＥ及び差分画像ＤＩの画像内容と、各画像の画素値（距離画像画素値ＤＥＢ及び差分画像画素値ＤＩＢ）の一例を示したものである。ここでは、カメラ２から約１ｍ、２ｍ及び３ｍ離れた位置に人物が存在しているものとする。
【００４３】
図４に示したように、距離画像ＤＥは、時刻ｔの右カメラ画像と左カメラ画像との視差を画素値で表現することで生成される。この視差は、その値が大きいほど人物の位置がカメラ２に近いことを表し、値が小さいほど人物の位置がカメラ２から遠いことを表している。例えば、距離画像画素値ＤＥＢに示したように、距離画像ＤＥの画素位置（０，０）は視差が０であり、カメラ２からの距離が無限大（∞）であることを意味している。また、距離画像ＤＥの画素位置（３０，５０）は視差が２０であり、カメラ２からの距離が視差２０に対応する距離、例えば２．２ｍであることを意味している。このように、距離画像ＤＥは、視差を画素値として表現するため、例えば、カメラ２に近いほど明るく、遠いほど暗い画像となる。
【００４４】
また、差分画像ＤＩは、時刻ｔの右カメラ画像と時刻ｔ＋１の右カメラ画像との差分をとり、差のあった画素を画素値“１”、差のなかった画素を画素値“０”として表現することで生成される。この差のあった画素が、実際に人物が動いた領域を表している。例えば、差分画像画素値ＤＩＢに示したように、差分画像ＤＩの画素位置（０，０）は“０”「停止」で、動きがなかったことを意味している。また、差分画像ＤＩの画素位置（３０，５０）は“１”「動き」で、動きがあったことを意味している。
図１に戻って、説明を続ける。
【００４５】
物体検出手段２０は、入力画像解析手段１０で解析された画像（距離画像及び差分画像）に基づいて、動きのある移動物体の領域を検出し、移動物体の輪郭を抽出するものである。ここでは、物体検出手段２０を、対象距離設定部２１と、対象距離画像生成部２２と、対象領域設定部２３と、輪郭抽出部２４と、距離情報更新部２５とで構成した。
【００４６】
対象距離設定部（対象距離設定手段）２１は、入力画像解析手段１０の距離情報生成部１１で生成された距離画像と、動き情報生成部１２で生成された差分画像とに基づいて、最も動き量の多い移動物体を特定し、対象となる移動物体が存在する視差（対象距離）を設定するものである。この対象距離は、対象距離画像生成部２２へ通知される。
【００４７】
この対象距離設定部２１では、距離画像で表された視差（距離）毎に、その視差に対応する画素と同じ位置にある差分画像の画素値を累計し、その累計が最も多くなる視差（最多視差）に、最も動き量の多い移動物体が存在していると判定する。なお、対象距離設定部２１は、距離情報生成部１１で生成された距離画像と、動き情報生成部１２で生成された差分画像とを、図示していないメモリ等の記憶手段に記憶することとする。
【００４８】
対象距離画像生成部（対象距離画像生成手段）２２は、距離情報生成部１１で生成された視差量を埋め込んだ距離画像から、対象距離設定部２１で設定された対象距離に対応する画素を抽出した対象距離画像を生成するものである。
【００４９】
なお、ここでは人物を検出することと仮定して、対象距離（最多視差）±α（数十ｃｍ）分の視差の幅（奥行き）を、最も動き量の多い移動物体が存在する視差の範囲とする。このαの値は、対象距離を基準とした奥行き方向の範囲（所定範囲）であって、検出する対象となる物体の奥行き方向の大きさによって予め定めた値である。
【００５０】
例えば、最多視差におけるカメラ２から移動物体までの距離Ｄを前記（１）式で算出したとすると、その視差の範囲Ｚｒは（１）式を変形することで、（２）式を得る。ただし、カメラ２の焦点距離をｆ、右カメラ２ａと左カメラ２ｂとの距離をＢとする。
【００５１】
Ｂ×ｆ／（Ｄ＋α）＜Ｚｒ＜Ｂ×ｆ／（Ｄ−α） …（２）
【００５２】
この対象距離画像生成部２２では、前記（２）式の範囲の視差に対応する画素を抽出した対象距離画像を生成するものとする。
なお、この対象距離画像の生成は、基準カメラ（右カメラ２ａ）で撮像されたカメラ画像（原画像）から、対象距離（視差の範囲）に対応する画素位置のみの画素を抽出することとしてもよい。
【００５３】
ここで、図５を参照（適宜図１参照）して、対象距離設定部２１及び対象距離画像生成部２２で、検出対象となる移動物体が存在する距離に対応する画像（対象距離画像）を生成する手順について説明する。図５（ａ）は、距離画像ＤＥ及び差分画像ＤＩ（図４）に基づいて、視差（距離）と動きのある画素を累計した動き量（画素数）との関係を示したグラフである。図５（ｂ）は、距離画像ＤＥ（図４）から対象距離の画像のみを抽出した対象距離画像ＴＤＥを示している。
【００５４】
図５（ａ）に示したように、距離画像ＤＥ（図４）の視差（距離）と動き量（画素数）との関係をグラフ化すると、視差（距離）が１ｍ、２．２ｍ、３ｍの位置で動き量がピークとなる。そこで、対象距離設定部２１は、動き量が最大となる視差（２．２ｍ）に移動物体が存在するものとして、２．２ｍを対象距離に設定する。なお、移動物体を人物と仮定すると、カメラ２から２．２±αｍ（α＝０．５ｍ）の範囲に人物が存在すると判定することができる。
【００５５】
そこで、対象距離画像生成部２２は、図５（ｂ）に示したように、距離情報生成部１１で生成された距離画像から、対象距離設定部２１で設定された対象距離±αｍ（２．２±０．５ｍ）に存在する画素を抽出した対象距離画像ＴＤＥを生成する。これによって、カメラ２から１ｍ、３ｍ離れた位置に存在している人物の画像を削除し、２．２±０．５ｍ離れた位置に存在している人物のみを抽出した対象距離画像ＴＤＥを生成することができる。
図１に戻って、説明を続ける。
【００５６】
対象領域設定部（対象領域設定手段）２３は、対象距離画像生成部２２で生成された対象距離画像の垂直方向の画素数を累計し、その垂直方向の画素数の累計が最も多くなる位置（ピーク）を移動物体の中心の水平位置であると特定して、その移動物体を含んだ領域（対象領域）を設定するものである。
【００５７】
より詳しくは、この対象領域設定部２３では、対象距離画像生成部２２で生成された対象距離画像の垂直方向の画素数をカウントすることでヒストグラム化し、そのヒストグラムが最大（ピーク）となる位置を移動物体の中心の水平位置であると特定する。ここでは人物を検出することと仮定して、ヒストグラムが最大となる水平位置を中心に、左右に特定の大きさ（例えば０．５〜０．６（ｍ））の範囲を対象領域の水平方向の存在領域（範囲）として設定する。また、縦方向は特定の大きさ（例えば２（ｍ））を対象領域の高さとする。このとき、対象領域設定部２３は、カメラ２から入力されるチルト角、床（設置面）からの高さ等のカメラパラメータに基づいて、対象領域の垂直方向の存在領域（範囲）を設定する。
【００５８】
なお、このようにヒストグラムが最大となる位置を移動物体の中心と判定することで、同一距離に複数の移動物体（人物等）が存在していても、その中の一つ（一人）を検出することができる。
【００５９】
ここで、図６を参照（適宜図１参照）して、対象領域設定部２３が、対象距離画像ＴＤＥの中から一つ（一人）の移動物体の領域（対象領域）を設定する手順について説明する。図６（ａ）は、対象距離画像生成部２２で生成された対象距離画像ＴＤＥにおける垂直方向の画素数の累計をヒストグラムＨＩで表したものである。図６（ｂ）は、対象距離画像ＴＤＥの中で移動物体を人物として対象領域Ｔを設定した状態を示したものである。なお、図６（ａ）（ｂ）では、ヒストグラムＨＩを対象距離画像ＴＤＥに重畳させているが、これは、説明の都合上重畳させているだけである。
【００６０】
対象領域設定部２３は、図６（ａ）に示したように、対象距離画像ＴＤＥの垂直方向の画素数を累計したヒストグラムＨＩを生成する。このように対象距離画像ＴＤＥをヒストグラム化することで、そのヒストグラムＨＩが最大となる位置に移動物体の中心の水平位置が存在すると判定することが可能になる。例えば、ヒストグラムＨＩを使用せずに対象距離画像ＴＤＥの中で最も高位置に存在する０でない画素位置を、移動物体の中心の水平位置と判定すると、人物が手を上げた場合、その手の先を人物（移動物体）の中心であると判定してしまうことになる。そこで、ここでは、ヒストグラムＨＩを使用することとする。
【００６１】
そして、対象領域設定部２３は、図６（ｂ）に示したように、ヒストグラムＨＩが最大となる水平位置を中心に、左右に特定の大きさ（例えば０．５ｍ）の範囲を対象領域Ｔの水平方向の範囲とする。また、縦方向は特定の大きさ（例えば２ｍ）を対象領域Ｔの垂直方向の範囲とする。
【００６２】
この対象領域Ｔの大きさについては、図７を参照（適宜図１参照）してさらに説明を行う。図７は、カメラ２が移動ロボット（図示せず）に組み込まれ、移動物体Ｍと同じ床からある高さ（カメラ高）Ｈに位置しているときに、移動物体Ｍが対象距離画像（ａ´、ｂ´）上のどの高さに位置するかを説明するための説明図である。なお、図７（ａ）は、カメラ２のチルト角が０（°）の場合、図７（ｂ）はカメラ２のチルト角がθ_Ｔ（≠０）の場合におけるカメラ２と移動物体Ｍとの対応関係を示している。
【００６３】
まず、図７（ａ）を参照して、チルト角が０（°）の場合において、移動物体Ｍが対象距離画像（ａ´）上で縦方向のどの位置に存在するかを特定する方法について説明する。
ここで、カメラ２の垂直画角をθ_ｖ、カメラ２から移動物体Ｍまでの距離をＤ、対象距離画像（ａ´）の縦方向の解像度をＹ、カメラ２の床からの高さ（カメラ高）をＨ、移動物体Ｍの床からの仮想の高さを２（ｍ）とする。このとき、カメラ２の光軸と、カメラ２から移動物体Ｍの仮想の上端（床から２ｍ）までを結んだ直線との角度θ_Ｈは（３）式で表すことができる。
【００６４】
θ_Ｈ＝ｔａｎ^−１（（２−Ｈ）／Ｄ） …（３）
【００６５】
これにより、移動物体Ｍの対象距離画像（ａ´）上での上端ｙ_Ｔは（４）式で求めることができる。
【００６６】

【００６７】
また、カメラ２の光軸と、カメラ２から移動物体Ｍの下端（床）までを結んだ直線との角度θ_Ｌは（５）式で表すことができる。
【００６８】
θ_Ｌ＝ｔａｎ^−１（Ｈ／Ｄ） …（５）
【００６９】
これにより、移動物体Ｍの対象距離画像（ａ´）上での下端ｙ_Ｂは（６）式で求めることができる。
【００７０】

【００７１】
次に、図７（ｂ）を参照して、チルト角がθ_Ｔ（≠０）の場合において、移動物体Ｍが対象距離画像（ｂ´）上で縦方向のどの位置に存在するかを特定する方法について説明する。
ここで、カメラ２の垂直画角をθ_ｖ、チルト角をθ_Ｔ、移動物体Ｍまでの距離をＤ、対象距離画像の縦方向の解像度をＹ、カメラ２の床からの高さ（カメラ高）をＨ、移動物体Ｍの床からの仮想の高さを２（ｍ）とする。このとき、カメラ２の光軸とカメラ２から移動物体Ｍの仮想の上端（床から２ｍ）までを結んだ直線との角度θ_Ｈと、チルト角θ_Ｔとの差分角度（θ_Ｈ−θ_Ｔ）は（７）式で表すことができる。
【００７２】
θ_Ｈ−θ_Ｔ＝ｔａｎ^−１（（２−Ｈ）／Ｄ） …（７）
【００７３】
これにより、移動物体Ｍの対象距離画像（ｂ´）上での上端ｙ_Ｔは（８）式で求めることができる。
【００７４】

【００７５】
また、カメラ２の光軸とカメラ２から移動物体Ｍの下端（床）までを結んだ直線との角度θ_Ｌと、チルト角θ_Ｔとの加算角度（θ_Ｌ＋θ_Ｔ）は（９）式で表すことができる。
【００７６】
θ_Ｌ＋θ_Ｔ＝ｔａｎ^−１（Ｈ／Ｄ） …（９）
【００７７】
これにより、移動物体Ｍの対象距離画像（ｂ´）上での下端ｙ_Ｂは（１０）式で求めることができる。
【００７８】

【００７９】
このように求めた対象距離画像（ａ´又はｂ´）の上端ｙ_Ｔ及び下端ｙ_Ｂによって、対象領域Ｔ（図６（ｂ））の垂直方向の範囲が決定される。
なお、移動ロボット（図示せず）が階段等を昇降し、移動物体Ｍと同一の床に存在しない場合は、移動ロボット本体のエンコーダ等によって昇降量を検出し、その昇降量を移動物体Ｍの床からの高さに対して加算又は減算することで、移動物体Ｍの対象距離画像（ａ´又はｂ´）における縦方向の位置を特定することができる。あるいは、移動ロボットに地図情報を保持しておき、移動物体Ｍの方向及び距離で特定される床の高さを、その地図情報から取得することとしてもよい。
【００８０】
また、対象領域Ｔ（図６（ｂ））の水平方向の範囲は、例えば、図示していないが、カメラ２の水平画角をθ_ｈ、カメラ２から対象とする移動物体Ｍまでの距離をＤ、対象距離画像の横方向の解像度をＸとすると、対象領域の幅の半分（移動物体の中心からの距離）を０．５（ｍ）としたときの、対象距離画像上での水平画素数α_Ｈは、（１１）式で求めることができる。
【００８１】
α_Ｈ＝（Ｘ／θ_ｈ）ｔａｎ^−１（０．５／Ｄ） …（１１）
図１に戻って、説明を続ける。
【００８２】
輪郭抽出部（輪郭抽出手段）２４は、対象距離画像生成部２２で生成された対象距離画像において、対象領域設定部２３で設定した移動物体の領域（対象領域）内で、既知の輪郭抽出技術を用いて輪郭の抽出を行うものである。ここで抽出された輪郭（輪郭情報）は、移動物体検出装置１の出力として、外部に出力されるとともに、距離情報更新部２５へ通知される。なお、この輪郭抽出部２４で輪郭が抽出されることで、移動物体が検出されたことになる。
【００８３】
ここで、既知の技術である輪郭抽出の手順の概要を説明する。
まず、対象領域内の画素値の変化に基づいてエッジを検出する。例えば、ある画素の近傍領域の画素に対して重み係数を持つオペレータ（係数行例：Ｓｏｖｅｌオペレータ、Ｋｉｒｓｃｈオペレータ等）を画素毎に乗算することで、エッジの検出を行う。そして、この検出されたエッジに対して、適当な閾値によって２値化を行い、メディアンフィルタ等によって孤立点の除去を行う。このように２値化されたエッジを連結することで、対象領域内から移動物体の輪郭を抽出することができる。なお、エッジから輪郭を抽出する手法として、動的輪郭モデル（ＳＮＡＫＥＳ）を適用することとしてもよい。これによって、例えば、図８に示したように、対象領域画像ＴＤＥの中で移動物体が一つ（一人）に限定された対象領域Ｔ内で輪郭Ｏを抽出することができる。
【００８４】
距離情報更新部（距離情報更新手段）２５は、輪郭抽出部２４で抽出された輪郭（輪郭情報）に基づいて、対象距離設定部２１で記憶手段（図示せず）に記憶した距離画像を更新するものである。例えば、輪郭を含んだ内部領域に対応する距離画像の画素値を“０”にする。これによって、輪郭抽出を完了した移動物体の領域が距離画像から削除されたことになる。なお、距離情報更新部２５は、この距離画像の更新が完了したことを、更新情報として、対象距離設定部２１へ通知する。
【００８５】
例えば、図９に示したように、図８で抽出した輪郭Ｏ内（輪郭Ｏを含んだ内部領域）に対応する距離画像ＤＥの内容（距離画像画素値ＤＥＢ）を更新する。すなわち、輪郭Ｏの領域内における全ての画素値、例えば輪郭Ｏ内の画素位置（３０，５０）等、の視差を０に変更する。このように輪郭Ｏの領域内の視差を０に変更することで、輪郭Ｏとして抽出された移動物体は、カメラ２からの距離が無限大になり、距離画像ＤＥ上には存在しなくなる。
【００８６】
以上、第一の実施の形態である移動物体検出装置１の構成について説明したが、移動物体検出装置１は、コンピュータにおいて各手段を各機能プログラムとして実現することも可能であり、各機能プログラムを結合して移動物体検出プログラムとして動作させることも可能である。
【００８７】
また、ここでは、移動物体検出装置１の距離情報生成部１１が、２台のカメラ２で撮像したカメラ画像に基づいて距離画像を生成したが、３台以上のカメラを用いて距離画像を生成することとしてもよい。例えば、３行３列に配置した９台のカメラで、中央に配置したカメラを基準カメラとして、他のカメラとの視差に基づいて距離画像を生成することで、移動物体までの距離をより正確に測定することもできる。
【００８８】
また、この移動物体検出装置１を、移動ロボット、自動車等の移動体に組み込んで、人物等の物体を検出するために用いることも可能である。例えば、移動ロボットに本発明を適用することで、移動ロボットが、人込みにおいても人物を認識することが可能になる。さらに、人物を個別に検出することができるので、例えば、顔認識等を行うことで、その人物を追跡したり、人物毎に異なる動作を行わせる等の輪郭抽出後の処理が容易になる。
【００８９】
（移動物体検出装置１の動作）
次に、図１乃至図３を参照して、移動物体検出装置１の動作について説明する。図２及び図３は、移動物体検出装置１の動作を示すフローチャートである。
【００９０】
＜カメラ画像入力ステップ＞
まず、移動物体検出装置１は、同期した２台のカメラ２から時系列にカメラ画像を入力する（ステップＳ１）。なお、ここでは、ある時刻ｔに右カメラ２ａ（基準カメラ）と左カメラ２ｂとから入力されたカメラ画像と、次の時刻ｔ＋１（例えば、１フレーム後）に右カメラ２ａ（基準カメラ）から入力されたカメラ画像とに基づいて、移動物体の輪郭を抽出するものとする。
【００９１】
＜距離画像生成ステップ＞
そして、移動物体検出装置１は、距離情報生成部１１によって、時刻ｔに右カメラ２ａ（基準カメラ）と左カメラ２ｂとから入力された２枚のカメラ画像から、撮像対象までの視差（距離）を埋め込んだ距離画像を生成する（ステップＳ２）。
【００９２】
＜差分画像生成ステップ＞
さらに、移動物体検出装置１は、動き情報生成部１２によって、右カメラ２ａ（基準カメラ）で時刻ｔと時刻ｔ＋１に撮像された２枚のカメラ画像（基準カメラ画像）の差分をとり、差のあった画素を画素値“１”、差のなかった画素を画素値“０”とした差分画像を生成する（ステップＳ３）。
【００９３】
＜対象距離設定ステップ＞
また、移動物体検出装置１は、対象距離設定部２１によって、ステップＳ２及びステップＳ３で生成した距離画像及び差分画像から、距離画像で表された視差（距離）毎に、動きのあった画素数を累計する（ステップＳ４）。例えば、距離画像から、ある視差（距離）の画素のみを抽出し、この抽出された画素と対応する差分画像の画素の画素値を累計する。そして、この動き（差分）のある画素数の累計が最大となる距離を、検出する移動物体の対象距離として設定する（ステップＳ５）。
【００９４】
＜対象距離画像生成ステップ＞
そして、移動物体検出装置１は、対象距離画像生成部２２によって、距離画像から対象距離±αに対応する画素を抽出した対象距離画像を生成する（ステップＳ６）。なお、ここでは人物を検出することと仮定して、αを数十ｃｍとする。
【００９５】
＜対象領域設定ステップ＞
そして、移動物体検出装置１は、対象領域設定部２３によって、ステップＳ６で生成した対象距離画像の垂直方向（縦方向）の画素数をヒストグラム化することで計測する（ステップＳ７）。そして、このヒストグラムが最大（ピーク）となる水平位置を中心に、左右に特定の大きさ（例えば０．５〜０．６（ｍ））の範囲を対象領域の水平方向の範囲として設定する（ステップＳ８）。
さらに、対象領域設定部２３では、カメラ２から入力されるチルト角、床（設置面）からの高さ等のカメラパラメータに基づいて、対象領域の垂直（上下）方向の範囲を設定する（ステップＳ９）。
【００９６】
例えば、カメラ２のチルト角、床からの高さに基づいて、対象距離画像における画像中の床の位置（対象領域の下端）を求める。そして、カメラ２の画角と移動物体までの距離とに基づいて、床から２ｍまでの範囲を、画素数に換算することにより対象領域の対象距離画像における床からの画素数を求める。これによって、対象距離画像における対象領域の上端を求めることができる。この対象領域の上端は、カメラ２のチルト角、床からの高さに基づいて、対象距離画像における画像中の２ｍの位置（高さ）を直接求めることとしてもよい。なお、この２ｍは、一例であって、他の長さ（高さ）であっても構わない。
【００９７】
＜輪郭抽出ステップ＞
また、移動物体検出装置１は、輪郭抽出部２４によって、ステップＳ６で生成した対象距離画像において、ステップＳ８及びステップＳ９で設定した対象領域内で輪郭の抽出を行う（ステップＳ１０）。例えば、対象領域内でエッジを検出し、そのエッジに対して動的輪郭モデル（ＳＮＡＫＥＳ）を適用することによって輪郭の抽出を行う。
【００９８】
そして、輪郭の抽出に成功したかどうかを判定する（ステップＳ１１）。なお、ここで輪郭抽出の成功及び失敗の判定は、ステップＳ１０において輪郭が抽出できたかどうかの判定だけではなく、例えば、対象距離が予め定めた距離よりも遠い場合や、対象領域が予め定めた大きさよりも小さい場合、さらには、すべての物体の輪郭抽出を完了した等の理由によって、物体の輪郭抽出を行わないとする判定をも含むものとする。
このステップＳ１１で輪郭の抽出に成功した場合（Ｙｅｓ）は、ステップＳ１２へ進む。一方、輪郭の抽出に失敗した（あるいは抽出を行わない）場合（Ｎｏ）は、本動作を終了する。
【００９９】
＜距離情報更新ステップ＞
そして、移動物体検出装置１は、距離情報更新部２５によって、ステップＳ１０で抽出した輪郭内（輪郭を含んだ内部領域）に対応する距離画像を更新する（ステップＳ１２）。例えば、輪郭を含んだ内部領域に対応する距離画像の画素値を“０”にする。これによって、すでに抽出を終わった移動物体の領域が距離画像から削除されることになる。そして、ステップＳ４へ戻って、処理を継続する。
【０１００】
以上の各ステップによって、本実施の形態の移動物体検出装置１によれば、カメラ２から入力されたカメラ画像から、そのカメラ画像に存在する移動物体を検出することができる。なお、ここでは、ある時刻ｔ（ｔ＋１）において移動物体の輪郭を抽出したが、時々刻々と入力されるカメラ画像に基づいて、前記ステップ（ステップＳ１〜ステップＳ１２）を動作させることで、例えば、移動ロボット等の移動体が、人物を検出し続けることができる。
【０１０１】
［第二の実施の形態］
（移動物体検出装置の構成）
次に、図１０を参照して、本発明における第二の実施の形態である移動物体検出装置１Ｂの構成について説明する。図１０は、移動物体検出装置１Ｂの構成を示したブロック図である。図１０に示すように移動物体検出装置１Ｂは、２台のカメラ（撮像手段）２から撮像されたカメラ画像（撮像画像）から、動きを伴う物体（移動物体）を検出するものである。
【０１０２】
ここでは、移動物体検出装置１Ｂを、距離情報生成部１１、動き情報生成部１２及びエッジ画像生成部１３からなる入力画像解析手段１０Ｂと、対象距離設定部２１、対象距離画像生成部２２Ｂ、対象領域設定部２３、輪郭抽出部２４Ｂ及び距離情報更新部２５からなる物体検出手段２０Ｂとで構成した。なお、エッジ画像生成部１３、対象距離画像生成部２２Ｂ及び輪郭抽出部２４Ｂ以外の構成は、図１に示したものと同一であるので、同一の符号を付し、説明を省略する。
【０１０３】
エッジ画像生成部（エッジ画像生成手段）１３は、カメラ２（２ａ）から距離情報生成部１１と動き情報生成部１２とに入力される同時刻のカメラ画像（基準撮像画像）を入力し、そのカメラ画像からエッジを抽出したエッジ画像を生成するものである。このエッジ画像生成部１３では、カメラ２（２ａ）から入力されたカメラ画像の明るさ（輝度：濃淡情報）に基づいて、その明るさが大きく変化する部分をエッジとして検出し、そのエッジのみからなるエッジ画像を生成する。例えば、ある画素の近傍領域の画素に対して重み係数を持つオペレータ（係数行例：Ｓｏｖｅｌオペレータ、Ｋｉｒｓｃｈオペレータ等）を画素毎に乗算することで、エッジの検出を行う。
【０１０４】
すなわち、入力画像解析手段１０Ｂでは、図１３に示すように、時刻ｔの右カメラ画像と左カメラ画像との視差を画素値で表現した距離画像ＤＥと、時刻ｔの右カメラ画像からエッジを抽出したエッジ画像ＥＤと、時刻ｔの右カメラ画像と時刻ｔ＋１の右カメラ画像との差分をとり、差のあった画素を画素値“１”、差のなかった画素を画素値“０”として表現した差分画像ＤＩとが生成されることになる。
なお、エッジ画像生成部１３では、カメラ画像がカラー画像で、移動物体を人物として特定する場合は、例えば、人物の顔の色（肌色）等を色情報として検出することで、エッジを検出することも可能である。
【０１０５】
対象距離画像生成部（対象距離画像生成手段）２２Ｂは、対象距離設定部２１で設定された対象距離に対応する画素からなる対象距離画像を生成するものである。この対象距離画像生成部２２Ｂでは、まず、距離情報生成部１１で生成された視差量を埋め込んだ距離画像から、対象距離設定部２１から通知される対象距離±α（このαは、人物を検出することと仮定した場合、数十ｃｍ）に対応する画素位置を求める。そして、その画素位置に対応する画素のみをエッジ画像生成部１３で生成されたエッジ画像から抽出し、対象距離画像を生成する。すなわち、この対象距離画像は、対象距離に存在する移動物体をエッジで表現した画像になる。
【０１０６】
輪郭抽出部（輪郭抽出手段）２４Ｂは、対象距離画像生成部２２Ｂで生成された対象距離画像において、対象領域設定部２３で設定した移動物体の領域（対象領域）内で輪郭の抽出を行うものである。ここで抽出された輪郭（輪郭情報）は、移動物体検出装置１Ｂの出力として、外部に出力されるとともに、距離情報更新部２５へ通知される。この輪郭抽出部２４Ｂで輪郭が抽出されることで、移動物体が検出されたことになる。
【０１０７】
なお、この輪郭抽出部２４Ｂでは、対象距離画像生成部２２Ｂで生成された対象距離画像が、すでにエッジで表現されているため、そのエッジから動的輪郭モデル（ＳＮＡＫＥＳ）等によって輪郭を抽出する。すなわち、輪郭抽出部２４Ｂでは、輪郭抽出部２４（図１）で行ったエッジ検出を省略することができる。
【０１０８】
以上、第二の実施の形態である移動物体検出装置１Ｂの構成について説明したが、移動物体検出装置１Ｂは、コンピュータにおいて各手段を各機能プログラムとして実現することも可能であり、各機能プログラムを結合して移動物体検出プログラムとして動作させることも可能である。
【０１０９】
また、移動物体検出装置１Ｂは、距離情報生成部１１において、３台以上のカメラを用いて距離画像を生成することとしてもよい。この場合、動き情報生成部１２及びエッジ画像生成部１３は、基準となるカメラから入力されるカメラ画像に基づいて、差分画像及びエッジ画像を生成することとする。
さらに、移動物体検出装置１Ｂは、移動ロボット、自動車等の移動体に組み込んで、人物等の物体を検出するために用いることも可能である。
【０１１０】
（移動物体検出装置１Ｂの動作）
次に、図１０、図１１及び図１２を参照して、移動物体検出装置１Ｂの動作について簡単に説明する。図１１及び図１２は、移動物体検出装置１Ｂの動作を示すフローチャートである。
【０１１１】
まず、移動物体検出装置１Ｂは、同期した２台のカメラ２から時系列にカメラ画像を入力する（ステップＳ２１）。そして、距離情報生成部１１によって、時刻ｔに右カメラ２ａ（基準カメラ）と左カメラ２ｂとから入力された２枚のカメラ画像から、撮像対象までの視差（距離）を埋め込んだ距離画像を生成する（ステップＳ２２）。さらに、動き情報生成部１２によって、右カメラ２ａ（基準カメラ）で時刻ｔと時刻ｔ＋１に撮像された２枚のカメラ画像（基準カメラ画像）の差分をとり、差のあった画素を画素値“１”、差のなかった画素を画素値“０”とした差分画像を生成する（ステップＳ２３）。そして、エッジ画像生成部１３によって、右カメラ２ａ（基準カメラ）で時刻ｔに撮像されたカメラ画像（基準カメラ画像）からエッジを抽出したエッジ画像を生成する（ステップＳ２４）。
【０１１２】
そして、移動物体検出装置１Ｂは、対象距離設定部２１によって、ステップＳ２２及びステップＳ２３で生成した距離画像及び差分画像から、距離画像で表された視差（距離）毎に、その視差に対応する画素と同じ位置にある差分画像の画素値を累計する（ステップＳ２５）。そして、この動き（差分）のある画素数（画素値の累計）が最大となる距離を、検出する移動物体の対象距離として設定する（ステップＳ２６）。そして、対象距離画像生成部２２Ｂによって、エッジ画像から対象距離±αに対応する画素を抽出した対象距離画像を生成する（ステップＳ２７）。なお、ここでは人物を検出することと仮定して、αを数十ｃｍとする。
【０１１３】
そして、移動物体検出装置１Ｂは、対象領域設定部２３によって、ステップＳ２７で生成した対象距離画像の垂直方向（縦方向）の画素値をヒストグラム化することで計測する（ステップＳ２８）。そして、このヒストグラムが最大となる水平位置を中心に、左右に特定の大きさ（例えば０．５〜０．６（ｍ））の範囲を対象領域の水平方向の範囲として設定する（ステップＳ２９）。さらに、カメラ２から入力されるチルト角、床（設置面）からの高さ等のカメラパラメータに基づいて、対象領域の垂直方向の範囲を設定する（ステップＳ３０）。
【０１１４】
また、移動物体検出装置１Ｂは、輪郭抽出部２４Ｂによって、ステップＳ２７で生成した対象距離画像において、ステップＳ２９及びステップＳ３０で設定した対象領域内で輪郭の抽出を行い（ステップＳ３１）、輪郭の抽出に成功したかどうかを判定する（ステップＳ３２）。このステップＳ３２で輪郭の抽出に成功した場合（Ｙｅｓ）は、ステップＳ３３へ進む。一方、輪郭の抽出に失敗した（あるいは抽出を行わない）場合（Ｎｏ）は、本動作を終了する。
【０１１５】
そして、移動物体検出装置１Ｂは、距離情報更新部２５によって、ステップＳ３１で抽出した輪郭内（輪郭を含んだ内部領域）に対応する画素位置を更新情報として生成し、対象距離設定部２１が、その更新情報に基づいて、距離画像の情報を削除する（ステップＳ３３）。これによって、すでに抽出を終わった移動物体の領域が距離画像から削除されることになる。そして、ステップＳ２５へ戻って、処理を継続する。
【０１１６】
以上の各ステップによって、本実施の形態の移動物体検出装置１Ｂによれば、カメラ２から入力されたカメラ画像から、そのカメラ画像に存在する移動物体を検出することができる。なお、移動物体検出装置１Ｂでは、ステップＳ２４でエッジ画像を生成し、ステップＳ３１における輪郭の抽出には、すでにエッジを検出した対象距離画像を用いるため、同じ距離に複数の移動物体（人物等）が並んで存在している場合でも高速に輪郭の抽出を行うことが可能になる。
【０１１７】
【発明の効果】
以上説明したとおり、本発明に係る移動物体検出装置、移動物体検出方法及び移動物体検出プログラムでは、以下に示す優れた効果を奏する。
【０１１８】
本発明によれば、複数のカメラで撮像されたカメラ画像から生成される距離画像（距離情報）と、時系列に入力されるカメラ画像から生成される差分画像（動き情報）とに基づいて、動きのある移動物体のカメラからの距離を特定し、その距離のみに着目した画像（対象距離画像）を生成することができる。これによって、カメラ画像上では繋がっている移動物体（例えば、人物等）を、距離によって識別し分離することで、別の移動物体として検出することが可能になる。
【０１１９】
また、本発明によれば、対象距離画像における移動物体の垂直方向の画素量に基づいて、移動物体の水平方向の範囲を絞り込むことができるため、同じ距離に横並びに存在する複数の移動物体を分離して、別の移動物体として検出することが可能になる。
【０１２０】
さらに、本発明によれば、カメラのチルト角や、床からの高さに基づいて、対象距離画像における移動物体の垂直方向の範囲を絞り込むことができるため、輪郭抽出にかかる計算量を抑え、移動物体の検出にかかる処理速度を早めることができる。
【０１２１】
また、本発明によれば、予めカメラ画像からエッジを抽出したエッジ画像を生成しておくため、個々の移動物体の領域（対象領域）に対する輪郭抽出時にエッジを検出する必要がない。このため、移動物体がカメラ画像上に複数繋がって存在する場合であっても、重複した領域でエッジの抽出を行わないため、高速に移動物体を検出することが可能になる。
【図面の簡単な説明】
【図１】本発明の第一の実施の形態である移動物体検出装置の全体構成を示すブロック図である。
【図２】本発明の第一の実施の形態である移動物体検出装置の動作を示すフローチャート（１／２）である。
【図３】本発明の第一の実施の形態である移動物体検出装置の動作を示すフローチャート（２／２）である。
【図４】距離画像及び差分画像の内容の一例を示す図である。
【図５】視差（距離）毎の動き量（画素値）に基づいて、対象距離画像を生成するための手順を説明するための説明図である。
【図６】ヒストグラムに基づいて、対象領域を設定する手順を説明するための説明図である。
【図７】カメラパラメータに基づいて、移動物体が対象距離画像上のどの高さに位置するかを算出する手順を説明するための説明図である。
【図８】対象距離画像の対象領域で輪郭を抽出した例を示す図である。
【図９】輪郭を抽出した移動物体の領域に基づいて、距離画像の内容を更新した例を示す図である。
【図１０】本発明の第二の実施の形態である移動物体検出装置の全体構成を示すブロック図である。
【図１１】本発明の第二の実施の形態である移動物体検出装置の動作を示すフローチャート（１／２）である。
【図１２】本発明の第二の実施の形態である移動物体検出装置の動作を示すフローチャート（２／２）である。
【図１３】距離画像、差分画像及びエッジ画像の内容の一例を示す図である。
【符号の説明】
１、１Ｂ …… 移動物体検出装置
１０、１０Ｂ…… 入力画像解析手段
１１ …… 距離情報生成部（距離情報生成手段）
１２ …… 動き情報生成部（動き情報生成手段）
１３ …… エッジ画像生成部（エッジ画像生成手段）
２０、２０Ｂ…… 物体検出手段
２１ …… 対象距離設定部（対象距離設定手段）
２２、２２Ｂ…… 対象距離画像生成部（対象距離画像生成手段）
２３ …… 対象領域設定部（対象領域設定手段）
２４、２４Ｂ…… 輪郭抽出部（輪郭抽出手段）
２５ …… 距離情報更新部（距離情報更新手段）[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a moving object detection device, a moving object detection method, and a moving object detection program for detecting a moving object existing on an image captured by a camera.
[0002]
[Prior art]
Conventionally, as a technique for detecting an object existing on an image captured by a camera such as a CCD, for example, an initial ambiguous outline of the object in an image is set as an outline model, and the outline model is set as an outline model. There is a technology (active contour model: SNAKES) for detecting the object by extracting the contour of the object by contracting and deforming according to a predetermined rule. In the object detection technique based on the contour extraction, the edge of a moving object (moving object) is detected from a temporally continuous image, and the contour of the moving object is connected by connecting the contour model to the edge. Is extracted to detect a moving object (see, for example, Patent Document 1).
[0003]
Further, as a technique for detecting a moving object from an image captured by a moving camera, the movement of the moving camera is analyzed from luminance information of a temporally continuous image, and the movement is assumed to be a background movement. There is a technique for detecting a region of a moving object based on a difference between images and a motion of a background and extracting the region as a contour (for example, see Non-Patent Document 1).
[0004]
[Patent Document 1]
JP-A-8-329254 (page 7, FIG. 9-10)
[Non-patent document 1]
Matsuoka, Araki, Yamazawa, Takemura, Yokoya, "Extraction and Tracking of Moving Object Contour from Moving Camera Image and Real-Time Processing by DSP", The Institute of Electronics, Information and Communication Engineers, IEICE Technical Report, PRMU97-235, 1998
[0005]
[Problems to be solved by the invention]
However, in the above-described conventional technique, in the first example, a technique of extracting a contour of a moving object by connecting a contour model to an edge detected from a continuous image and detecting the object, Therefore, when a plurality of objects exist adjacent to each other, there is a problem that the plurality of objects are recognized as one object.
[0006]
Further, in the above-described conventional technique, in the second example of the technique of detecting a moving object by a moving camera, since the entire image captured by the moving camera is processed as a contour extraction target area, the amount of calculation is large. In order to sequentially extract the contour of the moving object in real time, there is a problem that a high-speed arithmetic device is required. Further, similarly to the first example, when a plurality of objects exist adjacently on a captured image, there is a problem that the plurality of objects are recognized as one object.
[0007]
The present invention has been made in view of the above-described problems, and reduces an arithmetic process for extracting a contour of a moving object even for an image captured by a moving camera. It is an object of the present invention to provide a moving object detection device, a moving object detection method, and a moving object detection program that can individually detect objects even when a plurality of objects are adjacent.
[0008]
[Means for Solving the Problems]
The present invention has been devised to achieve the above object. First, the moving object detection device according to claim 1 uses a plurality of synchronized imaging units to capture a plurality of captured images of an imaging target. A moving object detection device that detects a moving object present in the imaging target, based on a parallax of the plurality of captured images, a distance information generating unit that generates a distance to the imaging target as distance information, From at least one of the plurality of imaging units, based on a difference between captured images input in time series, a motion information generation unit that generates the motion of the moving object as motion information, the distance information and A target distance setting unit that sets a target distance at which the moving object is present based on the motion information; and a target distance set by the target distance setting unit based on the distance information. Target distance image generating means for generating a target distance image composed of elements; and target area setting means for setting, in the target distance image, a target area for detecting the moving object corresponding to at least the target distance And a contour extracting means for detecting the moving object by extracting a contour from the target area set by the target area setting means.
[0009]
According to this configuration, the moving object detection device uses the distance information generation unit to generate the distance to the imaging target as distance information based on the parallax of the plurality of captured images. For example, in a pixel in which parallax is detected from a plurality of captured images, a distance image (distance information) in which the magnitude of the parallax (parallax amount) is embedded as a parallax (distance) to an imaging target for each pixel is generated. .
[0010]
In addition, the moving object detection device uses the motion information generating unit to determine the motion of the moving object as motion information based on a difference between captured images input in time series from at least one of the plurality of imaging units. Generate. For example, a difference between two captured images input in time series is obtained, and a difference image in which all pixel values other than “0” are set to “1” is generated as motion information of a moving object.
[0011]
Then, the moving object detection device specifies the parallax (distance) having the largest amount of motion based on the distance information and the motion information by the target distance setting means, and sets the parallax (distance) as the target distance.
[0012]
In the moving object detection device, the target distance image generating unit extracts pixels corresponding to the target distance from the distance image (distance information) to generate a target distance image. For example, the target distance has a certain width (for example, several tens of cm), and pixels corresponding to the distance are extracted from the distance image. Further, the target area setting means sets, in the target distance image, a target area for which a moving object is to be detected, corresponding to at least the target distance. For example, in a target distance image generated by pixels corresponding to the target distance, a region where the pixel exists is set as a target region. As a result, it is possible to narrow down an area in the target distance image where a moving object is assumed to be present. Then, the moving object is detected by extracting the contour of the moving object from the target area in the target distance image by the contour extracting means.
[0013]
According to a second aspect of the present invention, in the moving object detecting apparatus according to the first aspect, the target distance setting means obtains an accumulated value of pixels that have moved for each distance, and calculates the accumulated value. Setting a target distance at which the moving object exists based on
[0014]
According to this configuration, the moving object detection device accumulates (hashes a histogram) pixel values having motion included in the motion information for each parallax (distance) included in the distance information by the target distance setting unit. It is determined that a moving object with the largest amount of motion exists in the parallax (distance) with the largest cumulative value, and the parallax (distance) is set as the target distance. As described above, the distance to the target can be set by a simple operation of accumulating pixels, and the processing can be speeded up.
[0015]
Furthermore, in the moving object detection device according to claim 3, in the moving object detection device according to

claim

1 or 2, the target distance image generation unit includes a predetermined range in a depth direction at least based on the target distance. The method is characterized in that a target distance image including pixels existing within the target image is generated.
[0016]
According to such a configuration, the moving object detection device uses the target distance image generation unit to, for example, only detect pixels existing within a predetermined range (predetermined range) in the depth direction (front-back direction) based on the target distance. A target distance image is generated by extraction. Thus, even if a plurality of moving objects exist in the same direction, a target distance image in which a moving object existing at a target distance is specified can be generated.
[0017]
According to a fourth aspect of the present invention, in the moving object detecting apparatus according to any one of the first to third aspects, the target area setting unit may be configured so that the target area setting unit sets a vertical direction in the target distance image. The target area is set within a predetermined range in the horizontal direction from the peak of the pixel amount based on the pixel amount of (1).
[0018]
According to this configuration, when the moving object detection device sets the target region where the moving object exists, the moving object detection device sets the moving object based on the vertical pixel amount of the moving object in the target distance image by the target region setting unit. The horizontal position of the. For example, a position (peak) where the pixel amount of the moving object in the vertical direction is the largest is set as the center of the moving object in the horizontal direction, and a predetermined range from the center is set as a region where the moving object exists. Thus, even when a plurality of moving objects exist at the same distance, one of them can be detected.
[0019]
Furthermore, in the moving object detection device according to claim 5, in the moving object detection device according to any one of claims 1 to 4, the target area setting unit may include at least a tilt angle and a tilt angle of the imaging unit. A vertical range of the target area is set based on the height from the installation surface.
[0020]
According to this configuration, when the moving object detection device sets the target area where the moving object is present, the target area setting unit sets the tilt angle of the camera as the imaging unit or the installation surface serving as the reference of the camera. Based on the camera parameters such as the height of the moving object, the range of the existence area of the moving object in the vertical direction is set. For example, by setting the height of the moving object to a specific size (for example, 2 m for a human), based on the size and camera parameters, it is possible to determine in which range in the target distance image the moving object is located. Can be identified.
[0021]
A moving object detection device according to a sixth aspect is the moving object detection device according to any one of the first to fifth aspects, based on color information or shading information of each pixel of the captured image. Edge image generating means for generating an edge image by extracting an edge of the captured image, wherein the target distance image generating means extracts a pixel of the edge image corresponding to the target distance based on the distance information. And generating the target distance image.
[0022]
According to this configuration, the moving object detection device generates the edge image by extracting the edge of the captured image from the color information or the shading information of the captured image by the edge image generation unit. For example, based on the brightness (brightness) of the captured image, a portion where the brightness greatly changes is detected as an edge, thereby generating an edge image including only the edge. When the captured image is a color image and the moving object is specified as a person, the edge can be detected by detecting, for example, the color (skin color) of the face of the person as color information.
[0023]
Then, the moving object detection device generates a target distance image existing within the target distance range from the edge image by the target distance image generating means. This makes it possible to omit the operation of detecting an edge when the contour extracting means extracts a contour from the target distance image.
[0024]
Further, the moving object detecting device according to claim 7 is the moving object detecting device according to any one of claims 1 to 6, wherein the internal region of the contour extracted by the contour extracting means is defined as A distance information updating unit for updating the distance information is provided as an extracted area of the moving object.
[0025]
According to this configuration, the moving object detection device sets the distance information by the distance information updating unit to the extracted region in which the outline of the moving object has already been extracted by setting the inner region of the outline extracted by the outline extracting unit as the extracted region. Update. As a result, the information of the moving object that has already been extracted is deleted from the distance information, so that another moving object can be sequentially detected.
[0026]
Further, the moving object detection method according to claim 8, wherein the distance information to the imaging target generated based on the captured images captured by the plurality of synchronized imaging units and at least one of the plurality of imaging units. A moving object detection method for detecting a moving object that is moving within the imaging target, based on motion information generated based on captured images input in time series from two imaging units, wherein the distance information and the motion Based on information, a target distance setting step of setting a target distance at which the moving object is present, and, based on the distance information, a target distance image including pixels corresponding to the target distance set in the target distance setting step. Generating a target distance image to be generated, and setting, in the target distance image, a target region in which the moving object is to be detected, corresponding to at least the target distance. And the target region setting step, by extracting a contour from the target area set by the target region setting step, characterized in that it contains, and the contour extracting step of detecting the moving object.
[0027]
According to this method, the moving object detection method includes, in the target distance setting step, distance information to the imaging target generated based on the captured images captured by the plurality of synchronized imaging units; The parallax (distance) having the largest amount of motion is specified based on the motion information generated based on the captured images input in time series by at least one of the imaging means, and the parallax (distance) is set as the target distance. .
[0028]
Then, in a target distance image generation step, a pixel corresponding to the target distance is extracted from the distance image (distance information) to generate a target distance image. For example, the target distance has a certain width (for example, several tens of cm), and pixels corresponding to the distance are extracted from the distance image. Further, in the target area setting step, a target area for detecting a moving object is set in the target distance image at least corresponding to the target distance. As a result, it is possible to narrow down an area in the target distance image where a moving object is assumed to be present. Then, in the contour extraction step, the moving object is detected by extracting the contour of the moving object from the target area in the target distance image.
[0029]
The moving object detection program according to claim 9, further comprising: distance information to an imaging target generated based on images captured by a plurality of synchronized imaging units; and at least one of the plurality of imaging units. With the motion information generated based on the captured images input in chronological order from the two imaging units, in order to detect a moving object that is moving in the imaging target, a computer is configured to function by the following units: did.
[0030]
That is, based on the distance information and the motion information, a target distance setting unit that sets a target distance at which the moving object exists, and based on the distance information, corresponds to a target distance set by the target distance setting unit. A target distance image generating means for generating a target distance image composed of pixels, a target area setting means for setting a target area for detecting the moving object in the target distance image, at least corresponding to the target distance, Contour extraction means for detecting the moving object by extracting a contour from the target area set by the target area setting means.
[0031]
According to this configuration, the moving object detection program specifies the parallax (distance) having the largest amount of motion by the target distance setting unit based on the distance information and the motion information, and sets the parallax (distance) as the target distance. .
[0032]
Then, the target distance image generating means extracts a pixel corresponding to the target distance from the distance image (distance information) to generate a target distance image, and the target area setting means determines that a moving object exists in the target distance image. Set a target area that narrows down the assumed area.
Then, the moving object is detected by extracting the contour of the moving object from the target area in the target distance image by the contour extracting means.
[0033]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First embodiment]
(Configuration of moving object detection device)
FIG. 1 is a block diagram showing a configuration of a moving object detection device 1 according to a first embodiment of the present invention. As shown in FIG. 1, a moving object detection device 1 detects an object (moving object) with motion from camera images (captured images) captured by two cameras (imaging means) 2. Here, the moving object detection device 1 includes an input image analysis unit 10 that analyzes an input camera image, and an object detection unit 20 that detects an object from the analyzed camera image. Note that the two cameras 2 are arranged left and right at a distance B, and are respectively referred to as a right camera 2a and a left camera 2b.
[0034]
The input image analysis means 10 analyzes camera images (captured images) input synchronously from the two cameras 2 (image capture means: 2a, 2b) that have captured the image capturing target, and obtains a distance image including distance information. And a difference image including motion information. Here, the input image analysis means 10 includes a distance information generation unit 11 and a motion information generation unit 12.
[0035]
The distance information generation unit (distance information generation means) 11 calculates the parallax between the two camera images captured by the right camera 2a and the left camera 2b at the same time, from the camera 2 to the imaging target captured by the camera 2. It is embedded as information (more precisely, the distance from the focal position of the camera 2) and is generated as a distance image.
[0036]
In the distance information generation unit 11, a camera image (reference captured image) captured by the reference camera (right camera 2a) and a camera captured by the left camera 2b, using the right camera 2a as a reference camera (reference imaging unit). By performing block matching on a block (for example, 16 × 16 pixels) of a specific size between the image (the image captured at the same time), the parallax from the reference captured image is measured. Then, a distance image in which the magnitude of the parallax (parallax amount) is associated with each pixel of the reference captured image is generated.
[0037]
When the parallax is Z, the distance D (not shown) from the camera 2 to the object corresponding to the parallax Z is f (not shown) as the focal length of the camera 2, the right camera 2a and the left camera 2b. Assuming that the distance to is B, the distance can be obtained by equation (1).
[0038]
D = B × f / Z (1)
[0039]
The motion information generation unit (motion information generation means) 12 uses the motion of the moving object in the camera image as motion information based on the difference between the two camera images captured in time series by the reference camera (right camera 2a). This is to generate an embedded difference image.
[0040]
The motion information generating unit 12 uses the right camera 2a as a reference camera (reference imaging unit) and calculates the difference between two camera images captured in time series (time t and time t + 1) by the reference camera (right camera 2a). Take. Then, a difference image in which a pixel value “1” is given to a pixel having a difference as a pixel having movement and a pixel value “0” is given to a pixel having no difference as a pixel having no movement is generated. In addition, the motion information generation unit 12 further removes noise by performing filtering processing such as a median filter on the difference image.
[0041]
In the case where the camera 2 is a moving camera and the background in the captured camera image changes, a camera movement amount such as pan and tilt for each camera image is input from the camera 2 and, for example, a camera image at time t + 1 is input. By correcting by the camera movement amount, only the pixels that have moved at time t and time t + 1 are detected.
[0042]
Here, the contents of the distance image generated by the distance information generation unit 11 and the content of the difference image generated by the motion information generation unit 12 will be described with reference to FIG. FIG. 4 shows an example of the image contents of the distance image DE and the difference image DI, and an example of the pixel values (the distance image pixel value DEB and the difference image pixel value DIB) of each image. Here, it is assumed that a person exists at positions about 1 m, 2 m, and 3 m away from the camera 2.
[0043]
As shown in FIG. 4, the distance image DE is generated by expressing the parallax between the right camera image and the left camera image at the time t with pixel values. The larger the value of the parallax, the closer the position of the person is to the camera 2, and the smaller the value, the farther the position of the person is from the camera 2. For example, as shown in the distance image pixel value DEB, the pixel position (0, 0) of the distance image DE has a parallax of 0, which means that the distance from the camera 2 is infinity (∞). . The pixel position (30, 50) of the distance image DE has a parallax of 20, meaning that the distance from the camera 2 is a distance corresponding to the parallax 20, for example, 2.2 m. As described above, since the distance image DE expresses parallax as a pixel value, for example, the closer to the camera 2, the brighter the image, and the farther the image, the darker the image.
[0044]
The difference image DI is a difference between the right camera image at the time t and the right camera image at the time t + 1, and a pixel having a difference is set to a pixel value “1”, and a pixel having no difference is set to a pixel value “0”. It is generated by expressing. The pixel having the difference represents an area where the person actually moves. For example, as shown in the difference image pixel value DIB, the pixel position (0, 0) of the difference image DI is “0” “stop”, which means that there is no movement. Further, the pixel position (30, 50) of the difference image DI is “1” “movement”, which means that there is a movement.
Returning to FIG. 1, the description will be continued.
[0045]
The object detecting means 20 detects an area of a moving moving object based on the images (distance image and difference image) analyzed by the input image analyzing means 10 and extracts the contour of the moving object. Here, the object detecting means 20 includes a target distance setting unit 21, a target distance image generating unit 22, a target region setting unit 23, a contour extracting unit 24, and a distance information updating unit 25.
[0046]
The target distance setting unit (target distance setting unit) 21 performs the most movement based on the distance image generated by the distance information generation unit 11 of the input image analysis unit 10 and the difference image generated by the motion information generation unit 12. A moving object having a large amount is specified, and the parallax (target distance) at which the target moving object exists is set. This target distance is notified to the target distance image generation unit 22.
[0047]
The target distance setting unit 21 accumulates, for each disparity (distance) represented by the distance image, the pixel values of the difference image at the same position as the pixel corresponding to the disparity, and sets the disparity having the largest total (most disparity). It is determined that a moving object having the largest motion amount exists in (parallax). Note that the target distance setting unit 21 stores the distance image generated by the distance information generation unit 11 and the difference image generated by the motion information generation unit 12 in a storage unit such as a memory (not shown). I do.
[0048]
The target distance image generation unit (target distance image generation unit) 22 extracts a pixel corresponding to the target distance set by the target distance setting unit 21 from the distance image in which the parallax amount generated by the distance information generation unit 11 is embedded. The target distance image is generated.
[0049]
Here, assuming that a person is detected, the parallax width (depth) of the target distance (maximum parallax) ± α (several tens of cm) is set to the parallax range in which the moving object with the largest motion amount exists. And The value of α is a range (predetermined range) in the depth direction based on the target distance, and is a value predetermined according to the size of the object to be detected in the depth direction.
[0050]
For example, if the distance D from the camera 2 to the moving object in the maximum parallax is calculated by the above equation (1), the range Zr of the parallax is obtained by modifying the equation (1) to obtain the equation (2). Here, the focal length of the camera 2 is f, and the distance between the right camera 2a and the left camera 2b is B.
[0051]
B × f / (D + α) <Zr <B × f / (D−α) (2)
[0052]
The target distance image generation unit 22 generates a target distance image in which pixels corresponding to the parallax in the range of the expression (2) are extracted.
The target distance image may be generated by extracting only pixels at pixel positions corresponding to the target distance (range of parallax) from a camera image (original image) captured by the reference camera (right camera 2a). Good.
[0053]
Here, referring to FIG. 5 (see FIG. 1 as appropriate), the target distance setting unit 21 and the target distance image generation unit 22 generate an image (target distance image) corresponding to the distance at which the moving object to be detected exists. The generation procedure will be described. FIG. 5A is a graph showing the relationship between the parallax (distance) and the amount of motion (the number of pixels) obtained by accumulating moving pixels based on the distance image DE and the difference image DI (FIG. 4). FIG. 5B shows a target distance image TDE obtained by extracting only the image at the target distance from the distance image DE (FIG. 4).
[0054]
As shown in FIG. 5A, when the relationship between the disparity (distance) and the amount of motion (the number of pixels) of the distance image DE (FIG. 4) is graphed, the disparity (distance) is 1 m, 2.2 m, and 3 m. The movement amount reaches a peak at the position of. Therefore, the target distance setting unit 21 sets 2.2 m as the target distance on the assumption that the moving object exists in the parallax (2.2 m) at which the amount of motion is maximum. Assuming that the moving object is a person, it can be determined that the person exists within a range of 2.2 ± αm (α = 0.5 m) from the camera 2.
[0055]
Therefore, as shown in FIG. 5B, the target distance image generation unit 22 converts the distance image generated by the distance information generation unit 11 into the target distance ± αm set by the target distance setting unit 21 (2. A target distance image TDE is generated by extracting pixels existing at 2 ± 0.5 m). As a result, an image of a person existing at a

position

1 m or 3 m away from the camera 2 is deleted, and a target distance image TDE is generated by extracting only a person existing at a position 2.2 ± 0.5 m away. can do.
Returning to FIG. 1, the description will be continued.
[0056]
The target area setting unit (target area setting unit) 23 accumulates the number of pixels in the vertical direction of the target distance image generated by the target distance image generation unit 22, and determines the position (the position where the total number of pixels in the vertical direction is largest) ( The peak (peak) is specified as the horizontal position of the center of the moving object, and an area (target area) including the moving object is set.
[0057]
More specifically, the target region setting unit 23 counts the number of pixels in the vertical direction of the target distance image generated by the target distance image generation unit 22 to form a histogram, and determines the position where the histogram becomes the maximum (peak). The horizontal position of the center of the moving object is specified. Here, assuming that a person is detected, a range of a specific size (for example, 0.5 to 0.6 (m)) is set in the horizontal direction of the target area on the left and right around the horizontal position where the histogram is the maximum. Is set as an existing area (range). In the vertical direction, a specific size (for example, 2 (m)) is set as the height of the target area. At this time, the target area setting unit 23 sets a vertical existence area (range) of the target area based on camera parameters such as a tilt angle input from the camera 2 and a height from the floor (installation surface). .
[0058]
By determining the position where the histogram is maximum as the center of the moving object, even if there are a plurality of moving objects (persons, etc.) at the same distance, one (one) of the moving objects can be detected. can do.
[0059]
Here, a procedure in which the target area setting unit 23 sets an area (target area) of one (one) moving object from the target distance image TDE with reference to FIG. I do. FIG. 6A shows the total number of pixels in the vertical direction in the target distance image TDE generated by the target distance image generation unit 22 as a histogram HI. FIG. 6B shows a state where the target area T is set in the target distance image TDE with the moving object as a person. Although the histogram HI is superimposed on the target distance image TDE in FIGS. 6A and 6B, this is only superimposed for convenience of explanation.
[0060]
As shown in FIG. 6A, the target area setting unit 23 generates a histogram HI in which the number of pixels in the vertical direction of the target distance image TDE is accumulated. By converting the target distance image TDE into a histogram in this manner, it is possible to determine that the horizontal position of the center of the moving object exists at a position where the histogram HI is the maximum. For example, if the non-zero pixel position at the highest position in the target distance image TDE is determined as the horizontal position of the center of the moving object without using the histogram HI, when the person raises his hand, It is determined that the destination is the center of the person (moving object). Therefore, here, the histogram HI is used.
[0061]
Then, as shown in FIG. 6B, the target area setting unit 23 sets a range of a specific size (for example, 0.5 m) on the left and right around the horizontal position where the histogram HI is the maximum, as the target area T. Of the horizontal direction. In the vertical direction, a specific size (for example, 2 m) is defined as a range in the vertical direction of the target region T.
[0062]
The size of the target region T will be further described with reference to FIG. 7 (see FIG. 1 as appropriate). FIG. 7 shows a case where the camera 2 is incorporated in a mobile robot (not shown) and is located at a certain height (camera height) H from the same floor as the mobile object M. ', B') is an explanatory diagram for explaining at which height the position is located. 7A shows a case where the tilt angle of the camera 2 is 0 (°), and FIG. 7B shows a case where the tilt angle of the camera 2 is θ. _T 4 shows the correspondence between the camera 2 and the moving object M in the case of (≠ 0).
[0063]
First, with reference to FIG. 7A, a method for specifying which position in the vertical direction the moving object M exists on the target distance image (a ′) when the tilt angle is 0 (°). explain.
Here, the vertical angle of view of the camera 2 is θ _v , The distance from the camera 2 to the moving object M is D, the vertical resolution of the target distance image (a ′) is Y, the height of the camera 2 from the floor (camera height) is H, and the moving object M from the floor is H. The virtual height is 2 (m). At this time, the angle θ between the optical axis of the camera 2 and a straight line connecting the camera 2 to the virtual upper end (2 m from the floor) of the moving object M _H Can be expressed by equation (3).
[0064]
θ _H = Tan ^-1 ((2-H) / D) (3)
[0065]
Thus, the upper end y of the moving object M on the target distance image (a ′) _T Can be obtained by equation (4).
[0066]

[0067]
The angle θ between the optical axis of the camera 2 and a straight line connecting the camera 2 to the lower end (floor) of the moving object M _L Can be expressed by equation (5).
[0068]
θ _L = Tan ^-1 (H / D) ... (5)
[0069]
As a result, the lower end y of the moving object M on the target distance image (a ′) _B Can be obtained by equation (6).
[0070]

[0071]
Next, referring to FIG. 7B, when the tilt angle is θ _T In the case of (≠ 0), a method for specifying a vertical position of the moving object M on the target distance image (b ′) will be described.
Here, the vertical angle of view of the camera 2 is θ _v , Tilt angle θ _T , The distance to the moving object M is D, the vertical resolution of the target distance image is Y, the height of the camera 2 from the floor (camera height) is H, and the virtual height of the moving object M from the floor is 2 ( m). At this time, the angle θ between the optical axis of the camera 2 and a straight line connecting the camera 2 to the virtual upper end (2 m from the floor) of the moving object M _H And the tilt angle θ _T Angle (θ _H −θ _T ) Can be expressed by equation (7).
[0072]
θ _H −θ _T = Tan ^-1 ((2-H) / D) (7)
[0073]
Thereby, the upper end y of the moving object M on the target distance image (b ′) _T Can be obtained by equation (8).
[0074]

[0075]
The angle θ between the optical axis of the camera 2 and a straight line connecting the camera 2 to the lower end (floor) of the moving object M _L And the tilt angle θ _T And the addition angle (θ _L + Θ _T ) Can be expressed by equation (9).
[0076]
θ _L + Θ _T = Tan ^-1 (H / D) ... (9)
[0077]
As a result, the lower end y of the moving object M on the target distance image (b ′) _B Can be obtained by equation (10).
[0078]

[0079]
Upper end y of the target distance image (a ′ or b ′) obtained in this way _T And lower end y _B Thus, the range in the vertical direction of the target region T (FIG. 6B) is determined.
If the mobile robot (not shown) moves up and down stairs and the like and does not exist on the same floor as the moving object M, the moving amount of the moving object M is determined by detecting the amount of movement by the encoder or the like of the mobile robot body. By adding or subtracting the height from the floor, the vertical position of the moving object M in the target distance image (a ′ or b ′) can be specified. Alternatively, map information may be stored in the mobile robot, and the floor height specified by the direction and distance of the moving object M may be acquired from the map information.
[0080]
The horizontal range of the target area T (FIG. 6B) is, for example, not shown, but the horizontal angle of view of the camera 2 is θ _h If the distance from the camera 2 to the target moving object M is D, and the horizontal resolution of the target distance image is X, half the width of the target area (the distance from the center of the moving object) is 0.5 (m). ), The number of horizontal pixels α on the target distance image _H Can be obtained by Expression (11).
[0081]
α _H = (X / θ _h ) Tan ^-1 (0.5 / D) ... (11)
Returning to FIG. 1, the description will be continued.
[0082]
The contour extraction unit (contour extraction means) 24 uses a known contour extraction technique in the area of the moving object (target area) set by the target area setting unit 23 in the target distance image generated by the target distance image generation unit 22. Is used to extract the contour. The extracted contour (contour information) is output to the outside as an output of the moving object detection device 1 and is notified to the distance information updating unit 25. The extraction of the contour by the contour extraction unit 24 means that the moving object has been detected.
[0083]
Here, an outline of a procedure of contour extraction which is a known technique will be described.
First, an edge is detected based on a change in the pixel value in the target area. For example, edge detection is performed by multiplying, for each pixel, an operator having a weighting factor (a coefficient row example: a Sobel operator, a Kirsch operator, or the like) with respect to pixels in an area near a certain pixel. Then, the detected edge is binarized by an appropriate threshold, and an isolated point is removed by a median filter or the like. By connecting the binarized edges in this way, the outline of the moving object can be extracted from within the target area. As a method of extracting a contour from an edge, a dynamic contour model (SNAKES) may be applied. Thereby, for example, as shown in FIG. 8, it is possible to extract the contour O in the target area T in which the moving object is limited to one (one) in the target area image TDE.
[0084]
The distance information updating unit (distance information updating unit) 25 updates the distance image stored in the storage unit (not shown) by the target distance setting unit 21 based on the contour (contour information) extracted by the contour extracting unit 24. Is what you do. For example, the pixel value of the distance image corresponding to the internal region including the contour is set to “0”. As a result, the area of the moving object for which the contour extraction has been completed is deleted from the distance image. The distance information updating unit 25 notifies the target distance setting unit 21 of the completion of the update of the distance image as update information.
[0085]
For example, as shown in FIG. 9, the content (distance image pixel value DEB) of the distance image DE corresponding to the inside of the outline O (the internal region including the outline O) extracted in FIG. 8 is updated. That is, the parallax of all pixel values in the region of the contour O, for example, the pixel position (30, 50) in the contour O, is changed to zero. By changing the parallax in the region of the contour O to 0 in this manner, the moving object extracted as the contour O has an infinite distance from the camera 2 and does not exist on the distance image DE.
[0086]
In the above, the configuration of the moving object detection device 1 according to the first embodiment has been described. However, in the moving object detection device 1, it is also possible to realize each unit in a computer as each function program. It is also possible to combine and operate as a moving object detection program.
[0087]
Also, here, the distance information generation unit 11 of the moving object detection device 1 generates the distance image based on the camera images captured by the two cameras 2, but generates the distance image using three or more cameras. You may do it. For example, by using nine cameras arranged in three rows and three columns and using the camera arranged in the center as a reference camera and generating a distance image based on parallax with another camera, the distance to the moving object can be more accurately determined. Can also be measured.
[0088]
Further, the moving object detection device 1 can be incorporated in a moving object such as a mobile robot or an automobile and used for detecting an object such as a person. For example, by applying the present invention to a mobile robot, the mobile robot can recognize a person even when crowded. Furthermore, since a person can be individually detected, for example, by performing face recognition or the like, processing after contour extraction such as tracking the person or performing a different operation for each person becomes easy.
[0089]
(Operation of the moving object detection device 1)
Next, the operation of the moving object detection device 1 will be described with reference to FIGS. 2 and 3 are flowcharts illustrating the operation of the moving object detection device 1.
[0090]
<Camera image input step>
First, the moving object detection device 1 inputs camera images in time series from the two synchronized cameras 2 (step S1). Here, camera images input from the right camera 2a (reference camera) and the left camera 2b at a certain time t, and input from the right camera 2a (reference camera) at the next time t + 1 (for example, one frame later). It is assumed that the outline of the moving object is extracted based on the obtained camera image.
[0091]
<Distance image generation step>
Then, the moving object detection device 1 uses the distance information generation unit 11 to perform parallax (distance) from the two camera images input from the right camera 2a (reference camera) and the left camera 2b at time t to the imaging target. Is generated (step S2).
[0092]
<Difference image generation step>
Further, the moving object detection device 1 calculates the difference between the two camera images (reference camera images) captured by the right camera 2a (reference camera) at time t and time t + 1 by the motion information generation unit 12, and calculates the difference. A difference image is generated in which the existing pixel has a pixel value of “1” and the pixel having no difference has a pixel value of “0” (step S3).
[0093]
<Target distance setting step>
In addition, the moving object detection device 1 uses the target distance setting unit 21 to calculate the number of pixels that have moved for each parallax (distance) represented by the distance image from the distance image and the difference image generated in steps S2 and S3. Are accumulated (step S4). For example, only pixels having a certain parallax (distance) are extracted from the distance image, and the pixel values of the pixels of the difference image corresponding to the extracted pixels are accumulated. Then, the distance at which the cumulative total of the number of pixels having this motion (difference) becomes the maximum is set as the target distance of the moving object to be detected (step S5).
[0094]
<Target distance image generation step>
Then, in the moving object detection device 1, the target distance image generation unit 22 generates a target distance image in which pixels corresponding to the target distance ± α are extracted from the distance image (step S6). Here, it is assumed that a person is detected, and α is set to several tens cm.
[0095]
<Target area setting step>
Then, the moving object detection device 1 uses the target area setting unit 23 to measure the number of pixels in the vertical direction (vertical direction) of the target distance image generated in step S6 by forming a histogram (step S7). Then, a range of a specific size (for example, 0.5 to 0.6 (m)) is set as a horizontal range of the target region on the left and right with the horizontal position at which the histogram is the maximum (peak) as a center ( Step S8).
Further, the target area setting unit 23 sets the range of the target area in the vertical (up / down) direction based on the camera parameters such as the tilt angle and the height from the floor (installation surface) input from the camera 2 (step). S9).
[0096]
For example, based on the tilt angle of the camera 2 and the height from the floor, the position of the floor in the target distance image (the lower end of the target area) is obtained. Then, based on the angle of view of the camera 2 and the distance to the moving object, the number of pixels from the floor in the target distance image of the target area is obtained by converting the range from the floor to 2 m into the number of pixels. Thus, the upper end of the target area in the target distance image can be obtained. The upper end of the target area may be directly obtained at a position (height) of 2 m in the target distance image based on the tilt angle of the camera 2 and the height from the floor. Note that 2 m is an example, and other lengths (heights) may be used.
[0097]
<Contour extraction step>
In the moving object detection device 1, the contour extraction unit 24 extracts a contour from the target distance image generated in step S6 within the target area set in steps S8 and S9 (step S10). For example, an edge is detected in a target area, and a contour is extracted by applying a dynamic contour model (SNAKES) to the edge.
[0098]
Then, it is determined whether or not the contour has been successfully extracted (step S11). Here, the determination of the success or failure of the contour extraction is not limited to the determination of whether or not the contour has been successfully extracted in step S10. For example, when the target distance is longer than a predetermined distance, or when the target area is a predetermined target area. When the size is smaller than the size, the determination further includes that the contour extraction of the object is not performed because the contour extraction of all the objects is completed.
If the contour has been successfully extracted in step S11 (Yes), the process proceeds to step S12. On the other hand, if the extraction of the contour has failed (or the extraction is not performed) (No), the operation ends.
[0099]
<Distance information update step>
Then, the moving object detection device 1 updates the distance image corresponding to the inside of the contour extracted in step S10 (the internal region including the contour) by the distance information updating unit 25 (step S12). For example, the pixel value of the distance image corresponding to the internal region including the contour is set to “0”. As a result, the area of the moving object that has already been extracted is deleted from the distance image. Then, the process returns to step S4 to continue the processing.
[0100]
Through the above steps, according to the moving object detection device 1 of the present embodiment, the moving object existing in the camera image can be detected from the camera image input from the camera 2. Here, the contour of the moving object is extracted at a certain time t (t + 1). However, by performing the above-described steps (steps S1 to S12) based on a camera image input moment by moment, for example, A moving object such as a mobile robot can continue to detect a person.
[0101]
[Second embodiment]
(Configuration of moving object detection device)
Next, a configuration of a moving object detection device 1B according to a second embodiment of the present invention will be described with reference to FIG. FIG. 10 is a block diagram showing a configuration of the moving object detection device 1B. As shown in FIG. 10, the moving object detection device 1 </ b> B detects a moving object (moving object) from camera images (captured images) captured by two cameras (imaging units) 2.
[0102]
Here, the moving object detection device 1B is divided into an input image analysis unit 10B including a distance information generation unit 11, a motion information generation unit 12, and an edge image generation unit 13, a target distance setting unit 21, a target distance image generation unit 22B, An object detection unit 20B including an area setting unit 23, a contour extraction unit 24B, and a distance information updating unit 25 is configured. The configuration other than the edge image generation unit 13, the target distance image generation unit 22B, and the contour extraction unit 24B is the same as that shown in FIG.
[0103]
The edge image generation unit (edge image generation means) 13 inputs a camera image (reference captured image) at the same time input from the camera 2 (2a) to the distance information generation unit 11 and the motion information generation unit 12, and The edge image is generated by extracting an edge from a camera image. The edge image generation unit 13 detects a portion where the brightness changes greatly as an edge based on the brightness (brightness: density information) of the camera image input from the camera 2 (2a), and detects only the edge. An edge image is generated. For example, edge detection is performed by multiplying, for each pixel, an operator having a weighting factor (a coefficient row example: a Sobel operator, a Kirsch operator, or the like) with respect to pixels in an area near a certain pixel.
[0104]
That is, in the input image analysis unit 10B, as shown in FIG. 13, an edge is extracted from the distance image DE expressing the parallax between the right camera image and the left camera image at the time t by a pixel value, and the right camera image at the time t. The difference between the obtained edge image ED and the right camera image at time t and the right camera image at time t + 1 is calculated, and the pixel having a difference is expressed as a pixel value “1” and the pixel having no difference is expressed as a pixel value “0”. The resulting difference image DI is generated.
When the camera image is a color image and the moving object is specified as a person, the edge image generation unit 13 detects an edge by detecting, for example, the color (skin color) of a person's face as color information. It is also possible.
[0105]
The target distance image generating unit (target distance image generating unit) 22B generates a target distance image including pixels corresponding to the target distance set by the target distance setting unit 21. In the target distance image generation unit 22B, first, from the distance image in which the amount of parallax generated by the distance information generation unit 11 is embedded, the target distance ± α notified from the target distance setting unit 21 (this α is a Pixel position corresponding to several tens of cm). Then, only the pixel corresponding to the pixel position is extracted from the edge image generated by the edge image generation unit 13, and a target distance image is generated. That is, the target distance image is an image in which a moving object existing at the target distance is expressed by an edge.
[0106]
The contour extracting unit (contour extracting means) 24B extracts a contour in the moving object region (target region) set by the target region setting unit 23 in the target distance image generated by the target distance image generating unit 22B. It is. The extracted contour (contour information) is output to the outside as an output of the moving object detection device 1B, and is notified to the distance information updating unit 25. When the contour is extracted by the contour extracting unit 24B, a moving object is detected.
[0107]
In the contour extracting unit 24B, since the target distance image generated by the target distance image generating unit 22B is already expressed by an edge, a contour is extracted from the edge by using a dynamic contour model (SNAKES) or the like. That is, in the contour extracting unit 24B, the edge detection performed by the contour extracting unit 24 (FIG. 1) can be omitted.
[0108]
The configuration of the moving object detection device 1B according to the second embodiment has been described above. However, in the moving object detection device 1B, each unit in the computer can be realized as each function program. It is also possible to combine and operate as a moving object detection program.
[0109]
Further, the moving object detection device 1B may generate the distance image using the three or more cameras in the distance information generation unit 11. In this case, the motion information generation unit 12 and the edge image generation unit 13 generate a difference image and an edge image based on a camera image input from a reference camera.
Further, the moving object detection device 1B can be incorporated in a moving object such as a mobile robot or an automobile and used for detecting an object such as a person.
[0110]
(Operation of the moving object detection device 1B)
Next, the operation of the moving object detection device 1B will be briefly described with reference to FIGS. 10, 11, and 12. 11 and 12 are flowcharts showing the operation of the moving object detection device 1B.
[0111]
First, the moving object detection device 1B inputs camera images in time series from the two synchronized cameras 2 (step S21). Then, the distance information generation unit 11 generates a distance image in which parallax (distance) to the imaging target is embedded from the two camera images input from the right camera 2a (reference camera) and the left camera 2b at time t. (Step S22). Further, the motion information generating unit 12 calculates a difference between two camera images (reference camera images) captured by the right camera 2a (reference camera) at time t and time t + 1, and determines a pixel having a difference with a pixel value “ A difference image is generated with a pixel value of “1” and a pixel value of “0” with no difference (step S23). Then, the edge image generation unit 13 generates an edge image obtained by extracting an edge from a camera image (reference camera image) captured by the right camera 2a (reference camera) at time t (step S24).
[0112]
Then, the moving object detection device 1B uses the target distance setting unit 21 to generate, for each parallax (distance) represented by the distance image, the pixel corresponding to the parallax from the distance image and the difference image generated in steps S22 and S23. The pixel values of the difference image at the same position as are accumulated (step S25). Then, the distance at which the number of pixels (the total of the pixel values) having the motion (difference) becomes the maximum is set as the target distance of the moving object to be detected (step S26). Then, the target distance image generation unit 22B generates a target distance image in which pixels corresponding to the target distance ± α are extracted from the edge image (step S27). Here, it is assumed that a person is detected, and α is set to several tens cm.
[0113]
Then, the moving object detection device 1B measures the vertical direction (vertical direction) pixel values of the target distance image generated in step S27 by forming a histogram using the target area setting unit 23 (step S28). Then, a range of a specific size (for example, 0.5 to 0.6 (m)) is set as the horizontal range of the target region on the left and right with the horizontal position at which the histogram is the maximum as the center (step S29). . Further, a vertical range of the target area is set based on camera parameters such as a tilt angle and a height from the floor (installation surface) input from the camera 2 (step S30).
[0114]
In the moving object detection device 1B, the contour extraction unit 24B extracts a contour from the target area set in steps S29 and S30 in the target distance image generated in step S27 (step S31). Is determined (step S32). If the contour has been successfully extracted in step S32 (Yes), the process proceeds to step S33. On the other hand, if the extraction of the contour has failed (or the extraction is not performed) (No), the operation ends.
[0115]
Then, the moving object detection device 1B uses the distance information updating unit 25 to generate, as update information, a pixel position corresponding to the inside of the contour extracted in step S31 (the internal region including the contour), and the target distance setting unit 21 Based on the updated information, the information of the distance image is deleted (step S33). As a result, the area of the moving object that has already been extracted is deleted from the distance image. Then, the process returns to step S25 to continue the processing.
[0116]
Through the above steps, according to the moving object detection device 1B of the present embodiment, the moving object existing in the camera image can be detected from the camera image input from the camera 2. In the moving object detection device 1B, an edge image is generated in step S24, and a target distance image in which an edge has already been detected is used for extracting a contour in step S31. It is possible to extract outlines at high speed even in the case where are present side by side.
[0117]
【The invention's effect】
As described above, the moving object detection device, the moving object detection method, and the moving object detection program according to the present invention have the following excellent effects.
[0118]
According to the present invention, based on a distance image (distance information) generated from camera images captured by a plurality of cameras and a difference image (motion information) generated from camera images input in time series, It is possible to specify the distance of a moving object that moves from the camera, and generate an image (target distance image) focusing only on the distance. This makes it possible to detect a moving object (for example, a person or the like) connected on the camera image as another moving object by identifying and separating the moving object according to the distance.
[0119]
Further, according to the present invention, the horizontal range of the moving object can be narrowed down based on the pixel amount of the moving object in the target distance image in the vertical direction. Separately, it is possible to detect it as another moving object.
[0120]
Furthermore, according to the present invention, the range in the vertical direction of the moving object in the target distance image can be narrowed based on the tilt angle of the camera and the height from the floor, so that the amount of calculation for contour extraction is suppressed, The processing speed for detecting a moving object can be increased.
[0121]
Further, according to the present invention, since an edge image in which edges are extracted from a camera image is generated in advance, it is not necessary to detect an edge when extracting a contour for each moving object region (target region). For this reason, even when a plurality of moving objects are present on the camera image, edges are not extracted in the overlapped area, so that the moving objects can be detected at high speed.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating an entire configuration of a moving object detection device according to a first embodiment of the present invention.
FIG. 2 is a flowchart (1/2) illustrating an operation of the moving object detection device according to the first embodiment of the present invention.
FIG. 3 is a flowchart (2/2) showing an operation of the moving object detection device according to the first embodiment of the present invention.
FIG. 4 is a diagram illustrating an example of the contents of a distance image and a difference image.
FIG. 5 is an explanatory diagram for describing a procedure for generating a target distance image based on a motion amount (pixel value) for each parallax (distance).
FIG. 6 is an explanatory diagram illustrating a procedure for setting a target area based on a histogram.
FIG. 7 is an explanatory diagram illustrating a procedure for calculating a height of a moving object on a target distance image based on camera parameters.
FIG. 8 is a diagram illustrating an example in which a contour is extracted from a target area of a target distance image.
FIG. 9 is a diagram illustrating an example in which the content of a distance image is updated based on a region of a moving object from which a contour has been extracted.
FIG. 10 is a block diagram showing an entire configuration of a moving object detection device according to a second embodiment of the present invention.
FIG. 11 is a flowchart (1/2) illustrating an operation of the moving object detection device according to the second embodiment of the present invention.
FIG. 12 is a flowchart (2/2) showing an operation of the moving object detection device according to the second embodiment of the present invention.
FIG. 13 is a diagram illustrating an example of the contents of a distance image, a difference image, and an edge image.
[Explanation of symbols]
1, 1B ... moving object detection device
10, 10B ... Input image analysis means
11 Distance information generation unit (distance information generation means)
12: motion information generation unit (motion information generation means)
13 Edge image generation unit (edge image generation means)
20, 20B ... Object detecting means
21... Target distance setting unit (target distance setting means)
22, 22B... Target distance image generating unit (target distance image generating means)
23 Target area setting unit (target area setting means)
24, 24B... Contour extraction unit (contour extraction means)
25 distance information updating unit (distance information updating means)

Claims

A moving object detection device that detects a moving object existing in the imaging target from a plurality of captured images of the imaging target with a plurality of synchronized imaging units,
A distance information generating unit configured to generate a distance to the imaging target as distance information based on a parallax of the plurality of captured images;
Motion information generating means for generating the motion of the moving object as motion information based on a difference between captured images input in time series from at least one of the plurality of image capturing means;
A target distance setting unit that sets a target distance at which the moving object exists based on the distance information and the motion information;
Based on the distance information, a target distance image generation unit that generates a target distance image composed of pixels corresponding to the target distance set by the target distance setting unit,
In the target distance image, at least corresponding to the target distance, a target region setting means for setting a target region to be a target for detecting the moving object,
By extracting a contour from the target area set by the target area setting means, a contour extracting means for detecting the moving object,
A moving object detection device comprising:

2. The target distance setting unit according to claim 1, wherein the target distance setting unit obtains a cumulative value of pixels that have moved for each distance, and sets a target distance at which the moving object is present based on the cumulative value. 3. Moving object detection device.

3. The moving object according to claim 1, wherein the target distance image generation unit generates a target distance image including pixels existing within a predetermined range in a depth direction at least based on the target distance. 4. Detection device.

The apparatus according to claim 1, wherein the target area setting unit sets the target area within a predetermined range in a horizontal direction from a peak of the pixel amount based on a vertical pixel amount in the target distance image. The moving object detection device according to claim 3.

The apparatus according to claim 1, wherein the target area setting unit sets a vertical range of the target area based on at least a tilt angle of the imaging unit and a height from an installation surface. 2. The moving object detection device according to claim 1.

Based on color information or shade information of each pixel of the captured image, comprising an edge image generating means for generating an edge image by extracting the edge of the captured image,
6. The target distance image generating unit extracts a pixel of the edge image corresponding to the target distance based on the distance information to generate the target distance image. The moving object detection device according to any one of the above.

7. The apparatus according to claim 1, further comprising a distance information updating unit configured to update the distance information by setting an inner region of the outline extracted by the outline extracting unit as an extracted region of the moving object. 2. The moving object detection device according to claim 1.

Based on distance information to an imaging target generated based on captured images captured by a plurality of synchronized imaging units, and based on captured images input in time series from at least one of the plurality of imaging units. A moving object detection method for detecting a moving object having motion in the imaging target by using the generated motion information.
Based on the distance information and the motion information, a target distance setting step of setting a target distance where the moving object is present,
Based on the distance information, a target distance image generating step of generating a target distance image composed of pixels corresponding to the target distance set in the target distance setting step,
In the target distance image, at least corresponding to the target distance, a target region setting step of setting a target region to be a target for detecting the moving object,
A contour extraction step of detecting the moving object by extracting a contour from the target area set in the target area setting step;
A moving object detection method comprising:

Based on distance information to an imaging target generated based on captured images captured by a plurality of synchronized imaging units, and based on captured images input in time series from at least one of the plurality of imaging units. With the generated motion information, to detect a moving object that is moving in the imaging target, a computer,
A target distance setting unit that sets a target distance at which the moving object exists based on the distance information and the motion information;
A target distance image generation unit configured to generate a target distance image including pixels corresponding to the target distance set by the target distance setting unit based on the distance information;
In the target distance image, at least corresponding to the target distance, a target region setting means for setting a target region to be a target for detecting the moving object,
Contour extracting means for detecting the moving object by extracting a contour from the target area set by the target area setting means;
A moving object detection program characterized by functioning as a moving object.