JP3664892B2

JP3664892B2 - Stereo image processing method and apparatus and intruding object monitoring system

Info

Publication number: JP3664892B2
Application number: JP27877698A
Authority: JP
Inventors: 本泰之道; 田勝政恩
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1998-09-30
Filing date: 1998-09-30
Publication date: 2005-06-29
Anticipated expiration: 2018-09-30
Also published as: JP2000115810A

Description

【０００１】
【発明の属する技術分野】
本発明は、ステレオ画像処理を用いた侵入者等の監視方法および装置に関する。
【０００２】
【従来の技術】
画像処理を用いて侵入者等を検出する従来の技術として、一台の撮像装置を用いる単眼画像処理では、撮影時刻の異なる過去と現在の画像同士を差分処理することによって、変化が生じた画像領域を抽出し、この領域を侵入者が存在する領域と判断し、画像上での侵入者領域の重心位置を２次元的に追跡するなどして、画像上での２次元的な移動量、見かけの大きさ、移動方向等から前記動物体が侵入者であるか、検出する必要のない小動物等であるかを識別していた。
【０００３】
一方、ステレオ画像処理を用いた技術としては、例えば特開平７−９７３３７号公報に記載の侵入物監視装置がある。検出すべき物体の識別において、あらかじめ物体の存在しない時の３次元情報を獲得しておき、現在の３次元情報と比較して、過去と現在の３次元情報の間に変化があり、かつ物体の大きさが所定の大きさであれば警報を出力するようになっている。
【０００４】
具体的な構成を図８を用いて説明する。光軸が互いに平行であり、かつ互いの画像が水平になるように配置された２台の撮像装置からなるステレオカメラ１０によって撮影された画像は、ステレオ画像処理装置２０に入力される。ステレオ画像処理装置２０では、例えば左の撮像装置の画像をいくつかの小領域に分割し、各小領域ごとに右画像から小領域と相関の高い領域を見つける。左右画像間にて相関の高い領域、すなわち同一物体の同一部分が撮像されている領域を見つける処理は一般的に対応付け、あるいはマッチングと呼ばれる。左右画像間において対応付けられた領域の水平方向位置のズレ量が視差であり、視差は遠方の物体ほど小さくなる。侵入物検出装置７０では、まず、視差を３画測量の原理を用いて座標変換することによって、実空間上の一点を示す位置データに変換する。
【０００５】
次に、実空間を仮想的に０. ２ｍ〜０. ５ｍの立方格子状に分割し、各立方格子内に含まれる位置データの数をカウントし、一定以上の位置データが獲得できている場合に物体が存在すると判断する。既存の物体ではないと判断されれば、次に大きさによって物体を識別し、検出すべき物体であれば警報装置９０によって警報を発する。そして、物体が撮像範囲から外れないように物体を追跡処理し、必要があればカメラ回転装置１３によりカメラを回転しながら撮影する。また、ステレオ画像処理では、複数の撮像装置からなるステレオカメラを用いるが、撮像装置と画像処理部間の画像データの伝送は、図８からも明らかなように、撮像装置の台数分の映像ケーブル６０を必要としていた。
【０００６】
【発明が解決しようとする課題】
しかしながら、従来の単眼画像処理による侵入者の検出では、２次元情報である画像を用いているため、撮像装置の光軸方向である、奥行き方向の位置や実空間での大きさは把握することができなかった。代わりに画像上で２次元的に物体を追跡し、見かけの移動量や大きさを特徴量として用いて、侵入者等とその他の物体を識別していたが、識別の精度は十分でなく、誤った警報が発生しやすかった。
【０００７】
一方、従来のステレオ画像処理を応用した侵入者の検出では、左右２枚の画像を用いることによって、実空間の３次元情報を獲得しており、過去と現在、時刻の異なる３次元情報を比較して侵入物体を抽出し、実空間での大きさを計測して侵入物体が侵入者であるかを識別していた。しかし、ある瞬間だけの大きさを用いて、侵入物体が侵入者であるかその他の物体であるかを識別するのでは、３次元情報等の計測誤差の影響を受けやすく、誤った識別結果となることが考えられる。また、移動方向や移動速度等といった３次元情報の時間的変化も計測することが出来ない。
【０００８】
また、従来のステレオ画像処理装置において、複数の撮像装置からの画像データの入力は、撮像装置の台数分の映像ケーブルを必要としていたため、配線が通常の監視カメラに比較して複雑であり、配線の簡単化が望まれている。
【０００９】
本発明は、上記従来の課題を解決するものであり、高精度で物体識別が可能であり、結果として誤報が少ないステレオ画像処理方法および装置を提供することを目的とする。また本発明は、画像を多重化することで、一本の伝送路で複数の撮像装置の画像データを伝送することが可能な侵入物体監視システムを提供することを目的とする。
【００１０】
【課題を解決するための手段】
上記目的を達成するために、本発明は、ステレオ画像処理によって基準平面より上に存在する物体を抽出し、これによって得られる物体の実空間における３次元位置を順次追跡することで、見かけの大きさや移動量ではなく、実際の大きさや移動量といった、３次元情報の時間的変化を獲得することにより、高精度で物体識別が可能であり、誤報を少なくすることができる。
【００１１】
【発明の実施の形態】
請求項１に記載の発明は、複数台の撮像装置によって撮影された複数の画像を用いて、前記複数の画像間の対応付けをすることで視差を計測し、前記視差を用いて基準平面を３次元的に推定し、前記基準平面より上に存在する物体を抽出し、前記物体の３次元位置を順次追跡し、追跡中に得られる前記物体の３次元情報の時間的変化を特徴量として前記物体を識別するステレオ画像処理方法である。
【００１２】
請求項２に記載の発明は、複数台の撮像装置によって撮影された複数の画像を用いて、前記複数の画像間の視差を計測する対応付け部と、前記視差を用いて基準平面を３次元的に推定する平面推定部と、前記基準平面より上に存在する物体を抽出する物体抽出部と、前記物体の３次元位置を順次追跡する追跡部と、追跡中に得られる前記物体の３次元情報の時間的変化を特徴量として前記物体を識別する識別部を持つステレオ画像処理装置である。
【００１３】
これらの構成により、ステレオ画像処理によって撮像空間の３次元情報を獲得し、道路、床等基準平面より上に存在する物体を抽出し、物体の３次元位置を順次追跡する事ができるため、算出される特徴量の精度も向上し、結果的にこれを用いて行う物体識別の精度をも向上することができる。特徴量として用いている３次元情報の時間的変化とは具体的に以下の様である。
【００１４】
第一の特徴量として、検出開始時刻と検出終了時間の差分を計測する。一定時間以上の間検出され続けなかった物体は、３次元情報に含まれる単発的なエラーや、飛来する鳥等、侵入者以外であると判断することができる。
【００１５】
第二の特徴量として、現在検出されている物体を３次元的に追跡し、検出開始時から検出終了時までの実空間での移動量を算出する。これによって何らかの要因で静止物体等が監視領域に置かれた場合等と、移動する侵入者を識別することができる。
【００１６】
第三の特徴量として、現在検出されている物体の追跡中に、物体の実空間での幅、高さ、奥行き等の大きさの最大値、最小値、平均値、分散値等を算出する。これによって侵入者と小動物等の物体を識別することができる。
【００１７】
第四の特徴量として、現在検出されている物体の追跡中に、物体の実空間での移動方向を算出する。これによって侵入者と退出者を区別することができる。
【００１８】
第五の特徴量として、現在検出されている物体の追跡中に、物体の実空間での移動速度を算出する。これによって侵入者と鳥等を区別することができる。なお、第一から第五の特徴量を統合的にすべて用いて物体を識別しても良いし、一部のみで侵入者を精度よく識別することも可能であるし、その他の特徴量も併せて用いてもかまわない。
【００１９】
請求項３に記載の発明は、複数台の撮像装置と、請求項２記載のステレオ画像処理装置と、前記ステレオ画像処理装置からの警報信号を受けて、前記警報信号が発せられる前後の画像を記録する画像蓄積部とを持つことを特徴とする侵入物体監視システムである。以上の構成により、請求項１、請求項２の発明の効果と同様に、ステレオ画像処理によって、撮像空間の３次元情報を獲得し、床、道路等基準平面の位置を実空間座標系で推定し、基準平面より上に存在する物体を抽出し、抽出された物体の実空間での３次元位置や大きさ等を用いて追跡する事ができるため、算出される特徴量の精度も向上し、結果的には物体を識別する精度をも向上することができる。
【００２０】
請求項４に記載の発明は、複数台の撮像装置と、前記複数台の撮像装置によって得られる複数の画像を多重して伝送する画像多重伝送部と、多重化された画像を再度複数の画像に復元する画像多重受信部と、請求項２記載のステレオ画像処理装置と、前記ステレオ画像処理装置からの警報信号を受けて、前記警報信号が発せられる前後の画像を記録する画像蓄積部を持つことを特徴とする侵入物体監視システムとしたものである。以上の構成により、複数の撮像装置とステレオ画像処理装置間の画像データ伝送において、複数の撮像装置によって得られる複数の画像を多重化して伝送する事ができるため、画像データの伝送路である映像用ケーブルが一本で済み、既設の監視カメラ用映像ケーブルを用いても複数の撮像装置の映像をステレオ画像処理装置に伝送することができると同時に、請求項１、請求項２、請求項３の発明と同様にステレオ画像処理によって得られた撮像範囲の３次元情報を用いて、基準平面を推定し、基準平面より上にある物体を抽出し、抽出された物体の実空間での３次元位置や大きさ等を用いて高度に追跡する事ができるため、算出される特徴量の精度も向上し、結果的には物体を識別する精度をも向上することができる。
【００２１】
（第１の実施の形態）
以下、本発明の実施の形態を図面を参照して説明する。
本発明の第１の実施の形態は、対応付け部と平面推定部と物体抽出部と追跡部と識別部を備えたステレオ画像処理装置であり、以下、図１、図２、図３を用いて説明する。図１において、ＳＴＣＡＭは複数台の撮像装置を示しており、ここでは光軸が平行でかつ水平に配置された左右２台の撮像装置から構成されるステレオカメラであるとする。ＳＴＰＲＣはステレオ画像処理装置を示し、その内部に対応付け部３ＤＳＮＳ、平面推定部ＥＳＴ、物体抽出部ＥＸＴ、追跡部ＴＲＣ、識別部ＲＣＧ、等を備える。以下、各部の機能について具体的に説明する。
【００２２】
（対応付け部３ＤＳＮＳ）
対応付け部３ＤＳＮＳでは、ステレオカメラによって得られた画像を逐次入力する。２台のカメラから得られる画像の内、左カメラの画像を基準画像として考える。基準画像を水平Ｘ方向へＮ分割、垂直Ｙ方向へＭ分割し、図２の基準画像I ＭＰようにＮ×Ｍ個の格子状のブロックに区切る。このブロックの単位で、もう一方の右画像中から相関の高い同一矩形領域を見つけだす対応付け処理を行う。例えば基準画像内における水平位置Ｌｘのブロックと、右画像内の水平位置Ｒｘの領域が対応づけられた場合、（１）式によって左右画像間における水平方向のズレ量Ｄが得られ、このズレ量は視差と呼ばれている。基準画像中のブロックの水平方向インデックスをＸ、垂直方向のインデックスをＹとして各ブロック毎の視差Ｄは以下のように求めることができる。
Ｄ（Ｘ，Ｙ）＝｜Ｒｘ−Ｌｘ｜・・・（１）
【００２３】
（平面推定部ＥＳＴ）
次に、計測されたブロック単位の視差を用い、道路、床等、画像中の大部分を占める、床や道路等の基準平面の位置を３次元的に推定し、平面の視差ＤＰ（Ｘ，Ｙ）として出力する。まず、ブロック毎に得られた視差Ｄ（Ｘ，Ｙ）を、図２に示すＸ−Ｙ−Ｄ座標上にプロットする。インデックスＹ（１≦Ｙ≦Ｎ）を固定したときの平面自体の視差Ｄ（Ｘ，Ｙ）は一本の直線上にあるはずなので、ハフ変換によってこの直線を近似する。この直線は推定しようとする平面を通過していることから平面通過直線ＰＬＮＬと呼ぶことにする。ハフ変換によって得られる平面通過直線の直線パラメータとして、直線からＹ軸へ下ろした垂線の長さρと、垂線の傾きθを用いることにすると、各平面通過直線ＰＬＮＬには（２）式のような関係が成立する。
ρ_Y＝Ｘsin θ_Y＋Ｙsin θ_Y ｛１≦Ｘ≦Ｍ、１≦Ｙ≦Ｎ｝・・・（２）
ただし、ρ_Y：Ｙ軸から平面通過直線までの距離
θ_Y：平面通過直線から下ろした垂線とＸ−Ｙ平面がなす角度
（添字Ｙは各インデックスＹを示す）
【００２４】
同様にすべての各インデックスＹ毎に平面通過直線を近似すると、結果として基準平面ＰＬＮは平面通過直線群で近似できるようになり、平面の位置すなわち平面の視差ＤＰ（Ｘ，Ｙ）を大局的に推定する事ができる。平面の位置を推定する場合は、画像中に検出すべき物体が無い状況で行うことが望ましいが、必ずしも無い状況でなければならない訳ではない。
【００２５】
（物体抽出部ＥＸＴ）
物体抽出部では、視差Ｄ（Ｘ，Ｙ）と、推定した平面の視差ＤＰ（Ｘ，Ｙ）を用いて座標変換を行い、物体の高さＨ、平面上での奥行き方向の距離Ｖ、平面上における幅方向の距離Ｕを算出した後、基準平面より上に存在している物体を抽出する。
実空間での物体の高さＨ（Ｘ，Ｙ）は、図３に示すような複数台の撮像装置ＳＴＣＡＭと基準平面ＰＬＮと物体ＯＢＪとの幾何学的関係において（３）式を用いて算出できる。
【数１】

・・・（３）
ただし、Ｄ（Ｘ，Ｙ）：物体ＯＢＪの視差
ｈ：撮像装置の道路面からの高さ
ＤＰ（Ｘ，Ｙ）：平面ＰＬＮの視差
【００２６】
図３中、平面上での奥行き方向の距離Ｖ（Ｘ，Ｙ）もまた、撮像装置と基準平面と物体との幾何学的関係から（４）式を用いて算出できる。（４）式では撮像装置からの距離ＺおよびＺＰが必要となるが、これらは（５）式によって視差Ｄおよび平面の視差ＤＰからそれぞれ変換して得られる。
【数２】

ただし、Ｚ（Ｘ，Ｙ）：撮像装置から物体までの距離
ｈ：撮像装置の平面からの高さ
ＺＰ（Ｘ，Ｙ）：撮像装置から平面までの距離
θ ：撮像装置の俯角
距離＝Ｂ×ｆ／視差・・・（５）
ただし、Ｂ：撮像装置の配置間隔
ｆ：撮像装置のレンズ焦点距離
【００２７】
また、基準画像上での水平方向位置Ｌｘを、物体の実空間における幅方向位置Ｕに変換するには、（６）式を用いる。
【数３】

ただし、ｈ：ステレオカメラの平面からの高さ
θ ：ステレオカメラの俯角
f ：レンズの焦点距離
Ｌｘ，Ｌｙ：基準画像中のブロックの水平位置と垂直位置
以上の座標変換処理により、物体の各部の位置を図３のような実空間における３次元座標系（Ｕ，Ｖ，Ｈ）で示すことができるようになる。
【００２８】
次に、物体の存在しているであろう領域を抽出する物体抽出処理を行う。床、道路等の平面から一定以上の高さＨをもつブロックを抽出し、実空間で互いに近距離にあるならば、同一物体と見なしてこれらをラベリングする事により実空間における重心位置を算出し、物体の３次元位置とする。
【００２９】
（追跡部ＴＲＣ）
追跡部ＴＲＣでは、前記物体検出処理にて基準平面より上に物体が存在すると判断された場合、物体の実空間上での３次元位置を順次追跡する。ある時刻Ｔにおける物体の３次元位置をＰ１（ｕ１，ｖ１．ｈ１）とし、以降の時刻Ｔ＋Δ（Δは対応付け部の処理周期）における位置をＰ２（ｕ２，ｖ２，ｈ２）とすると、実空間での移動量は（７）式で算出され、処理周期Δ間における位置変化量がある範囲以内であり、物体の幅、高さ等の変化量も一定範囲内であれば、時刻Ｔと時刻Ｔ＋Δに抽出された物体は同一物体であると考えられ、物体はＰ１（ｕ１，ｖ１. ｈ１）からＰ２（ｕ２，ｖ２，ｈ２）に移動したこととする。以上の追跡処理は物体抽出部において物体が抽出されている期間、繰り返し行われる。
【数４】

各時刻における物体の位置や大きさを、追跡データとして追跡終了まで保持する。
【００３０】
（識別部ＲＣＧ）
識別部ＲＣＧでは、追跡によって得られる前記物体の３次元情報である位置や大きさの時間的変化を示す特徴量を算出して前記物体を識別する。３次元情報の時間的変化とは、例えば検出時間、移動量、大きさ、移動方向、移動速度等の特徴量である。
【００３１】
検出時間は、同一物体を追跡によって得られる追跡開始時刻Ｔ１から追跡終了時刻Ｔ２までの差分によって算出される。一般的に鳥等は比較的高速で移動して撮像範囲から外れるので連続して検出される時間が短く、人間は比較的低速で移動するので連続して撮像範囲中で検出される時間が長い。検出時間の長短によって鳥等と侵入者とを識別することができる。
【００３２】
移動量は、物体の追跡開始時から追跡終了時までにおいて、実空間での３次元的な追跡開始位置から追跡終了位置までの距離や、追跡開始位置から物体が最遠となる距離である。対応付けのエラーや、照明変化等によって、新たな物体があるかのような３次元情報が物体抽出部により得られた場合や、木が大きく揺れている場合であっても、これらがさらに侵入者等のように移動することは考えられないので、移動する侵入者等と識別することができる。
【００３３】
大きさは、実空間での物体の幅や高さ等であり、追跡中に随時計測する。追跡中の平均の大きさや分散であっても、追跡中の最大あるいは最小の大きさであってもよい。人間の標準的な大きさと比較してある割合以上小さい場合や、ある割合以上大きい場合にこれらを侵入者以外の物体であると識別することができる。実空間における幅は物体抽出部において得られる平面上での幅方向の距離Ｕ（Ｘ，Ｙ）を用いて、物体の左端位置と右端位置の差分によって得られ、高さは床、道路等からの高さであり、物体抽出部で得られる高さＨ（Ｘ，Ｙ）を用いる。
【００３４】
移動方向は、実空間で物体が移動する方向である。追跡開始時から追跡終了時までに物体がどの方向に移動したかを算出する。例えば実空間における追跡開始点と追跡終了点を結ぶことで得られるベクトルから求めることができる。移動方向によって、施設内からの退出者と、施設外からの侵入者を識別することができる。
【００３５】
移動速度は、実空間で物体が移動する速度である。追跡中のある時刻Ｔ１からＴ２までの実空間での移動量を、所要時間であるＴ１とＴ２の差分絶対値で除算することで得られる。移動速度は鳥等小動物が比較的速く、人間等は比較的遅いので移動速度が閾値以下であれば侵入者等人間であると判断することができる。
【００３６】
検出時間、移動量、大きさ、移動方向、移動速度の合計５つの特徴量を統合的に用いて物体が何であるかの識別を行う方法としては、まず各特徴量毎に物体が侵入者であるかどうかを識別し、識別結果が真である特徴量の数が閾値ＴＨ以上あれば物体は侵入者であると判断することが考えられる。閾値ＴＨはこの場合５以下１以上の値である。
【００３７】
なお、上記の５つの特徴量をすべて用いて物体を識別しても、一部を用いて識別しても、その他の３次元情報の時間的変化を示す特徴量と併せて用いて物体識別を行ってもかまわない。
【００３８】
以上のように、本実施の形態１によれば、ステレオ画像処理によって得られる物体の実空間上の位置や大きさを用いて３次元的に追跡できるため、物体識別の特徴量である３次元情報の時間的変化を高精度に算出でき、結果として侵入物体の識別の精度も向上し、誤報の少ないステレオ画像処理装置とすることができる。
【００３９】
なお、本実施の形態１と、従来の単眼画像処理と、従来のステレオ画像処理との構成の違いを比較したのが表１である。
【表１】

【００４０】
（第２の実施の形態）
本発明の第２の実施の形態は、左右画像の対応付けによって視差を計測し、視差を用いて基準平面の位置を３次元的に推定し、基準平面上に存在する物体を抽出し、これを３次元的に追跡して、検出時間、移動量、大きさ、移動方向、移動速度等を算出し、これらを統合的に用いた識別処理を行うことで、物体の種別を識別するステレオ画像処理方法である。図４の処理のフロー図を用いて各処理ステップの処理概要を説明する。なお、各ステップのより具体的処理内容は第１の実施の形態でも説明したので省略する。
【００４１】
ステップＦ０１では、ステレオ画像を入力する。ステップＦ０２では、左右画像間で対応付け処理を行い、左画像に設定したブロック単位で視差を獲得する。ステップＦ０３では、床、道路等基準平面の位置を３次元的に推定し、ステップＦ０４では、平面位置を基準とした座標系で物体の３次元情報を表現できるよう座標変換を行う。ステップＦ０５では、平面上に存在する物体を抽出する。ステップＦ０６では、Ｆ０５にて平面上に物体が存在すると判断された場合はステップＦ０７に進み、物体がないと判断された場合はＳＴＡＲＴに戻る分岐処理を行う。ステップＦ０７では、処理周期毎に物体の位置を３次元的に順次追跡する。ステップＦ１１では、追跡によって得られる追跡開始時刻と追跡終了時刻の差分から検出時間を算出し、検出時間の長短によって物体を識別できるようにする。ステップＦ１２では、追跡によって、物体の実空間での移動方向を計測し、例えば入場者と退場者を識別できるようにする。ステップＦ１３では、物体の実空間における幅や高さ等の大きさを計測し、大きさによって、猫、鳥等小動物と、侵入者等を識別することができるようにする。ステップＦ１４では、物体の実空間における移動速度を計測し、鳥等高速に移動する物体と人間等、低速で移動する物体を区別できるようにする。ステップＦ１５では、物体の実空間における移動量を算出し、移動しない静止物体と移動する動物体を識別できるようにする。ステップＦ１６では、ステップＦ１１〜Ｆ１５にて得られた特徴量の一部あるいは全部を利用して物体の識別を行う。数種の特徴量を統合的に用いて物体を識別する方法としては、例えば侵入者の条件を満たしているかを各特徴量毎にまず判定し、各特徴量毎の識別結果が真であった数が一定以上であった場合に、物体は侵入者であると識別することが考えられる。そして、侵入者等警報を発すべき物体であったと識別された場合は、警報信号を出力する。
【００４２】
以上のように、本実施の形態２によれば、ステレオ画像処理によって得られる物体の３次元情報を用いて高度に物体追跡することによって、物体の３次元情報の時間的変化を示す特徴量が正しく得られ、結果的に精度良く物体の識別が行えるため、誤報の少ないステレオ画像処理方法とすることができる。
【００４３】
（第３の実施の形態）
本発明の第３の実施の形態は、ステレオ画像処理を用いた侵入物体監視システムである。図５に本実施の形態３におけるの侵入物体監視システムの構成を示す。複数の撮像装置ＳＴＣＡＭによって撮影された左右一組のステレオ画像は、第１の実施の形態に記載したステレオ画像処理装置ＳＴＯＲＣに入力され、このステレオ画像処理装置内の識別部によって、侵入者等警報を発すべき物体と識別された場合、警報部ＡＬＭと画像蓄積部ＳＴＲに警報信号を出力し、警報部ＡＬＭは警報を発し、画像蓄積部ＳＴＲは警報の前後の画像を磁気ディスク等に蓄積する。なお、複数の撮像装置ＳＴＣＡＭ、ステレオ画像処理装置ＳＴＰＲＣについては第１の実施の形態でも説明したので詳細は省略する。
【００４４】
警報部ＡＬＭは、画像中に侵入者等が存在すると判断された場合に、識別部ＲＣＧより出力される警報信号を受けて、音、光、振動等で監視員に注意を喚起し、あるいは侵入者に警報を発する。画像蓄積部ＳＴＲは、識別部ＲＣＧから警報信号が出力された時点の前後の画像をＨＤＤや光ディスク、テープに記録蓄積する。画像蓄積部ＳＴＲでは、画像遅延手段であるフレームバッファ等を用いることで、警報信号が発せられた数秒間前の画像も記録することができる。なお、画像のみではなく、音声等も同時に記録しておくことも可能である。モニターＭＮＴＲは、複数の画像を表示したり、画像蓄積部ＳＴＲに記録された画像を表示したりする。
【００４５】
以上のように、本実施の形態３によれば、物体の３次元位置と大きさ等ステレオ画像処理によって得られる３次元情報を用いて３次元的に物体を追跡する事で、物体識別の特徴量である３次元情報の時間的変化が高精度に得られ、最終的には誤報が少ない侵入物体監視システムとすることができる。
【００４６】
（第４の実施の形態）
本発明の第４の実施の形態は、図６に示すように、第３の実施の形態で説明した侵入物体監視装置システムの複数の撮像装置ＳＴＣＡＭとステレオ画像処理装置ＳＴＰＲＣの間に、画像多重伝送部ＳＮＤと画像多重受信部ＲＣＶを挿入した侵入物体監視システムである。複数の撮像装置ＳＴＣＡＭによって撮影された左右一組のステレオ画像は、画像多重伝送部ＳＮＤに入力される。画像多重伝送部ＳＮＤでは、複数の画像であるステレオ画像を多重化して伝送、送出する。画像多重受信部ＲＣＶでは、多重化された画像データを再度複数の画像、つまりステレオ画像に戻す処理を行う。ステレオ画像処理装置ＳＴＰＲＣ内の識別部ＲＣＧで物体を識別し、侵入者等警報を発すべき物体であるとされた場合、警報部ＡＬＭと画像蓄積部ＳＴＲに警報信号を出力し、警報部ＡＬＭは警報を発し、画像蓄積部ＳＴＲは、警報の前後の画像を磁気ディスク等に蓄積する。なお、複数の撮像装置ＳＴＣＡＭ、ステレオ画像処理装置ＳＴＰＲＣについては第１の実施の形態、警報部ＡＬＭおよび画像蓄積部ＳＴＲについては第３の実施の形態で説明したので詳細は省略する。
【００４７】
画像多重伝送部ＳＮＤは、同一時刻に撮影された複数の画像を多重して伝送する。複数の画像の多重化方法としては、フィールド毎に多重、周波数多重、時分割多重が考えられる。フィールド毎に多重する場合は、２枚のフィールド画像を同一フレーム内の第一フィールドと第二フィールドに多重化して画像を伝送する。図７を用いて具体的に説明する。ステレオ画像処理においては、同時刻に撮影された左右２枚からなるステレオ画像を用いる必要があるため、一旦それぞれの画像をフィールドメモリまたはフレームメモリＦＭＲ、ＦＭＬに保持する。次に左右の画像が保持されているメモリから、奇数あるいは偶数のいずれか同じフィールドの画像を取り出し、一枚のフレーム画像として合成、多重したのが左右多重画像ＭＬＴＩＭＧである。画像多重伝送部ＲＣＶでは、この左右多重画像を伝送、送出する。この場合、画像多重受信部ＲＣＶでは、左右多重画像像からそれぞれ左フィールド画像と右フィールド画像を分離して復調する。
【００４８】
周波数多重して伝送する場合は、多重化する左右画像の映像信号帯域が互いに重ならないように周波数をずらして周波数軸上に配置して伝送する。この場合、画像多重受信部ＲＣＶは、複数の画像が周波数多重された信号から、それぞれの画像の映像信号帯域を、帯域通過フィルタを用いて分離して復調する。周波数多重は、放送、電話網等でも用いられている。
【００４９】
時分割多重して伝送する場合は、複数の画像が同時刻に撮影された右画像と左画像の２つの画像であるとした場合、時刻Ｔ１には右画像の信号、時刻Ｔ２には左画像の信号、時刻Ｔ３には右画像の信号といった順に順次切り替えて伝送する。この場合画像多重受信部では、伝送された順番で各画像のデータを受信して左右の画像を復調する。なお、いずれの多重化方法も伝送路は無線であってもかまわないし、光ケーブルを用いて多重化した画像を伝送することも可能である。
【００５０】
以上のように、本実施の形態４によれば、ステレオ画像処理によって得られる物体の３次元位置と大きさ等を用いて３次元的に追跡する事で、特徴量である３次元情報の時間的変化が高精度に得られ、最終的には誤報なく侵入者等を識別することができる。また、複数の撮像装置によって撮影された画像を一本の伝送路に多重化して伝送することで、複数の撮像装置と３次元情報処理部との間の配線を簡単にすることができ、また、既存の監視カメラ用の映像信号伝送路をそのまま用いても、複数の画像からなるステレオ画像を伝送することができるようになる。
【００５１】
【発明の効果】
以上のように、本発明によれば、第１の効果として、ステレオ画像処理によって、撮像範囲内の視差を獲得し、床や道路等基準平面を３次元的に推定し、基準平面より上にある物体を抽出し、この物体の３次元位置を順次追跡する事で、見かけではなく実際の移動量、大きさ、移動方向、移動速度等、物体の３次元情報の時間的変化を示す特徴量を精度良く取得でき、これを用いて物体を識別するため、結果として誤報の少ないステレオ画像処理方法およびステレオ画像処理装置と、侵入物体監視システムを提供することができる。
【００５２】
第２の効果として、複数の撮像装置からステレオ画像処理装置へ画像を伝送する際、多重化して伝送することによって、複数の伝送路を必要としない侵入物体監視システムとする事ができる。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態におけるステレオ画像処理装置の構成図
【図２】本発明の第１の実施の形態における平面推定部が道路、床等の平面の位置を３次元的に推定する際の原理を示す説明図
【図３】本発明の第１の実施の形態にけるステレオ画像処理装置の物体抽出部が座標変換を行う様子の説明図
【図４】本発明の第２の実施の形態におけるステレオ画像処理方法の処理の流れを示すフロー図
【図５】本発明の第３の実施の形態における侵入物体監視システムの構成図
【図６】本発明の第４の実施の形態における侵入物体監視システムの構成図
【図７】本発明の第４の実施の形態における画像多重伝送部の動作内容を示す説明図
【図８】従来の技術である侵入者監視装置の構成図
【符号の説明】
ＳＴＣＡＭ複数台の撮像装置、ステレオカメラ
ＳＴＰＲＣステレオ画像処理装置
３ＤＳＮＳ対応付け部
ＥＳＴ平面推定部
ＥＸＴ物体抽出部
ＴＲＣ追跡部
ＲＣＧ識別部
ＰＬＮ道路、床等の平面
ＰＬＮＬ平面通過直線
ＩＭＰ基準画像面
ＯＢＪ侵入者等の物体
ＳＮＤ画像多重伝送部
ＲＣＶ画像多重受信部
ＳＴＲ画像蓄積部
ＭＮＴＲモニタ
ＡＬＭ警報部
ＦＭＲ，ＦＭＬ左右それぞれの画像データを保持するフレームメモリ
ＭＬＴＩＭＧ左右多重画像
ＬＩＭＧ，ＲＩＭＧ左右それぞれのフィールド画像[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an intruder monitoring method and apparatus using stereo image processing.
[0002]
[Prior art]
As a conventional technique for detecting intruders and the like using image processing, in monocular image processing using a single imaging device, an image in which changes have occurred by performing differential processing between past and current images at different shooting times A region is extracted, this region is determined as a region where an intruder exists, and the position of the center of gravity of the intruder region on the image is tracked two-dimensionally. Whether the moving object is an intruder or a small animal that does not need to be detected is identified from the apparent size, moving direction, and the like.
[0003]
On the other hand, as a technique using stereo image processing, for example, there is an intruder monitoring apparatus described in JP-A-7-97337. In identifying the object to be detected, three-dimensional information when the object does not exist is acquired in advance, and there is a change between the past and current three-dimensional information as compared with the current three-dimensional information, and the object An alarm is output if the size of the is a predetermined size.
[0004]
A specific configuration will be described with reference to FIG. An image taken by the stereo camera 10 including two imaging devices arranged such that the optical axes are parallel to each other and the images are horizontal is input to the stereo image processing device 20. In the stereo image processing device 20, for example, the image of the left imaging device is divided into several small regions, and a region having a high correlation with the small region is found from the right image for each small region. The process of finding a region having a high correlation between the left and right images, that is, a region where the same part of the same object is imaged is generally referred to as matching or matching. The amount of shift in the horizontal position of the region associated between the left and right images is parallax, and the disparity becomes smaller as the object is far away. In the intruder detection apparatus 70, first, the parallax is converted into position data indicating one point in the real space by performing coordinate conversion using the principle of three-stroke surveying.
[0005]
Next, when the real space is virtually divided into a cubic grid of 0.2 m to 0.5 m, the number of position data included in each cubic grid is counted, and position data of a certain level or more can be acquired. It is determined that there is an object. If it is determined that the object is not an existing object, then the object is identified by the size, and if it is an object to be detected, the alarm device 90 issues an alarm. Then, the object is tracked so that the object does not fall out of the imaging range, and if necessary, the camera rotating device 13 takes a picture while rotating the camera. In stereo image processing, a stereo camera composed of a plurality of imaging devices is used. As is clear from FIG. 8, the video data transmission between the imaging device and the image processing unit is equivalent to the number of image cables. 60 was needed.
[0006]
[Problems to be solved by the invention]
However, since intruder detection using conventional monocular image processing uses an image that is two-dimensional information, it is necessary to grasp the position in the depth direction, which is the optical axis direction of the imaging device, and the size in real space. I could not. Instead, the object was tracked two-dimensionally on the image, and the apparent amount of movement and size were used as features to identify intruders and other objects, but the accuracy of identification was not sufficient, A false alarm was likely to occur.
[0007]
On the other hand, in the detection of intruders using conventional stereo image processing, three-dimensional information in real space is acquired by using two images on the left and right, and three-dimensional information with different times is compared with the past and present. The intruding object is extracted, and the size in the real space is measured to identify whether the intruding object is an intruder. However, using the size at a certain moment to identify whether the intruding object is an intruder or other object, it is easily affected by measurement errors such as three-dimensional information. It is possible to become. In addition, temporal changes in the three-dimensional information such as the moving direction and moving speed cannot be measured.
[0008]
Further, in the conventional stereo image processing apparatus, the input of image data from a plurality of imaging devices requires video cables as many as the number of imaging devices, so the wiring is complicated compared to a normal surveillance camera, Simplification of wiring is desired.
[0009]
SUMMARY OF THE INVENTION The present invention solves the above-described conventional problems, and an object thereof is to provide a stereo image processing method and apparatus capable of identifying an object with high accuracy and resulting in few false alarms. It is another object of the present invention to provide an intruding object monitoring system capable of transmitting image data of a plurality of imaging devices through a single transmission path by multiplexing images.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, the present invention provides a stereo image processing. Extract objects that exist above the reference plane, By tracking the 3D position of the obtained object in real space one after another, it is possible to obtain high-accuracy by acquiring temporal changes in 3D information such as actual size and amount of movement instead of apparent size and amount of movement. Object identification is possible, and false alarms can be reduced.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
According to the first aspect of the present invention, parallax is measured by associating the plurality of images using a plurality of images captured by a plurality of imaging devices, and a reference plane is determined using the parallax. Three-dimensional estimation, objects existing above the reference plane are extracted, the three-dimensional positions of the objects are sequentially tracked, and temporal changes in the three-dimensional information of the objects obtained during tracking are used as feature quantities. A stereo image processing method for identifying the object.
[0012]
According to a second aspect of the present invention, an association unit that measures parallax between the plurality of images using a plurality of images photographed by a plurality of imaging devices, and a three-dimensional reference plane using the parallax A plane estimation unit that estimates the target, an object extraction unit that extracts an object existing above the reference plane, a tracking unit that sequentially tracks a three-dimensional position of the object, and a three-dimensional model of the object obtained during tracking The stereo image processing apparatus includes an identification unit that identifies the object using a temporal change in information as a feature amount.
[0013]
With these configurations, it is possible to acquire three-dimensional information of the imaging space through stereo image processing, extract objects that exist above the reference plane such as roads and floors, and sequentially track the three-dimensional position of the objects. As a result, the accuracy of feature identification is improved, and as a result, the accuracy of object identification using the feature amount can be improved. The temporal change of the three-dimensional information used as the feature quantity is specifically as follows.
[0014]
The difference between the detection start time and the detection end time is measured as the first feature amount. An object that has not been detected for a certain period of time or more can be determined to be a non-intruder such as a single error included in the three-dimensional information or a flying bird.
[0015]
As the second feature amount, the currently detected object is tracked three-dimensionally, and the movement amount in the real space from the detection start time to the detection end time is calculated. Accordingly, it is possible to distinguish a moving intruder from a case where a stationary object or the like is placed in the monitoring area for some reason.
[0016]
As the third feature amount, the maximum value, minimum value, average value, variance value, etc. of the size of the object in the real space, such as width, height, and depth, are calculated while tracking the currently detected object. . Thereby, an intruder and an object such as a small animal can be identified.
[0017]
As a fourth feature amount, the movement direction of the object in the real space is calculated during the tracking of the currently detected object. This makes it possible to distinguish intruders and evacuees.
[0018]
As the fifth feature amount, the moving speed of the object in the real space is calculated during the tracking of the currently detected object. This makes it possible to distinguish intruders from birds and the like. In addition, the object may be identified by using all of the first to fifth feature values in an integrated manner, or an intruder can be accurately identified by only a part, and other feature values are also included. May be used.
[0019]
According to a third aspect of the present invention, there are provided a plurality of image pickup devices, the stereo image processing device according to the second aspect, and images before and after the alarm signal is generated in response to an alarm signal from the stereo image processing device. An intruding object monitoring system having an image storage unit for recording. With the above configuration, as in the effects of the first and second aspects of the invention, the three-dimensional information of the imaging space is acquired by stereo image processing, and the position of the reference plane such as the floor and the road is estimated in the real space coordinate system. In addition, it is possible to extract an object that exists above the reference plane and track it using the three-dimensional position and size of the extracted object in the real space. As a result, the accuracy of identifying an object can also be improved.
[0020]
According to a fourth aspect of the present invention, there are provided a plurality of image pickup devices, an image multiplex transmission unit that multiplexes and transmits a plurality of images obtained by the plurality of image pickup devices, and a plurality of images that are multiplexed again. And a stereo image processing device according to claim 2, and an image storage unit that receives an alarm signal from the stereo image processing device and records images before and after the alarm signal is issued. The intruding object monitoring system is characterized by this. With the above configuration, in image data transmission between a plurality of imaging devices and a stereo image processing device, a plurality of images obtained by the plurality of imaging devices can be multiplexed and transmitted. Only one cable is required, and even when an existing video cable for a surveillance camera is used, the images of a plurality of imaging devices can be transmitted to the stereo image processing device, and at the same time, claim 1, claim 2, and claim 3. 3D information of the imaging range obtained by stereo image processing is used as in the invention of the above, a reference plane is estimated, an object above the reference plane is extracted, and the extracted object is three-dimensional in real space Since it is possible to perform high-level tracking using the position, size, etc., the accuracy of the calculated feature amount can be improved, and as a result, the accuracy of identifying the object can also be improved.
[0021]
(First embodiment)
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
1st Embodiment of this invention is a stereo image processing apparatus provided with the matching part, the plane estimation part, the object extraction part, the tracking part, and the identification part, and uses FIG.1, FIG.2, FIG.3 below. I will explain. In FIG. 1, STCAM indicates a plurality of imaging devices, and here, it is assumed that the stereo camera is composed of two left and right imaging devices having optical axes parallel and horizontally arranged. STPRC indicates a stereo image processing apparatus, and includes therein an association unit 3DSNS, a plane estimation unit EST, an object extraction unit EXT, a tracking unit TRC, an identification unit RCG, and the like. Hereinafter, the function of each unit will be specifically described.
[0022]
(Association unit 3DSNS)
The associating unit 3DSNS sequentially inputs images obtained by the stereo camera. Of the images obtained from the two cameras, the image of the left camera is considered as the reference image. The reference image is divided into N in the horizontal X direction and M in the vertical Y direction, and is divided into N × M grid-like blocks as in the reference image IMP in FIG. In this block unit, a matching process for finding the same rectangular region having a high correlation from the other right image is performed. For example, when the block at the horizontal position Lx in the reference image and the area at the horizontal position Rx in the right image are associated with each other, the horizontal shift amount D between the left and right images is obtained by the equation (1). Is called parallax. The parallax D for each block can be determined as follows, where X is the horizontal index of the block in the reference image and Y is the vertical index.
D (X, Y) = | Rx−Lx | (1)
[0023]
(Plane estimation unit EST)
Next, using the measured parallax in units of blocks, the position of a reference plane such as a floor or a road that occupies most of the image such as a road or a floor is estimated three-dimensionally, and the parallax DP (X, Y of the plane) ) Is output. First, the parallax D (X, Y) obtained for each block is plotted on the XYD coordinates shown in FIG. Since the parallax D (X, Y) of the plane itself when the index Y (1 ≦ Y ≦ N) is fixed should be on one straight line, this straight line is approximated by the Hough transform. Since this straight line passes through the plane to be estimated, it will be referred to as a plane passing straight line PLNL. As the straight line parameters of the plane passing straight line obtained by the Hough transform, if the length ρ of the perpendicular line dropped from the straight line to the Y axis and the inclination θ of the perpendicular line are used, each plane passing straight line PLNL has the following equation (2): The relationship is established.
ρ _Y = Xsin θ _Y + Ysin θ _Y {1 ≦ X ≦ M, 1 ≦ Y ≦ N} (2)
However, ρ _Y : Distance from the Y axis to the straight line passing through the plane
θ _Y ： An angle formed by a perpendicular drawn from a straight line passing through the plane and the XY plane
(Subscript Y indicates each index Y)
[0024]
Similarly, if the plane passing straight line is approximated for every index Y, as a result, the reference plane PLN can be approximated by a plane passing straight line group, and the position of the plane, that is, the parallax DP (X, Y) of the plane is globally determined. Can be estimated. When estimating the position of the plane, it is desirable to perform in a situation where there is no object to be detected in the image, but this is not necessarily a situation.
[0025]
(Object extraction unit EXT)
The object extraction unit performs coordinate conversion using the parallax D (X, Y) and the estimated parallax DP (X, Y) of the estimated plane, and the height H of the object, the distance V in the depth direction on the plane, and the plane After calculating the distance U in the width direction above, an object that exists above the reference plane is extracted.
The height H (X, Y) of the object in the real space is calculated by using the expression (3) in the geometrical relationship among the plurality of imaging devices STCAM, the reference plane PLN, and the object OBJ as shown in FIG. it can.
[Expression 1]

... (3)
Where D (X, Y): parallax of the object OBJ
h: Height of the imaging device from the road surface
DP (X, Y): parallax of plane PLN
[0026]
In FIG. 3, the distance V (X, Y) in the depth direction on the plane can also be calculated from the geometric relationship among the imaging device, the reference plane, and the object using equation (4). In the equation (4), the distances Z and ZP from the imaging device are required, and these are obtained by converting from the parallax D and the planar parallax DP, respectively, according to the equation (5).
[Expression 2]

Where Z (X, Y): distance from the imaging device to the object
h: Height from the plane of the imaging device
ZP (X, Y): Distance from the imaging device to the plane
θ: Depression angle of the imaging device
Distance = B × f / Parallax (5)
Where B: Arrangement interval of the imaging devices
f: Lens focal length of the imaging device
[0027]
Further, the equation (6) is used to convert the horizontal position Lx on the reference image into the width direction position U in the real space of the object.
[Equation 3]

Where h is the height from the plane of the stereo camera
θ: Angle of stereo camera
f: Focal length of lens
Lx, Ly: horizontal position and vertical position of the block in the reference image
With the above coordinate conversion processing, the position of each part of the object can be indicated by a three-dimensional coordinate system (U, V, H) in real space as shown in FIG.
[0028]
Next, an object extraction process is performed to extract a region where the object will be present. If blocks with height H above a certain level are extracted from a plane such as a floor or a road, and they are close to each other in real space, they are regarded as the same object and are labeled to calculate the center of gravity position in real space. The three-dimensional position of the object.
[0029]
(Tracker TRC)
The tracking unit TRC sequentially tracks the three-dimensional position of the object in the real space when it is determined in the object detection process that an object exists above the reference plane. If the three-dimensional position of an object at a certain time T is P1 (u1, v1.h1) and the subsequent position at time T + Δ (Δ is the processing period of the association unit) is P2 (u2, v2, h2), the real space The amount of movement is calculated by the equation (7), and if the amount of change in position during the processing period Δ is within a certain range, and the amount of change in the object width, height, etc. is also within a certain range, time T It is assumed that the objects extracted to T + Δ are the same object, and the object has moved from P1 (u1, v1. H1) to P2 (u2, v2, h2). The above tracking process is repeatedly performed during the period in which the object is extracted by the object extraction unit.
[Expression 4]

The position and size of the object at each time are stored as tracking data until the end of tracking.
[0030]
(Identification part RCG)
The identification unit RCG identifies the object by calculating a feature amount indicating a temporal change in position and size, which is three-dimensional information of the object obtained by tracking. The temporal change of the three-dimensional information is a characteristic amount such as a detection time, a movement amount, a size, a movement direction, and a movement speed.
[0031]
The detection time is calculated by the difference from the tracking start time T1 to the tracking end time T2 obtained by tracking the same object. Generally, birds and the like move at a relatively high speed and move out of the imaging range, so the time for continuous detection is short, and humans move at a relatively low speed, so the time for continuous detection in the imaging range is long. . A bird or the like and an intruder can be identified by the length of the detection time.
[0032]
The movement amount is a distance from a three-dimensional tracking start position to a tracking end position in the real space or a distance from which the object is farthest from the tracking start position to the end of tracking of the object. Even if the object extraction unit obtains 3D information as if there is a new object due to an association error or lighting change, or if the tree is greatly shaken, these will further intrude Since it is unthinkable to move like an intruder, it can be identified as a moving intruder.
[0033]
The size is the width or height of an object in real space, and is measured at any time during tracking. The average size or variance during tracking may be the maximum or minimum size during tracking. These can be identified as objects other than the intruder when they are smaller than a certain ratio or larger than a certain ratio compared to the standard human size. The width in the real space is obtained by the difference between the left end position and the right end position of the object using the distance U (X, Y) in the width direction on the plane obtained by the object extraction unit, and the height is obtained from the floor, road, etc. The height H (X, Y) obtained by the object extraction unit is used.
[0034]
The moving direction is a direction in which an object moves in real space. The direction in which the object has moved from the start of tracking to the end of tracking is calculated. For example, it can be obtained from a vector obtained by connecting a tracking start point and a tracking end point in real space. According to the moving direction, it is possible to identify a person who has left the facility and an intruder from outside the facility.
[0035]
The moving speed is a speed at which an object moves in real space. This is obtained by dividing the amount of movement in real space from a certain time T1 to T2 during tracking by the absolute difference between T1 and T2, which is the required time. The moving speed is relatively fast for small animals such as birds, and relatively slow for humans. Therefore, if the moving speed is equal to or lower than the threshold, it can be determined that the person is an intruder or the like.
[0036]
As a method of identifying what an object is by using a total of five feature quantities including detection time, movement amount, size, movement direction, and movement speed, first, the object is an intruder for each feature quantity. It is conceivable to determine whether or not an object is an intruder if the number of feature quantities for which the identification result is true is equal to or greater than a threshold value TH. In this case, the threshold value TH is a value of 5 or less and 1 or more.
[0037]
It should be noted that all of the above five feature quantities are used to identify the object, or a part of the object is used for identification, together with other feature quantities indicating temporal changes in three-dimensional information. You can go.
[0038]
As described above, according to the first embodiment, since the object can be tracked three-dimensionally using the position and size of the object obtained by stereo image processing in the real space, the three-dimensional object quantity is a feature quantity for object identification. The temporal change of information can be calculated with high accuracy. As a result, the accuracy of identifying an intruding object is improved, and a stereo image processing apparatus with few false alarms can be obtained.
[0039]
Table 1 compares the difference in configuration between the first embodiment, conventional monocular image processing, and conventional stereo image processing.
[Table 1]

[0040]
(Second Embodiment)
In the second embodiment of the present invention, parallax is measured by associating left and right images, the position of the reference plane is estimated three-dimensionally using the parallax, and an object existing on the reference plane is extracted. Stereo image that identifies the type of object by performing three-dimensional tracking, calculating detection time, amount of movement, size, movement direction, movement speed, etc., and performing identification processing using these in an integrated manner It is a processing method. The processing outline of each processing step will be described with reference to the processing flowchart of FIG. It should be noted that more specific processing content of each step has been described in the first embodiment, and will be omitted.
[0041]
In step F01, a stereo image is input. In step F02, a correlation process is performed between the left and right images, and parallax is acquired in units of blocks set for the left image. In step F03, the position of a reference plane such as a floor or a road is estimated three-dimensionally, and in step F04, coordinate conversion is performed so that the three-dimensional information of the object can be expressed in a coordinate system based on the plane position. In step F05, an object existing on the plane is extracted. In step F06, if it is determined in F05 that an object is present on the plane, the process proceeds to step F07, and if it is determined that there is no object, branch processing is returned to START. In step F07, the position of the object is sequentially tracked three-dimensionally for each processing cycle. In step F11, the detection time is calculated from the difference between the tracking start time and the tracking end time obtained by tracking so that the object can be identified by the length of the detection time. In step F12, the moving direction of the object in the real space is measured by tracking so that, for example, a visitor and a leaving person can be identified. In step F13, the size of the object in the real space, such as width and height, is measured, so that small animals such as cats and birds and intruders can be identified based on the size. In step F14, the moving speed of the object in the real space is measured so that an object moving at high speed such as a bird can be distinguished from an object moving at low speed such as a human. In step F15, the amount of movement of the object in real space is calculated so that a stationary object that does not move can be identified from a moving moving object. In step F16, an object is identified using some or all of the feature values obtained in steps F11 to F15. As a method for identifying an object by using several types of feature values in an integrated manner, for example, it is first determined for each feature value whether the condition of the intruder is satisfied, and the identification result for each feature value is true. If the number is above a certain level, the object may be identified as an intruder. Then, when it is identified that the object is to issue an alarm, such as an intruder, an alarm signal is output.
[0042]
As described above, according to the second embodiment, by performing high-level object tracking using the three-dimensional information of the object obtained by stereo image processing, the feature amount indicating the temporal change of the three-dimensional information of the object is obtained. Since it is obtained correctly and as a result, the object can be identified with high accuracy, a stereo image processing method with few false alarms can be obtained.
[0043]
(Third embodiment)
The third embodiment of the present invention is an intruding object monitoring system using stereo image processing. FIG. 5 shows the configuration of the intruding object monitoring system according to the third embodiment. A set of left and right stereo images taken by a plurality of imaging devices STCAM is input to the stereo image processing device STORC described in the first embodiment, and an intruder alarm is provided by an identification unit in the stereo image processing device. Is detected, an alarm signal is output to the alarm unit ALM and the image storage unit STR, the alarm unit ALM issues an alarm, and the image storage unit STR stores images before and after the alarm on a magnetic disk or the like. . Since the plurality of imaging devices STCAM and stereo image processing device STPRC have been described in the first embodiment, the details are omitted.
[0044]
When the alarm unit ALM determines that an intruder or the like is present in the image, the alarm unit ALM receives an alarm signal output from the identification unit RCG and alerts the monitoring person with sound, light, vibration, or the like. Alert the person. The image accumulation unit STR records and accumulates images before and after the alarm signal is output from the identification unit RCG in the HDD, the optical disc, and the tape. The image storage unit STR can also record an image several seconds before the alarm signal is issued by using a frame buffer or the like as an image delay unit. Note that not only images but also audio and the like can be recorded simultaneously. The monitor MNTR displays a plurality of images or displays images recorded in the image storage unit STR.
[0045]
As described above, according to the third embodiment, the object is distinguished by tracking the object three-dimensionally using the three-dimensional information obtained by stereo image processing such as the three-dimensional position and size of the object. The time change of the three-dimensional information as the quantity can be obtained with high accuracy, and finally, an intruding object monitoring system with few false alarms can be obtained.
[0046]
(Fourth embodiment)
In the fourth embodiment of the present invention, as shown in FIG. 6, image multiplexing is performed between a plurality of imaging devices STCAM and a stereo image processing device STPRC of the intruding object monitoring device system described in the third embodiment. This is an intruding object monitoring system in which a transmission unit SND and an image multiplex reception unit RCV are inserted. A set of left and right stereo images captured by a plurality of imaging devices STCAM is input to the image multiplex transmission unit SND. The image multiplex transmission unit SND multiplexes and transmits a plurality of stereo images. The image multiplex reception unit RCV performs processing for returning the multiplexed image data back to a plurality of images, that is, stereo images. When an object is identified by the identification unit RCG in the stereo image processing apparatus STPRC and an object such as an intruder is to be issued, an alarm signal is output to the alarm unit ALM and the image storage unit STR. The alarm unit ALM An alarm is issued, and the image storage unit STR stores images before and after the alarm on a magnetic disk or the like. The plurality of imaging devices STCAM and the stereo image processing device STPRC have been described in the first embodiment, and the alarm unit ALM and the image storage unit STR have been described in the third embodiment.
[0047]
The image multiplex transmission unit SND multiplexes and transmits a plurality of images taken at the same time. As a method of multiplexing a plurality of images, multiplexing for each field, frequency multiplexing, and time division multiplexing can be considered. In the case of multiplexing for each field, two field images are multiplexed into a first field and a second field in the same frame, and the image is transmitted. This will be specifically described with reference to FIG. In the stereo image processing, since it is necessary to use a stereo image composed of left and right images taken at the same time, each image is temporarily held in the field memory or the frame memories FMR and FML. Next, the left and right multiple images MLTIMG are obtained by taking out images of the same field, either odd or even, from the memory holding the left and right images, and combining and multiplexing them as a single frame image. The image multiplex transmission unit RCV transmits and transmits the left and right multiplex images. In this case, the image multiplex receiver RCV separates and demodulates the left field image and the right field image from the left and right multiplex image images, respectively.
[0048]
In the case of transmission by frequency multiplexing, transmission is performed by shifting the frequency so that the video signal bands of the left and right images to be multiplexed do not overlap each other on the frequency axis. In this case, the image multiplex receiver RCV separates and demodulates the video signal band of each image from a signal obtained by frequency multiplexing of a plurality of images using a band pass filter. Frequency multiplexing is also used in broadcasting, telephone networks and the like.
[0049]
When transmitting by time division multiplexing, assuming that a plurality of images are two images of a right image and a left image taken at the same time, the right image signal at time T1, and the left image at time T2. The signals are sequentially switched and transmitted in the order of the right image signal at time T3. In this case, the image multiplex receiving unit receives the data of each image in the order of transmission and demodulates the left and right images. In any multiplexing method, the transmission path may be wireless, and it is also possible to transmit multiplexed images using an optical cable.
[0050]
As described above, according to the fourth embodiment, the time of three-dimensional information that is a feature amount is obtained by tracking three-dimensionally using the three-dimensional position and size of an object obtained by stereo image processing. Change can be obtained with high accuracy, and finally an intruder can be identified without false alarms. In addition, by multiplexing and transmitting images taken by a plurality of imaging devices to a single transmission line, wiring between the plurality of imaging devices and the three-dimensional information processing unit can be simplified. Even if an existing video signal transmission path for a surveillance camera is used as it is, a stereo image composed of a plurality of images can be transmitted.
[0051]
【The invention's effect】
As described above, according to the present invention, as a first effect, the parallax within the imaging range is obtained by stereo image processing, the reference plane such as a floor or a road is estimated three-dimensionally, and is above the reference plane. By extracting the object and tracking the three-dimensional position of this object sequentially, the feature quantity indicating the temporal change in the three-dimensional information of the object, such as the actual movement amount, size, movement direction, movement speed, etc. Since it can be acquired with high accuracy and an object is identified using this, it is possible to provide a stereo image processing method, a stereo image processing apparatus, and an intruding object monitoring system with few false alarms.
[0052]
As a second effect, when an image is transmitted from a plurality of imaging devices to a stereo image processing device, it is possible to provide an intruding object monitoring system that does not require a plurality of transmission paths by multiplexing and transmitting the images.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a stereo image processing apparatus according to a first embodiment of the present invention.
FIG. 2 is an explanatory diagram illustrating a principle when a plane estimation unit according to the first embodiment of the present invention estimates the position of a plane such as a road or a floor in a three-dimensional manner.
FIG. 3 is an explanatory diagram illustrating a state in which the object extraction unit of the stereo image processing apparatus according to the first embodiment of the present invention performs coordinate transformation.
FIG. 4 is a flowchart showing a flow of processing of a stereo image processing method according to the second embodiment of the present invention.
FIG. 5 is a configuration diagram of an intruding object monitoring system according to a third embodiment of the present invention.
FIG. 6 is a configuration diagram of an intruding object monitoring system according to a fourth embodiment of the present invention.
FIG. 7 is an explanatory diagram showing operation contents of an image multiplex transmission unit according to the fourth embodiment of the present invention;
FIG. 8 is a configuration diagram of an intruder monitoring apparatus according to the prior art.
[Explanation of symbols]
STCAM Multiple imaging devices, stereo cameras
STPRC stereo image processing apparatus
3DSNS association unit
EST plane estimation unit
EXT Object extraction unit
TRC tracking unit
RCG identification unit
PLN Road, floor, etc.
PLNL plane passing straight line
IMP reference image plane
OBJ objects such as intruders
SND image multiplex transmission unit
RCV image multiplex receiver
STR image storage unit
MNTR monitor
ALM alarm section
FMR, FML Frame memory for holding left and right image data
MLTIMG Left and right multiple images
LIMG, RIMG Left and right field images

Claims

Using a plurality of images taken by a plurality of imaging devices, the parallax is measured by associating the plurality of images, a reference plane is estimated three-dimensionally using the parallax, and the reference Stereo that identifies an object that exists above a plane, sequentially tracks the three-dimensional position of the extracted object, and uses the temporal change in the three-dimensional information of the object obtained during tracking as a feature quantity Image processing method.

Using a plurality of images taken by a plurality of imaging devices, an association unit that measures parallax between the plurality of images, a plane estimation unit that three-dimensionally estimates a reference plane using the parallax, An object extraction unit that extracts an object existing above the reference plane, a tracking unit that sequentially tracks the three-dimensional position of the object extracted by the object extraction unit, and the three-dimensional information of the object obtained during tracking A stereo image processing apparatus having an identification unit for identifying the object using a temporal change of