JP2004157908A

JP2004157908A - Device and method for extracting movement information

Info

Publication number: JP2004157908A
Application number: JP2002324854A
Authority: JP
Inventors: Yutaka Yokota; 豊横田; Akira Shiosaki; 陽汐崎
Original assignee: Dainippon Pharmaceutical Co Ltd
Current assignee: Dainippon Pharmaceutical Co Ltd
Priority date: 2002-11-08
Filing date: 2002-11-08
Publication date: 2004-06-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technique whereby the movement of a target is extracted through a relatively simple process, relating to a device for extracting, based on animation, the movement of the target depicted in the animation. <P>SOLUTION: Blocks of images containing the target are detected from within a still-picture frame which constitutes the animation. Each of the image blocks is associated with the image blocks of the previous frame. In this case, of the image blocks of the previous frame, the image blocks where the directions of the previous movements are stable are selected and associated therewith. For example, the block BL1 of interest shown in Figure A is associated with the block BLp2 where movement is more stable, as shown in Figure B. Thus, a series of associated blocks is calculated to determine the track of the target. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の技術分野】
この発明は、与えられた動画から対象物の移動情報を抽出する技術に関するものである。
【０００２】
【従来の技術】
動画像中から対象物を抽出して、その動きを検出する技術が広く用いられている。このような技術は、たとえば、店舗などの監視システム、製造ラインにおける物品の流れの監視システム、動物の行動監視などに用いられている。
【０００３】
一般的な認識手法としては、動画を構成する静止画の各フレーム間の差分を算出する方法が知られている。この方法は、図１に示すように、連続するフレームにおいてその差分画像を算出し、移動する対象物を検出するものである。図１Ｃが、フレーム１とフレーム２との差分画像である。つまり、フレーム間差分画像には、フレーム１とフレーム２の間の画像の変化部分（つまり対象物の移動）が表れることになる。動画を構成する全ての連続フレームについて差分画像を算出すれば、対象物の動きを算出することができる。このようなフレーム間差分によって対象物の移動を検出する手法を開示したものとして、特許文献１がある。
【０００４】
しかしながら、フレーム間差分による方法では、静止している対象物（図１の例ではラット）を検出できない場合があった。また、対象物全体の形状を認識する必要がある場合であっても、これを認識できないという問題もあった。
【０００５】
また、各フレームごとに背景との差分を算出して、対象物を検出するという手法も知られている。図２Ａに示すように背景フレームを用意しておき、図２Ｂの対象となるフレームと、背景フレームとの差分を算出すれば、図２Ｃに示すように、対象物の画像を得ることができる。このような手法を開示したものとして、非特許文献１がある。
【０００６】
しかしながら、背景差分による方法では、対象物が水面上にある場合などのように、背景が変化するような場合には、対象物の認識がきわめて困難となってしまう。したがって、背景差分による手法を用いる場合には、この問題を解決する必要がある。
【０００７】
また、特許文献２には、次のような手法が開示されている。フレーム毎に、１つ以上の移動物体を背景差分法などによって検出する。その際、フレーム間における移動物体の図形的特徴の類似度および移動距離に基づいて、２つのフレーム間の移動物体が、同一の移動物体であるか否かを判断する。このようにして、同一の移動物体について、その軌跡を算出する。さらに、所定のフレーム数を越えて軌跡を算出できなかった移動物体は、対象から外す。さらに、この軌跡が滑らかでないものは、対象物でないとする。上記のようにして、対象物の動きを検出している。
【０００８】
しかしながら、この手法では、フレーム間における移動物体が同一であるか否かを、まず決定する必要があり、この判断が困難であるという問題があった。
【０００９】
なお、特許文献３に示されるように、動物の一部に着色を施すことにより、対象物の特定を容易にする手法もある。この手法は、この文献に示されているように背景とする装置が予め定まっている場合には有効である。しかし、背景に現れる物体の色が予測できない場合などにこの手法を適用しようとすると、色による対象物の判定が困難であるという問題があった。
【００１０】
したがって、比較的容易な処理でありながら、対象物の移動をより確実に抽出することのできる技術が望まれている。
【００１１】
【特許文献１】
【００１２】
特開平７−２９０１５号公報
【特許文献２】
【００１３】
特開２００２−１５７５９９号公報
【特許文献３】
【００１４】
特開平８−３２９５９号公報
【非特許文献１】
【００１５】
安居院猛、長尾知晴「画像の処理と認識」昭晃堂、１９９２年。
【００１６】
【課題を解決するための手段および効果】
（１）この発明に係る移動情報抽出装置は、与えられた動画から対象物の移動情報を抽出する移動情報抽出装置であって、当該動画を構成する各静止画において候補画像を認識する候補画像認識手段と、認識した候補画像を、順次、時系列順に関連付けて一連の画像列を認識して、対象物の移動情報を抽出する画像列認識手段とを備えており、
前記画像列認識手段は、注目候補画像を、既に得られている一連の画像列に関連付ける際、複数の一連の画像列が存在する場合には、それぞれの一連の画像列の移動特性に基づいて、いずれの画像列に関連付けるかを決定するよう構成されている。
【００１７】
すなわち、一連の画像列の移動特性に基づいて、対象物の移動特性に合致しているものを選択し、これに注目画像候補を関連付けることができるので、より確実に対象物の移動情報を抽出することができる。
【００１８】
（３）この発明に係る移動情報抽出装置は、注目候補画像をいずれの画像列に関連付けるかを決定する際に、各画像列の移動方向の安定性が最も大きい画像列に関連付けることを特徴としている。
【００１９】
したがって、ノイズなど、各画像列の移動方向が目まぐるしく変化しているものを除外できる可能性を高くできる。
【００２０】
（４）この発明に係る移動情報抽出装置は、注目候補画像をいずれの画像列に関連付けるかを決定する際に、各画像列について、下式によって定義される移動量係数Ｍを算出し、最も大きい移動量係数Ｍを持つ画像列に関連づけることを特徴としている。
【００２１】
【数２】

ここで、ｉは、各静止画に連続して付されたフレーム番号であり、
ＭＸｉは、ｉ−２フレームの静止画における候補画像とｉ−１フレームの静止画における候補画像のＸ軸方向に関する移動方向をＤｉ−１とし、ｉ−１フレームの静止画における候補画像とｉフレームの静止画における候補画像のＸ軸方向に関する移動方向をＤｉとしたとき、移動方向Ｄｉ−１と移動方向Ｄｉが同方向であれば「１」、反対方向であれば「−１」とする。ただし、ｉ−１フレームの静止画とｉフレームの静止画における候補画像のＸ軸方向に関する移動量が所定のしきい値以下である場合には「０」とする。
【００２２】
ＭＹｉは、ｉ−２フレームの静止画における候補画像とｉ−１フレームの静止画における候補画像のＹ軸方向に関する移動方向をＤｉ−１とし、ｉ−１フレームの静止画における候補画像とｉフレームの静止画における候補画像のＹ軸方向に関する移動方向をＤｉとしたとき、移動方向Ｄｉ−１と移動方向Ｄｉが同方向であれば「１」、反対方向であれば「−１」とする。ただし、ｉ−１フレームの静止画とｉフレームの静止画における候補画像のＹ軸方向に関する移動量が所定のしきい値以下である場合には「０」とする。
【００２３】
したがって、簡易な演算処理によって、各画像列の移動方向の非ランダム性を判断することができる。
【００２４】
（５）この発明に係る移動情報抽出装置は、注目候補画像を、既に得られている一連の画像列に関連付ける際、注目候補画像に対して所定以上の関連度を持たない候補画像を直近に有する一連の画像列を、注目候補画像の関連付け対象から除外することを特徴としている。
【００２５】
したがって、一連の画像列に対して関連度の小さい注目候補画像を処理対象から除外し、処理の迅速化を図ることができる。
【００２６】
（６）この発明に係る移動情報抽出装置は、関連度が、直近の候補画像と注目画像の距離に基づいて算出されるものであることを特徴としている。
【００２７】
したがって、対象物の移動速度をある程度予想できる場合には、処理対象とすべき注目候補画像を絞り込むことができる。
【００２８】
（７）この発明に係る移動情報抽出装置は、注目候補画像を、既に得られている一連の画像列に関連付ける際、関連付け対象として複数の一連の画像列があって、かつ、それぞれの一連の画像列の移動特性に基づいて、いずれの画像列に関連付けるかを決定できない場合には、それぞれの一連の画像列における直近の候補画像の内、注目候補画像に最も近似した図形特性を持つ候補画像を有する画像列に関連づけることを特徴としている。
【００２９】
したがって、より正確に対象物の動きを示す一連の画像列を認識することができる。
【００３０】
（８）この発明に係る移動情報抽出装置は、関連度が、直近の候補画像と注目画像の図形特性の差に基づいて算出されるものであることを特徴としている。
【００３１】
したがって、対象物を示す画像の図形特性が大きく変化しないような場合には、この図形特性を用いて関連度を得ることにより、処理対象とすべき注目候補画像を絞り込むことができる。
【００３２】
（９）この発明に係る移動情報抽出装置は、一連の画像列の移動方向の安定性が所定値以上であれば、当該一連の画像列を、対象物であると判断することを特徴としている。
【００３３】
したがって、ノイズなど短時間に変化する移動方向を持つものを対象から除外する可能性を高くできる。
【００３４】
（１０）この発明に係る移動情報抽出装置は、一連の画像列に関連付けるべき注目候補画像が存在しない場合には、これを第１の一連の画像列とし、その後の静止画に基づいて、他の一連の画像列を認識し、その移動方向の安定性が所定値以上となれば、これを第２の一連の画像列とし、第１の一連の画像列の最後の候補画像と、第２の一連の画像列の最初の候補画像とを関連付けることによって、連続した一連の画像列とすることを特徴としている。
【００３５】
したがって、一旦、対象物とおぼしき候補画像が途切れた場合であっても、対象物の移動情報を抽出することができる。
【００３６】
（１１）この発明に係る移動情報抽出装置は、一連の画像列に対して、複数の候補画像が関連付けられて、一連の画像列が分岐を生じた場合には、分岐以後のそれぞれの画像列の移動特性に基づいて、いずれか一つの分岐に係る画像列を選択することを特徴としている。
【００３７】
すなわち、一連の画像列の移動特性に基づいて、対象物の移動特性に合致しているものを選択することができるので、より確実に対象物の移動情報を抽出することができる。
【００３８】
（１２）この発明に係る移動情報抽出装置は、一連の画像列が分岐を生じた場合に、分岐から所定数の静止画まで処理を行った時点において、分岐以後のそれぞれの画像列の内、移動方向の安定性が小さい画像列を、対象物ではないと判断することを特徴としている。
【００３９】
したがって、ノイズなど、各画像列の移動方向が目まぐるしく変化しているものを対象物の移動情報から除外できる可能性を高くできる。
【００４０】
（１３）この発明に係る移動情報抽出手段は、与えられた動画から対象物の移動情報を抽出する移動情報抽出装置であって、当該動画を構成する各静止画において候補画像を認識する候補画像認識手段と、認識した候補画像を、順次、時系列順に関連付けて一連の画像列を認識して、対象物の移動情報を抽出する画像列認識手段とを備えており、
前記画像列認識手段は、一連の画像列の移動方向の安定性が所定値以上であれば、当該一連の画像列を、対象物であると判断し、一連の画像列に関連付けるべき注目候補画像が存在しない場合には、これを第１の一連の画像列とし、その後の静止画に基づいて、他の一連の画像列を認識し、その移動方向の安定性が所定値以上となれば、これを第２の一連の画像列とし、第１の一連の画像列の最後の候補画像と、第２の一連の画像列の最初の候補画像とを関連付けることによって、連続した一連の画像列とする。
【００４１】
したがって、動画の所定期間において、対象物の認識が困難である場合においても、一連の画像列を関連付けて、対象物の動きを抽出することができる。また、この際に、移動方向の安定性に基づいて対象物か否かを判断しているので、誤判断を少なくできる。
【００４２】
（１５）この発明に係る移動情報抽出方法は、与えられた動画から対象物の移動情報を抽出する移動情報抽出方法であって、当該動画を構成する各静止画において候補画像を認識し、認識した候補画像を、順次、時系列順に関連付けて一連の画像列を認識して、対象物の移動情報を抽出するものであって、注目候補画像を、既に得られている一連の画像列に関連付ける際、複数の一連の画像列が存在する場合には、それぞれの一連の画像列の移動特性に基づいて、いずれの画像列に関連付けるかを決定するものである。
【００４３】
すなわち、一連の画像列の移動特性に基づいて、対象物の移動特性に合致しているものを選択し、これに注目画像候補を関連付けることができるので、より確実に対象物の移動情報を抽出することができる。
【００４４】
（１６）この発明の移動情報抽出方法は、関連づけのための移動特性として、対象物について予想される移動と、対象物以外の移動物について予想される移動とを比較した際に、両者を峻別可能な移動特性を用いることを特徴としている。
【００４５】
したがって、より確実に、対象物の動きを、他の物体の動きから区別して抽出することができる。
【００４６】
（１７）この発明に係る移動情報抽出方法は、与えられた動画から対象物の移動情報を抽出する移動情報抽出方法であって、当該動画を構成する各静止画において候補画像を認識し、認識した候補画像を、順次、時系列順に関連付けて一連の画像列を認識して、対象物の移動情報を抽出するものであって、前記画像列の認識においては、各静止画像における候補画像を関連付けて得られる一連の画像列の内、最も移動方向の安定性が高い一連の画像列を選択することを特徴としている。
【００４７】
したがって、ノイズなどを排除して、対象物の移動情報を抽出する可能性を高くできる。
【００４８】
この発明において、「候補画像認識手段」とは、各静止画中にある画像を認識するものである。実施形態においては、図４、ステップＳ５のラベル化処理がこれに対応している。
【００４９】
「画像列認識手段」とは、時系列順に並んだ各静止画中の候補画像を適切に関係づけて、一連の候補画像列を認識するための手段である。実施形態では、ステップＳ６がこれに対応する。
【００５０】
「移動情報の抽出」とは、何らかの形で対象物の移動を認識できるようにすることをいい、対象物の移動軌跡を算出する場合だけでなく、一連の静止画において対象物を特定する場合も含む概念である。
【００５１】
「対象物」とは、移動情報を抽出する対象である。動物、機械、人間、車などを含む概念である。
【００５２】
「候補画像」とは、対象物である可能性のある画像をいう。下記実施形態では、ブロックがこれに該当する。
【００５３】
「移動特性」とは、移動を特徴付けるものであり、速度、加速度、移動方向、移動方向の変化、移動経路などを含む概念である。下記の実施形態では、移動方向の安定性がこれに該当する。
【００５４】
「移動方向の安定性」とは、一連の静止画における候補画像の移動方向の変化が、緩やかである度合いをいうものである。下記実施形態では、移動量係数Ｍがこれに該当する。
【００５５】
「図形特性」とは、図形を特徴づける属性をいい、たとえば、外形、面積、外周長、色などである。
【００５６】
「プログラム」とは、ＣＰＵにより直接実行可能なプログラムだけでなく、ソース形式のプログラム、圧縮処理がされたプログラム、暗号化されたプログラム等を含む概念である。
【００５７】
【発明の実施の形態】
１．発明の適用分野例
この発明に係る移動情報抽出は、対象物の移動を検出する場合全般について適用することができる。たとえば、監視システムにおける不審者進入の監視、動物の行動の測定、生産ラインにおける物品の移動の測定などに用いることができる。
【００５８】
以下に示す実施形態では、対象物である動物の行動の測定に適用した場合について説明を行う。なお、動物の行動の測定は、たとえば、薬品を動物に与えた場合の当該動物の行動を観察し、薬品の薬効や副作用を評価する際に用いられている。
【００５９】
２．装置の全体構成
図３ａに、この発明の一実施形態による移動情報抽出装置２００の全体構成を示す。撮像手段１００は、対象物を撮像して、動画データを移動情報抽出装置２００に与える。候補画像認識手段１０２は、動画を構成する各静止画において、候補画像を認識する。画像列認識手段１０４は、各静止画における候補画像を、時系列順に関連付けて画像列とし、対象物の移動を認識する。軌跡演算手段１０６は、これに基づいて、対象物の軌跡を算出する。出力手段１０８は、この軌跡を表示、印刷、データなどの形式で出力するものである。
【００６０】
３．ハードウエア構成
図３ｂに、図３ａの移動情報抽出装置２００を、コンピュータ２を用いて実現する場合のハードウエア構成を示す。コンピュータ２は、ＣＰＵ６、メモリ８、記録部であるハードディスク１０、表示装置であるディスプレイ１２、入力装置であるマウス／キーボード１４、入出力インターフェイス１６を備えている。入出力インターフェイス１６には、対象物を撮像するためのカメラ（撮像装置）４が接続されている。
なお、この実施形態では、コンピュータ２によって移動情報抽出装置２００を構成したが、カメラ４を含めた全体を、移動情報抽出装置２００としてもよい。
【００６１】
ハードディスク１０には、カメラ４からの動画データに基づいて、対象物の移動情報を抽出するためのプログラムが記録されている。また、ハードディスク１０には、オペレーティングシステム２０（たとえば、マイクロソフト社のＷｉｎｄｏｗｓ（商標））も記録されている。
【００６２】
移動情報抽出プログラム１８は、オペレーティングシステム２０と協働して、ＣＰＵ６に、移動情報抽出処理を行わせる。なお、この実施形態では、オペレーティングシステム２０と協働して、移動情報抽出処理を行うようにしているが、移動情報抽出プログラム１８単独で、移動情報抽出処理を行うようにしてもよい。
【００６３】
以下、マウスの動きをカメラ４で撮像し、その行動を検出する場合を例として説明を行う。
【００６４】
４．移動情報抽出処理
４．１画像取込処理
図４に、移動情報抽出プログラム１８をフローチャートにて示す。まず、ＣＰＵ６は、カメラ４からのカラー動画データ（ＲＧＢデータ）を、インターフェイス１６を介して取り込み、ハードディスク１０に記録する（ステップＳ１）。なお、この実施形態では、予め撮像した動画データを取り込み、移動情報抽出処理を行うようにしている。もちろん、リアルタイムに画像を取り込みつつ、移動情報抽出処理を行ってもよい。
【００６５】
また、カメラ４からの動画がディジタルデータである場合には、そのまま、もしくは、所望のフォーマットに変換した後、取り込む。カメラ４からの動画がアナログデータである場合には、インターフェイス１６にて、Ａ／Ｄ変換処理を行った後、取り込む。
【００６６】
取り込まれたカラー動画は、ハードディスク１０に動画データ２２として記録される。このカラー動画データ２２は、時系列的に連続する複数フレームのカラー静止画として記録される。
【００６７】
４．２表色系変換処理
次に、ＣＰＵ６は、ＲＧＢ表色系で表されているカラー動画データを、下式に基づいて、ＴＳＬ表色系に変換する（ステップＳ２）。なお、Ｔは色合い、Ｓは鮮やかさ、Ｌは輝度である。
【００６８】
【数３】

このようにして得たＴＳＬの各成分のうち、図１７に示すユーザ設定画面において予め選択された成分を、以下の処理において用いる。図１７に示す例では、「解析に用いる表色系成分」の項目において、Ｌ成分（輝度成分）が選択されている。この実施形態では、背景や対象物の色特性に応じて、両者の区別に有用な成分を選択することができるようになっている。また、この実施形態では、２以上の成分を指定（たとえば、Ｔ成分とＬ成分を指定）することも可能にしている。
【００６９】
４．３平滑化処理
次に、ＣＰＵ６は、Ｌ成分による動画データを対象として、ノイズを除去するために、中央値フィルタによる平滑化処理を行う（ステップＳ３）。この平滑化により、微少なノイズが取り除かれるので、後に行う２値化処理において、意味のない微少なブロックが多数発生するのを防ぐことができる。つまり、対象物の大きさと比較して、極めて小さいブロックを、この平滑化によって取り除くようにしている。
【００７０】
中央値フィルタ処理は、下式に基づいて行われれる。
【００７１】
【数４】

図６に、中央値フィルタ処理の概要を示す。図において、動画データの画素配列は、動画データのうちの１つの静止画データについて、各画素のＬ成分を示したものである。画素α（注目画素とする）について中央値フィルタ処理を行う場合には、次のようにする。ＣＰＵ６は、注目画素αを含む周囲の９画素のＬ成分を取得する。次に、これを、Ｌ成分の値の順に並べ、中央に来る値を求める。この中央値を、注目画素αのＬ成分値とする。
【００７２】
図６の場合であれば、周囲９画素のＬ成分値は、２，２，３，３，４，４，４，５，１０である。したがって、前から５番目の値（中央値）である４が選択され、これが注目画素αの新たなＬ成分値とされる。以上のような処理を、動画を構成する全ての静止画の全ての画素について行う。
【００７３】
なお、図１７のユーザ設定画面において、解析に用いる表色系成分として複数の成分を指定した場合には、それぞれの成分について、独立して平滑化処理を行う。
【００７４】
４．４２値化処理
次に、ＣＰＵ６は、平滑化した動画データを２値化する（ステップＳ４）。この実施形態では、上限下限の２つのしきい値を用いて２値化するようにしている。上限下限の値は、図１７に示すユーザ設定画面によりユーザが入力する。
【００７５】
ＣＰＵ６は、２値化の際に、上限値、下限値として、図１７の画面において入力された値を用いる（図では、下限値として「１９０」、上限値として「２４０」が入力されている）。すなわち、注目画素のＬ成分値につき、上限値と下限値の範囲内にある場合には、当該注目画素の値を「１」とする。また、上記範囲外にある場合には、当該注目画素の値を「０」とする。以上のような処理を、動画を構成する全ての静止画の全ての画素について行う。
【００７６】
上記のように、上限下限の２つのしきい値を用いて２値化することにより、照明などによって発生した高輝度な領域を取り除くことができる。
【００７７】
また、解析に用いる表色系成分として、複数の成分を選択した場合には、それぞれの成分によって２値化し、全ての成分による２値化が「１」となった画素について、「１」とする。
【００７８】
４．５ラベル付け処理
次に、ＣＰＵ６は、上記の２値化された動画データについて、ラベル付け処理を行う（ステップＳ５）。ラベル付け処理とは、各静止画において、「１」の値を持つ画素の塊を認識する処理である。この塊をブロックと呼ぶ。つまり、ブロックとは、値「１」を持つ画素の連結成分である。
【００７９】
ＣＰＵ６は、認識した各ブロックを区別可能なように、ブロック識別子を与える。さらに、各ブロックについて、重心座標、面積、周囲長などを算出して、図７に示すようなブロック情報テーブルとしてハードディスク１０に記録する。なお、この実施形態では、面積および周囲長は、画素数によって算出している。また、図において、フレーム番号とは、動画を構成する静止画について時系列順に付した番号である。
【００８０】
なお、このラベル付け処理において、対象物の面積と比較して、あまりにも大きい面積を持つブロックや、あまりにも小さい面積を持つブロックを、以後の処理対象から外すようにしてもよい。また、面積に代えて周囲長によってこの判断を行ってもよい。
【００８１】
面積を用いるか、周囲長を用いるかは、図１７の設定画面によってユーザが決定できる。また、上限、下限も設定できる。
【００８２】
４．６移動軌跡の識別処理
上記のようにラベル付けを行った後、移動軌跡の識別処理を行う（ステップＳ６）。この実施形態の移動軌跡識別処理では、余分なブロックを関係づけから排除しつつ、各フレームのブロックを時系列順に関連付け、対象物の軌跡を得るようにしている。図５ａ、図５ｂ、図５ｃ、図５ｄに、移動軌跡識別処理の詳細なフローチャートを示す。
【００８３】
ＣＰＵ６は、まず、フレーム「２」を注目フレームとする（図５ａ、ステップＳ６１）。この注目フレームｎは、処理の進行とともに「３」「４」・・・と進んでいく。
【００８４】
次に、当該注目フレームの最初のブロックを注目ブロックとする（ステップＳ６２）。注目ブロックとは、直前フレームのいずれのブロックに関係づけるかの処理を行う対象とするブロックである。つまり、以下においては、注目ブロックを、直前フレームのいずれのブロックに関係づけるかを判断し、適切なものに関係づける処理を行っている。
【００８５】
次に、最大移動量係数Ｍｍａｘを「−１」とする（ステップＳ６３）。移動量係数については、後述する。
【００８６】
続いて、直前フレームの最初のブロックを注目直前ブロックとする（ステップＳ６４）。ここで、たとえば、注目フレームｎが図８Ａのような静止画であったとする。ステップＳ６２において、たとえば、図８ＡのブロックＢＬ１が注目ブロックとして選択される。また、直前フレームｎ−１が図８Ｂのような静止画であったとする。ステップＳ６３において、たとえば、図８ＢのブロックＢＬｐ１が注目直前ブロックとして選択される。
【００８７】
次に、ＣＰＵ６は、注目ブロックＢＬ１と注目直前ブロックＢＬｐ１の重心間の距離Ｄを、図７のブロック情報に基づいて算出する（ステップＳ６５）。この重心距離Ｄが所定値Ｔｄを超えている場合には、当該注目直前ブロックＢＬｐ１を、関係づけの対象から外すようにしている（ステップＳ６６）。つまり、注目直前ブロックＢＬｐ１は、注目ブロックＢＬ１の関連付け対象から除外される。
【００８８】
これを図９によって説明すると、以下のとおりである。図９は、説明のため、注目フレーム（図８Ａ）と直前フレーム（図８Ｂ）を重ねて示したものである。図において、注目フレームのブロックは実線で、直前フレームのブロックは破線で示している。
【００８９】
２点鎖線は、注目ブロックＢＬ１の重心を中心として描いた、所定値Ｔｄを半径とする円である。つまり、注目ブロックＢＬ１に対して関係づけを行う候補となる注目直前ブロックを、所定値Ｔｄの円内にあるものに限定している。これにより、可能性の低い注目直前ブロックを予め排除でき、処理の効率化を図ることができる。なお、所定値Ｔｄは、図１７の設定画面における「ブロック間結合閾値」としてユーザが指定できる。この値は、対象物の移動速度などに基づいて決定すればよい。
【００９０】
このように重心距離Ｄが所定値Ｔｄを超えている場合には、ステップＳ７０を経て、次のブロックＢＬｐ２を注目直前ブロックとし（ステップＳ７１）、再び、ステップＳ６４以下を実行する。
【００９１】
図９に示すように、注目直前ブロックＢＬｐ２と注目ブロックＢＬ１との重心距離Ｄは、所定値Ｔｄを下回っているので、ステップＳ６５、Ｓ６６からＳ６７に進む。ステップＳ６７では、注目直前ブロックＢＬｐ２の有する移動量係数Ｍが、現在の最大移動量係数Ｍｍａｘより大きいか（あるいは等しいか）否かを判断する。
【００９２】
この移動量係数Ｍとは、当該注目直前ブロックを含む、それ以前に関連付けられている一連のブロック（さらに前の一連のフレームの一連のブロック）が、どのように移動してきたかを示す指標である。この実施形態では、関連付けられた一連のブロックの移動方向が、どの程度安定しているか（短時間に極端に変化しないか）を示す数値として移動量係数Ｍを算出するようにしている。すなわち、移動量係数Ｍが大きければ、ブロックの移動方向が安定していることを示すように、移動量係数Ｍを定義している。これにより、ランダムに移動するノイズなどを、認識対象から除外することができる。
【００９３】
図１０Ａは、関連付けられた一連のブロックの移動状況を模式的に示すものである。注目直前ブロックＢＬｐ２は、更にその前のフレームのブロックＢＬｑ２、
更にその前のフレームのブロックＢＬｒ１、更にその前のフレームのブロックＢＬｓ２に関連付けられている。この場合には、これら一連のブロックの移動方向は安定しているので、移動量係数Ｍは大きくなる。なお、移動量係数Ｍの算出については、後に詳しく説明する。また、算出された移動量係数Ｍは、図７のブロック情報テーブルに記録されているので、これを用いる。たとえば、ブロックＢＬｐ２の移動量係数Ｍは、「５２」として記録されている。
【００９４】
ここでは、ステップＳ６８において、注目直前ブロックＢＬｐ２の持つ移動量係数Ｍが最大移動量係数Ｍｍａｘとして記憶される。さらに、ＣＰＵ６は、この注目直前ブロックＢＬｐ２を、選択直前ブロックとする（ステップＳ６９）。
【００９５】
以上のように、注目直前ブロックＢＬｐ２に対する処理が終わると、次に、ブロックＢＬｐ３を、注目直前ブロックとし（ステップＳ７０、Ｓ７１）、ステップＳ６４以下を再び実行する。
【００９６】
注目直前ブロックＢＬＰ３と注目ブロックＢＬ１との重心距離Ｄは、所定値Ｔｄより小さいので（図９参照）、ステップＳ６７以下が実行される。ステップＳ６７においては、注目直前ブロックＢＬｐ３の有する移動量係数Ｍ（ここでは「３」）が、最大移動量係数Ｍｍａｘより大きいか（等しいか）否かが判断される。
【００９７】
図１０Ａに示すように、注目直前ブロックＢＬｐ３を含む一連のブロックの移動方向は、ランダムであり、移動量係数Ｍは小さい。したがって、ステップＳ７０に処理が進む。
【００９８】
この直前フレームには、未処理のブロックは残っていないので、注目ブロックＢＬ１に対する処理が終了したことになる。すなわち、注目ブロックＢＬ１との重心距離Ｄが所定値Ｔｄより小さく、かつ、最も大きな移動量係数Ｍを持つ直前ブロックＢＬｐ２がステップＳ６９の選択直前ブロックとして決定される。
【００９９】
次に、ＣＰＵ６は、選択ブロックがあるか否か、複数あるか否かを判断する（図５ｃ、ステップＳ７２、７３）。選択ブロックがなければ、図５ｄのステップＳ７９に進む。ここでは、選択直前ブロックは、１つであるから、ステップＳ７４に進む。ステップＳ７４において、ＣＰＵ６は、この選択直前ブロックＢＬｐ２を、注目ブロックＢＬ１に関係づける。すなわち、ＣＰＵ６は、図７のブロック情報テーブルにおいて、注目ブロックＢＬ１のブロック情報テーブルの「関連付」の欄に、選択直前ブロックＢＬｐ２のブロック識別子を記述する。
【０１００】
図１０Ｂに示すように、注目ブロックＢＬ１は、移動量係数Ｍの大きい（つまり、より対象物の可能性が高い）一連のブロックＢＬｐ２、ＢＬｑ２、ＢＬｒ１、ＢＬｓ２に関連付けられる。
【０１０１】
次に、ＣＰＵ６は、注目ブロックＢＬ１（および一連のブロック）について、下式に基づき、移動量係数Ｍを算出する。
【０１０２】
【数５】

ここで、ｉは、各静止画に連続して付されたフレーム番号である。
【０１０３】
ＭＸｉは、ｉ−２フレームの静止画における候補画像とｉ−１フレームの静止画における候補画像のＸ軸方向に関する移動方向をＤｉ−１とし、ｉ−１フレームの静止画における候補画像とｉフレームの静止画における候補画像のＸ軸方向に関する移動方向をＤｉとしたとき、移動方向Ｄｉ−１と移動方向Ｄｉが同方向であれば「１」、反対方向であれば「−１」とする。ただし、ｉ−１フレームの静止画とｉフレームの静止画における候補画像のＸ軸方向に関する移動量が所定のしきい値以下である場合には「０」とする。
【０１０４】
ＭＹｉは、ｉ−２フレームの静止画における候補画像とｉ−１フレームの静止画における候補画像のＹ軸方向に関する移動方向をＤｉ−１とし、ｉ−１フレームの静止画における候補画像とｉフレームの静止画における候補画像のＹ軸方向に関する移動方向をＤｉとしたとき、移動方向Ｄｉ−１と移動方向Ｄｉが同方向であれば「１」、反対方向であれば「−１」とする。ただし、ｉ−１フレームの静止画とｉフレームの静止画における候補画像のＹ軸方向に関する移動量が所定のしきい値以下である場合には「０」とする。
【０１０５】
なお、この実施形態では、予想される候補画像の面積と同じ面積を有する円を想定し、この円の半径の半分を、上記しきい値として用いるようにしている。
【０１０６】
ＣＰＵ６は、このようにして算出した移動量係数Ｍを、図７のブロック情報テーブルにおける注目ブロックＢＬ１（図７には図示せず）の「移動量係数」の欄に記録する（ステップＳ７５）。なお、移動量係数Ｍの算出は、図７のブロック情報において、既に直前のブロックＢＬｐ２に記録されているので、これを用いて下式に基づいて算出することができる。これにより、処理の迅速化を図ることができる。
【０１０７】
Ｍ＝ＭＸ＋ＭＹ＋ＭＰ
ここで、ＭＰは直前のブロックの移動量係数であり、ＭＸは注目ブロックと直前ブロックとの間のＸ軸に関する移動量係数であり、ＭＹは注目ブロックと直前ブロックとの間のＹ軸に関する移動量係数である。
【０１０８】
次に、ＣＰＵ６は、上記にて算出した注目ブロックの移動量係数Ｍが、所定値Ｔｍ（移動対象認識値）より大きいか否かを判断する（ステップＳ７６）。大きければ、対象物であると判断し、当該注目ブロックおよびこれに関連付けられた一連のブロックのブロック情報中の「対象物フラグ」欄を「１」にする（図７参照）。なお、所定値Ｔｍは、図１７の設定画面によって、ユーザが設定できる。
【０１０９】
続いて、一連のブロックの重心を結ぶ軌跡を算出し、ディスプレイ１２に表示する（ステップＳ７７０）。これにより、ディスプレイ１２上には、図８Ａのように、対象物の移動を示す軌跡Ｌが表示される。
【０１１０】
また、移動量係数Ｍが所定値Ｔｍより小さい場合には、ノイズであるか、あるいは、対象物と判断するにはデータの蓄積が少なすぎると判断し、対象物フラグを「０」のままにしておく。
【０１１１】
ここでは、注目ブロックＢＬ１の移動量係数Ｍ（ここでは「５４」）は、所定値Ｔｍ（ここでは、「１０」）より大きいので、対象物であると判断される。
【０１１２】
次に、ＣＰＵ６は、ステップＳ７９を経て、ステップＳ８２を実行する。ステップＳ８２では、注目フレーム中に、未処理の他のブロックがあるか否かを判断する。未処理のブロックＢＬ２、ＢＬ３が残っているので（図８Ａ参照）、ステップＳ８４に進み、ブロックＢＬ２を注目ブロックとして、再び、図５ａのステップＳ６３以下を実行する。
【０１１３】
注目ブロックＢＬ２に対して、重心距離Ｄが所定値Ｔｄ以下のものは、直前ブロックＢＬｐ３だけである。よって、ブロックＢＬ２は、図１１に示すように、直前ブロックＢＬｐ３と関係づけられる。ただし、ブロックＢＬ２（および一連のブロック）の移動方向はランダムに変化しており、その移動量係数Ｍは、所定値Ｔｍより小さくなるので、これら一連のブロックは、対象物と判定されない。したがって、ＣＰＵ６は、これらのブロックの「対象物フラグ」を「０」のままにしておく。
【０１１４】
以上のようにして、ブロックＢＬ２についての処理を終えると、注目フレーム（図８Ａ参照）には、未処理のブロックが残っていないので、ＣＰＵ６は、ステップＳ８２からＳ８６に進む。ステップＳ８６において、ＣＰＵ６は、未処理のフレームが残っているか否かを判断する。残っていれば、次のフレームを注目フレームとして（ステップＳ８５）、再び、図５ａのステップＳ６２以下を実行する。
【０１１５】
このようにして、全てのフレームの静止画について上述の処理を行うことにより、対象物の移動情報を抽出することができる。すなわち、図７に示すブロック情報テーブルにおいて、対象フラグが「１」となっており、「関連付」欄において関連づけのなされた一連のブロックとして、対象物の移動情報を得ることができる。また、その軌跡を表示することができる。
【０１１６】
ところで、上記の手法にて処理を行うと、一連のブロックにおいて分岐を生じる可能性がある。この実施形態では、このような分岐に対応するための処理を設けている。
【０１１７】
先ほどの例では、注目フレームとして図８Ａのような静止画を想定した。ここでは、図８Ａの静止画に代えて、図１１Ａのような静止画を注目フレームとして想定する。図１１Ａにおいては、ブロックＢＬ３が存在している。
【０１１８】
この注目フレームについて、移動軌跡識別処理を行うと、図１２に示すように関連づけが行われる。ブロックＢＬ１、ＢＬ２が、それぞれ、直前ブロックＢＬｐ２、ＢＬｐ３に関連付けられるのは、図８Ａの場合と同様である。
【０１１９】
ブロックＢＬ３は、最も移動量係数Ｍの大きい直前ブロックＢＬｐ２に関連付けられる。したがって、直前ブロックＢＬｐ２には、２つのブロックＢＬ１、ＢＬ３が関連付けられることになる。すなわち、分岐を生じてしまう。
【０１２０】
ＣＰＵ６は、このような分岐が生じた場合、次のようにして分岐を解消している。まず、分岐が生じた後に、図５ｄのステップＳ７９を実行すると、ステップＳ８０に進む。ステップＳ８０では、分岐が生じてから、所定のフレーム数（たとえば、４フレーム）が経過しているか否かを判断する。ここでは、まだ、１フレームしか経過していないので、通常の処理に戻る。つまり、ステップＳ８２に進む。
【０１２１】
次のフレームの処理において、図１３に示すように、注目ブロックＢＢから所定値Ｔｄの距離内に、直前ブロックＢＬ１、ＢＬ３があることになる。また、この両直前ブロックＢＬ１、ＢＬ３の移動量係数Ｍは等しい。このため、注目ブロックＢＢをいずれの直前ブロックに関係づけるべきかが問題となる。
【０１２２】
この実施形態では、次のようにして、この問題を解決している。ＣＰＵ６は、移動量係数Ｍが等しい直前ブロックが複数ある場合、当該移動量係数Ｍが最も大きいものであるなら、これら複数の直前ブロックを選択直前ブロックとして選択する（図５ｂのステップＳ６７、Ｓ６９参照）。したがって、図１３の場合、両ブロックＢＬ１、ＢＬ３共に選択直前ブロックとして選択されることになる。
【０１２３】
さらに、ＣＰＵ６は、図５ｃのステップＳ７３において、選択直前ブロックが複数ある場合には、ステップＳ７８に進む。ステップＳ７８では、注目ブロックＢＢの図形的特徴（面積、周囲長など）により近い図形的特徴を有する直前ブロックを、最終的な選択直前ブロックと決定する（ステップＳ７８）。このようにして、図１３に示すように、たとえば、ブロックＢＬ１が選択され、これに対して注目ブロックＢＢの関連づけが行われる。
【０１２４】
さらに処理が進むと、図１４に示すように、分岐したブロックのそれぞれに、ブロックが関連付けられて行く。また、移動方向が絶えず変化しているブロックＢＦ３〜ＢＬ３についても、対象物フラグが「１」となる。これは、ブロックＢＬｐ２が既に、大きな移動量係数Ｍを持っているため、これに関係づけられたブロックＢＦ３〜ＢＬ３も大きな移動量係数を持つことになるためである。
【０１２５】
図１４のような状態において、ステップＳ７９を実行すると、ステップＳ８０に進み、さらに、ステップＳ８１に進む。分岐を生じてから、所定のフレーム数（ここでは４とする）を経過しているからである。ステップＳ８１において、ＣＰＵ６は、ブロックＢＬｐ２〜ＢＦ１までの間の移動量係数Ｍ１を算出する。これは、ブロック情報テーブルにおいて、ブロックＢＦ１の移動量係数からブロックＢＬｐ２の移動量係数を減算することにより得ることができる。ＣＰＵ６は、同様にして、ブロックＢＬｐ２〜ＢＦ３までの間の移動量係数Ｍ２を算出する。
【０１２６】
さらに、ＣＰＵ６は、移動量係数Ｍ１とＭ２を比較し、小さい方について、対象から外す。図１４の場合であれば、移動量係数Ｍ２の方が小さくなる。したがって、ブロックＢＬ３〜ＢＦ３までの各ブロックにつき、対象物フラグを「０」にする。さらに、ブロックＢＬ３とＢＬｐ２との関係づけを外す。つまり、ブロックＢＬ３の「関連付」欄に記述されているブロックＢＬｐ２を削除する。このようにして、ＣＰＵ６は、分岐を解除して正しい軌跡を得るようにしている。
【０１２７】
また、図１５Ａ〜図１５Ｆに示すように、照明が写り込んだ領域５２などが生じる場合がある。このような領域５２は、図４のステップＳ４の２値化処理や、ステップＳ５のラベリング処理において、多くの場合、対象から排除される。しかし、図１５Ｃ、図１５Ｄ、図１５Ｅに示すように、対象物の画像が消失してしまうという問題を生じる。
【０１２８】
この実施形態では、このような現象が生じても、対象物の軌跡を追跡できるように、次のような処理を行っている。図１５Ｃのような状態になると、ＣＰＵ６は、対象物フラグが「１」となっている直前ブロック（図１５における画像５０のブロック）に対して、ブロックを関連付けることができない。図５ｄのステップＳ８３において、ＣＰＵ６は、このような状態を判断し、ステップＳ８７に進む。ただし、分岐が生じている場合には、対象物フラグが「１」である全てのブロックに対して、関連づけがなされなかった場合にのみ、ステップＳ８７に進む。
【０１２９】
ステップＳ８７では、対象物の追跡が途切れたことを示すために、消滅フラグを「１」にする。その後、図１５Ｆにおいてブロックが表れると、再び、これを開始点として関連づけを行う（図１６のブロックＢＮ１〜ＢＮ４参照）。これら一連のブロックＢＮ１〜ＢＮ４の移動量係数Ｍが所定値Ｔｍを越えると、この一連のブロックＢＮ１〜ＢＮ４は、対象物として認識される（ステップＳ７６、ステップＳ７７）。
【０１３０】
ＣＰＵ６は、この一連のブロックについて、その移動量係数Ｍが初めて所定値Ｔｍを越えたか否かを判断する（ステップＳ７７１）。ここでは、初めてであるので、ステップＳ７７２に進み、消滅フラグが「１」かどうかを判断する。消滅フラグが「１」であるので、ブロックの関連づけが途切れたブロックＢＺ５０と、この一連のブロックの内の最初のブロックＢＮ１とを関連付ける。このようにして、何らかの要因で一旦途切れたブロック間の関連づけを、復活させている。
【０１３１】
５．その他の実施形態
上記実施形態では、分岐が生じた場合に、対象物でないと判断した一連のブロックに対する関連づけを解除するようにし、対象物に関する一連のブロックのみを残すようにしている。しかし、分岐が生じても、そのまま処理を進め、対象物以外の軌跡も含めて表示を行い、人間が目視によって対象物の軌跡を判断するようにしてもよい。
【０１３２】
また、上記実施形態では、分岐が生じた場合に、対象物でないと判断した一連のブロックに対する関連づけを解除するようにし、対象物に関してのみ、その軌跡を出力（表示）するようにしている。しかしながら、対象物以外の物体の軌跡を、対象物の軌跡と区別可能なように（たとえば、表示色を変えるなど）して出力（表示）してもよい。
【０１３３】
上記実施形態では、対象物が一つである場合について説明をした。対象物が２以上ある場合には、上記の手法を用いつつ、対象物の色特性（輝度、色合い、彩度など）に基づいて、２つの対象物の軌跡を区別するようにすればよい。特に、背景の色と対象物の色が区別し難い場合などに有効である。
【０１３４】
上記実施形態においては、ＲＧＢカラー動画をＴＳＬカラー動画に変換し、これに基づいて処理を行った。しかし、ＨＩＳカラー画像に変換して用いてもよい。また、ＲＧＢカラー画像をそのまま用いてもよい。さらに、白黒画像に基づいて判断するようにしてもよい。
【０１３５】
上記実施形態では、移動方向の安定性を示す指標として移動量係数Ｍを用いた。この移動量係数Ｍは、Ｘ軸成分とＹ軸成分を加算して算出している。しかし、Ｘ軸成分またはＹ軸成分だけを用いてもよい。さらに、前回の動きと今回の動きとの角度差を算出し、当該角度差に基づいて移動方向の安定性を算出してもよい。
【０１３６】
上記実施形態では、移動方向の安定性がより高い直前ブロックに関連づけを行うようにしている（図１０参照）。しかし、対象物が木の葉などであって、その動く方向が目まぐるしく変わるものであり、除外したいノイズが車や人など、その動く方向が一定しているような場合には、移動方向の安定性がより低い直前ブロックに関連づけを行うようにしてもよい。また、移動方向の安定性だけでなく、検出物と検出物以外の移動物とを区別することのできる、その他の移動特性（一連のブロックの総移動距離、移動速度など）を、安定性に代えて、あるいはこれに加えて用いるようにしてもよい。すなわち、検出物について予想される動きと、検出物以外の移動物について予想される動きとを比較した際に、両者を区別できるような移動特性を、上記の関連付けに用いることができる。
【０１３７】
また、上記実施形態では、分岐が生じると、所定フレーム後にこれを解消するようにしている。しかし、分岐が生じたままにしておき、全てのフレームについて処理を終えてから、最も大きい移動方向の安定性を有する軌跡を、対象物の軌跡であると判断してもよい。
【０１３８】
さらに、上記実施形態では、時間的に古いフレームから順番に処理を行ったが、新しいフレームの順に処理を行うようにしてもよい。また、ちょうど中間のフレームから、新しい側、古い側に向けて、並行して処理を行うようにしてもよい。
【０１３９】
上記実施形態では、ステップＳ４の２値化処理において、上下２つのしきい値を用いている。しかし、上限または下限の１つのしきい値を用いて２値化処理を行うようにしてもよい。
【０１４０】
上記実施形態では、移動量係数Ｍの算出において、移動量が所定のしきい値以下である場合、ＭＸｉまたはＭＹｉを「０」としている。このしきい値は、任意の値とすることが可能である。たとえば、しきい値を「０」とすれば、少しでも動きがあった場合には、「１」または「−１」のＭＸｉまたはＭＹｉが得られる。
【０１４１】
また、上記実施形態では、重心間距離によって、ブロック間の関連づけを判断した（ステップＳ６６）。しかしながら、ブロック間の形状的類似度を用いて、ブロック間の関連づけを行うようにしてもよい。
【０１４２】
上記実施形態では、中央値フィルタ処理を用いて平滑化処理を行っている。しかし、移動平均法など他の平滑化手法を用いてもよい。
【０１４３】
上記実施形態では、判定した軌跡を表示するようにしている。しかし、プリントアウトしたり、ファイルとして記録媒体に出力したり、ネットワークで接続された他のコンピュータに出力するなどしてもよい。さらに、軌跡に代えて、あるいはこれとともに、ブロック情報テーブルの一部または全部を出力するようにしてもよい。
【０１４４】
上記実施形態では、対象物の移動情報を出力しているが、対象物の移動情報を抽出することによって、対象物の形状、対象物の存在の有無などを算出するようにしてもよい。
【０１４５】
上記実施形態では、２値化を行うことによって、ブロックを抽出している。しかし、２値化に加えて、背景画像との差分をとる処理を行ってもよい。すなわち、あらかじめ背景画像を用意しておき、各フレーム毎に差分画像を算出して、この差分画像を対象として２値化を行うようにしてもよい。
【０１４６】
上記実施形態では、図３ａに示す各機能を、コンピュータをプログラムに従って動作させることによって実現している。しかし、これら機能のうち一部または全部を、論理回路によって構成してもよい。
【図面の簡単な説明】
【図１】従来のフレーム差分による解析手法を示す図である。
【図２】従来の背景差分による解析手法を示す図である。
【図３ａ】移動情報抽出装置２００の全体的構成を示す図である。
【図３ｂ】移動情報抽出装置２００を、コンピュータを用いて実現した場合のハードウエア構成を示す図である。
【図４】移動情報抽出のためのプログラムのフローチャートである。
【図５ａ】移動軌跡識別処理の詳細を示すフローチャートである。
【図５ｂ】移動軌跡識別処理の詳細を示すフローチャートである。
【図５ｃ】移動軌跡識別処理の詳細を示すフローチャートである。
【図５ｄ】移動軌跡識別処理の詳細を示すフローチャートである。
【図６】中央値フィルタ処理による平滑化を説明するための図である。
【図７】ブロック情報テーブルを示す図である。
【図８】図８Ａは注目フレームを示す図であり、図８Ｂは直前フレームを示す図である。
【図９】図８Ａの注目フレームと図８Ｂの直前フレームとを重ねて示す図である。
【図１０】フレーム間の関連づけを模式的に示す図である。
【図１１】分岐を生じうるような注目フレームの例を示す図である。
【図１２】分岐が生じた状態を、模式的に示す図である。
【図１３】分岐後の処理を説明するための図である。
【図１４】分岐の解消を説明するための図である。
【図１５】対象物の画像が消失した場合の処理を示す図である。
【図１６】対象物の画像が消失した場合に、これを補完する処理を説明するための図である。
【図１７】移動情報抽出プログラムの諸条件設定の画面を示す図である。
【符号の説明】
１００・・・撮像手段
１０２・・・候補画像認識手段
１０４・・・画像列認識手段
１０６・・・軌跡演算手段
１０８・・・出力手段[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for extracting movement information of a target object from a given moving image.
[0002]
[Prior art]
2. Description of the Related Art A technique of extracting an object from a moving image and detecting its movement has been widely used. Such a technique is used in, for example, a monitoring system of a store or the like, a monitoring system of a flow of articles in a production line, an activity monitoring of animals, and the like.
[0003]
As a general recognition method, a method of calculating a difference between each frame of a still image forming a moving image is known. In this method, as shown in FIG. 1, a difference image is calculated in successive frames, and a moving target is detected. FIG. 1C is a difference image between frame 1 and frame 2. That is, in the inter-frame difference image, a changed portion of the image between frame 1 and frame 2 (that is, movement of the object) appears. If the difference images are calculated for all the continuous frames constituting the moving image, the motion of the object can be calculated. Patent Literature 1 discloses a technique for detecting the movement of an object based on such an inter-frame difference.
[0004]
However, the method based on the inter-frame difference may not be able to detect a stationary object (rat in the example of FIG. 1). Further, there is a problem that even when it is necessary to recognize the shape of the entire object, it cannot be recognized.
[0005]
There is also known a method of calculating a difference from a background for each frame and detecting an object. If a background frame is prepared as shown in FIG. 2A and the difference between the target frame in FIG. 2B and the background frame is calculated, an image of the target object can be obtained as shown in FIG. 2C. Non-Patent Document 1 discloses such a method.
[0006]
However, in the method using the background difference, it is extremely difficult to recognize the object when the background changes, such as when the object is on the water surface. Therefore, it is necessary to solve this problem when using a technique based on background subtraction.
[0007]
Patent Document 2 discloses the following method. For each frame, one or more moving objects are detected by a background difference method or the like. At this time, it is determined whether or not the moving objects between the two frames are the same moving object based on the similarity and the moving distance of the graphic feature of the moving object between the frames. In this way, the trajectory of the same moving object is calculated. Further, a moving object whose trajectory cannot be calculated beyond a predetermined number of frames is excluded from the target. Further, it is assumed that the object whose trajectory is not smooth is not the object. As described above, the movement of the object is detected.
[0008]
However, in this method, it is necessary to first determine whether the moving object is the same between frames, and there is a problem that it is difficult to make this determination.
[0009]
In addition, as shown in Patent Literature 3, there is also a method for facilitating identification of an object by coloring a part of an animal. This method is effective when the background device is predetermined as shown in this document. However, if this method is applied when the color of an object appearing in the background cannot be predicted, there is a problem that it is difficult to determine an object based on color.
[0010]
Therefore, there is a demand for a technique that can extract the movement of an object more reliably while performing relatively easy processing.
[0011]
[Patent Document 1]
[0012]
JP-A-7-29015
[Patent Document 2]
[0013]
JP-A-2002-157599
[Patent Document 3]
[0014]
JP-A-8-32959
[Non-patent document 1]
[0015]
Takeshi Yasui, Tomoharu Nagao, "Image Processing and Recognition," Shokodo, 1992.
[0016]
Means and effects for solving the problem
(1) A moving information extracting device according to the present invention is a moving information extracting device for extracting moving information of a target object from a given moving image, and a candidate image for recognizing a candidate image in each still image forming the moving image. Recognition means, and sequentially recognizes the candidate images, sequentially recognizes a series of image sequences in association with the chronological order, image sequence recognition means to extract the movement information of the target object,
The image sequence recognition means, when associating the attention candidate image with a series of image sequences already obtained, if there is a plurality of series of image sequences, based on the movement characteristics of each series of image sequences , Which image sequence to associate with.
[0017]
That is, based on the movement characteristics of a series of image sequences, an object that matches the movement characteristics of the target object can be selected and associated with the target image candidate, so that the movement information of the target object can be more reliably extracted. can do.
[0018]
(3) The moving information extracting apparatus according to the present invention is characterized in that when determining which image sequence to associate the target candidate image with, the image sequence is associated with the image sequence with the highest stability in the moving direction of each image sequence. I have.
[0019]
Therefore, it is possible to increase the possibility that noise or the like in which the moving direction of each image sequence is changing rapidly can be excluded.
[0020]
(4) The movement information extraction device according to the present invention calculates a movement amount coefficient M defined by the following equation for each image sequence when determining which image sequence the target candidate image is associated with. It is characterized in that it is associated with an image sequence having a large moving amount coefficient M.
[0021]
(Equation 2)

Here, i is a frame number continuously assigned to each still image,
MXi sets the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the X-axis direction to be Di-1, and the candidate image and the i frame in the i-1 frame still image. Assuming that the moving direction of the candidate image in the X-axis direction in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the X-axis direction is equal to or smaller than a predetermined threshold value, it is set to “0”.
[0022]
MYi indicates that the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the Y-axis direction is Di-1, and the candidate image in the i-1 frame still image and the i frame Assuming that the moving direction in the Y-axis direction of the candidate image in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the Y-axis direction is equal to or smaller than a predetermined threshold, the value is set to “0”.
[0023]
Therefore, the non-randomness of the moving direction of each image sequence can be determined by simple arithmetic processing.
[0024]
(5) The moving information extracting apparatus according to the present invention, when associating a candidate image of interest with a series of image sequences that have already been obtained, immediately associates a candidate image having no more than a predetermined degree of relevance with the candidate image of interest. It is characterized in that a series of image sequences having the same are excluded from the association target of the attention candidate image.
[0025]
Therefore, attention candidate images having a low degree of relevance to a series of image sequences can be excluded from processing targets, and processing can be speeded up.
[0026]
(6) The moving information extracting apparatus according to the present invention is characterized in that the degree of association is calculated based on the distance between the most recent candidate image and the image of interest.
[0027]
Therefore, when the moving speed of the target object can be predicted to some extent, the target candidate images to be processed can be narrowed down.
[0028]
(7) When associating the target candidate image with a series of image sequences that have already been obtained, the movement information extraction device according to the present invention includes a plurality of series of image sequences as objects to be associated, and each of the series of image sequences. If it is not possible to determine which image sequence to associate with based on the moving characteristics of the image sequence, a candidate image having the most similar graphic characteristic to the target candidate image among the latest candidate images in each series of image sequences Are associated with an image sequence having
[0029]
Therefore, it is possible to more accurately recognize a series of image sequences indicating the movement of the object.
[0030]
(8) The movement information extraction device according to the present invention is characterized in that the degree of association is calculated based on the difference between the graphic characteristics of the most recent candidate image and the image of interest.
[0031]
Therefore, in a case where the graphic characteristics of the image indicating the target object do not change significantly, the target candidate images to be processed can be narrowed down by obtaining the degree of association using the graphic characteristics.
[0032]
(9) The moving information extracting apparatus according to the present invention is characterized in that if the stability of the moving direction of a series of image sequences is equal to or more than a predetermined value, the series of image sequences is determined to be an object. .
[0033]
Therefore, it is possible to increase the possibility of excluding a moving direction that changes in a short time, such as noise, from the target.
[0034]
(10) If there is no attention candidate image to be associated with a series of image sequences, the movement information extraction device according to the present invention sets this as a first series of image sequences, and based on the subsequent still images, Are recognized, and if the stability in the moving direction is equal to or more than a predetermined value, this is regarded as a second series of image sequences, and the last candidate image of the first series of image sequences and the second Is associated with the first candidate image of the series of image sequences to form a continuous series of image sequences.
[0035]
Therefore, even if the object and the possible candidate image are once interrupted, the movement information of the object can be extracted.
[0036]
(11) The moving information extracting apparatus according to the present invention is configured such that, when a plurality of candidate images are associated with a series of image sequences and the series of image sequences is branched, each image sequence after the branch is generated. Is characterized in that an image sequence related to any one of the branches is selected based on the movement characteristics of the image data.
[0037]
That is, based on the moving characteristics of a series of image sequences, it is possible to select an object that matches the moving characteristics of the object, so that the moving information of the object can be more reliably extracted.
[0038]
(12) The moving information extracting apparatus according to the present invention is configured such that, when a series of image sequences has a branch, at the time when processing is performed from the branch to a predetermined number of still images, of the respective image sequences after the branch, It is characterized in that an image sequence whose stability in the moving direction is small is determined not to be an object.
[0039]
Therefore, it is possible to increase the possibility that a moving direction of each image sequence, such as noise, which changes rapidly can be excluded from the movement information of the target object.
[0040]
(13) The moving information extracting means according to the present invention is a moving information extracting device for extracting moving information of a target object from a given moving image, wherein a candidate image for recognizing a candidate image in each still image constituting the moving image is provided. Recognition means, and sequentially recognizes the candidate images, sequentially recognizes a series of image sequences in association with the chronological order, image sequence recognition means to extract the movement information of the target object,
If the stability of the moving direction of the series of image sequences is equal to or more than a predetermined value, the image sequence recognizing means determines that the series of image sequences is a target object, and the candidate image of interest to be associated with the series of image sequences. Is not present, this is regarded as a first series of image sequences, and based on the subsequent still images, another series of image sequences is recognized. If the stability of the moving direction is equal to or more than a predetermined value, This is referred to as a second series of image sequences, and by associating the last candidate image of the first series of image sequences with the first candidate image of the second series of image sequences, I do.
[0041]
Therefore, even when it is difficult to recognize the target object during the predetermined period of the moving image, the movement of the target object can be extracted by associating a series of image sequences. Further, at this time, since it is determined whether or not the object is the object based on the stability of the moving direction, erroneous determination can be reduced.
[0042]
(15) A moving information extracting method according to the present invention is a moving information extracting method for extracting moving information of a target object from a given moving image, and recognizes a candidate image in each still image constituting the moving image and performs recognition. The extracted candidate images are sequentially associated with each other in chronological order to recognize a series of image sequences and extract the movement information of the object, and associate the target candidate image with a series of image sequences already obtained. At this time, when there are a plurality of series of image sequences, it is determined which image sequence to associate with based on the movement characteristics of each series of image sequences.
[0043]
That is, based on the movement characteristics of a series of image sequences, an object that matches the movement characteristics of the target object can be selected and associated with the target image candidate, so that the movement information of the target object can be more reliably extracted. can do.
[0044]
(16) In the movement information extraction method of the present invention, when a movement expected for an object and a movement expected for a moving object other than the object are compared as movement characteristics for association, the two are distinguished sharply. It is characterized by using possible movement characteristics.
[0045]
Therefore, the movement of the target object can be more reliably distinguished and extracted from the movement of other objects.
[0046]
(17) A moving information extracting method according to the present invention is a moving information extracting method for extracting moving information of an object from a given moving image, and recognizes and recognizes a candidate image in each still image constituting the moving image. The candidate images are sequentially associated with each other in chronological order to recognize a series of image sequences and extract the movement information of the object. In the recognition of the image sequences, the candidate images in each still image are associated with each other. A series of image sequences having the highest stability in the moving direction is selected from a series of image sequences obtained as described above.
[0047]
Therefore, it is possible to increase the possibility of extracting the movement information of the target object by eliminating noise and the like.
[0048]
In the present invention, "candidate image recognition means" is for recognizing an image in each still image. In the embodiment, the labeling process in step S5 in FIG. 4 corresponds to this.
[0049]
The “image sequence recognizing device” is a device for recognizing a series of candidate image sequences by appropriately associating the candidate images in the still images arranged in chronological order. In the embodiment, step S6 corresponds to this.
[0050]
"Extraction of movement information" means that the movement of an object can be recognized in some way, not only when calculating the movement trajectory of the object but also when identifying the object in a series of still images It is a concept that also includes
[0051]
The “target” is a target from which movement information is extracted. This concept includes animals, machines, humans, cars, and so on.
[0052]
“Candidate image” refers to an image that may be an object. In the following embodiment, a block corresponds to this.
[0053]
The “movement characteristic” characterizes movement and is a concept including speed, acceleration, moving direction, change in moving direction, moving path, and the like. In the following embodiment, the stability in the moving direction corresponds to this.
[0054]
“Movement direction stability” refers to the degree to which the change in the movement direction of a candidate image in a series of still images is gradual. In the following embodiment, the movement amount coefficient M corresponds to this.
[0055]
The “graphic characteristic” refers to an attribute characterizing a graphic, such as an outer shape, an area, an outer peripheral length, and a color.
[0056]
The “program” is a concept that includes not only a program directly executable by the CPU but also a source format program, a compressed program, an encrypted program, and the like.
[0057]
BEST MODE FOR CARRYING OUT THE INVENTION
1. Examples of application fields of the invention
The movement information extraction according to the present invention can be applied to general detection of movement of a target object. For example, it can be used for monitoring a suspicious person in a monitoring system, measuring the behavior of an animal, measuring the movement of an article on a production line, and the like.
[0058]
In the embodiment described below, a case will be described in which the present invention is applied to measurement of the behavior of an animal as a target object. The measurement of the behavior of an animal is used, for example, when observing the behavior of an animal when a drug is given to the animal and evaluating the medicinal effect and side effects of the drug.
[0059]
2. Overall configuration of the device
FIG. 3A shows an overall configuration of a movement information extraction device 200 according to an embodiment of the present invention. The imaging unit 100 captures an image of an object and provides moving image data to the movement information extraction device 200. The candidate image recognizing means 102 recognizes a candidate image in each of the still images constituting the moving image. The image sequence recognizing unit 104 recognizes the movement of the target object by associating the candidate images in each still image in chronological order to form an image sequence. The trajectory calculating means 106 calculates the trajectory of the object based on the trajectory. The output unit 108 outputs the locus in a format such as display, printing, data, or the like.
[0060]
3. Hardware configuration
FIG. 3B shows a hardware configuration in a case where the movement information extraction device 200 of FIG. The computer 2 includes a CPU 6, a memory 8, a hard disk 10 as a recording unit, a display 12 as a display device, a mouse / keyboard 14 as an input device, and an input / output interface 16. A camera (imaging device) 4 for imaging an object is connected to the input / output interface 16.
In this embodiment, the movement information extracting device 200 is configured by the computer 2, but the entirety including the camera 4 may be the moving information extracting device 200.
[0061]
On the hard disk 10, a program for extracting movement information of an object based on moving image data from the camera 4 is recorded. The hard disk 10 also records an operating system 20 (for example, Windows (trademark) of Microsoft Corporation).
[0062]
The movement information extraction program 18 causes the CPU 6 to perform the movement information extraction processing in cooperation with the operating system 20. In this embodiment, the movement information extraction processing is performed in cooperation with the operating system 20. However, the movement information extraction processing may be performed by the movement information extraction program 18 alone.
[0063]
Hereinafter, a case where the movement of the mouse is imaged by the camera 4 and the action is detected will be described as an example.
[0064]
4. Movement information extraction processing
4.1 Image capture processing
FIG. 4 is a flowchart showing the movement information extraction program 18. First, the CPU 6 captures color moving image data (RGB data) from the camera 4 via the interface 16 and records it on the hard disk 10 (step S1). Note that in this embodiment, moving image data captured in advance is taken in, and movement information extraction processing is performed. Of course, the movement information extraction processing may be performed while capturing the image in real time.
[0065]
If the moving image from the camera 4 is digital data, the moving image is captured as it is or after being converted into a desired format. If the moving image from the camera 4 is analog data, the interface 16 performs A / D conversion processing and captures the data.
[0066]
The captured color moving image is recorded as moving image data 22 on the hard disk 10. The color moving image data 22 is recorded as a plurality of frames of color still images that are continuous in time series.
[0067]
4.2 Color system conversion processing
Next, the CPU 6 converts the color moving image data represented by the RGB color system into the TSL color system based on the following equation (step S2). In addition, T is color, S is vividness, and L is luminance.
[0068]
[Equation 3]

Among the components of the TSL obtained in this way, the components selected in advance on the user setting screen shown in FIG. 17 are used in the following processing. In the example illustrated in FIG. 17, the L component (luminance component) is selected in the item of “color system components used for analysis”. In this embodiment, a component useful for discriminating between the two can be selected according to the color characteristics of the background and the object. In this embodiment, two or more components can be designated (for example, a T component and an L component are designated).
[0069]
4.3 Smoothing processing
Next, the CPU 6 performs a smoothing process using a median filter on the moving image data based on the L component in order to remove noise (step S3). This smoothing removes minute noise, so that it is possible to prevent the occurrence of many meaningless minute blocks in the binarization process performed later. That is, blocks that are extremely small compared to the size of the object are removed by this smoothing.
[0070]
The median filter processing is performed based on the following equation.
[0071]
(Equation 4)

FIG. 6 shows an outline of the median filter processing. In the drawing, the pixel array of the moving image data indicates the L component of each pixel for one still image data of the moving image data. When the median filter processing is performed on the pixel α (the pixel of interest), the following is performed. The CPU 6 acquires the L components of nine surrounding pixels including the target pixel α. Next, these are arranged in the order of the values of the L component, and the value at the center is obtained. This median value is set as the L component value of the target pixel α.
[0072]
In the case of FIG. 6, the L component values of the nine surrounding pixels are 2, 2, 3, 3, 4, 4, 4, 5, and 10. Therefore, 4 which is the fifth value (median value) from the front is selected, and this is set as a new L component value of the target pixel α. The above processing is performed for all pixels of all the still images constituting the moving image.
[0073]
When a plurality of components are specified as the color system components to be used for analysis on the user setting screen in FIG. 17, the smoothing process is performed independently for each component.
[0074]
4.42 value processing
Next, the CPU 6 binarizes the smoothed moving image data (step S4). In this embodiment, binarization is performed using two thresholds, the upper limit and the lower limit. The values of the upper and lower limits are input by the user on the user setting screen shown in FIG.
[0075]
At the time of binarization, the CPU 6 uses the values input on the screen in FIG. 17 as the upper limit value and the lower limit value (in the figure, “190” is input as the lower limit value, and “240” is input as the upper limit value). ). That is, when the L component value of the target pixel is within the range between the upper limit value and the lower limit value, the value of the target pixel is set to “1”. If the value is outside the above range, the value of the pixel of interest is set to “0”. The above processing is performed for all pixels of all the still images constituting the moving image.
[0076]
As described above, binarization using the two thresholds of the upper limit and the lower limit makes it possible to remove a high-luminance region generated by illumination or the like.
[0077]
When a plurality of components are selected as the color system components used in the analysis, each component is binarized, and a pixel whose binarization by all components is “1” is “1”. I do.
[0078]
4.5 Labeling process
Next, the CPU 6 performs a labeling process on the binarized moving image data (step S5). The labeling process is a process of recognizing a cluster of pixels having a value of “1” in each still image. This block is called a block. That is, a block is a connected component of pixels having the value “1”.
[0079]
The CPU 6 gives a block identifier so that each recognized block can be distinguished. Further, for each block, the coordinates of the center of gravity, the area, the perimeter, and the like are calculated and recorded on the hard disk 10 as a block information table as shown in FIG. In this embodiment, the area and the perimeter are calculated based on the number of pixels. In the figure, the frame number is a number assigned to a still image constituting a moving image in chronological order.
[0080]
In the labeling process, blocks having an area that is too large or an area that is too small compared to the area of the target object may be excluded from the subsequent processing. This determination may be made based on the perimeter instead of the area.
[0081]
Whether to use the area or the perimeter can be determined by the user on the setting screen in FIG. In addition, an upper limit and a lower limit can be set.
[0082]
4.6 Moving trajectory identification processing
After the labeling is performed as described above, the process of identifying the movement locus is performed (step S6). In the movement trajectory identification processing of this embodiment, the blocks of each frame are associated in chronological order while eliminating extra blocks from the association, and the trajectory of the object is obtained. 5a, 5b, 5c, and 5d show detailed flowcharts of the movement trajectory identification processing.
[0083]
First, the CPU 6 sets the frame “2” as the frame of interest (FIG. 5A, step S61). This frame of interest n proceeds as “3”, “4”,... As the processing proceeds.
[0084]
Next, the first block of the frame of interest is set as the block of interest (step S62). The block of interest is a block on which a process of associating with the block of the immediately preceding frame is performed. That is, in the following, processing is performed to determine which block of the immediately preceding frame is associated with the target block and to associate it with an appropriate block.
[0085]
Next, the maximum movement amount coefficient Mmax is set to “−1” (step S63). The movement amount coefficient will be described later.
[0086]
Subsequently, the first block of the immediately preceding frame is set as the immediately preceding block of interest (step S64). Here, for example, it is assumed that the target frame n is a still image as shown in FIG. 8A. In step S62, for example, the block BL1 in FIG. 8A is selected as the target block. It is also assumed that the immediately preceding frame n-1 is a still image as shown in FIG. 8B. In step S63, for example, the block BLp1 in FIG. 8B is selected as the immediately preceding block of interest.
[0087]
Next, the CPU 6 calculates the distance D between the centers of gravity of the block of interest BL1 and the block of interest BLp1 based on the block information of FIG. 7 (step S65). If the center-of-gravity distance D exceeds the predetermined value Td, the immediately preceding block of interest BLp1 is excluded from the target of association (step S66). That is, the immediately preceding block BLp1 is excluded from the association target of the focused block BL1.
[0088]
This will be described below with reference to FIG. FIG. 9 shows the frame of interest (FIG. 8A) and the immediately preceding frame (FIG. 8B) in an overlapping manner. In the figure, the block of the frame of interest is indicated by a solid line, and the block of the immediately preceding frame is indicated by a broken line.
[0089]
The two-dot chain line is a circle drawn with the center of gravity of the block of interest BL1 as a center and having a radius of a predetermined value Td. In other words, the immediately preceding block as a candidate for associating with the block of interest BL1 is limited to those within the circle of the predetermined value Td. As a result, it is possible to eliminate in advance the block immediately before attention, which is unlikely, and it is possible to increase the processing efficiency. The predetermined value Td can be specified by the user as the “block connection threshold” in the setting screen of FIG. This value may be determined based on the moving speed of the object.
[0090]
When the center-of-gravity distance D exceeds the predetermined value Td, the next block BLp2 is set as the immediately preceding block via step S70 (step S71), and steps S64 and subsequent steps are executed again.
[0091]
As shown in FIG. 9, the center of gravity distance D between the immediately preceding block of interest BLp2 and the block of interest BL1 is less than the predetermined value Td, so that the process proceeds from steps S65 and S66 to S67. In step S67, it is determined whether or not the movement amount coefficient M of the immediately preceding block BLp2 is greater than (or equal to) the current maximum movement amount coefficient Mmax.
[0092]
The movement amount coefficient M is an index indicating how a series of blocks related to the previous block including the immediately preceding block of interest (a series of blocks in a previous series of frames) has moved. . In this embodiment, the movement amount coefficient M is calculated as a numerical value indicating how stable the movement direction of a series of related blocks is (ie, whether the movement direction is not extremely changed in a short time). That is, the moving amount coefficient M is defined so that if the moving amount coefficient M is large, it indicates that the moving direction of the block is stable. This makes it possible to exclude noise or the like that moves at random from recognition targets.
[0093]
FIG. 10A schematically shows a moving state of a series of associated blocks. The immediately preceding block BLp2 is a block BLq2,
Furthermore, it is associated with the block BLr1 of the previous frame and the block BLs2 of the previous frame. In this case, since the moving direction of these series of blocks is stable, the moving amount coefficient M becomes large. The calculation of the movement amount coefficient M will be described later in detail. In addition, the calculated movement amount coefficient M is used since it is recorded in the block information table of FIG. For example, the movement amount coefficient M of the block BLp2 is recorded as “52”.
[0094]
Here, in step S68, the movement amount coefficient M of the immediately preceding block BLp2 is stored as the maximum movement amount coefficient Mmax. Further, the CPU 6 sets the immediately preceding block BLp2 as the immediately preceding block (step S69).
[0095]
As described above, when the processing for the block BLp2 just before attention ends, next, the block BLp3 is set as the block immediately before attention (steps S70 and S71), and the steps from step S64 are executed again.
[0096]
Since the center-of-gravity distance D between the immediately preceding block of interest BLP3 and the block of interest BL1 is smaller than the predetermined value Td (see FIG. 9), steps S67 and thereafter are executed. In step S67, it is determined whether the moving amount coefficient M (here, “3”) of the immediately preceding block BLp3 is greater than (equal to) the maximum moving amount coefficient Mmax.
[0097]
As shown in FIG. 10A, the moving direction of a series of blocks including the immediately preceding block BLp3 is random, and the moving amount coefficient M is small. Therefore, the process proceeds to step S70.
[0098]
Since no unprocessed blocks remain in this immediately preceding frame, the processing for the target block BL1 has been completed. That is, the immediately preceding block BLp2 having the center-of-gravity distance D from the target block BL1 smaller than the predetermined value Td and having the largest moving amount coefficient M is determined as the immediately preceding block in step S69.
[0099]
Next, the CPU 6 determines whether or not there is a selected block or not (steps S72 and S73 in FIG. 5C). If there is no selected block, the process proceeds to step S79 in FIG. 5D. Here, since there is one block just before selection, the process proceeds to step S74. In step S74, the CPU 6 associates the immediately preceding block BLp2 with the target block BL1. That is, in the block information table of FIG. 7, the CPU 6 describes the block identifier of the block BLp2 just before selection in the column of "association" in the block information table of the target block BL1.
[0100]
As shown in FIG. 10B, the block of interest BL1 is associated with a series of blocks BLp2, BLq2, BLr1, and BLs2 having a large movement amount coefficient M (that is, having a higher possibility of being an object).
[0101]
Next, the CPU 6 calculates a movement amount coefficient M for the block of interest BL1 (and a series of blocks) based on the following equation.
[0102]
(Equation 5)

Here, i is a frame number continuously assigned to each still image.
[0103]
MXi sets the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the X-axis direction to be Di-1, and the candidate image and the i frame in the i-1 frame still image. Assuming that the moving direction of the candidate image in the X-axis direction in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the X-axis direction is equal to or smaller than a predetermined threshold value, it is set to “0”.
[0104]
MYi indicates that the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the Y-axis direction is Di-1, and the candidate image in the i-1 frame still image and the i frame Assuming that the moving direction in the Y-axis direction of the candidate image in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the Y-axis direction is equal to or smaller than a predetermined threshold, the value is set to “0”.
[0105]
In this embodiment, a circle having the same area as the expected area of the candidate image is assumed, and a half of the radius of the circle is used as the threshold.
[0106]
The CPU 6 records the movement amount coefficient M calculated in this manner in the “movement amount coefficient” column of the target block BL1 (not shown in FIG. 7) in the block information table of FIG. 7 (step S75). In addition, since the movement amount coefficient M is already recorded in the immediately preceding block BLp2 in the block information of FIG. 7, it can be calculated based on the following equation using this. This can speed up the processing.
[0107]
M = MX + MY + MP
Here, MP is a movement amount coefficient of the immediately preceding block, MX is a movement amount coefficient on the X axis between the target block and the immediately preceding block, and MY is a movement amount on the Y axis between the target block and the immediately preceding block. It is a quantity coefficient.
[0108]
Next, the CPU 6 determines whether or not the calculated moving amount coefficient M of the target block is larger than a predetermined value Tm (moving target recognition value) (step S76). If it is larger, it is determined that the block is a target, and the “target flag” column in the block information of the block of interest and a series of blocks associated therewith is set to “1” (see FIG. 7). The predetermined value Tm can be set by the user on the setting screen in FIG.
[0109]
Subsequently, a trajectory connecting the centers of gravity of a series of blocks is calculated and displayed on the display 12 (step S770). Accordingly, a trajectory L indicating the movement of the target object is displayed on the display 12, as shown in FIG. 8A.
[0110]
If the moving amount coefficient M is smaller than the predetermined value Tm, it is determined that the noise is present or that the amount of accumulated data is too small to be determined as the target, and the target flag is set to “0”. Keep it.
[0111]
Here, since the movement amount coefficient M (here, “54”) of the target block BL1 is larger than the predetermined value Tm (here, “10”), it is determined that the target block BL1 is the target object.
[0112]
Next, the CPU 6 executes step S82 after step S79. In step S82, it is determined whether or not there is another unprocessed block in the frame of interest. Since unprocessed blocks BL2 and BL3 remain (see FIG. 8A), the process proceeds to step S84, and the process from step S63 onward in FIG.
[0113]
Only the immediately preceding block BLp3 has the center of gravity distance D equal to or smaller than the predetermined value Td with respect to the block of interest BL2. Therefore, the block BL2 is related to the immediately preceding block BLp3, as shown in FIG. However, the moving direction of the block BL2 (and a series of blocks) changes randomly, and the moving amount coefficient M thereof becomes smaller than the predetermined value Tm. Therefore, these series of blocks are not determined as the target. Therefore, the CPU 6 keeps the “object flag” of these blocks at “0”.
[0114]
When the processing for the block BL2 is completed as described above, since no unprocessed block remains in the frame of interest (see FIG. 8A), the CPU 6 proceeds from step S82 to S86. In step S86, the CPU 6 determines whether an unprocessed frame remains. If it remains, the next frame is set as the target frame (step S85), and steps S62 and subsequent steps in FIG. 5A are executed again.
[0115]
In this way, by performing the above-described processing on the still images of all the frames, the movement information of the target object can be extracted. That is, in the block information table shown in FIG. 7, the target flag is “1”, and the movement information of the target can be obtained as a series of blocks associated in the “association” column. Also, the trajectory can be displayed.
[0116]
By the way, when processing is performed by the above method, there is a possibility that a branch occurs in a series of blocks. In this embodiment, a process for coping with such a branch is provided.
[0117]
In the previous example, a still image as shown in FIG. 8A was assumed as the frame of interest. Here, instead of the still image of FIG. 8A, a still image as shown in FIG. 11A is assumed as the frame of interest. In FIG. 11A, a block BL3 exists.
[0118]
When the movement trajectory identification processing is performed on the frame of interest, the association is performed as shown in FIG. The blocks BL1 and BL2 are associated with the immediately preceding blocks BLp2 and BLp3, respectively, as in the case of FIG. 8A.
[0119]
The block BL3 is associated with the immediately preceding block BLp2 having the largest moving amount coefficient M. Therefore, two blocks BL1 and BL3 are associated with the immediately preceding block BLp2. That is, a branch occurs.
[0120]
When such a branch occurs, the CPU 6 resolves the branch as follows. First, after the branch occurs, if step S79 in FIG. 5D is executed, the process proceeds to step S80. In step S80, it is determined whether or not a predetermined number of frames (for example, four frames) has elapsed since the branch occurred. Here, since only one frame has elapsed, the process returns to the normal processing. That is, the process proceeds to step S82.
[0121]
In the processing of the next frame, as shown in FIG. 13, the immediately preceding blocks BL1 and BL3 are located within the distance of the predetermined value Td from the target block BB. In addition, the movement amount coefficients M of the two immediately preceding blocks BL1 and BL3 are equal. For this reason, there is a problem as to which previous block the target block BB should be related to.
[0122]
In this embodiment, this problem is solved as follows. When there are a plurality of immediately preceding blocks having the same moving amount coefficient M, the CPU 6 selects the plurality of immediately preceding blocks as the immediately preceding blocks if the moving amount coefficient M is the largest (see steps S67 and S69 in FIG. 5B). ). Therefore, in the case of FIG. 13, both blocks BL1 and BL3 are selected as the blocks immediately before selection.
[0123]
Further, if there are a plurality of blocks just before selection in step S73 of FIG. 5C, the CPU 6 proceeds to step S78. In step S78, the immediately preceding block having a figure characteristic closer to the figure characteristic (area, perimeter, etc.) of the block of interest BB is determined as the final block immediately before selection (step S78). In this way, as shown in FIG. 13, for example, the block BL1 is selected, and the target block BB is associated with the selected block BL1.
[0124]
When the process further proceeds, as shown in FIG. 14, a block is associated with each of the branched blocks. The object flag is also set to “1” for the blocks BF3 to BL3 in which the moving direction is constantly changing. This is because the block BLp2 already has a large moving amount coefficient M, and the blocks BF3 to BL3 associated therewith also have a large moving amount coefficient.
[0125]
When step S79 is executed in the state as shown in FIG. 14, the process proceeds to step S80, and further proceeds to step S81. This is because a predetermined number of frames (here, four) has elapsed since the branch occurred. In step S81, the CPU 6 calculates a movement amount coefficient M1 between the blocks BLp2 to BF1. This can be obtained by subtracting the movement coefficient of the block BLp2 from the movement coefficient of the block BF1 in the block information table. The CPU 6 similarly calculates the movement amount coefficient M2 between the blocks BLp2 to BF3.
[0126]
Further, the CPU 6 compares the movement amount coefficients M1 and M2, and excludes the smaller one from the target. In the case of FIG. 14, the movement amount coefficient M2 is smaller. Therefore, the object flag is set to "0" for each of the blocks BL3 to BF3. Further, the association between the blocks BL3 and BLp2 is removed. That is, the block BLp2 described in the “association” column of the block BL3 is deleted. In this way, the CPU 6 cancels the branch and obtains a correct trajectory.
[0127]
In addition, as shown in FIGS. 15A to 15F, an area 52 in which illumination is reflected may occur. Such an area 52 is often excluded from the target in the binarization processing in step S4 of FIG. 4 and the labeling processing in step S5. However, as shown in FIGS. 15C, 15D, and 15E, there is a problem that the image of the target object disappears.
[0128]
In this embodiment, the following processing is performed so that the trajectory of the target object can be tracked even if such a phenomenon occurs. In the state shown in FIG. 15C, the CPU 6 cannot associate the block with the immediately preceding block (the block of the image 50 in FIG. 15) in which the target object flag is “1”. In step S83 of FIG. 5D, the CPU 6 determines such a state, and proceeds to step S87. However, if there is a branch, the process proceeds to step S87 only when no association has been made with all blocks whose object flag is “1”.
[0129]
In step S87, the disappearance flag is set to “1” to indicate that the tracking of the object has been interrupted. Thereafter, when a block appears in FIG. 15F, association is performed again using this as a start point (see blocks BN1 to BN4 in FIG. 16). When the movement amount coefficient M of the series of blocks BN1 to BN4 exceeds a predetermined value Tm, the series of blocks BN1 to BN4 is recognized as a target (steps S76 and S77).
[0130]
The CPU 6 determines whether or not the movement amount coefficient M has exceeded the predetermined value Tm for the series of blocks for the first time (step S771). Here, since this is the first time, the process proceeds to step S772, and it is determined whether the disappearance flag is “1”. Since the disappearance flag is “1”, the block BZ50 in which the association of the blocks is interrupted is associated with the first block BN1 in this series of blocks. In this way, the association between blocks that have been interrupted for some reason is restored.
[0131]
5. Other embodiments
In the above embodiment, when a branch occurs, the association with a series of blocks determined not to be the object is released, and only the series of blocks related to the object is left. However, even if a branch occurs, the process may proceed as it is, and the trajectory of the target object may be visually determined by displaying the trajectory other than the target object.
[0132]
Further, in the above-described embodiment, when a branch occurs, the association with a series of blocks determined to be not the object is released, and the trajectory of only the object is output (displayed). However, the trajectory of an object other than the target object may be output (displayed) so as to be distinguishable from the trajectory of the target object (for example, by changing the display color).
[0133]
In the above-described embodiment, the case where the number of objects is one has been described. When there are two or more objects, the trajectory of the two objects may be distinguished based on the color characteristics (brightness, hue, saturation, etc.) of the objects using the above method. This is particularly effective when it is difficult to distinguish the color of the background from the color of the object.
[0134]
In the above embodiment, the RGB color moving image is converted into the TSL color moving image, and the processing is performed based on this. However, it may be converted into an HIS color image and used. Further, an RGB color image may be used as it is. Further, the determination may be made based on a black and white image.
[0135]
In the above embodiment, the movement amount coefficient M is used as an index indicating stability in the movement direction. This movement amount coefficient M is calculated by adding the X-axis component and the Y-axis component. However, only the X-axis component or the Y-axis component may be used. Furthermore, the angle difference between the previous motion and the current motion may be calculated, and the stability of the moving direction may be calculated based on the angle difference.
[0136]
In the above embodiment, association is made with the immediately preceding block having higher stability in the moving direction (see FIG. 10). However, if the object is a leaf or the like and the direction of movement changes rapidly, and the noise to be excluded is a constant movement direction, such as a car or a person, the stability of the movement direction is poor. You may make it link with a lower immediately preceding block. In addition to the stability of the moving direction, other moving characteristics (total moving distance of a series of blocks, moving speed, etc.) that can distinguish the detected object from the moving object other than the detected object Alternatively, or in addition to this, it may be used. That is, when the expected movement of the detected object and the expected movement of the moving object other than the detected object are compared, a moving characteristic that can distinguish the two can be used for the association.
[0137]
Further, in the above embodiment, when a branch occurs, the branch is resolved after a predetermined frame. However, the trajectory having the greatest stability in the moving direction may be determined to be the trajectory of the target object after the processing has been completed for all the frames while the branch is left.
[0138]
Further, in the above-described embodiment, the processing is performed in order from the temporally old frame, but the processing may be performed in the order of new frame. Also, the processing may be performed in parallel from just the middle frame toward the new side and the old side.
[0139]
In the above embodiment, two upper and lower threshold values are used in the binarization processing in step S4. However, the binarization process may be performed using one threshold value of the upper limit or the lower limit.
[0140]
In the above embodiment, when calculating the movement amount coefficient M, if the movement amount is equal to or less than the predetermined threshold value, MXi or MYi is set to “0”. This threshold can be any value. For example, if the threshold is “0”, MXi or MYi of “1” or “−1” is obtained if there is any movement.
[0141]
In the above embodiment, the association between the blocks is determined based on the distance between the centers of gravity (step S66). However, the association between the blocks may be performed using the shape similarity between the blocks.
[0142]
In the above embodiment, the smoothing processing is performed using the median filter processing. However, another smoothing method such as a moving average method may be used.
[0143]
In the above embodiment, the determined trajectory is displayed. However, it may be printed out, output as a file to a recording medium, or output to another computer connected via a network. Further, part or all of the block information table may be output instead of or together with the trajectory.
[0144]
In the above embodiment, the movement information of the target is output. However, the movement information of the target may be extracted to calculate the shape of the target, the presence or absence of the target, and the like.
[0145]
In the above embodiment, blocks are extracted by performing binarization. However, in addition to the binarization, a process of obtaining a difference from the background image may be performed. That is, a background image may be prepared in advance, a difference image may be calculated for each frame, and binarization may be performed on the difference image.
[0146]
In the above embodiment, each function shown in FIG. 3A is realized by operating a computer according to a program. However, some or all of these functions may be configured by a logic circuit.
[Brief description of the drawings]
FIG. 1 is a diagram showing a conventional analysis method based on a frame difference.
FIG. 2 is a diagram showing a conventional analysis method based on background difference.
FIG. 3A is a diagram showing an overall configuration of a movement information extraction device 200.
FIG. 3B is a diagram showing a hardware configuration when the movement information extracting device 200 is realized using a computer.
FIG. 4 is a flowchart of a program for extracting movement information.
FIG. 5A is a flowchart illustrating details of a movement trajectory identification process.
FIG. 5B is a flowchart illustrating details of a movement trajectory identification process.
FIG. 5C is a flowchart illustrating details of a movement trajectory identification process.
FIG. 5D is a flowchart illustrating details of a movement trajectory identification process.
FIG. 6 is a diagram for explaining smoothing by median filter processing.
FIG. 7 is a diagram showing a block information table.
8A is a diagram illustrating a frame of interest, and FIG. 8B is a diagram illustrating a frame immediately before.
9 is a diagram showing the frame of interest in FIG. 8A and the frame immediately before in FIG. 8B in an overlapping manner.
FIG. 10 is a diagram schematically showing association between frames.
FIG. 11 is a diagram illustrating an example of a frame of interest that may cause a branch.
FIG. 12 is a diagram schematically showing a state where a branch has occurred.
FIG. 13 is a diagram for explaining processing after branching;
FIG. 14 is a diagram for explaining the elimination of a branch;
FIG. 15 is a diagram illustrating processing when an image of a target object has disappeared.
FIG. 16 is a diagram illustrating a process for complementing a case where an image of a target object has disappeared.
FIG. 17 is a diagram showing a screen for setting various conditions of a movement information extraction program.
[Explanation of symbols]
100 imaging means
102 ... Candidate image recognition means
104 ・・・ Image sequence recognition means
106: trajectory calculation means
108 output means

Claims

A movement information extraction device that extracts movement information of an object from a given moving image,
Candidate image recognition means for recognizing a candidate image in each still image constituting the moving image;
Image sequence recognizing means for sequentially recognizing the recognized candidate images, recognizing a series of image sequences in association with a chronological order, and extracting movement information of the object,
With
The image sequence recognition means,
When associating the attention candidate image with a series of image sequences already obtained, if there are a plurality of image sequences, the image is associated with any of the image sequences based on the movement characteristics of each of the series of image sequences. Moving information extracting device configured to determine

A program for causing a computer to perform movement information extraction processing for extracting movement information of an object from a given moving image,
Candidate image recognition processing for recognizing a candidate image in each still image constituting the moving image;
An image sequence recognition process for sequentially recognizing the recognized candidate images, recognizing a series of image sequences by associating them in chronological order, and extracting movement information of the object,
Is a program that causes a computer to perform
In the image sequence recognition process,
When associating the attention candidate image with a series of image sequences already obtained, if there are a plurality of image sequences, the image is associated with any of the image sequences based on the movement characteristics of each of the series of image sequences. Is determined.

In the apparatus according to claim 1 or the program according to claim 2,
When deciding which image sequence to associate the attention candidate image with, the image sequence is associated with the image sequence with the highest stability in the moving direction of each image sequence.

The device or program according to claim 3,
When deciding which image sequence to associate the attention candidate image with, the moving amount coefficient M defined by the following equation is calculated for each image sequence, and the moving amount coefficient M is associated with the image sequence having the largest moving amount coefficient M. Features.

Here, i is a frame number continuously assigned to each still image,
MXi sets the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the X-axis direction to be Di-1, and the candidate image and the i frame in the i-1 frame still image. Assuming that the moving direction of the candidate image in the X-axis direction in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the X-axis direction is equal to or smaller than a predetermined threshold value, it is set to “0”.
MYi indicates that the moving direction of the candidate image in the i-2 frame still image and the candidate image in the i-1 frame still image in the Y-axis direction is Di-1, and the candidate image in the i-1 frame still image and the i frame Assuming that the moving direction in the Y-axis direction of the candidate image in the still image is Di, the moving direction Di-1 and the moving direction Di are "1" if they are the same direction, and "-1" if they are the opposite directions. However, when the moving amount of the candidate image in the i-1 frame still image and the i-frame still image in the Y-axis direction is equal to or smaller than a predetermined threshold, the value is set to “0”.

An apparatus or program according to any one of claims 1 to 4,
When associating the noted candidate image with a series of image sequences that have already been obtained, a series of image sequences having a candidate image that does not have a degree of association higher than a predetermined value with respect to the noted candidate image is set as a target image to be associated with the noted candidate image. Characterized by being excluded from

The device or program according to claim 5,
The relevance is calculated based on the distance between the latest candidate image and the target image.

An apparatus or program according to any one of claims 1 to 6,
When associating the attention candidate image with a series of image sequences that have already been obtained, there are a plurality of series of image sequences to be associated, and based on the movement characteristics of each series of image sequences, If it is not possible to determine whether to associate with the image sequence, the image sequence is associated with an image sequence having a candidate image having the most similar graphic characteristic to the target candidate image among the latest candidate images in each series of image sequences.

The device or program according to claim 5,
The relevance is calculated based on a difference in graphic characteristics between the latest candidate image and the target image.

An apparatus or program according to any one of claims 1 to 8,
If the stability of the moving direction of a series of image sequences is equal to or more than a predetermined value, the series of image sequences is determined to be an object.

The device or program according to claim 9,
If there is no attention candidate image to be associated with the series of image sequences, this is set as a first series of image sequences, and based on the subsequent still images, another series of image sequences is recognized, and the moving direction of the sequence is recognized. If the stability is equal to or greater than a predetermined value, this is regarded as a second series of image sequences, and the last candidate image of the first series of image sequences is associated with the first candidate image of the second series of image sequences. A series of continuous image sequences.

The device or the program according to any one of claims 1 to 10,
When a plurality of candidate images are associated with a series of image sequences, and the series of image sequences is branched, based on the moving characteristics of each image sequence after the branch, any one of the branches is determined. The image sequence is selected.

The device or program according to claim 11,
When a series of image sequences has a branch, at the point in time when processing is performed from the branch to a predetermined number of still images, an image sequence having a small moving direction stability among the image sequences after the branch is determined as an object. It is characterized by judging that it is not.

A moving information extracting apparatus for extracting moving information of a target object from a given moving image, wherein candidate image recognizing means for recognizing a candidate image in each still image forming the moving image,
Image sequence recognizing means for sequentially recognizing the recognized candidate images, recognizing a series of image sequences in association with a chronological order, and extracting movement information of the object,
With
The image sequence recognition means,
If the stability of the moving direction of the series of image sequences is equal to or more than a predetermined value, the series of image sequences is determined to be an object,
If there is no attention candidate image to be associated with the series of image sequences, this is set as a first series of image sequences, and based on the subsequent still images, another series of image sequences is recognized, and the moving direction of the sequence is recognized. If the stability is equal to or greater than a predetermined value, this is regarded as a second series of image sequences, and the last candidate image of the first series of image sequences is associated with the first candidate image of the second series of image sequences. In this way, a moving information extracting apparatus that forms a continuous series of images.

A program for causing a computer to perform movement information extraction processing for extracting movement information of an object from a given moving image,
Candidate image recognition processing for recognizing a candidate image in each still image constituting the moving image;
An image sequence recognition process for sequentially recognizing the recognized candidate images, recognizing a series of image sequences by associating them in chronological order, and extracting movement information of the object,
Is a program that causes a computer to perform
In the image sequence recognition process,
If the stability of the moving direction of the series of image sequences is equal to or more than a predetermined value, the series of image sequences is determined to be an object,
If there is no attention candidate image to be associated with the series of image sequences, this is set as a first series of image sequences, and based on the subsequent still images, another series of image sequences is recognized, and the moving direction of the sequence is recognized. If the stability is equal to or greater than a predetermined value, this is regarded as a second series of image sequences, and the last candidate image of the first series of image sequences is associated with the first candidate image of the second series of image sequences. This includes a process of forming a continuous series of images.

A movement information extraction method for extracting movement information of an object from a given moving image,
Recognize candidate images in each still image that constitutes the moving image,
Recognized candidate images are sequentially associated with each other in chronological order to recognize a series of image sequences and extract movement information of the object,
When associating the attention candidate image with a series of image sequences already obtained, if there are a plurality of image sequences, the image is associated with any of the image sequences based on the movement characteristics of each of the series of image sequences. Movement information extraction method to determine whether or not.

The movement information extraction method according to claim 15,
As a movement characteristic for the association, when a movement expected for an object and a movement expected for a moving object other than the object are compared, a movement characteristic capable of distinguishing both is used. thing.

A movement information extraction method for extracting movement information of an object from a given moving image,
Recognize candidate images in each still image that constitutes the moving image,
Recognized candidate images are sequentially associated with each other in chronological order to recognize a series of image sequences and extract movement information of the object,
In the recognition of the image sequence, a movement information extraction method characterized by selecting a series of image sequences having the highest stability in the moving direction from a series of image sequences obtained by associating candidate images in each still image. .