JPH10150593A

JPH10150593A - Device and method for extracting target

Info

Publication number: JPH10150593A
Application number: JP8308174A
Authority: JP
Inventors: Kingo Ozawa; 金吾小沢; Michihisa Dou; 通久堂; Shinichi Horinouchi; 真一堀ノ内
Original assignee: Tokimec Inc
Current assignee: Tokimec Inc
Priority date: 1996-11-19
Filing date: 1996-11-19
Publication date: 1998-06-02

Abstract

PROBLEM TO BE SOLVED: To improve followup performance by reducing possibility due to miss or erroneously recognize a target by performing matching concerning all the areas of image, while using an evaluation value considering that the possibility of target existence is improved getting closer to a predictive position in template matching. SOLUTION: A template setting part 14 sets a template in a prescribed size with a pixel, which is discriminated as one part of an infiltrator from the difference of luminance with a background image as a reference, as a center. An evaluation value calculating part 16 finds the degree of similarity with the total sum of luminance difference as scaling for each partial area in the same size as the template over all the surface of frame. Concerning the coordinate of predictive position on the template from a predictive position calculating part 15 and the coordinate of partial area defined as an object, a position evaluation value calculating part 15 sends the sum of absolute values of separate distances in X-axis and Y-axis directions to the evaluation value calculating part 16 as a position evaluation value. The result of adding the position evaluation value to the degree of similarity is sent to a target-extracting part 17 as the evaluation value of matching by the evaluation value calculating part 16.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＴＶカメラ、赤外
線カメラ等の撮像装置によって得られた画像の中から目
標を抽出する目標抽出装置、及び目標抽出方法に関し、
例えば監視装置における監視対象の抽出などに用いられ
る目標抽出装置及び目標抽出方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a target extracting device and a target extracting method for extracting a target from an image obtained by an image pickup device such as a TV camera and an infrared camera.
For example, the present invention relates to a target extraction device and a target extraction method used for extracting a monitoring target in a monitoring device.

【０００２】[0002]

【従来の技術】例えば、ＴＶカメラ、赤外線カメラ等の
撮像装置を用いて不審者を監視するシステムなどにおい
て、目標の動きを検知する目的などのために、撮像装置
によって時々刻々得られる画像の中から目標を抽出し、
追尾する装置が近年要求されてきている。2. Description of the Related Art For example, in a system for monitoring a suspicious person using an image pickup device such as a TV camera or an infrared camera, an image obtained every moment by an image pickup device is used for the purpose of detecting a movement of a target. Extract goals from
A tracking device has recently been required.

【０００３】従来、この種の目標を抽出する方法として
は、目標をテンプレートとして表し、このテンプレート
と、画像中の所定の探索領域との類似性を判断するテン
プレートマッチング法が用いられている。図７はこのテ
ンプレートマッチング法の説明図を示しており、抽出し
ようとする目標を表すテンプレートｔ（ｍ，ｎ）の中心
を、画像ｆ（ｘ，ｙ）中の探索領域Ｓ内にある点（ｉ，
ｊ）に重なるようにし、テンプレートｔ（ｍ，ｎ）とそ
れに重なる画像の部分の類似度（マッチング度）を測
り、その値を点（ｉ，ｊ）に目標が存在する確からしさ
としている。この操作を探索領域Ｓ内のすべての点に対
して施し、類似度が最大となる位置を目標の位置と決定
するのが基本的な考え方である。類似度の測定には、例
えば以下の（１）式によって計算される値Ｄが用いられ
る。即ち、Conventionally, as a method for extracting such a target, a template matching method has been used in which the target is represented as a template and the similarity between the template and a predetermined search area in an image is determined. FIG. 7 is an explanatory diagram of this template matching method, in which the center of a template t (m, n) representing a target to be extracted is set to a point () in a search area S in an image f (x, y). i,
j), the similarity (matching degree) between the template t (m, n) and the part of the image overlapping the template t (m, n) is measured, and the value is regarded as a certainty that the target exists at the point (i, j). The basic idea is that this operation is performed on all points in the search area S, and the position where the similarity is maximized is determined as the target position. For the measurement of the similarity, for example, a value D calculated by the following equation (1) is used. That is,

【０００４】[0004]

【数５】Ｄ（ｉ，ｊ）＝Σ｜ｆ（ｉ＋ｍ，ｊ＋ｎ）−ｔ
（ｍ，ｎ）｜である。尚、Σは、テンプレートｔ（ｍ，ｎ）のすべて
の点についての差分の総和をとることを意味し、ｆ、ｔ
はそれぞれ輝度を表すものとする。ここで、探索領域Ｓ
は、目標の存在する可能性のある領域であり、テンプレ
ートマッチングの対象領域をかかる探索領域Ｓについて
のみ限定することで、目標以外のものを目標として誤認
することを防ぐためのものである。D (i, j) = Σ | f (i + m, j + n) −t
(M, n) |. Note that Σ means taking the sum of the differences for all points of the template t (m, n), and f, t
Represents luminance. Here, the search area S
Is a region where a target may be present, and is intended to prevent a region other than the target from being erroneously recognized as a target by limiting the target region of template matching only to the search region S.

【０００５】従来の目標の抽出を行う目標抽出装置にあ
っては、微小時間だけ離れたフレーム間では目標の位置
が大きく変化しないことから、直前のフレームで抽出さ
れた目標の位置を中心とする所定の大きさの領域として
設定される。In a conventional target extracting apparatus for extracting a target, the position of the target does not change significantly between frames separated by a very short time, so that the position of the target extracted in the immediately preceding frame is centered. It is set as a region of a predetermined size.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、上記従
来の方法にあっては、目標が探索領域Ｓ外に存在すると
目標を検出できず、かといって、探索領域Ｓを大きく設
定すれば探索領域を設ける効果が小さくなってしまうと
いう問題があった。本願発明は、かかる問題点に鑑みな
されたもので、請求項１ないし請求項１２に記載した発
明は、探索領域を設けることなく、正確に目標を抽出す
ることができる目標抽出装置及び目標抽出方法を提供す
ることを目的とする。However, in the above-mentioned conventional method, if the target exists outside the search area S, the target cannot be detected. On the other hand, if the search area S is set large, the search area cannot be detected. There is a problem that the effect of the provision is reduced. The present invention has been made in view of such a problem, and the inventions described in claims 1 to 12 can accurately extract a target without providing a search area and a target extraction method. The purpose is to provide.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
に、請求項１記載の発明は、目標を撮像して映像信号を
出力する撮像装置と、該撮像装置から出力される映像信
号を逐次Ａ／Ｄ変換してフレーム毎に画像ｆ（ｘ，ｙ）
として生成する画像生成手段と、前記画像ｆ（ｘ，ｙ）
から前記目標を抽出して目標位置信号を出力する目標抽
出手段とからなる目標抽出装置において、前記目標抽出
手段は、目標を表すテンプレートｔ（ｍ，ｎ）と該テン
プレートと同じ大きさを持つ前記画像の各部分ｐ（ｘ，
ｙ）に対して、その部分ｐ（ｘ，ｙ）が目標である確か
らしさを評価する評価値を、テンプレートｔ（ｍ，ｎ）
とその部分ｐ（ｘ，ｙ）の類似度、及びその部分ｐ
（ｘ，ｙ）の代表点と前フレーム以前で抽出された目標
の位置に基づき予測された現フレームでの目標の予測位
置（ｘ０，ｙ０）との離間度によって決まる位置評価値
ｃ、とから求め、求めた各部分ｐ（ｘ，ｙ）の評価値に
基づいて、前記画像ｆ（ｘ，ｙ）から目標を抽出するこ
とを特徴とする。According to one aspect of the present invention, there is provided an imaging apparatus for imaging a target and outputting a video signal, and sequentially outputting a video signal output from the imaging apparatus. A / D converted and image f (x, y) for each frame
Image generation means for generating the image f (x, y)
And a target extracting means for extracting the target from the target and outputting a target position signal, wherein the target extracting means has a template t (m, n) representing the target and the same size as the template. Each part p (x,
y), an evaluation value for evaluating the probability that the part p (x, y) is a target is represented by a template t (m, n).
And the similarity between the part p (x, y) and the part p
From the position evaluation value c determined by the degree of separation between the representative point of (x, y) and the target predicted position (x0, y0) in the current frame predicted based on the target position extracted before the previous frame. It is characterized in that a target is extracted from the image f (x, y) based on the obtained and obtained evaluation value of each part p (x, y).

【０００８】また、請求項２記載の発明は、請求項１記
載のものにおいて、前記評価値は、テンプレートｔ
（ｍ，ｎ）と前記部分ｐ（ｘ，ｙ）の対応する画素同士
の輝度の差分の総和Ｄ、及びその部分ｐ（ｘ，ｙ）の代
表点と前フレーム以前で抽出された目標の位置に基づき
予測された現フレームでの目標の予測位置（ｘ０，ｙ
０）との離間度の非減少関数である位置評価値ｃ、とを
加算することにより求めることを特徴とする。[0008] According to a second aspect of the present invention, in the first aspect, the evaluation value is calculated based on a template t.
The sum D of the luminance differences between (m, n) and the corresponding pixels of the part p (x, y), and the representative point of the part p (x, y) and the position of the target extracted before the previous frame Target position (x0, y) in the current frame predicted based on
0) and a position evaluation value c which is a non-decreasing function of the degree of separation.

【０００９】また、請求項３記載の発明は、請求項２記
載のものにおいて、前記位置評価値ｃは、その部分ｐ
（ｘ，ｙ）の代表点（ｉ，ｊ）と前記目標の予測位置
（ｘ０，ｙ０）とから、According to a third aspect of the present invention, in the second aspect, the position evaluation value c includes
From the representative point (i, j) of (x, y) and the predicted position of the target (x0, y0),

【００１０】[0010]

【数６】ｃ＝ｋ（｜ｉ−ｘ０｜＋｜ｊ−ｙ０｜）ｋ：正の定数により求めることを特徴とする。また、請求項４記載の
発明は、目標を撮像して映像信号を出力する撮像装置
と、該撮像装置から出力される映像信号を逐次Ａ／Ｄ変
換してフレーム毎に画像ｆ（ｘ，ｙ）として生成する画
像生成手段と、前記画像ｆ（ｘ，ｙ）から前記目標を抽
出して目標位置信号を出力する目標抽出手段とからなる
目標抽出装置において、前記目標抽出手段は、目標を表
すテンプレートｔ（ｍ，ｎ）と該テンプレートと同じ大
きさを持つ前記画像の各部分ｐ（ｘ，ｙ）に対して、そ
の部分ｐ（ｘ，ｙ）が目標である確からしさを評価する
評価値を、テンプレートｔ（ｍ，ｎ）とその部分ｐ
（ｘ，ｙ）の類似度、及びその部分ｐ（ｘ，ｙ）の代表
点と前フレームでの画像ｆ（ｘ，ｙ）において抽出され
た目標の位置（ｘ１，ｙ１）との離間度によって決まる
位置評価値ｃ、とから求め、求めた各部分ｐ（ｘ，ｙ）
の評価値に基づいて、前記画像ｆ（ｘ，ｙ）から目標を
抽出することを特徴とする。C = k (| i−x0 | + | j−y0 |) k: It is characterized by being obtained by a positive constant. According to a fourth aspect of the present invention, there is provided an imaging apparatus for imaging a target and outputting a video signal, and sequentially performing A / D conversion of a video signal output from the imaging apparatus to generate an image f (x, y) for each frame. ), And a target extraction unit that extracts the target from the image f (x, y) and outputs a target position signal, wherein the target extraction unit represents a target. For a template t (m, n) and each part p (x, y) of the image having the same size as the template, an evaluation value for evaluating the probability that the part p (x, y) is a target To the template t (m, n) and its part p
The similarity of (x, y) and the degree of separation between the representative point of the part p (x, y) and the target position (x1, y1) extracted in the image f (x, y) in the previous frame Each part p (x, y) obtained from the determined position evaluation value c
And extracting a target from the image f (x, y) based on the evaluation value.

【００１１】また、請求項５記載の発明は、請求項４記
載のものにおいて、前記評価値は、テンプレートｔ
（ｍ，ｎ）と前記部分ｐ（ｘ，ｙ）の対応する画素同士
の輝度の差分の総和Ｄ、及びその部分ｐ（ｘ，ｙ）の代
表点と前フレームでの画像ｆ（ｘ，ｙ）において抽出さ
れた目標の位置（ｘ１，ｙ１）との離間度の非減少関数
である位置評価値ｃ、とを加算することにより求めるこ
とを特徴とする。According to a fifth aspect of the present invention, in the fourth aspect, the evaluation value is calculated based on a template t.
The sum D of the difference in luminance between (m, n) and the corresponding pixel of the part p (x, y), and the representative point of the part p (x, y) and the image f (x, y) in the previous frame ) Is obtained by adding the position evaluation value c, which is a non-decreasing function of the degree of separation from the target position (x1, y1) extracted in step (1).

【００１２】また、請求項６記載の発明は、請求項５記
載のものにおいて、前記位置評価値ｃは、その部分ｐ
（ｘ，ｙ）の代表点（ｉ，ｊ）と前フレームでの画像ｆ
（ｘ，ｙ）において抽出された目標の位置（ｘ１，ｙ
１）とから、According to a sixth aspect of the present invention, in the fifth aspect, the position evaluation value c is calculated by using
The representative point (i, j) of (x, y) and the image f in the previous frame
The target position (x1, y) extracted at (x, y)
1) From

【００１３】[0013]

【数７】ｃ＝ｋ（｜ｉ−ｘ１｜＋｜ｊ−ｙ１｜）ｋ：正の定数により求めることを特徴とする。また、請求項７記載の
発明は、目標を撮像する撮像装置から出力される映像信
号を逐次Ａ／Ｄ変換してフレーム毎に画像ｆ（ｘ，ｙ）
として生成し、前記画像ｆ（ｘ，ｙ）から前記目標を抽
出する目標抽出方法において、前記画像ｆ（ｘ，ｙ）か
ら目標を抽出するのは、目標を表すテンプレートｔ
（ｍ，ｎ）と該テンプレートと同じ大きさを持つ前記画
像の各部分ｐ（ｘ，ｙ）に対して、その部分ｐ（ｘ，
ｙ）が目標である確からしさを評価する評価値を、テン
プレートｔ（ｍ，ｎ）とその部分ｐ（ｘ，ｙ）の類似
度、及びその部分ｐ（ｘ，ｙ）の代表点と前フレーム以
前で抽出された目標の位置に基づき予測された現フレー
ムでの目標の予測位置（ｘ０，ｙ０）との離間度によっ
て決まる位置評価値ｃ、とから求め、求めた各部分ｐ
（ｘ，ｙ）の評価値に基づいて、行うことを特徴とす
る。C = k (| i−x1 | + | j−y1 |) k: a positive constant. According to a seventh aspect of the present invention, a video signal output from an imaging device that captures an image of a target is sequentially subjected to A / D conversion, and an image f (x, y) is obtained for each frame.
And extracting a target from the image f (x, y) in the target extraction method for extracting the target from the image f (x, y)
For each part p (x, y) of the image having the same size as (m, n) and the template, the part p (x, y)
The evaluation value for evaluating the likelihood that y) is the target is represented by the similarity between the template t (m, n) and its part p (x, y), and the representative point of the part p (x, y) and the previous frame. Each position p obtained from the position evaluation value c determined by the degree of separation from the target predicted position (x0, y0) in the current frame predicted based on the previously extracted target position.
It is characterized in that it is performed based on the evaluation value of (x, y).

【００１４】また、請求項８記載の発明は、請求項７記
載のものにおいて、前記評価値は、テンプレートｔ
（ｍ，ｎ）と前記部分ｐ（ｘ，ｙ）の対応する画素同士
の輝度の差分の総和Ｄ、及びその部分ｐ（ｘ，ｙ）の代
表点と前フレーム以前で抽出された目標の位置に基づき
予測された現フレームでの目標の予測位置（ｘ０，ｙ
０）との離間度の非減少関数である位置評価値ｃ、とを
加算することにより求めることを特徴とする。According to an eighth aspect of the present invention, in the seventh aspect of the present invention, the evaluation value is a template t.
The sum D of the luminance differences between (m, n) and the corresponding pixels of the part p (x, y), and the representative point of the part p (x, y) and the position of the target extracted before the previous frame Target position (x0, y) in the current frame predicted based on
0) and a position evaluation value c which is a non-decreasing function of the degree of separation.

【００１５】また、請求項９記載の発明は、請求項８記
載のものにおいて、前記位置評価値ｃは、その部分ｐ
（ｘ，ｙ）の代表点（ｉ，ｊ）と前記目標の予測位置
（ｘ０，ｙ０）とから、According to a ninth aspect of the present invention, in the ninth aspect of the present invention, the position evaluation value c is calculated based on a part p thereof.
From the representative point (i, j) of (x, y) and the predicted position of the target (x0, y0),

【００１６】[0016]

【数８】ｃ＝ｋ（｜ｉ−ｘ０｜＋｜ｊ−ｙ０｜）ｋ：正の定数により求めることを特徴とする。また、請求項１０記載
の発明は、目標を撮像する撮像装置から出力される映像
信号を逐次Ａ／Ｄ変換してフレーム毎に画像ｆ（ｘ，
ｙ）として生成し、前記画像ｆ（ｘ，ｙ）から前記目標
を抽出する目標抽出方法において、前記画像ｆ（ｘ，
ｙ）から目標を抽出するのは、目標を表すテンプレート
ｔ（ｍ，ｎ）と該テンプレートと同じ大きさを持つ前記
画像の各部分ｐ（ｘ，ｙ）に対して、その部分ｐ（ｘ，
ｙ）が目標である確からしさを評価する評価値を、テン
プレートｔ（ｍ，ｎ）とその部分ｐ（ｘ，ｙ）の類似
度、及びその部分ｐ（ｘ，ｙ）の代表点と前フレームで
の画像ｆ（ｘ，ｙ）において抽出された目標の位置（ｘ
１，ｙ１）との離間度によって決まる位置評価値ｃ、と
から求め、求めた各部分ｐ（ｘ，ｙ）の評価値に基づい
て、行うことを特徴とする。C = k (| i−x0 | + | j−y0 |) k: a positive constant. According to a tenth aspect of the present invention, a video signal output from an imaging device for imaging a target is sequentially subjected to A / D conversion and an image f (x,
y), and extracting the target from the image f (x, y).
The extraction of the target from y) is based on a template t (m, n) representing the target and each part p (x, y) of the image having the same size as the template.
The evaluation value for evaluating the likelihood that y) is the target is represented by the similarity between the template t (m, n) and its part p (x, y), and the representative point of the part p (x, y) and the previous frame. The target position (x extracted in the image f (x, y) at
1, y1), and a position evaluation value c determined by the degree of separation, and is performed based on the obtained evaluation values of the respective parts p (x, y).

【００１７】また、請求項１１記載の発明は、請求項１
０記載のものにおいて、前記評価値は、テンプレートｔ
（ｍ，ｎ）と前記部分ｐ（ｘ，ｙ）の対応する画素同士
の輝度の差分の総和Ｄ、及びその部分ｐ（ｘ，ｙ）の代
表点と前フレームでの画像ｆ（ｘ，ｙ）において抽出さ
れた目標の位置（ｘ１，ｙ１）との離間度の非減少関数
である位置評価値ｃ、とを加算することにより求めるこ
とを特徴とする。The invention according to claim 11 is the first invention.
0, the evaluation value is a template t
The sum D of the difference in luminance between (m, n) and the corresponding pixel of the part p (x, y), and the representative point of the part p (x, y) and the image f (x, y) in the previous frame ) Is obtained by adding the position evaluation value c, which is a non-decreasing function of the degree of separation from the target position (x1, y1) extracted in step (1).

【００１８】また、請求項１２記載の発明は、請求項１
１記載のものにおいて、前記位置評価値ｃは、その部分
ｐ（ｘ，ｙ）の代表点（ｉ，ｊ）と前フレームでの画像
ｆ（ｘ，ｙ）において抽出された目標の位置（ｘ１，ｙ
１）とから、The invention according to claim 12 is the first invention.
1, the position evaluation value c is calculated based on the representative point (i, j) of the portion p (x, y) and the target position (x1) extracted in the image f (x, y) in the previous frame. , Y
1) From

【００１９】[0019]

【数９】ｃ＝ｋ（｜ｉ−ｘ１｜＋｜ｊ−ｙ１｜）ｋ：正の定数により求めることを特徴とする。本発明において、離間
度は、２点の離れ程度を表す指標とし、任意の関数を選
択することができる。離間度の一例として、請求項３，
６，９，または１２に記載したｃで表現されるものの
他、２点の距離、√｛（ｉ−ｘ０）²＋（ｊ−ｙ０）²｝
または√｛（ｉ−ｘ１）²＋（ｊ−ｙ１）²｝とすること
もできる。C = k (| i-x1 | + | j-y1 |) k: It is characterized by being obtained by a positive constant. In the present invention, the degree of separation is an index indicating the degree of separation between two points, and an arbitrary function can be selected. Claim 3, as an example of the degree of separation
In addition to the one represented by c described in 6, 9, or 12, the distance between two points, {(i−x0) ² + (j−y0) ² }
Or {(i-x1) ² + (j-y1) ² }.

【００２０】目標の予測は、前フレーム以前の目標の位
置に基づき種々の方法で算出することができるが、例え
ば、α−βフィルタを作用させるα−βトラッカー法等
を用いて算出することができる。The prediction of the target can be calculated by various methods based on the position of the target before the previous frame. For example, it can be calculated by using the α-β tracker method or the like in which an α-β filter is applied. it can.

【００２１】[0021]

【発明の実施の形態】以下、図面を用いて本発明の実施
の形態を説明する。図１は、本発明の目標抽出装置の実
施の形態を表すブロック構成図であり、図において、１
１は撮像装置、１２はタイミング信号発生部、１３はＡ
／Ｄ変換部、２０は目標抽出手段、１０は表示部であ
る。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a target extracting apparatus according to the present invention.
1 is an imaging device, 12 is a timing signal generator, 13 is A
A / D conversion unit, 20 is a target extraction unit, and 10 is a display unit.

【００２２】撮像装置１１は、タイミング信号発生部１
２からの垂直同期信号及び水平同期信号を受けて、監視
領域などを撮像して電気的な映像信号として出力するも
ので、具体的にはＴＶカメラ、赤外線カメラ等が該当す
る。タイミング信号発生部１２は、垂直同期信号、水平
同期信号及び画素クロック信号を発生するもので、前述
のように撮像装置１１に同期信号を出力するほか、Ａ／
Ｄ変換部１３、及び目標抽出手段２０を構成するテンプ
レート設定部１４、位置評価値算出部１５、評価値算出
部１６、目標枠重量部１９にこれらの信号を出力する。The imaging device 11 includes a timing signal generator 1
2 receives a vertical synchronization signal and a horizontal synchronization signal from the camera 2, and captures an image of a monitoring area and outputs the image as an electric video signal. The timing signal generator 12 generates a vertical synchronizing signal, a horizontal synchronizing signal, and a pixel clock signal. In addition to outputting a synchronizing signal to the imaging device 11 as described above,
These signals are output to the D conversion unit 13, the template setting unit 14, the position evaluation value calculation unit 15, the evaluation value calculation unit 16, and the target frame weight unit 19 that constitute the target extraction unit 20.

【００２３】Ａ／Ｄ変換部１３（画像生成手段）は、撮
像装置１１からの映像信号を逐次Ａ／Ｄ変換して各画素
につき例えば８ビット２５６値の濃淡値にして、フレー
ム毎に画像ｆ（ｘ，ｙ）のラスタースキャンデータを出
力する。このデータに基づいて、以下に詳細に説明する
テンプレート設定部１４、位置評価値算出部１５、評価
値算出部１６、目標抽出部１７、予測位置算出部１８及
び目標枠重量部１９によって構成される目標抽出手段２
０において目標を抽出するものである。The A / D converter 13 (image generating means) sequentially A / D converts the video signal from the image pickup device 11 to make each pixel a gray value of, for example, 8 bits and 256 values, and outputs an image f for each frame. The (x, y) raster scan data is output. Based on this data, a template setting unit 14, a position evaluation value calculation unit 15, an evaluation value calculation unit 16, a target extraction unit 17, a predicted position calculation unit 18, and a target frame weight unit 19, which will be described in detail below, are configured. Target extraction means 2
At 0, the target is extracted.

【００２４】また、表示部１０は、撮像装置１１からの
映像信号を表示するもので、例えばＣＲＴディスプレイ
装置により構成される。尚、以上の撮像装置１１、タイ
ミング信号発生部１２、Ａ／Ｄ変換部１３及び表示部１
０は、従来の目標抽出装置に使用されるものと同じであ
り、同様に作用する。The display unit 10 displays a video signal from the image pickup device 11, and is constituted by, for example, a CRT display device. The above-described imaging device 11, timing signal generator 12, A / D converter 13, and display 1
0 is the same as used in conventional target extraction devices and works similarly.

【００２５】次に、目標抽出手段２０について詳述す
る。テンプレート設定部１４は、目標の自動抽出に先だ
って、画像ｆ（ｘ，ｙ）の一部を、目標を表すテンプレ
ートｔ（ｍ，ｎ）として選択し、テンプレート設定信号
が入力される毎に、そのデータを評価値算出部１６に送
出するものである。例えば、本目標抽出装置を、不審者
の監視システムに用いる場合、テンプレート設定部１４
は、侵入者を検出してそれを目標とし、画像中の侵入者
の部分を、目標を表すテンプレートとして出力する。Next, the target extracting means 20 will be described in detail. Prior to the automatic extraction of the target, the template setting unit 14 selects a part of the image f (x, y) as a template t (m, n) representing the target, and every time a template setting signal is input, the template f (x, y) is selected. The data is sent to the evaluation value calculation unit 16. For example, when the present target extraction device is used in a suspicious person monitoring system, the template setting unit 14
Detects an intruder and targets it, and outputs the intruder part in the image as a template representing the target.

【００２６】この機能を実現するために、テンプレート
設定部１４は、例えば、侵入者が存在しないときの監視
領域を表す背景画像ｂ（ｘ，ｙ）と、Ａ／Ｄ変換部１３
からの画像ｆ（ｘ，ｙ）のラスタースキャンデータを記
憶するフレームメモリとを有し、フレームメモリに取り
込んだ画像ｆ（ｘ，ｙ）と背景画像ｂ（ｘ，ｙ）とを画
素毎に比較して、輝度の差分が所定値以上であったと
き、画像ｆ（ｘ，ｙ）のその画素を侵入者の一部である
と判定する。そして、そのような画素の集合として侵入
者を検出し、その中心を中心とする所定の大きさの領域
をテンプレートｔ（ｍ，ｎ）として評価値算出部１６に
送出する。図４（ａ）は、Ｘ×Ｙ画素からなる画像ｆ
（ｘ，ｙ）に対して、斜線部分が輝度の差分が所定値以
上の部分を表しており、この部分の中心（ｘｉ，ｙｉ）
と中心が一致するようなＭ×Ｎ画素からなるテンプレー
トｔ（ｍ，ｎ）が設定される。In order to realize this function, the template setting unit 14 includes, for example, a background image b (x, y) representing a monitoring area when no intruder exists and an A / D conversion unit 13
And a frame memory for storing raster scan data of the image f (x, y) from the image memory. The image f (x, y) captured in the frame memory and the background image b (x, y) are compared for each pixel. Then, when the difference in luminance is equal to or larger than a predetermined value, it is determined that the pixel of the image f (x, y) is a part of the intruder. Then, an intruder is detected as a set of such pixels, and an area of a predetermined size centered on the center is sent to the evaluation value calculation unit 16 as a template t (m, n). FIG. 4A shows an image f composed of X × Y pixels.
With respect to (x, y), the shaded portion represents a portion where the difference in luminance is equal to or greater than a predetermined value, and the center of this portion (xi, yi)
And a template t (m, n) composed of M × N pixels whose centers match.

【００２７】画像ｆ（ｘ，ｙ）を常時取り込む機能は、
タイミング信号発生部１２からの垂直同期信号、水平同
期信号及び画素クロック信号に基づいて、カウンタ、論
理回路及びメモリによって実現される。また、侵入者を
検出し、それに基づきテンプレートを評価値算出部１６
に送出するのは、例えば、ＣＰＵ及びメモリが実行する
ソフトウェア処理で行うこともできる。The function of constantly taking in the image f (x, y) is as follows.
It is realized by a counter, a logic circuit, and a memory based on the vertical synchronization signal, the horizontal synchronization signal, and the pixel clock signal from the timing signal generator 12. Also, an intruder is detected, and a template is calculated based on the detected intruder.
Can be transmitted by, for example, software processing executed by the CPU and the memory.

【００２８】位置評価値算出部１５は、後述の評価値算
出部１６でテンプレートｔ（ｍ，ｎ）と同じ大きさの画
像ｆ（ｘ，ｙ）の各部分ｐ（ｘ，ｙ）に対して、その部
分ｐ（ｘ，ｙ）が目標である確からしさを評価するため
の評価値を求める際に用いる、位置評価値ｃを算出する
もので、この位置評価値ｃは、前記各部分の代表点と後
述の予測位置算出部１８で予測される現フレームでの目
標の予測位置との離間度の非減少関数とする。The position evaluation value calculation unit 15 calculates an evaluation value calculation unit 16 for each part p (x, y) of the image f (x, y) having the same size as the template t (m, n). , Which calculates a position evaluation value c used when calculating an evaluation value for evaluating the probability that the part p (x, y) is a target, and the position evaluation value c is a representative of the respective parts. This is a non-decreasing function of the degree of separation between the point and the target predicted position in the current frame predicted by the predicted position calculation unit 18 described later.

【００２９】この位置評価値ｃとして、例えば、次式
（２）を用いることができる。ここで、（ｉ，ｊ）は前
記部分ｐ（ｘ，ｙ）の中心位置、（ｘ０，ｙ０）は画像
ｆ（ｘ，ｙ）の予測位置であり、ｋは正の定数である。As the position evaluation value c, for example, the following equation (2) can be used. Here, (i, j) is the center position of the portion p (x, y), (x0, y0) is the predicted position of the image f (x, y), and k is a positive constant.

【００３０】[0030]

【数１０】ｃ＝ｋ（｜ｉ−ｘ０｜＋｜ｊ−ｙ０｜）（２）図４（ｂ）に、部分ｐ（ｘ，ｙ）の中心（ｉ，ｊ）と目
標の予測位置（ｘ０，ｙ０）との関係を示す。即ち、ｃ
を、水平方向の離間程度を表す｜ｉ−ｘ０｜と垂直方向
の離間程度を表す｜ｊ−ｙ０｜との和からなる離間度の
ｋ倍として、定義することができる。但し、離間度とし
ては、この例の他、部分ｐ（ｘ，ｙ）の中心（ｉ，ｊ）
と予測位置（ｘ０，ｙ０）との距離とすることも可能で
あり、いずれにしても、（ｉ，ｊ）と（ｘ０，ｙ０）と
の離れ程度を表す任意の関数とすることができ、ｃは、
その離間度の非減少関数とするとよい。C = k (| i−x0 | + | j−y0 |) (2) FIG. 4B shows the center (i, j) of the part p (x, y) and the predicted position of the target ( x0, y0). That is, c
Can be defined as k times the degree of separation, which is the sum of | i−x0 | representing the degree of horizontal separation and | j−y0 | representing the degree of vertical separation. However, as the degree of separation, in addition to this example, the center (i, j) of the portion p (x, y)
And the distance between the predicted position and the predicted position (x0, y0). In any case, an arbitrary function representing the degree of separation between (i, j) and (x0, y0) can be obtained. c is
A non-decreasing function of the degree of separation may be used.

【００３１】後述するように、評価値算出部１６は画像
ｆ（ｘ，ｙ）のラスタースキャンデータをそのまま入力
して各部分ｐ（ｘ，ｙ）の評価値を求めるものであるた
め、各部分ｐ（ｘ，ｙ）に対する位置評価値ｃは、その
部分の評価値が評価値算出部１６で求められるタイミン
グで評価値算出部１６に送出されると好ましいが、その
ためには、位置評価値算出部１５を例えば、図２に示す
構成とするとよい。As will be described later, since the evaluation value calculation unit 16 inputs the raster scan data of the image f (x, y) as it is and obtains the evaluation value of each part p (x, y), It is preferable that the position evaluation value c for p (x, y) is sent to the evaluation value calculation unit 16 at the timing when the evaluation value of that part is obtained by the evaluation value calculation unit 16. The unit 15 may have, for example, the configuration shown in FIG.

【００３２】図２において、２１及び２４はｕｐ／ｄｏ
ｗｎカウンタ、２２及び２５は０検出回路、２３及び２
６はフリップフロップ（Ｆ／Ｆ）、２７は加算器、２８
はシフトレジスタである。ｕｐ／ｄｏｗｎカウンタ２１
は、｜ｉ−ｘ０｜を算出するもので、そのLOAD端子LDに
水平同期信号が入力されるとDATA端子から入力されるｘ
０を初期値として、そのCK端子に入力される画素クロッ
ク信号を最初はカウントダウンしていくことにより、ｘ
０−ｉを出力する。尚、座標系はラスタースキャンにお
ける最初のデータつまり画像の左上を原点としている。
そして、その出力値ｘ０−ｉが０になったかどうかをコ
ンパレータ等で構成される０検出回路２２で検出する。
０検出回路２２で０になったことが検出されると、フリ
ップフロップ２３の出力が反転し、アップダウン切換U/
Dが切り換わり、カウントアップしていくことにより、
ｉ−ｘ０を出力する。こうして、｜ｉ−ｘ０｜を出力す
ることができる。In FIG. 2, 21 and 24 are up / do.
wn counters, 22 and 25 are 0 detection circuits, 23 and 2
6 is a flip-flop (F / F), 27 is an adder, 28
Is a shift register. up / down counter 21
Is used to calculate | i−x0 |, and when a horizontal synchronization signal is input to the LOAD terminal LD, x input from the DATA terminal
By initially counting down the pixel clock signal input to the CK terminal with 0 as an initial value, x
Output 0-i. Note that the origin of the coordinate system is the first data in the raster scan, that is, the upper left of the image.
Then, whether or not the output value x0-i has become 0 is detected by a 0 detection circuit 22 composed of a comparator or the like.
When the 0 detection circuit 22 detects that it has become 0, the output of the flip-flop 23 is inverted and the up / down switching U /
By switching D and counting up,
Output i-x0. Thus, | i−x0 | can be output.

【００３３】同様に、ｕｐ／ｄｏｗｎカウンタ２４は、
｜ｊ−ｙ０｜を算出するもので、そのLOAD端子LDに垂直
同期信号が入力されるとDATA端子から入力されるｙ０を
初期値として、そのCK端子に入力される水平同期信号を
最初はカウントダウンしていくことにより、ｙ０−ｊを
出力する。そして、その出力値が０になったかどうかを
コンパレータ等で構成される０検出回路２５で検出す
る。０検出回路２５で０になったことが検出されると、
フリップフロップ２６の出力が反転し、アップダウン切
換U/Dが切り換わり、カウントアップしていくことによ
り、ｊ−ｙ０を出力する。こうして、｜ｊ−ｙ０｜を出
力することができる。Similarly, the up / down counter 24
| J−y0 | is calculated. When a vertical synchronizing signal is input to the LOAD terminal LD, the horizontal synchronizing signal input to the CK terminal is initially counted down with y0 input from the DATA terminal as an initial value. By doing so, y0-j is output. Then, whether or not the output value becomes 0 is detected by a 0 detection circuit 25 composed of a comparator or the like. When the 0 detection circuit 25 detects that it has become 0,
The output of the flip-flop 26 is inverted, the up / down switching U / D is switched, and the count-up is performed, thereby outputting j-y0. Thus, | j−y0 | can be output.

【００３４】ｕｐ／ｄｏｗｎカウンタ２１及びｕｐ／ｄ
ｏｗｎカウンタ２４の出力は、加算器２７で加算され
る。そして、加算器２７の出力をシフトレジスタ２８に
送出して、桁送りすることによって、（２）式における
ｋを２のべきとした場合の、乗算をシフト演算で代用す
ることができ、位置評価値ｃが出力される。尚、図２及
び図４（ｂ）の例では、部分ｐ（ｘ，ｙ）の中心（ｉ，
ｊ）と目標の予測位置（ｘ０，ｙ０）とを用いて位置評
価値ｃを算出したが、例えば、部分ｐ（ｘ，ｙ）の右下
の点（ｉ０，ｊ０）と、これに対応する目標の予測位置
の右下の点（ｘ０−Ｍ／２，ｙ０−Ｎ／２）のそれぞれ
の位置座標を用いて、位置評価値ｃを算出することとし
ても、等価である。Up / down counter 21 and up / d
The output of the own counter 24 is added by the adder 27. Then, the output of the adder 27 is sent to the shift register 28 and shifted, so that the multiplication in the case where k in the equation (2) is a power of 2 can be substituted by the shift operation, and the position evaluation is performed. The value c is output. In the examples of FIGS. 2 and 4B, the center (i,
j) and the target predicted position (x0, y0) were used to calculate the position evaluation value c. For example, the lower right point (i0, j0) of the part p (x, y) and the corresponding point It is equivalent to calculate the position evaluation value c using the position coordinates of the lower right point (x0-M / 2, y0-N / 2) of the target predicted position.

【００３５】次に、評価値算出部１６は、テンプレート
ｔ（ｍ，ｎ）と同じ大きさの画像ｆ（ｘ，ｙ）の各部分
に対して、その部分が目標である確からしさを評価する
ための評価値を、i) テンプレートｔ（ｍ，ｎ）の各点
と、それに対応する前記部分ｐ（ｘ，ｙ）の各点の輝度
の差分の総和Ｄ（類似度）と、ii) 位置評価値算出部
１５で求められた、前記部分ｐ（ｘ，ｙ）に対する位置
評価値ｃと、から、両者を加算することによって求める
ものである。Next, the evaluation value calculation unit 16 evaluates the probability that each part of the image f (x, y) having the same size as the template t (m, n) is the target. The sum of the differences D (similarity) between the points of the template t (m, n) and the corresponding points of the part p (x, y), and ii) the position. The position evaluation value c for the part p (x, y) obtained by the evaluation value calculation unit 15 is obtained by adding both.

【００３６】評価値算出部１６は、例えば、図３に示す
ように構成することができる。図３において、３０はテ
ンプレートマッチングＬＳＩ（例えば、住友金属工業株
式会社製、ＩＰ９０Ｃ０８）であり、３１は１水平軸分
の画像データをＦＩＦＯで出力する１Ｈラインディレイ
である。テンプレートマッチングＬＳＩ３０は、その中
に、テンプレートレジスタ３０ａ、シフトレジスタアレ
イ３０ｂ、差分絶対値回路３０ｃ、総和回路３０ｄ、位
置算出回路３０ｅ、加算回路３０ｆ及びレジスタ３０ｇ
を備えている。The evaluation value calculation section 16 can be configured, for example, as shown in FIG. In FIG. 3, reference numeral 30 denotes a template matching LSI (for example, IP90C08 manufactured by Sumitomo Metal Industries, Ltd.), and 31 denotes a 1H line delay for outputting image data for one horizontal axis by FIFO. The template matching LSI 30 includes a template register 30a, a shift register array 30b, a difference absolute value circuit 30c, a summation circuit 30d, a position calculation circuit 30e, an addition circuit 30f, and a register 30g.
It has.

【００３７】テンプレートレジスタ３０ａは、テンプレ
ート設定部１４にて設定されたテンプレートｔ（ｍ，
ｎ）を格納するもので、テンプレート設定部１４からテ
ンプレートｔ（ｍ，ｎ）がテンプレートマッチングＬＳ
Ｉ３０に送出される毎に格納内容を書き換えるようにす
ることができる。また、テンプレートｔ（ｍ，ｎ）の垂
直方向の画素数Ｎから１引いたＮ−１個の１Ｈラインデ
ィレイ３１が直列に接続されており、各１Ｈラインディ
レイ３１からの出力及び１Ｈラインディレイ３１を経な
いラスタースキャン画像データが同時にテンプレートマ
ッチングＬＳＩ３０に入力されるように構成される。こ
のＮ個のデータは、シフトレジスタアレイ３０ｂに入力
されて、シフトレジスタアレイ３０ｂで画像ｆ（ｘ，
ｙ）のＭ×Ｎの部分ｐ（ｘ，ｙ）の画像データが一時的
に格納される。The template register 30a stores the template t (m,
n), and the template setting unit 14 determines that the template t (m, n) is a template matching LS
The stored content can be rewritten each time it is sent to I30. Also, N-1 1H line delays 31 obtained by subtracting 1 from the number N of pixels in the vertical direction of the template t (m, n) are connected in series, and the output from each 1H line delay 31 and the 1H line delay 31 are provided. Are input to the template matching LSI 30 at the same time. The N pieces of data are input to the shift register array 30b, and the image f (x,
Image data of the M × N part p (x, y) of y) is temporarily stored.

【００３８】差分絶対値回路３０ｃは、上記テンプレー
トレジスタ３０ａと上記シフトレジスタアレイ３０ｂの
対応するデータ同士の差分の絶対値、即ち、対応する画
素同士の輝度の差の絶対値を求めるものであり、総和回
路３０ｄで、その差分の絶対値の総和Ｄが算出される。
総和Ｄは、The absolute difference circuit 30c calculates the absolute value of the difference between the corresponding data in the template register 30a and the shift register array 30b, that is, the absolute value of the difference in luminance between the corresponding pixels. The summation circuit 30d calculates the sum D of the absolute values of the differences.
The sum D is

【００３９】[0039]

【数１１】Ｄ（ｉ，ｊ）＝Σ｜ｐ（ｉ＋ｍ，ｊ＋ｎ）−
ｔ（ｍ，ｎ）｜となる。また、位置算出回路３０ｅは、現在、シフトレ
ジスタアレイ３０ｂに格納されている部分ｐ（ｘ，ｙ）
の中心位置（ｉ，ｊ）を求めるための回路で、タイミン
グ信号発生部１２からの垂直同期信号、水平同期信号及
び画素クロック信号が入力されて、これらに基づき、最
新に入力されたラスタースキャンデータの位置（ｉ０，
ｊ０）を求め、さらにそれから、シフトレジスタアレイ
３０ｂに格納されている部分ｐ（ｘ，ｙ）の中心位置
（ｉ，ｊ）を求めることができる（ｉ＝ｉ０−Ｍ／２、
ｊ＝ｊ０−Ｎ／２）。D (i, j) = Σ | p (i + m, j + n) −
t (m, n) | Further, the position calculation circuit 30e calculates the part p (x, y) currently stored in the shift register array 30b.
A vertical synchronizing signal, a horizontal synchronizing signal, and a pixel clock signal from the timing signal generator 12 are input. Based on these, the latest input raster scan data Position (i0,
j0), and then the center position (i, j) of the portion p (x, y) stored in the shift register array 30b can be obtained (i = i0−M / 2,
j = j0-N / 2).

【００４０】加算回路３０ｆには、総和回路３０ｄから
のＤと、位置評価値算出部１５で求められた位置評価値
ｃとが入力されて、Ｄとｃとの加算が行われる。レジス
タ３０ｇでは、加算回路３０ｆの出力と既に格納された
データとを比較して、加算回路３０ｆの出力の方が小さ
い場合には、その小さい方にデータを書換えると同時
に、位置算出回路３０ｅから得られる位置（ｉ，ｊ）も
格納し、１フレームについての処理が終了した時点でそ
の位置を出力する。The D from the summing circuit 30d and the position evaluation value c obtained by the position evaluation value calculator 15 are input to the addition circuit 30f, and D and c are added. The register 30g compares the output of the adder circuit 30f with the already stored data. If the output of the adder circuit 30f is smaller, the data is rewritten to the smaller one and at the same time, the position calculation circuit 30e The obtained position (i, j) is also stored, and the position is output when the processing for one frame is completed.

【００４１】こうして、各部分ｐ（ｘ，ｙ）に対する評
価値Ｄ＋ｃをリアルタイムで求めることができ、評価値
Ｄ＋ｃが最も小さい、即ち、マッチングが最も大きい部
分の位置（ｉ，ｊ）が出力される。尚、ＩＰ９０Ｃ０８
を用いる場合、ＩＰ９０Ｃ０８はカスケード接続用の入
力端子ＥＸｉｎを有しており、最終的なマッチング結果
としてＤとＥＸｉｎから入力された値の加算値を出力す
るように構成されているので、この入力端子ＥＸｉｎに
位置評価値算出部１５からの位置評価値ｃを入力するこ
とで、Ｄとｃの加算ができる。In this way, the evaluation value D + c for each part p (x, y) can be obtained in real time, and the position (i, j) of the part having the smallest evaluation value D + c, that is, the largest matching is output. . In addition, IP90C08
Is used, the IP90C08 has an input terminal EXin for cascade connection, and is configured to output the sum of the values input from D and EXin as the final matching result. By inputting the position evaluation value c from the position evaluation value calculator 15 to EXin, D and c can be added.

【００４２】目標抽出部１７は、評価値算出部１６から
出力される、評価値Ｄ＋ｃが最小の部分ｐ（ｘ，ｙ）の
中心位置（ｘ１，ｙ１）を目標位置とみなし、目標位置
信号を予測位置算出部１８及び目標枠重量部１９に出力
する。目標抽出部１７は、例えば、ＣＰＵ及びメモリが
実行するソフトウェア処理で構成することもできる。次
に、予測位置算出部１８は、目標抽出部１７で抽出され
た目標の位置信号をもとに、次のフレームでの目標の位
置を予測するものである。The target extracting section 17 regards the center position (x1, y1) of the portion p (x, y) having the smallest evaluation value D + c output from the evaluation value calculating section 16 as the target position and outputs the target position signal. Output to the predicted position calculation unit 18 and the target frame weight unit 19. The target extraction unit 17 can be configured by software processing executed by a CPU and a memory, for example. Next, the predicted position calculating section 18 predicts the target position in the next frame based on the target position signal extracted by the target extracting section 17.

【００４３】目標の予測位置の算出は、例えば、水平方
向ｘ、垂直方向ｙそれぞれにおいて、（３）式で表され
るα−βフィルタを作用させることによって行うことが
できる。（３）式において、Ｘｍは目標抽出部１７で抽
出された目標の位置、Ｘｓは目標平滑位置、Ｘｐは目標
予測位置、Ｖｓは目標速度、α及びβはフィルタ定数、
Ｔはフレーム間の時間間隔、ｎはフレーム番号を表して
いる。The calculation of the target predicted position can be performed, for example, by applying the α-β filter expressed by the equation (3) in each of the horizontal direction x and the vertical direction y. In the equation (3), Xm is the position of the target extracted by the target extraction unit 17, Xs is the target smooth position, Xp is the target predicted position, Vs is the target speed, α and β are filter constants,
T represents a time interval between frames, and n represents a frame number.

【００４４】[0044]

【数１２】Ｘｓ（ｎ）＝Ｘｐ（ｎ）＋α［Ｘｍ（ｎ）−Ｘｐ（ｎ）］Ｖｓ（ｎ）＝Ｖｓ（ｎ−１）＋β［Ｘｍ（ｎ）−Ｘｐ（ｎ）］／ＴＸｐ（ｎ＋１）＝Ｘｓ（ｎ）＋Ｖｓ（ｎ）・Ｔ（３）このα−βフィルタは、一種の低域通過型フィルタで、
αまたはβの値を小さくすると、フィルタの平滑特性が
顕著となり、αまたはβの値を大きくすると、フィルタ
の追従特性が顕著となる（α−βフィルタの周波数特性
については例えば文献 D.E.Mayiatis "Comparison of
α−β and Kalman filter in track while scan radar
s"に詳細に説明されている）。図５に、このα−βフィ
ルタを用いた目標予測位置Ｘｐ（ｎ＋１）の求め方を示
す。Xs (n) = Xp (n) + α [Xm (n) −Xp (n)] Vs (n) = Vs (n−1) + β [Xm (n) −Xp (n)] / T Xp (n + 1) = Xs (n) + Vs (n) · T (3) This α-β filter is a kind of low-pass filter,
When the value of α or β is reduced, the smoothing characteristics of the filter become remarkable, and when the value of α or β is increased, the tracking characteristics of the filter become remarkable (for the frequency characteristics of the α-β filter, see, for example, DEMayiatis “Comparison of
α-β and Kalman filter in track while scan radar
s "). FIG. 5 shows how to obtain the target predicted position Xp (n + 1) using the α-β filter.

【００４５】以上の予測位置算出部１８は、ＣＰＵ及び
メモリが実行するソフトウェア処理で構成することもで
きる。目標枠重量部１９は、撮像装置１１からの映像信
号に、目標抽出部１７で抽出された目標の位置を示す目
標枠を重畳するもので、例えば、図６に示すように、ビ
デオスイッチ４１と切換信号発生回路４２から構成する
ことができる。ビデオスイッチ４１は、撮像装置１１か
らの映像信号と、目標枠を表す”白”レベル信号を、切
換信号発生回路４２からの切換信号によって切り換える
ものである。切換信号発生回路４２は、目標抽出部１７
で抽出された目標位置を中心とする所定の大きさの目標
枠として表示されるように、目標位置信号と、垂直同期
信号、水平同期信号及び画素クロック信号に基づいて、
目標枠に相当するタイミングでビデオスイッチ４１を映
像信号から”白”レベル信号に切り換えるための切換信
号を出力する。The above-mentioned predicted position calculating section 18 can be constituted by software processing executed by the CPU and the memory. The target frame weight unit 19 superimposes a target frame indicating the position of the target extracted by the target extraction unit 17 on the video signal from the imaging device 11. For example, as shown in FIG. The switching signal generation circuit 42 can be used. The video switch 41 switches between a video signal from the imaging device 11 and a “white” level signal representing a target frame by a switching signal from a switching signal generation circuit 42. The switching signal generation circuit 42
Based on the target position signal, the vertical synchronization signal, the horizontal synchronization signal, and the pixel clock signal, so as to be displayed as a target frame of a predetermined size around the target position extracted in
At a timing corresponding to the target frame, a switching signal for switching the video switch 41 from the video signal to the “white” level signal is output.

【００４６】こうして、表示部１０で、抽出した目標を
その目標位置を中心とする目標枠で表示することによ
り、例えば侵入者等の目標の追尾を行い、その行動を監
視することができる。また、他の実施の形態として、目
標抽出手段２０の中の予測位置算出部１８を省きその代
わりに目標抽出部１７からの目標位置信号を位置評価値
算出部１５へ直接送出するように構成し、位置評価値算
出部１５で算出する位置評価値ｃを画像ｆ（ｘ，ｙ）の
各部分ｐ（ｘ，ｙ）の中心（ｉ，ｊ）と前フレームで画
像ｆ（ｘ，ｙ）において抽出された目標の位置（ｘ１，
ｙ１）との離間度の非減少関数とすることもできる。In this way, by displaying the extracted target in the target frame centered on the target position on the display unit 10, it is possible to track the target of, for example, an intruder and monitor the action. Further, as another embodiment, the predicted position calculating unit 18 in the target extracting unit 20 is omitted, and the target position signal from the target extracting unit 17 is directly sent to the position evaluation value calculating unit 15 instead. And the position evaluation value c calculated by the position evaluation value calculation unit 15 in the image f (x, y) at the center (i, j) of each part p (x, y) of the image f (x, y) and the previous frame. The extracted target position (x1,
y1) may be a non-decreasing function of the degree of separation.

【００４７】この場合、位置評価値ｃは、次式（４）を
用いることができる。In this case, the following expression (4) can be used as the position evaluation value c.

【００４８】[0048]

【数１３】ｃ＝ｋ（｜ｉ−ｘ１｜＋｜ｊ−ｙ１｜）（４）目標が比較的動きが遅いもの、または動きが少ないもの
の場合には、この実施の形態で十分目標を抽出すること
ができる。以上説明したように、各実施の形態によれ
ば、位置評価値ｃを導入することで、画像ｆ（ｘ，ｙ）
の一部の領域のみならず、画像ｆ（ｘ，ｙ）の全体にわ
たって、テンプレートとのマッチングを行うようにした
ので、目標の抽出ができなくなって目標を見失う可能性
を小さくできると共に、目標の誤認を防ぐこともできる
ようになる。C = k (| i−x1 | + | j−y1 |) (4) If the target has a relatively slow movement or a small movement, the embodiment sufficiently extracts the target. can do. As described above, according to each embodiment, by introducing the position evaluation value c, the image f (x, y)
Since the matching with the template is performed not only in a part of the region but also in the entire image f (x, y), the possibility that the target cannot be extracted and the target is lost can be reduced, and the target It can also prevent misperception.

【００４９】位置評価値算出部１５におけるｋの値を調
整することにより、評価値としての離間度の重み付けを
変えることができる。従って、例えば、動きが不規則な
ものを目標とする場合には、ｃの値を小さくし、逆に、
動きが予測的なものを目標とする場合には、ｃの値を大
きくすることにより、状況に応じて的確な評価を行うこ
とができる。By adjusting the value of k in the position evaluation value calculator 15, the weight of the degree of separation as the evaluation value can be changed. Therefore, for example, when the target is an irregular movement, the value of c is reduced, and conversely,
In the case where the target is that the movement is predictive, an appropriate evaluation can be performed according to the situation by increasing the value of c.

【００５０】また、本実施の形態によれば、既存の市販
のＬＳＩを使用して、簡単に構成することができ、リア
ルタイムにマッチングを行い目標抽出を行うことができ
る。但し、本実施の形態では、位置評価値算出部１５及
び評価値算出部１６をＩＣ及びＬＳＩを用いたが、目標
抽出手段全体をＣＰＵとメモリなどで実行されるソフト
ウェアで構成する事も可能である。Further, according to the present embodiment, an existing commercially available LSI can be used for simple configuration, matching can be performed in real time, and target extraction can be performed. In the present embodiment, the position evaluation value calculation unit 15 and the evaluation value calculation unit 16 use ICs and LSIs. However, the entire target extraction unit may be configured by software executed by a CPU and a memory. is there.

【００５１】[0051]

【発明の効果】以上説明したように、請求項１ないし請
求項３及び請求項７ないし請求項９記載の発明によれ
ば、画像ｆ（ｘ，ｙ）から目標を抽出するのは、目標を
表すテンプレートｔ（ｍ，ｎ）と該テンプレートと同じ
大きさを持つ前記画像の各部分ｐ（ｘ，ｙ）に対して、
その部分ｐ（ｘ，ｙ）が目標である確からしさを評価す
る評価値を、テンプレートｔ（ｍ，ｎ）とその部分ｐ
（ｘ，ｙ）の類似度、及びその部分ｐ（ｘ，ｙ）の代表
点と前フレーム以前で抽出された目標の位置に基づき予
測された現フレームでの目標の予測位置（ｘ０，ｙ０）
との離間度によって決まる位置評価値ｃ、とから求め、
求めた各部分ｐ（ｘ，ｙ）の評価値に基づいて、行うこ
とにより、予測位置に近いほど、目標の存在する可能性
が高いことを考慮に入れることになるので、目標の誤認
を防ぐことができ、信頼性の高い目標追尾が可能とな
る。また、従来のような探索領域を用いる必要がないの
で、従来の探索領域を設定する場合の問題は解消され
る。As described above, according to the first to third and seventh to ninth aspects of the present invention, the target is extracted from the image f (x, y) by extracting the target. For a template t (m, n) to represent and each part p (x, y) of the image having the same size as the template,
An evaluation value for evaluating the likelihood that the part p (x, y) is the target is calculated using the template t (m, n) and the part p (x, y).
The target predicted position (x0, y0) in the current frame predicted based on the similarity of (x, y) and the representative point of the part p (x, y) and the target position extracted before the previous frame.
From the position evaluation value c determined by the degree of separation from
By performing on the basis of the obtained evaluation value of each part p (x, y), it is taken into consideration that the closer to the predicted position, the higher the possibility that the target exists, so that misrecognition of the target is prevented. And highly reliable target tracking can be performed. Further, since it is not necessary to use a conventional search area, the problem of setting a conventional search area is solved.

【００５２】同様に、請求項４ないし請求項６及び請求
項１０ないし請求項１２記載の発明によれば、画像ｆ
（ｘ，ｙ）から目標を抽出するのは、目標を表すテンプ
レートｔ（ｍ，ｎ）と該テンプレートと同じ大きさを持
つ前記画像の各部分ｐ（ｘ，ｙ）に対して、その部分ｐ
（ｘ，ｙ）が目標である確からしさを評価する評価値
を、テンプレートｔ（ｍ，ｎ）とその部分ｐ（ｘ，ｙ）
の類似度、及びその部分ｐ（ｘ，ｙ）の代表点と前フレ
ームでの画像ｆ（ｘ，ｙ）において抽出された目標の位
置（ｘ１，ｙ１）との離間度によって決まる位置評価値
ｃ、とから求め、求めた各部分ｐ（ｘ，ｙ）の評価値に
基づいて、行うことにより、前フレームの目標位置に近
いほど、目標の存在する可能性が高いことを考慮に入れ
ることになるので、目標の誤認を防ぐことができ、信頼
性の高い目標追尾が可能となる。また、従来のような探
索領域を用いる必要がないので、従来の探索領域を設定
する場合の問題は解消される。Similarly, according to the inventions of claims 4 to 6 and 10 to 12, the image f
The extraction of the target from (x, y) is performed for a template t (m, n) representing the target and each part p (x, y) of the image having the same size as the template.
An evaluation value for evaluating the likelihood that (x, y) is the target is defined as a template t (m, n) and its part p (x, y).
And a position evaluation value c determined by the degree of separation between the representative point of the part p (x, y) and the target position (x1, y1) extracted in the image f (x, y) in the previous frame. , And based on the obtained evaluation value of each part p (x, y), by taking into account that the closer to the target position of the previous frame, the higher the possibility that the target exists will be taken into account. Therefore, erroneous recognition of the target can be prevented, and highly reliable target tracking can be performed. Further, since it is not necessary to use a conventional search area, the problem of setting a conventional search area is solved.

[Brief description of the drawings]

【図１】本発明の目標抽出装置の実施の形態を表すブロ
ック構成図である。FIG. 1 is a block diagram showing an embodiment of a target extracting apparatus according to the present invention.

【図２】図１の位置評価値算出部１５の具体的な構成例
を示すブロック構成図である。FIG. 2 is a block diagram showing a specific configuration example of a position evaluation value calculation unit 15 in FIG. 1;

【図３】図１の評価値算出部１６の具体的な構成例を示
すブロック構成図である。FIG. 3 is a block diagram showing a specific configuration example of an evaluation value calculation unit 16 in FIG. 1;

【図４】（ａ）はテンプレートの設定を表す画像の説明
図であり、（ｂ）は画像のある部分ｐ（ｘ，ｙ）と画像
の目標の予測位置との離間度を表す説明図である。FIG. 4A is an explanatory diagram of an image showing a setting of a template, and FIG. 4B is an explanatory diagram showing a degree of separation between a certain part p (x, y) of the image and a target predicted position of the image. is there.

【図５】図１の予測位置算出部１８で行われるα−βフ
ィルタを作用させた目標の予測位置の算出の原理の説明
図である。FIG. 5 is an explanatory diagram of a principle of calculating a predicted position of a target by applying an α-β filter performed by a predicted position calculation unit 18 in FIG. 1;

【図６】図１の目標枠重量部１９の具体的な構成例を示
すブロック構成図である。FIG. 6 is a block diagram showing a specific configuration example of a target frame weight unit 19 in FIG. 1;

【図７】従来のテンプレートマッチング法を表す説明図
である。FIG. 7 is an explanatory diagram showing a conventional template matching method.

[Explanation of symbols]

１１撮像装置１３Ａ／Ｄ変換部（画像生成手段）２０目標抽出手段 DESCRIPTION OF SYMBOLS 11 Imaging device 13 A / D conversion part (image generation means) 20 Target extraction means

Claims

[Claims]

An image pickup apparatus for picking up an image of a target and outputting a video signal, and sequentially outputting a video signal output from the image pickup apparatus to A / A
A target comprising: an image generating means for performing D conversion and generating an image f (x, y) for each frame; and a target extracting means for extracting the target from the image f (x, y) and outputting a target position signal. In the extraction device, the target extraction means includes a template t (m,
n) and each part p (x, y) of the image having the same size as the template, an evaluation value for evaluating the probability that the part p (x, y) is a target is represented by a template t ( m, n) and the similarity between the part p (x, y) and the target in the current frame predicted based on the representative point of the part p (x, y) and the position of the target extracted before the previous frame. From the image f (x, y) based on the evaluation value of each part p (x, y) obtained from the position evaluation value c determined by the degree of separation from the predicted position (x0, y0). A target extraction device for extracting a target.

2. The evaluation value is a template t (m,
n) and the sum D of the luminance differences between the corresponding pixels of the part p (x, y) and the representative point of the part p (x, y) and the target position extracted before the previous frame. 2. The target extraction apparatus according to claim 1, wherein the target extraction position is obtained by adding a predicted value (x0, y0) of the target in the current frame and a position evaluation value c, which is a non-decreasing function of the degree of separation. .

3. The position evaluation value c is calculated based on a portion p (x,
y) and the target predicted position (x0,
3. The target extraction device according to claim 2, wherein c = k (| i−x0 | + | j−y0 |) where k is a positive constant.

4. An image pickup apparatus for picking up an image of a target and outputting a video signal, and sequentially outputting a video signal output from the image pickup apparatus to A / A
A target comprising: an image generating means for performing D conversion and generating an image f (x, y) for each frame; and a target extracting means for extracting the target from the image f (x, y) and outputting a target position signal. In the extraction device, the target extraction means includes a template t (m,
n) and each part p (x, y) of the image having the same size as the template, an evaluation value for evaluating the probability that the part p (x, y) is a target is represented by a template t ( m, n) and the similarity between the part p (x, y) and the representative point of the part p (x, y) and the image f in the previous frame
The target position (x1, y) extracted at (x, y)
1) and a position evaluation value c determined by the degree of separation, and a target is extracted from the image f (x, y) based on the obtained evaluation value of each part p (x, y). Target extraction device.

5. The evaluation value is a template t (m,
n) and the sum D of the luminance differences between the corresponding pixels of the portion p (x, y) and the representative point of the portion p (x, y) and the image f (x, y) in the previous frame. 5. The target extracting apparatus according to claim 4, wherein the target extracting apparatus obtains the target position by adding the position evaluation value c, which is a non-decreasing function of the degree of separation from the target position (x1, y1).

6. The position evaluation value c is represented by a part p (x,
y) and the image f (x,
From the target position (x1, y1) extracted in y), c = k (| i-x1 | + | j-y1 |) where k is a positive constant. Item 6. The target extraction device according to Item 5.

7. A video signal output from an imaging device for capturing an image of a target is sequentially A / D converted and an image f
In the target extraction method of generating the target as (x, y) and extracting the target from the image f (x, y), extracting the target from the image f (x, y) is performed by using a template t ( m, n) and each part p (x, y) of the image having the same size as the template,
An evaluation value for evaluating the likelihood that the part p (x, y) is the target is calculated using the template t (m, n) and the part p (x, y).
The target predicted position (x0, y0) in the current frame predicted based on the similarity of (x, y) and the representative point of the part p (x, y) and the target position extracted before the previous frame.
From the position evaluation value c determined by the degree of separation from
A target extraction method characterized in that the target extraction method is performed based on the obtained evaluation value of each part p (x, y).

8. The evaluation value of the template t (m,
n) and the sum D of the luminance differences between the corresponding pixels of the part p (x, y) and the representative point of the part p (x, y) and the target position extracted before the previous frame. 8. The target extraction method according to claim 7, wherein the target extraction position is obtained by adding a predicted value (x0, y0) of the target in the current frame and a position evaluation value c that is a non-decreasing function of the degree of separation. .

9. The position evaluation value c is calculated based on the portion p (x,
y) and the target predicted position (x0,
9. The target extraction method according to claim 8, wherein c = k (| i-x0 | + | j-y0 |) k: a positive constant is obtained from y0).

10. A video signal output from an imaging device for imaging a target is sequentially subjected to A / D conversion, and an image f
In the target extraction method for generating (x, y) and extracting the target from the image f (x, y), extracting the target from the image f (x, y) includes a template t ( m, n) and each part p (x, y) of the image having the same size as the template,
An evaluation value for evaluating the likelihood that the part p (x, y) is the target is calculated using the template t (m, n) and the part p (x, y).
The similarity of (x, y) and the degree of separation between the representative point of the part p (x, y) and the target position (x1, y1) extracted in the image f (x, y) in the previous frame Each part p (x, y) obtained from the determined position evaluation value c
A target extraction method characterized in that the target extraction method is performed based on the evaluation value of the target.

11. The evaluation value is a template t (m,
n) and the sum D of the luminance differences between the corresponding pixels of the portion p (x, y) and the representative point of the portion p (x, y) and the image f (x, y) in the previous frame. 11. The target extraction method according to claim 10, wherein the target extraction method is obtained by adding the calculated target position (x1, y1) and a position evaluation value c, which is a non-decreasing function of the degree of separation.

12. The position evaluation value c is calculated based on the part p
The representative point (i, j) of (x, y) and the image f in the previous frame
The target position (x1, y) extracted at (x, y)
12. The target extraction method according to claim 11, wherein c = k (| i-x1 | + | j-y1 |) k: a positive constant is obtained from 1).