JP4613558B2

JP4613558B2 - Human body detection device using images

Info

Publication number: JP4613558B2
Application number: JP2004267269A
Authority: JP
Inventors: 裕之藤井; 忠洋荒川; 啓史松田
Original assignee: Panasonic Corp; Matsushita Electric Works Ltd
Current assignee: Panasonic Corp; Panasonic Electric Works Co Ltd
Priority date: 2003-09-16
Filing date: 2004-09-14
Publication date: 2011-01-19
Anticipated expiration: 2024-09-14
Also published as: JP2005115932A

Description

本発明は、所望の検知領域を撮像手段により撮像することにより得られる時間順に並んだ複数枚の画像を用いて検知領域内の人の存否を検出する画像を用いた人体検知装置に関するものである。 The present invention relates to a human body detection apparatus using an image for detecting the presence or absence of a person in a detection area using a plurality of images arranged in order of time obtained by imaging a desired detection area by an imaging means. .

従来から、所望の検知領域をＴＶカメラのような撮像手段により撮像した画像を用い、検知領域内の人の存否を検出する人体検知装置が提案されている。この種の人体検知装置では、検知領域内で移動物体を背景から分離し、さらに移動物体について人と人以外とを分離する必要がある。移動物体を背景から分離する技術としては、検知領域について背景の画像をあらかじめ登録した基準画像と撮像手段により撮像した現画像とを比較するか、撮像手段により撮像した過去の画像と現画像とを比較することが考えられている（たとえば、特許文献１参照）。いずれの場合も、比較する２枚の画像の画素ごとの輝度値の差分を画素値とする差分画像を生成し、差分画像の中で所定の閾値よりも差分値の大きい画素からなる領域である変化領域を生成する。変化領域では、１人が複数領域に分割されていたり人以外のノイズが変化領域として生成されていたりするから、特許文献１に記載の技術では、変化領域のうち面積が規定の閾値以下の微小な領域をノイズとして除去し、残った変化領域の中で隣接する変化領域を統合することによって１人の人に相当する変化領域とみなしている。統合された変化領域に対しては、当該変化領域を包含する外接矩形（垂直方向と水平方向との辺からなる外接矩形）を生成することによって、外接矩形内を統合領域とする。このようにして求めた統合領域は、人が存在する可能性のある領域であって、統合領域について人に対応する領域か否かの評価がなされる。統合領域の評価には相関が用いられ、比較する２枚の画像間において、統合領域の相関が大きい（類似度が高い）ときには外乱とみなし、相関が小さければ移動物体、つまり人として検出している。 2. Description of the Related Art Conventionally, there has been proposed a human body detection device that detects an existence of a person in a detection area using an image obtained by imaging a desired detection area by an imaging unit such as a TV camera. In this type of human body detection device, it is necessary to separate the moving object from the background within the detection region, and to separate the moving object from a person and a person other than the person. As a technique for separating a moving object from a background, a reference image in which a background image is registered in advance for a detection area is compared with a current image captured by an imaging unit, or a past image and a current image captured by an imaging unit are compared. Comparison is considered (see, for example, Patent Document 1). In either case, a difference image is generated in which the difference between the luminance values of each pixel of the two images to be compared is a pixel value, and the difference image is an area composed of pixels having a difference value larger than a predetermined threshold value. Generate a change region. In the change region, one person is divided into a plurality of regions, or noise other than a person is generated as the change region. Therefore, in the technique described in Patent Document 1, the area of the change region is a minute area with a predetermined threshold value or less. This area is regarded as a change area corresponding to one person by removing adjacent areas as noise and integrating adjacent change areas. For the integrated change area, a circumscribed rectangle (a circumscribed rectangle composed of sides in the vertical direction and the horizontal direction) that includes the change area is generated, thereby making the inside of the circumscribed rectangle an integrated area. The integrated area obtained in this way is an area where a person may exist, and it is evaluated whether the integrated area is an area corresponding to a person. Correlation is used for evaluation of the integrated region. When the correlation of the integrated region is large (similarity is high) between two images to be compared, it is regarded as a disturbance, and when the correlation is small, it is detected as a moving object, that is, a person. Yes.

また、統合領域の類似度の評価に際して相関を用いると演算量が多くなるから、簡易的な類似度の評価技術としては、画像内で統合領域である外接矩形の垂直方向と水平方向との辺の長さ寸法の比（縦横比）を求め、あらかじめ設定されている人に相当する縦横比と比較して類似度が高ければ（縦横比の差が規定値以下であれば）、当該統合領域内の変化領域を人に対応すると判定することが考えられている。統合領域内の変化領域の評価に縦横比を用いることができるのは、直立した人に対応して生成される統合領域では縦長（縦寸法＞横寸法）になるから、犬のような小動物に対応する統合領域が横長（縦寸法＜横寸法）になる物体と区別することができる。 In addition, since the amount of calculation increases when correlation is used in evaluating the similarity of the integrated region, a simple similarity evaluation technique is to use the vertical and horizontal sides of the circumscribed rectangle that is the integrated region in the image. If the similarity is high compared to the aspect ratio corresponding to a preset person (if the difference in aspect ratio is less than the specified value), the integrated area It is considered that it is determined that the change area in the map corresponds to a person. The aspect ratio can be used to evaluate the change area in the integrated area because the integrated area generated corresponding to an upright person becomes vertically long (vertical dimension> horizontal dimension). The corresponding integrated region can be distinguished from an object that is horizontally long (vertical dimension <horizontal dimension).

同様にして、画像内で統合領域である外接矩形の面積を求め、あらかじめ設定されている人に相当する面積と比較して類似度が高ければ（面積差が規定値以下であれば）、当該統合領域内の変化領域を人に対応すると判定することが考えられている。統合領域内の変化領域の評価に面積を用いるのは、自動車のような大型の物体とネズミのような小型の物体とを区別するためである。 Similarly, the area of the circumscribed rectangle that is the integrated region in the image is obtained, and if the degree of similarity is high compared to the area corresponding to a preset person (if the area difference is less than the specified value), It is considered to determine that the change area in the integrated area corresponds to a person. The reason why the area is used for evaluating the change area in the integrated area is to distinguish a large object such as an automobile from a small object such as a mouse.

統合領域の類似度の評価に際して、統合領域の縦横比を用いる技術と統合領域の面積を用いる技術とは、いずれも統合領域の特徴量を簡易的に表すことができるから、特徴量を演算する演算量が少なく、演算処理に特別な高速性を要求することなく実時間での類似度の評価が可能になる。
特開平１１−４１５８９号公報 When evaluating the similarity of the integrated region, both the technology using the aspect ratio of the integrated region and the technology using the area of the integrated region can easily represent the feature value of the integrated region, so the feature value is calculated. The amount of calculation is small, and the degree of similarity can be evaluated in real time without requiring a special high-speed processing.
Japanese Patent Laid-Open No. 11-41589

ところで、統合領域が人か人以外かの評価に相関を用いる技術は演算量が多く、実時間で類似度を評価しようとすれば処理能力の高い演算装置が必要になる。一方、縦横比や面積による評価を行えば、処理量が少なくなるから処理能力の高い演算装置を必要とすることなく実時間で類似度を評価することができる。 By the way, the technique using the correlation for evaluating whether the integrated region is a person or a person has a large amount of calculation, and if a similarity is to be evaluated in real time, an arithmetic unit with high processing capability is required. On the other hand, if the evaluation is performed based on the aspect ratio and the area, the amount of processing is reduced, so that the degree of similarity can be evaluated in real time without requiring an arithmetic device with high processing capability.

しかしながら、縦横比を用いる技術では、たとえば、犬のような小動物が画像の奥行方向に移動している場合、つまり撮像手段の視野内で遠近のみが変化するように移動している場合には、統合領域が縦長になることがあり、人と区別することができなくなることがある。また、画像内で複数人が重なり合っているときに、統合領域が横長になることがあり、この場合には人を小動物と誤認する可能性がある。 However, in the technique using the aspect ratio, for example, when a small animal such as a dog is moving in the depth direction of the image, that is, when only the perspective changes within the field of view of the imaging means, The integrated area may be vertically long and cannot be distinguished from a person. Further, when a plurality of people overlap in the image, the integrated region may be horizontally long. In this case, there is a possibility that a person is mistaken for a small animal.

これに対して、面積を用いる技術では、たとえば自動車と人とのように、大きさが明らかに異なっていれば容易に区別することができるが、同物体であっても撮像手段との距離に応じて画像内での占有面積が異なるから、たとえば撮像素子の近傍に存在する犬と撮像素子から遠方に存在する人とが画像内で占める面積が略等しくなり、人と犬とを区別できない場合がある。 On the other hand, in the technique using an area, for example, a car and a person can be easily distinguished if their sizes are clearly different. Depending on the area occupied in the image, the area occupied by the dog in the vicinity of the image sensor and the person distant from the image sensor are approximately equal in the image, and the person and the dog cannot be distinguished. There is.

本発明は上記事由に鑑みて為されたものであり、その目的は、画像から抽出した領域が人に対応しているか否かを少ない処理量で精度よく区別することができる画像を用いた人体検知装置を提供することにある。 The present invention has been made in view of the above-mentioned reasons, and the object thereof is a human body using an image that can accurately distinguish whether or not an area extracted from an image corresponds to a person with a small amount of processing. It is to provide a detection device.

請求項１の発明は、所定の視野を撮像する撮像手段と、撮像手段により異なる時刻に撮像された各画像からそれぞれエッジを抽出した２値画像であるエッジ画像を用い時系列の３枚以上のエッジ画像を組み合わせて背景を除去する論理演算を行うことにより着目する時点のエッジ画像において移動した物体に相当する領域を抽出する移動領域抽出手段と、移動領域抽出手段により抽出した領域についてエッジ上の画素の方向コードの度数分布を求めるとともに、求めた度数分布とあらかじめ人のエッジ画像について求めたエッジ上の画素の方向コードの度数分布である基準データとの類似度を用いることにより、移動領域抽出手段により抽出した領域が人に対応する領域か否かを評価する領域解析手段とを備えることを備えることを特徴とする。 According to the first aspect of the present invention, there are provided an imaging unit for imaging a predetermined field of view, and three or more time-series images using edge images which are binary images obtained by extracting edges from images captured at different times by the imaging unit. A moving area extracting means for extracting an area corresponding to the moved object in the edge image at the time of interest by performing a logical operation to remove the background by combining the edge images, and the area extracted by the moving area extracting means on the edge Extraction of moving area by calculating frequency distribution of direction code of pixel and using similarity of calculated frequency distribution and reference data which is frequency distribution of direction code of pixel on edge previously obtained for human edge image A region analyzing means for evaluating whether or not the region extracted by the means is a region corresponding to a person. .

この構成によれば、撮像手段により撮像した画像から人に対応する領域を抽出する際にエッジ画像を用いており、エッジ画像は２値の画像であるから３枚以上のエッジ画像を用いて簡単な論理演算を行うだけで移動した物体に対応する領域を抽出することができる。また、抽出した領域のエッジ上の画素について方向コードの度数分布を作成し、基準データである度数分布と撮像手段により撮像した画像から求めた度数分布との類似度によって人か否かを判断するから、類似度の判断に用いる情報量を少なくして処理量を低減することができ、しかも方向コードの度数分布を用いることで精度よく人か否かを区別することができる。 According to this configuration, the edge image is used when the region corresponding to the person is extracted from the image captured by the imaging unit. Since the edge image is a binary image, it is easy to use three or more edge images. A region corresponding to the moved object can be extracted simply by performing a simple logical operation. Further, a frequency distribution of direction codes is created for pixels on the edge of the extracted region, and it is determined whether or not the person is a person based on the similarity between the frequency distribution as reference data and the frequency distribution obtained from the image captured by the imaging means. Therefore, it is possible to reduce the processing amount by reducing the amount of information used for determining the degree of similarity, and it is also possible to accurately discriminate whether or not a person is using the frequency distribution of direction codes.

請求項２の発明では、請求項１の発明において、前記度数分布は対象とする各エッジ上の画素の総数で正規化され、前記類似度の評価値には各方向コード別の度数の差の２乗和を用い、前記領域解析手段は、評価値が規定の閾値以下のときに前記移動領域抽出手段で抽出した領域を人に対応する領域と判断することを特徴とする。 In the invention of claim 2, in the invention of claim 1, the frequency distribution is normalized by the total number of pixels on each target edge, and the similarity evaluation value includes a frequency difference for each direction code. Using the sum of squares, the area analyzing means determines that the area extracted by the moving area extracting means is an area corresponding to a person when the evaluation value is equal to or less than a prescribed threshold value.

この構成によれば、度数分布を正規化しているから、撮像手段により撮像した画像から得られる度数分布と、基準データとしてあらかじめ作成してある度数分布との類似度の評価を簡単に行うことができ、各方向コード別の度数の差の２乗和を評価値に用いる程度の簡単な演算で類似度の判断が可能になる。 According to this configuration, since the frequency distribution is normalized, it is possible to easily evaluate the degree of similarity between the frequency distribution obtained from the image captured by the imaging unit and the frequency distribution created in advance as reference data. It is possible to determine the degree of similarity with a simple calculation to the extent that the sum of squares of the frequency difference for each direction code is used as the evaluation value.

請求項３の発明では、請求項１または請求項２の発明において、前記領域解析手段は、前記移動領域抽出手段により抽出した領域について求めた前記度数分布の各方向コードの度数に上限値および下限値による正常範囲を設定してあり、度数が正常範囲を逸脱する方向コードを含む度数分布が得られる領域は人以外の外乱とみなすことを特徴とする。 According to a third aspect of the present invention, in the first or second aspect of the present invention, the region analysis means sets an upper limit value and a lower limit for the frequency of each direction code of the frequency distribution obtained for the region extracted by the moving region extraction means. A normal range based on values is set, and a region where a frequency distribution including a direction code whose frequency deviates from the normal range is obtained is regarded as a disturbance other than a person.

この構成によれば、方向コードの度数に上限値と下限値とを設けて外乱を判断するから、外乱を簡単に除去することができ、人に対応する可能性が高い領域についてのみ類似度の評価を行うことになって類似度の演算に要する時間を短縮することができる。 According to this configuration, since the disturbance is determined by setting the upper limit value and the lower limit value for the frequency of the direction code, the disturbance can be easily removed, and the similarity degree is determined only for an area that is highly likely to correspond to a person. Since the evaluation is performed, the time required for calculating the similarity can be shortened.

請求項４の発明では、請求項１ないし請求項３の発明において、前記領域解析手段は、時系列の複数枚のエッジ画像について前記移動領域抽出手段により抽出した領域ごとにエッジ上の画素の方向コードについて各エッジ上の画素の総数で正規化した度数分布を求め、次に時系列で隣接する各一対のエッジ画像間で前記度数分布の類似度を評価することにより異なるエッジ画像間で同物体に相当する領域を対応付け、対応付けられた領域について前記基準データとの類似度を用いて評価することを特徴とする。 According to a fourth aspect of the present invention, in the first to third aspects of the present invention, the region analysis unit is configured to determine a pixel direction on an edge for each region extracted by the moving region extraction unit for a plurality of time-series edge images. Find the frequency distribution normalized with the total number of pixels on each edge for the code, and then evaluate the similarity of the frequency distribution between each pair of adjacent edge images in time series, and the same object between different edge images The areas corresponding to are associated with each other, and the associated areas are evaluated using the similarity to the reference data.

この構成によれば、時系列で隣接するエッジ画像間において度数分布の類似度を評価して同物体に相当する領域を対応付けるので、エッジ画像内で複数の物体が隣接している場合でも正しい対応付けが可能になり、対応付けが可能になった領域についてのみ基準データとの比較を行うことによって、当該領域が人を示しているか否かを正しくかつ効率よく判断することができる。 According to this configuration, since the similarity of the frequency distribution is evaluated between adjacent edge images in time series and the region corresponding to the same object is associated, the correct correspondence can be obtained even when a plurality of objects are adjacent in the edge image. By comparing the reference data only with respect to the area where the association is possible, it is possible to correctly and efficiently determine whether or not the area indicates a person.

請求項５の発明では、請求項１ないし請求項４の発明において、前記撮像手段の視野内に存在する人の像が撮像面の各部位に占める大きさに応じた比率で撮像手段の視野を複数の領域に分割するとともに、各領域ごとに有効領域と無効領域との別を指定する機能を有し、有効領域を人の存否を検出する監視領域とする監視領域設定手段が付加されたことを特徴とする。 According to a fifth aspect of the present invention, in the first to fourth aspects of the present invention, the field of view of the imaging unit is set at a ratio corresponding to the size of the image of the person existing in the field of view of the imaging unit in each part of the imaging surface. A monitoring area setting means has been added that has a function to divide the area into a plurality of areas, specify the distinction between the effective area and the invalid area for each area, and use the effective area as a monitoring area for detecting the presence or absence of a person. It is characterized by.

この構成によれば、撮像手段の視野を複数の領域に分割し、各領域ごとに人の監視を行う有効領域と人の監視を行わない無効領域とに指定することを可能としているから、たとえば外乱が生じることがわかっている領域について無効領域に指定しておくことで、外乱の影響を低減することができ、人を検出する精度が高くなる。 According to this configuration, it is possible to divide the field of view of the imaging unit into a plurality of areas, and to designate each area as an effective area that monitors a person and an invalid area that does not monitor a person. By designating an area where disturbance is known to occur as an invalid area, the influence of the disturbance can be reduced, and the accuracy of detecting a person is increased.

請求項６の発明では、請求項１ないし請求項４の発明において、人の存否を検出する監視領域を設定する領域設定モードを選択可能であって、領域設定モードでは前記撮像手段は視野内で監視領域の境界線に沿って移動させた光源からの特定波長の光のみを受光し、領域設定モードにおいて撮像手段から時系列で得られる複数の画像内で濃度値が最大になる位置を時間順に連結することにより得られる閉領域の内側と外側との一方を有効領域として他方を無効領域とし、有効領域を人の存否を検出する監視領域とする監視領域設定手段が付加されたことを特徴とする。 According to a sixth aspect of the present invention, in the first to fourth aspects of the invention, it is possible to select a region setting mode for setting a monitoring region for detecting the presence or absence of a person, and in the region setting mode, the imaging means is within the field of view. Only the light of a specific wavelength from the light source moved along the boundary line of the monitoring area is received, and the position where the density value is maximized in a plurality of images obtained in time series from the imaging means in the area setting mode is in time order. A monitoring area setting means is added, in which one of the inside and outside of the closed area obtained by connecting is an effective area, the other is an invalid area, and the effective area is a monitoring area for detecting the presence or absence of a person. To do.

この構成によれば、撮像手段の視野内で人の存否を検出する必要のない部位や外乱の生じやすい部位を監視対象から除外して無効領域に指定することができるから、たとえば外乱が生じることがわかっている領域について無効領域に指定しておくことで、外乱の影響を低減することができ、人を検出する精度が高くなる。 According to this configuration, a part that does not need to detect the presence / absence of a person in the field of view of the imaging unit or a part that is likely to generate a disturbance can be excluded from the monitoring target and designated as an invalid area. By designating an area with known as an invalid area, the influence of disturbance can be reduced and the accuracy of detecting a person is increased.

請求項７の発明では、請求項１ないし請求項６の発明において、前記撮像装置による撮像毎の画像を一時的に記憶する画像用メモリと、前記領域解析手段により人に対応する領域が抽出されると当該領域について画像用メモリに格納された画像を切り出して保存する保存用メモリとを備えることを特徴とする。 According to a seventh aspect of the invention, in the first to sixth aspects of the invention, an area corresponding to a person is extracted by the image memory for temporarily storing an image for each image picked up by the image pickup device and the area analyzing means. Then, a storage memory for cutting out and saving an image stored in the image memory for the area is provided.

この構成によれば、人に対応する領域が抽出されたときに、この領域を画像から切り出して保存用メモリに保存するから、人に対応する領域が存在しない無駄な画像を保存することがなく、しかも必要な画像のうち人に対応する領域のみを切り出しているから、保存するデータ量がを大幅に低減することができ、その上、必要な画像については詳細な画像を保存しておくことが可能になる。 According to this configuration, when an area corresponding to a person is extracted, the area is cut out from the image and stored in the storage memory, so that a useless image in which no area corresponding to the person exists is not stored. Moreover, since only the area corresponding to the person is cut out from the necessary images, the amount of data to be saved can be greatly reduced, and in addition, detailed images can be saved for the necessary images. Is possible.

請求項８の発明では、請求項１ないし請求項６の発明において、前記撮像装置による撮像毎の画像を一時的に記憶する画像用メモリと、前記領域解析手段により人に対応する領域が抽出されると当該領域について画像用メモリに格納された画像を切り出して他装置に転送する画像送信部とを備えることを特徴とする。 According to an eighth aspect of the present invention, in the first to sixth aspects of the invention, an area corresponding to a person is extracted by the image memory for temporarily storing an image for each image picked up by the image pickup device and the area analyzing means. Then, an image transmission unit that cuts out an image stored in the image memory for the region and transfers the image to another apparatus is provided.

この構成によれば、人に対応する領域が抽出されたときに、この領域を画像から切り出して画像送信部を介して他装置に転送するから、人に対応する領域が存在しない無駄な画像を他装置に転送することがなく伝送路のトラフィックを低減することができる。しかも、必要な画像のうち人に対応する領域のみを切り出しているから、転送するデータ量がを大幅に低減することができ、その上、必要な画像については詳細な画像を転送することが可能になる。 According to this configuration, when an area corresponding to a person is extracted, the area is cut out from the image and transferred to another apparatus via the image transmission unit. Traffic on the transmission line can be reduced without being transferred to another device. Moreover, since only the area corresponding to the person is cut out from the necessary images, the amount of data to be transferred can be greatly reduced, and in addition, detailed images can be transferred for necessary images. become.

請求項９の発明では、請求項１ないし請求項８の発明において、前記領域解析手段において人に対応する領域と評価された領域が存在するときに検知信号を出力する検知信号出力手段と、検知信号出力手段から検知信号が出力されると領域解析手段が人に対応する領域と評価した領域の追跡を開始するとともに当該領域を他の領域よりも拡大した部分拡大画像を画像表示手段の画面に表示させる画像出力手段とが付加されていることを特徴とする。 According to a ninth aspect of the present invention, in the first to eighth aspects of the invention, a detection signal output means for outputting a detection signal when there is an area that is evaluated as an area corresponding to a person in the area analysis means, and a detection When the detection signal is output from the signal output means, the area analysis means starts tracking the area evaluated as the area corresponding to the person, and a partially enlarged image obtained by enlarging the area from other areas is displayed on the screen of the image display means. An image output means for displaying is added.

この構成によれば、撮像手段の視野内に人に対応する領域が検出されると侵入者があると判断し、当該領域を追跡しかつ画像表示手段の画面に他の領域よりも拡大して表示するから、監視カメラのように撮像手段の視野の監視に用いる場合に、侵入者の特徴を画面で容易に捉えることができ、画像表示手段の画面を監視している監視者の負担が少なくなる。 According to this configuration, when an area corresponding to a person is detected in the field of view of the imaging unit, it is determined that there is an intruder, and the area is tracked and enlarged on the screen of the image display unit than other areas. Therefore, when used for monitoring the field of view of the imaging means like a monitoring camera, the characteristics of the intruder can be easily grasped on the screen, and the burden on the monitoring person monitoring the screen of the image display means is small. Become.

請求項１０の発明では、請求項９の発明において、前記画像出力手段は、前記部分拡大画像を前記画像表示手段の画面の大きさに合わせた拡大率で画像表示手段の画面に表示させるとともに、画像表示手段の画面の一部に前記撮像手段の視野全体である全体画像を表示させることを特徴とする。 In the invention of claim 10, in the invention of claim 9, the image output means displays the partial enlarged image on the screen of the image display means at an enlargement ratio that matches the size of the screen of the image display means, An entire image that is the entire field of view of the imaging unit is displayed on a part of the screen of the image display unit.

この構成によれば、部分拡大画像によって侵入者の特徴を画面で容易に捉えることができる上に、画面の一部に全体画像を表示していることによって、撮像手段の視野内において侵入者がどこに存在しているかを同時に知ることができる。 According to this configuration, the feature of the intruder can be easily captured on the screen by the partially enlarged image, and the intruder can be seen within the field of view of the imaging means by displaying the entire image on a part of the screen. You can know where it is at the same time.

請求項１１の発明では、請求項９の発明において、前記画像出力手段は、前記画像表示手段の画面に前記撮像手段の視野全体である全体画像を表示させる状態から、画像表示手段の画面の大きさに合わせた拡大率で前記部分拡大画像を表示させる状態まで、領域解析手段が人に対応する領域と最初に評価した領域を起点にして部分拡大画像の拡大率を時間経過に伴って徐々に大きくすることを特徴とする。 According to an eleventh aspect of the present invention, in the ninth aspect, the image output means is configured to display a whole image that is the entire field of view of the imaging means on the screen of the image display means, and then the screen size of the image display means. Until the state in which the partially magnified image is displayed at a magnification corresponding to the height, the region analysis means gradually increases the magnification rate of the partially magnified image over time starting from the region corresponding to the person and the region initially evaluated. It is characterized by being enlarged.

この構成によれば、画像表示手段の画面に表示する画像を全体画像から一定の拡大率の部分拡大画像に急に切り換えるのではなく、画面上で人を最初に検出した場所から始めて部分拡大画像の拡大率を時間経過に伴って増加させるので、画面上で人を最初に検出した場所から部分拡大画像が時間経過に伴ってズームアップされることになり、一定の拡大率の部分拡大画像に急に切り換える場合に比較して侵入者の存在する位置を把握するのが容易になる。 According to this configuration, the image to be displayed on the screen of the image display means is not suddenly switched from the entire image to the partially enlarged image having the constant enlargement ratio, but the partially enlarged image is started from the place where the person is first detected on the screen. Since the enlargement ratio of the image is increased with time, the partially enlarged image is zoomed up with the passage of time from the place where the person was first detected on the screen, and the partial enlarged image with a constant enlargement ratio is obtained. It is easier to grasp the position where the intruder is present than when switching suddenly.

請求項１２の発明では、請求項９の発明において、前記領域解析手段において人に対応する領域と評価された領域が複数個存在するときに、前記画像出力手段は、各領域に対応する前記部分拡大画像を前記画像表示手段の画面に一定時間毎に切り換えて表示することを特徴とする。 According to a twelfth aspect of the present invention, in the invention according to the ninth aspect, when there are a plurality of regions that are evaluated as regions corresponding to a person by the region analyzing unit, the image output unit is configured to display the portion corresponding to each region. The enlarged image is displayed on the screen of the image display means by switching at regular intervals.

この構成によれば、人に対応する領域が複数存在するときに、各領域の部分拡大画像を一定時間毎に切り換えて表示するから、複数の侵入者についてそれぞれの特徴を確認するのが容易になる。 According to this configuration, when there are a plurality of regions corresponding to a person, a partial enlarged image of each region is displayed by switching at regular intervals, so that it is easy to check the characteristics of each of a plurality of intruders. Become.

請求項１３の発明では、請求項９の発明において、前記領域解析手段において人に対応する領域と評価された領域が複数個存在するときに、前記画像出力手段は、前記画像表示手段の画面を領域の個数分の区画に分割し、各領域に対応する前記部分拡大画像を各区画にそれぞれ表示することを特徴とする。 In the invention of claim 13, in the invention of claim 9, when there are a plurality of areas evaluated as areas corresponding to people by the area analysis means, the image output means displays the screen of the image display means. The image is divided into sections corresponding to the number of areas, and the partial enlarged image corresponding to each area is displayed in each section.

この構成によれば、複数の侵入者の行動を１画面内で一覧することができるから、複数人の特徴および行動を一度に監視することができ、行動が不審な侵入者の有無を把握するのが容易になる。 According to this configuration, since the actions of a plurality of intruders can be listed in one screen, the characteristics and actions of a plurality of persons can be monitored at a time, and the presence or absence of an intruder whose behavior is suspicious can be grasped. It becomes easy.

本発明の構成によれば、撮像手段により撮像した画像から人に対応する領域を抽出する際にエッジ画像を用いており、エッジ画像は２値の画像であるから、簡単な論理演算のみで人に対応する領域を抽出することができるという利点がある。また、抽出した領域のエッジ上の画素について方向コードの度数分布を作成し、基準データである度数分布と撮像手段により撮像した画像から求めた度数分布との類似度によって人か否かを判断するから、類似度の判断に用いる情報量を少なくして処理量を低減することができ、しかも方向コードの度数分布を用いることで精度よく人か否かを区別することができるという利点がある。 According to the configuration of the present invention, the edge image is used when extracting the region corresponding to the person from the image picked up by the image pickup means, and the edge image is a binary image. There is an advantage that a region corresponding to can be extracted. Further, a frequency distribution of direction codes is created for pixels on the edge of the extracted region, and it is determined whether or not the person is a person based on the similarity between the frequency distribution as reference data and the frequency distribution obtained from the image captured by the imaging means. Therefore, there is an advantage that the amount of information used for determining the degree of similarity can be reduced and the amount of processing can be reduced, and the presence or absence of a person can be accurately distinguished by using the frequency distribution of direction codes.

（実施形態１）
本実施形態は、図１に示すように、所望の視野を撮像する撮像手段１と、撮像手段１により異なる時刻に撮像された複数枚の画像を用いて移動した物体に相当する領域を抽出する移動領域抽出手段２と、移動領域抽出手段２により抽出した領域について人に対応する領域か否かを評価する領域解析手段３とを備える。移動領域抽出手段２および領域解析手段３は、コンピュータに適宜のプログラムを実行させることにより実現される。 (Embodiment 1)
In the present embodiment, as shown in FIG. 1, an imaging unit 1 that captures a desired field of view and a plurality of images captured at different times by the imaging unit 1 are used to extract a region corresponding to an object that has moved. The moving area extracting means 2 and the area analyzing means 3 for evaluating whether or not the area extracted by the moving area extracting means 2 is an area corresponding to a person. The moving area extracting means 2 and the area analyzing means 3 are realized by causing a computer to execute an appropriate program.

撮像手段１は、所定時間間隔で撮像した画像を出力するカメラ１１と、カメラ１１で撮像した画像のアナログ情報をデジタル情報に変換するＡ／Ｄ変換器１２とを備える。カメラ１１としては、ＣＣＤイメージセンサやＣＭＯＳイメージセンサのような固体撮像素子を用いている。デジタル信号を出力する機能を備えたＣＭＯＳイメージセンサをカメラ１１に用いる場合には、Ａ／Ｄ変換器１２は不要になる。カメラ１１で撮像する画像としては、カラー画像を用いることも可能であるが、本実施形態ではモノクロの濃淡画像を採用する。撮像手段１が撮像する時間間隔は、当該時間間隔で得られる時系列の画像から移動物体の存否を判断できる程度の範囲で適宜に設定すればよく、滑らかな動画像を得ることが目的ではないから、１秒間に３０フレームの画像を出力することが必要というわけではない。 The imaging unit 1 includes a camera 11 that outputs images captured at predetermined time intervals, and an A / D converter 12 that converts analog information of the images captured by the camera 11 into digital information. As the camera 11, a solid-state image sensor such as a CCD image sensor or a CMOS image sensor is used. When a CMOS image sensor having a function of outputting a digital signal is used for the camera 11, the A / D converter 12 is not necessary. A color image can be used as an image captured by the camera 11, but a monochrome grayscale image is employed in the present embodiment. The time interval captured by the imaging means 1 may be set as appropriate within a range where the presence or absence of a moving object can be determined from time-series images obtained at the time interval, and is not intended to obtain a smooth moving image. Therefore, it is not necessary to output an image of 30 frames per second.

移動領域抽出手段２は、撮像手段１から出力される濃淡画像と濃淡画像に後述する処理を施した画像とを一時的に記憶する画像用メモリ２１を備える。本実施形態では、濃淡画像に対して微分処理部２２において微分値と方向コードとを求める処理を行い、各画素の画素値が微分値となる微分画像と、各画素の画素値が方向コードとなる方向コード画像とを濃淡画像とともに画像用メモリ２１に記憶させる。 The moving area extracting unit 2 includes an image memory 21 that temporarily stores a grayscale image output from the imaging unit 1 and an image obtained by performing processing described later on the grayscale image. In the present embodiment, the differential processing unit 22 obtains a differential value and a direction code for the grayscale image, and the differential image in which the pixel value of each pixel is a differential value, and the pixel value of each pixel is the direction code. The direction code image is stored in the image memory 21 together with the grayscale image.

微分値を求める手法は種々提案されているが、基本的には、着目する画素の近傍画素（８近傍が広く採用されている）について、画像の垂直方向に関する濃度差を水平方向に関する濃度差で除算した値を微分値として用いる。ただし、濃淡画像から微分画像を生成するのは、画像内の物体と背景との濃度値の相違によって物体と背景との境界付近で微分値が大きくなることを利用し、物体の輪郭線の候補を抽出するためであるから、本実施形態では、輪郭線の強調のためにソーベル（Ｓｏｂｅｌ）フィルタを用いた重み付きの微分処理を行う。 Various methods for obtaining the differential value have been proposed. Basically, for the neighboring pixels of the pixel of interest (eight neighborhoods are widely adopted), the density difference in the vertical direction of the image is expressed as the density difference in the horizontal direction. The divided value is used as a differential value. However, the differential image is generated from the grayscale image by utilizing the fact that the differential value becomes large near the boundary between the object and the background due to the difference in the density value between the object and the background in the image. In this embodiment, weighted differentiation using a Sobel filter is performed to enhance the contour line.

また、方向コードは、微分値を濃度値の変化方向に対応付けた値であって４５度を単位として８方向に整数値のコードを対応付けたものである（ここでは、８近傍の画素から求めた通常の微分値に方向コードを対応付けている）。各画素の方向コードは、画像内において濃度値の変化が最大になる方向に直交する方向を表すように設定される。したがって、各画素において方向コードが示す方向は輪郭線の延長方向にほぼ一致する（各画素の方向コードが示す方向に対して±４５度の範囲内で隣接する３画素が物体の輪郭線上の画素になる可能性が高い）。 The direction code is a value in which the differential value is associated with the change direction of the density value, and an integer value code is associated in 8 directions in units of 45 degrees (in this case, from the neighboring pixels of 8). The direction code is associated with the obtained normal differential value). The direction code of each pixel is set so as to represent a direction orthogonal to the direction in which the density value change is maximized in the image. Therefore, the direction indicated by the direction code in each pixel substantially coincides with the extending direction of the contour line (the three pixels adjacent to each other within the range of ± 45 degrees with respect to the direction indicated by the direction code of each pixel are pixels on the contour line of the object. Is likely to be).

上述のように微分処理部２２において求めた微分画像では、コントラストの大きい部位が強調されるから、適宜の閾値で微分画像を二値化することによって、微分画像に含まれる物体の輪郭線の候補を抽出することができる。微分処理部２２では、抽出した輪郭線の候補となる領域を１画素幅に細線化して輪郭線の候補となるエッジの候補を抽出する。エッジの候補は途切れている可能性があるから、エッジの候補について方向コードを用いて画素を追跡し、物体の輪郭線とみなせるエッジの候補を連結したエッジからなるエッジ画像を生成して画像用メモリ２１に格納する。画像用メモリ２１はエッジ画像を求める際の作業領域としても用いられる。 In the differential image obtained by the differential processing unit 22 as described above, a portion having a large contrast is emphasized. Therefore, by binarizing the differential image with an appropriate threshold, candidates for the contour line of the object included in the differential image Can be extracted. The differentiation processing unit 22 thins the extracted region that is a candidate for a contour line to a width of one pixel, and extracts a candidate for an edge that is a candidate for a contour line. Since the edge candidates may be interrupted, the pixels are tracked using the direction code for the edge candidates, and an edge image composed of the edges that are concatenated with the edge candidates that can be regarded as the contour lines of the object is generated. Store in the memory 21. The image memory 21 is also used as a work area when obtaining an edge image.

本実施形態では、論理合成部２３において３枚または５枚のエッジ画像を用いて移動物体に対応するエッジを抽出する。ここでは、図２を用いて３枚のエッジ画像を用いて移動物体に対応するエッジを抽出する技術について説明する。いま、図２（ａ）〜（ｃ）のように、時刻Ｔ−ΔＴ、Ｔ、Ｔ＋ΔＴに撮像された３枚のエッジ画像Ｅ（Ｔ−ΔＴ）、Ｅ（Ｔ）、Ｅ（Ｔ＋ΔＴ）が論理合成部２３に与えられるものとする。図示例では、各エッジ画像Ｐに、それぞれ移動物体Ｏｂが含まれている。 In this embodiment, the logic synthesis unit 23 extracts edges corresponding to a moving object using three or five edge images. Here, a technique for extracting an edge corresponding to a moving object using three edge images will be described with reference to FIG. Now, as shown in FIGS. 2A to 2C, three edge images E (T−ΔT), E (T), E (T + ΔT) imaged at times T−ΔT, T, and T + ΔT are logical. It is assumed that it is given to the synthesis unit 23. In the illustrated example, each edge image P includes a moving object Ob.

論理合成部２３では、まず、時系列において隣接する各一対のエッジ画像（つまり、Ｅ（Ｔ−ΔＴ）とＥ（Ｔ）、Ｅ（Ｔ）とＥ（Ｔ＋ΔＴ））の差分を求める（この画像は、エッジ画像の差分であるから、以下では「差分エッジ画像」と呼ぶ）。ただし、エッジ画像は、エッジの部分とエッジ以外の部分とで異なる画素値を持つ２値画像であるから、論理合成部２３では各一対のエッジ画像について同じ位置の一対の画素ごとに排他的論理和を求める論理演算を行えば、着目する一対のエッジ画像の差分を求めたことになる。図示例のエッジ画像から求めた２枚の差分エッジ画像では、各差分エッジ画像にそれぞれ移動物体Ｏｂが２回ずつ現れることになる。 First, the logic synthesis unit 23 obtains a difference between each pair of edge images adjacent in time series (that is, E (T−ΔT) and E (T), E (T) and E (T + ΔT)) (this image). Is an edge image difference, and is hereinafter referred to as a “difference edge image”). However, since the edge image is a binary image having different pixel values in the edge portion and the non-edge portion, the logic synthesis unit 23 performs exclusive logic for each pair of pixels at the same position for each pair of edge images. When the logical operation for obtaining the sum is performed, the difference between the pair of edge images of interest is obtained. In the two differential edge images obtained from the edge image in the illustrated example, the moving object Ob appears twice in each differential edge image.

論理合成部２３では、時刻Ｔのエッジ画像Ｅ（Ｔ）に含まれる移動物体Ｏｂを抽出するために、２枚の差分エッジ画像について同じ位置の一対の画素ごとに論理積を求める論理演算を行い、結果の画像を図２（ｄ）のような候補画像として出力する。すなわち、２枚の差分エッジ画像では背景はほぼ除去されているから、２枚の差分エッジ画像について論理積の演算を行うと共通部分である時刻Ｔのエッジ画像Ｅ（Ｔ）について背景を除去した候補画像が得られ、この候補画像には移動物体Ｏｂのほかにはノイズを含むだけになると考えられる。 In the logic synthesis unit 23, in order to extract the moving object Ob included in the edge image E (T) at time T, a logical operation is performed to obtain a logical product for each pair of pixels at the same position for the two differential edge images. The resulting image is output as a candidate image as shown in FIG. That is, since the background is almost removed from the two differential edge images, the background is removed from the edge image E (T) at the time T, which is a common part, when the logical product operation is performed on the two differential edge images. A candidate image is obtained, and it is considered that this candidate image only includes noise in addition to the moving object Ob.

ここに、本実施形態では３枚のエッジ画像Ｅ（Ｔ−ΔＴ）、Ｅ（Ｔ）、Ｅ（Ｔ＋ΔＴ）を用いる例を示しているが、４枚以上のエッジ画像を用いて候補画像を生成することも可能である。たとえば、５枚のエッジ画像Ｅ（Ｔ−２ΔＴ）、Ｅ（Ｔ−ΔＴ）、Ｅ（Ｔ）、Ｅ（Ｔ＋ΔＴ）、Ｅ（Ｔ＋２ΔＴ）を用いる場合には、まず２枚ずつのエッジ画像（Ｅ（Ｔ−２ΔＴ）とＥ（Ｔ＋２ΔＴ）、Ｅ（Ｔ−ΔＴ）とＥ（Ｔ＋ΔＴ））について、それぞれ論理積を求める論理演算によって移動物体Ｏｂを除去した背景のエッジ画像を生成する。このようにして得られる２枚のエッジ画像をそれぞれ反転してエッジ画像Ｅ（Ｔ）との論理積を求める論理演算を行うと、エッジ画像Ｅ（Ｔ−２ΔＴ）、Ｅ（Ｔ＋２ΔＴ）において移動物体Ｏｂにより隠れていた背景とエッジ画像Ｅ（Ｔ）における移動物体Ｏｂを含むエッジ画像と、エッジ画像Ｅ（Ｔ−ΔＴ）、Ｅ（Ｔ＋ΔＴ）において移動物体Ｏｂにより隠れていた背景とエッジ画像Ｅ（Ｔ）における移動物体Ｏｂを含むエッジ画像とが得られる。両エッジ画像について論理積を求める論理演算によって共通部分を抽出すれば、エッジ画像Ｅ（Ｔ）における移動物体Ｏｂのエッジを含むエッジ画像（候補画像）が得られる。このほかに、４枚以上のエッジ画像を種々に組み合わせることによって、候補画像を生成することができる。 Here, in the present embodiment, an example using three edge images E (T−ΔT), E (T), and E (T + ΔT) is shown, but candidate images are generated using four or more edge images. It is also possible to do. For example, when five edge images E (T−2ΔT), E (T−ΔT), E (T), E (T + ΔT), and E (T + 2ΔT) are used, two edge images (E For (T−2ΔT) and E (T + 2ΔT), E (T−ΔT) and E (T + ΔT)), a background edge image is generated by removing the moving object Ob by a logical operation for obtaining a logical product. When the logical operation for obtaining the logical product of the two edge images thus obtained and inverting the edge image E (T) is performed, a moving object is obtained in the edge images E (T−2ΔT) and E (T + 2ΔT). Edge image including moving object Ob in background and edge image E (T) hidden by Ob, and background image and edge image E () hidden by moving object Ob in edge images E (T−ΔT) and E (T + ΔT) An edge image including the moving object Ob in T) is obtained. If a common part is extracted by a logical operation for obtaining a logical product of both edge images, an edge image (candidate image) including the edge of the moving object Ob in the edge image E (T) is obtained. In addition, candidate images can be generated by variously combining four or more edge images.

候補画像では濃淡画像から差分を求めるのではなく２値のエッジ画像について論理演算を行っており、しかも２枚の画像から移動物体Ｏｂを抽出するのではなく、３枚以上のエッジ画像を用いて特定時刻のエッジ画像に含まれる移動物体Ｏｂを抽出するようにしているから、候補画像の中では同じ移動物体Ｏｂが２箇所に現れることがなく、移動物体Ｏｂを含む変化の生じた領域のみを抽出することができる。上述のように、論理合成部２３から出力される候補画像には、移動物体Ｏｂのほかにノイズも含まれるから、画素が連結されている領域（連結領域）ごとにラベリングを施す。ここに、各連結領域に対して図２（ｅ）のように外接矩形Ｄ１を設定し、外接矩形Ｄ１に対してラベリングを施すようにすれば、画素ごとにラベルを付与する場合に比較してデータ量を低減することができる。 In the candidate image, the logical operation is performed on the binary edge image instead of obtaining the difference from the grayscale image, and the moving object Ob is not extracted from the two images but using three or more edge images. Since the moving object Ob included in the edge image at the specific time is extracted, the same moving object Ob does not appear in two positions in the candidate image, and only the changed region including the moving object Ob is included. Can be extracted. As described above, since the candidate image output from the logic synthesis unit 23 includes noise in addition to the moving object Ob, labeling is performed for each region (connected region) where pixels are connected. Here, if a circumscribed rectangle D1 is set for each connected region as shown in FIG. 2 (e) and labeling is performed on the circumscribed rectangle D1, compared with the case where a label is assigned to each pixel. The amount of data can be reduced.

移動領域抽出手段２における論理合成部２３から出力された図２（ｄ）のような候補画像と、画像用メモリ２１に格納された方向コード画像とは領域解析手段３に設けた度数分布作成部３１に与えられる。領域解析手段３は、移動領域抽出手段２により抽出された領域が、人に対応する領域か人以外の外乱かを評価する機能を有する。領域解析手段３では、まず度数分布作成部３１において、論理合成部２３の出力として得られた候補画像の中でラベルが付された領域ごとに、画像用メモリ２１に格納された方向コード画像を参照してエッジ上の画素の方向コードを求め、ラベルが付された領域ごとに方向コードに関する度数分布を生成する。ここに、度数分布は対象とする各エッジ上の画素の総数で正規化しておく。また、方向コードは、８種類の方向コードを用いるのではなく、同方向で互いに逆向きになる方向コードについては同じ方向コードにまとめ、４種類の方向コードについて度数分布を生成する。つまり、０度と１８０度とに対応する方向コード、４５度と２２５度とに対応する方向コード、９０度と２７０度とに対応する方向コード、１３５度と３１５度とに対応する方向コードとの４種類の方向コードを用いる。図３に度数分布作成部３１で生成した度数分布の一例を示す。 The candidate image as shown in FIG. 2D output from the logic synthesis unit 23 in the moving region extraction unit 2 and the direction code image stored in the image memory 21 are the frequency distribution creation unit provided in the region analysis unit 3. 31. The region analysis unit 3 has a function of evaluating whether the region extracted by the moving region extraction unit 2 is a region corresponding to a person or a disturbance other than a person. In the area analysis means 3, first, in the frequency distribution creation section 31, the direction code image stored in the image memory 21 is obtained for each area labeled in the candidate image obtained as the output of the logic synthesis section 23. With reference to the direction code of the pixel on the edge, a frequency distribution related to the direction code is generated for each labeled region. Here, the frequency distribution is normalized by the total number of pixels on each edge of interest. In addition, the direction codes do not use eight types of direction codes, but direction codes that are opposite to each other in the same direction are grouped into the same direction code, and a frequency distribution is generated for the four types of direction codes. That is, a direction code corresponding to 0 degrees and 180 degrees, a direction code corresponding to 45 degrees and 225 degrees, a direction code corresponding to 90 degrees and 270 degrees, a direction code corresponding to 135 degrees and 315 degrees, and The four types of direction codes are used. FIG. 3 shows an example of the frequency distribution generated by the frequency distribution creation unit 31.

領域解析手段３には、度数分布作成部３１において生成された各領域ごとの度数分布は外乱除去部３２に入力され、度数分布の形によって外乱か否かが判断される。つまり、外乱除去部３２では、領域が人に対応するときの各方向コードの度数に関して各方向コードごとに上限値および下限値による正常範囲を設定してあり、各領域ごとに求めた度数分布について、各方向コードの度数のうちの１つでも正常範囲を逸脱するものがあるときには、当該領域を人以外の外乱とみなす。つまり、領域内の方向コードの度数が正常範囲を逸脱するときには、領域内の移動物体が特定方向に傾いたものであり、人以外のノイズとみなすのである。これは、人に対応するエッジには直線部分より曲線部分が多く、しかも人に対応するエッジは形状が複雑であるから、エッジの上の画素には方向コードのすべての値が出現する頻度が比較的高いのに対して、影やカメラ１１で生じるフリッカによるノイズのエッジは特定の方向に偏った分布を示すことが多いという経験則を利用したものである。要するに、外乱除去部３２は、各領域内のエッジ上の画素の方向コードに関する度数分布を特徴量として用い、移動物体が人に対応する領域か人以外のノイズになるかを判断し、ノイズと判断した領域については次段の分布比較処理部３３に与えずに除去する。 In the region analysis means 3, the frequency distribution for each region generated in the frequency distribution creating unit 31 is input to the disturbance removing unit 32, and it is determined whether or not the disturbance is based on the shape of the frequency distribution. That is, the disturbance removing unit 32 sets a normal range based on an upper limit value and a lower limit value for each direction code with respect to the frequency of each direction code when the region corresponds to a person, and the frequency distribution obtained for each region. When one of the degrees of each direction code deviates from the normal range, the area is regarded as a disturbance other than a person. That is, when the frequency of the direction code in the region deviates from the normal range, the moving object in the region is inclined in a specific direction and is regarded as noise other than a person. This is because the edge corresponding to a person has more curved parts than the straight line, and the edge corresponding to a person has a more complicated shape, so the frequency of all values of the direction code appearing in the pixels above the edge. This is based on an empirical rule that the edges of noise due to shadows and flicker generated by the camera 11 often show a distribution that is biased in a specific direction, while it is relatively high. In short, the disturbance removing unit 32 uses the frequency distribution related to the direction code of the pixel on the edge in each region as a feature amount, determines whether the moving object is a region corresponding to a person or noise other than a person, and The determined area is removed without being given to the distribution comparison processing unit 33 in the next stage.

外乱除去部３２においてノイズではないと評価された領域については、分布比較処理部３３に与えられ、当該領域が人を含むか否かを評価する。分布比較処理部３３では、人に関するエッジの方向コードの度数分布を基準データとしてあらかじめ登録してある基準データ格納部３４を用い、外乱除去部３２で除去されずに残された各領域ごとの度数分布を、基準データ格納部３４に格納された基準データの度数分布と比較し、以下の演算によって両者の類似度を評価する。 The region evaluated not to be noise by the disturbance removing unit 32 is given to the distribution comparison processing unit 33 to evaluate whether or not the region includes a person. The distribution comparison processing unit 33 uses the reference data storage unit 34 registered in advance as the reference data for the frequency direction distribution of the edge direction codes related to the person, and the frequency for each area remaining without being removed by the disturbance removal unit 32. The distribution is compared with the frequency distribution of the reference data stored in the reference data storage unit 34, and the similarity between the two is evaluated by the following calculation.

すなわち、外乱除去部３２においてノイズではないと評価された領域に関する度数分布に関して各方向コードごとの度数をＨ１ｉ（ｉ＝１，２，３，４）とし、基準データ格納部３４に格納された度数分布に関して各方向コードごとの度数をＨ２ｉ（ｉ＝１，２，３，４）とするとき、類似度の評価値ｅ^２は数１によって求める。 That is, the frequency stored in the reference data storage unit 34 with the frequency for each direction code as H1i (i = 1, 2, 3, 4) regarding the frequency distribution related to the region evaluated as not noise by the disturbance removing unit 32. when the frequency of each direction code with respect to the distribution and H2i (i = 1,2,3,4), the evaluation value ^{e 2} of similarity calculated by the number 1.

数１により求めた評価値ｅ^２を適宜に設定した閾値と比較し、評価値ｅ^２が閾値以下である場合には類似度が高いから、候補画像から得られた当該領域を人に対応する領域と判断し、評価値ｅ^２が閾値を越える場合には類似度が低いから候補画像から得られた当該領域は人以外の外乱であると判断する。 The evaluation value e ² obtained by Equation 1 is compared with an appropriately set threshold value, and when the evaluation value e ² is equal to or less than the threshold value, the similarity is high, so that the region obtained from the candidate image corresponds to a person. It determines that the region, when the evaluation value e ² exceeds the threshold value determines that the area obtained from the candidate image from a low degree of similarity is the disturbance of non-human.

以上説明したように、時系列のエッジ画像に関してフレーム間で論理演算を行うことにより移動物体Ｏｂの領域を背景から分離し、さらに移動物体Ｏｂの領域に含まれるエッジの方向コードの度数分布を特徴量とし、人に関する方向コードの度数分布である基準データとの評価値ｅ^２を評価することで、人である可能性の高い領域を抽出しているから、比較的簡単な演算で移動物体Ｏｂが人か否かを判別することができる。 As described above, the area of the moving object Ob is separated from the background by performing a logical operation between frames with respect to the time-series edge image, and the frequency distribution of the direction code of the edge included in the area of the moving object Ob is further characterized. Since an area having a high possibility of being a person is extracted by evaluating the evaluation value e ² with the reference data that is the frequency distribution of the direction code relating to the person, the moving object Ob is relatively easy to calculate. Whether or not is a person can be determined.

本実施形態の構成を外接矩形の縦横比や面積を用いる従来構成と比較すると、従来構成では、複数人が１つの統合領域を形成するような場合に、当該統合領域を人に対応しないと判断する可能性があったのに対して、上述した本実施形態の構成では、外乱除去部３２から出力される領域に複数人が含まれている場合であっても、エッジ上の画素の方向コードの度数分布を正規化すると、１人の場合と同様の傾向を示すから、領域内の人数に関わりなく人か人以外かを評価することが可能になる。 Comparing the configuration of the present embodiment with the conventional configuration using the aspect ratio and area of the circumscribed rectangle, in the conventional configuration, when a plurality of people form one integrated region, the integrated region is determined not to correspond to a person. In contrast, in the configuration of the present embodiment described above, even if a plurality of persons are included in the region output from the disturbance removal unit 32, the direction code of the pixel on the edge When the frequency distribution is normalized, the same tendency as in the case of one person is shown, so that it is possible to evaluate whether the person is a person or a person other than the person in the area.

さらに、本実施形態では、度数分布の類似度の評価値ｅ^２を各方向コード別の度数の差の２乗和（つまり、比較する度数分布間の距離の２乗）を用いているから、テンプレートマッチングを行う場合に比較すると、基準データのデータ量が少ない上に比較演算の演算量も少なくなる。また、テンプレートマッチングでは、対象となる物体が変形しなければテンプレートと一致する形状のときに相関が大きくなるが、人のように画像内の形状が変化する物体に対してはテンプレートと一致させることが困難であって大きな相関を得るのが困難になる。また、テンプレートマッチングでは対象となる物体の画像内での大きさとテンプレートとの大きさを合わせるために、拡大縮小の処理が必要になったり、同形状で大きさの異なる複数のテンプレートが必要になったりする。これに対して、本実施形態では方向コードの度数分布を用いているから、画像内で対象となる物体の形状が変化しても、度数分布に大きな変化はなく、基準データとの相関を容易に評価することができる。 Furthermore, in this embodiment, the evaluation value e ² of the similarity of the frequency distribution is used as the square sum of the frequency difference for each direction code (that is, the square of the distance between the frequency distributions to be compared). Compared to template matching, the amount of reference data is small and the amount of comparison operation is also small. In template matching, if the target object does not deform, the correlation increases when the shape matches the template. However, for objects that change the shape in the image, such as people, match the template. It is difficult to obtain a large correlation. In template matching, in order to match the size of the target object in the image with the size of the template, enlargement / reduction processing is required, or multiple templates with the same shape but different sizes are required. Or On the other hand, since the frequency distribution of the direction code is used in this embodiment, even if the shape of the target object in the image changes, the frequency distribution does not change greatly, and the correlation with the reference data is easy. Can be evaluated.

なお、図１において画像用メモリ２１を移動領域抽出手段２に設ける例を示したが、画像用メモリ２１を設ける場所についてとくに制限はなく、領域解析手段３に設けたり、移動領域抽出手段２と領域解析手段３との両方に設けたり、画像用メモリ２１のみを別途に設けるようにしてもよい。 1 shows an example in which the image memory 21 is provided in the moving area extracting means 2, but there is no particular limitation on the location where the image memory 21 is provided, and the image memory 21 is provided in the area analyzing means 3 or the moving area extracting means 2 It may be provided in both the area analysis means 3 or only the image memory 21 may be provided separately.

また、図１に示した構成では方向コードを移動領域抽出手段２に設けた微分処理部２２において求めるようにし、方向コードを実際に用いる領域解析手段３で方向コードを求める構成とはしていないが、これは方向コードは微分値から求めるものであって、微分処理部２２において微分値を求める際に方向コードも求める処理とするほうが計算上効率的であるからである。ただし、エッジの抽出のために用いる微分値を求める演算とは別に、領域解析手段３において方向コードを求めるようにしてもよい。 Further, in the configuration shown in FIG. 1, the direction code is obtained by the differential processing unit 22 provided in the moving area extracting means 2, and the direction code is not obtained by the area analyzing means 3 that actually uses the direction code. However, this is because the direction code is obtained from the differential value, and it is computationally efficient to obtain the direction code when the differential processing unit 22 obtains the differential value. However, the direction code may be obtained in the region analysis means 3 separately from the calculation for obtaining the differential value used for extracting the edge.

さらに、上述の例では、基準データ格納部３４に１種類の基準データを格納しているが、カメラ１１に対する人の向きやカメラ１１の視野内で人が存在する位置によっては、基準データとなる度数分布に変化が生じる。そこで、基準データ格納部３４に複数種類の度数分布を基準データとして格納しておき、外乱除去部３２から出力された各領域ごとの度数分布と各基準データとの間の類似度を求めるようにしてもよい。この場合、いずれかの基準データに対する評価値ｅ^２が閾値以下になるときに、当該領域を人に対応する領域とみなす。 Furthermore, in the above-described example, one type of reference data is stored in the reference data storage unit 34. However, depending on the orientation of the person with respect to the camera 11 and the position where the person exists in the field of view of the camera 11, the reference data is used. Changes occur in the frequency distribution. Therefore, a plurality of types of frequency distributions are stored as reference data in the reference data storage unit 34, and the similarity between the frequency distribution for each region output from the disturbance removal unit 32 and each reference data is obtained. May be. In this case, when the evaluation value e ² for any of the reference data is below the threshold value is regarded as the area corresponding to the area in humans.

上述の例では、エッジ画像から得られる移動物体Ｏｂの各領域の特徴量として４種類の方向コードに関する度数分布を用いているが、簡易的には画像の水平方向と垂直方向との２方向の方向コードのみを用いて特徴量としてもよい。この場合、領域ごとに２方向の方向コードの比率を用い、比率の大きさによって人に対応する領域か否かを判断することが可能である。たとえば、２方向の方向コードの度数がＨ１１、Ｈ１２であるときに、Ｈ１１とｋ・Ｈ１２（ｋ：倍率定数）との大小を比較し、Ｈ１１＜ｋ・Ｈ１２であるときに、当該領域を人に対応する領域と判断するのである。外乱除去部３２から出力される各領域に対して上述のようにして人に対応する領域か否かを評価する場合には、分布比較処理部３３および基準データ格納部３４は不要になる。 In the above example, the frequency distribution regarding the four kinds of direction codes is used as the feature amount of each area of the moving object Ob obtained from the edge image. However, for simplicity, the frequency distribution in the two directions of the horizontal direction and the vertical direction of the image is used. Only the direction code may be used as the feature amount. In this case, it is possible to determine whether or not the area corresponds to a person by using the ratio of the direction codes in two directions for each area and the size of the ratio. For example, when the frequency of the direction code in two directions is H11 and H12, the size of H11 is compared with k · H12 (k: magnification constant), and when H11 <k · H12, It is determined that the area corresponds to. When evaluating whether each region output from the disturbance removal unit 32 is a region corresponding to a person as described above, the distribution comparison processing unit 33 and the reference data storage unit 34 are unnecessary.

（実施形態２）
本実施形態では、時系列のエッジ画像間でラベル付けがなされた領域を対応付け、対応付けられない領域を光の変化のように単発的に生じる外乱（ノイズ）として除去する処理について説明する。つまり、一般に視野内において問題になる外乱の多くは光の差込みや変化であって、エッジ画像の中では、この種の外乱に対応する領域の形状は、人に対応する領域の形状に比較して短時間で大きく変化するから、本実施形態ではこの性質を利用して外乱を除去する。 (Embodiment 2)
In the present embodiment, a process for associating regions labeled between time-series edge images and removing uncorrelated regions as disturbances (noise) that occur only once, such as changes in light, will be described. In other words, many disturbances that are generally problematic in the field of view are light insertions and changes, and in the edge image, the shape of the region corresponding to this type of disturbance is compared to the shape of the region corresponding to a person. In this embodiment, the disturbance is removed using this property.

いま、論理合成部２３において移動物体Ｏｂの領域を抽出するのに用いる時系列のエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）が図４のように与えられているものとする。図示するエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）は、人に対応する領域Ｐ１１〜Ｐ１４と、窓などに光が反射することにより生じる外乱に対応する領域Ｎ１２、Ｎ１３とを含んでいる例を示している。このようなエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）においては、領域Ｐ１１〜Ｐ１４は対応付ける必要があり、領域Ｎ１２、Ｎ１３は対応付けずに除去する必要がある。 Now, time-series edge images E (1), E (2), E (3), and E (4) used for extracting the region of the moving object Ob in the logic synthesis unit 23 are given as shown in FIG. It shall be. The illustrated edge images E (1), E (2), E (3), and E (4) are regions P11 to P14 corresponding to a person and regions corresponding to disturbance caused by light reflected on a window or the like. An example including N12 and N13 is shown. In such edge images E (1), E (2), E (3), and E (4), the regions P11 to P14 need to be associated with each other, and the regions N12 and N13 need to be removed without being associated with each other. is there.

そこで、本実施形態では、各エッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）を時間順で各一対ずつ用い（時系列で隣接する各一対のエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）を用い）、各領域ごとに方向コードの度数分布をそれぞれ比較する。つまり、まず領域Ｐ１１の方向コードについて度数分布を求め、領域Ｐ１２と領域Ｎ１２との方向コードについてもそれぞれ度数分布を求めて類似度（実施形態１において基準データとの比較に用いた評価値ｅ^２と同様に演算する）を求める。ここに、各エッジ画像Ｅ（１）に含まれる領域Ｐ１１と、Ｅ（２）に含まれる領域Ｐ１２、Ｎ１２との距離を求めて対応付けのための制約条件に加えてもよい。つまり、対応付けられる移動物体が隣接する一対のエッジ画像Ｅ（１）、Ｅ（２）の間で移動する距離範囲に制約条件を設定しておき、当該距離範囲を逸脱するときには両者は対応付けられないものと判断する。このような制約条件を設定すれば、各エッジ画像Ｅ（１）、Ｅ（２）に複数の領域が存在していても度数分布を比較すべき領域の組合せを少なくすることができるから、演算処理の高速化につながる。上述の処理によって、領域Ｐ１１と領域Ｐ１２との対応付けが可能になる。 Therefore, in this embodiment, each edge image E (1), E (2), E (3), E (4) is used in pairs in time order (each pair of edge images E ( 1), E (2), E (3), and E (4)), and the frequency distribution of the direction code is compared for each region. That is, first, the frequency distribution is obtained for the direction code of the region P11, the frequency distribution is also obtained for the direction codes of the region P12 and the region N12, and the similarity (evaluation value e ² used for comparison with the reference data in the first embodiment). Is calculated in the same manner as above. Here, the distance between the region P11 included in each edge image E (1) and the regions P12 and N12 included in E (2) may be obtained and added to the constraint condition for association. That is, a restriction condition is set for a distance range in which a moving object to be associated moves between a pair of adjacent edge images E (1) and E (2). Judge that it is not possible. If such a constraint condition is set, even if there are a plurality of regions in each of the edge images E (1) and E (2), it is possible to reduce the number of combinations of regions whose frequency distributions should be compared. This leads to faster processing. By the above-described processing, the area P11 and the area P12 can be associated with each other.

次に、エッジ画像Ｅ（２）、Ｅ（３）について、同様の処理を行うと、領域Ｎ１２と領域Ｎ１３とは方向コードの度数分布が大幅に異なるから対応付けがなされず、領域Ｐ１２と領域Ｐ１３との対応付けがなされる。さらに、エッジ画像Ｅ（３）、Ｅ（４）について、同様の処理を行うことにより、領域Ｐ１３と領域Ｐ１４とが対応付けられる。 Next, when the same processing is performed on the edge images E (2) and E (3), the region N12 and the region N13 are not associated with each other because the frequency distribution of the direction code is significantly different. Association with P13 is made. Further, by performing the same processing on the edge images E (3) and E (4), the region P13 and the region P14 are associated with each other.

上述のようにして各エッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）において領域Ｐ１１〜Ｐ１４が互いに対応付けられるのであって、単発的に生じるノイズのような領域Ｎ１２、Ｎ１３は対応付けがなされにくいから、外乱として除去可能になる。すなわち、時系列で隣接するエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）間において度数分布の類似度を用いて評価して同物体に相当する領域を対応付け、対応付けることができた領域を基準データと比較することで人に対応すると判断できる領域が得られたときに、当該領域を人に対応する領域と判断するのであり、画像間の領域の対応付けによってノイズを除去できる可能性が高くなる。 As described above, the regions P11 to P14 are associated with each other in each of the edge images E (1), E (2), E (3), and E (4), and the region is like a single noise. Since N12 and N13 are not easily associated with each other, they can be removed as disturbances. In other words, time-sequentially adjacent edge images E (1), E (2), E (3), and E (4) are evaluated using the degree of similarity of the frequency distribution, and an area corresponding to the same object is associated. When a region that can be determined to correspond to a person is obtained by comparing the region that can be associated with the reference data, the region is determined to be a region that corresponds to a person, and the region between images is associated This increases the possibility of removing noise.

たとえば、光のように形状が一定でない領域の方向コードの度数分布は、人が占める領域の方向コードの度数分布に近似している場合があるが、時系列のエッジ画像の関係を用いて上述のような対応付けを行うことによって、この種の外乱を除去することが可能になる。 For example, the frequency distribution of the direction code in a region where the shape is not constant, such as light, may approximate the frequency distribution of the direction code in a region occupied by a person. This kind of disturbance can be removed by performing such association.

さらに、時系列のエッジ画像において対応付けられた各領域について、分布比較処理部３３において基準データとの照合を行うことによって、人に対応する領域が連続して得られたエッジ画像の枚数を計数し、この枚数が所定時間のうちで規定の閾値以上であるときに、当該領域を人に対応する領域と判定するのが望ましい。たとえば閾値を３に設定するとすれば、上述した図４に示す例では、４枚のエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）、Ｅ（４）のすべてにおいて人に対応する領域Ｐ１１〜Ｐ１４が連続して存在していることから、上述の条件を満たすことになり、３枚のエッジ画像Ｅ（１）、Ｅ（２）、Ｅ（３）における領域Ｐ１１〜Ｐ１３を人に対応する領域と判定することができる。 Further, the distribution comparison processing unit 33 compares each region associated in the time-series edge image with reference data, thereby counting the number of edge images in which regions corresponding to people are continuously obtained. However, when this number is equal to or greater than a predetermined threshold within a predetermined time, it is desirable to determine the area as an area corresponding to a person. For example, if the threshold is set to 3, in the example shown in FIG. 4 described above, the four edge images E (1), E (2), E (3), and E (4) correspond to people. Since the regions P11 to P14 exist continuously, the above-described condition is satisfied, and the regions P11 to P13 in the three edge images E (1), E (2), and E (3) It can be determined that the area corresponds to.

また、図５に示す例では、５枚のエッジ画像Ｅ（１）〜Ｅ（５）について、２枚のエッジ画像Ｅ（１）、Ｅ（２）において対応付けられる領域Ｎ１１、Ｎ１２が存在し、５枚のエッジ画像Ｅ（１）〜Ｅ（５）において対応付けられる領域Ｐ１１〜Ｐ１５が存在している。いま、最初の２枚のエッジ画像Ｅ（１）、Ｅ（２）について着目する。従来技術では、隣接するエッジ画像間で各領域の代表点（重心など）の距離が最小になる領域を対応付ける技術があり、この場合に図示例ではエッジ画像Ｅ（１）の領域Ｐ１１に対して、エッジ画像Ｅ（２）の領域Ｐ１２と領域Ｎ１２とのうち距離の近いほうが対応付けられることになる。つまり、領域Ｐ１１に対して領域Ｐ１２を対応付けなければならないにもかかわらず、ノイズの存在によって領域Ｎ１２が対応付けられる可能性が生じる。これに対して、本実施形態では、領域Ｐ１１と領域Ｐ１２または領域Ｎ１２とのエッジ方向値の度数分布の類似度によって距離とは関係なく対応付けるから、エッジの特徴が異なる領域を対応付けることがなく、領域Ｐ１１と領域Ｐ１２とを正しく対応付けることが可能になる。しかも、対応付けた領域Ｐ１１，Ｐ１２について基準データと比較することによって人か否かを判断するから、ノイズを除去し人に対応する領域を確実に検出することができる。 In the example shown in FIG. 5, there are areas N11 and N12 associated with the two edge images E (1) and E (2) for the five edge images E (1) to E (5). There are regions P11 to P15 associated with the five edge images E (1) to E (5). Now, attention is focused on the first two edge images E (1) and E (2). In the conventional technique, there is a technique for associating an area where the distance between representative points (such as the center of gravity) of each area is minimum between adjacent edge images. In this case, in the illustrated example, the area P11 of the edge image E (1) is associated with the area P11. The shorter distance between the region P12 and the region N12 of the edge image E (2) is associated. That is, there is a possibility that the region N12 is associated with the presence of noise even though the region P12 must be associated with the region P11. On the other hand, in the present embodiment, the region P11 and the region P12 or the region N12 are associated with each other regardless of the distance depending on the similarity of the frequency distribution of the edge direction values, so that regions having different edge characteristics are not associated with each other. It becomes possible to correctly associate the region P11 and the region P12. Moreover, since it is determined whether or not the associated areas P11 and P12 are people by comparing with the reference data, it is possible to remove the noise and reliably detect the area corresponding to the person.

さらに、上述のように、４枚のエッジ画像が得られる時間（上述の所定時間）内において、互いに対応付けられた領域であってかつ人に対応する領域が連続して得られるエッジ画像の枚数が３枚（上述の閾値）以上であるときに、当該領域を人に対応すると確定するのが望ましい。たとえば、図示例では領域Ｐ１１〜Ｐ１５は人に対応すると判定され、領域Ｎ１２、Ｎ１３が連続して得られるのは２枚のエッジ画像のみであるから、外乱として除去される。 Further, as described above, the number of edge images in which the regions corresponding to each other and the regions corresponding to the person are continuously obtained within the time during which four edge images are obtained (the above-described predetermined time) are obtained. When the number of images is three or more (the above-mentioned threshold) or more, it is desirable to determine that the area corresponds to a person. For example, in the illustrated example, it is determined that the regions P11 to P15 correspond to people, and the regions N12 and N13 are obtained continuously only from two edge images, and thus are removed as disturbances.

ところで、複数枚のエッジ画像について領域を追跡して対応付けを行う際に、人に対応する領域が柱の影などに隠れて領域を対応付けることができなかったり、人がまったく移動せずに背景とともに除去されたりする場合がある。図６に示す例では、７枚のエッジ画像Ｅ（１）〜Ｅ（７）のうち２枚のエッジ画像Ｅ（４）、Ｅ（７）において人に対応する領域を検出できなかった状態を示している。 By the way, when tracking and matching areas for multiple edge images, the area corresponding to the person is hidden behind the shadow of the pillar, etc., and the area cannot be associated, or the person does not move at all And may be removed together. In the example shown in FIG. 6, a state in which an area corresponding to a person cannot be detected in two edge images E (4) and E (7) out of seven edge images E (1) to E (7). Show.

この場合も、光のように形状が一定でない領域が生じた場合と同様の処理で対応することが可能である。つまり、時系列のエッジ画像において対応付けられた各領域について、分布比較処理部３３において基準データとの照合を行うことによって、人に対応する領域が得られたエッジ画像の枚数を計数し、この枚数が所定時間のうちで規定の閾値以上であるときに、当該領域を人に対応する領域と判定するのである。 In this case, it is possible to cope with the same processing as when a region having a non-constant shape such as light is generated. That is, for each area associated in the time-series edge image, the distribution comparison processing unit 33 compares the area with the reference data, thereby counting the number of edge images from which the area corresponding to the person is obtained. When the number of sheets is equal to or greater than a predetermined threshold within a predetermined time, the area is determined as an area corresponding to a person.

たとえば、図６に示す例では、７枚のエッジ画像Ｅ（１）〜Ｅ（７）について、エッジ画像Ｅ（１）〜（３）、Ｅ（５）、Ｅ（６）において対応付けられる領域Ｐ１１〜Ｐ１３、Ｐ１５、Ｐ１６が存在しているが、２枚のエッジ画像Ｅ（４）、Ｅ（７）においては対応付ける領域が存在していない。ただし、５枚のエッジ画像が得られる時間（上述の所定時間）内において、互いに対応付けられた領域であってかつ人に対応する領域が得られるエッジ画像の枚数が２枚（上述の閾値）以上であるときに、当該領域を人に対応すると判定することによって、領域Ｐ１１〜Ｐ１３を人に対応すると判定することができ、また領域Ｐ１５、Ｐ１６を人に対応すると判定することが可能になる。ここに、５枚のエッジ画像のうち２枚以上のエッジ画像において基準データを用いて人に対応する領域が得られたときに、当該領域を人に対応する領域と判定しているから、基準データとの類似度の判定によって人に対応する領域が検出されない画像が５枚のエッジ画像のうち３枚含まれていたとしても、人の存在を検出することが可能になる。他の構成および動作は実施形態１と同様である。 For example, in the example shown in FIG. 6, areas associated with seven edge images E (1) to E (7) in the edge images E (1) to (3), E (5), and E (6). P11 to P13, P15, and P16 exist, but there is no area to be associated in the two edge images E (4) and E (7). However, within the time period for obtaining five edge images (the above-mentioned predetermined time period), the number of edge images that are associated with each other and that correspond to a person is two (the above-mentioned threshold value). When it is above, by determining that the area corresponds to a person, the areas P11 to P13 can be determined to correspond to a person, and the areas P15 and P16 can be determined to correspond to a person. . Here, when an area corresponding to a person is obtained using reference data in two or more edge images among five edge images, the area is determined to be an area corresponding to a person. Even if three of the five edge images are included in the image in which the region corresponding to the person is not detected by determining the similarity to the data, the presence of the person can be detected. Other configurations and operations are the same as those of the first embodiment.

（実施形態３）
本実施形態は、上述した処理に加えてカメラ１１に関する条件を設定することにより、人に対応する領域の抽出を容易にしたものである。 (Embodiment 3)
In the present embodiment, in addition to the above-described processing, conditions relating to the camera 11 are set, thereby facilitating extraction of a region corresponding to a person.

いま、図７に示すように、カメラ１１を高さｈの位置に光軸の俯角がθとなるように設置しているものとする。また、カメラ１１の画角（視野角）はφとする。ここで、カメラ１１に設けた光学系の中心の直下の床面Ｆを原点とする直交座標系を考えると、カメラ１１の光学系の中心の座標は（０、ｈ）になり、床面Ｆにおける視野の限界位置Ｌ１、Ｌ２は、それぞれＬ１（ｈ／ｔａｎ（θ＋φ／２），０）、Ｌ２（ｈ／ｔａｎ（θ−φ／２），０）になる。視野の限界位置Ｌ１、Ｌ２の間の床面Ｆを４つの区画に等分し、各区画をカメラ１１から見込む角度を求めると、カメラ１１に近い部位は角度が大きく、カメラ１１から離れるほど角度が小さくなる。つまり、撮像した物体の大きさに変化がなくとも画像内での見かけの大きさはカメラに近い部位ほど大きくなる。 Now, as shown in FIG. 7, it is assumed that the camera 11 is installed at a position of height h so that the depression angle of the optical axis is θ. The angle of view (viewing angle) of the camera 11 is φ. Here, when considering an orthogonal coordinate system with the floor F immediately below the center of the optical system provided in the camera 11 as the origin, the coordinates of the center of the optical system of the camera 11 are (0, h), and the floor F The visual field limit positions L1 and L2 are L1 (h / tan (θ + φ / 2), 0) and L2 (h / tan (θ−φ / 2), 0), respectively. When the floor surface F between the limit positions L1 and L2 of the visual field is equally divided into four sections, and the angle at which each section is viewed from the camera 11 is obtained, the portion near the camera 11 has a larger angle, and the angle increases as the distance from the camera 11 increases. Becomes smaller. That is, even if there is no change in the size of the imaged object, the apparent size in the image becomes larger as the part is closer to the camera.

図７に示した床面Ｆの上の４区画の長さは等しく、ｈ｛１／ｔａｎ（θ−φ／２）−１／ｔａｎ（θ＋φ／２）｝／４になる。視野の限界位置Ｌ１を一方の端点とする区画の他方の端点の座標を（ａ，０）とし、視野の限界位置Ｌ２を一方の端点とする区画の他方の端点の座標を（ｂ，０）とすると、両区画を見込む弧の長さＳａ，Ｓｂの比には以下の関係が成立する。
Ｓａ：Ｓｂ＝（θ＋φ／２）−ｔａｎ^−１（ｈ／ａ）：ｔａｎ^−１（ｈ／ｂ）−（θ−φ／２）
なお、視野を上方から俯瞰するカメラ１１を想定しており、θ＋φ／２≦９０°が成立しているものとする。ここに、長さＳａ，Ｓｂの比は各区画を見込む弧の長さの比であるが、床面Ｆの各区画をカメラ１１の撮像面に投影した長さの比に近似することができるから（図７では近似した状態で示してある）、上述のようにして求めたカメラ１１の撮像面上での各区画の比でカメラ１１の視野の各領域を分割する。言い換えると、床面Ｆの各区画をカメラ１１の撮像面に投影した長さは、床面Ｆに立つ人の像が撮像面に占める大きさに相当するから、撮像面上での人のみかけ上の寸法に応じてカメラ１１の視野を複数区画に分割したことになる。たとえば、カメラ１１の視野における垂直方向を４分割し、上述のようにして求めた各区画の比率で視野を分割する。つまり、視野の上部に形成される領域よりも下部に形成される領域が広幅になる。また、分割後の視野内での垂直方向の長さを水平方向にも適用して正方形の領域を形成する。この方法で視野を分割することにより形成される矩形状の領域Ｄ２の分割例を図８（ａ）に示す。図示例では水平方向についてはカメラ１１からの距離に関係なく視野に一定幅の領域を設定しているが、水平方向についても垂直方向と同様の分割方法を適用すれば、水平方向の両端部の左右幅よりも中央部の左右幅のほうが広幅になる。 The lengths of the four sections on the floor F shown in FIG. 7 are equal to h {1 / tan (θ−φ / 2) −1 / tan (θ + φ / 2)} / 4. The coordinates of the other end point of the section having the limit position L1 of the visual field as one end point are (a, 0), and the coordinates of the other end point of the section having the limit position L2 of the visual field as one end point are (b, 0). Then, the following relationship is established in the ratio of the arc lengths Sa and Sb that allow for both sections.
Sa: Sb = (θ + φ / 2) −tan ⁻¹ (h / a): tan ⁻¹ (h / b) − (θ−φ / 2)
It is assumed that the camera 11 looks down from above, and θ + φ / 2 ≦ 90 ° is established. Here, the ratio of the lengths Sa and Sb is the ratio of the lengths of the arcs that allow the respective sections to be estimated, but can be approximated to the ratio of the lengths of the sections of the floor surface F projected onto the imaging surface of the camera 11. (Shown in an approximate state in FIG. 7), each region of the field of view of the camera 11 is divided by the ratio of each section on the imaging surface of the camera 11 obtained as described above. In other words, the length of each section of the floor surface F projected onto the imaging surface of the camera 11 corresponds to the size of the image of the person standing on the floor surface F in the imaging surface. The field of view of the camera 11 is divided into a plurality of sections according to the above dimensions. For example, the vertical direction in the visual field of the camera 11 is divided into four, and the visual field is divided by the ratio of each section obtained as described above. That is, the area formed below the area formed above the visual field is wider. Further, the square area is formed by applying the vertical length in the divided visual field to the horizontal direction. FIG. 8A shows an example of dividing the rectangular region D2 formed by dividing the field of view by this method. In the illustrated example, a region having a certain width is set in the field of view regardless of the distance from the camera 11 in the horizontal direction. However, if the same division method as in the vertical direction is applied to the horizontal direction, The lateral width in the center is wider than the lateral width.

本実施形態では、上述のようにして視野を複数個の領域Ｄ２に分割し、移動領域抽出手段２において各領域Ｄ２ごとに監視を行うか否かを選択できるようにしてある。図８（ｂ）において斜線を付した領域Ｄ２は監視を行わない無効領域を示し、斜線を付していない領域Ｄ２は有効領域を示す。ここに、領域Ｄ２を指定するために、移動領域抽出手段２に保存用テーブルを設けておき、保存用テーブルに領域Ｄ２ごとの画素の範囲を規定し、領域Ｄ２ごとに移動領域抽出手段２から後の処理を行うか否かを選択できるようにしてある。各領域Ｄ２について監視を行うか否か、つまり領域Ｄ２の有効と無効との指定に際しては、領域ごとに付した符号で領域Ｄ２を指定したり領域を画面上に表示してポインティングデバイスで領域Ｄ２指定したりし、指定した領域Ｄ２についてスイッチ操作により無効を指定すればよい。このように領域Ｄ２の有効と無効とを指定する機能を有したプログラムにより実現される手段を監視領域設定手段と呼ぶ。 In the present embodiment, the field of view is divided into a plurality of regions D2 as described above, and the moving region extracting means 2 can select whether or not to monitor each region D2. In FIG. 8B, a hatched area D2 indicates an invalid area where monitoring is not performed, and a hatched area D2 indicates an effective area. Here, in order to designate the region D2, a storage table is provided in the moving region extraction means 2, a range of pixels for each region D2 is defined in the storage table, and from the moving region extraction means 2 for each region D2. It is possible to select whether or not to perform subsequent processing. Whether or not to monitor each area D2, that is, when designating whether the area D2 is valid or invalid, designates the area D2 with a code assigned to each area or displays the area on the screen and displays the area D2 with a pointing device. It may be specified, or the specified area D2 may be specified as invalid by a switch operation. The means realized by the program having the function of designating validity / invalidity of the area D2 in this way is referred to as monitoring area setting means.

領域Ｄ２を決めるに際しては、上述のような簡易な方法で幾何学的に決定するほか、光学系の収差などを考慮してシミュレーションを行い、実空間での寸法が等しくなるように領域Ｄ２を厳密に分割してもよい。 In determining the region D2, in addition to geometrically determining by the simple method as described above, a simulation is performed in consideration of the aberration of the optical system, and the region D2 is strictly determined so that the dimensions in the real space are equal. You may divide into.

上述のように領域Ｄ２を幾何学的に決定する場合には、カメラ１１の俯角を与える必要があるから、カメラ１１の向きを調節するためのチルト機構に角度目盛りを設けておき、カメラ１１の向きを調節する際に角度目盛りを目視した値を俯角として手入力で与えるようにすればよい。また、俯角の入力を自動化する場合には、チルト機構にロータリエンコーダのような角度センサを配置しておき、角度センサの出力を俯角として与えるようにすればよい。 As described above, when the region D2 is determined geometrically, it is necessary to give the depression angle of the camera 11, so an angle scale is provided in the tilt mechanism for adjusting the orientation of the camera 11, and the camera 11 What is necessary is just to give the value which looked at the angle scale when adjusting direction as a depression angle by manual input. When automating the depression angle input, an angle sensor such as a rotary encoder may be arranged in the tilt mechanism so that the output of the angle sensor is given as the depression angle.

カメラ１１の高さ位置については、数値キーやサムホイールスイッチを用いて手入力で与えるか、あるいは巻尺型の接触式の距離センサや光学的な非接触式の距離センサを用いることによって自動的に高さ寸法を与えるようにする。 The height position of the camera 11 is manually input using a numerical key or a thumbwheel switch, or automatically by using a tape-type contact distance sensor or an optical non-contact distance sensor. Give height dimension.

本実施形態では、監視領域設定手段によって、カメラ１１の視野を床面Ｆでの寸法が等しくなるように複数の領域Ｄ２に分割し、各領域Ｄ２ごとに監視を行うか否かを指定するから、窓のように外乱の生じやすい領域Ｄ２のみを無効にしておけば、人に対応する領域に影響を与えることなくノイズを除去することが可能になり、人に対応する領域を精度よく検出することができる。 In this embodiment, the monitoring area setting means divides the field of view of the camera 11 into a plurality of areas D2 so that the dimensions on the floor surface F are equal, and designates whether or not monitoring is performed for each area D2. If only the region D2 that is likely to cause a disturbance such as a window is disabled, noise can be removed without affecting the region corresponding to the person, and the region corresponding to the person can be accurately detected. be able to.

なお、カメラ１１の俯角に応じた基準データを設定しておけば、カメラ１１の俯角に応じて適正な基準データを用いて、人に対応する領域の検出精度を高めることができる。他の構成および動作は実施形態１と同様である。 In addition, if the reference data according to the depression angle of the camera 11 is set, the detection accuracy of the area corresponding to the person can be increased by using appropriate reference data according to the depression angle of the camera 11. Other configurations and operations are the same as those of the first embodiment.

（実施形態４）
実施形態３では、カメラ１１の視野を矩形状の複数の領域Ｄ２に分割しているが、本実施形態では視野内に存在する物体に応じて人を監視する領域（有効領域）と監視しない領域（無効領域）とに分離するものである。言い換えると、カメラ１１の視野内で任意形状の監視領域を規定するものである。 (Embodiment 4)
In the third embodiment, the field of view of the camera 11 is divided into a plurality of rectangular regions D2, but in this embodiment, a region for monitoring a person (effective region) and a region that is not monitored according to an object present in the field of view. (Invalid area). In other words, an arbitrarily shaped monitoring area is defined within the field of view of the camera 11.

実施形態１において説明したように、移動領域抽出手段２および領域解析手段３はコンピュータにより実現されるものであって、監視領域を規定する機能もコンピュータで実行されるプログラムにより実現される。したがって、本実施形態の人体検知装置は、移動物体（人）を監視する監視モードと、監視領域を規定する領域設定モードとを有することになる。監視モードと領域設定モードとの切換や領域設定モードでの設定開始の指示は、スイッチのような入力手段を用いて行う。 As described in the first embodiment, the moving area extracting means 2 and the area analyzing means 3 are realized by a computer, and the function of defining the monitoring area is also realized by a program executed by the computer. Therefore, the human body detection device of the present embodiment has a monitoring mode for monitoring a moving object (person) and an area setting mode for defining a monitoring area. The switching between the monitoring mode and the area setting mode and the instruction to start the setting in the area setting mode are performed using an input unit such as a switch.

コンピュータがディスプレイ装置を備え、マウスのようなポインティングデバイスを用いることができる場合には、ディスプレイ装置の画面上で監視領域を設定する。いま、図９（ａ）のように、カメラ１１により撮像する視野内に窓Ｗのような外乱の発生しやすい部位や人を検知する必要のない部位が存在する場合を想定する。この場合、ディスプレイ装置には、カメラ１１で撮像された図９（ａ）ような画像（濃淡画像またはカラー画像）Ｉｂが表示されるから、図９（ｂ）のように、カメラ１１により撮像される画像内で使用者が位置を確認しながら有効領域Ｄ３と無効領域Ｄ４とを設定する。すなわち、カメラ１１により撮像した画像Ｉｂをディスプレイ装置に表示し、使用者がポインティングデバイスを用いて、有効領域Ｄ３と無効領域Ｄ４との境界線Ｌｂを設定する。境界線Ｌｂは多点Ｔ１〜Ｔ９で折れ線近似すればよく、図９（ｂ）に示す例では境界線Ｌｂの内側を有効領域Ｄ３とし、境界線Ｌｂの外側を窓Ｗが存在する無効領域Ｄ４としている。このように有効領域Ｄ３を監視領域として抽出する機能を有したプログラムにより実現される手段を監視領域設定手段と呼ぶ。 If the computer includes a display device and can use a pointing device such as a mouse, a monitoring area is set on the screen of the display device. Now, as shown in FIG. 9A, it is assumed that there is a part that is likely to generate a disturbance such as the window W or a part that does not need to detect a person in the field of view captured by the camera 11. In this case, an image (grayscale image or color image) Ib captured by the camera 11 is displayed on the display device as shown in FIG. 9A. Therefore, the image is captured by the camera 11 as shown in FIG. 9B. The valid area D3 and the invalid area D4 are set while confirming the position in the image. That is, the image Ib picked up by the camera 11 is displayed on the display device, and the user sets the boundary line Lb between the effective area D3 and the ineffective area D4 using a pointing device. The boundary line Lb may be approximated by a polygonal line at multiple points T1 to T9. In the example shown in FIG. 9B, the inside of the boundary line Lb is set as the effective area D3, and the outside of the boundary line Lb is set to the invalid area D4 where the window W exists. It is said. The means realized by the program having the function of extracting the effective area D3 as the monitoring area in this way is referred to as monitoring area setting means.

上述のように、監視領域設定手段では画像Ｉｂにおいて監視対象となる有効領域Ｄ３と監視対象にしない無効領域Ｄ４とを指定するから、カメラ１１の視野内において外乱が発生する可能性の高い領域を無効領域Ｄ４として排除することができ、結果的に人に対応する領域を精度よく検出することが可能になる。 As described above, since the monitoring area setting means designates the effective area D3 to be monitored and the invalid area D4 that is not to be monitored in the image Ib, an area that is highly likely to cause disturbance in the field of view of the camera 11 is selected. The invalid area D4 can be excluded, and as a result, an area corresponding to a person can be detected with high accuracy.

ところで、監視領域設定手段としては、領域設定モードにおいてディスプレイ装置の画面上で有効領域Ｄ３と無効領域Ｄ４との間の境界線Ｌｂを指定する代わりに、カメラ１１の視野内で境界線Ｌｂ上を人が実際に移動することによって、有効領域Ｄ３と無効領域Ｄ４とを分離してもよい。この方法を採用する場合には、カメラ１１の視野内において人が移動した位置を追跡する必要があるから、人に光源を携帯させ、光源の位置を追跡する技術を採用する。光源は他の外光と識別可能になるように、特定波長を含む光を発光するものを用い、カメラ１１ではフィルタを用いて特定波長の光のみを選択的に撮像する。 By the way, as the monitoring area setting means, instead of designating the boundary line Lb between the effective area D3 and the invalid area D4 on the screen of the display device in the area setting mode, The effective area D3 and the invalid area D4 may be separated by a person actually moving. When this method is adopted, it is necessary to track the position where the person has moved within the field of view of the camera 11, so a technique is adopted in which a person carries a light source and tracks the position of the light source. A light source that emits light including a specific wavelength is used so that the light source can be distinguished from other external light, and the camera 11 selectively images only light of a specific wavelength using a filter.

上述したように、カメラ１１には特定波長の光のみを透過させるフィルタを装着し、カメラ１１が特定波長の光に対してのみ感度を持つようにする。フィルタはカメラ１１に手作業で装着してもよいが、フィルタを着脱する機構をカメラ１１に設けてフィルタの装着を自動化すれば高所作業を伴わずにフィルタの着脱が可能になる。特定波長の光として近赤外線を用いる場合には、カメラ１１の撮像素子には近赤外線に感度を有するＣＣＤイメージセンサやＣＭＯＳイメージセンサを用い、可視光を遮断し近赤外線を透過させるフィルタをカメラ１１に装着する。 As described above, the camera 11 is provided with a filter that transmits only light of a specific wavelength so that the camera 11 has sensitivity only to light of a specific wavelength. The filter may be manually attached to the camera 11. However, if the camera 11 is provided with a mechanism for attaching / detaching the filter to automate the attachment of the filter, the filter can be attached / detached without work at a high place. When near-infrared light is used as light of a specific wavelength, a CCD image sensor or CMOS image sensor having sensitivity to near-infrared is used as the image sensor of the camera 11, and a filter that blocks visible light and transmits near-infrared is used for the camera 11. Attach to.

一方、境界線Ｌｂに沿って移動する人が携帯する光源には、近赤外線を発光する専用の光源を用いたり、近赤外線を伝送媒体とするリモコン送信器を代用したりすることができる。ここに、リモコン送信器は、カメラ１１により撮像される画像のフレーム間の時間間隔よりも点滅周期の短いものを用いる。 On the other hand, as a light source carried by a person moving along the boundary line Lb, a dedicated light source that emits near infrared light can be used, or a remote control transmitter that uses near infrared light as a transmission medium can be substituted. Here, the remote control transmitter uses one having a flashing cycle shorter than the time interval between frames of an image captured by the camera 11.

上述した有効領域Ｄ３と無効領域Ｄ４との境界線Ｌｂを設定する作業手順をまとめると以下のようになる。すなわち、まず領域設定モードに切り換えた後、カメラ１１に手作業でフィルタを装着するか、または領域設定モードに切り換えることによってカメラ１１にフィルタが自動的に装着される。フィルタの装着後に、境界線Ｌｂの設定作業の開始や終了をコンピュータに指示する。設定作業の指示は、キーボードやポインティングデバイスを備えるコンピュータを用いる場合にはキーボードやポインティングデバイスによって行えばよく、マイクロコンピュータを用いて専用装置を構成している場合には設定作業の開始や終了を指示する押釦スイッチなどを設けておけばよい。 The work procedure for setting the boundary line Lb between the effective area D3 and the invalid area D4 is summarized as follows. That is, after first switching to the region setting mode, the filter is manually mounted on the camera 11 or the filter is automatically mounted on the camera 11 by switching to the region setting mode. After the filter is attached, the computer is instructed to start and end the setting operation of the boundary line Lb. The setting work can be instructed by using a keyboard or pointing device when using a computer equipped with a keyboard or pointing device. When a dedicated device is configured using a microcomputer, the setting work is instructed to start or end. A push button switch or the like may be provided.

設定作業が開始されると、カメラ１１によって所定の時間間隔で視野内が撮像され、撮像された画像Ｉｂが画像メモリ２１に保存される。ここに、カメラ１１では監視モードと同様の時間間隔で撮像を行い、所望位置で光源を点灯させ、光源の点灯と画像Ｉｂの保存とを連動させることによって光源が点灯したときの画像Ｉｂのみを画像メモリ２１に保存するようにしてもよい。 When the setting operation is started, the inside of the visual field is imaged at a predetermined time interval by the camera 11, and the captured image Ib is stored in the image memory 21. Here, the camera 11 captures images at the same time interval as in the monitoring mode, turns on the light source at a desired position, and links only the lighting of the light source and the storage of the image Ib, thereby only the image Ib when the light source is turned on. You may make it preserve | save in the image memory 21. FIG.

一方、光源を携帯する人は有効領域Ｄ３と無効領域Ｄ４との境界線Ｌｂに沿って移動する。このとき、カメラ１１によって撮像された画像Ｉｂを光源を携帯する人が視認できる場所に表示して位置の確認を行わせるようにすれば、より適正な境界線Ｌｂを設定することが可能になる。とくに、光源を手で把持している場合には、人の足位置に対して前後左右に移動し、また床面からの高さ位置も変化するから、人が境界線Ｌｂと想定した線上を通ってもカメラ１１の視野内で設定しようとする境界線Ｌｂとはずれが生じることがある。したがって、画像Ｉｂを確認しながら光源の位置を調節することによって所望の境界線Ｌｂを設定することが可能になる。 On the other hand, the person carrying the light source moves along the boundary line Lb between the effective area D3 and the invalid area D4. At this time, if the image Ib picked up by the camera 11 is displayed in a place where the person carrying the light source can visually recognize the position, it is possible to set a more appropriate boundary line Lb. . In particular, when the light source is held by hand, it moves from front to back and left and right with respect to the position of the person's foot, and the height position from the floor also changes. Even if it passes, there may be a deviation from the boundary line Lb to be set within the field of view of the camera 11. Therefore, it is possible to set a desired boundary line Lb by adjusting the position of the light source while checking the image Ib.

上述のようにして光源が境界線Ｌｂの上を通るすべての画像（ここでは、図９（ｂ）に示す点Ｔ１〜Ｔ９に対応する位置の複数の画像を想定する）Ｉｂの保存が終了した後に終了を指示すると、保存された各画像Ｉｂについて濃度値（輝度値）が最大になる位置を抽出する。つまり、各画像Ｉｂにおいては光源の位置の輝度が最大と考えられるから、各画像Ｉｂにおける光源の位置を抽出することになる。ここで、画像Ｉｂ内で光源以外のノイズを抽出することがないように、濃度値が最大になる位置を求めるだけではなく、適宜の画像フィルタを併用してもよい。このようにして各画像Ｉｂから抽出した濃度値が最大になる位置を時間順に連結し、さらに時系列の最初の画像Ｉｂと最後の画像Ｉｂとから得られた位置を結ぶと閉領域が形成され、この閉領域を有効領域Ｄ３または無効領域Ｄ４に指定することができる（例では有効領域Ｄ３に指定している）。このようにして設定した有効領域Ｄ３はディスプレイ装置に表示された画像を用いて微調整することが可能になっている。 As described above, saving of all images in which the light source passes over the boundary line Lb (here, a plurality of images at positions corresponding to the points T1 to T9 shown in FIG. 9B) is completed. When the end is instructed later, the position where the density value (luminance value) is maximized is extracted for each stored image Ib. That is, in each image Ib, since the brightness | luminance of the position of a light source is considered to be the maximum, the position of the light source in each image Ib is extracted. Here, in order not to extract noise other than the light source in the image Ib, not only the position where the density value is maximized but also an appropriate image filter may be used together. In this way, the position where the density value extracted from each image Ib is maximized is connected in time order, and the closed region is formed by connecting the positions obtained from the first image Ib and the last image Ib in time series. The closed region can be designated as the valid region D3 or the invalid region D4 (designated as the valid region D3 in the example). The effective area D3 set in this way can be finely adjusted using an image displayed on the display device.

上述の例ではカメラ１１の視野内に１個の閉領域のみを形成しているが、複数個の閉領域を設定するようにしてもよい。その場合、各閉領域の設定毎に閉領域の終了を指示可能とし、すべての閉領域の設定が終了した後に領域設定の終了を指示できるようにしておけばよい。他の構成および動作は実施形態１と同様である。 In the above example, only one closed region is formed in the field of view of the camera 11, but a plurality of closed regions may be set. In that case, the end of the closed region can be instructed for each closed region setting, and the end of the region setting can be instructed after the setting of all the closed regions is completed. Other configurations and operations are the same as those of the first embodiment.

（実施形態５）
本実施形態は、人を検知するにあたって、カメラ１１により撮像した画像のみを用いるのではなく、人から放射される熱線を検知する熱線式の人感センサを併用することによって、検知精度を高めた人体検知装置を例示する。 (Embodiment 5)
In this embodiment, when detecting a person, the detection accuracy is improved by using not only an image captured by the camera 11 but also a hot-wire human sensor that detects a heat ray emitted from the person. The human body detection apparatus is illustrated.

すなわち、図１０に示すように、人感センサとして焦電形赤外線センサ（以下、「焦電センサ」と略称する）４１を備える熱感知手段４を設けている。熱感知手段４は、焦電センサ４１により熱変化を検出すると、一定時間幅のパルス信号を発生するように構成してある。焦電センサ４１は、微分型センサであって受光した熱線の変化量に応じた電圧信号を出力し、この電圧信号に適宜の閾値を設定しておくことによって、熱感知手段４からパルス信号を出力することが可能になる。焦電センサ４１が熱線を受光する範囲（受光領域）は、レンズやミラーのような光学要素との組合せによって設定され、さらに光学要素によって受光領域内に感度むらを生じさせることにより、受光領域内での熱源（人など）の移動に伴って焦電センサ４１で受光する熱線量に変化が生じるようにしてある。 That is, as shown in FIG. 10, a heat sensing means 4 having a pyroelectric infrared sensor (hereinafter abbreviated as “pyroelectric sensor”) 41 as a human sensor is provided. The heat sensing means 4 is configured to generate a pulse signal having a certain time width when a thermal change is detected by the pyroelectric sensor 41. The pyroelectric sensor 41 is a differential sensor, and outputs a voltage signal corresponding to the amount of change in the received heat ray. By setting an appropriate threshold for this voltage signal, the pyroelectric sensor 41 outputs a pulse signal from the heat sensing means 4. It becomes possible to output. The range (light receiving area) in which the pyroelectric sensor 41 receives heat rays is set by a combination with an optical element such as a lens or a mirror, and further, the optical element causes uneven sensitivity in the light receiving area. The heat dose received by the pyroelectric sensor 41 is changed with the movement of the heat source (such as a person).

焦電センサ４１の受光領域はカメラ１１の視野（または監視領域）にほぼ一致させてあり、カメラ１１の視野（焦電センサ４１の受光領域）内において図１１（ｂ）のような変化が生じると、熱感知手段４は図１１（ａ）のようなパルス信号を出力する。図１１において（イ）は、視野内に人が存在せず窓際で日差しの急激な変化やカーテンＲの揺れなどによって生じている熱線量の変化で、熱感知手段４から１個のパルス信号が出力された場合を示している。また、図１１の（ロ）は視野内に人Ｍが出現した状態、図１１の（ハ）〜（ホ）は人が移動したり微動（手足が動くような状態）したりしている状態を示している。図１１（ａ）のパルス信号は図１１の（ロ）〜（ホ）のように人Ｍが視野内に存在する間には繰り返し発生する。つまり、外乱による熱線量の変化は単発的に生じ、人の検知による熱線量の変化は断続的に生じることが多い。 The light receiving area of the pyroelectric sensor 41 is substantially coincident with the field of view (or monitoring area) of the camera 11 and changes as shown in FIG. 11B occur within the field of view of the camera 11 (light receiving area of the pyroelectric sensor 41). Then, the heat sensing means 4 outputs a pulse signal as shown in FIG. In FIG. 11, (a) is a change in heat dose caused by a sudden change of sunlight or a swing of the curtain R at the window when no person is present in the field of view. One pulse signal is output from the heat sensing means 4. The output case is shown. Further, (b) in FIG. 11 shows a state where a person M appears in the field of view, and (c) through (e) in FIG. 11 show a state where the person is moving or slightly moving (a state where the limb moves). Is shown. The pulse signal shown in FIG. 11A is repeatedly generated while the person M exists in the field of view as shown in FIGS. In other words, the change in heat dose due to disturbance occurs only once, and the change in heat dose due to human detection often occurs intermittently.

そこで、熱感知手段４から出力されるパルス信号を監視する熱変化解析手段５を設け、熱変化解析手段５に設けた時系列熱変化信号解析部５１において、一定の検知期間Ｔｄ１（図１１（ａ）参照）ごとのパルス信号の発生数を計数し、パルス信号の発生数が規定の閾値以上であるときに視野内に人が存在すると判断する。つまり、熱感知手段４に設けた焦電センサ４１からなる人感センサの出力から人の存否を判定する解析手段として時系列熱変化信号解析部５１を備える。 Therefore, a thermal change analysis means 5 for monitoring the pulse signal output from the heat sensing means 4 is provided, and in the time-series thermal change signal analysis unit 51 provided in the thermal change analysis means 5, a certain detection period Td1 (FIG. 11 ( The number of occurrences of the pulse signal for each a) is counted, and it is determined that there is a person in the field of view when the number of occurrences of the pulse signal is equal to or greater than a predetermined threshold. That is, the time-series heat change signal analysis unit 51 is provided as an analysis unit that determines the presence / absence of a person from the output of a human sensor including a pyroelectric sensor 41 provided in the heat sensing unit 4.

さらに、熱変化解析手段５では、パルス信号の発生数が規定の閾値以上であって視野内に人が存在すると判断すると、その判断時点ＴＳからさらに一定の確認期間Ｔｄ２（検知期間Ｔｄ１と確認期間Ｔｄ２とは等しくてもよい）におけるパルス信号の有無を確認し、確認期間Ｔｄ２において１個以上のパルス信号の発生が確認されると、視野内に人が存在するという判断結果を維持する。また、確認期間Ｔｄ２においてパルス信号が１個も検出されなければ視野内に人が存在するという判断結果を解除する。 Further, when the thermal change analyzing means 5 determines that the number of pulse signals generated is equal to or greater than a predetermined threshold and that there is a person in the field of view, a certain confirmation period Td2 (detection period Td1 and confirmation period Td2) The presence / absence of a pulse signal in (which may be equal to Td2) is confirmed, and when the generation of one or more pulse signals is confirmed in the confirmation period Td2, the determination result that a person exists in the visual field is maintained. In addition, if no pulse signal is detected in the confirmation period Td2, the determination result that a person exists in the field of view is cancelled.

熱変化解析手段５の判断結果は、領域解析手段３での判断結果とともに統合演算部６に入力され、統合演算部６では両者の判断結果から視野（または監視領域）内に人が存在するか否かを確定する。統合演算部６では、両者の判断結果の論理積を求めるのがもっとも簡単な方法であって、領域解析手段３と熱変化解析手段５との両方において人が視野（または監視領域）に存在するという判断結果が得られたときに、統合演算部６でも視野（または監視領域）に人が存在するという結果を出力するようにすれば、一方のみで人の存在を判断する場合よりも判断結果の信頼性が高くなる。つまり、視野（または監視領域）に人が存在するという判断結果が統合演算部６で得られるときに、領域解析手段３の出力により特定の画像内での人の位置を確定することが可能になる。また、カメラ１１での撮像内容を用いて侵入監視を行う場合には、視野（または監視領域）内への人の侵入を確実に検知することができ誤報を防止することができる。 The determination result of the thermal change analysis means 5 is input to the integrated calculation unit 6 together with the determination result of the region analysis means 3, and the integrated calculation unit 6 determines whether a person exists in the field of view (or monitoring area) based on the determination results of both. Confirm whether or not. In the integrated arithmetic unit 6, it is the simplest method to obtain the logical product of the judgment results of both, and a person exists in the field of view (or monitoring area) in both the area analysis means 3 and the thermal change analysis means 5. If the result that the person exists in the field of view (or the monitoring area) is output also in the integrated calculation unit 6 when the determination result is obtained, the determination result is more than the case where the presence of the person is determined by only one side. The reliability will be higher. That is, it is possible to determine the position of a person in a specific image by the output of the area analysis unit 3 when the integrated calculation unit 6 obtains a determination result that a person exists in the visual field (or monitoring area). Become. In addition, when intrusion monitoring is performed using the contents captured by the camera 11, it is possible to reliably detect the intrusion of a person into the field of view (or the monitoring area), and prevent false reports.

ここに、撮像手段１を用いた人の監視と熱感知手段４を用いた人の監視とは、一方のみを常時行い、監視中に何らかの移動物体の存在の可能性が検知されたときに他方の検知動作を開始するようにしてもよい。たとえば、熱感知手段４を常時動作させておき、熱感知手段４がパルス信号を１個でも発生すると撮像手段１による撮像を開始するようにしたり、撮像手段１を常時動作させておき、移動領域抽出手段２から得られる候補画像において移動物体とみなせる画素数が規定の閾値以上になるときに熱感知手段４による人の検知を開始するようにしたりすればよい。このように、常時は撮像手段１と熱感知手段４との一方のみを動作させることによって、両者を常時動作させる場合に比較すると消費電力を低減することが可能になる。なお、上述のように撮像手段１と熱感知手段４との一方の出力で人の存在の可能性が判断される前に他方の動作を開始しているのは、一方で人の存在の可能性が判断されるまで待つと他方で人の存在の可能性を判断するまでの時間に遅れが生じるからである。 Here, only one of the monitoring of the person using the imaging unit 1 and the monitoring of the person using the heat sensing unit 4 is always performed, and when the possibility of the presence of any moving object is detected during the monitoring, This detection operation may be started. For example, the heat sensing means 4 is always operated, and when the heat sensing means 4 generates even one pulse signal, the imaging means 1 starts imaging, or the imaging means 1 is always operated, What is necessary is just to start a human detection by the heat sensing means 4 when the number of pixels that can be regarded as a moving object in the candidate image obtained from the extraction means 2 is equal to or greater than a predetermined threshold. As described above, by operating only one of the image pickup means 1 and the heat sensing means 4 at all times, it is possible to reduce power consumption as compared with the case where both are always operated. As described above, the other operation is started before the possibility of the presence of a person is determined based on the output of one of the imaging unit 1 and the heat sensing unit 4. This is because waiting until sex is judged delays the time until the possibility of the existence of a person is judged.

上述した例では、統合演算部６において、領域解析手段３と熱変化解析手段５との両者の出力の論理積を採用しているが、統合演算部６において論理和を用いるようにしてもよい。つまり、領域解析手段３と熱変化解析手段５との一方でも人の存在の可能性があると判断すると、人が存在するという結果を出力するのである。統合演算部６としてこの構成を採用すると、視野（または監視領域）内への人の侵入に対する失報の可能性が低減する。たとえば、侵入監視を行うために、カメラ１１で撮像した画像をビデオレコーダなどの記録装置に記録する場合には、撮像手段１と熱感知手段４とのいずれか一方の出力で人の存在の可能性があると判断されたときのすべての画像を記録することができ、記録装置を常時作動させることなく、視野内に人が存在する可能性のあるすべての期間の画像を記録することが可能になる。つまり、カメラ１１の視野内に人が存在する可能性があるときの画像をすべて記録しながらも、記録装置を常時作動させる場合に比較して記憶容量を低減することができる。 In the above-described example, the integrated arithmetic unit 6 employs the logical product of the outputs of the region analyzing unit 3 and the thermal change analyzing unit 5, but the integrated arithmetic unit 6 may use a logical sum. . In other words, if it is determined that there is a possibility that one of the region analysis means 3 and the heat change analysis means 5 exists, a result that a person exists is output. When this configuration is adopted as the integrated calculation unit 6, the possibility of false alarms due to the intrusion of a person into the visual field (or monitoring area) is reduced. For example, when an image captured by the camera 11 is recorded on a recording device such as a video recorder in order to perform intrusion monitoring, the presence of a person can be detected by the output of either the imaging means 1 or the heat sensing means 4. It is possible to record all images when it is determined that there is a possibility, and to record images for all periods in which a person may exist in the field of view without always operating the recording device become. That is, it is possible to reduce the storage capacity as compared with the case where the recording apparatus is always operated while recording all images when there is a possibility that a person exists in the visual field of the camera 11.

カメラ１１で撮像される画像を記録装置に保存する技術としては、一定時間ごとに画像を格納する技術と、別途に設けたセンサにより人が検知されている期間に画像を格納する技術とが知られているが、前者の技術では人の存否にかかわらず画像が保存されるから、必要な画像が得られない場合や無駄な画像が撮像されたりする可能性があり、また後者の技術であっても一般に１種類のセンサでの検知であるから失報を生じる可能性があり必要な画像を得られないことがある。これに対して本実施形態の構成では、人が存在する可能性がある期間には確実に画像を保存することができ、必要な画像がすべて保存されることになるのである。他の構成および動作は実施形態１と同様である。なお、本実施形態において人感センサとして焦電センサ４１を用いているが、赤外線の投受光を行う光電センサや超音波の送受波を行う超音波センサなどを人感センサに用いることが可能である。この場合、人感センサの出力を熱変化解析手段に相当する解析手段に与えて、視野（または監視領域）における人の存否を判断することになる。 As a technique for storing an image captured by the camera 11 in a recording device, a technique for storing an image at regular intervals and a technique for storing an image during a period in which a person is detected by a separately provided sensor are known. However, since the former technique saves images regardless of the presence or absence of people, there is a possibility that necessary images cannot be obtained or useless images may be taken. However, since the detection is generally performed by one type of sensor, there is a possibility that a false alarm may occur and a necessary image may not be obtained. On the other hand, in the configuration of the present embodiment, an image can be reliably stored during a period in which there is a possibility that a person exists, and all necessary images are stored. Other configurations and operations are the same as those of the first embodiment. In this embodiment, the pyroelectric sensor 41 is used as a human sensor. However, a photoelectric sensor that performs infrared light transmission and reception, an ultrasonic sensor that transmits and receives ultrasonic waves, and the like can be used as the human sensor. is there. In this case, the output of the human sensor is given to an analysis unit corresponding to the thermal change analysis unit to determine the presence or absence of a person in the field of view (or monitoring area).

（実施形態６）
本実施形態は、図１２に示すように、図１に示した実施形態１の構成に画像用メモリ２１とは別に保存用メモリ７を付加したものである。保存用メモリ７は、所定期間にわたって得られた画像を記録し保存する目的で設けられている。また、撮像手段１に設けたＡ／Ｄ変換器１２からの濃淡画像を画像用メモリ２１に記録する経路と、画像用メモリ２１から保存用メモリ７に画像を転送する経路とを図示している。 (Embodiment 6)
In the present embodiment, as shown in FIG. 12, a storage memory 7 is added to the configuration of the first embodiment shown in FIG. The storage memory 7 is provided for the purpose of recording and storing an image obtained over a predetermined period. Further, a path for recording the gray image from the A / D converter 12 provided in the image pickup means 1 in the image memory 21 and a path for transferring the image from the image memory 21 to the storage memory 7 are illustrated. .

一般に、この種の目的で画像を記録し保存する場合に、カメラ１１で撮像した画像（濃淡画像またはカラー画像）について全視野の画像を保存しているから、ビデオレコーダのように記憶容量の大きい記録装置を必要としている。 In general, when an image is recorded and stored for this type of purpose, since the image of the entire field of view is stored for the image (grayscale image or color image) captured by the camera 11, the storage capacity is large like a video recorder. Need a recording device.

本実施形態では、侵入監視のような目的では画像内のうち移動物体（人）に相当する領域の画像のみが得られればよいことに着目し、領域解析手段３において人の存在の可能性が検知された領域のみを、カメラ１１で撮像した画像から切り出して保存用メモリ７に格納するようにし、また画像用メモリ２１に格納された画像を保存用メモリ７に転送可能としている。切り出す領域としては、図２（ｄ）のように移動物体Ｏｂと考えられる領域としてもよいが、図２（ｅ）に示した外接矩形Ｄ１の範囲とすれば、外接矩形Ｄ１の対角位置の座標を指定するだけで切り出す領域を指定することができるから処理が簡単になる。なお、保存用メモリ７に格納する画像には、撮像日時のような標識を対応付けるのが望ましい。 In the present embodiment, focusing on the fact that it is only necessary to obtain an image of a region corresponding to a moving object (person) in the image for purposes such as intrusion monitoring, the region analysis means 3 may have a human presence. Only the detected area is cut out from the image captured by the camera 11 and stored in the storage memory 7, and the image stored in the image memory 21 can be transferred to the storage memory 7. The area to be cut out may be an area that can be considered as a moving object Ob as shown in FIG. 2D. However, if the area is within the circumscribed rectangle D1 shown in FIG. Since the area to be cut out can be specified simply by specifying the coordinates, the processing is simplified. In addition, it is desirable that an image stored in the storage memory 7 is associated with a sign such as an imaging date.

このような構成を採用することにより、従来構成に比較して大幅に少ない記憶容量で必要な画像の保存が可能になるのであって、結果的に保存用メモリ７のような比較的小容量の記録装置を用いて画像の保存が可能になっている。移動領域抽出手段２に用いる画像用メモリ２１は作業用のメモリであるから揮発性メモリを用いればよく、保存用メモリ７は保存用であるから不揮発性メモリ（フラッシュメモリなど）を用いる。また、保存用メモリ７として着脱可能なメモリを用いれば、保存用メモリ７の内容を他のコンピュータによって読み出すことが可能になる。 By adopting such a configuration, it becomes possible to store a required image with a significantly smaller storage capacity compared to the conventional configuration. As a result, a relatively small capacity such as the storage memory 7 can be stored. Images can be stored using a recording device. Since the image memory 21 used for the moving area extraction means 2 is a working memory, a volatile memory may be used. Since the storage memory 7 is a storage, a non-volatile memory (flash memory or the like) is used. If a removable memory is used as the storage memory 7, the contents of the storage memory 7 can be read out by another computer.

ところで、上述したように画像内において移動物体が人か否かを評価するために、エッジ画像における方向コードの度数分布を用いているから、移動物体の存否を評価するための画像には高い解像度は要求されない。しかしながら、侵入監視に用いるとすれば、保存用メモリ７に格納する画像は顔などが識別できる程度の解像度が要求されることになる。そこで、カメラ１１では比較的高い解像度の画像を撮像して、この画像を画像用メモリ２１に一時的に記憶しておき、移動物体が人か否かの評価を行う処理は低解像度で行うことによって処理を高速化し、移動物体が人である可能性が高いと認識された領域について、高解像度の画像を画像用メモリ２１から読み出して保存用メモリ７に格納するようにするのが望ましい。この処理によって、処理の高速性と所要領域における高解像度とを満足しながらも、保存用メモリ７の記憶容量の増加を抑制することが可能になる。他の構成および機能は実施形態１と同様である。 By the way, since the frequency distribution of the direction code in the edge image is used to evaluate whether or not the moving object is a person in the image as described above, the image for evaluating the presence or absence of the moving object has a high resolution. Is not required. However, if used for intrusion monitoring, the image stored in the storage memory 7 is required to have a resolution that can identify a face or the like. Therefore, the camera 11 captures an image with a relatively high resolution, temporarily stores this image in the image memory 21, and performs a process of evaluating whether or not the moving object is a person with a low resolution. Therefore, it is desirable to speed up the processing and to read out a high-resolution image from the image memory 21 and store it in the storage memory 7 for an area recognized as having a high possibility that the moving object is a person. By this process, it is possible to suppress an increase in the storage capacity of the storage memory 7 while satisfying the high speed of the process and the high resolution in the required area. Other configurations and functions are the same as those of the first embodiment.

（実施形態７）
本実施形態は、図１３に示すように、実施形態６の構成において保存用メモリ７に代えて所要の画像を伝送路を介して他装置に伝送するための画像送信部８を設けたものである。画像送信部８が送信する画像は、実施形態６における保存用メモリ７に格納する画像と同様であって、カメラ１１により撮像された画像のうち、移動物体が人である可能性が高いと判断された領域のみの画像を切り出して他装置に転送するのである。画像を転送するタイミングは、移動物体が人である可能性が高いと判断された時点であって、不定期になるから、撮像日時を付加して転送する必要がある。 (Embodiment 7)
As shown in FIG. 13, the present embodiment is provided with an image transmission unit 8 for transmitting a required image to another apparatus via a transmission path instead of the storage memory 7 in the configuration of the sixth embodiment. is there. The image transmitted by the image transmission unit 8 is the same as the image stored in the storage memory 7 in the sixth embodiment, and it is determined that the moving object is likely to be a person among the images captured by the camera 11. An image of only the specified area is cut out and transferred to another apparatus. The timing for transferring the image is a point in time when it is determined that the moving object is likely to be a person, and it is irregular. Therefore, it is necessary to add the imaging date and time for transfer.

上述のように本実施形態の構成では、人と考えられる領域の画像のみを他装置に転送するから、カメラ１１により撮像した画像を他装置に常時転送する場合に比較すると、伝送路のトラフィックを大幅に低減することができる。しかも、着目する領域については画像の解像度を低下させることなく、他装置に画像を転送することが可能になる。その結果、他装置として無線携帯端末（画像送受信機能を有する移動体電話機など）のようなデータの記憶容量が小さい装置であってもカメラ１１により撮像した画像を転送することが可能になる。 As described above, in the configuration of the present embodiment, only an image of a region considered to be a person is transferred to another device. Therefore, compared with a case where an image captured by the camera 11 is always transferred to another device, the traffic on the transmission path is reduced. It can be greatly reduced. Moreover, it is possible to transfer the image to the other device without reducing the resolution of the image in the area of interest. As a result, an image captured by the camera 11 can be transferred even if the device has a small data storage capacity such as a wireless portable terminal (such as a mobile phone having an image transmission / reception function) as another device.

他装置としては、上述した無線携帯端末のほか、監視用のディスプレイ装置、画像を保存する記憶装置、通信機能を有するコンピュータなどを用いることができ、画像送信部８は他装置との通信仕様に応じて適宜に構成すればよい。たとえば、画像送信部８から他装置に転送する画像は、アナログ画像でもデジタル画像でもよく、伝送路も有線か無線かを問わない。さらに、画像送信部８での処理として、他装置や伝送路の仕様に応じて画像の間引きを行うようにしてもよい。とくに、動画像を他装置に転送する場合には、伝送路の通信速度や他装置での画像に対する処理速度に応じて、適宜のフレームを抜くようにすれば、解像度を低下させることなく、画像を転送することが可能である。 As the other device, in addition to the above-described wireless portable terminal, a monitor display device, a storage device for storing an image, a computer having a communication function, and the like can be used, and the image transmission unit 8 has communication specifications with the other device. It may be configured accordingly. For example, the image transferred from the image transmission unit 8 to another device may be an analog image or a digital image, and the transmission path may be wired or wireless. Further, as the processing in the image transmission unit 8, image thinning may be performed according to the specifications of other devices and transmission paths. In particular, when transferring a moving image to another apparatus, if an appropriate frame is extracted according to the communication speed of the transmission path or the processing speed for the image in the other apparatus, the image is not reduced without reducing the resolution. Can be transferred.

本実施形態の構成は侵入監視においてとくに有効であって、視野（または監視領域）を常時監視して他装置に画像を転送するのではなく、人が存在すると判断される期間にのみ画像を他装置に転送し、しかも転送する画像は人が存在する可能性のある領域のみであるから、他装置との間の伝送路のトラフィックを大幅に低減することができ、通信に要するコストを低減することができる上に、送受信のためのバッファ容量を小さくすることが可能であり、しかも画像の解像度を低下させないから侵入者の顔などの識別が可能になる。他の構成および機能は実施形態６と同様である。 The configuration of the present embodiment is particularly effective in intrusion monitoring. Instead of constantly monitoring the visual field (or monitoring area) and transferring the image to another apparatus, the image is only displayed during a period when it is determined that a person exists. Since the image to be transferred to the device and the image to be transferred is only an area where a person may exist, the traffic on the transmission path to other devices can be greatly reduced, and the cost required for communication can be reduced. In addition, the buffer capacity for transmission / reception can be reduced, and the resolution of the image is not lowered, so that the face of the intruder can be identified. Other configurations and functions are the same as those of the sixth embodiment.

（実施形態８）
上述した各実施形態により人の存在する可能性のある領域を抽出することができるから、撮像手段１の視野を監視領域として侵入者の有無を監視することができる。このような目的で用いる場合には、図１４に示すように、撮像手段１で撮像した画像の監視者による監視を可能とするためにＣＲＴあるいは液晶表示器のようなディスプレイ装置からなる画像表示手段４０を設ける。画像表示手段４０には、画像用メモリ２１に格納されている濃淡画像に基づく画像が表示される。画像表示手段４０にどのような画像を表示するかは画像出力手段４１により制御される。また、上述した領域解析手段３が人に対応する領域と評価した領域が存在するときに検知信号を出力する検知信号出力手段４２を設けている。画像出力手段４１は、領域解析手段３が人に対応する領域と評価した領域を画像用メモリ２１から読み出し、画像表示手段４０の画面の大きさに合わせた拡大率で拡大した部分拡大画像を画像表示手段４０の画面に表示する。 (Embodiment 8)
Since each of the above-described embodiments can extract a region where a person may exist, the presence or absence of an intruder can be monitored using the field of view of the imaging unit 1 as a monitoring region. When used for such a purpose, as shown in FIG. 14, an image display means comprising a display device such as a CRT or a liquid crystal display in order to enable monitoring by an observer of the image taken by the image pickup means 1. 40 is provided. The image display means 40 displays an image based on the grayscale image stored in the image memory 21. The image output means 41 controls what kind of image is displayed on the image display means 40. Further, there is provided a detection signal output means 42 for outputting a detection signal when there is an area evaluated by the area analysis means 3 described above as an area corresponding to a person. The image output means 41 reads from the image memory 21 the area evaluated by the area analysis means 3 as a region corresponding to a person, and displays a partially enlarged image enlarged at an enlargement ratio in accordance with the screen size of the image display means 40. It is displayed on the screen of the display means 40.

画像表示手段４０の画面の大きさに合わせた拡大率とは、たとえば人に対応する領域の高さが画像表示手段４０の画面の高さの３分の２程度になるような拡大率を意味する。ただし、この拡大率は目安であって、画像表示手段４０の画面から必要な部分がはみ出さないように表示するように設定すればよい。さらに、拡大率は、撮像手段１の視野内での侵入者の位置に応じて変化し、侵入者の姿勢が変化しなければ、撮像手段１から侵入者までの距離が遠いほど拡大率が大きくなる。ただし、部分拡大画像を生成する領域について撮像手段１からの距離範囲を制限している場合には、侵入者を拡大しさえすれば侵入者の特徴を把握するという目的を達成することができるから、拡大率を一定にしてもよい。この場合には、拡大率をプログラムで規定値として設定するか、監視者が操作部（図示せず）を操作して設定すればよい。人に対応する領域の撮像手段１の視野内での位置によっては、拡大ではなく縮小が必要になる場合もあるが、縮小の場合でも元の大きさに対する比率を拡大率と呼ぶことにする。なお、人に対応する領域の縦横比は一定ではなく、画像表示手段４０の画面の縦横比は一定であるから、人に対応する領域については高さ寸法のみを画面の大きさに応じて調節し、この高さに対して画面の縦横比に応じた横幅寸法を有する領域を抽出して部分拡大画像に用いる。 The enlargement ratio in accordance with the size of the screen of the image display means 40 means an enlargement ratio such that the height of the area corresponding to a person is about two thirds of the height of the screen of the image display means 40, for example. To do. However, this enlargement ratio is a guideline and may be set so that a necessary portion does not protrude from the screen of the image display means 40. Furthermore, the enlargement ratio changes according to the position of the intruder in the field of view of the image pickup means 1, and if the intruder's posture does not change, the enlargement ratio increases as the distance from the image pickup means 1 to the intruder increases. Become. However, if the distance range from the imaging means 1 is limited for the region for generating the partially enlarged image, the purpose of grasping the characteristics of the intruder can be achieved only by enlarging the intruder. The magnification rate may be constant. In this case, the enlargement ratio may be set as a specified value by a program, or the supervisor may set it by operating an operation unit (not shown). Depending on the position of the area corresponding to the person in the field of view of the image pickup means 1, reduction may be required instead of enlargement, but even in the case of reduction, the ratio to the original size will be referred to as the enlargement ratio. Since the aspect ratio of the area corresponding to the person is not constant and the aspect ratio of the screen of the image display means 40 is constant, only the height dimension of the area corresponding to the person is adjusted according to the size of the screen. Then, an area having a width dimension corresponding to the aspect ratio of the screen with respect to this height is extracted and used for the partially enlarged image.

本実施形態では、検知信号出力手段４２が検知信号を出力すると、画像出力手段４１において、領域解析手段３が人に対応する領域と評価した領域の追跡を開始し、この領域を画像表示手段４０の画面の大きさに合わせた拡大率で拡大した部分拡大画像を画像表示手段４０に表示させる。その後、領域解析手段３で人に対応する領域と評価した領域が検出されなくなれば、部分拡大画像の表示を解除し、撮像手段１の視野全体である全体画像を画像表示手段４０に表示する状態に復帰する。すなわち、図１５（ａ）のように、常時は全体画像Ｘ１を表示しておき、人に対応する領域Ｐｘが検出されると、図１５（ｂ）のように、当該領域Ｐｘを含む部分拡大画像Ｘ２を画像表示手段４０に表示する。 In the present embodiment, when the detection signal output unit 42 outputs a detection signal, the image output unit 41 starts tracking the region evaluated by the region analysis unit 3 as a region corresponding to a person. A partially enlarged image enlarged at an enlargement rate that matches the size of the screen is displayed on the image display means 40. After that, when the region evaluated as the region corresponding to the person is not detected by the region analysis unit 3, the display of the partially enlarged image is canceled and the entire image that is the entire visual field of the imaging unit 1 is displayed on the image display unit 40. Return to. That is, as shown in FIG. 15A, the entire image X1 is always displayed, and when a region Px corresponding to a person is detected, a partial enlargement including the region Px is performed as shown in FIG. 15B. The image X2 is displayed on the image display means 40.

ところで、部分拡大画像は画像表示手段４０の画面の大きさに合わせた拡大率で表示され、しかも部分拡大画像に対応する領域は撮像手段１の視野内で侵入者の移動とともに移動するから、部分拡大画像が撮像手段１の視野内におけるどの部位であるかを監視者に示すことが望ましい。そこで、画像表示手段４０を２台のディスプレイ装置で構成し、一方のディスプレイ装置に撮像手段１の視野全体である全体画像を表示し、他方のディスプレイ装置に部分拡大画像を表示するか、あるいは１台のディスプレイ装置を用いるとともに監視者による操作が可能な操作部（図示せず）を設けておき、全体画像の表示状態と部分拡大画像の表示状態とを操作部の操作で切り換えるようにする。 By the way, the partial enlarged image is displayed at an enlargement ratio that matches the size of the screen of the image display means 40, and the area corresponding to the partial enlarged image moves within the field of view of the imaging means 1 as the intruder moves. It is desirable to indicate to the monitor which portion of the enlarged image is in the field of view of the imaging means 1. Therefore, the image display means 40 is composed of two display devices, and the entire image that is the entire field of view of the imaging means 1 is displayed on one display device, and the partially enlarged image is displayed on the other display device. An operation unit (not shown) that can be operated by a monitor is provided using a display device, and the display state of the entire image and the display state of the partially enlarged image are switched by operation of the operation unit.

２台のディスプレイ装置を用いると、コスト高になるものの特別な操作を行うことなく全体画像と部分拡大画像とを同時に見ることができ監視作業が容易になる。また、１台のディスプレイ装置を全体画像と部分拡大画像との表示に共用すれば、画像を切り換える操作が必要になるもののコストを低減することができる。しかも、部分拡大画像は、撮像手段１の視野内に侵入者が入ると画像表示手段４０に自動的に表示されるから、画像表示手段４０に表示されている画像が部分拡大画像に切り換わると侵入者の特徴を確認することができ、その後、全体画像に切り換えて侵入者の位置を確認することができる。 When two display devices are used, although the cost is high, the entire image and the partially enlarged image can be viewed at the same time without performing a special operation, and the monitoring operation is facilitated. Further, if one display device is shared for displaying the entire image and the partially enlarged image, the operation of switching the images is required, but the cost can be reduced. Moreover, since the partially enlarged image is automatically displayed on the image display means 40 when an intruder enters the field of view of the imaging means 1, the image displayed on the image display means 40 is switched to the partially enlarged image. The characteristics of the intruder can be confirmed, and then the position of the intruder can be confirmed by switching to the whole image.

画像表示手段４０によって部分拡大画像と全体画像とを表示するために、図１５（ｃ）のように、部分拡大画像Ｘ２を表示している画面の一部に全体画像Ｘ１を表示するように画像出力手段４１を構成してもよい。すなわち、部分拡大画像Ｘ２の画面内に全体画像Ｘ１をスーパーインポーズによって表示する。この構成を採用すれば、１台のディスプレイ装置を用いながらも、部分拡大画像Ｘ２を用いて侵入者の特徴を画面で確認すると同時に、全体画像Ｘ１を用いて撮像手段１の視野内のどこに侵入者が存在しているかを知ることが可能になる。つまり、監視作業が容易になる上に低コストで提供することが可能になる。 In order to display the partial enlarged image and the whole image by the image display means 40, as shown in FIG. 15C, the image is displayed so that the whole image X1 is displayed on a part of the screen displaying the partial enlarged image X2. The output means 41 may be configured. That is, the entire image X1 is displayed by superimposition on the screen of the partial enlarged image X2. If this configuration is adopted, the invader's characteristics are confirmed on the screen using the partially enlarged image X2 while using one display device, and at the same time, the entire image X1 is used to enter anywhere in the field of view of the imaging means 1. It becomes possible to know if there is a person. That is, it becomes possible to provide monitoring work at a low cost as well as to facilitate the monitoring work.

上述したように、画像表示手段４０には、常時は全体画像が表示されており、領域解析手段３が人に対応する領域と評価した領域が検出されると部分拡大画像が画面の大きさに合わせた拡大率で表示されるから、画面が変化した直後においては、侵入者の存在する位置を把握することができないことがある。そこで、画像出力手段４１において、画像表示手段４０の画面に全体画像を表示させている状態から部分拡大画像に切り換える際に、図１６に示すように、画像表示手段４０の画面の大きさに合わせた拡大率で部分拡大画像Ｘ２を表示させる状態まで部分拡大画像Ｘ２の拡大率を時間経過に伴って徐々に大きくするように制御するのが望ましい。つまり、全体画像Ｘ１の中に人に対応する領域Ｐｘが発生すると、この領域Ｐｘを全体画像Ｘ１の中にスーパーインポーズによって表示し、領域Ｐｘが最初に表示された位置を起点にして領域Ｐｘが全体画像Ｘ１の中に占める面積を時間経過に伴って徐々に大きくし、最終的に所望の拡大率の部分拡大画像Ｘ２を画面一杯に表示するのである。この動作によって、領域Ｐｘは画面の中央からではなく、最初に表示された位置から拡がるから、部分拡大画像Ｘ２があたかも全体画像Ｘ１からズームアップされたかのように表示され、全体の中で侵入者の存在する位置を把握するのが容易になる。なお、部分拡大画像Ｘ２は画像用メモリ２１から読み出した画像を用いるから、デジタル信号処理によるズームアップになるのはいうまでもない。 As described above, the entire image is displayed on the image display unit 40 at all times, and when the region analyzed by the region analysis unit 3 is detected as a region corresponding to a person, the partially enlarged image is enlarged to the size of the screen. Since it is displayed with the combined enlargement ratio, the position where the intruder exists may not be grasped immediately after the screen changes. Therefore, when the image output means 41 switches from the state in which the entire image is displayed on the screen of the image display means 40 to the partially enlarged image, as shown in FIG. 16, it matches the size of the screen of the image display means 40. It is desirable to control the enlargement rate of the partial enlarged image X2 to be gradually increased with time until the partial enlarged image X2 is displayed at the same enlargement rate. That is, when a region Px corresponding to a person is generated in the entire image X1, this region Px is displayed in the entire image X1 by superimposition, and the region Px is started from the position where the region Px was first displayed. Gradually increases the area occupied by the entire image X1 with time, and finally displays the partially enlarged image X2 having a desired enlargement ratio on the full screen. By this operation, the area Px expands not from the center of the screen but from the position where it was initially displayed. Therefore, the partial enlarged image X2 is displayed as if it was zoomed up from the entire image X1, and the intruder's part in the whole is displayed. It becomes easy to grasp the existing position. Needless to say, zooming in by digital signal processing is performed because the partial enlarged image X2 uses an image read from the image memory 21.

撮像手段１の視野内において人に対応する領域と評価された領域が複数存在するとき、つまり侵入者が複数人存在するときには、複数の部分拡大画像が生成されることになる。このような場合には、画像出力手段４１は、各部分拡大画像を画像表示手段４０の画面に一定時間毎に自動的に切り換えて順に表示することによって、すべての部分拡大画像を１台のディスプレイ装置に表示する。表示する順序はディスプレイ装置のラスタスキャンの順とすればよい。つまり、各領域の左上角の座標位置をラスタスキャンの順で探索し、検出された領域に対応する部分拡大画像を順に表示すればよい。 When there are a plurality of regions that are evaluated to correspond to a person within the field of view of the imaging unit 1, that is, when there are a plurality of intruders, a plurality of partially enlarged images are generated. In such a case, the image output means 41 automatically switches each partial enlarged image to the screen of the image display means 40 at regular intervals and displays it in order, thereby displaying all the partial enlarged images on one display. Display on the device. The display order may be the raster scan order of the display device. That is, the coordinate position of the upper left corner of each area may be searched in the order of raster scan, and the partially enlarged images corresponding to the detected areas may be displayed in order.

複数の部分拡大画像を順に自動的に表示することによって、複数の侵入者についてそれぞれ特徴を確認することができ、複数人の中に不審者かいるか否かを容易に識別することができる。なお、部分拡大画像に対応する領域は侵入者の移動に伴って移動するから、部分拡大画像を表示する時点の侵入者の位置に応じた部分拡大画像を表示し、また部分拡大画像に合わせて全体画像を表示するのが望ましい。複数の部分拡大画像を表示する際に、全体画像の表示状態から部分拡大画像をズームアップし、一定時間後に全体画像を表示する状態に戻して別の領域の部分拡大画像をズームアップするという処理を繰り返すようにすることも可能である。ただし、このような表示がやや見にくい場合は、部分拡大画像を順に切り換えるとともに、画面の一部に全体画像を表示する表示を選択すればよい。 By automatically displaying a plurality of partially enlarged images in order, it is possible to confirm the characteristics of each of a plurality of intruders and easily identify whether or not there are suspicious persons among a plurality of persons. Since the region corresponding to the partially enlarged image moves with the movement of the intruder, the partially enlarged image corresponding to the position of the intruder at the time of displaying the partially enlarged image is displayed, and also in accordance with the partially enlarged image. It is desirable to display the entire image. When displaying multiple partially magnified images, zoom in the partially magnified image from the display state of the entire image, and return to the state in which the entire image is displayed after a certain time to zoom in on the partially magnified image in another area. It is also possible to repeat the above. However, if such a display is somewhat difficult to see, the partially enlarged images may be sequentially switched and a display that displays the entire image on a part of the screen may be selected.

さらに、部分拡大画像を画像表示手段４０の画面に一定時間毎に順に表示するのではなく、図１７（ａ）のように画像表示手段４０の画面を人に対応する領域Ｐｘ１，Ｐｘ２の個数分の区画に分割し、各領域Ｐｘ１，Ｐｘ２に対応する部分拡大画像Ｘ２１，Ｘ２２を各区画にそれぞれ表示してもよい。つまり、画像表示手段４０の画面を、撮像手段１の視野内に存在する侵入者の人数分に分割した部分画面を生成し、各部分画面にそれぞれ部分拡大画像Ｘ２１，Ｘ２２を表示する。この構成は侵入者が比較的少ない場合に有効であって、多人数になるときには部分拡大画像を順に切り換えて表示するのが望ましい。また、画像表示手段４０の画面の分割数の上限を制限しておき、分割数の上限内の人数では人数分に分割した部分画面を生成し、分割数の上限を超える人数に対しては上限の分割数で分割した画面を切り換えて人数分の表示を行うようにしてもよい。画像表示手段４０の画面を分割して複数人を１画面に表示すれば、複数の侵入者の行動を１画面内で一覧することができ、複数の侵入者の特徴と行動とを一度に把握することができる。 Furthermore, instead of sequentially displaying the partial enlarged images on the screen of the image display means 40 at regular intervals, the screen of the image display means 40 is displayed for the number of areas Px1 and Px2 corresponding to people as shown in FIG. The partial enlarged images X21 and X22 corresponding to the areas Px1 and Px2 may be displayed in the respective sections. That is, a partial screen is generated by dividing the screen of the image display means 40 by the number of intruders present in the field of view of the imaging means 1, and the partial enlarged images X21 and X22 are displayed on the partial screens, respectively. This configuration is effective when the number of intruders is relatively small, and it is desirable that the partially enlarged images are sequentially switched and displayed when there are a large number of people. In addition, the upper limit of the number of divisions of the screen of the image display means 40 is limited, a partial screen divided into the number of persons within the upper limit of the number of divisions is generated, and the upper limit for the number of persons exceeding the upper limit of the number of divisions. The screens divided by the number of divisions may be switched to display the number of people. If the screen of the image display means 40 is divided and a plurality of persons are displayed on one screen, the actions of a plurality of intruders can be listed in one screen, and the characteristics and actions of the plurality of intruders can be grasped at a time. can do.

なお、侵入者の存在使用目的によっては、複数の侵入者が検出されたときに全侵入者を追跡するのではなく、侵入者のうちの１人のみを追跡すればよい場合もある。このような場合には、撮像手段１の視野内において最初に検出された侵入者のみを追跡し、この侵入者のみの部分拡大画像を表示するようにしてもよい。また、監視した画像を後日に利用するために記憶しておく場合には、画像表示手段４０に部分拡大画像の表示を開始してから部分拡大画像の表示を終了するまでの期間のみの画像を記録すれば目的を達成できるから、記憶媒体の記憶容量を低減することができる。 Depending on the purpose of use of the intruder, it may be necessary to track only one of the intruders instead of tracking all intruders when a plurality of intruders are detected. In such a case, only the intruder detected first in the field of view of the imaging means 1 may be tracked, and a partially enlarged image of only this intruder may be displayed. In addition, when the monitored image is stored for later use, an image only for a period from the start of displaying the partially enlarged image to the end of displaying the partially enlarged image on the image display means 40 is displayed. Since the purpose can be achieved by recording, the storage capacity of the storage medium can be reduced.

実施形態１を示すブロック図である。1 is a block diagram illustrating a first embodiment. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 実施形態２の動作説明図である。FIG. 9 is an operation explanatory diagram of the second embodiment. 実施形態２の動作説明図である。FIG. 9 is an operation explanatory diagram of the second embodiment. 実施形態２の動作説明図である。FIG. 9 is an operation explanatory diagram of the second embodiment. 実施形態３の動作説明図である。FIG. 10 is an operation explanatory diagram of the third embodiment. 実施形態３の動作説明図である。FIG. 10 is an operation explanatory diagram of the third embodiment. 実施形態４の動作説明図である。FIG. 10 is an operation explanatory diagram of the fourth embodiment. 実施形態５を示すブロック図である。FIG. 10 is a block diagram illustrating a fifth embodiment. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 実施形態６を示すブロック図である。FIG. 10 is a block diagram illustrating a sixth embodiment. 実施形態７を示すブロック図である。FIG. 10 is a block diagram illustrating a seventh embodiment. 実施形態８を示すブロック図である。FIG. 10 is a block diagram illustrating an eighth embodiment. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above.

Explanation of symbols

１撮像手段
２移動領域抽出手段
３領域解析手段
４熱感知手段
５熱変化解析手段
６統合演算部
７保存用メモリ
８画像送信部
２１画像用メモリ
４０画像表示手段
４１画像出力手段
４２検知信号出力手段 DESCRIPTION OF SYMBOLS 1 Image pickup means 2 Moving area extraction means 3 Area analysis means 4 Thermal sensing means 5 Thermal change analysis means 6 Integrated calculation part 7 Storage memory 8 Image transmission part 21 Image memory 40 Image display means 41 Image output means 42 Detection signal output means

Claims

A background is obtained by combining three or more time-series edge images using an image pickup means for picking up a predetermined field of view and an edge image that is a binary image obtained by extracting edges from each image picked up at different times by the image pickup means. A moving area extracting means for extracting an area corresponding to the moved object in the edge image at the time of interest by performing a logical operation to be removed, and a frequency distribution of the direction code of the pixel on the edge for the area extracted by the moving area extracting means And using the similarity between the obtained frequency distribution and the reference data that is the frequency distribution of the direction code of the pixel on the edge obtained in advance for the human edge image, the region extracted by the moving region extracting means A human body detection apparatus using an image, comprising: region analysis means for evaluating whether or not the region corresponds to the region.

The frequency distribution is normalized by the total number of pixels on each edge of interest, and the evaluation value of the similarity uses a sum of squares of the frequency difference for each direction code. 2. The human body detection apparatus using an image according to claim 1, wherein an area extracted by the moving area extraction unit is determined as an area corresponding to a person when the value is equal to or less than a predetermined threshold.

The area analysis means sets a normal range based on an upper limit value and a lower limit value for the frequency of each direction code of the frequency distribution obtained for the area extracted by the moving area extraction means, and the frequency deviates from the normal range. 3. The human body detection apparatus using an image according to claim 1, wherein a region where a frequency distribution including a code is obtained is regarded as a disturbance other than a person.

The area analysis means obtains a frequency distribution normalized by the total number of pixels on each edge for the direction code of the pixels on the edge for each area extracted by the moving area extraction means for a plurality of time-series edge images, Next, by evaluating the similarity of the frequency distribution between each pair of adjacent edge images in time series, areas corresponding to the same object are associated between different edge images, and the reference data The human body detection apparatus using an image according to any one of claims 1 to 3, wherein the evaluation is performed using the similarity.

The field of view of the imaging unit is divided into a plurality of regions at a ratio according to the size of the image of the person existing in the field of view of the imaging unit in each part of the imaging surface, and an effective region and an invalid region are provided for each region. 5. The monitoring area setting means having a function of designating the distinction between the effective area and the effective area as a monitoring area for detecting the presence or absence of a person is added. Human body detection device using images of

A region setting mode for setting a monitoring region for detecting the presence or absence of a person can be selected, and in the region setting mode, the imaging means has a light of a specific wavelength from a light source moved along the boundary line of the monitoring region within the field of view. Only one of the inside and the outside of the closed area obtained by linking the position where the density value becomes the maximum in a plurality of images obtained in time series from the imaging means in the area setting mode in the area setting mode is the effective area. 5. An image according to any one of claims 1 to 4, further comprising monitoring area setting means for setting the other as an invalid area and the effective area as a monitoring area for detecting the presence or absence of a person. The human body detection device used.

An image memory for temporarily storing an image for each image picked up by the image pickup device, and when an area corresponding to a person is extracted by the area analysis unit, the image stored in the image memory is cut out and saved for the area. A human body detection device using an image according to any one of claims 1 to 6, further comprising a storage memory.

An image memory for temporarily storing an image for each image picked up by the image pickup device, and when an area corresponding to a person is extracted by the area analysis unit, the image stored in the image memory for the area is cut out to the other device The human body detection apparatus using the image of any one of Claims 1 thru | or 6 provided with the image transmission part forwarded to.

A detection signal output unit that outputs a detection signal when there is an area that is evaluated as a region corresponding to a person in the region analysis unit, and the region analysis unit corresponds to a person when a detection signal is output from the detection signal output unit And an image output means for starting the tracking of the evaluated area and the evaluated area and displaying a partially enlarged image obtained by enlarging the area as compared with other areas on the screen of the image display means. A human body detection device using the image according to any one of claims 1 to 8.

The image output means displays the partial enlarged image on the screen of the image display means at an enlargement ratio that matches the size of the screen of the image display means, and the field of view of the imaging means on a part of the screen of the image display means The whole body image which is the whole is displayed, The human body detection apparatus using the image of Claim 9 characterized by the above-mentioned.

The image output means displays the partially enlarged image at an enlargement ratio in accordance with the screen size of the image display means from a state in which the entire image that is the entire field of view of the imaging means is displayed on the screen of the image display means. 10. The image according to claim 9, wherein the magnification of the partially magnified image is gradually increased as time elapses, starting from the region corresponding to the person and the region initially evaluated by the region analysis means until the state. The human body detection device used.

When there are a plurality of regions evaluated as regions corresponding to the person in the region analysis unit, the image output unit displays the partial enlarged image corresponding to each region on the screen of the image display unit at regular intervals. The human body detection device using an image according to claim 9, wherein the display is switched and displayed.

When there are a plurality of regions that are evaluated as regions corresponding to a person in the region analysis unit, the image output unit divides the screen of the image display unit into sections corresponding to the number of regions and corresponds to each region. The human body detection device using an image according to claim 9, wherein the partial enlarged image to be displayed is displayed in each section.