JPH11283036A

JPH11283036A - Object detector and object detection method

Info

Publication number: JPH11283036A
Application number: JP8373698A
Authority: JP
Inventors: Takuya Haketa; 卓哉羽毛田
Original assignee: Toshiba TEC Corp
Current assignee: Toshiba TEC Corp
Priority date: 1998-03-30
Filing date: 1998-03-30
Publication date: 1999-10-15

Abstract

PROBLEM TO BE SOLVED: To improve a detection rate by considering individual differences of the feature area position of an object to be identified, fluctuations of a size and of a photographing environment and detecting the object and to perform a simple processing by eliminating need of verification of the position relation of a feature area. SOLUTION: Images including a face are inputted from an image input part 1 and stored as gradation images, and the gradation images are differentiated by an image processing part 2 so that edge images are generated and stored. Then, in a position detection part 3, and area model for which plural judgement element obtaining areas are set corresponding to the characteristic areas such as the eye, the nose and the mouth, etc., of face images in used for the gradation images and the edge images, the area model is fitted to the image while successively specifying the position, a feature amount is extracted from the respective judgement element obtaining areas and the face image is detected from the images based on the extracted feature amounted.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、顔や物品など、被
識別対象物の検出を行う対象物検出装置及び対象物検出
方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an object detecting apparatus and an object detecting method for detecting an object to be identified, such as a face or an article.

【０００２】[0002]

【従来の技術】ある被識別対象物や背景を含む画像から
被識別対象物を検出する装置としては、例えば、特開平
７−１２９７７０号公報のものが知られている。これ
は、例えば、工場における組立て工程や検査工程などに
おいて、被識別対象物をテレビカメラで撮影し、この撮
影した入力画像と基準となるテンプレート画像との濃淡
画像について相関を取ることによって被識別対象物の位
置を検出するようにしている。具体的には、探索画像と
して、テンプレート画像と同じ大きさの局所領域を入力
画像中に設定し、これを順にずらしながらテンプレート
画像と探索画像との相関値を計算し、相関値が最も高い
位置をテンプレートが存在する位置として特定するよう
になっている。2. Description of the Related Art As an apparatus for detecting an object to be identified from an image including a certain object to be identified and a background, for example, Japanese Patent Application Laid-Open No. 7-129770 is known. For example, in an assembling process or an inspection process in a factory, an object to be identified is photographed by a television camera, and a correlation is made between a grayscale image of the photographed input image and a reference template image, thereby obtaining an image of the object to be identified. The position of an object is detected. Specifically, a local region having the same size as the template image is set in the input image as a search image, and the correlation value between the template image and the search image is calculated while shifting the local region in order, and the position where the correlation value is the highest is calculated. Is specified as the position where the template exists.

【０００３】また、テンプレートマッチング法を利用し
て入力画像から顔画像を検出するものとしては、例え
ば、特開平９−２５１５３４号公報や特開平９−４４６
７６号公報が知られている。特開平９−２５１５３４号
公報のものは、入力画像から顔画像領域を抽出するため
に、予め登録されている標準顔画像（テンプレート）を
全画面にわたって移動させつつ相関値を計算し、最も高
い相関値を有する領域を顔領域として抽出するものであ
る。また、特開平９−４４６７６号公報のものは、目を
濃淡情報で表したテンプレート画像で顔画像を含む画像
を走査し、対象領域の濃淡情報とテンプレート画像の濃
淡情報との相関演算を行い、類似度の高い領域を目の候
補として抽出する。同様に、鼻を濃淡情報で表したテン
プレート画像で顔画像を含む画像を走査して鼻の候補を
抽出し、口を濃淡情報で表したテンプレート画像で顔画
像を含む画像を走査して口の候補を抽出する。そして、
抽出が終了すると、顔領域の抽出を行うが、この時、
目、鼻、口の候補の組み合わせについて、予め用意され
ている目、鼻、口の位置関係と比較検証して画像の中か
ら顔画像を抽出するというものである。As a technique for detecting a face image from an input image by using a template matching method, for example, Japanese Patent Application Laid-Open Nos. 9-251534 and 9-446.
No. 76 is known. In Japanese Patent Application Laid-Open No. 9-251534, in order to extract a face image region from an input image, a correlation value is calculated while moving a pre-registered standard face image (template) over the entire screen, and the highest correlation value is obtained. A region having a value is extracted as a face region. Japanese Patent Application Laid-Open No. 9-44676 scans an image including a face image with a template image in which eyes are represented by density information, performs a correlation operation between the density information of the target area and the density information of the template image, An area having a high similarity is extracted as an eye candidate. Similarly, a nose candidate is extracted by scanning an image including a face image with a template image representing the nose in shade information, and an image including a face image is scanned in a template image representing the mouth in the shade information. Extract candidates. And
When the extraction is completed, the face area is extracted.
The face image is extracted from the image by comparing and verifying the combination of the eye, nose, and mouth candidates with the positional relationship of the prepared eyes, nose, and mouth.

【０００４】さらに、顔画像の色情報を利用したものと
しては、例えば、特開平９−５０５２８号公報が知られ
ている。これは、入力画像のＲＧＢ値から肌色領域を抽
出し、この領域に対して自動的にモザイクサイズを決定
し、その候補領域をモザイク化し、人物顔辞書と比較し
て人物顔の有無を判定し人物顔の切り出しを行うという
ものである。Further, as an example utilizing color information of a face image, Japanese Patent Application Laid-Open No. 9-50528 is known. This involves extracting a skin color region from the RGB values of an input image, automatically determining a mosaic size for this region, converting the candidate region into a mosaic, comparing the candidate region with a person face dictionary to determine the presence or absence of a person face. This is to cut out a human face.

【０００５】[0005]

【発明が解決しようとする課題】ところで、顔は個人毎
に鼻や目の位置、大きさが様々であり、また、人物の顔
の検出の適用場所は工場内のファクトリ・オートメーシ
ョン装置と違い背景が複雑で、かつ外光環境の変動も大
きい場合が多い。また、人物はカメラの前で動きがある
ため、人物の前後位置の多少のずれによっても顔サイズ
は変わり、また、顔の傾きのばらつきも生じる。By the way, faces vary in position and size of nose and eyes for each individual, and the application location of the face detection of a person is different from that of a factory automation device in a factory. Is complicated and the external light environment fluctuates greatly in many cases. In addition, since a person moves in front of the camera, the face size changes due to a slight shift in the front-back position of the person, and the inclination of the face also varies.

【０００６】このようなことから、特開平７−１２９７
７０号公報や特開平９−２５１５３４号公報のようなテ
ンプレートマッチング法を利用するものでは、人物の顔
を顔全体のテンプレート画像としてそのまま用いること
になるため、テンプレート画像に柔軟性がなく、検出率
がそれ程高くならないという問題があった。すなわち、
個人差等を考慮していないので、ある人は検出できるが
別の人は検出できないというような事態が生じる。In view of the above, Japanese Patent Application Laid-Open No.
No. 70 and Japanese Patent Application Laid-Open No. 9-251534 use a template matching method, in which the face of a person is used as it is as a template image of the entire face. Was not so high. That is,
Since individual differences and the like are not taken into account, a situation occurs in which a certain person can be detected but another person cannot.

【０００７】また、特開平９−４４６７６号公報のよう
な個々の部品を抽出するような方法では、個々の部品毎
に抽出方法を変えたり、候補領域の位置関係を検証した
りなど手順が複雑であり、安定した検出が難しい問題が
あり、特に、背景が複雑な場合においては部品候補の数
が大量になる可能性があり、その位置関係を検証するこ
とが困難になるという問題があった。さらに、特開平９
−５０５２８号公報のような肌色領域を抽出する色情報
を用いて予め候補領域を絞り込むような方法では、照明
条件の影響を受けやすく安定した検出が困難になる問題
があった。In the method of extracting individual parts as disclosed in Japanese Patent Application Laid-Open No. 9-44676, the procedure is complicated, such as changing the extraction method for each individual part and verifying the positional relationship between candidate regions. However, there is a problem that stable detection is difficult, and in particular, when the background is complicated, there is a possibility that the number of component candidates may become large, and there is a problem that it becomes difficult to verify the positional relationship. . Further, Japanese Patent Application Laid-Open
The method of narrowing down the candidate areas in advance by using color information for extracting a skin color area as in Japanese Patent No. -50528 has a problem that it is easily affected by lighting conditions and stable detection becomes difficult.

【０００８】そこで、請求項１乃至７記載の発明は、被
識別対象物の特徴領域の位置の個体差、サイズの変動、
撮影環境の変動を考慮して被識別対象物を検出すること
ができて検出率の向上を図ることができ、しかも特徴領
域の位置関係を検証する必要がなく簡潔な処理を実現で
きる対象物検出装置を提供する。また、請求項８乃至１
１記載の発明は、被識別対象物の特徴領域の位置の個体
差、サイズの変動、撮影環境の変動を考慮して被識別対
象物を検出することができて検出率の向上を図ることが
でき、しかも特徴領域の位置関係を検証する必要がなく
簡潔な処理を実現できる対象物検出方法を提供する。Therefore, the inventions according to claims 1 to 7 are based on individual differences in the position of the characteristic region of the object to be identified, fluctuations in the size,
Object detection that can detect an object to be identified in consideration of fluctuations in the shooting environment, improve the detection rate, and realizes simple processing without the need to verify the positional relationship between characteristic regions. Provide equipment. Claims 8 to 1
According to the invention described in (1), the object to be identified can be detected in consideration of individual differences in the position of the characteristic region of the object to be identified, variation in size, and variation in the imaging environment, and the detection rate can be improved. Provided is a method for detecting an object that can perform simple processing without having to verify the positional relationship between characteristic regions.

【０００９】[0009]

【課題を解決するための手段】請求項１記載の発明は、
画像を入力する画像入力手段と、検出する被識別対象物
画像の特徴的な領域に対応して複数の判定要素取得領域
を設定した領域モデルを記憶した記憶手段と、画像入力
手段により入力した入力画像及びこの入力画像を画像処
理して得られる画像の一方又は両方に対して記憶手段に
記憶した領域モデルを当て嵌める位置を順次指定する位
置指定手段と、この位置指定手段にて指定した位置に領
域モデルを順次当て嵌める毎に、この領域モデルの各判
定要素取得領域から判定要素を取得する判定要素取得手
段と、この判定要素取得手段が取得した判定要素に基づ
いて被識別対象物画像か否かの判定を行う判定手段とか
らなり、判定手段の判定結果により被識別対象物の検出
を行う対象物検出装置にある。According to the first aspect of the present invention,
Image input means for inputting an image, storage means for storing an area model in which a plurality of determination element acquisition areas are set corresponding to characteristic areas of the image of the object to be detected, and input input by the image input means A position designating means for sequentially designating a position where the area model stored in the storage means is applied to one or both of an image and an image obtained by performing image processing on the input image; and a position designated by the position designating means. Each time the region model is sequentially applied, a determination element acquisition unit that acquires a determination element from each determination element acquisition region of the region model, and whether or not the image is a target object image based on the determination element acquired by the determination element acquisition unit And a determination unit for determining whether the target is an object to be identified based on the determination result of the determination unit.

【００１０】請求項２記載の発明は、画像を入力する画
像入力手段と、検出する顔画像の目、口、鼻、頬等の特
徴的な領域に対応して複数の判定要素取得領域を設定し
た領域モデルを記憶した記憶手段と、画像入力手段によ
り入力した入力画像及びこの入力画像を画像処理して得
られる画像の一方又は両方に対して記憶手段に記憶した
領域モデルを当て嵌める位置を順次指定する位置指定手
段と、この位置指定手段にて指定した位置に領域モデル
を順次当て嵌める毎に、この領域モデルの各判定要素取
得領域から判定要素を取得する判定要素取得手段と、こ
の判定要素取得手段が取得した判定要素に基づいて顔画
像か否かの判定を行う判定手段とからなり、判定手段の
判定結果により顔の検出を行う対象物検出装置にある。According to a second aspect of the present invention, there is provided an image input means for inputting an image, and a plurality of determination element acquisition areas corresponding to characteristic areas such as eyes, mouth, nose, and cheeks of a face image to be detected. Means for storing the area model stored in the storage means, and a position at which the area model stored in the storage means is applied to one or both of an input image input by the image input means and an image obtained by performing image processing on the input image. A position specifying means for specifying, a determination element obtaining means for obtaining a determination element from each determination element obtaining area of the area model each time the area model is sequentially applied to the position specified by the position specification means, The object detection device includes a determination unit that determines whether or not the image is a face image based on the determination element acquired by the acquisition unit, and detects a face based on the determination result of the determination unit.

【００１１】請求項３記載の発明は、請求項１又は２記
載の対象物検出装置において、判定要素取得手段は、領
域モデルの各判定要素取得領域から判定要素として特徴
量を取得することにある。請求項４記載の発明は、請求
項３記載の対象物検出装置において、判定要素取得手段
は、領域モデルの少なくとも１つの判定要素取得領域に
対してテンプレート画像を利用して特徴量を取得するこ
とにある。請求項５記載の発明は、請求項３又は４記載
の対象物検出装置において、判定要素取得手段は、領域
モデルの判定要素取得領域として目、口、鼻の領域の１
又は複数が設定されたときには、この領域の少なくとも
１つに対して領域内の濃淡値あるいは濃淡値の微分値を
利用して特徴量を取得することにある。According to a third aspect of the present invention, in the object detecting device according to the first or second aspect, the determination element obtaining means obtains a feature amount as a determination element from each determination element obtaining area of the area model. . According to a fourth aspect of the present invention, in the object detection apparatus according to the third aspect, the determination element obtaining means obtains a feature amount using a template image for at least one determination element obtaining area of the area model. It is in. According to a fifth aspect of the present invention, in the object detection device according to the third or fourth aspect, the determination element obtaining means includes one of an eye, a mouth, and a nose as a determination element obtaining area of the area model.
Alternatively, when a plurality of regions are set, a feature amount is obtained for at least one of the regions by using a grayscale value or a differential value of the grayscale value in the region.

【００１２】請求項６記載の発明は、請求項３又は５記
載の対象物検出装置において、判定要素取得手段は、領
域モデルの判定要素取得領域として頬の領域が設定され
たときには、この領域に対して領域内の濃淡値の分散値
を利用して特徴量を取得することにある。請求項７記載
の発明は、請求項３又は５記載の対象物検出装置におい
て、判定要素取得手段は、領域モデルの判定要素取得領
域として頬の領域が設定されたときには、この領域に対
してテンプレート画像を利用して特徴量を取得すること
にある。According to a sixth aspect of the present invention, in the object detecting device according to the third or fifth aspect, when the area of the cheek is set as the determination element acquisition area of the area model, On the other hand, a feature amount is obtained by using a variance value of gray values in a region. According to a seventh aspect of the present invention, in the object detection device according to the third or fifth aspect, when the cheek area is set as the determination element acquisition area of the area model, the determination element acquiring unit determines a template for the area. It is to acquire a feature amount using an image.

【００１３】請求項８記載の発明は、入力した入力画像
及びこの入力画像を画像処理して得られる画像の一方又
は両方に対して、検出する被識別対象物画像の特徴的な
領域に対応して複数の判定要素取得領域を設定した領域
モデルを順次位置を指定しながら当て嵌め、領域モデル
を当て嵌める毎にこの領域モデルの各判定要素取得領域
から判定要素を取得し、この取得した判定要素に基づい
て被識別対象物画像か否かの判定を行い、この判定結果
により被識別対象物の検出を行う対象物検出方法にあ
る。The invention according to claim 8 corresponds to a characteristic region of an image of an object to be detected with respect to one or both of an input image and an image obtained by performing image processing on the input image. A plurality of determination element acquisition areas are set by sequentially fitting the area model while designating the position, and each time the area model is applied, a determination element is obtained from each determination element acquisition area of the area model, and the obtained determination element is obtained. In the object detection method, a determination is made as to whether or not the image is an object to be identified based on the determination result, and the object to be identified is detected based on the determination result.

【００１４】請求項９記載の発明は、入力した入力画像
及びこの入力画像を画像処理して得られる画像の一方又
は両方に対して、検出する顔画像の目、口、鼻等の特徴
的な領域に対応して複数の判定要素取得領域を設定した
領域モデルを順次位置を指定しながら当て嵌め、領域モ
デルを当て嵌める毎にこの領域モデルの各判定要素取得
領域から判定要素を取得し、この取得した判定要素に基
づいて顔画像か否かの判定を行い、この判定結果により
顔の検出を行う対象物検出方法にある。According to a ninth aspect of the present invention, the input image and one or both of the images obtained by performing image processing on the input image are characterized by the characteristic face, mouth, nose, etc. of the detected face image. A region model in which a plurality of determination element acquisition regions are set corresponding to the region is applied while sequentially specifying the position, and each time the region model is applied, a determination element is obtained from each determination element acquisition region of the region model. An object detection method is to determine whether or not the image is a face image based on the obtained determination element and to detect a face based on the determination result.

【００１５】請求項１０記載の発明は、請求項８又は９
記載の対象物検出方法において、領域モデルの各判定要
素取得領域から判定要素として特徴量を取得することに
ある。請求項１１記載の発明は、請求項１０記載の対象
物検出方法において、領域モデルの少なくとも１つの判
定要素取得領域に対してテンプレート画像を利用して特
徴量を取得することにある。The invention according to claim 10 is the invention according to claim 8 or 9.
In the described object detection method, a feature amount is obtained as a determination element from each determination element acquisition area of the area model. According to an eleventh aspect of the present invention, in the object detecting method according to the tenth aspect, a feature amount is acquired using a template image for at least one determination element acquisition region of the region model.

【００１６】[0016]

【発明の実施の形態】本発明の実施の形態を図面を参照
して説明する。（第１の実施の形態）なお、この実施の形態は被識別対
象物検出として顔検出を例にした場合について述べる。Embodiments of the present invention will be described with reference to the drawings. (First Embodiment) This embodiment describes a case where face detection is used as an example of detection of an object to be identified.

【００１７】図１は対象物検出装置の全体構成を示すブ
ロック図で、人物の顔を含む画像を入力する画像入力手
段としての画像入力部１、この画像入力部１により入力
した画像に対して所定の処理を施す画像処理部２、前記
画像入力部１が入力した画像情報及び前記画像処理部２
が処理した画像情報から顔画像の位置を検出する位置検
出部３とで構成している。FIG. 1 is a block diagram showing the overall configuration of an object detection apparatus. An image input unit 1 serving as an image input unit for inputting an image including a human face. An image processing unit 2 for performing a predetermined process, image information input by the image input unit 1 and the image processing unit 2
And a position detector 3 for detecting the position of the face image from the processed image information.

【００１８】前記画像入力部１は、図２に示すように、
人物の顔を撮影し、顔を含むデジタル濃淡画像情報を出
力するＣＣＤカメラ１１と、このＣＣＤカメラ１１から
のデジタル濃淡画像情報を取込む画像入力ボード１２
と、この画像入力ボード１２が取込んだデジタル濃淡画
像情報を記憶する画像メモリ１３とで構成している。な
お、入力する画像はカラー画像でもよい。The image input unit 1, as shown in FIG.
A CCD camera 11 for photographing the face of a person and outputting digital gray image information including the face, and an image input board 12 for receiving digital gray image information from the CCD camera 11
And an image memory 13 for storing digital grayscale image information captured by the image input board 12. The input image may be a color image.

【００１９】前記画像処理部２は、前記画像メモリ１３
からデジタル濃淡画像情報を読出し、これを微分処理し
てエッジ画像情報に変換し、前記画像メモリ１３に記憶
するようになっている。すなわち、図３の(a) に示すよ
うな濃淡画像を微分処理して図３の(b) に示すようなエ
ッジ画像を生成し画像メモリ１３に記憶するようになっ
ている。エッジの抽出には、よく知られているラプラシ
アンフィルタやSobelフィルタなどを使用する。The image processing unit 2 includes the image memory 13
, Digital gray image information is read from the image data, differentiated and converted into edge image information, and stored in the image memory 13. That is, the grayscale image shown in FIG. 3A is differentiated to generate an edge image shown in FIG. 3B and stored in the image memory 13. For the edge extraction, a well-known Laplacian filter or Sobel filter is used.

【００２０】なお、目、鼻、口などはかなりのエッジが
存在するので、特徴量として利用するのは有効である。
また、エッジ以外の特徴量を利用する場合は他の画像処
理手段を追加すればよい。また、画像濃淡値のみを特徴
量として用いる場合は画像処理部はなくてもよい。さら
に、画像処理は画像全体で行わずに被照合局所領域毎に
行ってもよい。Since the eyes, nose, mouth, etc. have considerable edges, it is effective to use them as feature values.
When using a feature amount other than an edge, another image processing means may be added. When only the image gray level is used as the feature value, the image processing unit may not be provided. Further, the image processing may not be performed on the entire image but may be performed on each local region to be compared.

【００２１】前記位置検出部３は、図４に示すように、
顔の領域モデルを生成するモデル生成部３１と、このモ
デル生成部３１が生成した領域モデルを記憶するモデル
記憶部３２と、このモデル記憶部３２に記憶した領域モ
デルを前記画像メモリ１３に記憶したデジタル濃淡画像
及びエッジ画像に当て嵌める位置を指定する位置指定部
３３と、この位置指定部３３が指定した位置における領
域モデル内の複数の判定要素取得領域の特徴量を抽出す
る判定要素取得手段としての特徴量抽出部３４と、この
特徴量抽出部３４が抽出した各判定要素取得領域の特徴
量から判定用記憶部３５に記憶している判定用情報を使
用して顔画像か否かの判定を行う判定手段としての判定
部３６とで構成している。As shown in FIG. 4, the position detecting section 3
A model generation unit 31 that generates a region model of the face, a model storage unit 32 that stores the region model generated by the model generation unit 31, and a region model stored in the model storage unit 32 is stored in the image memory 13. A position specifying unit 33 that specifies a position to be applied to the digital grayscale image and the edge image; and a determination element acquisition unit that extracts feature amounts of a plurality of determination element acquisition regions in the region model at the position specified by the position specification unit 33. And determining whether or not the image is a face image using the determination information stored in the determination storage unit 35 from the characteristic amount of each determination element acquisition area extracted by the characteristic amount extraction unit 34 And a determination unit 36 as determination means for performing the determination.

【００２２】前記モデル生成部３１は、図５に示すよう
に顔画像の目、鼻、口、頬などの特徴的な領域に対応し
て複数の判定要素取得領域を設定した領域モデルを生成
するようになっている。図５の(a) の領域モデルは、特
徴的な領域を目、鼻、口、頬の領域とし、目に対する判
定要素取得領域４１，４２、鼻に対する判定要素取得領
域４３、口に対する判定要素取得領域４４、残りの頬に
対する判定要素取得領域４５を設定したものである。図
５の(b) の領域モデルは、特徴的な領域を目、口、頬の
領域とし、目に対する判定要素取得領域４１，４２、口
に対する判定要素取得領域４４、頬に対する判定要素取
得領域４５を設定したものである。The model generating section 31 generates an area model in which a plurality of determination element acquisition areas are set corresponding to characteristic areas such as eyes, nose, mouth, and cheeks of a face image as shown in FIG. It has become. In the region model shown in FIG. 5A, the characteristic regions are the eye, nose, mouth, and cheek regions, and the judgment element acquisition regions 41 and 42 for the eyes, the judgment element acquisition region 43 for the nose, and the judgment element acquisition for the mouth are obtained. An area 44 and a determination element acquisition area 45 for the remaining cheeks are set. In the region model shown in FIG. 5B, the characteristic regions are the eye, mouth, and cheek regions, and the judgment element acquisition regions 41 and 42 for the eyes, the judgment element acquisition region 44 for the mouth, and the judgment element acquisition region 45 for the cheek. Is set.

【００２３】図５の(c) の領域モデルは、特徴的な領域
を目、頬の領域とし、目に対する判定要素取得領域４
１，４２、頬に対する判定要素取得領域４５を設定した
ものである。図５の(d) の領域モデルは、特徴的な領域
を目、鼻、頬の領域とし、目に対する判定要素取得領域
４１，４２、鼻に対する判定要素取得領域４３、頬に対
する判定要素取得領域４５を設定したものである。In the region model shown in FIG. 5C, a characteristic region is defined as an eye or cheek region, and a determination element acquisition region 4 for the eye.
1, 42, and a determination element acquisition area 45 for the cheek is set. In the region model shown in FIG. 5D, the characteristic regions are the eye, nose, and cheek regions, and the judgment element acquisition regions 41 and 42 for the eyes, the judgment element acquisition region 43 for the nose, and the judgment element acquisition region 45 for the cheek. Is set.

【００２４】図５の(e) の領域モデルは、特徴的な領域
を目、鼻、口、頬の領域とし、左右の目を１つに設定し
た判定要素取得領域４６、鼻に対する判定要素取得領域
４３、口に対する判定要素取得領域４４、頬に対する判
定要素取得領域４５を設定したものである。図５の(f)
の領域モデルは、特徴的な領域を目、鼻、口、頬、髪の
領域とし、目に対する判定要素取得領域４１，４２、鼻
に対する判定要素取得領域４３、口に対する判定要素取
得領域４４、頬に対する判定要素取得領域４５、髪に対
する判定要素取得領域４７を設定したものである。これ
らの領域モデルはウィンドウモデルと呼ばれるものであ
る。In the region model shown in FIG. 5E, a characteristic region is defined as an eye, a nose, a mouth, and a cheek region, a judgment element acquisition region 46 in which the left and right eyes are set to one, and a judgment element acquisition for the nose. An area 43, a determination element acquisition area 44 for the mouth, and a determination element acquisition area 45 for the cheek are set. (F) of FIG.
In the region model, the characteristic regions are the regions of eyes, nose, mouth, cheeks, and hair, and the judgment element acquisition regions 41 and 42 for the eyes, the judgment element acquisition region 43 for the nose, the judgment element acquisition region 44 for the mouth, and the cheek And a determination element acquisition area 47 for hair are set. These region models are called window models.

【００２５】また、図６に示す領域モデルはマスクモデ
ルと呼ばれるもので、各領域をマスクで表現するように
なっている。数値１及び２は目に対する判定要素取得領
域であり、数値３は鼻に対する判定要素取得領域であ
り、数値４は口に対する判定要素取得領域であり、数値
５は頬に対する判定要素取得領域である。なお、数値０
の部分は除外する領域である。このようなマスク表現に
より、より詳細な領域モデルを生成することができる。The area model shown in FIG. 6 is called a mask model, and each area is represented by a mask. Numerical values 1 and 2 are determination element obtaining regions for the eyes, numerical value 3 is a determining element obtaining region for the nose, numerical value 4 is a determining element obtaining region for the mouth, and numerical value 5 is a determining element obtaining region for the cheek. In addition, numerical value 0
Is a region to be excluded. With such a mask expression, a more detailed region model can be generated.

【００２６】どの領域モデルを利用した場合でも、１つ
のモデルである程度の顔のサイズの違いや顔部品の位置
の個人差を吸収することはできるが、より多種の顔のサ
イズに対応させるためにはサイズの異なった領域モデル
を利用すればよい。なお、予め必要な領域モデルを作成
してモデル記憶部３２に記憶しておけばモデル生成部３
１は省略できる。Regardless of which region model is used, a single model can absorb a certain degree of difference in face size and individual differences in the position of face parts, but in order to cope with a wider variety of face sizes. May use region models of different sizes. If a necessary area model is created in advance and stored in the model storage unit 32, the model generation unit 3
1 can be omitted.

【００２７】前記モデル記憶部３２は、図５に示すよう
な顔の領域モデルを記憶する。ウィンドウモデルの場
合、ウィンドウの左上を原点として、各判定要素取得領
域を表す各ウィンドウの座標や幅、高さ等を記憶する。
マスクモデルの場合、図６に示すマスク値をそのまま記
憶する。なお、モデルの記憶方法としてはその他様々な
方法があり、これらに限定するものではない。The model storage section 32 stores a face area model as shown in FIG. In the case of the window model, the coordinates, width, height, and the like of each window representing each determination element acquisition area are stored with the upper left of the window as the origin.
In the case of a mask model, the mask value shown in FIG. 6 is stored as it is. There are various other methods for storing the model, and the method is not limited to these.

【００２８】前記モデル記憶部３２に記憶した領域モデ
ルを、前記画像入力部１からのデジタル濃淡画像及び前
記画像処理部２で処理したエッジ画像に当て嵌めて顔画
像の検出を行う。このような領域モデルＭを当て嵌める
ことにより、図７に示すような様々な人物、また、多少
顔を傾けた人、眼鏡をかけた人の顔の検出が可能にな
る。なお、計算機プログラム上では顔のモデルをメモリ
に記憶するという形を取らなくてもパラメータとして領
域モデル内の各領域を指定することは可能であり、この
ような場合もプログラムとして記憶していると見なす。The area model stored in the model storage unit 32 is applied to the digital grayscale image from the image input unit 1 and the edge image processed by the image processing unit 2 to detect a face image. By applying such an area model M, it becomes possible to detect the faces of various persons as shown in FIG. 7, a person with a slightly inclined face, and a person wearing glasses. It should be noted that it is possible to specify each area in the area model as a parameter without taking the form of storing a face model in a memory on a computer program. Regard it.

【００２９】前記位置指定部３３及び特徴量抽出部３４
は、前記画像メモリ１３に記憶したデジタル濃淡画像及
びエッジ画像の両画像に対して図８に示すように領域モ
デルＭを当て嵌める位置を指定し、その位置における２
つの画像から領域モデルＭ内の各判定要素取得領域の特
徴量を抽出する。すなわち、前記位置指定部３３は、基
本的には２つの画像全体に順次領域モデルの位置を指定
し、前記特徴量抽出部３４は、指定された各位置におい
て領域モデルＭ内の各判定要素取得領域の特徴量を抽出
する。そして、領域モデルＭがＭ′の位置に指定された
ときに最も顔らしい特徴量が抽出されることになる。The position specifying section 33 and the feature quantity extracting section 34
Designates a position where the area model M is applied to both the digital grayscale image and the edge image stored in the image memory 13 as shown in FIG.
The feature amount of each determination element acquisition region in the region model M is extracted from the two images. That is, the position specifying unit 33 basically specifies the position of the region model sequentially for the entire two images, and the feature amount extracting unit 34 obtains each determination element in the region model M at each of the specified positions. The feature of the region is extracted. Then, when the area model M is specified at the position of M ', the most face-like feature amount is extracted.

【００３０】なお、その他の特徴量を用いる場合は、画
像処理部２においてその他の特徴量を含む画像を入力画
像から生成して画像メモリ１３に記憶し、この記憶した
画像を利用すればよい。ここで入力画像がカラー画像の
場合は、このカラー入力画像から肌色らしい領域を抽出
し、その領域に対してのみ領域モデルの位置を指定して
もよい。When using other feature values, the image processing unit 2 generates an image including other feature values from the input image, stores the image in the image memory 13, and uses the stored image. Here, when the input image is a color image, an area that is likely to be skin color may be extracted from this color input image, and the position of the area model may be specified only for that area.

【００３１】このように前処理により検出領域を絞って
から領域モデルの位置を指定することもできる。また、
カラー画像を利用して肌色という特徴を用いることも可
能である。なお、ここでは領域モデルの当て嵌め方とし
て入力画像に対して領域モデル全体を指定位置を移動さ
せながら順次当て嵌めたが、必ずしもこれに限定するも
のではなく、領域モデルの各判定要素取得領域をばらば
らにして１つずつ指定位置を移動させながら順次当て嵌
めてもよい。As described above, it is also possible to specify the position of the area model after narrowing down the detection area by the preprocessing. Also,
It is also possible to use the feature of skin color using a color image. Here, as a method of fitting the region model, the entire region model is sequentially fitted to the input image while moving the designated position. However, the present invention is not limited to this. The fitting may be performed sequentially while moving the designated position one by one.

【００３２】前記位置指定部３３で指定した位置におい
て、デジタル濃淡画像及びエッジ画像の両画像から抽出
する特徴量としては、例えば、次式で表現される特徴量
を用いる。なお、利用する顔の領域モデルは図５の(a)
に示すモデルとする。また、領域モデル内の右目の判定
要素取得領域４１をＲＥ、左目の判定要素取得領域４２
をＬＥ、鼻の判定要素取得領域４３をＮ、口の判定要素
取得領域４４をＭ、残りの頬の判定要素取得領域４５を
Ｃとする。また、前記画像メモリ１３中のデジタル濃淡
画像、エッジ画像ともに各画素０〜２５５の値をとる画
像とする。さらに、Ｐ(i) を位置ｉにおけるデジタル濃
淡画像の濃淡値、Ｅ(i) を位置ｉにおけるエッジ画像の
濃淡値とする。At the position designated by the position designating section 33, as a feature quantity to be extracted from both the digital grayscale image and the edge image, for example, a feature quantity expressed by the following equation is used. The area model of the face to be used is shown in FIG.
The model shown in Further, the determination element acquisition area 41 of the right eye in the area model is RE, and the determination element acquisition area 42 of the left eye is
Is LE, the nose determination element acquisition area 43 is N, the mouth determination element acquisition area 44 is M, and the remaining cheek determination element acquisition area 45 is C. In addition, the digital grayscale image and the edge image in the image memory 13 are both images having values of pixels 0 to 255. Further, P (i) is a gray value of the digital gray image at the position i, and E (i) is a gray value of the edge image at the position i.

【００３３】右目の判定要素取得領域ＲＥの低輝度特徴
量の合計は、The sum of the low-luminance features of the right-eye determination element acquisition area RE is

【数１】 (Equation 1)

【００３４】右目の判定要素取得領域ＲＥのエッジ特徴
量の合計は、The sum of the edge feature amounts of the determination element acquisition area RE of the right eye is

【数２】 (Equation 2)

【００３５】左目の判定要素取得領域ＬＥの低輝度特徴
量の合計は、The sum of the low-luminance features of the left-eye determination element acquisition area LE is

【数３】 (Equation 3)

【００３６】左目の判定要素取得領域ＬＥのエッジ特徴
量の合計は、The sum of the edge feature amounts of the left-eye determination element acquisition area LE is

【数４】 (Equation 4)

【００３７】鼻の判定要素取得領域Ｎの低輝度特徴量の
合計は、The sum of the low-luminance features of the nose determination element acquisition area N is

【数５】 (Equation 5)

【００３８】鼻の判定要素取得領域Ｎのエッジ特徴量の
合計は、The sum of the edge feature amounts of the nose determination element acquisition region N is

【数６】 (Equation 6)

【００３９】口の判定要素取得領域Ｍの低輝度特徴量の
合計は、The sum of the low luminance feature amounts of the mouth determination element acquisition area M is

【数７】 (Equation 7)

【００４０】口の判定要素取得領域Ｍのエッジ特徴量の
合計は、The sum of the edge feature amounts of the mouth determination element acquisition area M is

【数８】 (Equation 8)

【００４１】頬の判定要素取得領域Ｃの輝度値平均は、The average brightness value of the cheek determination element acquisition area C is

【数９】 (Equation 9)

【００４２】頬の判定要素取得領域Ｃの輝度値分散は、The variance of the luminance value of the determination element acquisition area C of the cheek is

【数１０】 (Equation 10)

【００４３】となる。なお、頬の判定要素取得領域Ｃの
輝度値分散については、目、鼻、口を取り除いた残りの
頬領域Ｃは輝度分散が小さいという特徴を持っている。Is as follows. In addition, regarding the luminance value variance of the determination element acquisition region C of the cheek, the remaining cheek region C excluding the eyes, the nose, and the mouth has a characteristic that the luminance variance is small.

【００４４】判定のために上記各式で表される各特徴量
を抽出し、メモリに記憶する。こうして抽出された特徴
量の例を示すと図９に示すようになる。各判定要素取得
領域の特徴量は人物により様々である。すなわち、個人
差がある。撮影環境の違いや顔サイズの多少の違い、顔
の傾きの変動等もこのデータ中に含まれる。位置指定部
３３で指定され、特徴量抽出部３４で抽出された領域モ
デルの各領域の特徴量より、判定部３６において指定さ
れた領域モデルの位置が顔であるか否かを判定する。For the purpose of determination, each feature value represented by each of the above equations is extracted and stored in a memory. FIG. 9 shows an example of the feature amounts extracted in this way. The feature amount of each determination element acquisition area varies depending on the person. That is, there are individual differences. This data also includes a difference in the shooting environment, a slight difference in the face size, a change in the inclination of the face, and the like. The determination unit 36 determines whether the position of the area model specified by the determination unit 36 is a face based on the feature amount of each area of the area model specified by the position specification unit 33 and extracted by the feature amount extraction unit 34.

【００４５】判定には、例えばファジーメンバーシップ
関数を利用する。例えば、右目の低輝度画素量としての
正しさをμ_RE-dとして、ファジーメンバーシップ関数を
図１０のように定義する。この関数はある指定された位
置における領域モデル内の右目の判定要素取得領域ＲＥ
の低輝度画素量RE-dを入力して、その値に応じて０．０
〜１．０の値を出力する。For the determination, for example, a fuzzy membership function is used. For example, a fuzzy membership function is defined as shown in FIG. 10 by setting μ _RE-d as the correctness of the low-luminance pixel amount of the right eye. This function obtains the determination element acquisition area RE for the right eye in the area model at a specified position.
Input the low-luminance pixel amount RE-d of
The value of ~ 1.0 is output.

【００４６】図１０ではRE-dが１，２００〜１０，００
０と計測された場合、μ_RE-dは１．０を出力する。この
範囲については、図９のようなサンプル画像から得られ
た特徴量の個人差、変動を参考に決定する。その他の範
囲の値の場合は、台形の左右の辺をなす一次関数の出力
がμ_RE-dの値となる。μ_RE-dが１．０に近いほど右目の
低輝度画素量として正しいことになる。In FIG. 10, RE-d is 1,200 to 10,000.
If measured as 0, μ _RE-d outputs 1.0. This range is determined with reference to individual differences and variations in feature amounts obtained from a sample image as shown in FIG. For values in other ranges, the output of a linear function forming the left and right sides of the trapezoid is the value of μ _RE-d . The closer μ _RE-d is to 1.0, the more accurate the right-eye low-luminance pixel amount is.

【００４７】この台形のメンバーシップ関数の決定は、
図９に示したように予め様々な学習用画像から顔の特徴
量を計測し、この計測した特徴量に基づいて決定する。
同様に、図９に示したような各判定要素取得領域のサン
プル画像の特徴量から、RE-e、LE-d、LE-e、N-d 、N-e
、M-d 、M-e 、C-a 、C-v に対するファジーメンバー
シップ関数を定義し、それぞれの出力μ_RE-e、μ_LE-d、
μ_LE-e、μ_N-d、μ_N-e、μ_M-d、μ_M-e、μ_C-a、μ
_C-vを算出する。The trapezoidal membership function is determined by
As shown in FIG. 9, the feature amount of the face is measured in advance from various learning images, and is determined based on the measured feature amount.
Similarly, RE-e, LE-d, LE-e, Nd, Ne are calculated from the feature amounts of the sample images of the respective determination element acquisition areas as shown in FIG.
, Md, Me, Ca, and Cv are defined as fuzzy membership functions, and their outputs μ _RE-e , μ _LE-d ,
μ _LE-e , μ _Nd , μ _Ne , μ _Md , μ _Me , μ _Ca , μ
Calculate _Cv .

【００４８】それぞれのメンバーシップ関数は、前記判
定用記憶部３５に記憶しておけばよい。そして、最終的
にその位置が顔であるか否かを判定する。顔らしさＦと
しては、例えば、The respective membership functions may be stored in the storage unit 35 for determination. Then, it is finally determined whether or not the position is a face. As the faciality F, for example,

【数１１】 [Equation 11]

【００４９】のような関数を利用する。この式の場合、
個々のメンバーシップ関数の出力が１．０に近いほどＦ
は１．０に近くなる。但し、この式の場合、全てが１．
０でないとＦは１．０にならない。The following function is used. In this case,
The closer the output of each membership function is to 1.0, the more F
Is close to 1.0. However, in the case of this formula, all are 1.
If it is not 0, F does not become 1.0.

【００５０】Ｆの式については様々な方法が考えられ、
ファジールールを利用したり、代数和で求めることもで
きる。このＦの値が最も大きな部分を顔の位置とし、入
力画像上のその位置に円を描くと、例えば図１１のよう
な画像が得られる。以上の判定に関しては、必ずしもフ
ァジー理論を利用する必要はない。また、より確実な検
出を行う場合には、検出された位置において、さらに本
当の顔であるかの検証を行う。Various methods can be considered for the formula of F.
It can be obtained by using fuzzy rules or by algebraic sum. If the part having the largest value of F is set as the position of the face and a circle is drawn at that position on the input image, an image as shown in FIG. 11 is obtained, for example. For the above determination, it is not always necessary to use fuzzy logic. In addition, in the case of performing more reliable detection, it is further verified whether the face is a true face at the detected position.

【００５１】このように、被識別対象物である顔を、
目、鼻、口、頬等の特徴的な判定要素取得領域に分けて
表現した顔の領域モデルを作成し、各判定要素取得領域
の位置関係を維持したこの領域モデルを入力画像及び入
力画像を画像処理した画像の両方に当て嵌める位置を指
定し、その位置における領域モデル内の各判定要素取得
領域の特徴量を計測し、領域モデル全体で顔か否かの判
定を行っているので、個々の顔の目、鼻等の部品の位置
関係を検証する必要がなく、簡潔な処理が実現できる。
また、顔の個人差による部品の位置ずれや大きさの違い
などを吸収し、かつ、被写人物の前後位置における顔の
サイズの変動や多少の顔の傾きも吸収できるため、簡潔
で安定した顔の検出ができる。As described above, the face to be identified is
Create an area model of the face divided into characteristic determination element acquisition areas such as eyes, nose, mouth, cheek, etc., and apply this area model maintaining the positional relationship of each determination element acquisition area to an input image and an input image. Since the position to be applied to both of the image-processed images is specified, the feature amount of each determination element acquisition region in the region model at that position is measured, and it is determined whether or not the face is the entire region model. There is no need to verify the positional relationship between parts such as the eyes and nose of the face, and simple processing can be realized.
In addition, it is simple and stable because it can absorb positional shifts and differences in size of parts due to individual differences in the face, and can also absorb variations in face size and slight inclination of the face at the front and back positions of the subject. Face can be detected.

【００５２】なお、この実施の形態では、領域モデルを
入力画像及びこの入力画像を画像処理した画像の両方に
当て嵌めて各判定要素取得領域の特徴量を計測して顔の
判定を行ったが必ずしもこれに限定するものではなく、
領域モデルを入力画像及びこの入力画像を画像処理した
画像のいずれか一方に当て嵌めて各判定要素取得領域の
特徴量を計測して顔の判定を行ってもよい。In this embodiment, the face model is determined by applying the area model to both the input image and the image obtained by processing the input image, and measuring the characteristic amount of each determination element acquisition area. It is not necessarily limited to this,
A face model may be determined by applying a region model to one of an input image and an image obtained by performing image processing on the input image and measuring the feature amount of each determination element acquisition region.

【００５３】（第２の実施の形態）この実施の形態も顔
検出に関し、基本的には第１の実施の形態と同様であ
り、異なる点は、領域モデルにおける頬の判定要素取得
領域をテンプレート画像に置き換えた点である。(Second Embodiment) This embodiment also relates to face detection and is basically the same as the first embodiment, except that a cheek determination element acquisition area in an area model is used as a template. The point is that it has been replaced with an image.

【００５４】すなわち、図１２の(a) に示すように、マ
スク表現した顔の領域モデル５１における頬の判定要素
取得領域をテンプレート画像５２に置き換える。このテ
ンプレート画像は図１２の(b) に示すような構成になっ
ている。なお、数値１の領域は右目の判定要素取得領域
５３であり、数値２の領域は左目の判定要素取得領域５
４であり、数値３の領域は鼻の判定要素取得領域５５で
あり、数値４の領域は口の判定要素取得領域５６であ
る。このテンプレート画像５２は、実際の顔画像から切
り取った濃淡画像でもよい。例えば、頬はほぼ平面なの
で濃淡値1,1,1,1,1,1,…のような濃淡画像を使用する。
これらの画像とテンプレートマッチングしたときの頬の
領域の類似度Ｃ-similarを求める。That is, as shown in FIG. 12A, the cheek determination element acquisition region in the mask-represented face region model 51 is replaced with the template image 52. This template image has a configuration as shown in FIG. Note that the area of numerical value 1 is the determination element acquisition area 53 of the right eye, and the area of numerical value 2 is the determination element acquisition area 5 of the left eye.
The area of numerical value 3 is a nose determination element acquisition area 55, and the area of numerical value 4 is a mouth determination element acquisition area 56. This template image 52 may be a gray-scale image cut out from an actual face image. For example, since the cheek is almost flat, a grayscale image such as a grayscale value of 1,1,1,1,1,1,... Is used.
The similarity C-similar of the cheek region when template matching is performed with these images is obtained.

【００５５】この頬領域のテンプレート画像をｇとし、
入力濃淡画像から指定された位置の頬領域をｇと同じ大
きさで切り出した画像をｆとすると、その類似度Ｃ-s
は、Let the template image of this cheek region be g,
Assuming that an image obtained by cutting out the cheek region at the designated position from the input grayscale image with the same size as g is f, the similarity C-s
Is

【数１２】 (Equation 12)

【００５６】で表される。類似度Ｃ-sは０〜１の間の値
となる。この類似度を第１の実施の形態と同様に様々な
人物より取得して学習し、判定用データとしてメモリに
記憶し、図１３に示すように、他の領域の特徴量と一緒
に利用する。実際の顔の判定は類似度を特徴量の１つと
考え、第１の実施の形態と同様な処理を行えばよい。## EQU5 ## The similarity C-s takes a value between 0 and 1. This similarity is obtained and learned from various persons in the same manner as in the first embodiment, is stored in the memory as determination data, and is used together with the feature amounts of other areas as shown in FIG. . For the actual face determination, the similarity may be considered as one of the feature amounts, and the same processing as in the first embodiment may be performed.

【００５７】このように顔全体のテンプレートマッチン
グではなく、領域モデルのうち、頬の領域のテンプレー
トマッチングと他の領域について特徴量を組み合わせて
も第１の実施の形態の場合と同様の顔検出ができる。な
お、この実施の形態では頬領域についてテンプレートと
の類似度を求めるようにしたが必ずしもこれに限定する
ものではなく、テンプレート画像との距離等を用いても
よい。その他、類似度判定方法としては様々な方法があ
り、いずれも適用できる。As described above, the face detection similar to that of the first embodiment can be performed by combining the template matching of the cheek region and the feature amount of the other region in the region model instead of the template matching of the entire face. it can. In this embodiment, the similarity between the cheek region and the template is obtained. However, the present invention is not limited to this, and a distance from the template image may be used. There are various other similarity determination methods, and any of them can be applied.

【００５８】なお、前述した各実施の形態では、入力画
像の画像全体に対して、領域モデルを当て嵌める位置を
順次指定して特徴量を抽出し顔検出を行うようにしたが
必ずしもこれに限定するものではなく、図１４に示すよ
うに、予め背景が既知の場合には、画像全体からその既
知の部分５７を取り除いて候補領域５８を生成し、その
候補領域５８に対し、前述した各実施の形態と同様に領
域モデルを当て嵌める位置を順次指定して特徴量を抽出
し顔検出を行ってもよい。このようにすれば特徴量の抽
出処理がより迅速になる。In each of the above-described embodiments, the face detection is performed by sequentially designating the position where the region model is applied to the entire input image to extract the feature amount and perform face detection. Instead, as shown in FIG. 14, if the background is known in advance, a known area 57 is removed from the entire image to generate a candidate area 58, and the candidate area 58 In the same manner as in the above embodiment, the position where the region model is applied may be sequentially designated to extract the feature amount and perform face detection. By doing so, the extraction processing of the feature amount becomes faster.

【００５９】（第３の実施の形態）この実施の形態は、
被識別対象物検出として顔以外の物品検出に適用した例
について述べる。具体的には、シーンの中から一方通行
の道路標識を検出例について述べる。カラー画像を用い
て色特徴を利用してもよいが、ここでは濃淡画像のみを
利用する。対象は正面、水平に配置されたシーン画像中
の一方通行の標識とする。領域モデルとして、図１５に
示すような２つの領域６１，６２からなる領域モデル６
０を使用する。(Third Embodiment) This embodiment is similar to the third embodiment.
An example in which an object other than a face is detected as an object to be identified is described. Specifically, an example of detecting a one-way road sign from a scene will be described. Although color features may be used using a color image, here, only a gray image is used. The target is a one-way sign in a scene image arranged horizontally in front. As an area model, an area model 6 including two areas 61 and 62 as shown in FIG.
Use 0.

【００６０】矢印部の領域６１をＡ１、その他の領域６
２をＡ２とする。次に各領域Ａ１、Ａ２において利用す
る特徴量を決定する。一般に、一方通行の標識において
は領域Ａ１の輝度はその他の領域Ａ２に比べて高く、ま
た、領域Ａ１、Ａ２ともに輝度の分散値は非常に小さ
い。そこで、２つの領域Ａ１、Ａ２とも特徴量として平
均輝度と輝度分散値を利用するとして、それぞれA1-a、
A1-v、A2-a、A2-vとする。そして、様々な一方通行の道
路標識のサンプル画像からそれぞれの特徴量を計測し、
図１６に示すような計測値を得る。そして、この学習結
果から各領域Ａ１、Ａ２の正しさとしては、例えば、図
１７に示すファジーメンバーシップ関数を定義する。こ
こでは、領域Ａ１の平均輝度の正しさをμ_A1-aとして定
義している。同様に、領域Ａ１の輝度分散値の正しさを
μ_A1-vとして定義し、領域Ａ２の平均輝度の正しさをμ
_A2-aとして定義し、領域Ａ２の輝度分散値の正しさをμ
_A2-vとして定義している。The area 61 indicated by the arrow is denoted by A1, and the other areas 6
2 is A2. Next, the feature amount used in each of the regions A1 and A2 is determined. Generally, in a one-way sign, the luminance of the area A1 is higher than that of the other area A2, and the variance of the luminance is very small in both the areas A1 and A2. Therefore, assuming that the two regions A1 and A2 use the average luminance and the luminance variance value as feature amounts, respectively, A1-a,
A1-v, A2-a, and A2-v. Then, each feature value is measured from sample images of various one-way road signs,
A measurement value as shown in FIG. 16 is obtained. Then, as a correctness of each of the regions A1 and A2 from the learning result, for example, a fuzzy membership function shown in FIG. 17 is defined. Here, the correctness of the average luminance of the area A1 is defined as μ _A1-a . Similarly, the correctness of the luminance variance value of the area A1 is defined as μ _A1-v , and the correctness of the average luminance of the area A2 is μ _μ1−v.
_A2-a , and the correctness of the luminance variance value of the area A2 is μ
Defined as _A2-v .

【００６１】実際の検出においては、先ず、標識６５を
含む図１８のようなシーンの画像を入力し、シーン画像
上を領域モデル６０を順次移動させ、その都度各領域Ａ
１、Ａ２の平均輝度や輝度分散値を計測し、例えば、Ｆ
＝μ_A1-a×μ_A1-v×μ_A2-a×μ_A2-vのような計算式にお
いてその場所に標識６５がある確からしさＦを計算し、
ある閾値よりもＦが大きい位置に標識６５があると検出
する。このように、被識別対象物である標識６５をその
標識画像の特徴的な判定要素取得領域Ａ１、Ａ２に分け
て表現した顔の領域モデル６０を作成し、各判定要素取
得領域の位置関係を維持したこの領域モデル６０をシー
ン画像上に当て嵌める位置を指定しつつ領域モデル内の
各判定要素取得領域の特徴量を計測し、領域モデル全体
で標識か否かの判定を行っているので、簡潔な処理が実
現でき、簡潔で安定した標識の検出ができる。In the actual detection, first, an image of a scene including the marker 65 as shown in FIG. 18 is input, and the area model 60 is sequentially moved on the scene image.
1, the average luminance and the luminance variance of A2 are measured.
Calculate the likelihood F that the marker 65 is present at that location in a calculation formula such as = _A1-a x _A1-v x _A2-a x _A2-v ,
It is detected that the marker 65 is located at a position where F is larger than a certain threshold. In this manner, the area model 60 of the face in which the marker 65, which is the object to be identified, is divided into the characteristic determination element acquisition areas A1 and A2 of the marker image is created, and the positional relationship between the respective determination element acquisition areas is determined. Since the feature amount of each determination element acquisition area in the area model is measured while designating the position where the maintained area model 60 is applied to the scene image, and it is determined whether or not the entire area model is a sign, Simple processing can be realized, and simple and stable detection of a label can be achieved.

【００６２】[0062]

【発明の効果】請求項１乃至７記載の発明によれば、被
識別対象物の特徴領域の位置の個体差、サイズの変動、
撮影環境の変動を考慮して被識別対象物を検出すること
ができて検出率の向上を図ることができ、しかも特徴領
域の位置関係を検証する必要がなく簡潔な処理を実現で
きる対象物検出装置を提供できる。According to the first to seventh aspects of the present invention, individual differences in the position of the characteristic region of the object to be identified, variations in size,
Object detection that can detect an object to be identified in consideration of fluctuations in the shooting environment, improve the detection rate, and realizes simple processing without the need to verify the positional relationship between characteristic regions. Equipment can be provided.

【００６３】また、請求項２乃至７記載の発明によれ
ば、特に顔を含む画像に対して、目、鼻、口、頬等の特
徴的な領域に分けて表現した顔の領域モデルを作成し、
領域の位置関係を維持したこの領域モデルを画像に当て
嵌める位置を順次指定して領域モデルの各判定要素取得
領域から特徴量などの判定要素を取得して顔の判定を行
うので、個々の顔の目、鼻、口等の部品の位置関係を検
証する必要がなく、簡潔な処理ができ、また、顔の個人
差による部品の位置ずれや大きさの違いなどを吸収で
き、かつ被写人物の前後位置における顔のサイズの変動
や多少の顔の傾きも吸収でき、簡潔で安定した顔の検出
ができる対象物検出装置を提供できる。Further, according to the present invention, a face area model is created by dividing a face image into a characteristic area such as an eye, a nose, a mouth, and a cheek. And
A face is determined by sequentially specifying a position at which the region model maintaining the positional relationship of the region is applied to the image and acquiring a determination element such as a feature amount from each determination element acquisition region of the region model. There is no need to verify the positional relationship of parts such as the eyes, nose, mouth, etc., and simple processing can be performed. In addition, differences in the position and size of parts due to individual differences in the face can be absorbed, and the subject It is possible to provide an object detection device that can absorb a change in the size of the face and a slight inclination of the face at the front and rear positions, and can detect the face simply and stably.

【００６４】また、請求項８乃至１１記載の発明によれ
ば、被識別対象物の特徴領域の位置の個体差、サイズの
変動、撮影環境の変動を考慮して被識別対象物を検出す
ることができて検出率の向上を図ることができ、しかも
特徴領域の位置関係を検証する必要がなく簡潔な処理を
実現できる対象物検出方法を提供できる。According to the present invention, the object to be identified is detected in consideration of the individual difference in the position of the characteristic region of the object to be identified, the variation in size, and the variation in the imaging environment. As a result, it is possible to provide an object detection method that can improve the detection rate and can realize simple processing without having to verify the positional relationship between the characteristic regions.

【００６５】また、請求項９乃至１１記載の発明によれ
ば、特に顔を含む画像に対して、目、鼻、口、頬等の特
徴的な領域に分けて表現した顔の領域モデルを作成し、
領域の位置関係を維持したこの領域モデルを画像に当て
嵌める位置を順次指定して領域モデルの各判定要素取得
領域から特徴量などの判定要素を取得して顔の判定を行
うので、個々の顔の目、鼻、口等の部品の位置関係を検
証する必要がなく、簡潔な処理ができ、また、顔の個人
差による部品の位置ずれや大きさの違いなどを吸収で
き、かつ被写人物の前後位置における顔のサイズの変動
や多少の顔の傾きも吸収でき、簡潔で安定した顔の検出
ができる対象物検出方法を提供できる。According to the ninth to eleventh aspects of the present invention, an area model of a face which is expressed by dividing a characteristic image such as an eye, a nose, a mouth, and a cheek into an image including a face is prepared. And
A face is determined by sequentially specifying a position at which the region model maintaining the positional relationship of the region is applied to the image and acquiring a determination element such as a feature amount from each determination element acquisition region of the region model. There is no need to verify the positional relationship of parts such as the eyes, nose, mouth, etc., and simple processing can be performed. In addition, differences in the position and size of parts due to individual differences in the face can be absorbed, and the subject It is possible to provide an object detection method capable of absorbing a change in the size of the face and a slight inclination of the face at the front and rear positions, and enabling a simple and stable face detection.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の第１の実施の形態を示す全体構成のブ
ロック図。FIG. 1 is a block diagram of an overall configuration showing a first embodiment of the present invention.

【図２】同実施の形態における画像入力部の構成を示す
ブロック図。FIG. 2 is a block diagram showing a configuration of an image input unit according to the embodiment.

【図３】同実施の形態において入力した濃淡画像と画像
処理したエッジ画像を示す図。FIG. 3 is a diagram showing a grayscale image input and an edge image subjected to image processing in the embodiment;

【図４】同実施の形態における位置検出部の構成を示す
ブロック図。FIG. 4 is a block diagram showing a configuration of a position detection unit according to the embodiment.

【図５】同実施の形態において使用可能なウィンドウモ
デル型の領域モデルの例を示す図。FIG. 5 is a view showing an example of a window model type area model which can be used in the embodiment;

【図６】同実施の形態において使用可能なマスクモデル
型の領域モデルの例を示す図。FIG. 6 is a view showing an example of a mask model type area model that can be used in the embodiment;

【図７】同実施の形態において領域モデルを顔に当て嵌
める各種例を示す図。FIG. 7 is an exemplary view showing various examples of fitting an area model to a face in the embodiment;

【図８】同実施の形態において領域モデルを当て嵌める
位置指定と特徴量抽出を説明するための図。FIG. 8 is a view for explaining position designation and feature extraction to which the region model is applied in the embodiment.

【図９】同実施の形態においてサンプル顔画像から抽出
した各領域の特徴量の例を示す図。FIG. 9 is a diagram showing an example of a feature amount of each region extracted from a sample face image in the embodiment.

【図１０】同実施の形態における右目の判定要素取得領
域の低輝度画素量RE-dのファジーメンバーシップ関数例
を示す図。FIG. 10 is a diagram showing an example of a fuzzy membership function of the low-luminance pixel amount RE-d in the right-eye determination element acquisition area in the embodiment.

【図１１】同実施の形態における顔の位置検出結果例を
示す図。FIG. 11 is a diagram showing an example of a result of face position detection according to the embodiment;

【図１２】本発明の第２の実施の形態を示すもので、マ
スク表現した顔の領域モデルにおける頬の判定要素取得
領域へのテンプレート画像の利用を示す図。FIG. 12 is a view showing the second embodiment of the present invention and shows the use of a template image in a determination element acquisition area of a cheek in an area model of a face represented by a mask.

【図１３】同実施の形態においてサンプル顔画像から抽
出した各領域の特徴量の例を示す図。FIG. 13 is a view showing an example of a feature amount of each region extracted from a sample face image in the embodiment.

【図１４】領域モデルを当て嵌める位置指定と特徴量抽
出の他の例を説明するための図。FIG. 14 is a view for explaining another example of position designation and feature amount extraction to which an area model is applied.

【図１５】本発明の第３の実施の形態における領域モデ
ルの例を示す図。FIG. 15 is a diagram showing an example of a region model according to the third embodiment of the present invention.

【図１６】同実施の形態においてサンプル画像から抽出
した各領域の特徴量の例を示す図。FIG. 16 is a view showing an example of a feature amount of each region extracted from a sample image in the embodiment.

【図１７】同実施の形態における領域Ａ１の平均輝度A1
-aのファジーメンバーシップ関数例を示す図。FIG. 17 shows an average luminance A1 of an area A1 in the embodiment.
The figure which shows the fuzzy membership function example of -a.

【図１８】同実施の形態におけるシーン画像例を示す
図。FIG. 18 is a view showing an example of a scene image in the embodiment.

[Explanation of symbols]

１…画像入力部２…画像処理部３…位置検出部１３…画像メモリ３２…モデル記憶部３３…位置指定部３４…特徴量抽出部３６…判定部 REFERENCE SIGNS LIST 1 image input unit 2 image processing unit 3 position detection unit 13 image memory 32 model storage unit 33 position designation unit 34 feature extraction unit 36 determination unit

Claims

[Claims]

An image input unit for inputting an image; a storage unit for storing an area model in which a plurality of determination element acquisition areas are set corresponding to characteristic areas of an image of an object to be detected; Position designation means for sequentially designating a position at which the area model stored in the storage means is applied to one or both of an input image input by the input means and an image obtained by performing image processing on the input image; Each time an area model is sequentially applied to the position specified by the means, a determination element obtaining means for obtaining a determination element from each determination element obtaining area of the area model, and a determination element obtained by the determination element obtaining means. An object detection apparatus, comprising: a determination unit configured to determine whether an image is an object to be identified, and detecting an object to be identified based on a determination result of the determination unit.

2. An image input means for inputting an image, and an area model in which a plurality of determination element acquisition areas are set corresponding to characteristic areas such as eyes, mouth, nose, and cheek of a face image to be detected are stored. Storage means, and a position designation for sequentially designating a position at which the area model stored in the storage means is applied to one or both of an input image input by the image input means and an image obtained by performing image processing on the input image. Means, a determination element obtaining means for obtaining a determination element from each determination element obtaining area of the area model each time the area model is sequentially applied to the position designated by the position specification means, and An object detection apparatus, comprising: determination means for determining whether or not the image is a face image based on the determined determination element, and detecting a face based on the determination result of the determination means.

3. The object detecting apparatus according to claim 1, wherein the determination element obtaining unit obtains a feature amount as a determination element from each determination element obtaining area of the area model.

4. The method according to claim 3, wherein the determination element obtaining means obtains a feature amount of at least one determination element obtaining area of the area model using a template image.
An object detection device according to any one of the preceding claims.

5. When at least one of an eye, a mouth, and a nose region is set as a determination element acquisition region of a region model, the determination element acquisition unit determines a gray level value in the region for at least one of the regions. 5. The object detecting apparatus according to claim 3, wherein the feature amount is obtained using a differential value of the gray value.

6. When a cheek area is set as a determination element acquisition area of an area model, the determination element obtaining means obtains a feature amount for the area by using a variance value of gray levels in the area. The target object detection device according to claim 3 or 5, wherein:

7. The method according to claim 7, wherein when a cheek region is set as the determination element acquisition region of the region model, the determination element acquisition unit acquires a feature amount of the region using a template image. 6. The object detection device according to 3 or 5.

8. Acquisition of a plurality of determination elements for one or both of an input image and an image obtained by performing image processing on the input image, in accordance with a characteristic region of a target object image to be detected. The area model in which the area is set is applied while sequentially specifying the position, and each time the area model is applied, a determination element is obtained from each determination element acquisition area of the area model, and the identification target is determined based on the obtained determination element. An object detection method, comprising: determining whether an image is an object image; and detecting an object to be identified based on the determination result.

9. A method according to claim 1, wherein one or both of the input image and an image obtained by subjecting the input image to image processing correspond to characteristic regions such as eyes, mouth, and nose of a face image to be detected. The area model in which the determination element acquisition area is set is applied while sequentially specifying the position, and each time the area model is applied, a determination element is obtained from each determination element acquisition area of this area model, and based on the obtained determination element, A method for determining whether or not the image is a face image, and detecting a face based on a result of the determination.

10. The object detection method according to claim 8, wherein a feature amount is acquired as a determination element from each determination element acquisition area of the area model.

11. The object detection method according to claim 10, wherein a feature amount is acquired for at least one determination element acquisition area of the area model using a template image.