JP2008152611A

JP2008152611A - Image recognition device, electronic equipment, image recognition method, and image recognition program

Info

Publication number: JP2008152611A
Application number: JP2006340864A
Authority: JP
Inventors: Michihiro Nagaishi; 道博長石
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2006-12-19
Filing date: 2006-12-19
Publication date: 2008-07-03

Abstract

<P>PROBLEM TO BE SOLVED: To quickly and precisely recognize a complicated object in an image. <P>SOLUTION: This image recognition device/method is provided with an image input part 10 for inputting the image photographed with a face as an example of an object, a binarization-normalizing part 13 for binarizing each pixel of the image, and for simplifying the image so as to calculate a local directional contribution feature amount, a directional contribution feature amount calculating part 14 for calculating the local directional contribution feature amount showing feature amounts of the binarized image, and a recognition part 15 for recognizing the face photographed in the image, based on the local directional contribution feature amount. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、画像に写っているオブジェクトを認識する画像認識技術に関する。 The present invention relates to an image recognition technique for recognizing an object shown in an image.

従来、顔が写った顔画像に対する顔認識技術には、肌色など顔特有の色情報や輪郭線な
ど顔の部品（目や鼻、口、眉など）に関する情報を特徴量として顔認識を行う技術や（例
えば、特許文献１及び２参照）、顔画像に対して周波数解析を行って特徴量を抽出し顔認
識を行う技術（例えば、特許文献３参照）などの種々の技術が提案されている。
さらに近年では、顔画像から抽出した顔の特徴量に基づく認識精度をより高めるために
、サポートベクトルマシン等のニューラルネットワークやブースティング手法等の高度な
学習方法を認識処理に付加することで顔の認識率を高めており、これにより、セキュリテ
ィシステム等の高い認識精度が要求されるアプリケーション分野での実用を可能にしてい
る。 Conventionally, face recognition technology for facial images with a face is a technology that performs face recognition using face-specific color information such as skin color and information about facial parts (such as eyes, nose, mouth, and eyebrows) such as contour lines as feature quantities. (For example, refer to Patent Documents 1 and 2) Various techniques such as a technique for performing face analysis by performing frequency analysis on a face image and extracting a feature amount (for example, refer to Patent Document 3) have been proposed. .
In recent years, in order to further improve the recognition accuracy based on facial features extracted from facial images, advanced learning methods such as neural networks such as support vector machines and boosting techniques are added to the recognition process. The recognition rate is increased, which enables practical use in application fields such as security systems that require high recognition accuracy.

一方、文字パターンを認識する文字パターン認識技術が古くから研究されており、１９
７０年代には既に郵便番号の読み取り技術への応用が成されている。その当時は、計算能
力に乏しいハードウェアを用いて文字パターンを認識する必要があったため、アルゴリズ
ムが簡単であり、かつ、高い認識率を実現可能な手法が検討されている。この手法として
、文字線の方向寄与度特徴量を用いた技術が提案されており、手書き文字でも高精度な認
識を可能にし、認識性能評価や各種手法のリファレンスにも使用されている（例えば、特
許文献４及び非特許文献１参照）。
特開２００５−１４１４３７号公報特開２００１−２８３２２４号公報特開２００４−２７２３２６号公報特開昭５７−１６４３７６号公報萩田紀博、内藤誠一郎、増田功、「大局的・局所的方向寄与度密度特徴による手書き漢字認識方式」、電子情報通信学会論文誌、電子情報通信学会、１９８３年６月、Ｖｏｌ．Ｊ６６−Ｄ、Ｎｏ．６、ｐｐ．７２２−７２９ On the other hand, character pattern recognition technology for recognizing character patterns has been studied for a long time.
In the 1970s, it was already applied to postal code reading technology. At that time, it was necessary to recognize character patterns using hardware that lacked computing power, so methods that are simple and that can achieve a high recognition rate are being studied. As this technique, a technique using a character line direction contribution feature amount has been proposed, which enables highly accurate recognition even with handwritten characters, and is used for recognition performance evaluation and various technique references (for example, (See Patent Document 4 and Non-Patent Document 1).
JP 2005-141437 A JP 2001-283224 A JP 2004-272326 A JP-A-57-164376 Norihiro Hamada, Seiichiro Naito, Isao Masuda, “A Handwritten Kanji Recognition Method Based on Global and Local Directional Contribution Density Features”, IEICE Transactions, IEICE, June 1983, Vol. J66-D, no. 6, pp. 722-729

ところで、上記顔認識技術は、高精度な顔認識が可能であるものの、計算アルゴリズム
が複雑であり、処理能力が制限されるデジタルカメラやＰＤＡなどの小型の電子機器が処
理することは困難であり、専用の装置が必要となる。一方、文字パターンの認識技術は、
処理能力の低いハードウェアを用いて処理可能な計算アルゴリズムが提供可能であるもの
の、この認識技術を用いて、複雑な線分を多く含む顔の認識を行うことは困難である。
本発明は、上述した事情に鑑みてなされたものであり、高速、かつ、精度良く画像に写
った複雑なオブジェクトを認識可能な画像認識装置、電子機器、画像認識方法及び画像認
識プログラムを提供することを目的とする。 By the way, although the above face recognition technology enables high-accuracy face recognition, the calculation algorithm is complicated and it is difficult for a small electronic device such as a digital camera or PDA whose processing capability is limited to process. A dedicated device is required. On the other hand, character pattern recognition technology
Although it is possible to provide a calculation algorithm that can be processed using hardware with low processing capability, it is difficult to recognize a face containing many complex line segments using this recognition technique.
The present invention has been made in view of the above-described circumstances, and provides an image recognition apparatus, an electronic device, an image recognition method, and an image recognition program capable of recognizing a complex object captured in an image with high speed and high accuracy. For the purpose.

〔形態１〕上記目的を達成するために、形態１の画像認識装置は、オブジェクトが含
まれるオブジェクト画像データを入力する画像入力部と、前記画像入力部により入力され
た前記オブジェクト画像データを構成するそれぞれの画素を、所定の輝度閾値を用いて二
値化する二値化部と、前記二値化部により二値化されたオブジェクト画像データの所定領
域に含まれる輪郭線の方向寄与度特徴量である局所的方向寄与度特徴量を算出する特徴量
算出部と、前記特徴量算出部により算出された前記局所的方向寄与度特徴量に基づいて、
前記オブジェクト画像データに含まれるオブジェクトを認識する認識部とを備えたことを
特徴とする。 [Mode 1] In order to achieve the above object, an image recognition apparatus according to mode 1 comprises an image input unit for inputting object image data including an object, and the object image data input by the image input unit. A binarization unit that binarizes each pixel using a predetermined luminance threshold value, and a direction contribution feature amount of a contour line included in a predetermined region of the object image data binarized by the binarization unit Based on the local amount contribution feature amount calculated by the feature amount calculation unit that calculates the local direction contribution degree feature amount and the feature amount calculation unit,
And a recognition unit for recognizing an object included in the object image data.

このような構成によれば、オブジェクト画像データが二値化によって単純化されるため
、複雑な線分を多く含むオブジェクト画像の認識にも方向寄与度特徴量を用いたアルゴリ
ズムを用いることが可能となり、画像に写った複雑なオブジェクトを簡単な処理で高速に
、かつ、精度良く認識することが可能となる。
ここで、オブジェクトとは画像認識の対象物を意味し、顔、動物、車や花瓶といった物
体等、画像認識の対象となるものであればどのようなものでもよい。
また、方向寄与度特徴量とは、線分の方向情報を抽出してなる特徴であり、例えば、あ
る画素が水平方向、右斜め４５度方向、垂直方向、左斜め４５度方向の４方向について、
対象画素がどの程度連続しているかを示す特徴量である。これにより、線分の方向や接続
関係等の幾何学的形状を定量的に求めることができる。
また、画像入力部は、オブジェクト画像データを入力するようになっていればどのよう
な構成であってもよく、例えば、入力装置等からオブジェクト画像データを入力してもよ
いし、外部の装置等からオブジェクト画像データを獲得（取得）または受信してもよいし
、記憶装置や記憶媒体等からオブジェクト画像データを読み出してもよい。したがって、
入力には、少なくとも、獲得、受信および読出が含まれる。 According to such a configuration, since the object image data is simplified by binarization, it is possible to use an algorithm using the direction contribution feature amount for recognizing an object image including many complex line segments. Thus, it is possible to recognize a complex object in an image at high speed and with high accuracy by simple processing.
Here, the object means an object for image recognition, and any object may be used for image recognition, such as a face, an animal, an object such as a car or a vase.
Further, the direction contribution feature amount is a feature obtained by extracting the direction information of the line segment. For example, a certain pixel has four directions of a horizontal direction, a right oblique 45 degree direction, a vertical direction, and a left oblique 45 degree direction. ,
This is a feature amount indicating how long the target pixel is continuous. Thereby, the geometric shape such as the direction of the line segment and the connection relation can be quantitatively obtained.
The image input unit may have any configuration as long as it can input object image data. For example, the image input unit may input object image data from an input device, an external device, or the like. The object image data may be acquired (acquired) or received from the image data, or the object image data may be read from a storage device or a storage medium. Therefore,
Input includes at least acquisition, reception and reading.

〔形態２〕さらに、形態２の画像認識装置は、形態１の画像認識装置において、二値
化した前記オブジェクト画像データを構成するそれぞれの画素について、有効画素を縦列
又は／及び横列ごとに積算してなる射影の分布に応じて前記輝度閾値を変えることを特徴
とする。
この構成によれば、オブジェクト画像の線分の複雑さが、オブジェクト画像データを構
成するそれぞれの画素について有効画素を縦列又は／及び横列ごとに積算してなる射影の
分布として求められる。そして、求めた射影の分布に応じて、ニ値化の輝度閾値を変える
ことにより、様々なオブジェクト画像データに対して、局所的方向寄与度特徴量を算出す
るのに好適な輝度閾値を用いてニ値化を行うことができる。
ここで、有効画素とは、ニ値化により、その画素が有効と判断された画素を意味する。
例えば、黒を有効、白を無効とした場合、黒画素を意味する。ここで、黒、白以外の異な
る２色を用いたニ値化においても有効画素の概念を適用できることはもちろんである。 [Mode 2] Furthermore, the image recognition apparatus according to mode 2 is the image recognition apparatus according to mode 1, and for each pixel constituting the binarized object image data, the effective pixels are integrated for each column and / or row. The brightness threshold value is changed in accordance with the projection distribution.
According to this configuration, the complexity of the line segment of the object image is obtained as a projection distribution obtained by accumulating effective pixels for each column or / and row for each pixel constituting the object image data. Then, by changing the binarization luminance threshold according to the obtained projection distribution, a luminance threshold suitable for calculating the local direction contribution feature amount is used for various object image data. Binarization can be performed.
Here, the effective pixel means a pixel in which the pixel is determined to be effective by binarization.
For example, when black is valid and white is invalid, it means a black pixel. Here, it goes without saying that the concept of effective pixels can also be applied to binarization using two different colors other than black and white.

〔形態３〕さらに、形態３の画像認識装置は、形態２の画像認識装置において、前記
二値化部は、前記射影の分布の状態を定量化するエントロピを算出し、算出されたエント
ロピに応じて前記輝度閾値を変えることを特徴とする。この構成によれば、上記射影の分
布の複雑さをエントロピとして定量化し、二値化の際の輝度閾値をエントロピの値に応じ
て適切に設定することが可能となる。 [Mode 3] Further, in the image recognition device according to mode 3, in the image recognition device according to mode 2, the binarization unit calculates an entropy for quantifying the state of the distribution of the projection, and according to the calculated entropy. And changing the brightness threshold. According to this configuration, the complexity of the projection distribution can be quantified as entropy, and the luminance threshold value in binarization can be appropriately set according to the entropy value.

〔形態４〕さらに、形態４の画像認識装置は、形態１ないし形態３のいずれか１の画
像認識装置において、前記二値化部は、前記オブジェクト画像データに対する二値化に先
立って、或いは、二値化後に、線分を平滑化するフィルタ処理を施し、前記特徴量算出部
は、前記二値化部により二値化及びフィルタ処理が施されたオブジェクト画像データにつ
いて前記局所的方向寄与度特徴量を算出することを特徴とする。
この構成によれば、オブジェクト画像データの線分がフィルタ処理により平滑化される
ため、撮影環境が悪く二値化だけでは十分に単純化できず、局所的方向寄与度特徴量の算
出が困難なオブジェクト画像についても局所的方向寄与度特徴量を安定的に算出すること
が可能となる。 [Mode 4] Furthermore, the image recognition device according to mode 4 is the image recognition device according to any one of modes 1 to 3, wherein the binarization unit is prior to binarization of the object image data, or After binarization, a filtering process for smoothing the line segment is performed, and the feature amount calculation unit performs the local direction contribution feature on the object image data binarized and filtered by the binarization unit. An amount is calculated.
According to this configuration, since the line segment of the object image data is smoothed by the filtering process, the photographing environment is bad and the binarization alone cannot be sufficiently simplified, and it is difficult to calculate the local direction contribution feature quantity. It is also possible to stably calculate the local direction contribution feature amount for the object image.

〔形態５〕さらに、形態５の画像認識装置は、形態１ないし形態４のいずれか１の画
像認識装置において、前記オブジェクトは顔であり、異なる複数の顔ごとの前記局所的方
向寄与度特徴量を含む顔認識用辞書データを記憶する記憶部を有し、前記認識部は、前記
オブジェクト画像データに含まれる顔について前記特徴量算出部によって算出された局所
的方向寄与度特徴量と、前記記憶部に記憶された顔認識用辞書データに含まれる局所的方
向寄与度特徴量とに基づいて、前記オブジェクト画像データに含まれる顔を認識すること
を特徴とする。この構成によれば、複雑な線分を多く含む顔の画像に対し、簡単な処理
で高速かつ高精度に顔認識を行うことが可能となる。
ここで、顔とは、人物、動物、あるいはそれらを模したロボット等の顔に相当するオブ
ジェクトを意味する。 [Mode 5] Furthermore, the image recognition device according to mode 5 is the image recognition device according to any one of modes 1 to 4, wherein the object is a face, and the local direction contribution feature quantity for each of a plurality of different faces. And a storage unit that stores dictionary data for face recognition including the local direction contribution feature amount calculated by the feature amount calculation unit for the face included in the object image data, and the storage A face included in the object image data is recognized based on a local direction contribution feature amount included in the face recognition dictionary data stored in the section. According to this configuration, it is possible to perform face recognition with high speed and high accuracy with a simple process on a face image including many complex line segments.
Here, the face means an object corresponding to a face such as a person, an animal, or a robot imitating them.

〔形態６〕さらに、形態６の電子機器は、形態１ないし５のいずれか１の画像認識装
置を備えたことを特徴とする。
この構成によれば、画像認識の機能を備える電子機器を提供することができる。
ここで、電子機器とは、カメラ、スキャナ、プロジェクタ、テレビ、プリンタ等、あら
ゆる電子機器を想定することができる。 [Mode 6] Further, an electronic device according to mode 6 includes the image recognition device according to any one of modes 1 to 5.
According to this configuration, an electronic device having an image recognition function can be provided.
Here, the electronic device may be any electronic device such as a camera, a scanner, a projector, a television, or a printer.

〔形態７〕さらに、形態７の画像認識方法は、オブジェクトが含まれるオブジェクト
画像データを入力する画像入力工程と、前記画像入力工程により入力された前記オブジェ
クト画像データを構成するそれぞれの画素を、所定の輝度閾値を用いて二値化する二値化
工程と、前記二値化工程により二値化されたオブジェクト画像データの所定領域に含まれ
る輪郭線の方向寄与度特徴量である局所的方向寄与度特徴量を算出する特徴量算出工程と
、前記特徴量算出工程により算出された前記局所的方向寄与度特徴量に基づいて、前記オ
ブジェクト画像データに含まれるオブジェクトを認識する認識工程とを有することを特徴
としている。
これにより、形態１の画像認識装置と同等の効果が得られる。 [Mode 7] Furthermore, in the image recognition method according to mode 7, an image input step for inputting object image data including an object and each pixel constituting the object image data input by the image input step A binarization step that binarizes using a luminance threshold of the image, and a local direction contribution that is a direction contribution degree feature amount of a contour line included in a predetermined region of the object image data binarized by the binarization step A feature amount calculation step for calculating a degree feature amount, and a recognition step for recognizing an object included in the object image data based on the local direction contribution feature amount calculated by the feature amount calculation step. It is characterized by.
Thereby, an effect equivalent to that of the image recognition apparatus according to mode 1 is obtained.

〔形態８〕さらに、形態８の画像認識プログラムは、コンピュータを、オブジェクト
が含まれるオブジェクト画像データを入力する画像入力手段と、前記画像入力手段により
入力された前記オブジェクト画像データを構成するそれぞれの画素を、所定の輝度閾値を
用いて二値化する二値化手段と、前記二値化手段により二値化されたオブジェクト画像デ
ータの所定領域に含まれる輪郭線の方向寄与度特徴量である局所的方向寄与度特徴量を算
出する特徴量算出手段と、前記特徴量算出手段により算出された前記局所的方向寄与度特
徴量に基づいて、前記オブジェクト画像データに含まれるオブジェクトを認識する認識手
段として機能させることを特徴とする。
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られ
たプログラムに従ってコンピュータが処理を実行すると、形態１の画像認識装置と同等の
作用および効果が得られる。 [Embodiment 8] Furthermore, the image recognition program according to Embodiment 8 is a computer that includes an image input unit that inputs object image data including an object, and each pixel that constitutes the object image data input by the image input unit. And binarizing means for binarizing using a predetermined luminance threshold, and a local direction contribution feature amount of a contour line included in a predetermined area of the object image data binarized by the binarizing means A feature amount calculating means for calculating a target direction contribution feature quantity, and a recognition means for recognizing an object included in the object image data based on the local direction contribution feature quantity calculated by the feature quantity calculation means. It is made to function.
With such a configuration, when the program is read by the computer and the computer executes processing according to the read program, the same operations and effects as those of the image recognition apparatus according to mode 1 are obtained.

〔形態９〕さらに、形態９の記録媒体は、コンピュータを、オブジェクトが含まれる
オブジェクト画像データを入力する画像入力手段と、前記画像入力手段により入力された
前記オブジェクト画像データを構成するそれぞれの画素を、所定の輝度閾値を用いて二値
化する二値化手段と、前記二値化手段により二値化されたオブジェクト画像データの所定
領域に含まれる輪郭線の方向寄与度特徴量である局所的方向寄与度特徴量を算出する特徴
量算出手段と、前記特徴量算出手段により算出された前記局所的方向寄与度特徴量に基づ
いて、前記オブジェクト画像データに含まれるオブジェクトを認識する認識手段として機
能させる画像認識プログラムを記録したコンピュータ読取可能な記録媒体であることを特
徴とする。
このような構成であれば、コンピュータによってプログラムが記憶媒体から読み取られ
、読み取られたプログラムに従ってコンピュータが処理を実行すると、形態１の画像認識
装置と同等の作用および効果が得られる。
ここで、記憶媒体とは、ＲＡＭ、ＲＯＭ等の半導体記憶媒体、ＦＤ、ＨＤ等の磁気記憶
型記憶媒体、ＣＤ、ＣＤＶ、ＬＤ、ＤＶＤ等の光学的読取方式記憶媒体、ＭＯ等の磁気記
憶型／光学的読取方式記憶媒体であって、電子的、磁気的、光学的等の読み取り方法のい
かんにかかわらず、コンピュータで読み取り可能な記憶媒体であれば、どのような記憶媒
体であってもよい。 [Mode 9] Furthermore, the recording medium of mode 9 includes a computer that includes an image input unit that inputs object image data including an object, and each pixel that constitutes the object image data input by the image input unit. , Binarization means for binarization using a predetermined luminance threshold value, and local direction contribution feature amounts of contour lines included in a predetermined area of the object image data binarized by the binarization means A function amount calculation unit that calculates a direction contribution feature amount, and a recognition unit that recognizes an object included in the object image data based on the local direction contribution feature amount calculated by the feature amount calculation unit It is a computer-readable recording medium on which an image recognition program to be recorded is recorded.
With such a configuration, when the program is read from the storage medium by the computer and the computer executes processing in accordance with the read program, the same operation and effect as those of the image recognition apparatus according to mode 1 can be obtained.
Here, the storage medium is a semiconductor storage medium such as RAM or ROM, a magnetic storage type storage medium such as FD or HD, an optical reading type storage medium such as CD, CDV, LD, or DVD, or a magnetic storage type such as MO. / Optical reading type storage medium, and any storage medium can be used as long as it can be read by a computer regardless of electronic, magnetic, optical, etc. .

以下、図面を参照して本発明の実施形態について説明する。
図１は、本実施形態に係る画像認識システム１の構成を示す図である。
画像認識システム１は、オブジェクトの一例たる人物９の顔を撮影し画像データを生成
するカメラ２と、カメラ２の画像データに対して画像認識を施して、撮影画像に写ってい
る人物９を認識する画像認識装置としてのコンピュータ３と、このコンピュータ３に接続
された表示装置４とを備えている。カメラ２は、静止画像を撮影するデジタルカメラであ
り、撮影された静止画像の画像データがコンピュータ３に入力される。なお、カメラ２と
して、動画像を撮影するビデオカメラを用いても良く、この場合には、撮影データの個々
のフレーム（画像データ）がコンピュータ３に入力される。表示装置４は、画像認識対象
の画像データや、画像認識結果等を表示する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram showing a configuration of an image recognition system 1 according to the present embodiment.
The image recognition system 1 recognizes the person 9 in the photographed image by performing image recognition on the camera 2 that shoots the face of the person 9 as an example to generate image data and the image data of the camera 2. A computer 3 serving as an image recognition device, and a display device 4 connected to the computer 3. The camera 2 is a digital camera that captures a still image, and image data of the captured still image is input to the computer 3. Note that a video camera that captures a moving image may be used as the camera 2, and in this case, individual frames (image data) of the captured data are input to the computer 3. The display device 4 displays image data to be recognized, image recognition results, and the like.

図２は、コンピュータ３の機能的構成を示すブロック図である。
この図において、画像入力部１０は、カメラ２との接続インタフェースであり、カメラ
２の画像データが画像入力部１０から入力される。顔領域抽出部（オブジェクト領域抽出
部）１１は、画像データに対し、顔が写った画像領域である顔領域４０（図４参照）の判
定基準となる肌色に基づいて、画像領域から顔領域４０を抽出する。記憶部１２は、顔領
域４０の判定基準となる肌色の色情報である肌色基準色データ１６や顔認識用辞書データ
１７等の各種データ、及び、コンピュータ３を画像認識装置として機能させる画像認識プ
ログラム１８等の各種プログラムを記憶する。
コンピュータ３は、演算実行手段としてのＣＰＵや、記憶手段としてのＲＯＭ、ＣＰＵ
のワークエリアとして機能するＲＡＭ、外部機器との接続インタフェース等を有し、上記
画像認識プログラム１８がＲＯＭに格納されている。そして、ＣＰＵが画像認識プログラ
ム１８を実行することで、上述した画像入力部１０から各部の機能が実現される。 FIG. 2 is a block diagram showing a functional configuration of the computer 3.
In this figure, an image input unit 10 is a connection interface with the camera 2, and image data of the camera 2 is input from the image input unit 10. The face area extraction unit (object area extraction unit) 11 applies the image area to the face area 40 from the image area based on the skin color that is the determination criterion of the face area 40 (see FIG. 4) that is the image area where the face is captured. To extract. The storage unit 12 includes various data such as skin color reference color data 16 and face recognition dictionary data 17 that are skin color information serving as a determination reference for the face area 40, and an image recognition program that causes the computer 3 to function as an image recognition device. Various programs such as 18 are stored.
The computer 3 includes a CPU as calculation execution means, a ROM and a CPU as storage means.
The image recognition program 18 is stored in the ROM. When the CPU executes the image recognition program 18, the functions of the respective units are realized from the image input unit 10 described above.

二値化・正規化部１３は、顔領域４０の各画素を黒画素と白画素に二値化すると共に、
後述する局所的方向寄与度特徴量を算出する際に用いるメッシュ領域２５（図３参照）の
中心と、顔領域４０の中心とを合わせ、かつ、メッシュ領域２５内に顔領域４０が収まる
ように顔領域４０を拡大或いは縮小する正規化を行う。方向寄与度特徴量算出部１４は、
顔領域４０の特徴量を示す後述の局所的方向寄与度特徴量を算出する。顔認識部１５は、
顔領域４０の局所的方向寄与度特徴量と、顔認識用辞書データ１７とに基づいて、顔領域
４０の顔を認識する。なお、これらの各部は、コンピュータ３が画像認識プログラム１８
を実行することで実現されるものであるが、各部を専用のハードウェア回路により構成し
ても良いことは勿論である。 The binarization / normalization unit 13 binarizes each pixel of the face area 40 into a black pixel and a white pixel,
The center of the mesh area 25 (see FIG. 3) used when calculating the local direction contribution feature amount described later is aligned with the center of the face area 40, and the face area 40 is within the mesh area 25. Normalization for enlarging or reducing the face area 40 is performed. The direction contribution feature amount calculation unit 14
A local direction contribution feature amount, which will be described later, indicating the feature amount of the face region 40 is calculated. The face recognition unit 15
The face in the face area 40 is recognized based on the local direction contribution feature quantity of the face area 40 and the face recognition dictionary data 17. It should be noted that each of these units is processed by the computer 3 using the image recognition program 18
Of course, each unit may be configured by a dedicated hardware circuit.

上記の通り、本実施形態では局所的方向寄与度特徴量を顔認識に用いることとしている
。この局所的方向寄与度特徴量については、上述の非特許文献１に掲載されており、ここ
では文字パターンの認識を例にして簡単に説明する。
方向寄与度特徴量は、文字線の方向情報を抽出してなる特徴の１つであり、ある画素が
水平方向、右斜め４５度方向、垂直方向、左斜め４５度方向の４方向について、対象画素
（例えば黒画素）がどの程度連続しているかを示しており、これにより、文字線の方向や
接続関係等の幾何学的形状が示される。
方向寄与度ｄｍ（ｍ＝１、２、３、４）は、図３（Ａ）に示すように、文字線内にある
点Ｐ１から８方向に触手を伸ばして決まる各方向の黒点連結長ｌｉ(ｉ＝１、２、・・・
８）を用いて次式のように表される。 As described above, in this embodiment, the local direction contribution feature amount is used for face recognition. This local direction contribution feature amount is published in the above-mentioned Non-Patent Document 1, and will be briefly described here by taking character pattern recognition as an example.
The direction contribution feature amount is one of features obtained by extracting the direction information of the character line, and a certain pixel is subject to four directions of the horizontal direction, the right oblique 45 degree direction, the vertical direction, and the left oblique 45 degree direction. This indicates how long the pixels (for example, black pixels) are continuous, and this indicates the geometric shape such as the direction of the character lines and the connection relationship.
As shown in FIG. 3A, the direction contribution degree dm (m = 1, 2, 3, 4) is determined by extending the tentacles in the eight directions from the point P1 in the character line, and the black dot connection length li in each direction. (i = 1, 2, ...
8) is used to express the following equation.

なお、上記式（１）において、ｄ１は水平方向、ｄ２は右斜め４５度方向、ｄ３は垂直
方向、ｄ４は左斜め４５度方向の方向寄与度成分である。方向寄与度ｄ１を具体的に計算
すると、上記式（１）から次式が得られる。 In the above formula (1), d1 is a horizontal direction, d2 is a right oblique 45 degree direction, d3 is a vertical direction, and d4 is a direction contribution component in a left oblique 45 degree direction. When the direction contribution d1 is specifically calculated, the following equation is obtained from the above equation (1).

図３（Ａ）に示す例では、点Ｐ１においては垂直方向に黒画素が比較的長く連結してい
るため、垂直方向の方向寄与度ｄ３の値が大きくなる。このように方向寄与度ｄｍは、文
字線の方向と共に接続関係を定量的に評価できる。
方向寄与度には、大局的方向寄与度（GDCD：Global Direction Contributivity Densit
y）と局所的方向寄与度（LDCD：Local Direction Contributivity Density）の２つが知
られている。大局的方向寄与度特徴は、文字を大局的に、文字線の方向、接続関係の違い
を捉える特徴である。一方、局所的方向寄与度特徴は、所定の局所領域内の線密度に加え
、文字線の方向、接続関係の違いを反映するものである。 In the example shown in FIG. 3A, since the black pixels are connected in the vertical direction for a relatively long time at the point P1, the value of the direction contribution d3 in the vertical direction is large. Thus, the direction contribution degree dm can quantitatively evaluate the connection relationship together with the direction of the character line.
The direction contribution is a global direction contribution density (GDCD).
Two are known: y) and Local Direction Contributivity Density (LDCD). The global direction contribution characteristic is a characteristic that captures the difference between the character line direction and the connection relation in a global manner. On the other hand, the local direction contribution feature reflects the difference in the direction of the character line and the connection relationship in addition to the line density in the predetermined local region.

局所的方向寄与度特徴は、図３（Ｂ）に示すように、文字パターンがｍ×ｍ個の粗いメ
ッシュ領域２５に分割し、各メッシュ領域２５内の全黒画素についての方向寄与度を求め
、それらをメッシュ領域２５毎に４方向別に投影した平均として求められる。したがって
、各メッシュ領域２５につき４個の特徴量が得られ、また、文字全体ではｍ×ｍ×４個の
特徴量が得られる。
このように、局所的方向寄与度特徴は、文字パターンの大局的な特徴をｍ×ｍ個のメッ
シュ領域２５で分割して成る区分において、文字パターンの局所的な特徴を線分の連結具
合の寄与度から得ており、大局的な特徴、及び、局所的な特徴の双方の特徴が融合した特
徴と言える。そのため、文字パターンの外形的な特徴と文字線分のつながり、詳細な形状
などを反映している特徴と言える。
本実施形態では、この局所的方向寄与度特徴を方向寄与度の特徴として用いることとし
、前掲図２に示す方向寄与度特徴量算出部１４は、上記顔領域４０を局所領域として、こ
の顔領域４０の局所的方向寄与度特徴量を算出する。 As shown in FIG. 3B, the local direction contribution feature is obtained by dividing the character pattern into m × m coarse mesh regions 25 and obtaining direction contributions for all black pixels in each mesh region 25. These are obtained as an average obtained by projecting them in four directions for each mesh region 25. Therefore, four feature values are obtained for each mesh region 25, and m × m × 4 feature values are obtained for the entire character.
As described above, the local direction contribution feature is obtained by dividing the local feature of the character pattern in the segmentation state obtained by dividing the global feature of the character pattern by the m × m mesh regions 25. It is obtained from the degree of contribution, and it can be said that the features of both global features and local features are fused. Therefore, it can be said that it reflects the external features of the character pattern, the connection of the character line segments, and the detailed shape.
In the present embodiment, this local direction contribution feature is used as a direction contribution feature, and the direction contribution feature amount calculation unit 14 shown in FIG. 2 uses the face region 40 as a local region, and this face region. Forty local direction contribution feature quantities are calculated.

ただし、局所的方向寄与度特徴量を算出するには、局所領域内が二値画像であり、さら
に、線分が鮮明に抽出されている必要がある。一般に、顔を撮影した画像は、カラー画像
やグレースケール等の多値画像であり、さらに、顔のシワ等により不鮮明な線が多く含ま
れるため、上記局所的方向寄与度特徴量の算出に適した画像ではない。
そこで、二値化・正規化部１３は、顔領域抽出部１１により抽出された顔領域４０の各
画素を、局所的方向寄与度特徴量が算出可能な程度に二値化することとしている。 However, in order to calculate the local direction contribution feature amount, it is necessary that the local region is a binary image and that a line segment is clearly extracted. In general, an image of a face is a multi-valued image such as a color image or gray scale, and further contains many unclear lines due to facial wrinkles, etc., so it is suitable for calculating the local direction contribution feature amount. It is not an image.
Therefore, the binarization / normalization unit 13 binarizes each pixel of the face region 40 extracted by the face region extraction unit 11 to such an extent that the local direction contribution feature quantity can be calculated.

詳述すると、画像データから抽出された顔領域４０がグレースケール画像である場合、
各画素を所定の輝度閾値（例えば、最大輝度値の半分の値）で二値化すると、図４（Ａ）
に示すように、各画素が黒画素又は白画素に二値化される。しかしながら、この状態では
、通常、顔のシワや頬等により不鮮明な線や細かな線が多く残る。
ここで、Ｑ×Ｒ個の画素がマトリクス状に配列して顔領域４０が構成されているものと
し、また、横列方向をＸ軸、縦列方向をＹ軸とした場合に、縦列ごとに黒画素の数を積算
して成る積算数のＸ軸方向の分布をＸ軸射影ｄｘ（ｑ）（但し１≦ｑ≦Ｑ）、横列ごとに
黒画素の数を積算して成る積算数のＹ軸方向の分布をＹ軸射影ｄｙ（ｒ）（但し１≦ｒ≦
Ｒ）として求める。 More specifically, when the face area 40 extracted from the image data is a grayscale image,
When each pixel is binarized with a predetermined luminance threshold value (for example, half the maximum luminance value), FIG.
As shown, each pixel is binarized into a black pixel or a white pixel. However, in this state, many unclear lines and fine lines usually remain due to facial wrinkles and cheeks.
Here, it is assumed that the face area 40 is configured by arranging Q × R pixels in a matrix, and when the horizontal direction is the X axis and the vertical direction is the Y axis, black pixels are provided for each vertical column. X-axis projection dx (q) (where 1 ≦ q ≦ Q), and the cumulative number obtained by integrating the number of black pixels for each row in the Y-axis direction. The Y-axis projection dy (r) (where 1 ≦ r ≦
R).

図４（Ａ）に示すように、顔領域４０に含まれる線分の数が多い場合、顔領域４０の広
い範囲に黒画素が分布するため、Ｘ軸射影ｄｘ（ｑ）及びＹ軸射影ｄｙ（ｒ）の分布は、
ピークの落差が小さく全体的になだらかな曲線となる。
一方、顔領域４０の局所的方向寄与度特徴量を算出するには、図４（Ｂ）に示すように
、顔領域４０に含まれる線分が、左右の眉４１、４２、左右の目４３、４４、鼻４５及び
口４６といった顔部品の輪郭線のみといった程度まで顔領域４０が単純化される必要があ
る。このように単純化された顔領域４０においては、Ｘ軸射影ｄｘ（ｑ）及びＹ軸射影ｄ
ｙ（ｒ）の分布は、ピークの落差が大きく、全体的に複数のピークが鋭くたった曲線とな
る。 As shown in FIG. 4A, when the number of line segments included in the face area 40 is large, black pixels are distributed over a wide range of the face area 40, and thus the X-axis projection dx (q) and the Y-axis projection dy. The distribution of (r) is
The drop of the peak is small and the overall curve becomes smooth.
On the other hand, in order to calculate the local direction contribution feature quantity of the face area 40, as shown in FIG. 4B, the line segments included in the face area 40 include left and right eyebrows 41 and 42, and left and right eyes 43. , 44, nose 45, and mouth 46, the face area 40 needs to be simplified to the extent that only the contour lines of the facial parts are present. In the face area 40 thus simplified, the X-axis projection dx (q) and the Y-axis projection d
The distribution of y (r) is a curve having a large peak drop and a sharp plurality of peaks as a whole.

したがって、Ｘ軸射影ｄｘ（ｑ）及びＹ軸射影ｄｙ（ｒ）の分布の状態（複雑さ）を定
量化することで、二値化によって、顔領域４０に含まれる線分が局所的方向寄与度特徴量
が算出可能な程度にまで単純化されたか否かを判定することが可能となり、本実施形態で
は、この定量化にエントロピを用いる。Ｘ軸射影ｄｘ（ｑ）の射影エントロピＥｘ、及び
、Ｙ軸射影ｄｙ（ｒ）の射影エントロピＥｙは、それぞれ式（３）及び式（４）にて求め
られる。なお、以下の式（３）、式（４）において、Ｎｘ、Ｎｙは、それぞれＸ軸射影ｄ
ｘ（ｑ）及びＹ軸射影ｄｙ（ｒ）の総数である。 Accordingly, by quantifying the distribution state (complexity) of the X-axis projection dx (q) and the Y-axis projection dy (r), the line segment included in the face region 40 contributes to the local direction by binarization. It is possible to determine whether or not the degree feature amount has been simplified to a degree that can be calculated. In this embodiment, entropy is used for this quantification. The projection entropy Ex of the X-axis projection dx (q) and the projection entropy Ey of the Y-axis projection dy (r) are obtained by Expression (3) and Expression (4), respectively. In the following equations (3) and (4), Nx and Ny are X-axis projections d, respectively.
The total number of x (q) and Y-axis projection dy (r).

前掲図４（Ａ）に示す例では、Ｘ軸射影ｄｘ（ｑ）及びＹ軸射影ｄｙ（ｒ）の分布がな
らだかであるため、上記式（３）及び式（４）から求めた射影エントロピＥｘ、Ｅｙは小
さくなり、また、図４（Ｂ）に示す例では、Ｘ軸射影ｄｘ（ｑ）及びＹ軸射影ｄｙ（ｒ）
の分布にピークが多く形状が鋭いため、射影エントロピＥｘ、Ｅｙが大きくなる。
すなわち、顔領域４０の画像に対する二値化の輝度閾値を大きくするほど、顔領域４０
に含まれる黒画素の数が減り画像（線分）が単純化されるため、射影エントロピＥｘ、Ｅ
ｙが増大する。したがって、射影エントロピＥｘ、Ｅｙが、局所的方向寄与度特徴量を算
出可能な程度まで画像が単純化された事を示す射影エントロピ閾値Ｅｔｈをこえるまで、
二値化の輝度閾値を上げて顔領域４０を二値化することで、局所的方向寄与度特徴量を算
出可能な顔領域４０の画像が得られる。局所的方向寄与度特徴量を算出可能な顔領域４０
の画像とは、図４（Ｂ）に示すように、顔領域４０には、左右の眉４１、４２、左右の目
４３、４４、鼻４５及び口４６等の顔部品の輪郭線の線分のみがおおよそ含まれる程度の
画像である。顔領域４０をそれ以上単純化すると、顔部品の輪郭線が変形又は消去される
ため、顔認識に必要な情報が欠落することになる。
なお、上記射影エントロピ閾値Ｅｔｈは、メッシュ領域２５の大きさや、画像の解像度
などに依存するため、予め実験的に求められている。また、射影エントロピ閾値Ｅｔｈは
、Ｘ軸の射影エントロピＥｘ、Ｙ軸の射影エントロピＥｙごとに設定されている。 In the example shown in FIG. 4A, since the distribution of the X-axis projection dx (q) and the Y-axis projection dy (r) is gentle, the projection entropy obtained from the above equations (3) and (4). Ex and Ey become smaller, and in the example shown in FIG. 4B, the X-axis projection dx (q) and the Y-axis projection dy (r)
The projection entropy Ex and Ey are large because there are many peaks in the distribution and the shape is sharp.
That is, the larger the binarization luminance threshold for the image of the face area 40 is, the more the face area 40 becomes.
Since the number of black pixels contained in the image is reduced and the image (line segment) is simplified, the projection entropies Ex and E
y increases. Therefore, until the projection entropy Ex, Ey exceeds the projection entropy threshold Eth indicating that the image has been simplified to the extent that the local direction contribution feature quantity can be calculated,
By increasing the binarization luminance threshold and binarizing the face area 40, an image of the face area 40 capable of calculating the local direction contribution feature amount is obtained. Face region 40 capable of calculating local direction contribution feature quantity
As shown in FIG. 4B, the face area 40 includes line segments of outlines of facial parts such as left and right eyebrows 41 and 42, left and right eyes 43 and 44, nose 45 and mouth 46. This is an image that only contains approximately. If the face area 40 is further simplified, the outline of the face part is deformed or deleted, and information necessary for face recognition is lost.
Note that the projection entropy threshold Eth depends on the size of the mesh region 25, the resolution of the image, and the like, and is thus experimentally obtained in advance. The projection entropy threshold Eth is set for each of the X-axis projection entropy Ex and the Y-axis projection entropy Ey.

次いで、本実施形態の動作について説明する。
図５及び図６はコンピュータ３によって実行される画像認識プログラム１８のフローチ
ャートである。図５に示すように、人物９が写った画像データがカメラ２から画像入力部
１０に入力されると（ステップＳ１）、顔領域抽出部１１が画像データの画像領域から顔
領域４０を抽出する（ステップＳ２）。この顔領域４０の抽出は、画像に写った各物体の
領域を分離する領域分離を用いて行われる。詳述すると、図７（Ａ）に示すように、一般
に、画像認識対象の画像２０には、人物９と共に背景が写っており、領域分離においては
、画像２０に写っている物体や背景に写った物体の各々の領域が分離される。 Next, the operation of this embodiment will be described.
5 and 6 are flowcharts of the image recognition program 18 executed by the computer 3. As shown in FIG. 5, when image data showing a person 9 is input from the camera 2 to the image input unit 10 (step S1), the face region extraction unit 11 extracts a face region 40 from the image region of the image data. (Step S2). The extraction of the face area 40 is performed using area separation that separates the areas of the objects in the image. More specifically, as shown in FIG. 7A, in general, the image 20 to be recognized includes a background together with the person 9, and in the region separation, the background is reflected in the object or background in the image 20. Each region of the object is separated.

例えば、人物９の背景に看板２１、山２２及び空２３が写っている画像２０においては
、領域分離により、図７（Ｂ）に示すように、人物９の領域を分離した分離領域３０、看
板２１の領域を分離した分離領域３１、山２２の領域を分離した分離領域３２Ａ〜３２Ｃ
、空２３の領域を分離した分離領域３３に画像領域が分離される。
そして、顔領域抽出部１１は、各分離領域３０〜３３の中から、顔領域の色情報と肌色
基準色データ１６とを比較し、肌色を多く含む領域を特定し、図７（Ｃ）に示すように、
人物９を含む分離領域３０を特定する。さらに、顔領域抽出部１１は、肌色基準色データ
１６、及び、この分離領域３０に含まれている輪郭線に基づいて、図７（Ｄ）に示すよう
に、分離領域３０の中から顔領域４０を抽出する。 For example, in the image 20 in which the signboard 21, the mountain 22 and the sky 23 are reflected in the background of the person 9, as shown in FIG. 7B by the area separation, the separation area 30 and the signboard separated from the person 9 area. Separation region 31 from which 21 regions are separated, and separation regions 32A to 32C from which mountain 22 regions are separated
The image area is separated into the separation area 33 obtained by separating the sky 23 area.
Then, the face area extraction unit 11 compares the color information of the face area with the skin color reference color data 16 from each of the separation areas 30 to 33, specifies an area containing a lot of skin colors, and FIG. As shown
The separation area 30 including the person 9 is specified. Further, the face area extracting unit 11 selects the face area from the separation area 30 based on the skin color reference color data 16 and the outline included in the separation area 30 as shown in FIG. 40 is extracted.

このようにして顔領域４０が抽出されると、二値化・正規化部１３は、顔領域４０の画
像がカラー画像である場合にはグレースケール画像に変換し、グレースケールの顔領域４
０の各画素に対して、所定の輝度閾値（例えば、最大輝度の半分の値）で二値化する（ス
テップＳ３）。これにより、例えば、図８（Ａ）に示すように、顔領域４０に比較的多く
の線分を含む二値化画像が得られる。
そして、二値化・正規化部１３は、二値化後の顔領域４０に対し、上記式（３）及び（
４）を用いて射影エントロピＥｘ、Ｅｙを算出し（ステップＳ４）、これらの射影エント
ロピＥｘ、Ｅｙが、各々について設定された射影エントロピ閾値Ｅｔｈ以上であるかを判
定する（ステップＳ５）。 When the face area 40 is extracted in this way, the binarization / normalization unit 13 converts the face area 40 into a gray scale image when the image of the face area 40 is a color image, and the gray scale face area 4
Each pixel of 0 is binarized with a predetermined luminance threshold (for example, half the maximum luminance) (step S3). Thereby, for example, as shown in FIG. 8A, a binary image including a relatively large number of line segments in the face region 40 is obtained.
Then, the binarization / normalization unit 13 applies the above formulas (3) and (3) to the face area 40 after binarization.
4) is used to calculate the projection entropies Ex and Ey (step S4), and it is determined whether these projection entropies Ex and Ey are greater than or equal to the projection entropy threshold Eth set for each (step S5).

射影エントロピＥｘ、Ｅｙが射影エントロピ閾値Ｅｔｈより小さい場合（ステップＳ５
：ＮＯ）、顔領域４０が局所的方向寄与度特徴量を算出可能な程度には未だ単純化されて
いないため、二値化・正規化部１３は、二値化処理により顔領域４０の黒画素数（線分数
）を減らすべく、現在の輝度閾値に所定値を加算して輝度閾値を上げる（ステップＳ６）
。 When the projection entropy Ex, Ey is smaller than the projection entropy threshold Eth (step S5)
: NO), since the face area 40 has not yet been simplified to such an extent that the local direction contribution feature quantity can be calculated, the binarization / normalization unit 13 performs blackening of the face area 40 by binarization processing. In order to reduce the number of pixels (number of line segments), a predetermined value is added to the current luminance threshold value to increase the luminance threshold value (step S6).
.

ここで、暗い照明環境下で顔の撮影が行われる等して、顔領域４０における陰影が強く
、シワやシミなどが多数写っている場合には、輝度閾値を十分大きくして二値化処理をし
ても、顔領域４０の射影エントロピＥｘ、Ｅｙが射影エントロピ閾値Ｅｔｈをこえ難くな
り、また、射影エントロピＥｘ、Ｅｙが射影エントロピ閾値Ｅｔｈをこえる程の輝度閾値
で二値化処理を行うと、顔領域４０から顔部品の線分が必要以上に除去され、認識を正し
く行えなくなる恐れがある。 Here, when a face is photographed in a dark illumination environment and the shadow in the face area 40 is strong and many wrinkles and spots are captured, the luminance threshold value is sufficiently increased and binarization processing is performed. Even if the projection entropy Ex and Ey of the face area 40 are difficult to exceed the projection entropy threshold Eth, and the binarization process is performed with a luminance threshold that the projection entropy Ex and Ey exceed the projection entropy threshold Eth. There is a possibility that the line segment of the facial part is removed more than necessary from the face area 40 and the recognition cannot be performed correctly.

そこで、二値化・正規化部１３は、上記ステップＳ６において輝度閾値を上げた後、こ
の輝度閾値が、認識不能な程度まで顔領域４０が単純化されてしまう限界閾値をこえたか
否かを判断する（ステップＳ７）。輝度閾値が限界閾値より小さい場合（ステップＳ７：
ＮＯ）、この輝度閾値により二値化が行われても顔領域４０が単純化され過ぎることは無
いため、二値化・正規化部１３は、処理手順をステップＳ３に戻し、再度、二値化処理を
行う。これにより、例えば図８（Ｂ）に示すように、顔領域４０の画像が図８（Ａ）に比
べて単純化される。 Therefore, the binarization / normalization unit 13 raises the luminance threshold value in the above step S6, and then determines whether or not the luminance threshold value exceeds the limit threshold value that simplifies the face area 40 to the extent that it cannot be recognized. Judgment is made (step S7). When the luminance threshold is smaller than the limit threshold (step S7:
NO), since the face area 40 is not oversimplified even if binarization is performed by this luminance threshold, the binarization / normalization unit 13 returns the processing procedure to step S3, and again the binarization Process. As a result, for example, as shown in FIG. 8B, the image of the face region 40 is simplified compared to FIG. 8A.

一方、輝度閾値が限界閾値以上である場合には（ステップＳ７：ＹＥＳ）、この輝度閾
値により二値化を行うことができない。
そこで、二値化・正規化部１３は、顔領域４０全体に対し、線分を平滑化するフィルタ
処理を施して画像を単純化し、局所的方向寄与度特徴量が算出可能な画像を得る（ステッ
プＳ８）。このフィルタ処理には、例えば、メッシュ領域２５程度の直径の標準偏差をも
つガウスフィルタを用いることが可能である。
なお、本実施形態では、輝度閾値が限界閾値以上に達したときに、顔領域４０の線分を
平滑化するフィルタ処理を行ったが、ステップＳ３において二値化処理を行う前に、フィ
ルタ処理を予め行うようにしても良い。この場合には、顔領域４０に含まれている必要な
情報が欠落しないように、ガウスフィルタの標準偏差を小さくして、平滑化の効果を弱め
ておくことが望ましい。 On the other hand, if the luminance threshold is equal to or greater than the limit threshold (step S7: YES), binarization cannot be performed using this luminance threshold.
Therefore, the binarization / normalization unit 13 performs filtering processing for smoothing the line segment on the entire face region 40 to simplify the image and obtain an image in which the local direction contribution feature quantity can be calculated ( Step S8). For this filtering process, for example, a Gaussian filter having a standard deviation with a diameter of about the mesh region 25 can be used.
In the present embodiment, when the luminance threshold reaches or exceeds the limit threshold, the filtering process for smoothing the line segment of the face area 40 is performed. However, before the binarization process is performed in step S3, the filtering process is performed. May be performed in advance. In this case, it is desirable to reduce the smoothing effect by reducing the standard deviation of the Gaussian filter so that the necessary information included in the face area 40 is not lost.

さて、射影エントロピＥｘ、Ｅｙが射影エントロピ閾値Ｅｔｈ以上である場合（ステッ
プＳ５：ＹＥＳ）、又は、ステップＳ８におけるフィルタ処理実行後は、局所的方向寄与
度特徴量を算出可能な程度まで顔領域４０の画像（線分）が単純化された事を示す。例え
ば、図８（Ｃ）に示すように、顔領域４０には、眉や目、鼻、口等の顔部品のみがおおよ
そ残り、更に、これらの顔部品が比較的少ない線分数で示された状態となる。
このようにして局所的方向寄与度特徴量を算出可能な顔領域４０が得られた後、二値化
・正規化部１３は、顔部品に含まれない点、すなわち、連結点の数が所定数以下の孤立点
を除去して顔領域４０からノイズを除去した後（ステップＳ９）、局所的方向寄与度特徴
量を求めるためのｍ×ｍ個のメッシュ領域２５に顔領域４０を合わせる正規化を行う。 Now, when the projection entropy Ex, Ey is greater than or equal to the projection entropy threshold Eth (step S5: YES), or after executing the filter processing in step S8, the face area 40 is calculated to the extent that the local direction contribution feature quantity can be calculated. This shows that the image (line segment) has been simplified. For example, as shown in FIG. 8C, only face parts such as eyebrows, eyes, nose and mouth remain in the face area 40, and these face parts are shown with relatively few line segments. It becomes a state.
After the face region 40 that can calculate the local direction contribution degree feature quantity is obtained in this way, the binarization / normalization unit 13 determines that points that are not included in the face part, that is, the number of connection points is predetermined. After removing noise from the face area 40 by removing a few or less isolated points (step S9), normalization is performed to match the face area 40 to the m × m mesh areas 25 for obtaining the local direction contribution feature amount. I do.

すなわち、図６に示すように、二値化・正規化部１３は、顔領域４０の重心ＯＧ及び大
きさ（縦横の画素数）を計測し（ステップＳ１０）、図９（Ｂ）に示すように、ｍ×ｍ個
のメッシュ領域２５の中心ＯＭに顔領域４０の重心ＯＧを合わせ、かつ、顔領域４０の長
径（図９（Ａ）に示す例では縦方向）がｍ×ｍ個のメッシュ領域２５に収まるように顔領
域４０の拡大或いは縮小を行い（ステップＳ１１）、これにより、正規化が行われる。
顔領域４０の拡大或いは縮小には、顔領域４０の縦横比の比率を一定に保ったアフィン
変換が用いられる。また、顔領域４０に含まれる顔が傾いている場合や、正面以外の方向
を向いている場合には、顔領域４０の拡大或いは縮小に先だって、顔の傾きを無くし、か
つ、正面を向いた画像にする補正も行う。顔の向きの検出は、例えば、顔領域を抽出した
際に、目や鼻、口等の顔部品の位置やバランスに基づいて検出可能である。 That is, as shown in FIG. 6, the binarization / normalization unit 13 measures the center of gravity OG and the size (number of vertical and horizontal pixels) of the face region 40 (step S10), as shown in FIG. 9B. Further, the center of gravity OG of the face region 40 is aligned with the center OM of the m × m mesh regions 25, and the major axis of the face region 40 (the vertical direction in the example shown in FIG. 9A) is m × m meshes. The face area 40 is enlarged or reduced so as to fit in the area 25 (step S11), and thereby normalization is performed.
For enlargement or reduction of the face area 40, affine transformation is used in which the aspect ratio of the face area 40 is kept constant. Further, when the face included in the face area 40 is tilted or faces in a direction other than the front, the face is not tilted and the front is turned before the enlargement or reduction of the face area 40. It also corrects the image. The face orientation can be detected based on the position and balance of facial parts such as eyes, nose and mouth when a face area is extracted.

なお、上記メッシュ領域２５の大きさは、画像認識プログラム１８を実行する機器の処
理能力や、目標とする認識精度に応じて適宜設定されており、通常は、通常は、１００×
１００画素程度以上の大きさに設定される。１つのメッシュ領域２５に含ませる画素数を
多くすることで、計算量が多くなるものの精度を高めることができる。なお、メッシュ領
域２５は正方形であることが好ましいが長方形であっても良い。
また、上記ステップＳ９における孤立点の除去は、ステップＳ１１の正規化後に行って
も良い。 Note that the size of the mesh region 25 is set as appropriate according to the processing capability of the device that executes the image recognition program 18 and the target recognition accuracy.
The size is set to about 100 pixels or more. Increasing the number of pixels included in one mesh region 25 can increase the accuracy of the calculation amount. The mesh region 25 is preferably square but may be rectangular.
Further, the removal of isolated points in step S9 may be performed after normalization in step S11.

このように、顔領域４０が二値化、正規化されると、方向寄与度特徴量算出部１４は、
メッシュ領域２５毎に局所的方向寄与度特徴量を算出する。すなわち、前掲図６に示すよ
うに、方向寄与度特徴量算出部１４は、メッシュ領域２５を指定するメッシュ番号Ｍ（但
し１≦Ｍ≦ｍ×ｍ）を「１」に初期化し（ステップＳ１２）、Ｍ番目のメッシュ領域２５
について局所的方向寄与度特徴量を算出する（ステップＳ１３）。次いで、方向寄与度特
徴量算出部１４は、現在のメッシュ番号Ｍがｍ×ｍと等しいか否か、すなわち、最後のメ
ッシュ領域２５に到達したか否かを判別し（ステップＳ１４）、メッシュ番号Ｍがｍ×ｍ
と等しくない場合には（ステップＳ１４：ＮＯ）、メッシュ番号Ｍを「１」だけインクリ
メントし（ステップＳ１５）、次のメッシュ領域２５について局所的方向寄与度特徴量を
算出すべく、処理手順をステップＳ１３に戻す。
一方、メッシュ番号Ｍがｍ×ｍと等しい場合には（ステップＳ１４：ＹＥＳ）、全ての
メッシュ領域２５について局所的方向寄与度特徴量が算出されたことを示すため、これら
の局所的方向寄与度特徴量と、顔認識用辞書データ１７とに基づいて、顔認識部１５が、
顔領域４０に写った顔が誰の顔であるか、或いは、顔領域４０が顔認識用辞書データ１７
に登録されている顔のいずれかに該当するか否かといった顔認識を行う（ステップＳ１６
）。 As described above, when the face area 40 is binarized and normalized, the direction contribution feature amount calculation unit 14
A local direction contribution feature quantity is calculated for each mesh region 25. That is, as shown in FIG. 6, the direction contribution feature quantity calculation unit 14 initializes the mesh number M (1 ≦ M ≦ m × m) specifying the mesh region 25 to “1” (step S12). , Mth mesh region 25
The local direction contribution feature quantity is calculated for (step S13). Next, the direction contribution feature quantity calculation unit 14 determines whether or not the current mesh number M is equal to m × m, that is, whether or not the last mesh region 25 has been reached (step S14). M is m × m
(Step S14: NO), the mesh number M is incremented by “1” (step S15), and the processing procedure is stepped to calculate the local direction contribution feature quantity for the next mesh region 25. Return to S13.
On the other hand, when the mesh number M is equal to m × m (step S14: YES), these local direction contributions are shown to indicate that the local direction contribution feature quantity has been calculated for all the mesh regions 25. Based on the feature amount and the dictionary data 17 for face recognition, the face recognition unit 15
Who is the face reflected in the face area 40, or the face area 40 is the face recognition dictionary data 17.
Face recognition is performed (step S16).
).

以上説明したように、本実施形態によれば、二値化・正規化部１３は、顔領域４０の射
影エントロピＥｘ、Ｅｙに基づいて輝度閾値を変えて二値化し、顔領域４０を局所的方向
寄与度特徴量を算出可能な程度まで単純化する構成としたため、複雑な線分を多く含む顔
領域４０の顔認識にも方向寄与度特徴を用いたアルゴリズムを用いることが可能となり、
人物の顔を簡単な処理で高速に、かつ、精度良く認識することが可能となる。
これにより、例えば、デジタルカメラやＰＤＡ、携帯電話機等の、比較的処理能力が低
い小型機器においても高精度な顔認識が実現可能となる。 As described above, according to the present embodiment, the binarization / normalization unit 13 binarizes the face area 40 by changing the luminance threshold based on the projection entropies Ex and Ey of the face area 40, and locally converts the face area 40. Since the direction contribution feature amount is simplified to a degree that can be calculated, an algorithm using the direction contribution feature can be used for face recognition of the face region 40 including many complex line segments.
A person's face can be recognized at high speed and with high accuracy by simple processing.
Thereby, for example, highly accurate face recognition can be realized even in a small device having a relatively low processing capability such as a digital camera, a PDA, or a mobile phone.

また、本実施形態によれば、二値化・正規化部１３は、顔領域４０に対する二値化に先
立って、或いは、二値化後に、線分を平滑化するフィルタ処理を施す構成としたため、撮
影環境が悪く二値化だけでは十分に単純化できない顔領域４０についても、安定して局所
的方向寄与度特徴を算出することが可能となる。 In addition, according to the present embodiment, the binarization / normalization unit 13 is configured to perform a filter process for smoothing a line segment before or after binarization of the face region 40. Even in the face area 40 in which the photographing environment is bad and cannot be simplified simply by binarization, it is possible to stably calculate the local direction contribution feature.

なお、上述した実施の形態は、あくまでも本発明の一態様を示すものであり、本発明の
範囲内で任意に変形および応用が可能である。
例えば、上述した実施形態では、顔領域４０の縦及び横方向の各々の射影のエントロピ
である射影エントロピＥｘ、Ｅｙの各々を、それぞれの閾値エントロピＥｔｈと比較して
、顔領域４０の単純化が完了したか否かを判断したが、射影エントロピＥｘ、Ｅｙのいず
れかのみを使って簡易的に判断する構成としても良い。
例えば、上述した実施形態において、画像認識装置たるコンピュータ３が顔をオブジェ
クトとして認識する場合を例示したが、これに限らず、任意のオブジェクト、特に、複雑
な線分を多く含むオブジェクトの認識にも本発明を適用可能である。
また、例えば、本発明に係る画像認識装置をカメラ、スキャナ、プロジェクタ、テレビ
、プリンタ等のあらゆる電子機器が備える形態で実施することも可能である。 The above-described embodiment is merely an aspect of the present invention, and can be arbitrarily modified and applied within the scope of the present invention.
For example, in the above-described embodiment, each of the projection entropies Ex and Ey, which are the entropies of projection in the vertical and horizontal directions of the face area 40, is compared with the respective threshold entropy Eth, thereby simplifying the face area 40. Although it has been determined whether or not it has been completed, a simple determination may be made using only one of the projection entropies Ex and Ey.
For example, in the above-described embodiment, the case where the computer 3 as the image recognition apparatus recognizes a face as an object has been illustrated. However, the present invention is not limited to this. The present invention is applicable.
In addition, for example, the image recognition apparatus according to the present invention can be implemented in any electronic device such as a camera, a scanner, a projector, a television, and a printer.

本発明の実施形態に係る画像認識システムの構成を示す図である。It is a figure which shows the structure of the image recognition system which concerns on embodiment of this invention. コンピュータの機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of a computer. 方向寄与度を説明するための図である。It is a figure for demonstrating a direction contribution. 顔領域の単純化を説明するための図である。It is a figure for demonstrating simplification of a face area | region. 画像認識処理のフローチャートである。It is a flowchart of an image recognition process. 画像認識処理のフローチャートである。It is a flowchart of an image recognition process. 領域分離を説明するための図である。It is a figure for demonstrating area separation. 二値化による顔領域の単純化の一例を示す図である。It is a figure which shows an example of the simplification of the face area by binarization. 顔領域の正規化を示す図である。It is a figure which shows the normalization of a face area | region.

Explanation of symbols

１…画像認識システム、２…カメラ、３…コンピュータ（画像認識装置）、１０…画像
入力部、１１…顔領域抽出部、１２…記憶部、１３…二値化・正規化部、１４…方向寄与
度特徴量算出部、１５…顔認識部、１７…顔認識用辞書データ、１８…画像認識プログラ
ム、２５…メッシュ領域、４０…顔領域、ｄｘ…Ｘ軸射影、ｄｙ…Ｙ軸射影、Ｅｔｈ…射
影エントロピ閾値、Ｅｘ、Ｅｙ…射影エントロピ。 DESCRIPTION OF SYMBOLS 1 ... Image recognition system, 2 ... Camera, 3 ... Computer (image recognition apparatus), 10 ... Image input part, 11 ... Face area extraction part, 12 ... Memory | storage part, 13 ... Binarization and normalization part, 14 ... Direction Contribution feature amount calculation unit, 15 ... face recognition unit, 17 ... dictionary data for face recognition, 18 ... image recognition program, 25 ... mesh region, 40 ... face region, dx ... X-axis projection, dy ... Y-axis projection, Eth ... Projective entropy threshold, Ex, Ey ... Projective entropy.

Claims

An image input unit for inputting object image data including the object;
A binarization unit that binarizes each pixel constituting the object image data input by the image input unit using a predetermined luminance threshold;
A feature amount calculation unit that calculates a local direction contribution feature amount that is a direction contribution feature amount of a contour line included in a predetermined region of the object image data binarized by the binarization unit;
An image recognition apparatus comprising: a recognition unit that recognizes an object included in the object image data based on the local direction contribution feature amount calculated by the feature amount calculation unit.

The image recognition apparatus according to claim 1,
The binarization unit includes:
The image recognition apparatus characterized by changing the said brightness | luminance threshold value according to distribution of the projection formed by integrating | accumulating an effective pixel for every column or / and a row about each pixel which comprises the said binarized said object image data.

The image recognition apparatus according to claim 2,
The binarization unit includes:
An image recognition apparatus, wherein entropy for quantifying the state of the projection distribution is calculated, and the luminance threshold value is changed in accordance with the calculated entropy.

In the image recognition device according to any one of claims 1 to 3,
The binarization unit performs a filter process for smoothing a line segment before or after binarization of the object image data,
The feature amount calculation unit calculates the local direction contribution feature amount for object image data that has been binarized and filtered by the binarization unit.

The image recognition device according to claim 1,
The object is a face;
A storage unit for storing face recognition dictionary data including the local direction contribution feature amount for each of a plurality of different faces;
The recognition unit includes a local direction contribution feature amount calculated by the feature amount calculation unit for a face included in the object image data, and a local direction included in face recognition dictionary data stored in the storage unit. An image recognition apparatus characterized by recognizing a face included in the object image data based on a contribution feature amount.

An electronic device comprising the image recognition device according to claim 1.

An image input process for inputting object image data including the object;
A binarization step of binarizing each pixel constituting the object image data input by the image input step using a predetermined luminance threshold;
A feature amount calculating step of calculating a local direction contribution feature amount that is a direction contribution feature amount of a contour line included in a predetermined region of the object image data binarized by the binarization step;
A recognition step of recognizing an object included in the object image data based on the local direction contribution feature amount calculated by the feature amount calculation step.

Computer
An image input means for inputting object image data including the object;
Binarization means for binarizing each pixel constituting the object image data input by the image input means using a predetermined luminance threshold;
Feature amount calculating means for calculating a local direction contribution feature amount that is a direction contribution feature amount of a contour line included in a predetermined region of the object image data binarized by the binarization means;
An image recognition program that functions as a recognition unit that recognizes an object included in the object image data based on the local direction contribution feature amount calculated by the feature amount calculation unit.