JP5653003B2

JP5653003B2 - Object identification device and object identification method

Info

Publication number: JP5653003B2
Application number: JP2009105662A
Authority: JP
Inventors: 佐藤　博; 博佐藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2009-04-23
Filing date: 2009-04-23
Publication date: 2015-01-14
Anticipated expiration: 2029-04-23
Also published as: JP2010257158A

Description

本発明は、オブジェクト識別装置及びオブジェクト識別方法に関する。 The present invention relates to an object identification device and an object identification method.

画像データ中の被写体であるオブジェクトが、別の画像中の被写体であるオブジェクトと同一のものであると識別する技術として、例えば、個人の顔を識別する顔識別技術がある。以下、本明細書では、オブジェクトの識別とは、オブジェクトの個体の違い（例えば、個人としての人物の違い）を判定することを意味する。一方、オブジェクトの検出は、個体を区別せず同じ範疇に入るものを判定する（例えば、個人を区別せず、顔を検出する）、ことを意味するものとする。
顔識別技術として、例えば非特許文献１のような方法がある。これは、顔による個人の識別問題を、差分顔と呼ばれる特徴クラスの２クラス識別問題に置き換えることによって、顔の登録・追加学習をリアルタイムに行うことを可能にしたアルゴリズムである。 As a technique for identifying that an object that is a subject in image data is the same as an object that is a subject in another image, for example, there is a face identification technique for identifying an individual's face. Hereinafter, in this specification, the identification of an object means that a difference between individual objects (for example, a difference between persons as individuals) is determined. On the other hand, detection of an object means that an object that falls within the same category is determined without distinguishing individuals (for example, a face is detected without distinguishing individuals).
As a face identification technique, for example, there is a method as described in Non-Patent Document 1. This is an algorithm that makes it possible to perform face registration and additional learning in real time by replacing an individual identification problem by a face with a two-class identification problem of a feature class called a differential face.

例えば、一般によく知られているサポートベクターマシン（ＳＶＭ）を用いた顔識別では、ｎ人分の人物の顔を識別するために、登録された人物の顔と、それ以外の顔を識別するｎ個のＳＶＭ識別器が必要になる。人物の顔を登録する際には、ＳＶＭの学習が必要となる。ＳＶＭの学習には、登録したい人物の顔と、既に登録されている人物とその他の人物の顔データが大量に必要で、非常に計算時間がかかるため、予め計算しておく手法が一般的であった。
しかし、非特許文献１の方法によれば、個人識別の問題を、次に挙げる２クラスの識別問題に置き換えることよって、追加学習を実質的に不要にすることができる。即ち、
・ｉｎｔｒａ−ｐｅｒｓｏｎａｌｃｌａｓｓ：同一人物の画像間の、照明変動、表情・向き等の変動特徴クラス
・ｅｘｔｒａ−ｐｅｒｓｏｎａｌｃｌａｓｓ：異なる人物の画像間の、変動特徴クラス
の２クラスである。上記２クラスの分布は、特定の個人によらず一定であると仮定して、個人の顔識別問題を、上記２クラスの識別問題に帰着させて識別器を構成する。予め、大量の画像を準備して、同一人物間の変動特徴クラスと、異なる人物間の変動特徴クラスと、の識別を行う識別器を学習する。新たな登録者は、顔の画像（若しくは必要な特徴を抽出した結果）のみを保持すればよい。識別する際には２枚の画像から差分特徴を取り出し、上記識別器で、同一人物なのか異なる人物なのかを判定する。これにより、個人の顔登録の際にＳＶＭ等の学習が不要になり、リアルタイムで登録を行うことができる。 For example, in the face identification using a generally well-known support vector machine (SVM), in order to identify the faces of n persons, n faces that are registered and other faces are identified. SVM discriminators are required. When registering a person's face, SVM learning is required. SVM learning requires a large amount of face data of a person to be registered and face data of already registered persons and other persons, which requires a lot of calculation time. there were.
However, according to the method of Non-Patent Document 1, additional learning can be made substantially unnecessary by replacing the problem of personal identification with the following two classes of identification problems. That is,
Intra-personal class: variation feature class such as illumination variation, facial expression / direction, etc. between images of the same person. Extra-personal class: two classes of variation feature class between images of different people. Assuming that the distribution of the two classes is constant regardless of a specific individual, the classifier is configured by reducing the face identification problem of the individual to the classification problem of the two classes. A large number of images are prepared in advance, and a discriminator for discriminating between a variation feature class between the same persons and a variation feature class between different persons is learned. The new registrant need only hold the face image (or the result of extracting the necessary features). When discriminating, the difference feature is extracted from the two images, and the discriminator determines whether the person is the same person or a different person. This eliminates the need for learning such as SVM at the time of personal face registration, and registration can be performed in real time.

上記のような、オブジェクト（より具体的には、人物の顔）の識別を行う装置及び方法において、識別性能を低下させる要因として、登録用画像と認証用画像の変動が挙げられる。即ち、識別対象であるオブジェクト（人物の顔）の変動、より具体的には、照明条件、向き・姿勢、他のオブジェクトによる隠れや、表情による変動等である。上記のような変動が大きくなると、識別性能が大幅に低下してしまう。
この問題に対して、特許文献１では、部分領域ごとのパターンマッチングを複数行い、それらの結果のうち、外れ値を取り除いて、各部分領域のマッチ度を統合することによって、変動に対するロバスト性を確保している。 In the apparatus and method for identifying an object (more specifically, the face of a person) as described above, a factor that degrades the identification performance is a change in the registration image and the authentication image. That is, there are fluctuations in the object to be identified (person's face), more specifically, lighting conditions, orientation / posture, hiding by other objects, fluctuations due to facial expressions, and the like. When the above fluctuations become large, the identification performance is greatly reduced.
With respect to this problem, Patent Document 1 performs a plurality of pattern matching for each partial region, removes outliers from those results, and integrates the matching degree of each partial region, thereby improving robustness against fluctuations. Secured.

特開２００３−３２３６２２号公報JP 2003-323622 A

ＢａｂａｃｋＭｏｇｈａｄｄａｍ，ＢｅｙｏｎｄＥｉｇｅｎｆａｃｅｓ：ＰｒｏｂａｂｉｌｉｓｔｉｃＭａｔｃｈｉｎｇｆｏｒＦａｃｅＲｅｃｏｇｎｉｔｉｏｎ（Ｍ．Ｉ．ＴＭｅｄｉａＬａｂｏｒａｔｏｒｙＰｅｒｃｅｐｔｕａｌＣｏｍｐｕｔｉｎｇＳｅｃｔｉｏｎＴｅｃｈｎｉｃａｌＲｅｐｏｒｔＮｏ．４３３），ＰｒｏｂａｂｉｌｉｓｔｉｃＶｉｓｕａｌＬｅａｒｎｉｎｇｆｏｒＯｂｊｅｃｔＲｅｐｒｅｓｅｎｔａｔｉｏｎ（ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．１９，Ｎｏ．７，ＪＵＬＹ１９９７）Baback Moghaddam, Beyond Eigenfaces:. Probabilistic Matching for Face Recognition (M.I.T Media Laboratory Perceptual Computing Section Technical Report No.433), ProbabilisticVisual Learning for Object Representation (IEEE Transactions on PatternAnalysis and Machine Intelligence, Vol 19, No. 7 , JULY 1997)

しかしながら、複数の部分領域の類似度から、単純に外れ値を取り除いたり、その重み付き平均等を取ったりするだけでは、性能的に改善の余地があると考えられる。例えば、上記複数の部分領域の設定に誤りがあった場合、特に入力画像の範囲外に部分領域が設定された場合、上記処理だけでは、誤りを十分に訂正できる可能性は低いと考えられる。人間の顔のように変動が大きく、更に撮影条件が様々な環境においても、識別性能を維持するためには、上記複数の部分領域設定に誤りがあった場合に、その誤りをある程度吸収できる処理を組み込むことが有効であると考えられる。
デジタルカメラやＷｅｂカメラ等への応用を想定すると、画像の撮影条件及びオブジェクトの変動（大きさ、向き、表情等）が、大きい場合にも識別性能が劣化しないことが望まれる。 However, it is considered that there is room for improvement in performance by simply removing outliers or taking a weighted average or the like from the similarity of a plurality of partial areas. For example, when there is an error in the setting of the plurality of partial areas, particularly when a partial area is set outside the range of the input image, it is considered that there is a low possibility that the error can be sufficiently corrected only by the above processing. In order to maintain discrimination performance even in environments with large fluctuations such as the human face and various shooting conditions, if there is an error in the multiple partial area settings, the process can absorb the error to some extent Is considered effective.
Assuming application to a digital camera, a web camera, etc., it is desirable that the identification performance does not deteriorate even when the image capturing condition and the variation (size, orientation, facial expression, etc.) of the object are large.

本発明はこのような問題点に鑑みなされたもので、照明や大きさ、向き等の変動に対して、ロバスト性と識別性能を向上させることを目的とする。 The present invention has been made in view of such problems, and it is an object of the present invention to improve robustness and identification performance against variations in illumination, size, orientation, and the like.

そこで、本発明は、オブジェクト識別装置であって、前記オブジェクトを含む入力画像を入力する入力手段と、オブジェクトを含む登録画像を取得する取得手段と、前記入力画像に対してアフィン変換を行い、該変換された画像に部分領域を設定する第１の部分領域設定手段と、前記アフィン変換後の画像に設定された部分領域の少なくとも一部が、前記アフィン変換する前の入力画像の領域からはみ出しているか否かを判定する第１の判定手段と、前記部分領域が前記入力画像の領域からはみ出していないと判定された場合には、前記部分領域から特徴ベクトルを設定し、前記部分領域が前記入力画像の領域からはみ出していると判定された場合には、前記部分領域のうち、前記入力画像の領域からはみ出している領域に対し、乱数を設定し、該乱数が設定された前記はみ出している領域と前記入力画像からはみ出していない領域とを含む領域から特徴ベクトルを設定する第１の設定手段と、前記登録画像の特徴ベクトルと前記第１の設定手段により設定される前記部分領域の特徴ベクトルとの相関を算出し、該算出された結果に基づいて前記オブジェクトを識別する識別手段とを有することを特徴とする。
また、本発明は、オブジェクト識別方法としてもよい。 Accordingly, the present invention provides an object identification device, wherein an input unit that inputs an input image including the object, an acquisition unit that acquires a registered image including the object, affine transformation is performed on the input image, First partial area setting means for setting a partial area in the converted image, and at least a part of the partial area set in the image after the affine transformation protrudes from the area of the input image before the affine transformation. A first determination means for determining whether or not the partial area is not protruding from the area of the input image, a feature vector is set from the partial area, and the partial area is the input If it is determined that extends off region of an image, of the partial region, for a region that protrudes from the region of the input image, sets a random number , Set the region and the first setting means for setting a feature vector, the registered image feature vector and the first including a region not protrude from the region and the input image extends off the said random number is set And an identification unit for calculating a correlation with the feature vector of the partial region set by the unit and identifying the object based on the calculated result.
The present invention may also be an object identification method.

本発明によれば、照明や大きさ、向き等の変動に対して、ロバスト性と識別性能を向上させることができる。 According to the present invention, it is possible to improve robustness and identification performance against variations in illumination, size, orientation, and the like.

オブジェクト識別装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of an object identification device. オブジェクト識別装置の全体処理の一例を示したフローチャートである。It is the flowchart which showed an example of the whole process of the object identification apparatus. オブジェクト登録部の構成の一例を示す図である。It is a figure which shows an example of a structure of an object registration part. オブジェクト辞書データ生成部の構成の一例を示す図である。It is a figure which shows an example of a structure of an object dictionary data generation part. 特徴ベクトル抽出部で行われる処理の一例を表したフローチャートである。It is a flowchart showing an example of the process performed in the feature vector extraction part. オブジェクト識別部の一例を示す図である。It is a figure which shows an example of an object identification part. 識別処理の一例を示したフローチャートである。It is the flowchart which showed an example of the identification process. オブジェクト識別用データ生成部の構成の一例を示した図である。It is the figure which showed an example of the structure of the data generation part for object identification. オブジェクト識別演算部の構成の一例を示す図である。It is a figure which shows an example of a structure of an object identification calculating part. オブジェクト識別演算処理の一例を示したフローチャートである。It is the flowchart which showed an example of the object identification calculation process. 部分領域の学習処理の一例を示したフローチャートである。It is the flowchart which showed an example of the learning process of the partial area. オブジェクト識別装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of an object identification device. オブジェクト登録部の構成の一例を示す図である。It is a figure which shows an example of a structure of an object registration part. 特徴ベクトル抽出部での処理の一例を示したフローチャートである。It is the flowchart which showed an example of the process in a feature vector extraction part. オブジェクト識別部の構成の一例を示す図である。It is a figure which shows an example of a structure of an object identification part. オブジェクト識別器を弱識別器のツリー構造で構成した場合の模式図である。It is a schematic diagram at the time of comprising an object discriminator with the tree structure of a weak discriminator.

以下、本発明の実施形態について図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

≪実施形態１≫
以下、図面を参照して本発明の第１の実施形態を詳細に説明する。
図１は、オブジェクト識別装置１００のハードウェア構成の一例を示す図（その１）である。図１に示すように、オブジェクト識別装置１００は、結像光学系１、撮像部２、撮像制御部３、画像記録部４、オブジェクト登録部５、オブジェクト識別部６、を含む。またオブジェクト識別装置１００は、オブジェクト識別結果を出力する外部出力部７、各構成要素の制御・データ接続を行うためのバス８、を含む。
なお、オブジェクト登録部５、オブジェクト識別部６は、典型的には、それぞれ専用回路（ＡＳＩＣ）、プロセッサ（リコンフィギュラブルプロセッサ、ＤＳＰ、ＣＰＵ等）であってもよい。また、オブジェクト登録部５、オブジェクト識別部６は、単一の専用回路及び汎用回路（ＰＣ用ＣＰＵ）内部において実行されるプログラムとして存在してもよい。 Embodiment 1
Hereinafter, a first embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a diagram (part 1) illustrating an example of a hardware configuration of the object identification device 100. As shown in FIG. 1, the object identification device 100 includes an imaging optical system 1, an imaging unit 2, an imaging control unit 3, an image recording unit 4, an object registration unit 5, and an object identification unit 6. The object identification device 100 also includes an external output unit 7 that outputs an object identification result, and a bus 8 for performing control / data connection of each component.
The object registration unit 5 and the object identification unit 6 may typically be a dedicated circuit (ASIC) and a processor (reconfigurable processor, DSP, CPU, etc.), respectively. Further, the object registration unit 5 and the object identification unit 6 may exist as programs that are executed inside a single dedicated circuit and general-purpose circuit (PC CPU).

結像光学系１は、ズーム機構を備えた光学レンズで構成される。また、結像光学系１は、パン・チルト軸方向の駆動機構を備えてもよい。
撮像部２の映像センサとしては、典型的にはＣＣＤ又はＣＭＯＳイメージセンサが用いられ、不図示のセンサ駆動回路からの読み出し制御信号により所定の映像信号（例えば、サブサンプリング、ブロック読み出しして得られる信号）が画像データとして出力される。
撮像制御部３は、撮影者からの指示（画角調整指示、シャッター押下等）及び、オブジェクト登録部５又はオブジェクト識別部６からの情報を基に、実際に撮影が行われるタイミングを制御する。撮像制御部３は、自動露出（ＡＥ）や自動焦点（ＡＦ）の制御を行う制御装置を含んでもよい。
画像記録部４は、半導体メモリ等で構成され、撮像部２から転送された画像データを保持し、オブジェクト登録部５、オブジェクト識別部６からの要求に応じて、所定のタイミングで、画像データを転送する。 The imaging optical system 1 includes an optical lens having a zoom mechanism. Further, the imaging optical system 1 may include a driving mechanism in the pan / tilt axis direction.
A CCD or CMOS image sensor is typically used as the image sensor of the imaging unit 2, and a predetermined image signal (for example, subsampling or block reading is obtained by a read control signal from a sensor drive circuit (not shown). Signal) is output as image data.
The imaging control unit 3 controls the timing of actual shooting based on an instruction from the photographer (viewing angle adjustment instruction, shutter pressing, etc.) and information from the object registration unit 5 or the object identification unit 6. The imaging control unit 3 may include a control device that controls automatic exposure (AE) and automatic focus (AF).
The image recording unit 4 is composed of a semiconductor memory or the like, holds image data transferred from the imaging unit 2, and stores image data at a predetermined timing in response to requests from the object registration unit 5 and the object identification unit 6. Forward.

オブジェクト登録部５は、画像データから識別の対象とするオブジェクトの情報を抽出し、記録・保持する。オブジェクト登録部５のより詳細な構成及び実際に行われる処理のより具体的な内容については、後述する。
オブジェクト識別部６は、画像データ及びオブジェクト登録部５から取得したデータを基に、画像データ中のオブジェクトの識別を行う。オブジェクト識別部６に関して、より具体的な構成及び行われる処理の詳細については、後で詳しく説明する。
外部出力部７は、典型的には、ＣＲＴやＴＦＴ液晶等のモニタであり、撮像部２及び画像記録部４から取得した画像データを表示、又は、画像データにオブジェクト登録部５及びオブジェクト識別部６の結果出力を重畳表示する。また、外部出力部７は、オブジェクト登録部５及びオブジェクト識別部６の結果出力を電子データとして、外部メモリ等に出力する形式をとってもよい。
接続バス８は、上記構成要素間の制御・データ接続を行うためのバスである。 The object registration unit 5 extracts information on an object to be identified from the image data, and records / holds it. A more detailed configuration of the object registration unit 5 and more specific contents of the processing actually performed will be described later.
The object identification unit 6 identifies an object in the image data based on the image data and the data acquired from the object registration unit 5. A more specific configuration and details of the processing performed on the object identification unit 6 will be described in detail later.
The external output unit 7 is typically a monitor such as a CRT or a TFT liquid crystal, and displays the image data acquired from the imaging unit 2 and the image recording unit 4 or displays the object registration unit 5 and the object identification unit in the image data. The result output of 6 is superimposed and displayed. The external output unit 7 may take a form of outputting the result output of the object registration unit 5 and the object identification unit 6 as electronic data to an external memory or the like.
The connection bus 8 is a bus for performing control and data connection between the above components.

＜全体フロー＞
図２は、オブジェクト識別装置１００の全体処理の一例を示したフローチャートである。この図２を参照しながら、このオブジェクト識別装置１００が、画像からオブジェクトの識別を行う実際の処理について説明する。なお、以下では、識別するオブジェクトが人物の顔である場合について説明するが、本実施形態はこれに限るものでない。
初めに、オブジェクト識別部６は、画像記録部４から画像データを取得（画像データ入力）する（Ｓ００）。続いて、オブジェクト識別部６は、取得した画像データに対して、人の顔の検出処理を行う（Ｓ０１）。画像中から、人物の顔を検出する方法については、公知の技術を用いればよい。オブジェクト識別部６は、例えば、「特許３０７８１６６号公報」や「特開２００２−８０３２号公報」で提案されているような技術を用いることができる。
対象オブジェクトである人物の顔の検出処理をしたのち、画像中に人の顔が存在するならば（Ｓ０２でＹｅｓの場合）、オブジェクト識別部６は、オブジェクト識別処理、即ち個人の識別処理を行う。画像中に人の顔が存在しない場合（Ｓ０２でＮｏの場合）、オブジェクト識別部６は、図２に示す処理を終了する。オブジェクト識別処理（Ｓ０３）のより具体的な処理内容については、あとで詳しく説明する。 <Overall flow>
FIG. 2 is a flowchart showing an example of the overall processing of the object identification device 100. With reference to FIG. 2, an actual process in which the object identification device 100 identifies an object from an image will be described. In the following, a case where the object to be identified is a human face will be described, but the present embodiment is not limited to this.
First, the object identification unit 6 acquires image data (image data input) from the image recording unit 4 (S00). Subsequently, the object identification unit 6 performs a human face detection process on the acquired image data (S01). A known technique may be used as a method for detecting a human face from an image. The object identification unit 6 can use, for example, a technique proposed in “Patent No. 3078166” or “Japanese Patent Laid-Open No. 2002-8032”.
If the human face exists in the image after the process of detecting the face of the person who is the target object (Yes in S02), the object identification unit 6 performs the object identification process, that is, the individual identification process. . If no human face is present in the image (No in S02), the object identification unit 6 ends the process shown in FIG. More specific processing contents of the object identification processing (S03) will be described later in detail.

オブジェクト識別部６は、オブジェクト識別処理の結果から、登録済みの人物に該当する顔があるか判定する（Ｓ０４）。（Ｓ０１）で検出された顔と同一人物が、登録済みの人物の中にある場合（Ｓ０４でＹｅｓの場合）、オブジェクト識別部６は、（Ｓ０７）の処理に進む。検出された顔が、登録済み人物の誰とも一致しない場合（Ｓ０４でＮｏの場合）、オブジェクト識別部６は、その人物を登録するか判定する（Ｓ０５）。これは、予め設定されている場合もあるが、例えばユーザが外部インターフェースやＧＵＩ等を通じて、その場で登録するかどうか決定するようにしてもよい。登録すると判定された場合（Ｓ０５でＹｅｓの場合）、オブジェクト識別部６は、後述するオブジェクト（人物の顔）の登録処理を行う（Ｓ０６）。登録を行わない場合（Ｓ０５でＮｏの場合）、オブジェクト識別部６は、そのまま処理を続行する。（Ｓ０６）のオブジェクト登録処理後、及び（Ｓ０５）で登録を行わない場合、オブジェクト識別部６は、検出されたオブジェクト全てについて処理が終わったかどうかを判定する（Ｓ０７）。未処理のオブジェクトがある場合（Ｓ０７でＮｏの場合）、オブジェクト識別部６は、（Ｓ０３）まで戻る。検出された全てのオブジェクトについて処理が終わった場合（Ｓ０７でＹｅｓの場合）、オブジェクト識別部６は、一連のオブジェクト識別処理の結果を、外部出力部７に出力する。
以上が、本実施形態にかかるオブジェクト識別装置１００の全体の処理フローである。 The object identification unit 6 determines whether there is a face corresponding to the registered person from the result of the object identification process (S04). When the same person as the face detected in (S01) is among the registered persons (Yes in S04), the object identifying unit 6 proceeds to the process of (S07). If the detected face does not match any registered person (No in S04), the object identifying unit 6 determines whether to register the person (S05). Although this may be set in advance, for example, the user may determine whether to register on the spot through an external interface, a GUI, or the like. If it is determined to register (Yes in S05), the object identifying unit 6 performs registration processing of an object (person's face) described later (S06). When registration is not performed (No in S05), the object identification unit 6 continues the process as it is. After the object registration processing in (S06) and when registration is not performed in (S05), the object identification unit 6 determines whether the processing has been completed for all the detected objects (S07). When there is an unprocessed object (No in S07), the object identification unit 6 returns to (S03). When the processing has been completed for all the detected objects (Yes in S07), the object identification unit 6 outputs a series of object identification processing results to the external output unit 7.
The above is the overall processing flow of the object identification device 100 according to the present embodiment.

＜オブジェクト登録部＞
オブジェクト登録処理について説明する。図３は、オブジェクト登録部５の構成の一例を示す図である。図３に示すように、オブジェクト登録部５は、オブジェクト辞書データ生成部２１、オブジェクト辞書データ保持部２２、オブジェクト辞書データ選択部２３、を含む。
オブジェクト辞書データ生成部２１は、画像記録部４から取得した画像データから、オブジェクトの個体を識別するために必要なオブジェクト辞書データを生成する。オブジェクト辞書データ生成部２１は、例えば、非特許文献１にあるようなｉｎｔｒａ−ｃｌａｓｓ及びｅｘｔｒａ−ｃｌａｓｓの２クラス問題を判別する場合、典型的には、人物の顔画像を辞書データとすればよい。オブジェクト辞書データ生成部２１は、オブジェクト検出処理によって検出されたオブジェクトの画像データを、大きさや向き（面内回転方向）等を正規化したのち、オブジェクト辞書データ保持部２２に格納するようにしてもよい。
ここで、オブジェクト辞書データ生成部２１は、回転方向の正規化、特に面内回転を補正するアフィン変換を行う際に、以下のようにするとよい。即ち、オブジェクト辞書データ生成部２１は、顔の傾きを補正するアフィン変換処理において、アフィン変換後の画像が、参照する元画像に対して、その領域外を参照してしまう場合、変換後の値を乱数で置き換える。通常、上記のような場合には、所定の固定値をアフィン変換後の画像に設定する場合が多い。しかしながら、固定値にすると、部分領域が、対象オブジェクトからはみ出した場合、問題になることがある。より具体的には、オブジェクト辞書データの画像と識別用データの画像とで、固定値に設定された部分が、部分領域の一部に入った場合、両者の類似度が大きくなってしまい、識別に影響を与える。このような事態を避けるため、オブジェクト辞書データ生成部２１は、アフィン変換後の画像において、参照元画像に対応点のないデータを乱数で置き換えるようにするとよい。 <Object registration part>
The object registration process will be described. FIG. 3 is a diagram illustrating an example of the configuration of the object registration unit 5. As shown in FIG. 3, the object registration unit 5 includes an object dictionary data generation unit 21, an object dictionary data holding unit 22, and an object dictionary data selection unit 23.
The object dictionary data generation unit 21 generates object dictionary data necessary for identifying an individual object from the image data acquired from the image recording unit 4. For example, when the object dictionary data generation unit 21 determines a two-class problem of intra-class and extra-class as in Non-Patent Document 1, typically, a person's face image may be used as dictionary data. . The object dictionary data generation unit 21 may normalize the size and direction (in-plane rotation direction) of the object image data detected by the object detection process, and then store the image data in the object dictionary data holding unit 22. Good.
Here, the object dictionary data generation unit 21 may perform the following when normalizing the rotation direction, particularly affine transformation for correcting in-plane rotation. That is, in the affine transformation process for correcting the inclination of the face, the object dictionary data generation unit 21 converts the converted value if the image after the affine transformation refers to the outside of the area with respect to the original image to be referred to. Replace with a random number. Usually, in the above case, a predetermined fixed value is often set in an image after affine transformation. However, if a fixed value is set, there may be a problem if the partial area protrudes from the target object. More specifically, in the case of the image of the object dictionary data and the image of the identification data, when the portion set to a fixed value enters a part of the partial area, the similarity between the two increases, and the identification is performed. To affect. In order to avoid such a situation, the object dictionary data generation unit 21 may replace the data having no corresponding point in the reference source image with a random number in the image after the affine transformation.

画像データそのものではなく、識別時に必要なデータのみを保持するようにすることによって、辞書データ量を削減することもできる。当該オブジェクトの部分領域のベクトル相関（類似度）をとって識別演算を行う場合、オブジェクト辞書データ生成部２１は、予めその部分領域のみを切り出しておけばよい。
以上のように、オブジェクト辞書データ生成部２１は、適宜必要な情報を画像から抽出し、後述する所定の変換を行った後、オブジェクトの識別を行うための特徴ベクトルとして、オブジェクト辞書データ保持部２２に格納する。オブジェクト辞書データ生成部２１で行われるより具体的な処理の内容については、あとで詳しく説明する。
オブジェクト辞書データ選択部２３は、後述するオブジェクト識別部６の要求に応じて、オブジェクト辞書データ保持部から必要なオブジェクト辞書データを読み出して、オブジェクト識別部６にオブジェクト辞書データを転送する。 It is possible to reduce the amount of dictionary data by holding only the data necessary for identification, not the image data itself. When the identification calculation is performed by taking the vector correlation (similarity) of the partial area of the object, the object dictionary data generation unit 21 may cut out only the partial area in advance.
As described above, the object dictionary data generation unit 21 appropriately extracts necessary information from the image, performs predetermined conversion described later, and then uses the object dictionary data holding unit 22 as a feature vector for identifying the object. To store. More specific processing contents performed by the object dictionary data generation unit 21 will be described in detail later.
The object dictionary data selection unit 23 reads out necessary object dictionary data from the object dictionary data holding unit in response to a request from the object identification unit 6 described later, and transfers the object dictionary data to the object identification unit 6.

＜オブジェクト辞書データ生成部＞
図４は、オブジェクト辞書データ生成部２１の構成の一例を示すブロック図である。図４に示すように、オブジェクト辞書データ生成部２１は、部分領域設定部３１、特徴ベクトル抽出部３２、特徴ベクトル変換部３３、特徴ベクトル変換用データ保持部３４、を含む。
部分領域設定部３１は、画像データに対して、特徴ベクトル抽出部３２が特徴ベクトルを抽出する位置と範囲を設定する。部分領域の位置と範囲とは、機械学習の方法を用いて予め決めておくとよい。例えば、部分領域設定部３１は、部分領域の候補を複数設定しておいて、上記複数候補から、ＡｄａＢｏｏｓｔを用いて選択するようにしてもよい。ＡｄａＢｏｏｓｔを適用して、実際に部分領域を決める方法については、後述するオブジェクト識別部の説明で詳しく述べる。部分領域設定部３１は、部分領域の数として、処理時間等に応じて予め所定の数を決めておく。部分領域設定部３１は、予め用意した学習用サンプルに対して、十分な識別性能を得られる数を計測して決める、等すればよい。 <Object dictionary data generator>
FIG. 4 is a block diagram illustrating an example of the configuration of the object dictionary data generation unit 21. As shown in FIG. 4, the object dictionary data generation unit 21 includes a partial region setting unit 31, a feature vector extraction unit 32, a feature vector conversion unit 33, and a feature vector conversion data holding unit 34.
The partial region setting unit 31 sets the position and range where the feature vector extraction unit 32 extracts feature vectors for image data. The position and range of the partial area may be determined in advance using a machine learning method. For example, the partial region setting unit 31 may set a plurality of partial region candidates and select from the plurality of candidates using AdaBoost. A method of actually determining a partial area by applying AdaBoost will be described in detail in the description of the object identification unit described later. The partial area setting unit 31 determines a predetermined number as the number of partial areas in advance according to the processing time or the like. The partial area setting unit 31 may measure and determine the number that can obtain sufficient discrimination performance for a learning sample prepared in advance.

特徴ベクトル抽出部３２は、登録用のオブジェクトデータから特徴ベクトルを抽出する。オブジェクトが人物の顔である場合、特徴ベクトル抽出部３２は、典型的には、顔を含む画像から、識別に必要なデータを取り出す処理を行う。特徴ベクトル抽出部３２は、識別に必要なデータを、部分領域設定部３１によって設定された部分領域から、その輝度値を特徴ベクトルとして抽出する。また、特徴ベクトル抽出部３２は、輝度値を直接取得するのではなく、ガボアフィルタ等何らかのフィルタ演算を施した結果から特徴ベクトルを抽出してもよい。特徴ベクトル抽出部３２で行われる処理の内容については、あとで詳しく説明する。
特徴ベクトル変換部３３は、特徴ベクトル抽出部３２によって抽出された特徴ベクトルに所定の変換を施す。特徴ベクトル変換部３３は、特徴ベクトルの変換として、例えば、主成分分析（ＰＣＡ）による次元圧縮や、独立成分分析（ＩＣＡ）による次元圧縮等を行う。また、特徴ベクトル変換部３３は、フィッシャー判別分析（ＦＤＡ）による次元圧縮を行ってもよい。 The feature vector extraction unit 32 extracts feature vectors from the registration object data. When the object is a human face, the feature vector extraction unit 32 typically performs a process of extracting data necessary for identification from an image including the face. The feature vector extraction unit 32 extracts data necessary for identification from the partial area set by the partial area setting unit 31 as a feature vector. In addition, the feature vector extraction unit 32 may extract a feature vector from a result of performing some filter operation such as a Gabor filter instead of directly acquiring a luminance value. The contents of the processing performed by the feature vector extraction unit 32 will be described in detail later.
The feature vector conversion unit 33 performs a predetermined conversion on the feature vector extracted by the feature vector extraction unit 32. The feature vector conversion unit 33 performs, for example, dimension compression by principal component analysis (PCA) or dimension compression by independent component analysis (ICA) as the feature vector conversion. The feature vector conversion unit 33 may perform dimensional compression by Fisher discriminant analysis (FDA).

特徴ベクトルの変換方法にＰＣＡを用いた場合、その基底数（特徴ベクトルの次元削減数）や、どの基底を用いるか等のパラメータを用いることになる。なお、特徴ベクトル変換部３３は、基底数の代わりに、基底ベクトルに対応する固有値の和、即ち累積寄与率を指標としてもよい。特徴ベクトル変換部３３は、これらのパラメータを、部分領域ごとに異なったものにすることもできる。実際にどのようなパラメータを設定するかは、予め学習によって決めることができる。
以上のように、特徴ベクトル変換部３３は、特徴ベクトルを変換したデータを、オブジェクト辞書データの出力として、オブジェクト辞書データ保持部２２に格納する。
特徴ベクトル変換用データ保持部３４は、特徴ベクトル変換部３３において、特徴ベクトルの変換を行う際に必要なデータを保持している。ここで、特徴ベクトルの変換に必要なデータとは、上述したような、基底数（次元削減数）等の設定情報である。 When PCA is used as a feature vector conversion method, parameters such as the number of bases (dimension reduction number of feature vectors) and which base to use are used. Note that the feature vector conversion unit 33 may use, as an index, the sum of eigenvalues corresponding to the basis vectors, that is, the cumulative contribution rate, instead of the basis number. The feature vector conversion unit 33 can make these parameters different for each partial region. What parameters are actually set can be determined in advance by learning.
As described above, the feature vector conversion unit 33 stores the data obtained by converting the feature vector in the object dictionary data holding unit 22 as the output of the object dictionary data.
The feature vector conversion data holding unit 34 holds data necessary for the feature vector conversion unit 33 to convert feature vectors. Here, the data necessary for the feature vector conversion is setting information such as the base number (dimension reduction number) as described above.

図５は、特徴ベクトル抽出部３２で行われる処理の一例を表したフローチャートである。以下、これを用いて説明する。初めに、特徴ベクトル抽出部３２は、部分領域設定部３１より部分領域の設定情報を取得する（Ｓ１０）。続いて特徴ベクトル抽出部３２は、画像記録部４からオブジェクトの画像データを取得する（Ｓ１１）。特徴ベクトル抽出部３２は、取得した画像データから、（Ｓ１０）で取得した部分領域の情報を基に、部分領域のはみ出し判定を行う（Ｓ１３）。ここで、はみ出しとは、（Ｓ１０）で取得した部分領域が、（Ｓ１１）で取得した画像の有効な範囲から外れている状態を指す。例えば、画像記録部４から取得する画像データが、オブジェクト検出の情報に基づきオブジェクト領域を切り出したものである場合、後段の処理のため、画像データに固定値ののりしろがつけられる場合がある。部分領域が、オブジェクトの境界付近に設定されている場合、上記のようなケースでは、必ず固定値が特徴ベクトルとして取得されてしまうため、後段の識別用データとの相関演算によくない影響を及ぼす可能性がある。 FIG. 5 is a flowchart showing an example of processing performed by the feature vector extraction unit 32. Hereinafter, this will be described. First, the feature vector extraction unit 32 acquires partial region setting information from the partial region setting unit 31 (S10). Subsequently, the feature vector extraction unit 32 acquires image data of the object from the image recording unit 4 (S11). The feature vector extraction unit 32 determines the protrusion of the partial area based on the information of the partial area acquired in (S10) from the acquired image data (S13). Here, the protrusion indicates a state in which the partial area acquired in (S10) is out of the effective range of the image acquired in (S11). For example, when the image data acquired from the image recording unit 4 is an object region cut out based on object detection information, a margin of a fixed value may be added to the image data for subsequent processing. If the partial area is set near the boundary of the object, a fixed value is always acquired as a feature vector in the above case, which has a negative effect on the correlation calculation with the subsequent identification data. there is a possibility.

はみ出しがなかった場合（Ｓ１３でＮｏの場合）、特徴ベクトル抽出部３２は、画像データから特徴ベクトルを取得する（Ｓ１４）。ここで、特徴ベクトルは、典型的には部分領域の画像データから取得した輝度データをベクトル化したものであってもよい。また、特徴ベクトルは、ＬＢＰ（ＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ）変換等、所定の変換を施したものから、対応する部分領域を特徴ベクトル化するようにしてもよい。はみ出しがあった場合（Ｓ１３でＹｅｓの場合）、特徴ベクトル抽出部３２は、一旦、はみ出した領域以外から特徴ベクトルを取得する（Ｓ１５）。次に、特徴ベクトル抽出部３２は、はみ出した部分の特徴ベクトルの追加処理を行う（Ｓ１６）。（Ｓ１５）で取得した特徴ベクトルは、領域が狭いため、（Ｓ１４）で取得する特徴ベクトルに比べて次元が小さくなる。特徴ベクトル抽出部３２は、この少ない分を追加する際に、固定値ではなく、乱数を生成して、ランダムな値をはみ出し部分の特徴ベクトルとする。
乱数を特徴ベクトルとして設定することにより、上述のような、固定値が特徴ベクトルとして設定されることが回避され、後段の識別用データとの相関演算に与えるよくない影響を少なくすることができる。
以上が、特徴ベクトル抽出部で行われる一処理例の説明である。 When there is no protrusion (No in S13), the feature vector extraction unit 32 acquires a feature vector from the image data (S14). Here, the feature vector may be a vector obtained by vectorizing luminance data acquired from image data of a partial region. The feature vector may be converted into a feature vector of a corresponding partial area from a predetermined transformation such as LBP (Local Binary Pattern) transformation. If there is a protrusion (Yes in S13), the feature vector extraction unit 32 once acquires a feature vector from a region other than the protruding region (S15). Next, the feature vector extraction unit 32 performs a feature vector addition process for the protruding portion (S16). Since the feature vector acquired in (S15) has a narrow area, the dimension is smaller than the feature vector acquired in (S14). When adding this small amount, the feature vector extraction unit 32 generates a random number instead of a fixed value, and sets the random value as the feature vector of the protruding portion.
By setting a random number as a feature vector, it is possible to avoid setting a fixed value as a feature vector as described above, and to reduce unfavorable influence on correlation calculation with subsequent identification data.
The above is the description of one processing example performed by the feature vector extraction unit.

＜オブジェクト識別部＞
オブジェクト識別処理について説明する。図６は、オブジェクト識別部６の一例を示す図である。図６に示すように、オブジェクト識別部６は、オブジェクト識別用データ生成部４１、オブジェクト辞書データ取得部４２、オブジェクト識別演算部４３、を含む。
オブジェクト識別用データ生成部４１は、画像記録部４から取得した画像データから、オブジェクトの識別に必要な情報の抽出を行う。
オブジェクト辞書データ取得部４２は、オブジェクト登録部５より、オブジェクトの識別に必要な辞書データを取得する。
オブジェクト識別演算部４３は、オブジェクト識別用データ生成部４１から取得した識別用データとオブジェクト辞書データ取得部４２から得た辞書データとから、オブジェクトの識別処理を行う。ここで行われる処理については、後で詳しく説明する。 <Object identification part>
The object identification process will be described. FIG. 6 is a diagram illustrating an example of the object identification unit 6. As shown in FIG. 6, the object identification unit 6 includes an object identification data generation unit 41, an object dictionary data acquisition unit 42, and an object identification calculation unit 43.
The object identification data generation unit 41 extracts information necessary for object identification from the image data acquired from the image recording unit 4.
The object dictionary data acquisition unit 42 acquires dictionary data necessary for object identification from the object registration unit 5.
The object identification calculation unit 43 performs an object identification process from the identification data acquired from the object identification data generation unit 41 and the dictionary data acquired from the object dictionary data acquisition unit 42. The processing performed here will be described in detail later.

図７は、オブジェクト識別部６で行われる識別処理の一例を示したフローチャートである。まず、オブジェクト識別部６は、オブジェクト登録部５からオブジェクト辞書データを取得する（Ｓ２０）。次に、オブジェクト識別部６は、画像記録部４よりオブジェクト画像データを取得する（Ｓ２１）。続いて、オブジェクト識別部６は、オブジェクト識別用データ生成処理を行う（Ｓ２２）。ここで行われる処理については、あとで詳しく説明する。次に、オブジェクト識別部６は、オブジェクト識別演算処理を行う（Ｓ２３）。オブジェクト識別演算処理の出力として、登録済みデータ（辞書データ）との一致をバイナリ（０ｏｒ１）で出力する場合と正規化した出力値（０〜１の実数値）で出力する場合とがある。更に登録オブジェクト（登録者）が複数（複数人）ある場合には、それぞれの登録オブジェクト（登録者）に対して、出力値を出力してもよいが、最も良く一致した登録データだけを出力してもよい。なお、オブジェクト識別演算処理のより具体的な内容についても、後で詳しく説明する。
以上が、オブジェクト識別部６における処理フロー例の説明である。 FIG. 7 is a flowchart illustrating an example of identification processing performed by the object identification unit 6. First, the object identification unit 6 acquires object dictionary data from the object registration unit 5 (S20). Next, the object identification unit 6 acquires object image data from the image recording unit 4 (S21). Subsequently, the object identification unit 6 performs an object identification data generation process (S22). The processing performed here will be described in detail later. Next, the object identification unit 6 performs an object identification calculation process (S23). As an output of the object identification calculation process, there are a case where a match with registered data (dictionary data) is output in binary (0 or 1) and a case where a normalized output value (a real value of 0 to 1) is output. Furthermore, when there are a plurality of registered objects (registrants), an output value may be output to each registered object (registrant), but only the registered data that is the best match is output. May be. Note that more specific contents of the object identification calculation process will be described later in detail.
The above is the description of the processing flow example in the object identification unit 6.

＜オブジェクト識別用データ生成部＞
図８は、オブジェクト識別用データ生成部４１の構成の一例を示した図である。図８に示すように、オブジェクト識別用データ生成部４１は、部分領域設定部５１、特徴ベクトル抽出部５２、特徴ベクトル変換部５３、特徴ベクトル変換用データ保持部５４、を含む。オブジェクト識別用データ生成部４１の構成及びそこで行われる処理は、オブジェクト辞書データ生成部２１でのそれとほぼ同じであるので、詳細は割愛する。 <Object identification data generator>
FIG. 8 is a diagram illustrating an example of the configuration of the object identification data generation unit 41. As shown in FIG. 8, the object identification data generation unit 41 includes a partial region setting unit 51, a feature vector extraction unit 52, a feature vector conversion unit 53, and a feature vector conversion data holding unit 54. The configuration of the object identification data generation unit 41 and the processing performed there are almost the same as those in the object dictionary data generation unit 21, and therefore the details are omitted.

＜オブジェクト識別演算処理＞
オブジェクト識別演算処理について説明する。ここでは、一例として、ｉｎｔｒａ−ｃｌａｓｓ，ｅｘｔｒａ−ｃｌａｓｓの２クラス問題を、ＳＶＭ識別器を用いて判定する場合について説明する。図９は、オブジェクト識別演算部４３の構成の一例を示す図である。オブジェクト識別演算部４３は、オブジェクト識別用データ取得部６１、オブジェクト辞書データ取得部６２、変動特徴抽出部６３、ＳＶＭ識別器６４、識別結果保持部６５、識別結果統合部６６、を含む。 <Object identification calculation processing>
The object identification calculation process will be described. Here, as an example, a case where a two-class problem of intra-class and extra-class is determined using an SVM classifier will be described. FIG. 9 is a diagram illustrating an example of the configuration of the object identification calculation unit 43. The object identification calculation unit 43 includes an object identification data acquisition unit 61, an object dictionary data acquisition unit 62, a variation feature extraction unit 63, an SVM classifier 64, an identification result holding unit 65, and an identification result integration unit 66.

図１０は、オブジェクト識別演算処理の一例を示したフローチャートである。以下この図を用いて説明する。
始めに、オブジェクト識別用データ取得部６１において、オブジェクト識別用データを取得する（Ｓ３０）。続いて、オブジェクト辞書データ取得部６２で、オブジェクトの辞書データを取得する（Ｓ３１）。次に、変動特徴抽出部６３において、（Ｓ３０）及び（Ｓ３１）で取得したオブジェクト識別用データとオブジェクト辞書データから、変動特徴抽出処理を行う（Ｓ３２）。ここで、変動特徴とは、典型的には２枚の画像から抽出される、同一オブジェクト間の変動、又は、異なるオブジェクト間の変動、の何れかに属する特徴のことである。変動特徴の定義は様々なものが考えられる。ここでは一例として、変動特徴抽出部６３は、辞書データと、識別用データとで、同じ領域に対応する特徴ベクトル間で類似度（相関値、内積）を計算し、その類似度を成分とするベクトルを変動特徴ベクトルとする。上記定義によれば、変動特徴ベクトルの次元数は、部分領域数と一致する。 FIG. 10 is a flowchart illustrating an example of the object identification calculation process. This will be described below with reference to this figure.
First, the object identification data acquisition unit 61 acquires object identification data (S30). Subsequently, the object dictionary data acquisition unit 62 acquires object dictionary data (S31). Next, the variation feature extraction unit 63 performs variation feature extraction processing from the object identification data and the object dictionary data acquired in (S30) and (S31) (S32). Here, the variation feature is a feature belonging to either variation between the same object or variation between different objects, typically extracted from two images. There are various definitions of the variation feature. Here, as an example, the variation feature extraction unit 63 calculates similarity (correlation value, inner product) between feature vectors corresponding to the same region using dictionary data and identification data, and uses the similarity as a component. Let the vector be a variable feature vector. According to the above definition, the number of dimensions of the variation feature vector matches the number of partial regions.

変動特徴抽出部６３は、（Ｓ３２）で取得した取得した変動特徴ベクトルをサポートベクターマシン（ＳＶＭ）識別器６４に投入する（Ｓ３３）。ＳＶＭ識別器６４は、同一オブジェクト間の変動（ｉｎｔｒａ−ｃｌａｓｓ）と異なるオブジェクトとの間の変動（ｅｘｔｒａ−ｃｌａｓｓ）の２クラスを識別する識別器として予め訓練しておく。一般に、部分領域の数を増やすと、それだけ変動特徴ベクトルの次元数が増え、演算時間が増加する。このため、処理時間を優先した場合、カスケード接続型のＳＶＭ識別器が有効である。この場合、ＳＶＭ識別器は、特定の部分領域ごとに訓練されたもので構成される。変動特徴抽出部６３は、変動特徴ベクトルを部分領域ごとに分割し、対応するＳＶＭ識別器に投入する。このようにすることにより、演算時間を削減することができる。ＳＶＭ識別器を１つの部分領域だけに対応させて学習するのではなく、２つ以上の部分領域の組み合わせを、ＳＶＭ識別器の入力として学習させてもよい。 The variation feature extraction unit 63 inputs the variation feature vector acquired in (S32) into the support vector machine (SVM) discriminator 64 (S33). The SVM discriminator 64 is trained in advance as a discriminator that discriminates two classes of intra-class variation between the same object and extra-class variation between different objects. In general, when the number of partial regions is increased, the number of dimensions of the variation feature vector is increased accordingly, and the calculation time is increased. For this reason, when priority is given to processing time, a cascade connection type SVM discriminator is effective. In this case, the SVM discriminator is configured by training for each specific partial area. The variation feature extraction unit 63 divides the variation feature vector for each partial region and inputs it to the corresponding SVM classifier. By doing in this way, calculation time can be reduced. Rather than learning an SVM classifier corresponding to only one partial area, a combination of two or more partial areas may be learned as an input of the SVM classifier.

一方、識別精度を重視する場合、ＳＶＭ識別器を並列に演算し、演算結果について重み付け和をとるようにするとよい。この場合でも、サポートベクター数を削減するアルゴリズムを適用することで、ある程度演算時間を短縮することは可能である。サポートベクター数を削減する方法は、「Ｂｕｒｇｅｓ，Ｃ．Ｊ．Ｃ（１９９６）． "Ｓｉｍｐｌｉｆｉｅｄｓｕｐｐｏｒｔｖｅｃｔｏｒｄｅｃｉｓｉｏｎｒｕｌｅｓ．" ＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＭａｃｈｉｎｅＬｅａｒｎｉｎｇ（ｐｐ．７１−７７）．」に記載されているような公知の技術を用いることができる。 On the other hand, when importance is attached to the identification accuracy, the SVM classifiers are preferably operated in parallel, and a weighted sum is calculated for the calculation result. Even in this case, it is possible to reduce the calculation time to some extent by applying an algorithm for reducing the number of support vectors. A method for reducing the number of support vectors is described in “Burges, CJC (1996).” Simplified support vector decision rules. A well-known technique as described in "International Conference on Machine Learning (pp.71-77)." Can be used.

ＳＶＭ識別器６４は、（Ｓ３３）で算出された、辞書データとオブジェクト識別用データとの識別結果を識別結果保持部６５に保持する（Ｓ３４）。次に、例えばＳＶＭ識別器６４は、全ての辞書データについて、識別演算が終わったかどうか判定する（Ｓ３５）。まだ辞書データがある場合（Ｓ３５でＮｏの場合）には（Ｓ３１）に戻る。全ての辞書データについて識別演算が終わった場合（Ｓ３５でＹｅｓの場合）、識別結果統合部６６で識別結果統合処理を行う（Ｓ３６）。識別結果統合部６６は、例えば、最も単純には、ＳＶＭ識別器が回帰値を出力する識別器であった場合、最も値の高かった辞書データを、識別結果として出力するような処理を行う。また、識別結果統合部６６は、一致度の高かった上位オブジェクトの結果をリストとして出力してもよい。
以上が、オブジェクト識別演算処理の説明である。 The SVM discriminator 64 holds the discrimination result between the dictionary data and the object discrimination data calculated in (S33) in the discrimination result holding unit 65 (S34). Next, for example, the SVM discriminator 64 determines whether or not the discrimination calculation has been completed for all dictionary data (S35). If there is still dictionary data (No in S35), the process returns to (S31). When the identification calculation is completed for all dictionary data (Yes in S35), the identification result integration unit 66 performs the identification result integration process (S36). For example, in the simplest case, when the SVM classifier is a classifier that outputs a regression value, the identification result integration unit 66 performs a process of outputting the dictionary data having the highest value as the identification result. Further, the identification result integration unit 66 may output the results of the higher level object having a high degree of matching as a list.
The above is the description of the object identification calculation process.

＜部分領域の学習方法＞
次に、部分領域の位置と範囲の学習に、ＡｄａＢｏｏｓｔを用いた場合の手順について、説明する。
図１１は、部分領域の学習処理の一例を示したフローチャートである。まず、オブジェクト識別装置１００は、学習データを取得する（Ｓ４０）。人物の顔を扱う場合は、学習データとして、個人の識別子を表すラベルのついた顔を含む画像を多数用意する。この際、１人あたりの画像数が十分用意されていることが望ましい。照明変動や、表情の変動に頑健な部分領域及び特徴ベクトルの変換方法を学習するためには、学習データに上記変動を十分含んだサンプルを用意することが重要である。ラベルつきの顔画像から、個人の顔の変動を表すデータと、他人間の顔の変動を表すデータと、の２種類を生成することができる。次に、オブジェクト識別装置１００は、弱仮説の選択処理を行う（Ｓ４１）。ここで弱仮説とは、典型的には、登録データと識別用データとの部分領域間の類似度を算出する処理を行う。オブジェクト識別装置１００は、部分領域の位置と範囲との組み合わせの数だけ、弱仮説を用意しておく。そして、オブジェクト識別装置１００は、（Ｓ４０）で取得した学習データに対して、ＡｄａＢｏｏｓｔの枠組みに沿って、もっとも性能のよい弱仮説、即ち、位置と範囲とが最適な部分領域を選択する（Ｓ４２）。性能評価を行うためのより具体的な手順は、オブジェクト識別演算部４３の説明で述べた、変動特徴抽出処理の例のようにするとよい。即ち、オブジェクト識別装置１００は、学習データに対して、特徴ベクトルの類似度（内積）を求め、変動特徴ベクトルを生成し、ＳＶＭ識別器に入力する。オブジェクト識別装置１００は、同一ラベルの人物間（画像は異なる）と、異なるラベルの人物間とで、それぞれ正しい識別結果になっているか判定し、学習データの重み付き誤り率を求める。 <Learning method of partial area>
Next, a procedure when AdaBoost is used for learning the position and range of the partial area will be described.
FIG. 11 is a flowchart illustrating an example of the learning process of the partial area. First, the object identification device 100 acquires learning data (S40). When dealing with a human face, a large number of images including a face with a label representing an individual identifier are prepared as learning data. At this time, it is desirable that a sufficient number of images per person is prepared. In order to learn a method of converting a partial region and a feature vector that are robust against illumination fluctuations and facial expression fluctuations, it is important to prepare a sample that sufficiently includes the fluctuations in the learning data. Two types of data can be generated from a labeled face image: data representing the variation of an individual's face and data representing the variation of another person's face. Next, the object identification device 100 performs weak hypothesis selection processing (S41). Here, the weak hypothesis typically includes a process of calculating the similarity between the partial areas of the registration data and the identification data. The object identification device 100 prepares as many weak hypotheses as the number of combinations of positions and ranges of partial areas. Then, the object identification device 100 selects the weak hypothesis with the best performance, that is, the partial region having the optimum position and range, based on the AdaBoost framework for the learning data acquired in (S40) (S42). ). A more specific procedure for performing the performance evaluation may be as in the example of the variation feature extraction process described in the description of the object identification calculation unit 43. That is, the object identification device 100 obtains the similarity (inner product) of feature vectors with respect to the learning data, generates a variation feature vector, and inputs it to the SVM classifier. The object identification device 100 determines whether a correct identification result is obtained between persons with the same label (images are different) and between persons with different labels, and obtains a weighted error rate of learning data.

もっとも性能のよい弱仮説を選択した場合、オブジェクト識別装置１００は、その弱仮説の学習データに関する識別結果を基に、学習データの重み付けを更新する（Ｓ４２）。次に、オブジェクト識別装置１００は、弱仮説数が所定数に達しているか判定する（Ｓ４３）。所定数に達している場合（Ｓ４３でＹｅｓの場合）、オブジェクト識別装置１００は、学習処理を終了する。所定数に達していない場合（Ｓ４３でＮｏの場合）、オブジェクト識別装置１００は、新たな弱仮説の選択を行う。
なお、重みつき誤り率の算出や、学習データの重み付けの更新方法等、ＡｄａＢｏｏｓｔによる学習の詳細な手順は、「Ｖｉｏｌａ＆Ｊｏｎｅｓ（２００１） ”Ｒａｐ
ｉｄＯｂｊｅｃｔＤｅｔｅｃｔｉｏｎｕｓｉｎｇａＢｏｏｓｔｅｄＣａｓｃａｄｅｏｆＳｉｍｐｌｅＦｅａｔｕｒｅｓ”，ＣｏｍｐｕｔｅｒＶｉｓｉｏｎ
ａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．」等に記載されている方法を適宜採用すればよい。 When the weak hypothesis with the best performance is selected, the object identification device 100 updates the weighting of the learning data based on the identification result regarding the learning data of the weak hypothesis (S42). Next, the object identification device 100 determines whether the number of weak hypotheses has reached a predetermined number (S43). If the predetermined number has been reached (Yes in S43), the object identification device 100 ends the learning process. If the predetermined number has not been reached (No in S43), the object identification device 100 selects a new weak hypothesis.
The detailed procedure of learning by AdaBoost, such as the calculation of the weighted error rate and the updating method of the weighting of the learning data, is described in “Viola & Jones (2001)” Rap.
id Object Detection using a Boosted Cascade of Simple Features ", Computer Vision
and Pattern Recognition. Or the like may be adopted as appropriate.

オブジェクト識別装置１００は、弱仮説を、複数の部分領域の組み合わせから構成するようにしてもよい。即ち、オブジェクト識別装置１００は、１つの弱仮説に含まれる部分領域数を一定（所定数、例えば５個、１０個等）にする。この場合、オブジェクト識別装置１００は、１弱仮説中の部分領域数を増やすと、組み合わせの数が指数関数的に増加するので、何らかの拘束条件をつけて学習するようにするとよい。より具体的には、オブジェクト識別装置１００は、部分領域の位置関係を参照して、互いに近い位置にある部分領域が含まれないようにするようにしてもよい。
また、オブジェクト識別装置１００は、複数部分領域の組み合わせを作る際に、遺伝的アルゴリズム（ＧＡ）等の最適化手法を適用するようにしてもよい。この場合、オブジェクト識別装置１００は、弱仮説の候補は、ＡｄａＢｏｏｓｔの手続きに入る前に予め全て用意されるのではなく、弱仮説を選択しならが、動的に候補を構築していく。即ち、オブジェクト識別装置１００は、予め一部用意された弱仮説の候補（例えば、ランダムに領域候補を組み合わせる等して生成しておく）から、性能のよいものを選択しておくようにする。そして、オブジェクト識別装置１００は、その性能のよいもの同士を、組み合わせながら、新しい弱仮説の候補を生成し、性能を評価していく。このようにすることにより、弱仮説の候補を効率的に絞り込むことができる。以上のようにして、学習時間の増加を抑えるようにするとよい。
以上が、部分領域の位置と範囲とを学習する手順の説明である。 The object identification device 100 may configure the weak hypothesis from a combination of a plurality of partial areas. That is, the object identification device 100 makes the number of partial areas included in one weak hypothesis constant (a predetermined number, for example, five, ten, etc.). In this case, since the number of combinations increases exponentially when the number of partial areas in the one weak hypothesis is increased, the object identification device 100 may be learned with some constraint condition. More specifically, the object identification device 100 may refer to the positional relationship of the partial areas so that the partial areas at positions close to each other are not included.
Further, the object identification device 100 may apply an optimization method such as a genetic algorithm (GA) when creating a combination of a plurality of partial regions. In this case, the object identification apparatus 100 does not prepare all weak hypothesis candidates in advance before entering the AdaBoost procedure, but dynamically builds candidates even if a weak hypothesis is selected. In other words, the object identification device 100 selects one with good performance from weak hypothesis candidates prepared in advance (for example, randomly generated by combining region candidates). Then, the object identification device 100 generates new weak hypothesis candidates while combining those having good performance, and evaluates the performance. In this way, the weak hypothesis candidates can be narrowed down efficiently. As described above, it is preferable to suppress an increase in learning time.
The above is the description of the procedure for learning the position and range of the partial region.

≪実施形態２≫
実施形態２は実施形態１に対して、オブジェクト登録部とオブジェクト識別部との処理内容が異なる。
より具体的には、実施形態１では、オブジェクトの属性は考えなかったのに対し、実施形態２では、オブジェクトの属性を推定し、オブジェクトの属性に応じた部分領域の設定がなされる点が異なる。
以下、より具体的に説明する。なお、重複を避けるため、以下の説明において、前実施形態と同じ部分は、省略する。図１２は、オブジェクト識別装置１００のハードウェア構成の一例を示す図（その２）である。各部の基本的な機能は実施形態１と同一であるが、以下の点が異なる。即ち、オブジェクト識別装置１００に、オブジェクト辞書データ入力部１０９と、オブジェクト辞書データ書き換え部１１０と、が追加されている。
オブジェクト辞書データ入力部１０９は、オブジェクト辞書データを外部から入力するための処理を実行し、典型的には半導体メモリ等の外部記憶装置から、オブジェクト辞書データを読み取る。オブジェクトデータ書き換え部１１０は、オブジェクト辞書データ及びオブジェクト識別用データを所定の手順に従って書き換える。オブジェクトデータ書き換え部１１０は、典型的には、オブジェクトのデータが画像であった場合、コントラストの補正や、ノイズの除去、解像度の変更等を行う。
なお、説明の便宜上、識別する対象となるオブジェクトを、画像中の人物の顔としているが、本実施形態は、人物の顔以外のオブジェクトに適用可能である。 << Embodiment 2 >>
The second embodiment differs from the first embodiment in the processing contents of the object registration unit and the object identification unit.
More specifically, the attribute of the object is not considered in the first embodiment, whereas the attribute of the object is estimated in the second embodiment, and a partial area is set according to the attribute of the object. .
More specific description will be given below. In addition, in order to avoid duplication, the same part as previous embodiment is abbreviate | omitted in the following description. FIG. 12 is a diagram (part 2) illustrating an example of a hardware configuration of the object identification device 100. The basic function of each part is the same as that of the first embodiment, but the following points are different. That is, an object dictionary data input unit 109 and an object dictionary data rewrite unit 110 are added to the object identification device 100.
The object dictionary data input unit 109 executes processing for inputting object dictionary data from the outside, and typically reads the object dictionary data from an external storage device such as a semiconductor memory. The object data rewriting unit 110 rewrites the object dictionary data and the object identification data according to a predetermined procedure. The object data rewriting unit 110 typically performs contrast correction, noise removal, resolution change, and the like when the object data is an image.
For convenience of explanation, the object to be identified is the face of a person in the image, but this embodiment is applicable to objects other than the face of a person.

＜オブジェクト登録部＞
図１３は、オブジェクト登録部１０５の構成の一例を示す図である。オブジェクト登録部１０５は、オブジェクト辞書データ生成部１１１、オブジェクト辞書データ保持部１１２、オブジェクト辞書データ選択部１１３、オブジェクト属性推定部１１４、を含む。実施形態１とは、オブジェクト属性推定部１１４が追加されている点が異なる。
オブジェクト属性推定部１１４は、画像記録部１０４から入力された画像データから、オブジェクトの属性を推定する処理を行う。推定を行う具体的な属性は、オブジェクトの大きさ、姿勢・向き、照明条件等が含まれる。オブジェクトが人物の顔である場合、オブジェクト属性推定部１１４は、顔の器官位置を検出する。より具体的には、オブジェクト属性推定部１１４は、目、口、鼻等構成要素の端点を検出する。端点を検出するアルゴリズムは、例えば、特許３０７８１６６号公報に記載の畳み込み神経回路網を用いた方法等を用いることができる。オブジェクト属性推定部１１４は、端点として、左右の目、口の両端点、鼻、等個人の特徴を現すと考えられる部位を予め選択しておく。オブジェクト属性推定部１１４は、顔器官の端点の位置関係を、その属性として検出する。また、他の属性として、オブジェクト属性推定部１１４は、人物の年齢、性別、表情、等の属性を推定してもよい。これらの属性推定には公知の技術を用いることができる。例えば「特開２００３−２４２４８６号公報」のような方法を用いることで、人物の属性を推定することができる。 <Object registration part>
FIG. 13 is a diagram illustrating an example of the configuration of the object registration unit 105. The object registration unit 105 includes an object dictionary data generation unit 111, an object dictionary data holding unit 112, an object dictionary data selection unit 113, and an object attribute estimation unit 114. The difference from the first embodiment is that an object attribute estimation unit 114 is added.
The object attribute estimation unit 114 performs processing for estimating the object attribute from the image data input from the image recording unit 104. Specific attributes for estimation include the size, posture / orientation, lighting conditions, and the like of the object. When the object is a human face, the object attribute estimation unit 114 detects the organ position of the face. More specifically, the object attribute estimation unit 114 detects end points of components such as eyes, mouth, and nose. As an algorithm for detecting an end point, for example, a method using a convolutional neural network described in Japanese Patent No. 3078166 can be used. The object attribute estimation unit 114 selects in advance, as the end points, parts that are considered to show personal features such as left and right eyes, both end points of the mouth, and the nose. The object attribute estimation unit 114 detects the positional relationship between the end points of the facial organ as the attribute. As other attributes, the object attribute estimation unit 114 may estimate attributes such as a person's age, sex, and facial expression. A known technique can be used for the attribute estimation. For example, by using a method such as “Japanese Patent Laid-Open No. 2003-242486”, the attribute of a person can be estimated.

学習データを変えることによって、上記の方法で、人物だけでなく、一般の物体についても検出することができる。それによって、人物の顔にある顔器官以外の要素、例えば、メガネや、マスク、手等、オクリュージョンとなる物体を検出することもできる。上記のようなオクリュージョンがある人物の顔として、その属性に含めて考えることができる。
オブジェクト属性推定部１１４は、属性推定に、撮像パラメータの一例であるカメラパラメータを用いるようにしてもよい。例えば、オブジェクト属性推定部１１４は、撮像制御部１０３から制御用のＡＥ、ＡＦに関するパラメータを取得することによって、照明条件等の属性を精度良く推定することが可能になる。ここで、カメラパラメータのより具体的な例として、露出条件、ホワイトバランス、ピント、オブジェクトの大きさ等があげられる。例えば、オブジェクト属性推定部１１４は、露出条件及びホワイトバランスと、肌色成分領域に対応する色成分の対応表を予め作成し、ルックアップテーブルとして保持しておくことで、撮影条件に影響されないオブジェクトの色属性を推定することができる。 By changing the learning data, not only a person but also a general object can be detected by the above method. Accordingly, it is possible to detect elements other than the facial organs on the face of the person, such as glasses, masks, hands, and other objects that are occluded. It can be considered as a face of a person who has the occlusion as described above.
The object attribute estimation unit 114 may use camera parameters, which are examples of imaging parameters, for attribute estimation. For example, the object attribute estimation unit 114 can accurately estimate attributes such as illumination conditions by acquiring control-related AE and AF parameters from the imaging control unit 103. Here, more specific examples of the camera parameters include exposure conditions, white balance, focus, object size, and the like. For example, the object attribute estimation unit 114 creates in advance a correspondence table of exposure conditions, white balance, and color components corresponding to the skin color component area, and stores the correspondence table as a lookup table. Color attributes can be estimated.

また、オブジェクト属性推定部１１４は、被写体であるオブジェクトまでの距離をＡＦ等の距離測定手段を用いることによって測定し、オブジェクトの大きさを推定することができる。より詳細にはオブジェクト属性推定部１１４は、以下の式に従ってオブジェクトの大きさを推定することができる。
ｓ＝（ｆ／ｄ − ｆ）・Ｓ
ここで、ｓは、オブジェクトの画像上での大きさ（ピクセル数）である。ｆは、焦点距離である。ｄは、装置からオブジェクトまでの距離である。Ｓは、オブジェクトの実際の大きさである。但し、（ｄ＞ｆ）であるとする。 Further, the object attribute estimation unit 114 can measure the distance to the object that is the subject by using a distance measuring means such as AF, and can estimate the size of the object. More specifically, the object attribute estimation unit 114 can estimate the size of the object according to the following equation.
s = (f / d−f) · S
Here, s is the size (number of pixels) of the object on the image. f is a focal length. d is the distance from the device to the object. S is the actual size of the object. However, it is assumed that (d> f).

このように、オブジェクト属性推定部１１４は、撮影条件に影響されないオブジェクトの大きさを属性として推定することが可能になる。
オブジェクト属性推定部１１４で推定されたオブジェクトの属性情報は、オブジェクト辞書データ生成部１１１から出力されるオブジェクト辞書データと共に、オブジェクト辞書データ保持部１１２に格納される。
オブジェクト辞書データ生成部１１１での処理も前記実施形態と一部、異なる。図１４は、オブジェクト辞書データ生成部１１１における特徴ベクトル抽出部での処理の一例を示したフローチャートである。以下、これを用いて説明する。始めに、オブジェクト辞書データ生成部１１１は、オブジェクト属性推定部１１４からオブジェクトの属性情報を取得する（Ｓ１００）。取得するオブジェクトの属性情報は、典型的には、人物の顔の器官位置及びその端点である。次に、オブジェクト辞書データ生成部１１１は、画像記録部１０４から画像データを取得する（Ｓ１０１）。オブジェクト辞書データ生成部１１１は、（Ｓ１００）で取得したオブジェクト属性情報を用いて、オブジェクト画像データに対して、部分領域を設定する（Ｓ１０２）。 In this manner, the object attribute estimation unit 114 can estimate the size of an object that is not affected by the shooting conditions as an attribute.
The object attribute information estimated by the object attribute estimation unit 114 is stored in the object dictionary data holding unit 112 together with the object dictionary data output from the object dictionary data generation unit 111.
The processing in the object dictionary data generation unit 111 is also partly different from the above embodiment. FIG. 14 is a flowchart illustrating an example of processing in the feature vector extraction unit in the object dictionary data generation unit 111. Hereinafter, this will be described. First, the object dictionary data generation unit 111 acquires object attribute information from the object attribute estimation unit 114 (S100). The attribute information of the object to be acquired is typically the organ position of the person's face and its end point. Next, the object dictionary data generation unit 111 acquires image data from the image recording unit 104 (S101). The object dictionary data generation unit 111 sets a partial region for the object image data using the object attribute information acquired in (S100) (S102).

部分領域の設定は、より具体的には以下のようにする。まず、オブジェクト辞書データ生成部１１１は、属性情報として取得した、顔器官の複数端点のうち、所定の１点を基準点として設定する。更にオブジェクト辞書データ生成部１１１は、他の少なくとも２つ端点間の距離を測り、その２点間距離の所定倍の長さだけ、前記基準点から離れた所に部分領域を設定する。オブジェクト辞書データ生成部１１１は、基準点からの方向も予め定められた値を用いる。ここで、基準点となる端点、距離の基準となる２点、２点間距離の何倍かを決める定数、基準点からの方向等は、予め学習によって決めることができる。これらのパラメータの学習は、部分領域のパラメータの中に含めることによって、実施形態１で説明したＡｄａＢｏｏｓｔを用いた部分領域の選択方法によって実現することができる。部分領域が設定された後の処理は、実施形態１における特徴ベクトル抽出部の処理と同じであるので説明は割愛する。 More specifically, the partial area is set as follows. First, the object dictionary data generation unit 111 sets a predetermined one of the plurality of end points of the facial organ acquired as attribute information as a reference point. Further, the object dictionary data generation unit 111 measures a distance between at least two other end points, and sets a partial region at a distance from the reference point by a length that is a predetermined multiple of the distance between the two points. The object dictionary data generation unit 111 uses a predetermined value for the direction from the reference point. Here, the end point serving as the reference point, the two points serving as the reference for the distance, the constant for determining how many times the distance between the two points, the direction from the reference point, and the like can be determined in advance by learning. The learning of these parameters can be realized by the partial region selection method using AdaBoost described in the first embodiment by including them in the partial region parameters. Since the processing after the partial area is set is the same as the processing of the feature vector extraction unit in the first embodiment, the description is omitted.

上記のような部分領域の設定方法、即ち、顔器官の端点を基準とした方法をとる場合、はみ出し判定をして、はみ出した領域の特徴ベクトルを無相関データにすることは、特に効果的である。一般に、斜光や逆光等撮影条件が厳しい場合や、奥行き方向の回転が入った人物の顔について、その顔器官の端点を高精度に検出することは難しい。仮に端点が誤検出であった場合、部分領域の設定も誤ってしまい、結果的に部分領域が、画像の範囲外に設定されることは、十分あり得る。はみ出した領域の特徴ベクトルを固定値にすると、登録辞書データと識別用データの両方にはみ出しがあると、その部分の類似度が大きくなり、識別結果によくない影響を及ぼす。オブジェクト辞書データ生成部１１１及びオブジェクト識別用データ生成部１２１が、はみ出し領域に固定値ではない、オブジェクト辞書データとオブジェクト識別用データとで無相関になる値、より具体的には、乱数を設定する。このことによって、上記のような問題を回避できる。 When using the method for setting the partial area as described above, that is, the method based on the end points of the facial organ, it is particularly effective to make the feature vector of the protruded area by making it an uncorrelated data. is there. In general, it is difficult to detect the end points of a facial organ with high accuracy when photographing conditions such as oblique light and backlight are severe, or for a human face that has been rotated in the depth direction. If the end point is erroneously detected, it is possible that the setting of the partial area is also incorrect, and as a result, the partial area is set outside the range of the image. If the feature vector of the protruded area is set to a fixed value, if both the registered dictionary data and the identification data are protruded, the degree of similarity of the portion increases, which adversely affects the identification result. The object dictionary data generation unit 111 and the object identification data generation unit 121 set a non-correlated value between the object dictionary data and the object identification data, more specifically, a random number in the protruding area. . As a result, the above problems can be avoided.

オブジェクト属性推定部１１４は、オブジェクトの属性として、オクリュージョン情報を用いることもできる。オブジェクト辞書データ生成部１１１は、部分領域に設定されるはずの範囲に、メガネや、マスク等他の物体があった場合、その物体の範囲を、無相関データ、より具体的には乱数によって置き換えてしまう。このようにすることによって、例えば、たまたまマスクをしていた他人同士の口領域の類似度が極端に大きくなってしまうような状況を避けることができる。
また、オブジェクト属性推定部１１４は、属性として、表情を用いることもできる。例えば、笑った顔の頬には、しわが出やすいが、これが識別によくない影響を与えることもあり得るので、オブジェクト辞書データ生成部１１１は、上記のように無相関データに置き換えてしまってもよい。表情と、無相関データに置き換える部分領域との関係は、予め学習サンプルによって決めた、ルックアップテーブルを作成し、オブジェクト辞書データ生成部１１１がこれを参照するようにすればよい。
同様に、人物の年齢を用いて、経年変化の現れやすい領域をオブジェクト辞書データ生成部１１１が無相関データにすることにより、登録時と認証時とで時間がたっている場合の識別性能を向上させることができる。
以上が、オブジェクト登録部の説明である。 The object attribute estimation unit 114 can also use occlusion information as the attribute of the object. The object dictionary data generation unit 111 replaces the range of the object with uncorrelated data, more specifically, a random number when there is another object such as glasses or a mask in the range that should be set in the partial area. End up. By doing so, for example, it is possible to avoid a situation in which the similarity between mouth areas of other people who happen to be masked becomes extremely large.
The object attribute estimation unit 114 can also use a facial expression as an attribute. For example, wrinkles are likely to appear on the cheek of a laughed face, but this may have an adverse effect on identification, so the object dictionary data generation unit 111 has replaced the data with uncorrelated data as described above. Also good. The relationship between the facial expression and the partial area to be replaced with the uncorrelated data may be prepared by creating a lookup table determined in advance by a learning sample, and the object dictionary data generation unit 111 may refer to this.
Similarly, by using the age of the person, the object dictionary data generation unit 111 uses the age of the person as an uncorrelated data so that the identification performance when the time is long between registration and authentication is improved. be able to.
The above is the description of the object registration unit.

＜オブジェクト識別部＞
図１５は、オブジェクト識別部１０６の構成の一例を示す図である。オブジェクト識別部１０６は、オブジェクト識別用データ生成部１２１、オブジェクト識別演算部１２２、オブジェクト辞書データ取得部１２３、オブジェクト属性推定部１２４、を含む。実施形態１とは、オブジェクト属性推定部１２４が追加されている点が異なる。
オブジェクト属性推定部１２４の処理の内容は、オブジェクト登録部のオブジェクト属性推定部１１４と同じであるので、説明は割愛する。
オブジェクト識別用データ生成部１２１は、画像記録部１０４からの入力と共に、オブジェクト属性推定部１１４の出力を用いて、特徴ベクトル及びその変換処理を行う。この処理は、オブジェクト登録部の処理とほぼ同じになるので、説明は割愛する。
オブジェクト識別演算部１２２は、オブジェクト識別用データ生成部１２１及びオブジェクト辞書データ取得部１２３からの入力を基に、オブジェクトの識別処理を行う。オブジェクト識別演算部１２２で行われる処理のより具体的な内容については、あとで説明する。
オブジェクト辞書データ取得部１２３は、オブジェクト識別演算部１２２からのリクエストに基づいて、オブジェクト登録部１０５中のオブジェクト辞書データ保持部１１２より、オブジェクト辞書データを取得する。 <Object identification part>
FIG. 15 is a diagram illustrating an example of the configuration of the object identification unit 106. The object identification unit 106 includes an object identification data generation unit 121, an object identification calculation unit 122, an object dictionary data acquisition unit 123, and an object attribute estimation unit 124. This embodiment is different from the first embodiment in that an object attribute estimation unit 124 is added.
Since the content of the processing of the object attribute estimation unit 124 is the same as that of the object attribute estimation unit 114 of the object registration unit, description thereof is omitted.
The object identification data generation unit 121 uses the input from the image recording unit 104 and the output of the object attribute estimation unit 114 to perform a feature vector and its conversion process. Since this process is almost the same as the process of the object registration unit, the description is omitted.
The object identification calculation unit 122 performs object identification processing based on inputs from the object identification data generation unit 121 and the object dictionary data acquisition unit 123. More specific contents of the processing performed by the object identification calculation unit 122 will be described later.
The object dictionary data acquisition unit 123 acquires object dictionary data from the object dictionary data holding unit 112 in the object registration unit 105 based on a request from the object identification calculation unit 122.

＜オブジェクト識別演算処理＞
次に、オブジェクト識別演算処理の内容について説明する。
オブジェクト識別処理の全体的な処理は、実施形態１とほぼ同じである。
以下では、オブジェクト識別器として、多数の識別器（以下弱識別器と呼ぶ）をツリー状に構成したオブジェクト識別器を用いてオブジェクト識別処理を行う場合について説明する。典型的には弱識別器は、１つの部分領域に対応しているが、弱識別器を複数の部分領域に対応させてもよい。
図１６は、オブジェクト識別器を弱識別器のツリー構造で構成した場合の模式図である。図中の枠１つが１つの弱識別器を表している。以下、ツリー構造をなす各弱識別器のことをノード識別器と呼ぶことがある。識別時は、矢印の方向に沿って処理が行われる。即ち、上位にある弱識別器から処理を行って、処理が進むにつれ、下位の弱識別器で処理を行う。一般に、上位にある弱識別器は、変動に対するロバスト性が高いが、誤識別率は高い傾向にある。下位にある弱識別器ほど変動に対するロバスト性は低い一方で、変動範囲が一致したときの識別精度は高くなるように学習してある。ある特定の変動範囲（顔の奥行き方向や、表情変動、照明変動等）に特化した弱識別器系列を複数用意し、ツリー構造をとることで、全体としての対応変動範囲を確保している。図１６では、５系列の弱識別器系列がある場合について示している。また、図１６では、最終的に５つの弱識別器系列が１つのノード識別器に統合されている。この最終ノード識別器は、例えば５系列の累積スコアを比較して、最も高いスコアをもつ系列の識別結果を採用する等の処理を行ってもよい。また、１つの識別結果に統合して出力するのではなく、各系列の識別結果をベクトルとして出力するようにしてもよい。 <Object identification calculation processing>
Next, the contents of the object identification calculation process will be described.
The overall process of the object identification process is almost the same as in the first embodiment.
Hereinafter, a case will be described in which object identification processing is performed using an object classifier in which a large number of classifiers (hereinafter referred to as weak classifiers) are configured in a tree shape as object classifiers. Typically, the weak classifier corresponds to one partial area, but the weak classifier may correspond to a plurality of partial areas.
FIG. 16 is a schematic diagram when the object classifier is configured with a tree structure of weak classifiers. One frame in the figure represents one weak classifier. Hereinafter, each weak classifier having a tree structure may be referred to as a node classifier. At the time of identification, processing is performed along the direction of the arrow. That is, processing is performed from the weak classifier at the upper level, and processing is performed at the lower weak classifier as the processing proceeds. In general, weak classifiers at the top have high robustness to fluctuations, but have a high misclassification rate. The weak classifier at the lower level is less robust to fluctuations, while learning is performed so that the classification accuracy when the fluctuation ranges match is higher. Multiple weak classifier series specialized for a specific variation range (face depth direction, facial expression variation, illumination variation, etc.) are prepared and a tree structure is used to secure the corresponding variation range as a whole. . FIG. 16 shows a case where there are five weak classifier sequences. In FIG. 16, finally, five weak classifier sequences are integrated into one node classifier. For example, the final node discriminator may perform processing such as comparing cumulative scores of five sequences and adopting a discrimination result of a sequence having the highest score. Further, instead of being integrated into one identification result and output, the identification result of each series may be output as a vector.

各弱識別器は、ｉｎｔｒａ−ｃｌａｓｓ，ｅｘｔｒａ−ｃｌａｓｓの２クラス問題を判別する識別器であるが、分岐の基点にあるノード識別器は、どの弱識別器系列に進むか、分岐先を決める判定を行う。もちろん、２クラス判定を行いつつ、分岐先も決めるようにしてもよい。また、分岐先を決めず、全ての弱識別器系列で処理するようにしてもよい。また、各ノード識別器は、２クラス判定の他に、演算を打ち切るかどうか判定するようにしてもよい（打ち切り判定）。打ち切り判定は、各ノード識別器単体の判定でもよいが、他のノード識別器の出力値（判定スコア）を累積したものを閾値処理する等して判定してもよい。
以上が、オブジェクト識別演算処理の説明である。 Each weak classifier is a classifier that discriminates two-class problems of intra-class and extra-class, but the node classifier at the base point of the branch determines which weak classifier sequence to proceed to and determines the branch destination I do. Of course, the branch destination may be determined while performing the two-class determination. Further, the processing may be performed for all weak classifier series without determining the branch destination. Further, each node discriminator may determine whether or not to cancel the calculation in addition to the 2-class determination (canceling determination). The abort determination may be performed for each node discriminator alone, or may be performed by performing threshold processing on an accumulation of output values (determination scores) of other node discriminators.
The above is the description of the object identification calculation process.

以上、上述した各実施形態によれば、照明や大きさ、向き等の変動に対して、ロバスト性と識別性能を向上させることができる。
また、上述したように、オブジェクト識別装置１００は、部分領域の設定が、例えば、画像データの範囲外に設定された場合、登録オブジェクトの辞書データと、識別用データとで無相関になる値を設定する。このことにより、高精度な識別を行うことができる。
なお、上述したように、オブジェクト識別装置１００は、部分領域の属性が、オクリュージョンであった場合、登録オブジェクトの辞書データと、識別用データとで無相関になる値を設定するようにしてもよい。このようにすることによっても、高精度な識別を行うことができる。 As described above, according to each embodiment described above, robustness and identification performance can be improved with respect to variations in illumination, size, orientation, and the like.
Further, as described above, the object identification device 100 sets a value that is uncorrelated between the dictionary data of the registered object and the identification data when the setting of the partial area is set outside the range of the image data, for example. Set. As a result, highly accurate identification can be performed.
As described above, when the attribute of the partial area is occlusion, the object identification device 100 sets a value that is uncorrelated between the dictionary data of the registered object and the identification data. Also good. In this way, highly accurate identification can be performed.

以上、本発明の好ましい実施形態について詳述したが、本発明は係る特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 The preferred embodiments of the present invention have been described in detail above, but the present invention is not limited to such specific embodiments, and various modifications can be made within the scope of the gist of the present invention described in the claims.・ Change is possible.

１結像光学系、２撮像部、３撮像制御部、４画像季肋部、５オブジェクト登録部、６オブジェクト識別部、７外部出力部 1 imaging optical system, 2 imaging unit, 3 imaging control unit, 4 image seasoning unit, 5 object registration unit, 6 object identification unit, 7 external output unit

Claims

An input means for inputting an input image including an object;
An acquisition unit configured to acquire a registration image including the object,
First partial area setting means for performing affine transformation on the input image and setting a partial area in the converted image;
First determination means for determining whether or not at least a part of the partial area set in the image after the affine transformation protrudes from the area of the input image before the affine transformation;
When it is determined that the partial area does not protrude from the area of the input image, a feature vector is set from the partial area, and when it is determined that the partial area protrudes from the area of the input image , of the partial region, a region for a region that protrudes from the region of the input image, and sets the random number, and a region that does not protrude from the region and the input image extends off the said random number is set First setting means for setting a feature vector from
An identification unit for calculating a correlation between the feature vector of the registered image and the feature vector of the partial area set by the first setting unit, and identifying the object based on the calculated result. Feature object identification device.

And further comprising detection means for detecting a specific part of the object in the input image,
It said first setting means, based on the detected position of the said site, the object identification device as claimed in claim 1, characterized in that to set the partial area.

And second setting means performs an affine transformation, and sets the partial area in the converted image to the registered image,
Least for the even part of the partial region set in the image after the affine transformation, a second determination means for determining whether protrudes from the area in front of the registered image to the affine transformation,
When it is determined that the partial area does not protrude from the area of the input image, a feature vector is set from the partial area, and it is determined that the partial area of the registered image protrudes from the area of the registered image. in a case where it is, the one of the segment of the registered image, to region protrudes from the region of the registered image, sets a random number, from the region and the input image extends off the said random number is set Second setting means for setting a feature vector from a region including a region that does not protrude,
The identification unit calculates a correlation between feature vectors of the partial areas corresponding to the registered image and the input image, and identifies the object based on the calculated result. Or the object identification device of 2.

A dictionary holding means for holding a partial region of the object of the registered image as a dictionary;
4. The object identification device according to claim 1, wherein the identification unit calculates a correlation between the dictionary and the partial area of the input image. 5.

The first setting unit sets a random number for the partial area of the input image when the partial area of the input image includes occlusion. The object identification device according to claim 1.

The object according to claim 3, wherein the second setting unit sets a random number for the partial area of the registered image when the partial area of the registered image includes occlusion. Identification device.

The first setting means sets a random number for the partial region of the input image when the partial region of the input image includes a region that is likely to change. The object identification device according to any one of 6.

The said 2nd setting means sets a random number with respect to the said partial area of the said registration image, when the said partial area of the said registration image contains the area | region where change appears easily. The object identification device described.

First feature vector extraction means for extracting a feature vector of the partial region of the input image;
Second feature vector extracting means for extracting a feature vector of the partial area of the registered image;
Further comprising
The identification unit is configured to determine the partial area of the registered image and the partial area of the input image based on the feature vector of the partial area of the input image and the feature vector of the partial area of the registered image. The object identification device according to claim 3, wherein the correlation is calculated .

An input step for inputting an input image including an object;
An acquisition step of acquiring a registered image including the object;
A first partial area setting step of performing affine transformation on the input image and setting a partial area in the transformed image;
A first determination step of determining whether at least a part of the partial region set in the image after the affine transformation protrudes from the region of the input image before the affine transformation;
When it is determined that the partial area does not protrude from the area of the input image, a feature vector is set from the partial area, and when it is determined that the partial area protrudes from the area of the input image Is a region that includes a region that protrudes from the region of the input image and includes a region that protrudes from the input image and a region that does not protrude from the input image. A first setting step for setting a feature vector from
An identification step of calculating a correlation between the feature vector of the registered image and the feature vector of the partial area set in the first setting step, and identifying the object based on the calculated result;
An object identification method comprising: