JP2012027617A

JP2012027617A - Pattern identification device, pattern identification method and program

Info

Publication number: JP2012027617A
Application number: JP2010164302A
Authority: JP
Inventors: Hiroshi Sato; 博佐藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-07-21
Filing date: 2010-07-21
Publication date: 2012-02-09

Abstract

PROBLEM TO BE SOLVED: To make it possible to easily check in which state a plurality of registered images are.SOLUTION: A pattern identification device has registration means for registering images in a dictionary and confirmation means for confirming the dictionary images already registered in the dictionary. The confirmation means displays the relationships of the plurality of dictionary images. The relationship of the dictionary images is positional relationship on a characteristic space used for pattern identification or a positional relationship on a projection space on which the characteristic amount used for pattern identification is projected.

Description

本発明はパターン識別装置、パターン識別方法及びプログラムに関し、特に、辞書画像を識別するために用いて好適な技術に関する。 The present invention relates to a pattern identification device, a pattern identification method, and a program, and more particularly to a technique suitable for use in identifying a dictionary image.

パターン認識における識別技術、典型的には、画像データ中の被写体であるオブジェクトが、別の画像中の被写体であるオブジェクトと同一のものであると識別する技術として、例えば、個人の顔を識別する顔識別技術がある。顔識別技術としては、例えば非特許文献１に記載されているような方法がある。この方法は、顔による個人の識別問題を、差分顔と呼ばれる特徴クラスの２クラス識別問題に置き換えることによって、顔の登録・追加学習をリアルタイムに行うことを可能にしたアルゴリズムである。 As an identification technique in pattern recognition, typically, as a technique for identifying that an object that is a subject in image data is the same as an object that is a subject in another image, for example, an individual's face is identified. There is face recognition technology. As a face identification technique, for example, there is a method as described in Non-Patent Document 1. This method is an algorithm that enables face registration / additional learning in real time by replacing an individual identification problem by a face with a two-class identification problem of a feature class called a differential face.

例えば、一般によく知られているサポートベクターマシン（ＳＶＭ）を用いた顔識別では、ｎ人分の人物の顔を識別するために、登録された人物の顔と、それ以外の顔とを識別するｎ個のＳＶＭ識別器が必要になる。人物の顔を登録する際には、また、ＳＶＭの学習が必要となる。ＳＶＭの学習には、登録したい人物の顔と、既に登録されている人物とその他の人物の顔データが大量に必要であり、非常に計算時間がかかるため、予め計算しておく手法が一般的であった。 For example, in the face identification using a well-known support vector machine (SVM), in order to identify the faces of n persons, the faces of registered persons and other faces are identified. n SVM discriminators are required. When registering a person's face, SVM learning is also required. SVM learning requires a large amount of face data of a person to be registered and already registered persons and other persons, and it takes a lot of calculation time. Met.

しかしながら、非特許文献１に記載の方法によれば、個人識別の問題を、次に挙げる２クラスの識別問題に置き換えることよって、追加学習を実質的に不要にすることができる。この２クラスとは、同一人物の画像間の、照明変動、表情・向きなどの変動特徴クラス（intra-personal class）と異なる人物の画像間の、変動特徴クラス（extra-personal class）とである。上記２クラスの分布が特定の個人によらず一定であると仮定して、個人の顔識別問題を、上記２クラスの識別問題に帰着させて識別器を構成する。そして、予め、大量の画像を準備して、同一人物間の変動特徴クラスと、異なる人物間の変動特徴クラスとの識別を行うように識別器を学習する。これにより、新たな登録者は、顔の画像若しくは必要な特徴を抽出した結果のみを保持すればよい。識別する際には２枚の画像から差分特徴を取り出し、上記識別器で、同一人物なのか異なる人物なのかを判定する。これにより、個人の顔登録の際にＳＶＭなどの学習が不要になり、リアルタイムで登録を行うことができる。 However, according to the method described in Non-Patent Document 1, additional learning can be substantially eliminated by replacing the problem of personal identification with the following two classes of identification problems. These two classes are a variation feature class (extra-personal class) between images of the same person and a variation feature class (intra-personal class) such as illumination variation, facial expression / orientation, etc., and a different person image. . Assuming that the distribution of the two classes is constant regardless of a specific individual, the classifier is configured by reducing the individual face identification problem to the two class identification problem. A large number of images are prepared in advance, and the discriminator is learned so as to identify the variation feature class between the same persons and the variation feature class between different persons. Thereby, the new registrant only needs to hold the result of extracting the facial image or necessary features. When discriminating, the difference feature is extracted from the two images, and the discriminator determines whether the person is the same person or a different person. This eliminates the need for learning such as SVM at the time of personal face registration, and registration can be performed in real time.

上記のような、具体的には画像中のオブジェクト、より具体的には、人物の顔の識別を行う装置において、識別性能を低下させる要因として、登録用パターンと認証用パターンとの間の変動が挙げられる。即ち、識別対象であるオブジェクト（人物の顔）の変動、具体的には、照明条件、向き・姿勢、他のオブジェクトによる隠れや、表情による変動などがあげられる。上記のような変動が、登録用パターンと認証用パターンとの間で大きくなると、識別性能が大幅に低下してしまう。このような問題に対する解決策として、例えば、特許文献１に記載の技術では、登録用の画像を取得する際に、登録に適しているかの適応度を画面に表示し、使用者に判断の材料を与えている。 As described above, in an apparatus that specifically identifies an object in an image, more specifically, a person's face, fluctuation between a registration pattern and an authentication pattern is a factor that degrades identification performance. Is mentioned. That is, there are fluctuations in the object (person's face) that is the identification target, specifically lighting conditions, orientation / posture, hiding by other objects, fluctuation due to facial expressions, and the like. When the above-described fluctuations increase between the registration pattern and the authentication pattern, the identification performance is greatly reduced. As a solution to such a problem, for example, in the technique described in Patent Document 1, when acquiring an image for registration, a fitness level indicating whether it is suitable for registration is displayed on a screen, and a material for judgment to the user is obtained. Is given.

特開２００９−４８４４７号公報JP 2009-48447 A 特許３０７８１６６号公報Japanese Patent No. 3078166 特開２００２−８０３２号公報Japanese Patent Laid-Open No. 2002-8032

Baback Moghaddam, Beyond Eigenfaces :Probabilistic Matching for Face Recognition(M.I.T Media Laboratory Perceptual Computing Section Technical Report No.433), ProbabilisticVisual Learning for Object Representation(IEEE Transactions on PatternAnalysis and Machine Intelligence, Vol. 19, No. 7, JULY 1997)Baback Moghaddam, Beyond Eigenfaces: Probabilistic Matching for Face Recognition (M.I.T Media Laboratory Perceptual Computing Section Technical Report No.433), ProbabilisticVisual Learning for Object Representation (IEEE Transactions on PatternAnalysis and Machine Intelligence, Vol. 19, No. 7, JULY 1997)

しかし、ユーザーは、実際に登録した画像が、識別性能にどのように影響するか確認するすべはなく、機器の推奨する登録画像を取得し、登録を行うしかなかった。また、登録画像に枚数の制限がある場合、どの登録画像を削除するべきか、判断を行うための情報も十分には与えられていなかった。 However, the user has no way of confirming how the actually registered image affects the identification performance, and has only to acquire and register a registered image recommended by the device. In addition, when the number of registered images is limited, information for determining which registered image should be deleted is not sufficiently given.

本発明は前述の問題点に鑑み、複数の登録画像が互いにどのような状態にあるのか簡易に確認できるようにすることを目的としている。 The present invention has been made in view of the above-described problems, and an object of the present invention is to make it possible to easily check the state of a plurality of registered images.

本発明のパターン識別装置は、画像を辞書に登録する登録手段と、前記登録手段により既に登録されている辞書画像を確認する確認手段とを有し、前記確認手段は、複数の辞書画像の間の関係を表示させることを特徴とする。 The pattern identification apparatus of the present invention includes registration means for registering an image in a dictionary, and confirmation means for confirming a dictionary image already registered by the registration means, wherein the confirmation means includes a plurality of dictionary images. The relationship is displayed.

本発明によれば、ユーザーが、複数の登録辞書画像が互いにどのような状態にあるのか簡易に確認することが可能となり、より妥当な登録画像を選択することができる。 According to the present invention, the user can easily confirm the state of the plurality of registered dictionary images, and can select a more appropriate registered image.

実施形態に係るパターン識別装置の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the pattern identification apparatus which concerns on embodiment. パターン識別装置の全体処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the whole process sequence of a pattern identification device. 入力画像識別部の詳細な機能構成例を示すブロック図である。It is a block diagram which shows the detailed functional structural example of an input image identification part. 入力画像識別部による識別処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the identification processing procedure by an input image identification part. 辞書画像登録部の詳細な機能構成例を示すブロック図である。It is a block diagram which shows the detailed functional structural example of a dictionary image registration part. 辞書画像確認部の詳細な機能構成例を示すブロック図である。It is a block diagram which shows the detailed functional structural example of a dictionary image confirmation part. 辞書画像関係演算部による処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence by a dictionary image relationship calculating part. 辞書画像表示部による表示例を示す模式図である。It is a schematic diagram which shows the example of a display by a dictionary image display part. 辞書画像登録部の詳細な機能構成例を示すブロック図である。It is a block diagram which shows the detailed functional structural example of a dictionary image registration part. 辞書画像確認部の詳細な機能構成例を示すブロック図である。It is a block diagram which shows the detailed functional structural example of a dictionary image confirmation part. 辞書画像群評価部による処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence by a dictionary image group evaluation part.

（第１の実施形態）
以下、図面を参照しながら本発明の第１の実施形態について詳細に説明する。
図１は、本実施形態に係るパターン識別装置１００の構成例を示すブロック図である。
図１に示すように、パターン識別装置１００は、結像光学系１、撮像部２、撮像制御部３、画像記録部４、辞書画像登録部５、入力画像識別部６、及び辞書画像確認部７を備えている。また、識別結果を出力する外部出力部８、使用者が各種操作を行うための操作部９、及び各構成要素の制御・データ接続を行うための接続バス１０をさらに備えている。なお、辞書画像登録部５及び入力画像識別部６は、典型的には、それぞれ専用回路（ＡＳＩＣ）、プロセッサ（リコンフィギュラブルプロセッサ、ＤＳＰなど）であってもよい。また、単一の専用回路および汎用回路（ＰＣ用ＣＰＵ）内部において実行されるプログラムとして存在してもよい。 (First embodiment)
Hereinafter, a first embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram illustrating a configuration example of a pattern identification device 100 according to the present embodiment.
As shown in FIG. 1, the pattern identification device 100 includes an imaging optical system 1, an imaging unit 2, an imaging control unit 3, an image recording unit 4, a dictionary image registration unit 5, an input image identification unit 6, and a dictionary image confirmation unit. 7 is provided. Further, an external output unit 8 for outputting the identification result, an operation unit 9 for the user to perform various operations, and a connection bus 10 for performing control / data connection of each component are further provided. The dictionary image registration unit 5 and the input image identification unit 6 may typically be a dedicated circuit (ASIC) and a processor (reconfigurable processor, DSP, etc.), respectively. Further, it may exist as a program executed in a single dedicated circuit and general-purpose circuit (PC CPU).

結像光学系１は、ズーム機構を備えた光学レンズで構成される。また、パン・チルト軸方向の駆動機構を備えてもよい。撮像部２の映像センサとしては、典型的にはＣＣＤまたはＣＭＯＳイメージセンサが用いられる。不図示のセンサ駆動回路からの読み出し制御信号により所定の映像信号（例えば、サブサンプリング、ブロック読み出しして得られる信号）が画像データとして撮像部２から出力される。撮像制御部３は、撮影者からの指示（画角調整指示、シャッター押下、など）、及び辞書画像登録部５または入力画像識別部６からの情報を元に、実際に撮影が行われるタイミングを制御する。 The imaging optical system 1 includes an optical lens having a zoom mechanism. Further, a drive mechanism in the pan / tilt axis direction may be provided. As the image sensor of the imaging unit 2, a CCD or a CMOS image sensor is typically used. A predetermined video signal (for example, a signal obtained by sub-sampling or block reading) is output from the imaging unit 2 as image data by a read control signal from a sensor drive circuit (not shown). The imaging control unit 3 determines the timing at which actual shooting is performed based on an instruction from the photographer (viewing angle adjustment instruction, shutter pressing, etc.) and information from the dictionary image registration unit 5 or the input image identification unit 6. Control.

画像記録部４は、半導体メモリ等で構成され、撮像部２から転送された画像データを保持し、辞書画像登録部５や、入力画像識別部６からの要求に応じて、所定のタイミングで、画像データを転送する。辞書画像登録部５は、画像データから識別の対象とするオブジェクトの情報を抽出し、記録・保持する。なお、辞書画像登録部５におけるより詳細な構成及び実際に行われる処理の具体的な内容については後述する。 The image recording unit 4 is configured by a semiconductor memory or the like, holds image data transferred from the imaging unit 2, and at a predetermined timing in response to a request from the dictionary image registration unit 5 or the input image identification unit 6. Transfer image data. The dictionary image registration unit 5 extracts information of an object to be identified from the image data, and records and holds it. Note that a more detailed configuration in the dictionary image registration unit 5 and specific contents of processing actually performed will be described later.

入力画像識別部６は、画像データおよび辞書画像登録部５から取得したデータを元に、画像データ中のオブジェクトの識別を行う。なお、入力画像識別部６に関して、具体的な構成及び行われる処理の詳細については後述する。辞書画像確認部７は、使用者が辞書画像の内容を確認するためのものである。また、辞書画像登録部５で登録しようとする登録候補画像の確認を行うために用いることもできる。この辞書画像確認部７で行われる処理及び表示の内容については後述する。 The input image identification unit 6 identifies an object in the image data based on the image data and the data acquired from the dictionary image registration unit 5. In addition, regarding the input image identification part 6, a specific structure and the detail of the process performed are mentioned later. The dictionary image confirmation unit 7 is used by the user to confirm the contents of the dictionary image. The dictionary image registration unit 5 can also be used to confirm registration candidate images to be registered. Details of processing and display performed in the dictionary image confirmation unit 7 will be described later.

外部出力部８は、典型的には、ＣＲＴやＴＦＴ液晶などのモニタであり、撮像部２及び画像記録部４から取得した画像データを表示したり、画像データに辞書画像登録部５、入力画像識別部６、及び辞書画像確認部７の出力結果を重畳表示したりする。また、辞書画像登録部５、入力画像識別部６及び辞書画像確認部７の出力結果を電子データとして、外部メモリなどに出力する形式をとってもよい。 The external output unit 8 is typically a monitor such as a CRT or a TFT liquid crystal, and displays the image data acquired from the imaging unit 2 and the image recording unit 4, or the dictionary image registration unit 5 and the input image as image data. The output results of the identification unit 6 and the dictionary image confirmation unit 7 are superimposed and displayed. Also, the output results of the dictionary image registration unit 5, the input image identification unit 6, and the dictionary image confirmation unit 7 may be output as electronic data to an external memory or the like.

操作部９は、使用者が、辞書画像の登録動作を行うためのモード設定や、登録されている辞書画像に関する操作を行うためなどの各種操作を行う。接続バス１０は、上記構成要素間の制御・データ接続を行うためのバスである。以下、パターンの識別とは、パターンの個体の違い（例えば、個人としての人物の違い）を判定することを意味する。一方、パターンの検出は、個体を区別せず同じ範疇に入るものを判定する（例えば、個人を区別せず、顔を検出する）ことを意味するものとする。 The operation unit 9 allows the user to perform various operations such as mode setting for performing a dictionary image registration operation and operations for a registered dictionary image. The connection bus 10 is a bus for performing control and data connection between the above components. Hereinafter, the pattern identification means that a difference between individual patterns (for example, a difference between persons as individuals) is determined. On the other hand, the detection of the pattern means that an object that falls within the same category is determined without distinguishing between individuals (for example, a face is detected without distinguishing between individuals).

図２は、本実施形態におけるパターン識別装置１００の全体処理手順の一例を示すフローチャートである。以下、図２を参照しながら、このパターン識別装置１００が入力パターンの識別を行う実際の処理について説明する。なお、以下では、識別するパターンが、画像中のオブジェクト、より具体的には、人物の顔である場合について説明するが、本発明がこれに限るものでないことは、言うまでもない。 FIG. 2 is a flowchart showing an example of the overall processing procedure of the pattern identification apparatus 100 according to this embodiment. Hereinafter, an actual process in which the pattern identification apparatus 100 identifies an input pattern will be described with reference to FIG. In the following, a case where the pattern to be identified is an object in an image, more specifically, a human face will be described, but it goes without saying that the present invention is not limited to this.

まず、入力画像識別部６は、画像記録部４から画像データを取得する（ステップＳ２００）。続いて、取得した画像データに対して、対象パターンの検出、具体的には人の顔の検出処理を行う（ステップＳ２０１）。画像中から、人物の顔を検出する方法については、公知の技術を用いればよい。例えば、特許文献２や特許文献３で提案されているような技術を用いることができる。人物の顔以外を検出する場合でも、前述の技術を用いて、所望のオブジェクトを訓練データとすることにより、検出が可能である。 First, the input image identification unit 6 acquires image data from the image recording unit 4 (step S200). Subsequently, target pattern detection, specifically, human face detection processing is performed on the acquired image data (step S201). A known technique may be used as a method for detecting a human face from an image. For example, a technique proposed in Patent Document 2 or Patent Document 3 can be used. Even when a face other than a person's face is detected, it is possible to detect the desired object as training data using the technique described above.

次に、入力画像中に、対象パターンが存在するか否かを判定する（ステップＳ２０２）。この判定の結果、対象パターンが存在しない場合は、処理を終了する。一方、ステップＳ２０２の判定の結果、入力画像中に対象パターンが存在する場合は、登録モードに設定されているか否かの判定を行う（ステップＳ２０３）。なお、この登録モードは、ユーザーにより操作部９で所定の動作を行うことによって予め設定することができる。この判定の結果、登録モードである場合は、辞書画像登録部５により辞書画像登録処理を行う（ステップＳ２０４）。この辞書画像登録処理の具体的な内容は、後で詳しく説明する。 Next, it is determined whether or not the target pattern exists in the input image (step S202). If the result of this determination is that there is no target pattern, the process is terminated. On the other hand, if the target pattern exists in the input image as a result of the determination in step S202, it is determined whether or not the registration mode is set (step S203). This registration mode can be set in advance by performing a predetermined operation on the operation unit 9 by the user. If the result of this determination is registration mode, the dictionary image registration unit 5 performs dictionary image registration processing (step S204). The specific contents of the dictionary image registration process will be described in detail later.

一方、ステップＳ２０３の判定の結果、登録モードでない場合は、入力画像識別部６により入力画像の識別処理を行う（ステップＳ２０５）。この入力画像識別処理の具体的な内容についても、後述する。そして最後に、辞書画像登録処理または入力画像識別処理の結果を外部出力部８に対して出力する（ステップＳ２０６）。 On the other hand, if the result of determination in step S203 is not registration mode, the input image identification unit 6 performs input image identification processing (step S205). Specific contents of this input image identification processing will also be described later. Finally, the result of the dictionary image registration process or the input image identification process is output to the external output unit 8 (step S206).

次に、入力画像識別処理について説明する。図３は、入力画像識別部６の詳細な機能構成例を示すブロック図である。
図３に示すように、入力画像識別部６は、入力画像識別用データ生成部２１、辞書画像データ取得部２２、及び入力画像識別演算部２３を備えている。 Next, the input image identification process will be described. FIG. 3 is a block diagram illustrating a detailed functional configuration example of the input image identification unit 6.
As shown in FIG. 3, the input image identification unit 6 includes an input image identification data generation unit 21, a dictionary image data acquisition unit 22, and an input image identification calculation unit 23.

入力画像識別用データ生成部２１は、画像記録部４から取得した画像データから、対象オブジェクトの識別に必要な情報の抽出を行う。辞書画像データ取得部２２は、辞書画像登録部５より、入力画像の識別に必要な辞書データを取得する。入力画像識別演算部２３は、入力画像識別用データ生成部２１から取得した識別用データと、辞書画像データ取得部２２から得た辞書データとから、入力画像の識別処理を行う。 The input image identification data generation unit 21 extracts information necessary for identifying the target object from the image data acquired from the image recording unit 4. The dictionary image data acquisition unit 22 acquires dictionary data necessary for identification of the input image from the dictionary image registration unit 5. The input image identification calculation unit 23 performs input image identification processing from the identification data acquired from the input image identification data generation unit 21 and the dictionary data acquired from the dictionary image data acquisition unit 22.

図４は、入力画像識別部６で行われる識別処理手順の一例を示すフローチャートである。
まず、辞書画像登録部５から辞書データを取得する（ステップＳ４００）。この辞書データは、典型的には、入力画像から対象となるオブジェクトを切り出したあとに、傾きなどを補正し、所定の大きさに拡大または縮小した正規化画像である。また、あらかじめ後述の入力画像識別用データ生成処理と同様の処理を施しておき、特徴量を抽出したものを辞書データしてもよい。 FIG. 4 is a flowchart illustrating an example of an identification processing procedure performed by the input image identification unit 6.
First, dictionary data is acquired from the dictionary image registration unit 5 (step S400). This dictionary data is typically a normalized image obtained by cutting out a target object from an input image, correcting the inclination, and expanding or reducing the object to a predetermined size. Alternatively, the same processing as the input image identification data generation processing described later may be performed in advance, and the feature data extracted may be used as dictionary data.

次に、画像記録部４より画像データを取得する（ステップＳ４０１）。そして、入力画像識別用データ生成処理を行う（ステップＳ４０２）。ここで、辞書データが画像である場合、入力画像だけでなく、辞書データについても識別用データ生成処理を行う。入力画像識別用データ生成処理では、始めに入力画像データから、対象オブジェクト（典型的には人の顔）を切り出し、所定の大きさや回転方向の正規化を施した後、特徴量の抽出を行う。ここで、特徴量は、単純には画像の輝度をそのまま用いる方法もあるが、変動に対してより頑健な特徴量として、輝度に対して、Local Binary Pattern や増分符号などを施したものを用いてもよい。また、前記特徴量に対して、主成分分析法（ＰＣＡ）や独立成分分析法（ＩＣＡ）によって、特徴ベクトルの次元を削減したものを新たな特徴量として用いてもよい。 Next, image data is acquired from the image recording unit 4 (step S401). Then, input image identification data generation processing is performed (step S402). Here, when the dictionary data is an image, identification data generation processing is performed not only on the input image but also on the dictionary data. In the input image identification data generation processing, first, a target object (typically a human face) is cut out from input image data, and after normalizing in a predetermined size and rotation direction, a feature amount is extracted. . Here, there is a method that simply uses the brightness of the image as it is, but as a feature that is more robust against fluctuations, a feature that has been subjected to local binary pattern or incremental code is used for the brightness. May be. In addition, a feature vector whose dimension is reduced by a principal component analysis method (PCA) or an independent component analysis method (ICA) may be used as a new feature value.

次に、ステップＳ４００で取得した辞書データと、ステップＳ４０１で取得した識別用データとでマッチング処理を行う（ステップＳ４０３）。マッチング処理は、典型的には、辞書データと入力データとのそれぞれから抽出した特徴ベクトルの相関演算を行う。辞書データが複数ある場合には、それらすべてとの総当たりで行う。相関演算は、単純な内積でもよいし、相関係数を求めてもよい。また、マッチング処理の出力として、登録済みデータ（辞書データ）との一致をバイナリ（０または１）で出力する場合と、正規化した出力値を（０〜１の実数値）尤度として出力する場合とが考えられる。さらに、登録パターン（登録者）が複数（複数人）ある場合には、それぞれの登録パターン（登録者）に対して、尤度を出力しても良い。また、最も良く一致した登録パターンに対する結果だけを出力しても良い。他に、登録パターンに対する尤度ではなく、登録パターンが属するクラスに対しての尤度を出力するようにしてもよい。すなわち、人物の場合には、個々の登録顔画像への結果ではなく、人物のＩＤ（名前）に対する尤度を出力するようにする。 Next, matching processing is performed on the dictionary data acquired in step S400 and the identification data acquired in step S401 (step S403). The matching process typically performs a correlation operation between feature vectors extracted from the dictionary data and the input data. When there are a plurality of dictionary data, the brute force is performed for all of them. The correlation calculation may be a simple inner product or a correlation coefficient. In addition, as the output of the matching process, when matching with registered data (dictionary data) is output in binary (0 or 1), the normalized output value is output as a (0-1 real value) likelihood. A case may be considered. Further, when there are a plurality (a plurality of) registered patterns (registrants), the likelihood may be output for each registered pattern (registrant). Alternatively, only the result for the registered pattern that matches the best may be output. In addition, the likelihood for the class to which the registered pattern belongs may be output instead of the likelihood for the registered pattern. That is, in the case of a person, the likelihood for the person ID (name) is output instead of the result for each registered face image.

次に、辞書登録処理について説明する。図５は、辞書画像登録部５の詳細な機能構成例を示すブロック図である。
図５に示すように、辞書画像登録部５は、辞書画像データ生成部３１、辞書画像データ保持部３２、辞書画像データ選択部３３、及び登録候補画像確認部３４を備えている。 Next, the dictionary registration process will be described. FIG. 5 is a block diagram illustrating a detailed functional configuration example of the dictionary image registration unit 5.
As shown in FIG. 5, the dictionary image registration unit 5 includes a dictionary image data generation unit 31, a dictionary image data holding unit 32, a dictionary image data selection unit 33, and a registration candidate image confirmation unit 34.

辞書画像データ生成部３１は、画像記録部４から取得した画像データからパターンの個体を識別するために必要な辞書画像データを生成する。例えば、非特許文献１に記載されているように、intra-class及びextra-classの２クラス問題を判別する場合、典型的には、人物の顔画像を辞書画像データとすればよい。また、顔検出処理によって検出された人物の顔画像データを、大きさや向き（面内回転方向）などを正規化したのち、辞書画像データ保持部３２に格納するようにしてもよい。この際、辞書画像に対して、ユーザーが操作部９などを通じて設定したラベル情報も一緒に記録する。このように画像データそのものではなく、識別時に必要なデータのみを保持するようにすることによって、辞書データ量を削減することもできる。当該パターンの部分領域のベクトル相関をとって識別演算を行う場合、予めその部分領域のみを切り出しておけばよい。 The dictionary image data generation unit 31 generates dictionary image data necessary for identifying an individual pattern from the image data acquired from the image recording unit 4. For example, as described in Non-Patent Document 1, when a two-class problem of intra-class and extra-class is determined, typically, a human face image may be used as dictionary image data. Further, the face image data of a person detected by the face detection process may be stored in the dictionary image data holding unit 32 after normalizing the size and direction (in-plane rotation direction). At this time, label information set by the user through the operation unit 9 or the like is also recorded on the dictionary image. In this way, it is possible to reduce the amount of dictionary data by holding only the data necessary for identification rather than the image data itself. When performing the discrimination calculation by taking the vector correlation of the partial area of the pattern, it is sufficient to cut out only the partial area in advance.

また、辞書画像データ生成部３１は、ユーザーが後述する辞書画像確認部７を介して指定された画像に対して処理を行うようにすることができる。以上のように、適宜必要な情報を画像から抽出し、後述する所定の変換を行った後、オブジェクトの識別を行うための特徴ベクトルとして、辞書画像データ保持部３２に格納する。 Further, the dictionary image data generation unit 31 can perform processing on an image designated by the user via the dictionary image confirmation unit 7 described later. As described above, necessary information is appropriately extracted from the image, subjected to predetermined conversion described later, and then stored in the dictionary image data holding unit 32 as a feature vector for identifying an object.

辞書画像データ選択部３３は、前述の入力画像識別部６の要求に応じて、辞書画像データ保持部３２から必要な辞書画像データを読み出して、入力画像識別部６に辞書画像データを転送する。登録候補画像確認部３４は、辞書画像データ生成部３１に入力された辞書画像データを使用者が確認するための表示装置および操作を行うためのインターフェースから構成される。登録候補画像確認部３４による処理例は、後述する辞書画像確認部７の説明で改めて説明する。 The dictionary image data selection unit 33 reads necessary dictionary image data from the dictionary image data holding unit 32 and transfers the dictionary image data to the input image identification unit 6 in response to the request from the input image identification unit 6 described above. The registration candidate image confirmation unit 34 includes a display device for a user to confirm dictionary image data input to the dictionary image data generation unit 31 and an interface for operation. An example of processing by the registration candidate image confirmation unit 34 will be described again in the description of the dictionary image confirmation unit 7 described later.

次に、辞書画像確認部７で行われる処理について説明する。図６は、辞書画像確認部７の詳細な機能構成例を示すブロック図である。
図６に示すように、辞書画像確認部７は、辞書画像関係演算部４１、辞書画像関係表示部４２、及び辞書画像関係操作部４３を備えている。 Next, processing performed in the dictionary image confirmation unit 7 will be described. FIG. 6 is a block diagram illustrating a detailed functional configuration example of the dictionary image confirmation unit 7.
As shown in FIG. 6, the dictionary image confirmation unit 7 includes a dictionary image relationship calculation unit 41, a dictionary image relationship display unit 42, and a dictionary image relationship operation unit 43.

辞書画像関係演算部４１は、辞書画像データ保持部３２から、複数の辞書画像データを取得し、それら辞書画像間の関係性を数値として演算するための処理を行う。この処理の内容に関しては後に詳しく説明する。辞書画像関係表示部４２は、辞書画像関係演算部４１の出力結果を表示するための表示部である。この表示方法についても後に詳しく説明する。辞書画像関係操作部４３は、使用者が辞書画像関係表示部４２の結果をもとに、辞書画像群に対して、種々の操作を行うための操作デバイスである。 The dictionary image relationship calculation unit 41 acquires a plurality of dictionary image data from the dictionary image data holding unit 32, and performs a process for calculating the relationship between the dictionary images as a numerical value. The contents of this process will be described in detail later. The dictionary image relationship display unit 42 is a display unit for displaying the output result of the dictionary image relationship calculation unit 41. This display method will also be described in detail later. The dictionary image related operation unit 43 is an operation device for the user to perform various operations on the dictionary image group based on the result of the dictionary image related display unit 42.

次に、辞書画像関係演算部４１で行われる処理の内容について説明する。図７は、辞書画像関係演算部４１による処理手順の一例を示すフローチャートである。
まず、辞書画像データ保持部３２から辞書画像データを取得する（ステップＳ７００）。続いて、取得した辞書画像データから、特徴量を取得する（ステップＳ７０１）。ここで、特徴量は、入力画像識別用データ生成部２１で生成された特徴量をそのまま用いればよい。また、演算負荷低減のために、典型的には、輝度値をそのまま特徴ベクトルとするなど、より簡易な特徴量を抽出するようにしてもよい。 Next, the content of the process performed by the dictionary image relation calculating unit 41 will be described. FIG. 7 is a flowchart illustrating an example of a processing procedure performed by the dictionary image relation calculation unit 41.
First, dictionary image data is acquired from the dictionary image data holding unit 32 (step S700). Subsequently, a feature amount is acquired from the acquired dictionary image data (step S701). Here, as the feature amount, the feature amount generated by the input image identification data generation unit 21 may be used as it is. In order to reduce the calculation load, typically, simpler feature amounts may be extracted, for example, luminance values are used as feature vectors as they are.

次に、抽出した特徴量を特徴ベクトルとして抽出する（ステップＳ７０２）。続いて、取り出した特徴ベクトルを次元削減する（ステップＳ７０３）。ここで、次元削減処理では、入力画像識別用データ生成部２１で行われる処理と同様の方法をとればよい。典型的には、主成分分析（ＰＣＡ）や独立成分分析（ＩＣＡ）などの手法が用いられる。削減後の次元数は、入力画像識別用データ生成部２１の値と同じでもよいし、異なる値にしてもよい。そして、全ての辞書画像データについて処理が完了したか否かを判定する（ステップＳ７０４）。この判定の結果、すべて完了した場合は、ステップＳ７０５に進み、未処理のデータがある場合は、ステップＳ７０１に戻る。 Next, the extracted feature quantity is extracted as a feature vector (step S702). Subsequently, the dimension of the extracted feature vector is reduced (step S703). Here, in the dimension reduction process, a method similar to the process performed in the input image identification data generation unit 21 may be employed. Typically, techniques such as principal component analysis (PCA) and independent component analysis (ICA) are used. The number of dimensions after the reduction may be the same as or different from the value of the input image identification data generation unit 21. Then, it is determined whether or not the processing has been completed for all dictionary image data (step S704). As a result of the determination, if all are completed, the process proceeds to step S705, and if there is unprocessed data, the process returns to step S701.

次に、相対座標演算処理を行う（ステップＳ７０５）。この処理では、ステップＳ７０３で取得した全辞書画像データの次元削減後の特徴ベクトルを用いて、辞書画像間の相対的な関係を演算する。具体的には、以下のように行う。まず、全ての辞書画像同士の組み合わせで相関値を計算し、各辞書画像間の類似性を数値化する。そして、全辞書画像間の類似性をテーブルとして、後述する辞書画像関係表示部４２へ出力する。このようにすることにより、登録画像が他の登録画像とどのような関係（類似性）にあるのか、辞書画像関係表示部４２で可視化することができる。 Next, a relative coordinate calculation process is performed (step S705). In this process, the relative relationship between the dictionary images is calculated using the feature vector after dimension reduction of all dictionary image data acquired in step S703. Specifically, this is performed as follows. First, correlation values are calculated for all combinations of dictionary images, and the similarity between the dictionary images is digitized. Then, the similarity between all dictionary images is output as a table to a dictionary image relationship display unit 42 described later. In this way, the relationship between the registered image and other registered images (similarity) can be visualized by the dictionary image relationship display unit 42.

また、相関値ではなく、特徴ベクトルを２次元ないし３次元まで次元削減し、位置関係を図示できるような形式にすることもできる。この場合、ステップＳ７０３の特徴ベクトル次元削減処理において、削減後の次元数を２ないし３にすればよい。ＰＣＡを用いる場合、典型的には、寄与度の高い基底ベクトルを２ないし３選択し、その基底ベクトルの張る空間に射影したものを、辞書画像の位置関係とする。また、多次元尺度構成法（ＭＤＳ）を用いても良い。ＭＤＳによれば、似たものは近く、異なったものは遠くに配置されるように、特徴ベクトルを低次元化し、特徴量空間を代表させて表示することができる。 Further, instead of the correlation value, the feature vector can be reduced in dimensions from two to three dimensions so that the positional relationship can be illustrated. In this case, in the feature vector dimension reduction process in step S703, the number of dimensions after reduction may be set to 2 to 3. When using PCA, typically, two or three base vectors having a high degree of contribution are selected and projected onto the space spanned by the base vectors as the positional relationship of the dictionary image. Further, a multidimensional scale construction method (MDS) may be used. According to MDS, feature vectors can be reduced in order so that similar ones are close and different ones are arranged far away, and the feature amount space can be displayed as a representative.

このように、特徴ベクトルを１次元または、２次元、３次元に低次元化して、辞書画像関係表示部４２に出力することにより、辞書画像の位置関係を可視化することができる。即ち、使用者が、辞書画像を、実際の識別が行われる特徴空間上とほぼ同じ空間（射影空間上）で確認することができる。 In this manner, the positional relationship of the dictionary image can be visualized by reducing the feature vector to one, two, or three dimensions and outputting it to the dictionary image relationship display unit 42. In other words, the user can check the dictionary image in almost the same space (on the projection space) as the feature space where the actual identification is performed.

次に、辞書画像関係表示部４２で行われる処理の内容について説明する。辞書画像関係表示部４２は、辞書画像関係演算部４１の出力結果を可視化して表示するためのものであり、典型的には、ＬＣＤなどモニタ装置から構成される。 Next, the contents of processing performed in the dictionary image relation display unit 42 will be described. The dictionary image relationship display unit 42 is for visualizing and displaying the output result of the dictionary image relationship calculation unit 41, and typically includes a monitor device such as an LCD.

図８は、辞書画像関係表示部４２による表示例を示す模式図である。
図８において、辞書画像を２次元のグラフにマッピングしている。即ち、辞書画像関係表示部４２において、辞書画像の特徴ベクトルを２次元まで次元削減し、その結果をグラフ上に表示している。表示の際に、辞書画像を縮小したアイコンをグラフ上に表示するようにすると、複数の辞書画像の位置関係がより直感的に確認できる。アイコン表示の下に、ユーザーが定義したラベル（人物の顔の場合、個人ＩＤ）を表示するようにしてもよい。 FIG. 8 is a schematic diagram illustrating a display example by the dictionary image relation display unit 42.
In FIG. 8, the dictionary image is mapped to a two-dimensional graph. That is, the dictionary image relation display unit 42 reduces the feature vector of the dictionary image to two dimensions, and displays the result on a graph. When displaying an icon obtained by reducing the dictionary image on the graph at the time of display, the positional relationship between the plurality of dictionary images can be confirmed more intuitively. A label defined by the user (in the case of a human face, a personal ID) may be displayed under the icon display.

上記のような処理を経て表示された辞書画像は、同一のラベルが付いたもの同士であれば、近い位置に表示され、異なるラベルが付いたもの同士は、遠い位置に表示されることが期待される。仮に、同じラベルが付いた辞書画像が相対的に離れた位置にある場合は、何らかの条件により識別に好ましくない状態にあることが推定される。逆に、異なるラベルが付いた辞書画像が近い位置にある場合も、識別に好ましい状態とは言えない。このように、複数の辞書画像の関係を表示することにより、ユーザーは、登録してある辞書画像がどのようになっているのか直感的に理解することができるようになる。 It is expected that the dictionary images displayed through the above processing will be displayed at close positions if they have the same label, and those that have different labels will be displayed at distant positions. Is done. If the dictionary image with the same label is at a relatively distant position, it is presumed that it is not preferable for identification due to some condition. On the other hand, even when dictionary images with different labels are close to each other, it is not a preferable state for identification. Thus, by displaying the relationship between a plurality of dictionary images, the user can intuitively understand how the registered dictionary images look.

また、辞書画像確認部７と登録候補画像確認部３４とを連携させることも可能である。例えば、辞書画像登録部５に入力された入力画像を、登録候補画像確認部３４を介して、辞書画像確認部７に入力し、辞書画像関係演算部４１で、他の辞書画像データと同様の処理を行う。登録候補画像に対して、処理を行った結果を辞書画像関係表示部４２に表示し、他の辞書画像との関係性を示すことができる。このようにすることによって、ユーザーは、登録しようとする画像が、すでに登録してある辞書画像とどのような関係にあるのか直感的に理解できるようになる。さらに、上記処理をリアルタイムに行うと、より効果的である。即ち、ユーザーが登録しようとする被写体を、画角や照明条件を変えて撮影し、それがすでに登録してある辞書画像群とどのような関係にあるのか即時に確認することが可能になる。 Further, the dictionary image confirmation unit 7 and the registration candidate image confirmation unit 34 can be linked. For example, the input image input to the dictionary image registration unit 5 is input to the dictionary image confirmation unit 7 via the registration candidate image confirmation unit 34, and the dictionary image relation calculation unit 41 is the same as other dictionary image data. Process. The result of processing the registration candidate image can be displayed on the dictionary image relationship display unit 42 to show the relationship with other dictionary images. By doing so, the user can intuitively understand the relationship between the image to be registered and the dictionary image that has already been registered. Furthermore, it is more effective to perform the above processing in real time. In other words, the subject to be registered can be photographed while changing the angle of view and the illumination conditions, and it is possible to immediately confirm the relationship with the already registered dictionary image group.

辞書画像確認部７と登録候補画像確認部３４とを連携させる別の例として、登録候補画像の状態をもとに、ユーザーにアドバイスするようにしてもよい。例えば、上記連携例と同様の処理を行って、辞書画像確認部７に登録済みの辞書画像とともに、登録候補画像を表示し、登録候補画像がどうなれば、表示位置が変わるか、情報を表示する。前述のように、辞書画像関係表示部４２に表示される位置関係によって、辞書画像の状態を確認することができる。即ち、同じラベルの辞書画像同士は近い位置に、異なるラベルの辞書画像同士は遠い位置に表示されるのが望ましい辞書画像の状態である。登録候補である入力画像が、他の辞書画像に対して、上記のような状態になるためのアドバイスを行うようにする。具体的には、被写体である顔の向きや、照明条件、表情などをどうすれば、よりよい辞書画像になるかアドバイスする。 As another example of linking the dictionary image confirmation unit 7 and the registration candidate image confirmation unit 34, the user may be advised based on the state of the registration candidate image. For example, the same processing as in the above cooperation example is performed to display the registration candidate image together with the dictionary image registered in the dictionary image confirmation unit 7, and display information about what the registration candidate image will be and how the display position changes. To do. As described above, the state of the dictionary image can be confirmed by the positional relationship displayed on the dictionary image relationship display unit 42. In other words, it is desirable that the dictionary images with the same label are displayed at close positions and the dictionary images with different labels are displayed at distant positions. The input image that is the registration candidate is advised so as to be in the above state with respect to other dictionary images. Specifically, advice is given on how to obtain a better dictionary image, such as the orientation of the face as the subject, lighting conditions, facial expressions, and the like.

アドバイスの内容をどのように設定するかは、予め十分な量の典型的な被写体データを集めておき、学習することで求めることができる。即ち、顔の向きや、照明条件の変化が、辞書画像関係表示部の表示空間において、どのような軌跡になるのか、サンプルデータから求めておくことで、実現することができる。このように、どのようにすれば、好ましい辞書画像を登録することが出来るのか、ユーザーが試行錯誤することなく、簡便に撮影条件を設定することが可能になる。 How to set the content of the advice can be obtained by collecting and learning a sufficient amount of typical subject data in advance. That is, it can be realized by obtaining from the sample data what kind of locus the change in the face direction and the illumination condition will be in the display space of the dictionary image relation display unit. In this way, the user can easily set the shooting conditions without trial and error as to how a preferred dictionary image can be registered.

（第２の実施形態）
本実施形態では、第１の実施形態に対して、辞書画像登録部及び辞書画像確認部の構成が一部異なっている。以下、本実施形態について具体的に説明する。なお、重複を避けるため、以下の説明において、第１の実施形態と同じ部分は、省略する。本実施形態に係るパターン識別装置全体の構成は、第１の実施形態で説明した図１と同様であるため説明を省略する。 (Second Embodiment)
In the present embodiment, the configurations of the dictionary image registration unit and the dictionary image confirmation unit are partially different from those of the first embodiment. Hereinafter, this embodiment will be specifically described. In addition, in order to avoid duplication, the same part as 1st Embodiment is abbreviate | omitted in the following description. Since the overall configuration of the pattern identification apparatus according to this embodiment is the same as that of FIG. 1 described in the first embodiment, the description thereof is omitted.

図９は、辞書画像登録部５の詳細な機能構成例を示すブロック図である。
図９ｂに示すように、辞書画像登録部５は、辞書画像データ生成部５１、辞書画像データ保持部５２、辞書画像データ選択部５３、登録候補画像確認部５４、及び辞書画像削除部５５を備えている。第１の実施形態と比べ、辞書画像削除部５５が追加された点と、辞書画像データ選択部５３の処理内容の一部とが異なっている。 FIG. 9 is a block diagram illustrating a detailed functional configuration example of the dictionary image registration unit 5.
As shown in FIG. 9b, the dictionary image registration unit 5 includes a dictionary image data generation unit 51, a dictionary image data holding unit 52, a dictionary image data selection unit 53, a registration candidate image confirmation unit 54, and a dictionary image deletion unit 55. ing. Compared to the first embodiment, the point that the dictionary image deletion unit 55 is added is different from the part of the processing contents of the dictionary image data selection unit 53.

辞書画像データ選択部５３は、入力画像識別部６からの要求に応じて辞書画像データ保持部５２が保持する辞書画像データを選択し、転送するだけでなく、操作部９などを介した使用者からの要求に応じて、辞書画像を１つ以上選択することができる。また、辞書画像削除部５５は、辞書画像データ選択部５３によって選択された辞書画像を、辞書画像データ保持部５２から削除する。 The dictionary image data selection unit 53 not only selects and transfers the dictionary image data held by the dictionary image data holding unit 52 in response to a request from the input image identification unit 6 but also a user through the operation unit 9 and the like. One or more dictionary images can be selected in response to a request from. The dictionary image deletion unit 55 deletes the dictionary image selected by the dictionary image data selection unit 53 from the dictionary image data holding unit 52.

図１０は、辞書画像確認部７の詳細な機能構成例を示すブロック図である。
図１０に示すように、辞書画像確認部７は、辞書画像関係演算部６１、辞書画像関係表示部６２、辞書画像関係操作部６３、辞書画像群評価部６４、及び辞書画像関係入力部６５を備えている。第１の実施形態と比べ、辞書画像群評価部６４、及び辞書画像関係入力部６５が追加されている点が異なっている。 FIG. 10 is a block diagram illustrating a detailed functional configuration example of the dictionary image confirmation unit 7.
As shown in FIG. 10, the dictionary image confirmation unit 7 includes a dictionary image relationship calculation unit 61, a dictionary image relationship display unit 62, a dictionary image relationship operation unit 63, a dictionary image group evaluation unit 64, and a dictionary image relationship input unit 65. I have. Compared with the first embodiment, a dictionary image group evaluation unit 64 and a dictionary image relation input unit 65 are added.

辞書画像群評価部６４は、辞書画像データおよび辞書画像関係演算部６１の演算結果を用いて、登録されている辞書画像全ての総合的な状態を評価する。辞書画像群評価部６４で行われる処理の具体的な内容は後で詳しく説明する。辞書画像関係入力部６５は、ユーザーが複数の辞書画像のラベル情報に対して、その関係を入力するための処理を行う。辞書画像関係入力部６５で行われる処理の具体的な内容については後で説明する。 The dictionary image group evaluation unit 64 evaluates the overall state of all registered dictionary images using the dictionary image data and the calculation result of the dictionary image relation calculation unit 61. The specific contents of the processing performed by the dictionary image group evaluation unit 64 will be described in detail later. The dictionary image relationship input unit 65 performs a process for the user to input the relationship between the label information of a plurality of dictionary images. Specific contents of the processing performed by the dictionary image relation input unit 65 will be described later.

次に、辞書画像群評価部６４の処理内容について説明する。図１１は、辞書画像群評価部６４により行われる処理手順の一例を示すフローチャートである。
まず、辞書画像データ保持部５２から、登録してある全ての辞書画像データを取得する（ステップＳ１１００）。続いて、辞書画像関係演算部６１から、全辞書画像データ間の関係性を表すデータを取得する（ステップＳ１１０１）。そして、ステップＳ１１００で取得した辞書画像データの属性値を取得する（ステップＳ１１０２）。ここで、辞書画像データの属性値とは、典型的には、照明条件や、対象オブジェクトの大きさ、向き、などである。対象オブジェクトが人物の顔である場合には、表情や、性別、年齢などを含んでもよい。これら辞書画像データの属性は、辞書画像データから直接検出してもよいし、照明条件などは、撮像時に記録されるカメラパラメータから推定してもよい。 Next, processing contents of the dictionary image group evaluation unit 64 will be described. FIG. 11 is a flowchart illustrating an example of a processing procedure performed by the dictionary image group evaluation unit 64.
First, all registered dictionary image data is acquired from the dictionary image data holding unit 52 (step S1100). Subsequently, data representing the relationship between all dictionary image data is acquired from the dictionary image relationship calculation unit 61 (step S1101). Then, the attribute value of the dictionary image data acquired in step S1100 is acquired (step S1102). Here, the attribute values of the dictionary image data typically include lighting conditions, the size and orientation of the target object, and the like. When the target object is a person's face, it may include a facial expression, gender, age, and the like. These attributes of the dictionary image data may be detected directly from the dictionary image data, and the illumination conditions may be estimated from camera parameters recorded at the time of imaging.

次に、ステップＳ１１０２で取得した辞書画像データの属性を、辞書画像のラベル情報ごとに集計する（ステップＳ１１０３）。即ち、同じラベル情報が付与されている辞書画像データの属性について、属性のバリエーションを計測する。次に、ステップＳ１１０１で取得した辞書画像関係データと、ステップＳ１１０３で取得した同一ラベル辞書画像での属性バリエーションとを総合的に評価して、辞書画像群の評価値を算出する（ステップＳ１１０４）。 Next, the attributes of the dictionary image data acquired in step S1102 are totaled for each label information of the dictionary image (step S1103). That is, the variation of the attribute is measured for the attribute of the dictionary image data to which the same label information is assigned. Next, the dictionary image relation data acquired in step S1101 and the attribute variation in the same label dictionary image acquired in step S1103 are comprehensively evaluated to calculate an evaluation value of the dictionary image group (step S1104).

評価値の算出方法は、例えば、辞書画像のラベル数、同一ラベルに属する辞書画像の数、属性のバリエーションなどを基準にして求めることができる。一般的に、辞書画像のラベル数が少なければ、識別するクラスの数が少なくなるので、辞書画像群の評価値としては高い値を与えることができる。また、同一ラベルに属する辞書画像データの数が多く、また様々な属性が含まれた方が、より識別性能が向上するので、評価画像群の評価値は高くできる。 The evaluation value can be calculated based on, for example, the number of labels in the dictionary image, the number of dictionary images belonging to the same label, and attribute variations. In general, if the number of labels in the dictionary image is small, the number of classes to be identified is small, so that a high value can be given as the evaluation value of the dictionary image group. In addition, since the number of dictionary image data belonging to the same label is large and various attributes are included, the identification performance is further improved, so that the evaluation value of the evaluation image group can be increased.

辞書画像の数や属性ではなく、特徴量を評価に用いる方法も考えられる。例えば、同一ラベルに属する特徴量と異なるラベルに属する特徴量の分布を計測し、その分離度を、評価値に加えることができる。特徴ベクトルの分離度は、以下の式（１）を用いて求めるとよい。 A method that uses the feature amount for evaluation instead of the number and attributes of the dictionary image is also conceivable. For example, it is possible to measure the distribution of feature quantities belonging to different labels from the feature quantities belonging to the same label, and add the degree of separation to the evaluation value. The degree of feature vector separation may be obtained using the following equation (1).

なお、関数ｄは２つのベクトルの距離を求める関数であり、典型的には２つのベクトルのユークリッド距離を算出する。上記のようにして、ラベルＣ１、Ｃ２に属する辞書画像間の距離を平均したものを、分離度Ｄとする。一般に、異なるラベルに属する辞書画像間の分離度が高いほど識別性能は高くなると期待される。即ち、全てのラベルの組み合わせについて、分離度を求め、平均することによって、辞書画像群の評価値とすることができる。 The function d is a function for obtaining a distance between two vectors, and typically calculates a Euclidean distance between the two vectors. The average of the distances between the dictionary images belonging to the labels C1 and C2 as described above is defined as the degree of separation D. Generally, the higher the degree of separation between dictionary images belonging to different labels, the higher the identification performance is expected. That is, it is possible to obtain the evaluation value of the dictionary image group by obtaining the separation degree for all combinations of labels and averaging them.

また、式（１）に示すような特徴量間の距離ではなく、同一ラベルに属する特徴量間の分散と、異なるラベルに属する特徴量群間の分散との比を用いることもできる。例えば、以下の式（２）に示すような量を求める。 In addition, instead of the distance between the feature amounts as shown in Expression (1), a ratio between the variance between the feature amounts belonging to the same label and the variance between the feature amount groups belonging to different labels can be used. For example, an amount as shown in the following formula (2) is obtained.

なお、式（２）のＪの分母は同一ラベルに属する特徴ベクトル間の分散を表し、分子は異なるラベルに属する特徴ベクトルの分散を表している。一般に、異なるラベルに属する特徴ベクトルは離れていて、かつ、同じラベルに属する特徴ベクトルはまとまっていた方が、識別性能は良くなると考えられる。即ち、異なるラベルに属する特徴ベクトル間の分散は大きく、同一ラベルに属する特徴ベクトルの分散は小さいほうが望ましい。よって、Ｊが大きいほど、辞書画像群の評価値を高くすることができる。 The denominator of J in equation (2) represents the variance between feature vectors belonging to the same label, and the numerator represents the variance of feature vectors belonging to different labels. Generally, it is considered that the discrimination performance is improved when the feature vectors belonging to different labels are separated and the feature vectors belonging to the same label are collected. That is, it is desirable that the variance between feature vectors belonging to different labels is large and the variance of feature vectors belonging to the same label is small. Therefore, the evaluation value of a dictionary image group can be made high, so that J is large.

上記のように、辞書画像群の評価値を算出し、辞書画像確認部に表示することによって、使用者は、機器に登録してある辞書画像がどのような状態にあるのか簡単に確認することが可能になる。 As described above, by calculating the evaluation value of the dictionary image group and displaying it on the dictionary image confirmation unit, the user can easily confirm the state of the dictionary image registered in the device. Is possible.

次に、辞書画像関係入力部６５について説明する。辞書画像関係入力部６５は、使用者が辞書画像のラベル間の関係を入力するための機能を有する。ここで、辞書画像のラベル間の関係とは、例えば対象オブジェクトが人物の顔であって、ラベルが人物の名前である場合、その人物間の関係のことである。より具体的には、「親子」、「兄弟」、などの血縁関係を入力する。なお、「友人」、「同僚」など血縁関係以外の関係を入力してもよい。 Next, the dictionary image relation input unit 65 will be described. The dictionary image relationship input unit 65 has a function for a user to input a relationship between labels of dictionary images. Here, the relationship between the labels of the dictionary image is a relationship between the persons when the target object is a person's face and the label is the name of the person, for example. More specifically, a blood relationship such as “parent and child” and “brother” is input. It should be noted that a relationship other than a blood relationship such as “friend” or “colleague” may be input.

上記のように、使用者が入力した、登録されている辞書画像ラベルの関係性を、辞書画像群評価部６４の評価に用いることができる。例えば、辞書画像群評価部の評価演算で、異なるラベルに属する辞書画像の分離度平均を計算する際に、異なるラベル間の関係性を、重みづけに用いることができる。一般に、親子や兄弟など血縁関係にある場合と、そうでない場合とでは、特徴ベクトルの分離度に異なった傾向がでると考えられる。即ち、親子・兄弟の関係にある人同士の特徴ベクトルは、異なるラベルが付けられていても、分離度が、他人同士のそれよりも低めに出ることが予想される。単純に、分離度の平均をとると、血縁関係にあるラベルが含まれている辞書画像群と、そうでない辞書画像群とでは、後者の方が、評価値が高くなる。また、辞書画像を追加登録する場合に、血縁関係者を追加する場合と、他人を追加する場合とでは、辞書画像群の評価値は、後者の方が有利になる。このような現象を軽減するために、血縁関係にある場合の異なるラベル間の特徴ベクトル分離度の重みづけを低くするとよい。 As described above, the relationship between registered dictionary image labels input by the user can be used for the evaluation of the dictionary image group evaluation unit 64. For example, the relationship between different labels can be used for weighting when calculating the average degree of separation of dictionary images belonging to different labels in the evaluation operation of the dictionary image group evaluation unit. In general, it is considered that there is a tendency that the degree of separation of feature vectors differs depending on whether a parent or child or a sibling is related. That is, it is expected that the feature vectors of the persons having a parent-child / brother relationship will have a lower degree of separation than that of others even if different labels are attached. Simply taking the average of the degree of separation, the latter has a higher evaluation value in the dictionary image group including the related-related labels and the dictionary image group other than that. In addition, when the dictionary image is additionally registered, the latter is more advantageous for the evaluation value of the dictionary image group when adding a related person and when adding another person. In order to reduce such a phenomenon, it is preferable to lower the weight of the feature vector separation degree between different labels when there is a blood relationship.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

５辞書画像登録部、６入力画像識別部、７辞書画像確認部 5 dictionary image registration unit, 6 input image identification unit, 7 dictionary image confirmation unit

Claims

Registration means for registering images in a dictionary;
Confirmation means for confirming a dictionary image already registered by the registration means,
The pattern identifying apparatus, wherein the confirmation unit displays a relationship between a plurality of dictionary images.

The pattern identification apparatus according to claim 1, wherein the relationship between the dictionary images is a positional relationship on a feature space used for pattern identification.

The pattern identification apparatus according to claim 1, wherein the relationship between the dictionary images is a positional relationship on a projection space obtained by projecting a feature amount used for pattern identification.

A registration process for registering images in a dictionary;
A confirmation step of confirming a dictionary image already registered in the registration step,
In the confirmation step, a relationship between a plurality of dictionary images is displayed.

The program for making a computer perform each process of the pattern identification method of Claim 4.