JP2010282367A

JP2010282367A - Learning device and learning method

Info

Publication number: JP2010282367A
Application number: JP2009134305A
Authority: JP
Inventors: Hiroshi Torii; 寛鳥居
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2009-06-03
Filing date: 2009-06-03
Publication date: 2010-12-16

Abstract

PROBLEM TO BE SOLVED: To configure a discriminator unit to perform pattern identification of input data with high accuracy at high speed. SOLUTION: A branch node determines the next node to be initiated based on a parameter. A discrimination node discriminates whether the input data belongs to a second class, based on the parameter. The problem is solved by the provision of: a multivariate analysis means for obtaining a direction vector by performing multivariate analysis on the feature vector of a learning data belonging to a first class; a division plane decision means for deciding a division plane perpendicular to the direction vector obtained in the multivariate analysis means, to divide the feature space of the learning data; and based on the division plane decided by the division plane decision means, a parameter decision means for deciding the parameter of the branch node. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、学習装置及び学習方法に関する。 The present invention relates to a learning apparatus and a learning method.

従来、線形識別を用いたパターン識別が盛んに行われている。非特許文献１には線形識別のいくつかの例が解説されている。
簡単に説明すると、線形識別では、入力データを特徴ベクトルとして多次元空間内のベクトルで表し、これら特徴ベクトルが張る特徴空間を、超平面によって分割する。そして、入力データに対応する特徴ベクトルが、その超平面のどちら側に位置するかによって、入力データを識別する。更に、複数の超平面を用意すれば、これら超平面に囲まれた領域にある特徴ベクトルを１つのクラスとして識別することができる。非特許文献２には、このような例が開示されている。
前記識別器は比較的処理が高速である反面、線形識別を論理積によって統合した構造を採用している。そのため、識別したい特徴ベクトルの集合を、特徴空間内の超平面の片側或いは凸多面体としてしか表現することができない。つまり、凹凸のある集合を表現することができない。 Conventionally, pattern identification using linear identification has been actively performed. Non-Patent Document 1 describes some examples of linear discrimination.
Briefly, in linear identification, input data is represented as feature vectors as vectors in a multidimensional space, and the feature space spanned by these feature vectors is divided by a hyperplane. Then, the input data is identified depending on which side of the hyperplane the feature vector corresponding to the input data is located. Furthermore, if a plurality of hyperplanes are prepared, feature vectors in a region surrounded by these hyperplanes can be identified as one class. Non-Patent Document 2 discloses such an example.
The discriminator has a relatively high processing speed, but adopts a structure in which linear discrimination is integrated by logical product. Therefore, a set of feature vectors to be identified can be expressed only as one side of a hyperplane or a convex polyhedron in the feature space. In other words, it is not possible to represent a set with unevenness.

この問題を克服するためにいくつかの方法が提案されている。最も一般的な方法は、個々の線形識別の結果を論理積以外の演算で統合する方法である。非特許文献１にも解説されている区分的識別関数を利用する方法はその一つである。これは集合の表面を複数の多角形で覆うという考え方である。また、決定木を使う方法もある。典型的な決定木は教師あり学習である。決定木を構築する際には、分岐先ノードの不純度という概念を利用するのが通例となっている。これは分岐先ノードにたどり着く入力データの種類のばらつきのことである。通常、決定木を構築する際には、この不純度が低下するように分岐条件を決定する。非特許文献３では教師なしで決定木を構築する方法が提案されている。しかし、非特許文献３でもやはり不純度の概念を導入し、これが低下するように分岐条件を定めている。 Several methods have been proposed to overcome this problem. The most common method is a method of integrating the results of individual linear discrimination by operations other than logical product. One method is to use a piecewise discriminant function described in Non-Patent Document 1. This is the idea of covering the surface of the set with a plurality of polygons. There is also a method using decision trees. A typical decision tree is supervised learning. When constructing a decision tree, it is common to use the concept of impurity of branch destination nodes. This is a variation in the type of input data that reaches the branch destination node. Usually, when a decision tree is constructed, a branch condition is determined so that this impurity is reduced. Non-Patent Document 3 proposes a method for constructing a decision tree without a teacher. However, Non-Patent Document 3 also introduces the concept of impurity and determines the branching condition so that this is reduced.

凹凸のある集合を表すための別の方法として、特徴空間の次元を増やすということも行われている。例えば、非特許文献２では、複数の弱判別器の結果を加算して強判別器という概念を導入している。しかしながら、特徴空間の次元を増やしても、その次元の増えた特徴空間の中で非凹多面体しか表せないという本質的な問題は解決されない。
教師なしで決定木を構築する方法は、クラスタリングと似通ったところがある。クラスタリングでは、クラスの分からないデータを複数の集合に分割する。非特許文献４は、階層的に入力データを分割していくクラスタリングの１つの手法を提案している。その際、できあがったクラスタを次々と２つずつに分割していく。できあがる２つのクラスタが、それぞれなるべくガウス分布に近くなるように作られる。 Increasing the dimension of the feature space is another way to represent an uneven set. For example, Non-Patent Document 2 introduces the concept of a strong classifier by adding the results of a plurality of weak classifiers. However, even if the dimension of the feature space is increased, the essential problem that only the non-concave polyhedron can be expressed in the feature space having the increased dimension cannot be solved.
The method of building a decision tree without a teacher is similar to clustering. In clustering, data whose class is unknown is divided into a plurality of sets. Non-Patent Document 4 proposes one method of clustering in which input data is divided hierarchically. At that time, the completed cluster is divided into two one after another. The resulting two clusters are made as close to a Gaussian distribution as possible.

石井、上田、前田、村瀬（１９９８） "わかりやすいパターン認識"、オーム社．Ishii, Ueda, Maeda, Murase (1998) "Intuitive pattern recognition", Ohmsha. Ｖｉｏｌａ＆Ｊｏｎｅｓ（２００１） "ＲａｐｉｄＯｂｊｅｃｔＤｅｔｅｃｔｉｏｎｕｓｉｎｇａＢｏｏｓｔｅｄＣａｓｃａｄｅｏｆＳｉｍｐｌｅＦｅａｔｕｒｅｓ"，ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ），Ｖｏｌ．１，ｐ．５５１．Viola & Jones (2001) "Rapid Object Detection using a Boosted Cascade of Simple Features", Proceedings of the IEEE Conference on Computer and Veteran. 1, p. 551. Ｂａｓａｋ＆Ｋｒｉｓｈｎａｐｕｒａｍ（２００５） "ＩｎｔｅｒｐｒｅｔａｂｌｅＨｉｅｒａｒｃｈｉｃａｌＣｌｕｓｔｅｒｉｎｇｂｙＣｏｎｓｔｒｕｃｔｉｎｇａｎＵｎｓｕｐｅｒｖｉｓｅｄＤｅｃｉｓｉｏｎＴｒｅｅ"，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＫｎｏｗｌｅｄｇｅａｎｄＤａｔａＥｎｇｉｎｅｅｒｉｎｇ，Ｖｏｌ．１７（１），ｐ．１２１．Basak & Krishnapuram (2005) "Interpretable Hierarchical Clustering by Constructing an Undisclosed Decision Tree." IEEE Transactions in Knowledge. 17 (1), p. 121. Ｍｉａｓｎｉｋｏｖ，Ｒｏｍｅ＆Ｈａｒａｌｉｃｋ（２００４） "ＡＨｉｅｒａｒｃｈｉｃａｌＰｒｏｊｅｃｔｉｏｎＰｕｒｓｕｉｔＣｌｕｓｔｅｒｉｎｇＡｌｇｏｒｉｔｈｍ"，ＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎ，Ｖｏｌ．１，ｐ．２６８．Miasnikov, Rome & Harrick (2004) "A Hierarchical Projection Pursuing Clustering Algorithm", Pattern Recognition, Proceedings of the International Conference. 1, p. 268.

上述したように従来技術では、入力データを高速、かつ、高精度にパターン識別する識別器を構成することができない問題があった。 As described above, the conventional technique has a problem that it is not possible to configure a discriminator that identifies input data at high speed and with high accuracy.

本発明はこのような問題点に鑑みなされたもので、入力データを、高速、かつ、高精度にパターン識別する識別器を構成することを目的とする。 The present invention has been made in view of such problems, and an object of the present invention is to constitute a discriminator for identifying patterns of input data at high speed and with high accuracy.

そこで、本発明は、識別ノードと分岐ノードとを複数、連結した木構造を有し、入力データを特徴空間のＳと〜Ｓとの２クラスに識別する識別器の、各ノードのパラメータを決定する学習装置であって、前記分岐ノードは、前記パラメータに基づいて、次に起動するべきノードを決定するノードであり、前記識別ノードは、前記パラメータに基づいて、入力データが〜Ｓに属するかどうかを識別するノードであり、前記Ｓと〜Ｓとの何れかに属する学習データに基づいて、前記Ｓに属する学習データの特徴ベクトルに対して多変量解析を行い、方向ベクトルを求める多変量解析手段と、前記多変量解析手段で求められた前記方向ベクトルに対して垂直であって、学習データの特徴空間を分割する分割面を決定する分割面決定手段と、前記分割面決定手段で決定された前記分割面に基づいて、前記分岐ノードのパラメータを決定するパラメータ決定手段と、を有することを特徴とする。
かかる構成とすることにより、例えばパターン識別のための決定木を構築する際に分岐ノードのパラメータを適切に決定することができるため、入力データを、高速、かつ、高精度にパターン識別する識別器を構成することができる。
また、本発明は、学習方法としてもよい。 Therefore, the present invention determines a parameter of each node of a classifier having a tree structure in which a plurality of identification nodes and branch nodes are connected and identifying input data into two classes S and ˜S of the feature space. The branching node is a node that determines a node to be activated next based on the parameter, and the identification node determines whether the input data belongs to ~ S based on the parameter. A multivariate analysis for obtaining a direction vector by performing multivariate analysis on a feature vector of learning data belonging to S based on learning data belonging to any of S and ˜S. And a dividing plane determining unit that determines a dividing plane that is perpendicular to the direction vector obtained by the multivariate analyzing unit and that divides the feature space of the learning data, and the dividing Based on the dividing plane which is determined by the determining means, and having a a parameter determining means for determining a parameter of the branch node.
With such a configuration, for example, when a decision tree for pattern identification is constructed, it is possible to appropriately determine the parameters of the branch node, so that the identifier for identifying the pattern of input data at high speed and with high accuracy is provided. Can be configured.
Further, the present invention may be a learning method.

本発明によれば、入力データを、高速、かつ、高精度にパターン識別する識別器を構成することができる。 According to the present invention, it is possible to configure a discriminator for identifying patterns of input data at high speed and with high accuracy.

実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the information processing apparatus which concerns on embodiment. 顔を検出する際の処理の流れを表すフローチャートである。It is a flowchart showing the flow of the process at the time of detecting a face. 図２のデータフローチャートである。It is a data flowchart of FIG. パターン識別用パラメータ２１１を表すデータの構造を示す図である。It is a figure which shows the structure of the data showing the parameter 211 for pattern identification. タイプＴ１のノードのデータ構造を表す図である。It is a figure showing the data structure of the node of type T1. タイプＴ２のノードのデータ構造を表す図である。It is a figure showing the data structure of the node of type T2. 図２のステップＳ２０３の詳細を表すフローチャートである。It is a flowchart showing the detail of step S203 of FIG. 入力画像が分岐される様子を描いたイメージを表す図である。It is a figure showing the image which drew a mode that the input image was branched. ノードＮ３のための学習の大まかな処理の一例を示すフローチャートである。It is a flowchart which shows an example of the rough process of the learning for node N3. 図９のステップＦ０１の詳細を表すフローチャートである。It is a flowchart showing the detail of step F01 of FIG. 図９のステップＦ０３の詳細を表すフローチャート（その１）である。It is a flowchart (the 1) showing the detail of step F03 of FIG. 図９のステップＦ０３の詳細を表すフローチャート（その２）である。It is a flowchart (the 2) showing the detail of step F03 of FIG. 図１２のステップＦ０３１１の詳細を表すフローチャート（その１）である。It is a flowchart (the 1) showing the detail of step F0311 of FIG. 目的関数を最小化する射影ベクトルｑ'を求める処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process which calculates | requires the projection vector q 'which minimizes an objective function. 図１２のステップＦ０３１１の詳細を表すフローチャート（その２）である。It is a flowchart (the 2) showing the detail of step F0311 of FIG.

以下、本発明の実施形態について図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

＜実施形態１＞
入力された画像に顔があるかどうかを判定する情報処理装置の例を示す。実施形態を簡単にするために、入力された画像はグレースケール画像であり、顔があればパスポート写真のようにほぼ中央にほぼ決められた大きさで配置されているものと仮定する。なお、画像を走査したり、画像を拡大・縮小するなどしたりすれば、任意の位置にある任意の大きさの顔を検出できるようになる。また、輝度値も正規化されているものとする。正規化の方法には、平均輝度との差分を取ったり、輝度の標準偏差で割ったりする方法がある。
図１は、実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。図１において、ＣＰＵ（中央演算装置）１００は、実施形態で説明するパターン識別用パラメータ学習方法をプログラムに従って実行する。プログラムメモリ１０１は、ＣＰＵ１００により実行されるプログラムが記憶されている。ＲＡＭ１０２は、ＣＰＵ１００によるプログラムの実行時に、各種情報を一時的に記憶するためのメモリを提供している。ハードディスク１０３は、画像ファイルやパターン識別用のパラメータなどを保存するための記憶媒体である。ディスプレイ１０４は、本実施形態の処理結果をユーザに提示する装置である。バス１１０は、これら各部とＣＰＵ１００とを接続している制御バス・データバスである。 <Embodiment 1>
The example of the information processing apparatus which determines whether a face exists in the input image is shown. In order to simplify the embodiment, it is assumed that the input image is a grayscale image, and if there is a face, it is arranged at a substantially determined size at the center, like a passport photo. Note that a face of an arbitrary size at an arbitrary position can be detected by scanning the image or enlarging / reducing the image. It is assumed that the luminance value is also normalized. As a normalization method, there is a method of taking a difference from the average luminance or dividing by a standard deviation of luminance.
FIG. 1 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus according to the embodiment. In FIG. 1, a CPU (Central Processing Unit) 100 executes a pattern identification parameter learning method described in the embodiment according to a program. The program memory 101 stores a program executed by the CPU 100. The RAM 102 provides a memory for temporarily storing various types of information when the CPU 100 executes a program. The hard disk 103 is a storage medium for storing image files, pattern identification parameters, and the like. The display 104 is a device that presents the processing result of the present embodiment to the user. A bus 110 is a control bus / data bus that connects these units to the CPU 100.

図２は、顔を検出する際の処理の流れを表すフローチャートである。
まずステップＳ２０１において、ＣＰＵ１００は、ハードディスク１０３より画像をＲＡＭ１０２に読み込む。画像は、ＲＡＭ１０２上では２次元配列として保持される。次のステップＳ２０２において、ＣＰＵ１００は、後述する学習方法により作成したパターン識別用パラメータをハードディスク１０３よりＲＡＭ１０２に読み込む。ステップＳ２０３において、ＣＰＵ１００は、ステップＳ２０２で読み込んだパターン識別用パラメータを使用して、ステップＳ２０１で読み込んだ画像内に顔があるかどうかを判定する。その結果を次のステップＳ２０４において、ＣＰＵ１００は、ディスプレイ１０４に表示する。 FIG. 2 is a flowchart showing the flow of processing when a face is detected.
First, in step S 201, the CPU 100 reads an image from the hard disk 103 into the RAM 102. The image is held on the RAM 102 as a two-dimensional array. In the next step S 202, the CPU 100 reads pattern identification parameters created by a learning method described later from the hard disk 103 into the RAM 102. In step S203, the CPU 100 determines whether there is a face in the image read in step S201, using the pattern identification parameter read in step S202. In step S204, the CPU 100 displays the result on the display 104.

図２をデータフローチャートとして書き表すと図３ようになる。図３は、図２のデータフローチャートである。２０５は、ハードディスク１０３に保存されている画像である。２０１の画像の読み込み処理において、ハードディスク内の画像２０５がＲＡＭ１０２上に入力画像Ｉとして記憶される。２０９は、ハードディスク１０３に保存されているパターン識別用パラメータである。２１０のパターン識別用パラメータの読み込み処理において、ハードディスク１０３内のパターン識別用パラメータ２０９がＲＡＭ１０２上にパターン識別用パラメータ２１１として記憶される。２０３の検出処理では、ＣＰＵ１００が、先の入力画像Ｉとパターン識別用パラメータ２１１とを使用して、入力画像Ｉの中に顔があるかどうかを判定し、顔があるかどうかを２０７の検出結果としてＲＡＭ１０２に書き込む。２０４の検出結果表示処理では、ＣＰＵ１００が、検出結果２０７の内容をディスプレイ１０４に表示する。 FIG. 2 is represented as a data flowchart as shown in FIG. FIG. 3 is a data flowchart of FIG. Reference numeral 205 denotes an image stored in the hard disk 103. In the image reading process 201, the image 205 in the hard disk is stored as an input image I on the RAM 102. Reference numeral 209 denotes a pattern identification parameter stored in the hard disk 103. In the pattern identification parameter reading process 210, the pattern identification parameter 209 in the hard disk 103 is stored on the RAM 102 as the pattern identification parameter 211. In the detection process 203, the CPU 100 determines whether or not there is a face in the input image I using the previous input image I and the pattern identification parameter 211, and detects whether or not there is a face 207. As a result, the data is written in the RAM 102. In the detection result display process 204, the CPU 100 displays the contents of the detection result 207 on the display 104.

ここで、２１１のパターン識別用パラメータの内容について図４や図５、図６を用いて簡単に説明する。パターン識別用パラメータ２１１を作成する方法については、後ほど記述する。図４は、パターン識別用パラメータ２１１を表すデータの構造を示す図である。図４において、正方形は木構造の各ノードを表している。また、矢印は各ノードの処理が実行される順番を表している。パターン識別用パラメータ２１１は、タイプＴ１とタイプＴ２とで表された２種類のノードをツリー状に接続した構造をしている。タイプＴ１のノードは、識別ノードであって、その後にはノードが１つだけ接続されている。また、タイプＴ２のノードは、分岐ノードであって、ノードの後にはノードが複数接続されている。Ｎ３と記されたノードもまたタイプＴ２のノードである。本実施形態は、タイプＴ１の種類によらず様々な種類の検出器（識別器）に適用できるが、ここでは非特許文献２に書かれているような弱判別器（ｗｅａｋｃｌａｓｓｉｆｉｅｒ）をタイプＴ１のノードに使用した例を示す。これ以外にもｌｉｎｅａｒｄｉｓｃｒｉｍｉｎａｎｔａｎａｌｙｓｉｓ（ＬＤＡ）やｓｕｐｐｏｒｔｖｅｃｔｏｒｍａｃｈｉｎｅ（ＳＶＭ）等を使った検出器を利用することができる。また、これらを連結した検出器であってもよい。 Here, the contents of the pattern identification parameter 211 will be briefly described with reference to FIGS. 4, 5, and 6. A method for creating the pattern identification parameter 211 will be described later. FIG. 4 is a diagram illustrating the structure of data representing the pattern identification parameter 211. In FIG. 4, the square represents each node of the tree structure. Moreover, the arrow represents the order in which the processing of each node is executed. The pattern identification parameter 211 has a structure in which two types of nodes represented by type T1 and type T2 are connected in a tree shape. A node of type T1 is an identification node, after which only one node is connected. The type T2 node is a branch node, and a plurality of nodes are connected after the node. The node labeled N3 is also a type T2 node. The present embodiment can be applied to various types of detectors (identifiers) regardless of the type of T1, but here, a weak classifier as described in Non-Patent Document 2 is used as a type T1. An example of using this node is shown below. In addition to this, a detector using linear discriminant analysis (LDA), support vector machine (SVM), or the like can be used. Moreover, the detector which connected these may be sufficient.

以後の説明において、パラメータ・分岐先Ａ・分岐先Ｂ・集合Ｆ⁺・集合Ｇ⁺という表現を用いているが、これらは着目するノードによって内容が異なるものである。これらを用いて求めた値もノードによって異なる。本実施形態では煩雑さを避けるためにノードを示す添え字を省略している。 In the following description, although using a term parameter branch destination A · branch target B-set F ^+-set G ^+, These are the contents by the node of interest are different. The value obtained using these also differs depending on the node. In this embodiment, subscripts indicating nodes are omitted in order to avoid complexity.

非特許文献２に書かれている弱判別器は、図４のタイプＴ１のノードに相当する。図５は、タイプＴ１のノードのデータ構造を表す図である。このデータは、ＲＡＭ１０２のメモリ上に複数格納される。個々のデータはそれぞれ値が異なるのが普通である。まず先頭にノードのタイプが格納されている。このノードはタイプＴ１なので、Ｔ１を表すコードがノードのタイプとして格納される。その次に矩形情報が格納されている。矩形情報の初めに矩形の個数ｎが格納されており、その後にその個数ｎだけの矩形の座標（左上点・右下点）が格納されている。これら複数の矩形をまとめて矩形群と呼ぶことにする。次に、打ち切りのためのパラメータが格納されている。ここで「打ち切り」とは、後に図７を用いて説明するが、簡単に言うと早めの段階で入力画像に顔がないと判断することである。打ち切り用パラメータの先頭には閾値θが格納されている。その後に、先の矩形の数ｎだけの打ち切りのための計算に利用する符号が並ぶ。ここで言う符号とは、＋１や−１のことである。最後に次のノードへのポインタが格納されている。 The weak classifier described in Non-Patent Document 2 corresponds to a node of type T1 in FIG. FIG. 5 is a diagram illustrating a data structure of a node of type T1. Multiple pieces of this data are stored on the memory of the RAM 102. Individual data usually have different values. First, the node type is stored at the top. Since this node is of type T1, a code representing T1 is stored as the node type. Next, rectangle information is stored. The number n of rectangles is stored at the beginning of the rectangle information, and then the coordinates of the rectangles corresponding to the number n (the upper left point and the lower right point) are stored. These multiple rectangles are collectively referred to as a rectangle group. Next, parameters for censoring are stored. Here, “censoring” is described later with reference to FIG. 7, but simply speaking, it is determined that there is no face in the input image at an early stage. A threshold value θ is stored at the head of the censoring parameter. After that, codes used for calculation for truncation by the number n of the previous rectangles are arranged. The reference sign here means +1 or -1. Finally, a pointer to the next node is stored.

図６は、タイプＴ２のノードのデータ構造を表す図である。このデータも、ＲＡＭ１０２のメモリ上に複数格納される。個々のデータはそれぞれ値が異なるのが普通である。まず先頭にノードのタイプが格納されている。このノードはタイプＴ２なので、Ｔ２を表すコードがノードのタイプとして格納される。その次に矩形情報が格納されている。矩形情報の初めに矩形の個数ｎが格納されており、その後にその個数ｎだけの矩形の座標（左上点・右下点）が格納されている。次に分岐先Ａのためのパラメータが配置されている。分岐先Ａのためのパラメータには、打ち切り用パラメータ同様に閾値や矩形の係数が格納されているが、更に分岐先ノードＡへのポインタも格納されている。このポインタの指し示す先には、また別のノードのパラメータが格納されている。最後にもう１つの分岐先ノードＢへのポインタが格納されている。 FIG. 6 is a diagram illustrating a data structure of a node of type T2. A plurality of this data is also stored on the memory of the RAM 102. Individual data usually have different values. First, the node type is stored at the top. Since this node is of type T2, a code representing T2 is stored as the node type. Next, rectangle information is stored. The number n of rectangles is stored at the beginning of the rectangle information, and then the coordinates of the rectangles corresponding to the number n (the upper left point and the lower right point) are stored. Next, parameters for branch destination A are arranged. In the parameter for the branch destination A, a threshold and a rectangular coefficient are stored as in the case of the abort parameter, but a pointer to the branch destination node A is also stored. The parameter of another node is stored at the destination indicated by this pointer. Finally, a pointer to another branch destination node B is stored.

上記パラメータの作成方法を説明する前に、このパラメータを使用して顔を検出する方法を説明する。検出処理の全体的な流れは、ＣＰＵ１００が、図４の各ノードを根ノード（図で最も上位に描かれているノード）から順にたどることによって行われる。処理するノードがタイプＴ１のノードである場合、ＣＰＵ１００は、図５に図示されたノードに固有のパラメータを用いて、入力画像Ｉに顔が含まれているかどうかを判定する。顔がない可能性が高いと判定した場合には、ＣＰＵ１００は、そこで処理を中断する。そうでない場合には、ＣＰＵ１００は、次のノードの処理へと移る。処理するノードがタイプＴ２のノードである場合には、ＣＰＵ１００は、図６に図示されたノードに固有のパラメータを用いて、次にどのノードに処理を移すかの判断を行う。このように順にノードをたどっていくことによって、ＣＰＵ１００は、タイプＴ１のノードでは打ち切りか継続かの判断を行い、タイプＴ２のノードでは分岐先ノードの選択を行う。ここで、タイプＴ１のノード、つまり識別ノードは、入力データを特徴空間のＳ（第１のクラス）と〜Ｓ（第２のクラス）との２クラスに識別する識別器において、パラメータに基づいて、入力データが〜Ｓに属するかどうかを識別するノードである。また、タイプＴ２のノード、つまり分岐ノードは、パラメータに基づいて、次に起動するべきノードを決定するノードである。 Before explaining how to create the parameters, a method for detecting a face using these parameters will be explained. The overall flow of the detection process is performed by the CPU 100 sequentially following each node in FIG. 4 from the root node (the node drawn at the top in the drawing). If the node to be processed is a node of type T1, the CPU 100 determines whether or not a face is included in the input image I using parameters specific to the node illustrated in FIG. When it is determined that there is a high possibility that there is no face, the CPU 100 interrupts the processing there. Otherwise, the CPU 100 moves to the processing of the next node. When the node to be processed is a node of type T2, the CPU 100 determines which node to transfer the process to next using the parameters specific to the node illustrated in FIG. By following the nodes sequentially in this way, the CPU 100 determines whether the type T1 node is aborted or continued, and the type T2 node selects a branch destination node. Here, a node of type T1, that is, an identification node, is a classifier that identifies input data into two classes of S (first class) and ~ S (second class) of the feature space, based on parameters. , A node that identifies whether the input data belongs to ~ S. A type T2 node, that is, a branch node, is a node that determines a node to be activated next based on parameters.

図７は、図２のステップＳ２０３の詳細を表すフローチャートである。初めのステップＤ１０において、ＣＰＵ１００は、ポインタ変数ｐを最初のノードを指すように初期化する。次のステップＤ０２において、ＣＰＵ１００は、ｐが指し示すノードの種類を確認する。ｐが指し示すノードがタイプＴ１の場合、ＣＰＵ１００は、ステップＤ１１に進む。逆にタイプＴ２の場合、ＣＰＵ１００は、ステップＤ２１へ進む。
ステップＤ１１において、ＣＰＵ１００は、変数ｃを０で初期化する。そして、ＣＰＵ１００は、ステップＤ１２からＤ１５までのループを矩形の数ｎ回だけ繰り返す。ループ内において、ＣＰＵ１００は、矩形を表すループ変数をｉとする。ステップＤ１３において、ＣＰＵ１００は、図５のノード情報から矩形ｉの対角線の座標（ｘ_iL，ｙ_iT）−（ｘ_iR，ｙ_iB）を取得し、その入力画像Ｉにおける矩形内の輝度値の総和を求める。ＣＰＵ１００は、これをｂ_iとする。ｂ_iは、非特許文献２に書かれているように累積情報（ｉｎｔｅｇｒａｌｉｍａｇｅ）を使って高速に求めることができる。そしてステップＤ１４において、ＣＰＵ１００は、変数ｃにｂ_iと矩形ｉの符号ａ_iの積を加算する。まとめると、このループでＣＰＵ１００が求めているのは、次の和である。 FIG. 7 is a flowchart showing details of step S203 in FIG. In the first step D10, the CPU 100 initializes the pointer variable p to point to the first node. In the next step D02, the CPU 100 confirms the type of node indicated by p. When the node indicated by p is of type T1, the CPU 100 proceeds to step D11. Conversely, in the case of type T2, the CPU 100 proceeds to step D21.
In step D11, the CPU 100 initializes the variable c with 0. Then, the CPU 100 repeats the loop from step D12 to D15 for the number n of the rectangles. In the loop, the CPU 100 sets i as a loop variable representing a rectangle. In step D13, the CPU 100 obtains the coordinates (x _iL , y _iT ) − (x _iR , y _iB ) of the diagonal line of the rectangle i from the node information in FIG. 5 and sums the luminance values in the rectangle in the input image I. Ask for. CPU100, this is referred to as b _i. b _i can be determined at high speed by using the accumulated information (integral image) as written in Non-Patent Document 2. In step D14, the CPU 100 adds the product of b _i and the code a _i of the rectangle i to the variable c. In summary, the CPU 100 calculates the following sum in this loop.

ステップＤ１６において、ＣＰＵ１００は、この和ｃが図５の閾値θを超えているかどうか判定する。そして、ＣＰＵ１００は、θを超えていればステップＤ１７へ進み、検出結果２０７に「偽」の値を書き込む。これは顔が検出されなかったことを表す。ここで、図４で示されたツリーの処理は打ち切られる。ステップＤ１６において、ＣＰＵ１００は、和ｃが閾値θを超えていないと判断すると、次のステップＤ１８へ進む。ここではＣＰＵ１００は、全ノードの処理を終えたかどうか確認する。全ノードの処理が完了している場合、ＣＰＵ１００は、ステップＤ１９で検出結果２０７に「真」の値を書き込む。これにより顔が検出されたことになる。逆に、ステップＤ１８で全ノードの処理が完了していない場合、ＣＰＵ１００は、ステップＤ０５でポインタ変数ｐに次のノードへのポインタを格納する。そして、ＣＰＵ１００は、ステップＤ０２へと制御を戻す。
ステップＤ０２においてポインタ変数ｐが指すノードのタイプがＴ２であることになれば、ＣＰＵ１００は、ステップＤ２１からの処理を実行する。まずステップＤ２１において、ＣＰＵ１００は、変数ｃを０で初期化する。そしてステップＤ２２からＤ２５までのループでＣＰＵ１００は、次の内積値を求める。なお、ａ_Aiは図６の矩形の係数である。 In step D16, the CPU 100 determines whether or not the sum c exceeds the threshold value θ in FIG. If it exceeds θ, the CPU 100 proceeds to step D17 and writes a “false” value in the detection result 207. This represents that no face was detected. Here, the processing of the tree shown in FIG. 4 is aborted. If the CPU 100 determines in step D16 that the sum c does not exceed the threshold θ, the process proceeds to the next step D18. Here, the CPU 100 confirms whether or not all nodes have been processed. When the processing of all the nodes has been completed, the CPU 100 writes a “true” value in the detection result 207 in step D19. As a result, a face is detected. On the other hand, if all the nodes have not been processed in step D18, the CPU 100 stores a pointer to the next node in the pointer variable p in step D05. Then, the CPU 100 returns the control to step D02.
If the type of the node pointed to by the pointer variable p is T2 in step D02, the CPU 100 executes the processing from step D21. First, in step D21, the CPU 100 initializes the variable c with 0. Then, in the loop from step D22 to D25, the CPU 100 obtains the next inner product value. Note that a _Ai is a rectangular coefficient in FIG.

ステップＤ２６において、ＣＰＵ１００は、内積値ｃが図６の閾値θ_Aを超えているかどうか確認する。超えている場合、ＣＰＵ１００は、次のステップＤ２８へと進む。ステップＤ２８において、ＣＰＵ１００は、ポインタ変数ｐに図６の分岐先ノードＡへのポインタ値を代入する。そして、ＣＰＵ１００は、再びステップＤ０２からの処理を始める。ステップＤ２６で閾値を超えていなかった場合、ＣＰＵ１００は、ステップＤ３０へ進む。ここで、ＣＰＵ１００は、ポインタ変数ｐに図６の分岐先ノードＢへのポインタ値を代入する。そして、ＣＰＵ１００は、再びステップＤ０２からの処理を始める。この様子をイメージ図にしたのが図８である。図８は、入力画像が分岐される様子を描いたイメージを表す図である。図８には、丸や三角で描かれているのが、これらは、入力画像Ｉの特徴ベクトルｂｉである。入力画像Ｉが顔である場合は丸（Ｅ００やＥ０１）、顔でない場合には三角（Ｅ１０やＥ１１）として描かれている。Ｅ０２は、ｃ＝θ_Aとなる超平面である。Ｅ０３がベクトルａ_A＝（ａ_A1，ａ_A2，・・・，ａ_An）で、超平面Ｅ０２の法線ベクトルである。上記の分岐条件により、黒丸Ｅ０１として表示されている顔画像と黒塗りの三角Ｅ１１として表示されている非顔画像とが分岐先Ａへと振り分けられることになる。また、白丸Ｅ００として表示されている顔画像と白抜きの三角Ｅ１０として表示されている非顔画像とが分岐先Ｂへ振り分けられることになる。以上の処理で、図４のツリーのノードを遷移していくことになる。図４に示されているとおり、タイプＴ２のノードを連続させることもできる。そうすることによって、より複雑な分岐が可能となる。或いは、複数の閾値を用意することによって、３つ以上の分岐先の中から１つを選ぶこともできる。 In step D26, the CPU 100 confirms whether the inner product value c exceeds the threshold value θ _A in FIG. When exceeding, the CPU 100 proceeds to next Step D28. In step D28, the CPU 100 assigns the pointer value to the branch destination node A in FIG. And CPU100 starts the process from step D02 again. If the threshold is not exceeded in step D26, the CPU 100 proceeds to step D30. Here, the CPU 100 substitutes the pointer value to the branch destination node B in FIG. And CPU100 starts the process from step D02 again. FIG. 8 is an image of this situation. FIG. 8 is a diagram illustrating an image depicting a state where an input image is branched. In FIG. 8, these are feature vectors bi of the input image I that are drawn with circles and triangles. When the input image I is a face, it is drawn as a circle (E00 or E01), and when it is not a face, it is drawn as a triangle (E10 or E11). E02 is a hyperplane where c = θ _A. E03 is a vector a _A = (a _A1 , a _A2 ,..., A _An ) and is a normal vector of the hyperplane E02. Due to the above branching condition, the face image displayed as the black circle E01 and the non-face image displayed as the black triangle E11 are distributed to the branch destination A. Further, the face image displayed as the white circle E00 and the non-face image displayed as the white triangle E10 are distributed to the branch destination B. With the above processing, the nodes of the tree in FIG. 4 are transitioned. As shown in FIG. 4, nodes of type T2 can be continuous. By doing so, more complex branches are possible. Alternatively, by preparing a plurality of threshold values, one of three or more branch destinations can be selected.

図５に示されるタイプＴ１のノードのパラメータを求めるための学習手順は、非特許文献２に示されるとおりである。ここで、図５の各矩形となる候補は学習前に予め提示されていると考えると分かりやすい。これら矩形の集合をＲ＝｛ｒ_i｜ｉ＝１・・・Ｎ_r｝とする。当然のことながら、集合Ｒは規則的に生成されても、乱数によって生成されてもよい。
図６に示されるタイプＴ２のノードのパラメータを求めるための本実施形態における学習手順を示す。まず、前提として学習用の顔画像ｆ_jの集合Ｆ＝｛ｆ_j ｜ｊ＝１・・・Ｎ_f｝があり、顔の写っていない学習画像ｇ_jの集合Ｇ＝｛ｇ_j ｜ｇ_j ＝１・・・Ｎ_g｝が用意されているものとする。更に、図４のツリー構造は予め決められており、パラメータを確保するためのメモリがＲＡＭ１０２上に確保されているものとする。例えば、あくまでも例であるが、図４のように分岐数が３本になるまで２回に１回分岐が起こるように分岐ノードを配置することができる。このとき、図５や図６の各ポインタ値も確定しており、格納しておくことができる。そこで、図４においてＴ１と記されているノードからＮ３と記されているノードの直前（つまり、ここではＴ２と書かれているノード）までの学習が済んでいるものとする。前述した検出の処理を適用すると、Ｎ３までのノードで学習画像のいくつかは顔がないものとして棄却（打ち切り）されたり、タイプＴ２のノードによって他の分岐先に振り分けられたりする。そこで、ＣＰＵ１００は、Ｎ３のノードでは、それまでに棄却されたり他の分岐先に振り分けられたりしない顔画像ｆ_j ⁺の集合Ｆ⁺ ＝｛ｆ_j ⁺ ｜ｊ＝１・・・Ｎ_f ⁺｝と非顔画像ｇ_j ⁺の集合Ｇ⁺ ＝｛ｇ_j ⁺ ｜ｊ＝１・・・Ｎ_g ⁺｝とを学習に利用する。 The learning procedure for obtaining the parameter of the node of type T1 shown in FIG. 5 is as shown in Non-Patent Document 2. Here, it is easy to understand if the candidates for each rectangle in FIG. 5 are presented in advance before learning. These rectangular set R = | a _{{r i i = 1 ··· N} r}. Of course, the set R may be generated regularly or by random numbers.
FIG. 7 shows a learning procedure in the present embodiment for obtaining a parameter of a node of type T2 shown in FIG. First, as a premise, there is a set F = {f _j | j = 1... N _f } of learning face images f _j , and a set G = {g _j | g _{j of} learning images g _j without a face. = shall 1 ··· N _g} are prepared. Furthermore, it is assumed that the tree structure of FIG. 4 is determined in advance, and a memory for securing parameters is secured on the RAM 102. For example, as an example only, the branch nodes can be arranged so that one branch occurs every two times until the number of branches reaches three as shown in FIG. At this time, the pointer values in FIGS. 5 and 6 are also determined and can be stored. Therefore, it is assumed that learning has been completed from the node indicated as T1 in FIG. 4 to the node immediately before the node indicated as N3 (that is, the node indicated as T2 here). When the detection process described above is applied, some of the learning images are rejected (canceled) as having no face at the nodes up to N3, or distributed to other branch destinations by the node of type T2. Therefore, the CPU 100, at the node N3, sets F ⁺ = {f _j ⁺ | j = 1... N _f ⁺ } of face images f _j ⁺ that have not been rejected or distributed to other branch destinations so far. And a set G ⁺ = {g _j ⁺ | j = 1... N _g ⁺ } of non-face images g _j ⁺ are used for learning.

ノードＮ３のための学習の大まかな流れを図９に示す。まず、ステップＦ０１において、ＣＰＵ１００は、学習画像Ｆ⁺を特徴ベクトルの集合として表す。次にステップＦ０３において、ＣＰＵ１００は、特徴ベクトルの集合を用いて、学習データの特徴空間を分割する特徴空間内の超平面を決定し（分割面決定）、ノードＮ３のパラメータとして書き込む（パラメータ決定）。
次に、これら各ステップの詳細を説明する。
図１０は、図９のステップＦ０１の詳細を表すフローチャートである。ステップＦ０１０１からＦ０１０７までのループは、学習画像Ｆ⁺に属する各顔画像ｆ_j ⁺に関する処理である。ステップＦ０１０３からステップＦ０１０５までのループは、矩形候補集合Ｒに属する各矩形ｒ_iに対して繰り返す。そしてステップＦ０１０４でＣＰＵ１００は、２次元配列の要素ｂ_ji ^fに、顔画像ｆ_j ⁺上の矩形ｒ_i内にあるピクセルの輝度値の総和を代入する。以上の処理により、学習画像の集合Ｆ⁺の各画像に対してＮ_r次元の特徴ベクトルが対応付けられたことになる。特徴空間の各次元は、それぞれある矩形の中の輝度値総和に対応する。学習画像の集合Ｆ⁺に対応する特徴ベクトルの集合をＢ^F+ ＝｛ｂ_j ^f ｜ｂ_j ^f ＝（ｂ_j1 ^f，ｂ_j2 ^f，・・・，ｂ_jNr ^f）｝とする。 A general flow of learning for the node N3 is shown in FIG. First, in step F01, CPU 100 represents a learning image F ⁺ as a set of feature vectors. Next, in step F03, the CPU 100 determines a hyperplane in the feature space that divides the feature space of the learning data using the set of feature vectors (determination plane determination), and writes it as a parameter of the node N3 (parameter determination). .
Next, details of each of these steps will be described.
FIG. 10 is a flowchart showing details of step F01 in FIG. A loop from steps F0101 to F0107 is processing relating to each face image f _j ⁺ belonging to the learning image F ⁺ . The loop from step F0103 to step F0105 is repeated for each rectangle r _i belonging to the rectangle candidate set R. In step F0104, the CPU 100 assigns the sum of the luminance values of the pixels in the rectangle r _i on the face image f _j ⁺ to the element b _ji ^f of the two-dimensional array. With the above processing, _Nr- dimensional feature vectors are associated with each image in the learning image set F ⁺ . Each dimension of the feature space corresponds to the sum of luminance values in a certain rectangle. A set of feature vectors corresponding to the set of learning images F ⁺ is _assumed to be B ^{F +} = {b _j ^f | b _j ^f = (b _j1 ^f , b _j2 ^f ,..., B _jNr ^f )}.

図９のステップＦ０３でＣＰＵ１００は、ステップＦ０１で求められたベクトルに対して垂直であって、学習データの特徴空間を分割する分割超平面を決定する（分割面決定）。分割超平面は、分割面の一例である。ステップＦ０３の流れを図１１のフローチャートに示す。図１１は、図９のステップＦ０３の詳細を表すフローチャート（その１）である。
まず、ステップＦ０３０１において、ＣＰＵ１００は、顔特徴ベクトルの集合Ｂ^F+の第１主成分方向ベクトルを求める。ここでいう第１主成分方向とは、集合Ｂ^F+の散らばりが最大となる方向である。ＣＰＵ１００は、ｓｉｎｇｕｌａｒ−ｖａｌｕｅｄｅｃｏｍｐｏｓｉｔｉｏｎ（ＳＶＤ）やｐｒｉｎｃｉｐａｌｃｏｍｐｏｎｅｎｔａｎａｌｙｓｉｓ（ＰＣＡ；主成分分析）等の多変量解析の手法を用いて主成分方向ベクトルを求めることができる。或いはＣＰＵ１００は、ｉｎｄｅｐｅｎｄｅｎｔｃｏｍｐｏｎｅｎｔａｎａｌｙｓｉｓ（ＩＣＡ；独立成分分析）等の多変量解析の手法を用いて主成分方向ベクトルを求めることもできる。ここで得られた主成分方向ベクトルをｄ＝（ｄ₁，ｄ₂，・・・，ｄ_Nr）とする。次にステップＦ０３０２において、ＣＰＵ１００は、この主成分方向ベクトルｄの次元を削減する。より具体的に説明すると、ＣＰＵ１００は、ｎを予め決められた値として、ｄの成分の中で絶対値が大きい上位ｎ個の成分を取り出し、ａ_A ＝（ａ_A1，ａ_A2，・・・，ａ_An）とする。ｎは値を大きく取るとその分計算に時間を要することになるので、大きくしすぎないことが必要である。ｄの各次元はそれぞれＲ内の矩形１つに対応する。このことに着目して、ＣＰＵ１００は、ａ_Aの各要素に対応する矩形を並べることができる。これをｒ_A ＝（ｒ_A1，ｒ_A2，・・・，ｒ_An）とする。ａ_Aの各成分はＣＰＵ１００によって図６の分岐先Ａ用パラメータの該当する領域に書き込まれ、ｎとｒ_Aとの各矩形の座標がＣＰＵ１００によって図６の矩形情報として書き込まれる。なお、次元削減の方法は、上記方法に限らない。例えば、ＣＰＵ１００は、ベクトルｄの中の絶対値と対応する矩形面積との積が大きい上位ｎ個の成分を取り出すこともできる。また、ＣＰＵ１００は、次元削減を行わないことも可能である。
主成分方向ベクトルは、方向ベクトルの一例である。 In step F03 in FIG. 9, the CPU 100 determines a divided hyperplane that is perpendicular to the vector obtained in step F01 and divides the feature space of the learning data (division plane determination). A divided hyperplane is an example of a divided surface. The flow of step F03 is shown in the flowchart of FIG. FIG. 11 is a flowchart (part 1) showing details of step F03 in FIG.
First, in step F0301, the CPU 100 obtains a first principal component direction vector of the face feature vector set B ^{F +} . Here, the first principal component direction is a direction in which the dispersion of the set B ^{F +} is maximized. The CPU 100 can obtain a principal component direction vector using a multivariate analysis technique such as single-value decomposition (SVD) or principal component analysis (PCA). Alternatively, the CPU 100 can also obtain the principal component direction vector using a multivariate analysis method such as independent component analysis (ICA). The principal component direction vector obtained here is d = (d ₁ , d ₂ ,..., D _Nr ). Next, in step F0302, the CPU 100 reduces the dimension of the principal component direction vector d. To be more specific, CPU 100 is a predetermined value n, taken out the top n components having a large absolute value among the components of _{_{d, a A = (a A1}} , a A2, ··· , A _An ). Since n takes a long time to calculate if the value is large, it is necessary not to make it too large. Each dimension of d corresponds to one rectangle in R. Focusing on this fact, CPU 100 may be arranged a rectangle corresponding to each element of a _A. Let this be r _A = (r _A1 , r _A2 ,..., R _An ). Each component of a _A is written by the CPU 100 in the corresponding area of the branch destination A parameter in FIG. 6, and the coordinates of the rectangles n and r _A are written as rectangle information in FIG. The dimension reduction method is not limited to the above method. For example, the CPU 100 can extract the top n components having a large product of the absolute value in the vector d and the corresponding rectangular area. Further, the CPU 100 may not perform dimension reduction.
The principal component direction vector is an example of a direction vector.

残る閾値θ_Aは、図１１のステップＦ０３０３で求められる。閾値θ_Aを求める方法の一例を式で表すと、前述のＢ^F+の重心ｃを用いて、次のように表される。 The remaining threshold value θ _A is obtained in step F0303 in FIG. An example of a method for obtaining the threshold value θ _A is expressed by the following equation using the centroid c of B ^{F +} described above.

ここで、ｂ_jA ^fは、ｂ_j ^fからｒ_A＝（ｒ_A1，ｒ_A2，・・・，ｒ_An）に対応する成分を取り出したベクトルである。ｗ_j＝１でもよいが、非特許文献２に書かれているようなＡｄａｂｏｏｓｔの重みでもよい。 Here, b _jA ^f is a vector obtained by extracting components corresponding to r _A = (r _A1 , r _A2 ,..., R _An ) from b _j ^f . Although w _j = 1 may be used, it may be an Adaboost weight as described in Non-Patent Document 2.

以上の方法によりタイプＴ１とタイプＴ２とのノードのパラメータを学習することにより、図４に示したツリー構造のパラメータを用意することができる。そして、このパラメータを使用することにより、比較的計算負荷の軽い処理により入力画像中の顔を検出することができる。本実施形態では、主成分方向を求めるためにＰＣＡ等を利用したが、当然のことながらｋｅｒｎｅｌＰＣＡ等の非線形な手法を用いることもできる。また、これまでの説明から、識別器は、顔の識別に限るものでなく、人物や図形や文字等他の画像も扱えることは明らかである。 The parameters of the tree structure shown in FIG. 4 can be prepared by learning the parameters of the nodes of type T1 and type T2 by the above method. By using this parameter, a face in the input image can be detected by processing with a relatively light calculation load. In the present embodiment, PCA or the like is used to obtain the principal component direction. However, as a matter of course, a nonlinear method such as kernel PCA can also be used. From the above description, it is clear that the discriminator is not limited to face identification, but can also handle other images such as a person, a figure, and a character.

＜実施形態２＞
実施形態１では、主成分分析等を行ってから次元削減を行ったが、実施形態２では次元削減を行ってから、ｉｎｄｅｐｅｎｄｅｎｔｃｏｍｐｏｎｅｎｔａｎａｌｙｓｉｓ（ＩＣＡ；独立成分分析）等を行う例を示す。
本実施形態では、図１１の代わりに図１２を利用する。図１２は、図９のステップＦ０３の詳細を表すフローチャート（その２）である。
ステップＦ０３１１において、ＣＰＵ１００は、分割超平面の法線ベクトルを求める。そして、ステップＦ０３１３において、ＣＰＵ１００は、図１１のステップＦ０３０３と同じ手順（同じ処理）で閾値θ_Aを求める。 <Embodiment 2>
In the first embodiment, the dimension reduction is performed after the principal component analysis or the like is performed. In the second embodiment, an example is shown in which independent component analysis (ICA) is performed after the dimension reduction is performed.
In this embodiment, FIG. 12 is used instead of FIG. FIG. 12 is a flowchart (part 2) showing details of step F03 in FIG.
In step F0311, the CPU 100 obtains a normal vector of the divided hyperplane. In step F0313, the CPU 100 obtains the threshold θ _A by the same procedure (same process) as step F0303 in FIG.

図１３は、図１２のステップＦ０３１１の詳細を表すフローチャート（その１）である。
ステップＧ０１において、ＣＰＵ１００は、矩形の集合Ｒのなかからいくつかの矩形の組み合わせを選び、その組み合わせの集合をＲＣとする。それぞれの組み合わせでの矩形の数ｍは、例えば２のように一定であってもよいが、不揃いであってもよい。不揃いの場合には、ＣＰＵ１００は、根ノードから数えたノード数に応じて、ｍが単調に増加するように選んでもよい。ＣＰＵ１００は、矩形の組み合わせ集合ＲＣの各組み合わせｒ_C＝（ｒ_C1，ｒ_C2，・・・，ｒ_Cm）（Ｃ₁〜Ｃ_mは矩形の番号を表すインデックス）について、ステップＧ０２からＧ０６までのループを繰り返す。 FIG. 13 is a flowchart (part 1) showing details of step F0311 in FIG.
In step G01, the CPU 100 selects several rectangle combinations from the rectangle set R, and sets the combination set as RC. The number m of rectangles in each combination may be constant, for example, 2 but may be uneven. If they are not uniform, the CPU 100 may select m to increase monotonously according to the number of nodes counted from the root node. The CPU 100 performs steps G02 to G06 for each combination r _C = (r _C1 , r _C2 ,..., R _Cm ) (C _{1 to} C _m are indexes representing rectangle numbers) of the rectangular combination set RC. Repeat the loop.

次にステップＧ０３において、ＣＰＵ１００は、集合Ｂ^F+の各特徴ベクトルｂ_j ^fについて、ｒ_C＝（ｒ_C1，ｒ_C2，・・・，ｒ_Cm）の各要素（矩形）に対応する成分を取り出した特徴ベクトル

を生成する。もし、ｒ_C＝（ｒ₅，ｒ₂₃₆，ｒ₅₄₆₈
）の場合、

は、（ｂ_j,5 ^f，ｂ_j,236 ^f，ｂ_j,5468 ^f）となる。つまり、学習画像ｆ
_j ⁺上の矩形ｒ₅とｒ₂₃₆とｒ₅₄₆₈内の輝度値総和を並べたベクトルとなる。 Next, in step G03, the CPU 100 extracts a component corresponding to each element (rectangle) of r _C = (r _C1 , r _C2 ,..., R _Cm ) for each feature vector b _j ^f of the set B ^{F +.} Feature vector

Is generated. If r _C = (r ₅ , r ₂₃₆ , r ₅₄₆₈
)in the case of,

Becomes (b _{j, 5} ^f , b _{j, 236} ^f , b _{j, 5468} ^f ). That is, the learning image f
_This is a vector in which the luminance value sums in the rectangles r ₅ , r _236, and r ₅₄₆₈ on _j ⁺ are arranged.

ステップＧ０４において、ＣＰＵ１００は、これらｍ次元のベクトルの集合

に対してＩＣＡを適用し、最大でｍ本のｍ次元ベクトルｑ_k（ｋ＝１，・・・，ｍ_q；ｍ_q≦ｍ）を得る。ＩＣＡを適用する際の目的関数には例えば次のような関数Ｊ（ｑ）を選ぶことができる。ここで、ｖは平均０、分散１の正規分布に従う確率変数である。 In step G04, the CPU 100 determines a set of these m-dimensional vectors.

ICA is applied to a maximum of m m-dimensional vectors q _k (k = 1,..., M _q ; m _q ≦ m). For example, the following function J (q) can be selected as an objective function when applying ICA. Here, v is a random variable that follows a normal distribution with an average of 0 and a variance of 1.

次にステップＧ０５において、ＣＰＵ１００は、評価値として集合

の尖度（ｋｕｒｔｏｓｉｓ）の符号を反転したものを計算する。より具体的に説明すると、ＣＰＵ１００は、次の値を求める（射影評価）。 Next, in step G05, the CPU 100 sets the evaluation value as a set.

Compute the inverse of the sign of kurtosis. More specifically, the CPU 100 obtains the following value (projection evaluation).

或いはＣＰＵ１００は、評価値としてｃｏｎｔｒａｓｔｆｕｎｃｔｉｏｎや目的関数を使用してもよい。ループを抜けると、ＣＰＵ１００は、ステップＧ０７で評価値が最も大きかったベクトルｑ_kとそのときの矩形組み合わせｒとを、それぞれａ_Aとｒ_Aとして選択する（最適化）。なお、ベクトルｑ_kは、射影ベクトルと学習データの特徴ベクトルとの内積値の集合に関する統計量の一例である。つまり、ＣＰＵ１００は、ベクトルｑ_kを最大化又は最小化する射影ベクトルｑを求め、方向ベクトルとする。 Alternatively, the CPU 100 may use a contrast function or an objective function as the evaluation value. Once out of the loop, CPU 100 is a vector q _k evaluation value was greatest in step G07 and the rectangular combined r at that time, respectively selected as a _A and r _A (optimization). The vector q _k is an example of a statistic regarding the set of inner product values of the projection vector and the feature vector of the learning data. That is, the CPU 100 obtains a projection vector q that maximizes or minimizes the vector q _k and sets it as a direction vector.

以上の方法でタイプＴ１とタイプＴ２とのノードのパラメータを学習することにより、図４に示したツリー構造のパラメータを用意することができる。そして、このパラメータを使用することにより、比較的計算負荷の軽い処理により入力画像中の顔を検出することができる。特に本実施形態では、全ての組み合わせｒ_Cに共通する評価関数によって評価値を求めて比較することによって、分割超平面の法線ベクトルを求めるだけでなく、次元削減において使用する成分の選択も行っている。なお、本実施形態では、超平面の法線ベクトルを求めるためにＩＣＡを使用したが、ＰＣＡやＳＶＤ等他の手法を使用することもできる。 The parameters of the tree structure shown in FIG. 4 can be prepared by learning the parameters of nodes of type T1 and type T2 by the above method. By using this parameter, a face in the input image can be detected by processing with a relatively light calculation load. In particular, in the present embodiment, not only the normal vector of the divided hyperplane is obtained by calculating and comparing evaluation values using an evaluation function common to all combinations r _C , but also the components used in dimension reduction are selected. ing. In the present embodiment, ICA is used to obtain a hyperplane normal vector, but other methods such as PCA and SVD can also be used.

＜実施形態３＞
実施形態２では、尖度が正規分布からより乖離した射影ベクトルｑを、ＩＣＡを使って求める方法を示した。この方法は、尖度が小さい射影ベクトルだけでなく、尖度が大きい射影ベクトルも求めてしまうことになる。本実施形態では、射影追跡法を利用して、直接尖度が小さい射影ベクトルのみを求める方法を示す。本実施形態でも、尖度は正規分布からの乖離度を表す指標の例として用いる。実施形態２とほぼ同じ構成であるが、図１３の代わりに本実施形態では図１５を使用する。 <Embodiment 3>
In the second embodiment, a method has been described in which a projection vector q whose kurtosis is more deviated from the normal distribution is obtained using ICA. This method requires not only a projection vector with a low kurtosis but also a projection vector with a high kurtosis. In the present embodiment, a method for obtaining only a projection vector having a small direct kurtosis using a projection tracking method will be described. Also in this embodiment, the kurtosis is used as an example of an index that represents the degree of deviation from the normal distribution. Although the configuration is almost the same as that of the second embodiment, FIG. 15 is used in this embodiment instead of FIG.

まず射影ベクトルｑ'を極座標系で表現する。

First, the projection vector q ′ is expressed in a polar coordinate system.

また、

を次式に従って平行移動させる。

Also,

Is translated according to the following equation.

そして、目的関数は以下の通りとする。

The objective function is as follows.

この目的関数を最小化する射影ベクトルｑ'を求める方法のフローチャートを図１４に示す。図１４は、目的関数を最小化する射影ベクトルｑ'を求める処理の一例を示すフローチャートである。
ステップＫ０１において、ＣＰＵ１００は、θ_i（ｉ＝１，・・・，ｍ）を所定の値で初期化する。例えば、ＣＰＵ１００は、θ_i＝０（ｉ＝１，・・・，ｍ）とすることができる。或いは、ＣＰＵ１００は、前記値を乱数で生成することもできる。また、ＣＰＵ１００は、収束条件のためのカウンタ変数ｓを０に初期化する。 FIG. 14 shows a flowchart of a method for obtaining the projection vector q ′ for minimizing the objective function. FIG. 14 is a flowchart illustrating an example of processing for obtaining a projection vector q ′ that minimizes the objective function.
In step K01, the CPU 100 initializes θ _i (i = 1,..., M) with a predetermined value. For example, the CPU 100 can set θ _i = 0 (i = 1,..., M). Alternatively, the CPU 100 can generate the value with a random number. Further, the CPU 100 initializes a counter variable s for convergence condition to 0.

ＣＰＵ１００は、ステップＫ０２から繰り返し処理に入る。まずステップＫ０２において、ＣＰＵ１００は、θ⁺を生成する。より具体的に説明すると、ＣＰＵ１００は、まず乱数により自然数ｕ（１≦ｕ≦ｍ−１）とΔを生成する。Δは、例えば平均０の正規分布をなすものとする。そしてθ⁺はθの第ｕ成分にΔを足したものとする。つまり、次式の通りとする。θ_i（ｉ＝１，．．．，ｍ）はθの第ｉ成分である。
θ_i ⁺＝θ_i（ｉ≠ｕ）
θ_u ⁺＝θ_u＋Δ The CPU 100 enters repetitive processing from step K02. First, in step K02, the CPU 100 generates θ ⁺ . More specifically, the CPU 100 first generates natural numbers u (1 ≦ u ≦ m−1) and Δ using random numbers. For example, Δ has a normal distribution with an average of 0. Θ ⁺ is obtained by adding Δ to the u-th component of θ. That is, the following formula is assumed. _{θ i (i = 1, ...} , m) is the i-th component of theta.
θ _i ⁺ = θ _i (i ≠ u)
θ _u ⁺ = θ _u + Δ

次に、ステップＫ０３において、ＣＰＵ１００は、目的関数の増減を調べる。より具体的に説明すると、ＣＰＵ１００は、（式１）にθを代入して射影ベクトルｑ'を求め、Ｊ（ｑ'）を計算する。次にＣＰＵ１００は、θの代わりにθ⁺を使って射影ベクトルｑ⁺'を求め、Ｊ（ｑ⁺'）を計算する。ＣＰＵ１００は、Ｊ（ｑ'）≦Ｊ（ｑ⁺'）であれば、ステップＫ０６へ進み、逆であればステップＫ０４へ進む。
ステップＫ０４において、ＣＰＵ１００は、カウンタ変数ｓを０に初期化する。そして、ステップＫ０５において、ＣＰＵ１００は、θにθ⁺を代入して、再びステップＫ０２よりループを繰り返す。ステップＫ０６において、ＣＰＵ１００は、カウンタ変数ｓを１つ増分させる。そして、ステップＫ０７において、ＣＰＵ１００は、予め定められた定数Ｓとｓとを比較し、まだｓ＜Ｓであれば、ステップＫ０２よりループを繰り返す。逆にｓ≦ＳであればＣＰＵ１００は、最小化処理を中止する。これにより、目的関数の値がＳ回改善されなければループを抜けることになる。このときのθから求めた射影ベクトルｑ'を、目的関数を最小化する値として扱う。 Next, in step K03, the CPU 100 checks the increase / decrease of the objective function. More specifically, the CPU 100 substitutes θ into (Equation 1) to obtain a projection vector q ′, and calculates J (q ′). Next, the CPU 100 calculates a projection vector q ⁺ 'using θ ⁺ instead of θ, and calculates J (q ⁺ '). The CPU 100 proceeds to step K06 if J (q ′) ≦ J (q ⁺ ′), and proceeds to step K04 if vice versa.
In step K04, the CPU 100 initializes the counter variable s to 0. In step K05, the CPU 100 substitutes θ ⁺ for θ and repeats the loop from step K02 again. In step K06, the CPU 100 increments the counter variable s by one. In step K07, the CPU 100 compares a predetermined constant S with s, and if s <S, repeats the loop from step K02. Conversely, if s ≦ S, the CPU 100 stops the minimization process. As a result, if the value of the objective function is not improved S times, the loop is exited. The projection vector q ′ obtained from θ at this time is treated as a value that minimizes the objective function.

ステップＫ０３において、ＣＰＵ１００は、逐次Ｊ（ｑ'）を評価しているので、実施形態２の図１３のステップＧ０５ように評価値（Ｊ（ｑ'）の値）を再度計算する必要がない。そのため、図１３の代わりに本実施形態では図１５に従う。ステップＬ０４の詳細は図１４に示したとおりである。 In step K03, since the CPU 100 sequentially evaluates J (q ′), it is not necessary to recalculate the evaluation value (the value of J (q ′)) as in step G05 of FIG. Therefore, instead of FIG. 13, the present embodiment follows FIG. Details of step L04 are as shown in FIG.

以上、尖度を最小化する射影ベクトルを求める方法を示した。本実施形態ではこの射影ベクトルを用いてタイプＴ２ノードの分割超平面を決定する。
なお、上述した最適化手法以外にも、ニュートン法等他の最適化手法によって射影ベクトルを求めるようにしてもよい。
また、これまでの実施形態では矩形特徴を用いた例を示したが、本発明はこれに限定されるものではない。特徴量としては、入力画像のピクセル値、入力画像にガボールフィルタをかけた特徴量、また局所特徴と呼ばれ入力画像の個々のピクセルにベクトルを割り当てるもの等がある。このようにパターン認識の分野においては、数多くの特徴量が定義されている。また、これまで顔の識別を例に取り上げたが、本発明は、画像のみならず、音声情報やアンケート結果等、他の情報にも適用することができる。 The method for obtaining the projection vector that minimizes the kurtosis has been described above. In this embodiment, the division hyperplane of the type T2 node is determined using this projection vector.
In addition to the optimization method described above, the projection vector may be obtained by another optimization method such as Newton's method.
Moreover, although the example using the rectangular feature was shown in the embodiments so far, the present invention is not limited to this. Examples of the feature amount include a pixel value of the input image, a feature amount obtained by applying a Gabor filter to the input image, and a feature called a local feature that assigns a vector to each pixel of the input image. As described above, many feature quantities are defined in the field of pattern recognition. Further, although face identification has been taken up as an example, the present invention can be applied not only to images but also to other information such as voice information and questionnaire results.

以上、上述した各実施形態によれば、パターン識別のための決定木を構築する際に、分岐ノードのパラメータを適切に決定することができるため、入力データを、高速、かつ、高精度にパターン識別する識別器を構成することができる。 As described above, according to each of the above-described embodiments, when building a decision tree for pattern identification, it is possible to appropriately determine the parameters of the branch node. A discriminator can be constructed.

以上、本発明の好ましい実施形態について詳述したが、本発明は係る特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 The preferred embodiments of the present invention have been described in detail above, but the present invention is not limited to such specific embodiments, and various modifications can be made within the scope of the gist of the present invention described in the claims.・ Change is possible.

１００ＣＰＵ、１０１プログラムメモリ、１０２ＲＡＭ、１０３ハードディスク、１０４ディスプレイ 100 CPU, 101 program memory, 102 RAM, 103 hard disk, 104 display

Claims

Determine a parameter of each node of a classifier having a tree structure in which a plurality of identification nodes and branch nodes are connected and identifying input data into two classes of a first class and a second class of feature space A learning device,
The branch node is a node that determines a node to be started next based on a parameter;
The identification node is a node that identifies whether input data belongs to the second class based on a parameter;
Multivariate analysis means for performing a multivariate analysis on a feature vector of learning data belonging to the first class to obtain a direction vector;
A dividing plane determining unit that determines a dividing plane that is perpendicular to the direction vector obtained by the multivariate analyzing unit and divides the feature space of the learning data;
Parameter determining means for determining a parameter of the branch node based on the split plane determined by the split plane determining means;
A learning apparatus comprising:

The dividing plane determining means determines a hyperplane in the feature space that is perpendicular to the direction vector obtained by the multivariate analyzing means and divides the feature space of the learning data as a dividing plane. The learning apparatus according to claim 1, wherein

The multivariate analysis means includes:
Projection means for obtaining an inner product value of a projection vector and a feature vector of learning data belonging to the first class;
A projection evaluation means for obtaining a statistic relating to the set of inner product values;
An optimization means for determining the projection vector for maximizing or minimizing the statistic and setting it as the direction vector;
The learning apparatus according to claim 1, comprising:

The projection evaluation means obtains a statistic representing a degree of deviation from a normal distribution of the set of inner product values,
The learning device according to claim 3, wherein the optimization unit obtains the projection vector that maximizes the statistic and uses the projection vector as the direction vector.

The learning apparatus according to claim 3, wherein the projection evaluation unit obtains a kurtosis of the set of inner product values as a statistic.

Dimensional reduction means for reducing elements based on the magnitude of the absolute value from the direction vector determined by the multivariate analysis means,
The learning apparatus according to claim 1, wherein the division plane determination unit determines the division plane that is perpendicular to the vector whose dimension is reduced by the dimension reduction unit.

Determine a parameter of each node of a classifier having a tree structure in which a plurality of identification nodes and branch nodes are connected, and identifying input data into two classes, a first class and a second class in the feature space. A learning method in a learning device,
The branch node is a node that determines a node to be started next based on a parameter;
The identification node is a node that identifies whether input data belongs to the second class based on a parameter;
The learning device is
A multivariate analysis step of performing a multivariate analysis on a feature vector of learning data belonging to the first class to obtain a direction vector;
A dividing plane determining step that determines a dividing plane that is perpendicular to the direction vector determined in the multivariate analysis step and divides the feature space of the learning data;
A parameter determining step for determining a parameter of the branch node based on the split plane determined in the split plane determining step;
The learning method characterized by including.

In the dividing plane determining step, a hyperplane in the feature space that is perpendicular to the direction vector obtained in the multivariate analyzing step and divides the feature space of the learning data is determined as a dividing plane. The learning method according to claim 7, wherein the learning method is characterized.

In the multivariate analysis step,
A projection step for obtaining an inner product value of a certain projection vector and a feature vector of learning data belonging to the first class;
A projection evaluation step for obtaining a statistic relating to the set of inner product values;
Determining the projection vector that maximizes or minimizes the statistic, and optimizing it as the direction vector;
The learning method according to claim 7, further comprising:

In the projection evaluation step, a statistic representing a degree of deviation from the normal distribution of the set of inner product values is obtained,
The learning method according to claim 9, wherein in the optimization step, the projection vector that maximizes the statistic is obtained and used as the direction vector.

The learning method according to claim 9, wherein, in the projection evaluation step, a kurtosis of the set of inner product values is obtained as a statistic.

A dimension reduction step of reducing elements based on the magnitude of the absolute value from the direction vector obtained in the multivariate analysis step;
12. The learning method according to claim 7, wherein, in the division plane determination step, the division plane that is perpendicular to the vector whose dimension is reduced in the dimension reduction step is determined.