JP6659120B2

JP6659120B2 - Information processing apparatus, information processing method, and program

Info

Publication number: JP6659120B2
Application number: JP2015218444A
Authority: JP
Inventors: 裕一郎飯尾; 奥野　泰弘; 泰弘奥野
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-11-06
Filing date: 2015-11-06
Publication date: 2020-03-04
Anticipated expiration: 2035-11-06
Also published as: JP2017091083A

Description

本発明は、情報処理装置、情報処理方法、およびプログラムに関し、特に、アンサンブル識別器を作成するために用いて好適なものである。 The present invention relates to an information processing apparatus, an information processing method, and a program, and is particularly suitable for use in creating an ensemble discriminator.

機械学習を利用した物体認識の分野で多く用いられる手法として、アンサンブル識別器を用いた認識手法がある。アンサンブル識別器を用いた認識手法は、精度の低い識別器（弱識別器）を複数組み合わせることによって教師付き学習の精度を向上させる手法である。アンサンブル識別器を用いた認識を行う方式として、BaggingやRandomized Treesといった方式がある。 As a method often used in the field of object recognition using machine learning, there is a recognition method using an ensemble classifier. The recognition method using the ensemble classifier is a method for improving the accuracy of supervised learning by combining a plurality of classifiers with low accuracy (weak classifiers). As a method of performing recognition using an ensemble classifier, there are methods such as Bagging and Randomized Trees.

基本的に、アンサンブル識別器を構成する弱識別器の個数が多いほど、アンサンブル識別器の性能が向上する。一方、アンサンブル識別器を構成する弱識別器の個数が同一である場合でも、アンサンブル識別器を構成する複数の弱識別器の組み合わせによって、アンサンブル識別器の性能が変動する。従って、適切な弱識別器の組み合わせでアンサンブル識別器を構成することで、より高性能なアンサンブル識別器を作成することが望まれる。 Basically, the performance of the ensemble classifier improves as the number of weak classifiers constituting the ensemble classifier increases. On the other hand, even when the number of weak classifiers forming the ensemble classifier is the same, the performance of the ensemble classifier varies depending on the combination of a plurality of weak classifiers forming the ensemble classifier. Therefore, it is desired to create an ensemble classifier with higher performance by configuring an ensemble classifier with an appropriate combination of weak classifiers.

アンサンブル識別器を作成する手法として、作成したアンサンブル識別器を、予めラベルづけされたサンプルデータに基づいて評価し、その評価の結果が所定の条件を満たすまでアンサンブル識別器の学習を繰り返す手法がある。ここで、アンサンブル識別器の評価に用いるサンプルデータとして、ラベルが付与されたデータを用いる技術がある。ラベルは、教師データやGround Truth等とも呼ばれ、識別器における識別結果の正しい内容（正解）を示す情報である。例えば、サンプルデータにおける識別対象が属するクラスを示すクラス情報がラベルとして付与される。この種の技術として、特許文献１には、AdaBoostを用いたアンサンブル識別器の学習についての技術が開示されている。特許文献１では、まず、学習した弱識別器から合成対象となる複数の弱識別器を選択する。次に、選択した複数の弱識別器を合成したアンサンブル識別器を、ラベルが付与されたサンプル画像を入力として評価する。以上のアンサンブル識別器を構成する弱識別器の選択と、アンサンプル識別器の評価とを、アンサンブル識別器の識別精度が閾値以上となるまで繰り返す。 As a method of creating an ensemble classifier, there is a method of evaluating the created ensemble classifier based on pre-labeled sample data and repeating learning of the ensemble classifier until the evaluation result satisfies a predetermined condition. . Here, there is a technique that uses labeled data as sample data used for evaluating the ensemble classifier. The label is also called teacher data, Ground Truth, or the like, and is information indicating the correct content (correct answer) of the identification result in the classifier. For example, class information indicating the class to which the identification target belongs in the sample data is assigned as a label. As this type of technology, Patent Literature 1 discloses a technology for learning an ensemble classifier using AdaBoost. In Patent Document 1, first, a plurality of weak classifiers to be combined are selected from the learned weak classifiers. Next, an ensemble discriminator obtained by combining a plurality of selected weak discriminators is evaluated using a sample image to which a label is added as an input. The selection of the weak classifiers constituting the ensemble classifier and the evaluation of the ensemble classifier are repeated until the classification accuracy of the ensemble classifier becomes equal to or larger than the threshold.

特許第５１２３７５９号公報Japanese Patent No. 5123759

Gall, J., "Hough forests for object detection，tracking，and action recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33,no. 11, pp. 2188-2202, 2011．Gall, J., "Hough forests for object detection, tracking, and action recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 11, pp. 2188-2202, 2011.

特許文献１では、学習に用いるデータの一部をサンプルデータとして、アンサンブル識別器の評価に利用する。識別器の学習に用いるデータは、学習の精度を高めるために識別対象の物体が単独で含まれているものが望ましいとされている。一方で、サンプルデータは、識別時に識別器に入力される未知のデータに対する汎化性能を確認するために用いることが目的である。このため、サンプルデータは、複数の物体が雑多に含まれているような、実際に識別時に識別器に入力されるデータであることが望ましい。そのため、学習に用いるデータをサンプルデータとして利用するよりも、識別時に用いられるデータと同種のデータ（例えば、実撮影画像データ）のセットをサンプルデータとして別途用意した方が、より有効に識別器を評価することができる。これにより、アンサンブル識別器の高性能化が見込める。 In Patent Literature 1, a part of data used for learning is used as sample data for evaluation of an ensemble classifier. It is considered that the data used for the learning of the classifier desirably includes only the object to be identified in order to enhance the learning accuracy. On the other hand, the purpose of the sample data is to confirm the generalization performance for unknown data input to the classifier at the time of classification. For this reason, it is desirable that the sample data is data that is actually input to the discriminator at the time of discrimination such that a plurality of objects are miscellaneously included. Therefore, rather than using the data used for learning as sample data, it is better to separately prepare a set of data (for example, actual photographed image data) of the same type as the data used at the time of identification as sample data. Can be evaluated. As a result, the performance of the ensemble classifier can be improved.

しかし、識別時に用いられるデータと同種のデータには、必ずしも、ラベルを付与することができるとは限らない。識別時に用いられるデータと同種のデータにラベルを付与するためには、例えば、人間がサンプルデータごとに、当該サンプルデータにおける識別対象の有無や当該サンプルデータにおける識別対象が属するクラス等を判定する必要がある。そして、当該サンプルデータに、判定した結果を示す情報をラベルとして設定する必要がある。しかし、サンプルデータのセットの総数が膨大である場合には、人力で全てのサンプルデータにクラス情報等を付与することは容易ではない。 However, it is not always possible to assign a label to data of the same type as the data used at the time of identification. In order to label data of the same type as the data used for identification, for example, it is necessary for a human to determine, for each sample data, whether or not there is an identification target in the sample data, a class to which the identification target belongs in the sample data, and the like. There is. Then, it is necessary to set information indicating the determined result as a label in the sample data. However, when the total number of sample data sets is enormous, it is not easy to manually add class information or the like to all sample data.

例えば、非特許文献１で用いられているHough Forestでは、画像上の部分領域を入力データとする。そして、弱識別器は、入力データが、識別対象を含むときには、識別対象の物体の重心位置へのオフセットを出力する。一方、入力データが、識別対象を含まないときには、弱識別器は、ネガティブサンプルであると判定する。即ち、Hough Forestにおいては、サンプルデータにクラス情報等を付与するためには、以下のようにする必要がある。まず、サンプルデータの全ての部分領域に対して、それぞれの部分領域が識別対象の物体を含むかどうかを判定する。そして、識別対象の物体を含む場合には、識別対象の物体の重心位置へのオフセットを算出する。しかし、全てのサンプルデータのセットに対してこの作業を人力で行うことは現実的ではない。 For example, in Hough Forest used in Non-Patent Document 1, a partial region on an image is used as input data. Then, when the input data includes the identification target, the weak classifier outputs an offset to the center of gravity of the identification target object. On the other hand, when the input data does not include the identification target, the weak classifier determines that the sample is a negative sample. That is, in the Hough Forest, it is necessary to perform the following in order to add class information and the like to the sample data. First, it is determined whether or not each partial region includes an object to be identified, for all partial regions of the sample data. Then, when an object to be identified is included, an offset to the position of the center of gravity of the object to be identified is calculated. However, it is not realistic to perform this task manually for all sample data sets.

従って、ラベルが付与されたサンプルデータを用意することが容易でない場合には、特許文献１の方式では、作成したアンサンブル識別器の性能を評価することが容易ではない。
本発明は、このような問題点に鑑みてなされたものであり、ラベルが付与されたサンプルデータを用いなくても、複数の弱識別器から識別対象の識別に有用な弱識別器の組み合わせを選択できるようにすることを目的とする。 Therefore, if it is not easy to prepare sample data with a label, it is not easy to evaluate the performance of the created ensemble discriminator according to the method of Patent Document 1.
The present invention has been made in view of such a problem, and even without using labeled sample data, a combination of weak discriminators useful for discriminating an identification target from a plurality of weak discriminators. The purpose is to be selectable.

本発明の情報処理装置は、アンサンブル識別器を構成する複数の弱識別器を選択するための処理を行う情報処理装置であって、前記アンサンブル識別器を構成する弱識別器の候補を入力する入力手段と、前記入力手段により入力された候補のうち、少なくとも２つの前記弱識別器に対して同じデータを与え、当該弱識別器のそれぞれにおいて当該データに対する識別処理を行う際に得られる情報を比較し、比較した結果に基づいて、当該少なくとも２つの弱識別器間の識別結果の多様性を表す指標を導出することを、前記少なくとも２つの弱識別器の複数の組み合わせのそれぞれについて行う導出手段と、前記導出手段により導出された、前記少なくとも２つの前記弱識別器間の識別結果の多様性を表す指標に基づいて、前記アンサンブル識別器を構成する前記複数の弱識別器の数の候補のそれぞれについて、前記複数の弱識別器の組み合わせを選択する処理手段と、を有することを特徴とする。 An information processing apparatus according to the present invention is an information processing apparatus for performing a process for selecting a plurality of weak classifiers forming an ensemble classifier, and an input for inputting a candidate of a weak classifier forming the ensemble classifier. Means and the same data is given to at least two of the weak discriminators among the candidates input by the input means, and information obtained when each of the weak discriminators performs discrimination processing on the data is compared. And deriving means for deriving an index representing the diversity of the classification result between the at least two weak classifiers based on the comparison result for each of a plurality of combinations of the at least two weak classifiers; , derived by the deriving means, on the basis of the index representing the identification result of the diversity between the at least two of said weak classifier, the ensemble identification For each number of candidates of the plurality of weak classifiers constituting the vessel, and having a processing means for selecting a combination of said plurality of weak classifiers.

本発明によれば、ラベルが付与されたサンプルデータを用いなくても、複数の弱識別器から識別対象の識別に有用な弱識別器の組み合わせを選択することができる。 According to the present invention, it is possible to select a combination of weak classifiers useful for identifying a classification target from a plurality of weak classifiers without using sample data to which a label is assigned.

情報処理装置の構成を示す図である。FIG. 2 is a diagram illustrating a configuration of an information processing device. 弱識別器多様性指標算出処理の第１の例を示すフローチャートである。It is a flowchart which shows the 1st example of a weak discriminator diversity index calculation process. 識別結果ばらつき度の算出過程の第１の例を示す図である。FIG. 9 is a diagram illustrating a first example of a process of calculating a variation degree of an identification result. 識別結果ばらつき度の算出過程の第２の例を示す図である。FIG. 11 is a diagram illustrating a second example of the process of calculating the degree of variation in the identification result. 識別結果ばらつき度の算出過程の第３の例を示す図である。It is a figure showing the 3rd example of a calculation process of a classification result variation degree. 識別結果ばらつき度の算出過程の第４の例を示す図である。It is a figure showing the 4th example of a calculation process of a classification result variation degree. 識別結果ばらつき度の算出過程の第５の例を示す図である。It is a figure showing the 5th example of a calculation process of a classification result variation degree. 弱識別器組み合わせ選択部の処理を示すフローチャートである。It is a flowchart which shows the process of a weak discriminator combination selection part. 弱識別器間多様性指標を示す図である。It is a figure showing a diversity index between weak discriminators. 弱識別器組み合わせ選択部の処理を説明する図である。It is a figure explaining processing of a weak discriminator combination selection part. 決定木による識別の流れを示す図である。FIG. 9 is a diagram illustrating a flow of identification using a decision tree. 弱識別器多様性指標算出処理の第２の例を示すフローチャートである。It is a flowchart which shows the 2nd example of a weak discriminator diversity index calculation process. 類結果ばらつき度を算出する過程を示す図である。It is a figure which shows the process of calculating a similar result variation degree. アンサンブル識別器構築システムの構成を示す図である。It is a figure showing the composition of an ensemble discriminator construction system. アンサンブル識別器構築システムの処理を示すフローチャートである。It is a flowchart which shows the process of an ensemble discriminator construction system.

以下、図面を参照しながら、第１の実施形態を説明する。
（第１の実施形態）
まず、第１の実施形態について説明する。
アンサンブル識別器は、複数の弱識別器を組み合わせて高精度な識別を行う識別器である。しかし、弱識別器の数に比例してアンサンブル識別器の容量は増大する。このため、実行環境の都合でアンサンブル識別器の容量に制限がある場合には、予めアンサンブル識別器を構成する弱識別器の数を制限してアンサンブル識別器の作成を行う必要がある。 Hereinafter, a first embodiment will be described with reference to the drawings.
(First embodiment)
First, a first embodiment will be described.
The ensemble classifier is a classifier that performs highly accurate classification by combining a plurality of weak classifiers. However, the capacity of the ensemble classifier increases in proportion to the number of weak classifiers. Therefore, when the capacity of the ensemble classifier is limited due to the execution environment, it is necessary to create the ensemble classifier by limiting the number of weak classifiers constituting the ensemble classifier in advance.

一方、高い汎化性能を持つアンサンブル識別器を構築するためには、個々の弱識別器の多様性が重要である。そこで、複数の弱識別器の組み合わせでアンサンブル識別器を作成する場合には、より多数の弱識別器を作成し、それらの弱識別器の中から多様性の大きい弱識別器の組み合わせを選択してアンサンブル識別器の作成を行うのが好ましい。より高性能なアンサンブル識別器を作成することができるからである。 On the other hand, in order to construct an ensemble classifier having high generalization performance, diversity of individual weak classifiers is important. Therefore, when creating an ensemble classifier using a combination of a plurality of weak classifiers, create a larger number of weak classifiers and select a combination of weak classifiers with a large variety from among the weak classifiers. Preferably, an ensemble discriminator is created. This is because a higher-performance ensemble classifier can be created.

本実施形態の目的は、所定の数の弱識別器群からなるアンサンブル識別器を作成するに際し、クラス情報等のラベルを持たないサンプルデータを用いて、予め用意された多数の弱識別器集合から適切な弱識別器の組み合わせを選択することである。本実施形態では、このようなサンプルデータを入力としたときの各弱識別器の識別結果のばらつき度を用いて、弱識別器間の識別結果の多様性を表す指標を導出する。そして、このような指標を用いて、アンサンブル識別器を構成するための弱識別器群の組み合わせを選択することにより、高性能なアンサンブル識別器を作成する。また、本実施形態では、取り扱うデータが画像データである場合を例に挙げて説明する。更に、本実施形態では、識別器が、入力されるデータに含まれる識別対象が属するクラスを識別する場合を例に挙げて説明する。 The purpose of the present embodiment is to create an ensemble discriminator composed of a predetermined number of weak discriminator groups, using sample data without labels such as class information, from a large number of prepared weak discriminator sets. It is to select an appropriate combination of weak classifiers. In the present embodiment, an index indicating the diversity of the classification results between weak classifiers is derived using the degree of variation in the classification results of each weak classifier when such sample data is input. Then, a high-performance ensemble classifier is created by selecting a combination of weak classifier groups for forming an ensemble classifier using such an index. In the present embodiment, a case where the data to be handled is image data will be described as an example. Further, in the present embodiment, a case will be described as an example where the classifier identifies the class to which the identification target included in the input data belongs.

図１は、本実施形態における情報処理装置１００の構成の一例を示す図である。本実施形態における情報処理装置１００は、弱識別器間多様性指標算出部１１０と、弱識別器組み合わせ選択部１２０と、アンサンブル識別器生成部１３０とを有する。情報処理装置１００は、例えば、ＣＰＵ、ＲＯＭ、ＲＡＭ、ＨＤＤ、および各種のインターフェースを備える装置や専用のハードウェアを用いることにより実現される。 FIG. 1 is a diagram illustrating an example of a configuration of an information processing apparatus 100 according to the present embodiment. The information processing apparatus 100 according to the present embodiment includes a weak discriminator diversity index calculation unit 110, a weak discriminator combination selection unit 120, and an ensemble discriminator generation unit 130. The information processing apparatus 100 is realized by using, for example, a device including a CPU, a ROM, a RAM, an HDD, and various interfaces and dedicated hardware.

弱識別器間多様性指標算出部１１０は、弱識別器集合１１とサンプルデータ群１２を入力として、弱識別器間の識別結果の多様性を表す指標を算出する。以下の説明では、弱識別器間の識別結果の多様性を表す指標を、必要に応じて、弱識別器間多様性指標と称する。
弱識別器組み合わせ選択部１２０は、弱識別器間多様性指標算出部１１０によって算出された弱識別器間多様性指標に基づいて所定の数の弱識別器を選択する。所定の数は、アンサンブル識別器を構成する弱識別器の数である。
アンサンブル識別器生成部１３０は、弱識別器組み合わせ選択部１２０で選択された弱識別器を用いてアンサンブル識別器を生成する。
以下に、情報処理装置１００の動作の一例について詳細に説明する。 The weak discriminator diversity index calculation unit 110 receives the weak discriminator set 11 and the sample data group 12 as inputs, and calculates an index indicating the diversity of the classification results between weak discriminators. In the following description, an index indicating the diversity of the classification results between weak classifiers will be referred to as a weak classifier diversity index as needed.
The weak discriminator combination selector 120 selects a predetermined number of weak discriminators based on the weak discriminator diversity index calculated by the weak discriminator diversity index calculator 110. The predetermined number is the number of weak classifiers constituting the ensemble classifier.
The ensemble classifier generation unit 130 generates an ensemble classifier using the weak classifiers selected by the weak classifier combination selection unit 120.
Hereinafter, an example of the operation of the information processing apparatus 100 will be described in detail.

本実施形態においては、予め学習された複数の弱識別器から構成される弱識別器集合１１と、複数のサンプルデータから構成されるサンプルデータ群１２とが事前データとして用意される。
本実施形態のアンサンブル識別器における弱識別器は、ランダムに解を出力する識別器よりは高精度な識別を行う識別器であって、識別対象のクラス情報が付与された学習データを用いて学習が行われる識別器である。本実施形態では、弱識別器の学習は事前に行われているものとする。ただし、情報処理装置１００は、学習データ群を入力し、弱識別器間多様性指標算出部１１０による処理が実行される前に、当該学習データ群を用いて弱識別器の学習を行ってもよい。弱識別器の学習方法は、例えば、Adaboostを用いた公知の方法により実現することができるので、ここでは、その詳細な説明を省略する。 In the present embodiment, a weak classifier set 11 composed of a plurality of weak classifiers that have been learned in advance and a sample data group 12 composed of a plurality of sample data are prepared as preliminary data.
The weak classifier in the ensemble classifier of the present embodiment is a classifier that performs classification with higher accuracy than a classifier that outputs a solution at random, and performs learning using learning data to which class information to be classified is added. Is performed. In the present embodiment, it is assumed that the learning of the weak classifier has been performed in advance. However, the information processing apparatus 100 may input the learning data group and perform learning of the weak classifier using the learning data group before the processing by the weak discriminator diversity index calculation unit 110 is performed. Good. Since the learning method of the weak classifier can be realized by, for example, a known method using Adaboost, a detailed description thereof is omitted here.

サンプルデータは、学習データとは異なり、実際に識別時に識別器に入力される画像に類する画像である。実際に識別時に識別器に入力される画像に類する画像とは、ラベル（本実施形態ではクラス情報）が付与されていない画像をいう。
また、本実施形態では、アンサンブル識別器（アンサンブル識別器を構成する弱識別器の数と同数の弱識別器の組み合わせ）の候補を、サンプルデータを用いて評価し、その結果に基づいて最終的なアンサンブル識別器を作成する。このようにすることによって、より高性能なアンサンブル識別器を作成することができる。特徴的な一部のサンプルデータの影響を小さくし、識別器に対する評価の信頼性を増すために、サンプルデータは大量に用意されることが望ましい。 The sample data is different from the learning data and is an image similar to an image actually input to the classifier at the time of identification. The image similar to the image actually input to the classifier at the time of identification is an image to which no label (class information in the present embodiment) is added.
Further, in the present embodiment, candidates for the ensemble classifier (combinations of the same number of weak classifiers as the number of weak classifiers constituting the ensemble classifier) are evaluated using sample data, and the final result is determined based on the result. Create a unique ensemble classifier. By doing so, a higher performance ensemble discriminator can be created. It is desirable that a large amount of sample data be prepared in order to reduce the influence of some characteristic sample data and increase the reliability of evaluation for the classifier.

次に、図２のフローチャートを参照しながら、弱識別器間多様性指標算出部１１０の処理の一例を説明する。
ステップＳ２０１では、弱識別器間多様性指標算出部１１０は、サンプルデータ群１２から未選択のサンプルデータを１つ選択する。
次に、ステップＳ２０２では、弱識別器間多様性指標算出部１１０は、予め用意された弱識別器集合１１に属する全ての弱識別器に、ステップＳ２０１で選択されたサンプルデータを入力する。そして、それらの弱識別器のそれぞれにおいて、当該サンプルデータが属するクラスの識別を行う。弱識別器間多様性指標算出部１１０は、そのサンプルデータの識別結果を取得する。この処理によって各弱識別器はそれぞれが独立に、選択されたサンプルデータの識別結果を出力する。 Next, an example of the processing of the weak discriminator diversity index calculating unit 110 will be described with reference to the flowchart of FIG.
In step S201, the weak discriminator diversity index calculating unit 110 selects one unselected sample data from the sample data group 12.
Next, in step S202, the inter-weak classifier diversity index calculating unit 110 inputs the sample data selected in step S201 to all weak classifiers belonging to the weak classifier set 11 prepared in advance. Then, in each of those weak classifiers, the class to which the sample data belongs is identified. The weak classifier diversity index calculation unit 110 acquires the identification result of the sample data. By this processing, each weak classifier independently outputs the classification result of the selected sample data.

次に、ステップＳ２０３では、弱識別器間多様性指標算出部１１０は、ステップＳ２０２で算出された、各弱識別器におけるサンプルデータの識別結果に基づき、２つの弱識別器からなる弱識別器ペアの識別結果ばらつき度を算出する。ここで、識別結果ばらつき度とは、２つの弱識別器の識別結果が異なる度合いを表す指標である。図３Ａ〜図３Ｅに識別結果ばらつき度の算出方式について複数の事例を示す。 Next, in step S203, the weak discriminator diversity index calculating unit 110 calculates a weak discriminator pair including two weak discriminators based on the result of the identification of the sample data in each weak discriminator calculated in step S202. Is calculated. Here, the discrimination result variation degree is an index indicating the degree to which the discrimination results of the two weak discriminators are different. FIG. 3A to FIG. 3E show a plurality of examples of the calculation method of the discrimination result variation degree.

図３Ａは、弱識別器が単一のクラス情報を識別結果として出力する場合の識別結果ばらつき度の算出過程の一例を示す模式図である。この場合、識別結果ばらつき度は二値で表現することができる。即ち、図３Ａ（ａ）に示すように、サンプルデータｉが入力されたときの弱識別器Ａと弱識別器Ｂの識別結果が一致しているときには、弱識別器間多様性指標算出部１１０は、識別結果バラつき度として０（ゼロ）を導出する。一方、図３Ａ（ｂ）に示すように、サンプルデータｉが入力されたときの弱識別器Ａと弱識別器Ｂの識別結果が一致していないときには、弱識別器間多様性指標算出部１１０は、識別ばらつき度として「１」を導出する。 FIG. 3A is a schematic diagram illustrating an example of a calculation process of the variation degree of the identification result when the weak classifier outputs a single class information as the identification result. In this case, the degree of variation in the identification result can be expressed in binary. That is, as shown in FIG. 3A, when the identification results of the weak classifier A and the weak classifier B when the sample data i are input match, the weak classifier diversity index calculation unit 110 Derives 0 (zero) as the degree of variation in the identification result. On the other hand, as shown in FIG. 3A (b), when the identification results of the weak classifier A and the weak classifier B when the sample data i are input do not match, the weak classifier diversity index calculating unit 110 Derives “1” as the discrimination variation degree.

図３Ｂは、弱識別器が複数のクラス情報を識別結果として出力する場合の識別結果ばらつき度の算出過程の一例を表す模式図である。この場合、識別結果を集合、クラス情報を集合の要素とみなす。弱識別器間多様性指標算出部１１０は、集合間の類似度である集合間類似度を算出し、１から当該算出した集合間類似度を減算した値（＝１−集合間類似度）を識別結果ばらつき度として導出する。集合間類似度を表す手法として、例えば、ジャッカード係数またはダイス係数を用いて表す手法がある。ジャッカード係数は、集合Ｘと集合Ｙの共通要素数を、集合Ｘと集合Ｙの少なくとも一方にある要素の総数で割った値である。ダイス係数は、集合Ｘと集合Ｙの共通要素数を、各集合Ｘ、Ｙの要素数の平均で割った値である。即ち、集合Ｘと集合Ｙのジャッカード係数Ｊ、ダイス係数Ｄは、それぞれ、以下の（１）式、（２）式で表される。 FIG. 3B is a schematic diagram illustrating an example of a calculation process of the variation degree of the identification result when the weak identifier outputs a plurality of class information as the identification result. In this case, the identification result is regarded as a set, and the class information is regarded as an element of the set. The weak classifier diversity index calculation unit 110 calculates an inter-set similarity, which is a similarity between sets, and calculates a value obtained by subtracting the calculated inter-set similarity from 1 (= 1−inter-set similarity). It is derived as the degree of variation in the classification result. As a method of representing the similarity between sets, for example, there is a method of representing the similarity using a Jacquard coefficient or a dice coefficient. The Jacquard coefficient is a value obtained by dividing the number of common elements of the sets X and Y by the total number of elements in at least one of the sets X and Y. The dice coefficient is a value obtained by dividing the number of common elements of the sets X and Y by the average of the number of elements of each of the sets X and Y. That is, the Jacquard coefficient J and the dice coefficient D of the set X and the set Y are expressed by the following equations (1) and (2), respectively.

図３Ｂに示す事例では、サンプルデータｉが入力された時に弱識別器Ａは、サンプルデータｉが属するクラスが、クラスＰ、クラスＱ、クラスＲの何れかであると識別する。一方、弱識別器Ｂは、サンプルデータｉが属するクラスが、クラスＰ、クラスＲ、クラスＳ、クラスＴの何れかであると識別する。従って、弱識別器Ａの識別結果と弱識別器Ｂの識別結果の共通要素は、クラスＰとクラスＲである。一方、弱識別器Ａの識別結果と弱識別器Ｂの識別結果の少なくとも一方にある要素は、クラスＰ、クラスＱ、クラスＲ、クラスＳ、クラスＴである。 In the example shown in FIG. 3B, when the sample data i is input, the weak classifier A identifies that the class to which the sample data i belongs is one of the class P, the class Q, and the class R. On the other hand, the weak classifier B identifies that the class to which the sample data i belongs is any one of the class P, the class R, the class S, and the class T. Therefore, the common elements of the classification result of the weak classifier A and the classification result of the weak classifier B are the class P and the class R. On the other hand, the elements in at least one of the classification result of the weak classifier A and the classification result of the weak classifier B are class P, class Q, class R, class S, and class T.

以上より、集合間類似度としてジャッカード係数Ｊを用いる場合、弱識別器Ａと弱識別器Ｂの集合間類似度は、（１）式より０．４（＝２／５）になる。よって、集合間類似度としてジャッカード係数Ｊを用いる場合、弱識別器間多様性指標算出部１１０は、弱識別器Ａと弱識別器Ｂの識別結果ばらつき度として、０．６（＝１−０．４）を導出する。一方、集合間類似度としてダイス係数Ｄを用いる場合、弱識別器Ａと弱識別器Ｂの集合間類似度は、（２）式より０．５７（＝２／３．５）になる。よって、集合間類似度としてダイス係数Ｄを用いる場合、弱識別器間多様性指標算出部１１０は、弱識別器Ａと弱識別器Ｂの識別結果ばらつき度として、０．４３（１−０．５７）を導出する。 As described above, when the Jacquard coefficient J is used as the similarity between sets, the similarity between sets of the weak classifier A and the weak classifier B is 0.4 (= 2/5) according to the equation (1). Therefore, when the Jacquard coefficient J is used as the similarity between sets, the weak discriminator diversity index calculating unit 110 calculates the discrimination result variation between the weak discriminators A and B as 0.6 (= 1−1). 0.4) is derived. On the other hand, when the dice coefficient D is used as the inter-set similarity, the inter-set similarity between the weak discriminators A and B is 0.57 (= 2 / 3.5) according to the equation (2). Therefore, when the dice coefficient D is used as the similarity between sets, the inter-weak discriminator diversity index calculating unit 110 calculates the discrimination result variation between the weak discriminators A and B as 0.43 (1-0. 57) is derived.

図３Ｃは、弱識別器が各クラスの出現尤度を識別結果として出力する場合の識別結果ばらつき度の算出過程の一例を表す模式図である。この場合、識別結果ばらつき度は、各クラスを要素とする尤度ヒストグラムの差Ｈで定義することができる。即ち、識別結果ばらつき度は、以下の（３）式で表される。 FIG. 3C is a schematic diagram illustrating an example of a calculation process of the variation degree of the classification result when the weak classifier outputs the appearance likelihood of each class as the classification result. In this case, the discrimination result variation degree can be defined by the difference H of the likelihood histogram having each class as an element. That is, the discrimination result variation degree is expressed by the following equation (3).

図３Ｃに示す事例では、サンプルデータｉが入力された時に、弱識別器Ａは、クラスＰの出現頻度が０．５、クラスＱの出現頻度が０．２５、クラスＲの出現頻度が０．２５であると識別する。一方、弱識別器Ｂは、クラスＰの出現頻度が０．２、クラスＲの出現頻度が０．６、クラスＴの出現頻度が０．２であると識別する。従って、クラスＰの尤度差の絶対値、クラスＱの尤度差の絶対値、クラスＲの尤度差の絶対値、クラスＴの尤度差の絶対値は、それぞれ、０．３（＝０．５−０．２）、０．２５、０．３５（＝０．６−０．２５）、０．２になる。よって、弱識別器間多様性指標算出部１１０は、（３）式より、弱識別器Ａと弱識別器Ｂの識別結果ばらつき度として、０．５５（＝（０．３＋０．２５＋０．３５＋０．２）／２）を導出する。 In the example shown in FIG. 3C, when the sample data i is input, the weak classifier A has a class P appearance frequency of 0.5, a class Q appearance frequency of 0.25, and a class R appearance frequency of 0.5. 25 is identified. On the other hand, the weak classifier B classifies that the appearance frequency of the class P is 0.2, the appearance frequency of the class R is 0.6, and the appearance frequency of the class T is 0.2. Therefore, the absolute value of the likelihood difference of class P, the absolute value of the likelihood difference of class Q, the absolute value of the likelihood difference of class R, and the absolute value of the likelihood difference of class T are 0.3 (= 0.5-0.2), 0.25, 0.35 (= 0.6-0.25), 0.2. Therefore, the inter-weak discriminator diversity index calculating unit 110 determines from the equation (3) that the variation degree of the discrimination result between the weak discriminators A and B is 0.55 (= (0.3 + 0.25 + 0.35 + 0. 2) / 2) is derived.

図２の説明に戻り、ステップＳ２０４では、弱識別器間多様性指標算出部１１０は、識別結果ばらつき度の算出を行っていない弱識別器ペアが存在するか否かを判定する。この判定の結果、識別結果ばらつき度の算出を行っていない弱識別器ペアが存在する場合には、ステップＳ２０３に戻り、弱識別器間多様性指標算出部１１０は、その弱識別器ペアの識別結果ばらつき度を算出する。ステップＳ２０３とステップＳ２０４を繰り返すことで、弱識別器集合１１の全ての弱識別器ペアについて、ステップＳ２０１で選択されたサンプルデータに対する識別結果ばらつき度が算出される。 Returning to the description of FIG. 2, in step S204, the inter-weak classifier diversity index calculation unit 110 determines whether there is a weak classifier pair for which the calculation of the degree of variation in the classification result has not been performed. As a result of this determination, when there is a weak discriminator pair for which the calculation of the degree of discrimination result variation has not been performed, the process returns to step S203, and the weak discriminator diversity index calculation unit 110 discriminates the weak discriminator pair. The degree of result variation is calculated. By repeating Step S203 and Step S204, the degree of variation in the classification result with respect to the sample data selected in Step S201 is calculated for all the weak classifier pairs in the weak classifier set 11.

次に、ステップＳ２０５では、弱識別器間多様性指標算出部１１０は、サンプルデータ群１２の中に未選択のサンプルデータが存在するか否かを判定する。この判定の結果、未選択のサンプルデータが存在する場合には、ステップＳ２０１に戻り、弱識別器間多様性指標算出部１１０は、そのサンプルデータを選択する。ステップＳ２０１からステップＳ２０５のループを繰り返すことで、サンプルデータ群１２の全てのサンプルデータに対して、弱識別器ペアの識別結果ばらつき度が算出される。 Next, in step S205, the weak discriminator diversity index calculating unit 110 determines whether or not unselected sample data exists in the sample data group 12. As a result of this determination, if there is unselected sample data, the process returns to step S201, and the weak inter-identifier diversity index calculating unit 110 selects the sample data. By repeating the loop from step S201 to step S205, the discrimination result variation degree of the weak discriminator pair is calculated for all the sample data of the sample data group 12.

次に、ステップＳ２０６では、弱識別器間多様性指標算出部１１０は、ステップＳ２０１からステップＳ２０５のループで算出された各弱識別器ペアの全サンプルデータにおける識別結果ばらつき度の平均値を算出する。そして、弱識別器間多様性指標算出部１１０は、その各弱識別器ペアの全サンプルデータにおける識別結果ばらつき度の平均値を各弱識別器ペアの多様性指標として設定する。 Next, in step S206, the inter-weak classifier diversity index calculation unit 110 calculates the average value of the degree of variation in the classification result in all the sample data of each weak classifier pair calculated in the loop from step S201 to step S205. . Then, the inter-weak classifier diversity index calculation unit 110 sets the average value of the variation degrees of the classification results in all the sample data of each weak classifier pair as the diversity index of each weak classifier pair.

以上の処理により、弱識別器間多様性指標算出部１１０は、弱識別器間の識別結果の多様性を表す指標を設定する。尚、本実施形態の弱識別器間多様性指標算出部１１０では、「サンプルデータの識別結果のばらつき度」を基準として弱識別器ペアの多様性指標を算出していればよい。従って、弱識別器ペアの多様性指標の算出方法や、識別結果ばらつき度の定義は、前述したものに限定されない。 Through the above processing, the weakness classifier diversity index calculation unit 110 sets an index representing the diversity of the classification results between weak classifiers. In addition, the weakness classifier diversity index calculation unit 110 of the present embodiment may calculate the diversity index of the weak classifier pair based on “the degree of variation in the identification result of the sample data”. Therefore, the method of calculating the diversity index of the weak classifier pair and the definition of the classification result variation degree are not limited to those described above.

ここで、弱識別器間多様性指標算出部１１０の処理の変形例を説明する。Hough Forestのように、画像の部分領域を入力として、識別対象の物体の重心への投票を行うことで、識別対象の物体を検出する方式における弱識別器多様性指標算出処理について、図３Ｄおよび図３Ｅに示す例を用いて説明する。図３Ｄ、Ｅは、各弱識別器が出力する識別結果が識別対象の物体の重心へのオフセットである場合の識別結果ばらつき度の算出過程の第１、第２の例を示す模式図である。ここでは、サンプル画像ｉが入力されたときに、弱識別器Ａは、識別結果として３つの識別対象の物体の重心へのオフセットを出力し、弱識別器Ｂは、識別結果として５つの識別対象の物体の重心へのオフセットを出力するものとする。 Here, a modified example of the process of the weak discriminator inter-unit diversity index calculating unit 110 will be described. As in Hough Forest, a weak classifier diversity index calculating process in a method of detecting an object to be identified by voting to the center of gravity of the object to be identified by inputting a partial region of the image as an input is shown in FIG. This will be described using the example shown in FIG. 3E. FIGS. 3D and 3E are schematic diagrams showing first and second examples of the process of calculating the variation degree of the identification result when the identification result output from each weak classifier is an offset to the center of gravity of the object to be identified. . Here, when a sample image i is input, the weak classifier A outputs offsets to the centers of gravity of three objects to be identified as identification results, and the weak classifier B outputs five identification objects as identification results. The offset to the center of gravity of the object is output.

この事例における多様性指標算出処理では、図２に示すステップＳ２０１で選択されるデータはサンプルデータの部分領域になる。サンプルデータ群１２における全てのサンプルデータの全ての部分領域が、ステップＳ２０１からステップＳ２０５のループで評価されることになる。 In the diversity index calculation process in this case, the data selected in step S201 shown in FIG. 2 is a partial area of the sample data. All partial regions of all sample data in the sample data group 12 are evaluated in a loop from step S201 to step S205.

また、識別結果ばらつき度は、例えば、弱識別器ごとの識別結果であるオフセット位置のばらつきで表現できる。オフセット位置のばらつきを指標化する方法の一例としては、サンプルデータの全領域を同一サイズの領域に分割し、各領域への投票が行われる頻度に応じて、識別対象の物体の重心の領域の尤度を設定する方法がある。具体的には、まず、弱識別器間多様性指標算出部１１０は、入力された部分領域を含むサンプルデータの全領域を同一サイズの領域に分割する。図３Ｄおよび図３Ｅに示す例では、サンプルデータｉが２４の領域に均等に分割される。次に、弱識別器間多様性指標算出部１１０は、各弱識別器における識別結果のオフセット位置が含まれる領域を調べ、識別対象の物体の重心が位置する領域の尤度を算出する。以下の説明では、識別対象の物体の重心が位置する領域の尤度を、必要に応じて、投票先領域尤度と称する。 The discrimination result variation degree can be expressed by, for example, a variation in offset position, which is a discrimination result for each weak discriminator. As an example of a method of indexing the variation of the offset position, the entire region of the sample data is divided into regions of the same size, and the center of gravity of the object to be identified is determined according to the frequency of voting for each region. There is a method of setting the likelihood. Specifically, first, the inter-weak classifier diversity index calculating unit 110 divides the entire region of the sample data including the input partial region into regions of the same size. In the examples shown in FIGS. 3D and 3E, the sample data i is equally divided into 24 regions. Next, the inter-weak classifier diversity index calculation unit 110 checks a region including the offset position of the classification result in each weak classifier, and calculates the likelihood of the region where the center of gravity of the object to be classified is located. In the following description, the likelihood of the region where the center of gravity of the object to be identified is located will be referred to as the voting destination region likelihood as necessary.

投票先領域尤度は、例えば、識別結果であるオフセット位置の総数に対する、各領域に含まれるオフセット位置の割合で表すことができる。図３Ｄに示す例では、弱識別器Ａにおいて、領域４に１票、領域１８に１票、領域２３に１票で計３票投票されている。従って、領域４、領域１８、領域２３の投票先領域尤度は、それぞれ０．３３となる。一方、図３Ｅに示す例では、弱識別器Ｂにおいて、領域１０に１票、領域１４に１票、領域１８に２票、領域２３に１票で計５票投票されている。従って、領域１０、領域１４、領域２３の投票先領域尤度は、それぞれ０．２となり、領域１８の投票先領域尤度は、０．４となる。 The voting destination region likelihood can be represented by, for example, the ratio of the offset position included in each region to the total number of offset positions as the identification result. In the example shown in FIG. 3D, in the weak classifier A, three votes are cast in one vote in the area 4, one vote in the area 18, and one vote in the area 23. Therefore, the voting destination region likelihoods of the region 4, the region 18, and the region 23 are each 0.33. On the other hand, in the example shown in FIG. 3E, in the weak discriminator B, one vote is cast in the area 10, one vote in the area 14, two votes in the area 18, and one vote in the area 23, for a total of five votes. Therefore, the voting destination region likelihood of the region 10, the region 14, and the region 23 is 0.2, and the voting destination region likelihood of the region 18 is 0.4.

弱識別器間多様性指標算出部１１０は、以上のようにして各弱識別器の投票先領域尤度を算出する。そして、弱識別器間多様性指標算出部１１０は、投票先領域尤度に基づいて、例えば、図３Ｃを参照しながら説明した尤度ヒストグラムの差Ｈを算出することで弱識別器Ａと弱識別器Ｂの識別結果ばらつき度を算出する（（３）式を参照）。弱識別器間多様性指標算出部１１０は、全てのサンプルデータの全ての部分領域について識別結果ばらつき度を算出し、それらの平均値を弱識別器Ａと弱識別器Ｂの弱識別器間多様性指標とする。 The weak discriminator diversity index calculation unit 110 calculates the voting destination region likelihood of each weak discriminator as described above. Then, the weak discriminator diversity index calculating unit 110 calculates, for example, the difference H between the likelihood histograms described with reference to FIG. The degree of variation in the classification result of the classifier B is calculated (see equation (3)). The weak discriminator diversity index calculation unit 110 calculates the discrimination result variability for all the partial regions of all the sample data, and averages the average values thereof between the weak discriminators A and B. Sex index.

尚、弱識別器による識別の結果、サンプルデータの部分領域が、識別対象の物体が存在しない領域である（Hough Forestでは、入力データがネガティブサンプルである）と判定される場合がある。識別対象の物体が存在しないと判定された領域は、弱識別器による識別処理の際に、識別対象の物体の重心の検出に寄与しない領域である。そのため、ステップＳ２０３における識別結果ばらつき度の算出において、入力されたサンプルデータの部分領域に、識別対象の物体が存在しないことが何れかの弱識別器で判定された場合、弱識別器間多様性指標算出部１１０は、次のようにする。即ち、弱識別器間多様性指標算出部１１０は、この部分領域における当該弱識別器を含む弱識別器ペアの識別結果ばらつき度を算出せず、識別器間の多様性指標の算出の集計（前述した例では、識別結果ばらつき度の平均値の算出対象）から除外する。このようにすることで、識別器間の多様性指標（弱識別器間多様性指標）の算出をより適切に行うことができる。 As a result of the classification by the weak classifier, the partial region of the sample data may be determined to be a region where no object to be classified is present (in Hough Forest, the input data is a negative sample). The area where it is determined that the object to be identified does not exist is an area that does not contribute to the detection of the center of gravity of the object to be identified during the identification processing by the weak classifier. Therefore, in the calculation of the degree of variation in the classification result in step S203, if any weak classifier determines that no object to be classified is present in the partial area of the input sample data, the diversity between weak classifiers is determined. The index calculation unit 110 performs the following. That is, the weak discriminator diversity index calculation unit 110 does not calculate the discrimination result variation degree of the weak discriminator pair including the weak discriminator in this partial area, and counts the calculation of the diversity index between discriminators ( In the example described above, it is excluded from the calculation target of the average value of the degree of variation of the identification result. By doing so, the diversity index between classifiers (weak classifier diversity index) can be calculated more appropriately.

次に、図４に示すフローチャートおよび図５Ａおよび図５Ｂに示す模式図を用いて、弱識別器組み合わせ選択部１２０の処理の一例を詳細に説明する。
弱識別器集合１１の要素数がＭであるとすると、弱識別器ペアのそれぞれに対し１つずつ弱識別器間多様性指標が算出されるので、弱識別器間多様性指標算出部１１０によって算出される弱識別器間多様性指標の総数は、以下の（４）式で表わされる。 Next, an example of the processing of the weak discriminator combination selection unit 120 will be described in detail with reference to the flowchart shown in FIG. 4 and the schematic diagrams shown in FIGS. 5A and 5B.
Assuming that the number of elements of the weak classifier set 11 is M, one weak classifier diversity index is calculated for each weak classifier pair. The calculated total number of weak classifier diversity indices is expressed by the following equation (4).

例えば、Ｍ＝５であるときには、１０個の弱識別器ペアの弱識別器間多様性指標が算出される。図５Ａは、Ｍ＝５のときの弱識別器間多様性指標の一例を示す図である。図５Ａにおいて、各行と各列の交点の数値が各弱識別器ペアの弱識別器間多様性指標を表す。例えば、ＩＤ２の弱識別器とＩＤが４の弱識別器の弱識別器間多様性指標は０．７４９となる。弱識別器組み合わせ選択部１２０は、弱識別器間多様性指標に基づきＭ個の弱識別器集合からＬ個の弱識別器を選択する。 For example, when M = 5, the weak discriminator diversity indices of ten weak discriminator pairs are calculated. FIG. 5A is a diagram illustrating an example of the weak discriminator diversity index when M = 5. In FIG. 5A, the numerical value at the intersection of each row and each column represents the weakness classifier diversity index of each weak classifier pair. For example, the weak discriminator diversity index of the weak discriminator of ID2 and the weak discriminator of ID 4 is 0.749. The weak discriminator combination selection unit 120 selects L weak discriminators from the M weak discriminator sets based on the inter-weak discriminator diversity index.

ステップＳ４０１では、弱識別器組み合わせ選択部１２０は、Ｍ個の弱識別器集合からＬ個を選択するＣｏｍｂ（Ｍ，Ｌ）通りの組み合わせパターンの中から、後述する総和値の算出を行っていない組み合わせパターンを１つ選択する。Ｌは、アンサンブル識別器を構成する弱識別器の数である。
次に、ステップＳ４０２では、弱識別器組み合わせ選択部１２０は、ステップＳ４０１で選択されたＬ個の弱識別器の組み合わせパターンが持つＣｏｍｂ（Ｌ，２）個の弱識別器間多様性指標の総和値を算出する。 In step S401, the weak discriminator combination selection unit 120 does not calculate the sum value described later from the Comb (M, L) combination patterns for selecting L from the M weak discriminator sets. Select one combination pattern. L is the number of weak classifiers forming the ensemble classifier.
Next, in step S402, the weak discriminator combination selection unit 120 sums up the Comb (L, 2) inter-weak discriminator diversity indexes of the combination pattern of the L weak discriminators selected in step S401. Calculate the value.

次に、ステップＳ４０３では、弱識別器組み合わせ選択部１２０は、Ｃｏｍｂ（Ｌ，２）個の弱識別器間多様性指標の総和値の算出が行われていないＬ個の弱識別器パターンが存在するか否かを判定する。この判定の結果、Ｃｏｍｂ（Ｌ，２）個の弱識別器間多様性指標の総和値の算出が行われていないＬ個の弱識別器パターンが存在する場合には、ステップＳ４０１に戻る。そして、弱識別器組み合わせ選択部１２０は、Ｃｏｍｂ（Ｌ，２）個の弱識別器間多様性指標の総和値の算出が行われていない組み合わせパターンを選択する。ステップＳ４０１からステップＳ４０３のループを繰り返すことでＣｏｍｂ（Ｍ，Ｌ）通りの全ての組み合わせパターンに対して弱識別器間多様性指標の総和値が算出される。 Next, in step S403, the weak discriminator combination selection unit 120 determines that there are L weak discriminator patterns for which the sum of the Comb (L, 2) weak discriminator diversity indexes has not been calculated. It is determined whether or not to perform. If the result of this determination is that there are L weak discriminator patterns for which the total value of the Comb (L, 2) weak discriminator diversity indexes has not been calculated, the process returns to step S401. Then, the weak discriminator combination selection unit 120 selects a combination pattern for which the sum of the Comb (L, 2) weak discriminator diversity indexes has not been calculated. By repeating the loop from step S401 to step S403, the sum value of the diversity index between weak discriminators is calculated for all the combination patterns of Comb (M, L).

そして、ステップＳ４０４では、弱識別器組み合わせ選択部１２０は、Ｌ個の組み合わせパターンの中から、ステップＳ４０２で算出した弱識別器間多様性指標の総和値が最大となる組み合わせを探索し、その弱識別器組み合わせパターンを選択する。
以上の処理によって、弱識別器間多様性指標に基づいて、アンサンブル識別器を構成する適切な弱識別器の組み合わせを選択することができる。 Then, in step S404, the weak discriminator combination selection unit 120 searches the L combination patterns for a combination in which the sum of the weak discriminator diversity indices calculated in step S402 is the largest, and determines the weakest one. Select a discriminator combination pattern.
Through the above processing, it is possible to select an appropriate combination of weak classifiers constituting the ensemble classifier based on the diversity index between weak classifiers.

図５Ｂは、Ｍ＝５、Ｌ＝３、即ち５個の弱識別器集合から３個の弱識別器の組み合わせを選択する場合の弱識別器組み合わせ選択部１２０の処理の一例を説明する図である。この場合、Ｃｏｍｂ（５，３）＝２０なので、組み合わせパターンは全部で２０通り存在する。弱識別器組み合わせ選択部１２０は、それらの全ての組み合わせパターンに対して、弱識別器間多様性指標の総和値を算出する。 FIG. 5B is a diagram illustrating an example of processing of the weak classifier combination selection unit 120 when M = 5, L = 3, that is, a combination of three weak classifiers is selected from a set of five weak classifiers. is there. In this case, since Comb (5, 3) = 20, there are a total of 20 combinations patterns. The weak discriminator combination selection unit 120 calculates the sum of the weak discriminator diversity indices for all the combination patterns.

例えば、図５Ｂの一番左の表では、ＩＤ＝２、３、４の弱識別器の組み合わせパターンについての弱識別器間多様性指標の総和値を示す。ＩＤが２とＩＤが３の弱識別器ペアの多様性指標は０．６７５、ＩＤが２とＩＤが４の弱識別器ペアの多様性指標は０．７４９、ＩＤが３とＩＤが４の多様性指標は０．０３４である。従って、それらの総和値は１．４５８となる。弱識別器組み合わせ選択部１２０は、以上の処理を２０通りの組み合わせパターンについて繰り返すことにより、全ての組み合わせパターンのそれぞれについての弱識別器間多様性指標の総和値を個別に算出する。そして、弱識別器組み合わせ選択部１２０は、当該総和値が最大となる組み合わせパターンを選択する。図５Ｂに示す例では、ＩＤが０とＩＤが２とＩＤが３の弱識別器の組み合わせパターンについての弱識別器間多様性指標の総和値が２．３６４となり、全ての組み合わせパターンの中で最大となるため、その組み合わせパターンが選択される。 For example, the leftmost table in FIG. 5B shows the sum of the weak discriminator diversity indices for the combination patterns of the weak discriminators having IDs of 2, 3, and 4. The diversity index of the weak discriminator pair of ID 2 and ID 3 is 0.675, the diversity index of the weak discriminator pair of ID 2 and ID 4 is 0.749, and the ID of ID 3 and ID 4 is The diversity index is 0.034. Therefore, their total value is 1.458. The weak discriminator combination selection unit 120 individually calculates the sum of the weak discriminator diversity indices for each of all the combination patterns by repeating the above processing for the 20 combination patterns. Then, the weak discriminator combination selection unit 120 selects a combination pattern in which the sum total is maximum. In the example shown in FIG. 5B, the sum of the weak discriminator diversity indices for the combination pattern of the weak discriminators having IDs of 0, 2 and 3 is 2.364, and among all the combination patterns, Since the maximum value is obtained, the combination pattern is selected.

アンサンブル識別器生成部１３０は、弱識別器組み合わせ選択部１２０で選択されたＬ個の弱識別器を用いてアンサンブル識別器１３を生成する。尚、複数の弱識別器を用いてアンサンブル識別器１３を生成する方法は、公知の技術で実現できるので、ここでは、その詳細な説明を省略する。 The ensemble classifier generation unit 130 generates the ensemble classifier 13 using the L weak classifiers selected by the weak classifier combination selection unit 120. Note that a method of generating the ensemble classifier 13 using a plurality of weak classifiers can be realized by a known technique, and a detailed description thereof will be omitted here.

以上のように本実施形態では、サンプルデータを弱識別器に入力して当該サンプルデータにおける識別対象の識別結果を得て、弱識別器ペアにおける識別結果のばらつき度を算出する。このような弱識別器ペアにおける識別結果のばらつき度の算出を、とり得る弱識別器ペア、複数のサンプルデータのそれぞれについて行う。そして、当該複数のサンプルデータを用いた場合の、弱識別器ペアにおける識別結果のばらつき度から、当該弱識別器ペアにおける識別結果の多様性を表す弱識別器間多様性指標を算出する。そして、アンサンブル識別器を構成する所定の個数の弱識別器の組み合わせパターンに含まれる弱識別器ペアにおける弱識別器間多様性指標に基づいて、当該所定の個数の弱識別器の組み合わせを選択する。そして、選択した弱識別器を用いてアンサンブル識別器を作成する。従って、クラス情報を持たない実撮影データを利用して、アンサンブル識別器を評価することが可能となる。このため、クラス情報を持たない実撮影データを利用しても、識別対象の識別に有用な弱識別器の組み合わせを選択することができ、高性能なアンサンブル識別器の作成が可能となる。 As described above, in the present embodiment, the sample data is input to the weak classifier, the classification result of the classification target in the sample data is obtained, and the degree of variation of the classification result in the weak classifier pair is calculated. The calculation of the degree of variation of the classification result in such a weak classifier pair is performed for each of the possible weak classifier pairs and a plurality of sample data. Then, a weakness classifier diversity index indicating the diversity of the classification results of the weak classifier pairs is calculated from the degree of variation of the classification results of the weak classifier pairs when the plurality of sample data are used. Then, the combination of the predetermined number of weak discriminators is selected based on the diversity index between weak discriminators in the weak discriminator pair included in the combination pattern of the predetermined number of weak discriminators constituting the ensemble discriminator. . Then, an ensemble classifier is created using the selected weak classifiers. Therefore, it is possible to evaluate the ensemble discriminator using the actual photographing data having no class information. Therefore, even if actual photographing data having no class information is used, a combination of weak classifiers useful for identifying a classification target can be selected, and a high-performance ensemble classifier can be created.

尚、本実施形態では、弱識別器間多様性指標算出部１１０が、弱識別器ペアの識別結果を比較して弱識別器間多様性指標を算出し、弱識別器組み合わせ選択部１２０が、弱識別器間多様性指標の総和値に基づき弱識別器の組み合わせを導出する場合を説明した。しかし、必ずしもこのようにする必要はない。例えば、以下のようにしてもよい。即ち、弱識別器間多様性指標算出部１１０においてＬ個（Ｌ＞２）の弱識別器の識別結果を比較して、Ｌ個の弱識別器ペアの弱識別器間多様性指標を算出する。そして、弱識別器組み合わせ選択部１２０においてＬ個の弱識別器のペアの弱識別器間多様性指標が最大となる弱識別器の組み合わせを選択する。Ｌ個の弱識別器のペアの弱識別器間多様性指標の算出には、既存の手法を利用することができる。例えば、各弱識別器の識別結果を、集合の各要素とみなしてＬ個の集合の類似度を求めることで、弱識別器間多様性指標の算出を行える。 In the present embodiment, the weak discriminator diversity index calculation unit 110 calculates the weak discriminator diversity index by comparing the discrimination results of the weak discriminator pairs, and the weak discriminator combination selection unit 120 outputs The case of deriving the combination of weak classifiers based on the sum of the diversity index between weak classifiers has been described. However, this is not necessary. For example, the following may be performed. That is, the weak inter-classifier diversity index calculating unit 110 compares the classification results of the L (L> 2) weak classifiers and calculates the weak classifier diversity index of the L weak classifier pairs. . Then, the weak discriminator combination selection unit 120 selects a weak discriminator combination that maximizes the diversity indicator between weak discriminators of the L weak discriminator pairs. An existing method can be used for calculating the diversity index between weak classifiers of the pair of L weak classifiers. For example, by regarding the classification result of each weak classifier as each element of the set and calculating the similarity of the L sets, it is possible to calculate the diversity index between weak classifiers.

また、本実施形態では、識別器における識別対象のデータが画像データである場合を例に挙げて説明した。しかしながら、識別器における識別対象のデータは、画像データに限定されない。例えば、製造実績データであってもよい。また、識別器が、入力されるデータにおける識別対象が属するクラスを識別する場合を例に挙げて説明した。しかしながら、識別器は、入力されるデータにおける識別対象を識別するものであれば、このようなものに限定されない。例えば、識別対象の有無を識別する識別器を採用してもよい。 Further, in the present embodiment, an example has been described in which the data to be identified in the classifier is image data. However, data to be identified in the classifier is not limited to image data. For example, production result data may be used. Also, the case where the classifier identifies the class to which the identification target belongs in the input data has been described as an example. However, the discriminator is not limited to such as long as it identifies a discrimination target in input data. For example, an identifier for identifying the presence or absence of an identification target may be employed.

（第２の実施形態）
次に、第２の実施形態について説明する。本実施形態では、入力されたサンプルデータに対して分類処理を行い、その分類処理の結果に基づいて、当該サンプルデータにおける識別対象の識別結果を出力する方式の弱識別器を用いたアンサンブル識別器を用いる場合を例に挙げて説明する。このように本実施形態と第１の実施形態とは、弱識別器による識別対象の識別方法が異なることによる処理が主として異なる。従って、本実施形態の説明において、第１の実施形態と同一の部分については、図１〜図５Ｂに付した符号と同一の符号を付す等して詳細な説明を省略する。具体的に、本実施形態における情報処理装置の構成は、第１の実施形態と同様の構成である。また、図１に示す情報処理装置１００における弱識別器組み合わせ選択部１２０およびアンサンブル識別器生成部１３０の処理についても第１の実施形態と同様である。従って、これらの詳細な説明を省略する。 (Second embodiment)
Next, a second embodiment will be described. In the present embodiment, an ensemble classifier that uses a weak classifier that performs a classification process on input sample data and outputs an identification result of an identification target in the sample data based on a result of the classification process The case of using is described as an example. As described above, the present embodiment and the first embodiment mainly differ in the processing due to the difference in the method of identifying the identification target by the weak classifier. Therefore, in the description of the present embodiment, the same portions as those in the first embodiment are denoted by the same reference numerals as those in FIGS. 1 to 5B, and the detailed description is omitted. Specifically, the configuration of the information processing apparatus according to the present embodiment is the same as that of the first embodiment. Further, the processing of the weak discriminator combination selector 120 and the ensemble discriminator generator 130 in the information processing apparatus 100 shown in FIG. 1 is the same as in the first embodiment. Therefore, these detailed explanations are omitted.

入力されたサンプルデータに対して分類処理を行い、その分類処理の結果に基づいて、当該サンプルデータにおける識別対象の識別結果を出力する方式の弱識別器の一例として、決定木がある。図６は、決定木による識別の流れの一例を示す模式図である。まず、分類処理では、データが入力された決定木は、当該決定木の各ノードに設定されている分岐関数に当該データを当てはめ、その結果に基づき、当該データをいずれかの末端ノード（リーフ）に分類する。図６に示す例では、決定木Ａにより、入力データｉはリーフａに分類されていることを示す。 There is a decision tree as an example of a weak classifier that performs a classification process on input sample data and outputs an identification result of an identification target in the sample data based on a result of the classification process. FIG. 6 is a schematic diagram illustrating an example of the flow of identification using a decision tree. First, in the classification process, a decision tree to which data is input applies the data to a branch function set for each node of the decision tree, and based on the result, the data is divided into one of the terminal nodes (leaves). Classify into. In the example shown in FIG. 6, the input tree i is classified into leaf a by the decision tree A.

決定木の各リーフには、事前の学習処理によってクラス情報が格納されている。分類処理が行われた後、識別処理として、決定木は、入力されたデータの分類先であるリーフに格納されているクラス情報を、当該データにおける識別対象の識別結果として出力する。図６に示す例では、決定木Ａによる入力データｉの識別結果として、入力データｉの分類先であるリーフａに格納されているクラス情報｛クラスＰ、クラスＱ｝が出力される。尚、決定木自体は、公知の技術で実現することができるので、ここでは、その詳細な説明を省略する。 Class information is stored in each leaf of the decision tree by a learning process in advance. After the classification process is performed, as the identification process, the decision tree outputs the class information stored in the leaf to which the input data is classified as the identification result of the identification target in the data. In the example shown in FIG. 6, class information {class P, class Q} stored in leaf a as a classification destination of input data i is output as a result of identification of input data i by decision tree A. Since the decision tree itself can be realized by a known technique, a detailed description thereof is omitted here.

第１の実施形態においては、各弱識別器にサンプルデータを入力したときの識別結果に着目して弱識別器間多様性指標を算出する場合を例に挙げて説明した。これに対し、本実施形態では、各弱識別器にサンプルデータを入力したときの分類結果に着目して弱識別器間多様性指標を算出する。 In the first embodiment, an example has been described in which the diversity index between weak classifiers is calculated by focusing on the classification result when sample data is input to each weak classifier. On the other hand, in the present embodiment, the inter-weak classifier diversity index is calculated by focusing on the classification result when sample data is input to each weak classifier.

図７に示すフローチャートを参照しながら、弱識別器間多様性指標算出部１１０の処理の一例を詳細に説明する。
ステップＳ７０１では、弱識別器間多様性指標算出部１１０は、サンプルデータ群１２に含まれる全てのサンプルデータを、弱識別器集合１１に含まれる全ての弱識別器に入力する。これにより、弱識別器集合１１に含まれる各弱識別器は、自身に入力されたサンプルデータを分類する。 An example of the process of the weak discriminator diversity index calculating unit 110 will be described in detail with reference to the flowchart shown in FIG.
In step S701, the inter-weak classifier diversity index calculation unit 110 inputs all sample data included in the sample data group 12 to all weak classifiers included in the weak classifier set 11. Accordingly, each weak classifier included in the weak classifier set 11 classifies the sample data input thereto.

次に、ステップＳ７０２では、弱識別器間多様性指標算出部１１０はサンプルデータ群１２から未選択のサンプルデータを１つ選択する。
次に、ステップＳ７０３では、弱識別器間多様性指標算出部１１０は、ステップＳ７０１で行った各弱識別器におけるサンプルデータの分類結果に基づき、分類結果ばらつき度が未算出である２つの弱識別器ペアの分類結果ばらつき度を算出する。ここで、分類結果ばらつき度とは、２つの弱識別器の分類結果が異なる度合いを表す指標である。２つの弱識別器間の分類結果の比較のため、ステップＳ７０１で行った全てのサンプルデータに対する分類結果を利用する。 Next, in step S702, the inter-weak classifier diversity index calculation unit 110 selects one unselected sample data from the sample data group 12.
Next, in step S703, the inter-weak discriminator diversity index calculating unit 110 uses the two weak discriminators whose classification result variation degree has not been calculated based on the classification result of the sample data in each weak discriminator performed in step S701. Calculate the classification result variation degree of the container pair. Here, the classification result variation degree is an index indicating the degree to which the classification results of the two weak classifiers are different. For comparison of the classification results between the two weak classifiers, the classification results for all the sample data performed in step S701 are used.

弱識別器の分類結果であるリーフを集合、それぞれのリーフに分類された各サンプルデータを集合要素とみなす。弱識別器間多様性指標算出部１１０は、或るサンプルデータが２つの識別器へ入力されたときに、それぞれの弱識別器におけるそのサンプルデータが属する集合同士の類似度を集合間類似度として算出する。そして、弱識別器間多様性指標算出部１１０は、１から当該算出した集合間類似度を減算した値（＝１−集合間類似度）を分類結果ばらつき度として導出する。２つの集合間の類似度は、第１の実施形態において、弱識別器が複数のクラス情報を識別結果として出力する場合の識別結果ばらつき度を算出する場合と同様に、ジャッカード係数Ｊやダイス係数Ｄを用いて算出することができる。
ステップＳ７０４からステップＳ７０６の処理は、第１の実施形態におけるステップＳ２０４からステップＳ２０６の処理と同様であるため、詳細な説明を省略する。尚、ステップＳ７０６では、弱識別器間多様性指標算出部１１０は、識別結果ばらつき度を用いる代わりに、分類結果ばらつき度を用いて、各弱識別器ペアの弱識別器間多様性指標を設定（導出）する。 The leaves that are the classification results of the weak classifiers are collected, and each sample data classified into each leaf is regarded as a set element. When a certain sample data is input to two discriminators, the weak discriminator diversity index calculation unit 110 uses the similarity between sets to which the sample data belongs in each weak discriminator as the inter-set similarity. calculate. Then, the weak discriminator diversity index calculation unit 110 derives a value obtained by subtracting the calculated inter-set similarity from 1 (= 1−inter-set similarity) as the classification result variation degree. The similarity between the two sets is determined by the Jacquard coefficient J and the dice as in the first embodiment, as in the case of calculating the discrimination result variation when the weak discriminator outputs a plurality of class information as discrimination results in the first embodiment. It can be calculated using the coefficient D.
The processing from step S704 to step S706 is the same as the processing from step S204 to step S206 in the first embodiment, and a detailed description thereof will be omitted. In step S706, the weak discriminator diversity index calculation unit 110 sets the weak discriminator diversity index of each weak discriminator pair using the classification result variation degree instead of using the classification result variation degree. (Derived).

図８は、弱識別器Ａと弱識別器Ｂに対して、サンプルデータ１が入力されたときの分類結果ばらつき度を算出する過程の一例を示す模式図である。サンプルデータ１は、弱識別器Ａによって分類結果ａに分類され、また、弱識別器Ｂによって分類結果ｂに分類される。サンプルデータ１と同様に弱識別器Ａによって分類結果ａに分類された全てのサンプルデータを含む分類結果ａの集合と、弱識別器Ｂによって分類結果ｂに分類された全てのサンプルデータを含む分類結果ｂの集合とを比較する。この比較の結果に基づいて分類結果ばらつき度が算出される。 FIG. 8 is a schematic diagram illustrating an example of a process of calculating the classification result variation degree when the sample data 1 is input to the weak classifiers A and B. The sample data 1 is classified into the classification result a by the weak classifier A, and is classified into the classification result b by the weak classifier B. A set of classification results a including all the sample data classified by the weak discriminator A into the classification result a similarly to the sample data 1, and a classification including all the sample data classified by the weak discriminator B into the classification result b. Compare with the set of results b. The classification result variation degree is calculated based on the result of this comparison.

図８に示す事例では、分類結果ａの集合と分類結果ｂの集合との集合間類似度が算出される。分類結果ａの集合は、｛サンプルデータ１、サンプルデータ３、サンプルデータ４、サンプルデータ７｝を要素に持つ集合である。分類結果ｂの集合は、｛サンプルデータ１、サンプルデータ２、サンプルデータ３、サンプルデータ５、サンプルデータ８｝を要素に持つ集合である。この場合、分類結果ａの集合と分類結果ｂの集合との共通要素は｛サンプルデータ１、サンプルデータ３｝である。従って、ジャッカード係数Ｊを用いた場合の集合間類似度は、（１）式より、２／７、ダイス係数Ｄを用いた場合の集合間類似度は、（２）式より４／９（＝２×２／９）となる。これと同様に、弱識別器間多様性指標算出部１１０は、サンプルデータ群１２に含まれる全てのサンプルデータについて、それぞれのサンプルデータが弱識別器Ａによって分類された分類結果の集合と弱識別器Ｂによって分類された分類結果の集合とを比較する。そして、弱識別器間多様性指標算出部１１０は、算出した分類結果ばらつき度の平均をとることで弱識別器Ａと弱識別器Ｂの弱識別器間多様性指標を算出する。 In the case illustrated in FIG. 8, the similarity between sets of the set of the classification results a and the set of the classification results b is calculated. The set of the classification results a is a set having {sample data 1, sample data 3, sample data 4, and sample data 7} as elements. The set of the classification results b is a set having {sample data 1, sample data 2, sample data 3, sample data 5, and sample data 8} as elements. In this case, the common element between the set of classification results a and the set of classification results b is {sample data 1, sample data 3}. Therefore, the similarity between sets when the Jacquard coefficient J is used is 2/7 from the equation (1), and the similarity between sets when the dice coefficient D is used is 4/9 ( = 2 x 2/9). Similarly, the inter-weak discriminator diversity index calculating unit 110 determines, for all the sample data included in the sample data group 12, a set of classification results in which each sample data is classified by the weak discriminator A and a weak discrimination. And a set of classification results classified by the classifier B. Then, the weak discriminator diversity index calculating unit 110 calculates the weak discriminator diversity indices of the weak discriminators A and B by averaging the calculated classification result variation degrees.

以上のように本実施形態では、決定木のような、入力されるサンプルデータに対して分類処理を行い、その分類処理の結果に基づいて識別結果を出力する弱識別器を用いる場合でも、第１の実施形態で説明した効果と同様の効果を得ることができる。即ち、それぞれの弱識別器が学習したクラス情報を用いることなく、サンプルデータの分類処理の結果のみを使って弱識別器間多様性指標を算出することができる。 As described above, in this embodiment, even when a weak classifier that performs classification processing on input sample data such as a decision tree and outputs a classification result based on the result of the classification processing is used, The same effect as the effect described in the first embodiment can be obtained. That is, the weakness classifier diversity index can be calculated using only the result of the classification processing of the sample data without using the class information learned by each weak classifier.

尚、弱識別器間多様性指標算出部１１０において、更に第１の実施形態で説明した識別結果ばらつき度に基づく弱識別器間多様性指標を算出してもよい。この場合、例えば、弱識別器組み合わせ選択部１２０は、識別結果ばらつき度に基づく弱識別器間多様性指標と分類結果ばらつき度に基づく弱識別器間多様性指標を組み合わせて、アンサンブル識別器を構成する弱識別器を選択してもよい。また、本実施形態でも、第１の実施形態で説明した変形例を採用することができる。 The inter-weak discriminator diversity index calculation unit 110 may further calculate the inter-weak discriminator diversity index based on the degree of variation in the identification results described in the first embodiment. In this case, for example, the weak discriminator combination selection unit 120 configures an ensemble discriminator by combining the weak discriminator diversity index based on the classification result variation degree and the weak discriminator diversity index based on the classification result variation degree. You may select a weak classifier to perform. Also, in the present embodiment, the modified example described in the first embodiment can be adopted.

（第３の実施形態）
次に、第３の実施形態について説明する。本実施形態では、データの識別時に使用するメモリ量、データの識別時の処理速度、および識別結果の精度の少なくとも１つを設定し、その設定値に応じてアンサンブル識別器を構成する弱識別器の数を変更する。このように本実施形態と第１の実施形態とは、アンサンブル識別器を生成する際の構成および処理が主として異なる。従って、本実施形態の説明において、第１の実施形態と同一の部分については、図１〜図５に付した符号と同一の符号を付して詳細な説明を省略する。 (Third embodiment)
Next, a third embodiment will be described. In the present embodiment, at least one of the amount of memory used for data identification, the processing speed at the time of data identification, and the accuracy of the identification result is set, and the weak classifier configuring the ensemble classifier according to the set values is set. Change the number of As described above, the present embodiment and the first embodiment mainly differ in the configuration and processing when generating an ensemble discriminator. Therefore, in the description of the present embodiment, the same parts as those of the first embodiment are denoted by the same reference numerals as those shown in FIGS. 1 to 5 and detailed description is omitted.

アンサンブル識別器は、一般に、自身に用いられる弱識別器の個数が増すにつれ性能が向上していき、最終的に性能は飽和する。そのため、識別処理を行った際にアンサンブル識別器が所望の精度を達成しない場合には、より多数の弱識別器を用いて構成されるアンサンブル識別器を用いて識別処理を行う方法が有効である。一方で、アンサンブル識別器を構成する弱識別器の個数が増すにつれ、アンサンブル識別器の容量や識別処理にかかる時間も増大する。そのため、識別処理における実行メモリの負荷軽減や処理時間の短縮を行いたいときには、少数の弱識別器を用いて構成されるアンサンブル識別器を用いて識別を行えばよい。 Ensemble classifiers generally improve in performance as the number of weak classifiers used therein increases, and eventually the performance saturates. Therefore, when the ensemble classifier does not achieve the desired accuracy when performing the classification process, a method of performing the classification process using an ensemble classifier configured using a larger number of weak classifiers is effective. . On the other hand, as the number of weak classifiers constituting the ensemble classifier increases, the capacity of the ensemble classifier and the time required for the classification process also increase. Therefore, when it is desired to reduce the load on the execution memory and to shorten the processing time in the identification processing, the identification may be performed using an ensemble identifier configured using a small number of weak identifiers.

図９は、アンサンブル識別器構築システムの基本的な構成の一例を示す図である。本実施形態では、アンサンブル識別器構築システムは、弱識別器の組み合わせを算出する情報処理装置２００と、識別処理を行う物体識別装置３００とを有する。
情報処理装置２００は、弱識別器間多様性指標算出部２１０と弱識別器組み合わせ選択部２２０とを有する。 FIG. 9 is a diagram illustrating an example of a basic configuration of an ensemble discriminator construction system. In the present embodiment, the ensemble classifier construction system includes an information processing device 200 that calculates a combination of weak classifiers, and an object classification device 300 that performs a classification process.
The information processing device 200 includes a weak discriminator diversity index calculation unit 210 and a weak discriminator combination selection unit 220.

弱識別器間多様性指標算出部２１０は、第１の実施形態における弱識別器間多様性指標算出部１１０と同様に、弱識別器集合２１とサンプルデータ群２２とを入力として弱識別器間の多様性を表す指標（弱識別器間多様性指標）を算出する。
弱識別器組み合わせ選択部２２０は、第１の実施形態における弱識別器組み合わせ選択部２２０と同様に、弱識別器間多様性指標算出部２１０によって算出された弱識別器間多様性指標に基づいて、所定の個数の弱識別器を選択する。ただし、第１の実施形態と異なる点として、弱識別器組み合わせ選択部２２０は、複数の弱識別器の組み合わせとして複数の組み合わせを選択する。即ち、前記所定の個数の候補が複数ある。弱識別器組み合わせ選択部２２０は、前記所定の個数のそれぞれの場合において選択した弱識別器の組み合わせを、選択弱識別器組み合わせリスト２３として出力する。 The inter-weak classifier diversity index calculating unit 210 receives the weak classifier set 21 and the sample data group 22 as inputs and outputs the weak classifier-to-weak classifier diversity similarly to the inter-weak classifier diversity index calculating unit 110 in the first embodiment. Is calculated (an index indicating the diversity between weak classifiers).
The weak discriminator combination selection unit 220 is based on the weak discriminator diversity index calculated by the weak discriminator diversity index calculation unit 210, similarly to the weak discriminator combination selection unit 220 in the first embodiment. , A predetermined number of weak classifiers are selected. However, as a point different from the first embodiment, the weak discriminator combination selection section 220 selects a plurality of combinations as a combination of a plurality of weak discriminators. That is, there are a plurality of the predetermined number of candidates. The weak discriminator combination selection unit 220 outputs a combination of the weak discriminators selected in each of the predetermined number of cases as a selected weak discriminator combination list 23.

物体識別装置３００は、目標値設定部３１０と、アンサンブル識別器生成部３２０と、識別部３３０とを有する。
目標値設定部３１０は、ユーザによって入力された目標値を設定する。
アンサンブル識別器生成部３２０は、情報処理装置２００から出力された選択弱識別器組み合わせリスト２３と、目標値設定部３１０によって設定された目標値とに基づき、アンサンブル識別器を生成する。
識別部３３０は、アンサンブル識別器生成部３２０によって生成されたアンサンブル識別器を用いて識別対象データ３１の識別を行い、その結果を識別結果３２として出力する。 The object identification device 300 includes a target value setting unit 310, an ensemble identification unit generation unit 320, and an identification unit 330.
The target value setting section 310 sets a target value input by the user.
The ensemble classifier generation unit 320 generates an ensemble classifier based on the selected weak classifier combination list 23 output from the information processing device 200 and the target value set by the target value setting unit 310.
The identification unit 330 identifies the identification target data 31 using the ensemble identification unit generated by the ensemble identification unit generation unit 320, and outputs the result as the identification result 32.

図１０のフローチャートを参照しながら、アンサンブル識別器構築システム（情報処理装置２００および物体識別装置３００）の処理の一例を説明する。
ステップＳ１００１では、弱識別器間多様性指標算出部２１０は、弱識別器集合２１に含まれる弱識別器群の全ての弱識別器ペアに対して弱識別器間多様性指標を算出する。具体的な処理は、第１の実施形態における図２または第２の実施形態における図７のフローチャートを参照しながら説明した処理と同様であるため、その詳細な説明を省略する。 An example of the processing of the ensemble classifier construction system (the information processing device 200 and the object identification device 300) will be described with reference to the flowchart of FIG.
In step S1001, the weak discriminator diversity index calculation unit 210 calculates a weak discriminator diversity index for all the weak discriminator pairs of the weak discriminator group included in the weak discriminator set 21. The specific processing is the same as the processing described with reference to the flowchart of FIG. 2 in the first embodiment or FIG. 7 in the second embodiment, and thus detailed description thereof will be omitted.

次に、ステップＳ１００２では、弱識別器組み合わせ選択部２２０は、選択する弱識別器の個数を複数パターン設定する。予め用意された弱識別器の数をＮとすると、弱識別器組み合わせ選択部２２０は、２〜Ｎ−１までの全ての個数を指定してもよいし、計算量の削減のために所定数おきに個数を指定してもよい。 Next, in step S1002, the weak discriminator combination selection unit 220 sets a plurality of patterns of the number of weak discriminators to be selected. Assuming that the number of weak classifiers prepared in advance is N, the weak classifier combination selector 220 may specify all numbers from 2 to N−1, or a predetermined number to reduce the amount of calculation. The number may be specified every other time.

次に、ステップＳ１００３では、弱識別器組み合わせ選択部２２０は、ステップＳ１００１で算出された弱識別器間多様性指標に基づいて、ステップＳ１００３で指定した個数における適切な弱識別器の組み合わせを選択する。弱識別器の組み合わせを算出する処理は、第１の実施形態における図４のフローチャートを参照しながら説明した処理と同様であるため、その詳細な説明を省略する。 Next, in step S1003, the weak discriminator combination selection unit 220 selects an appropriate combination of weak discriminators in the number specified in step S1003 based on the inter-weak discriminator diversity index calculated in step S1001. . The process of calculating the combination of the weak classifiers is the same as the process described with reference to the flowchart of FIG. 4 in the first embodiment, and thus a detailed description thereof will be omitted.

ステップＳ１００４では、目標値設定部３１０は、アンサンブル識別器を用いた識別処理における目標値を設定する。ここでは、識別処理に使用するメモリ量（容量）、識別処理時に要求される処理速度（識別処理の実行速度）のいずれかが目標値として設定されるものとする。目標値設定部３１０は、例えば、ユーザにより指定されたメモリ量を設定することができる。また、目標値設定部３１０は、識別時に使用できる最大メモリ量から、識別処理に使用するメモリ量を設定してもよい。また、目標値設定部３１０は、ユーザにより指定された処理速度を設定してもよいし、予め設定された全体の処理時間から処理速度を算出してもよい。尚、ステップＳ１００１〜１００３とステップＳ１００４は独立な処理である。ステップＳ１００１〜１００３とステップＳ１００４を並列処理してもよいし、一方を先に処理し他方を後に処理してもよい。 In step S1004, the target value setting unit 310 sets a target value in the identification processing using the ensemble classifier. Here, one of the memory amount (capacity) used for the identification processing and the processing speed required for the identification processing (the execution speed of the identification processing) is set as the target value. The target value setting unit 310 can set, for example, a memory amount specified by the user. Further, the target value setting unit 310 may set the amount of memory to be used for the identification processing based on the maximum amount of memory that can be used for identification. Further, the target value setting unit 310 may set the processing speed specified by the user, or may calculate the processing speed from a preset overall processing time. Steps S1001 to S1003 and step S1004 are independent processes. Steps S1001 to 1003 and step S1004 may be processed in parallel, or one may be processed first and the other may be processed later.

以上のステップＳ１００１〜Ｓ１００４の処理が終わると、ステップＳ１００５に進む。ステップＳ１００５では、アンサンブル識別器生成部３２０は、ステップＳ１００４で設定された目標値を満たす弱識別器の個数を設定する。
識別処理に使用するメモリ量は、アンサンブル識別器を構成する弱識別器のサイズから算出することができる。従って、アンサンブル識別器生成部３２０は、ステップＳ１００３で選択された弱識別器の組み合わせで識別処理を行う場合に使用するメモリ量を算出し、算出したメモリ量が目標値を満たすような個数を選択することができる。 Upon completion of the processes in steps S1001 to S1004, the process proceeds to step S1005. In step S1005, the ensemble classifier generation unit 320 sets the number of weak classifiers that satisfy the target value set in step S1004.
The amount of memory used for the classification process can be calculated from the size of the weak classifier that forms the ensemble classifier. Accordingly, the ensemble classifier generation unit 320 calculates the amount of memory to be used when performing the classification process using the combination of the weak classifiers selected in step S1003, and selects a number such that the calculated amount of memory satisfies the target value. can do.

また、アンサンブル識別器における識別処理では、弱識別器による識別処理を行い、その識別結果を集計することで最終的な識別結果を出力する。そして、アンサンブル識別器の識別処理時の実行速度は、弱識別器の個数に比例する。このため、予め弱識別器単体の識別処理時の実行速度を計測しておくことで、アンサンブル識別器生成部３２０は、要求速度（目標値）を満たす弱識別器の数を逆算することが可能である。 Further, in the classification processing in the ensemble classifier, the classification processing is performed by the weak classifier, and the classification results are totaled to output the final classification result. The execution speed of the ensemble classifier at the time of the classification process is proportional to the number of weak classifiers. For this reason, the ensemble classifier generation unit 320 can calculate the number of weak classifiers that satisfy the required speed (target value) by measuring the execution speed of the weak classifier alone at the time of the identification processing in advance. It is.

次に、ステップＳ１００６では、アンサンブル識別器生成部３２０は、ステップＳ１００３で選択された弱識別器の組み合わせのうち、ステップＳ１００５で設定された個数の弱識別器の組み合わせの弱識別器を用いてアンサンブル識別器を生成する。
次に、ステップＳ１００７では、識別部３３０は、ステップＳ１００６で生成されたアンサンブル識別器を用いて、入力された識別対象データ３１に対する識別処理を行う。
次に、ステップＳ１００８では、識別部３３０は、ステップＳ１００７で行った識別処理の結果（識別結果）が、所望の性能（使用メモリ量もしくは実行速度）を満たすか否かを判定する。 Next, in step S1006, the ensemble classifier generation unit 320 uses the weak classifiers of the combinations of the weak classifiers set in step S1005 among the combinations of weak classifiers selected in step S1003. Generate a classifier.
Next, in step S1007, the identification unit 330 performs an identification process on the input identification target data 31 by using the ensemble identifier generated in step S1006.
Next, in step S1008, the identification unit 330 determines whether the result (identification result) of the identification processing performed in step S1007 satisfies a desired performance (used memory amount or execution speed).

識別対象の物体の識別結果の他に、最大メモリ使用量や実行速度のログが識別結果に含まれる場合には、例えば、以下のようにステップＳ１００８の判定を行うことができる。即ち、ユーザがログを参照して、所望の性能を満たすかどうかを指示し、識別部３３０は、その指示の内容に基づいて、識別結果が、所望の性能を満たすか否かを判定することができる。 When the log of the maximum memory usage and the execution speed is included in the identification result in addition to the identification result of the object to be identified, for example, the determination in step S1008 can be performed as follows. That is, the user refers to the log and instructs whether or not the desired performance is satisfied, and the identification unit 330 determines whether or not the identification result satisfies the desired performance based on the content of the instruction. Can be.

以上のステップＳ１００８の判定の結果、識別結果が所望の性能を満たしている場合には、識別部３３０は、識別結果３２を出力し、図１０のフローチャートによる処理を終了する。 As a result of the determination in step S1008, when the identification result satisfies the desired performance, the identification unit 330 outputs the identification result 32, and ends the processing according to the flowchart in FIG.

一方、識別結果が所望の性能を満たしていない場合には、ステップＳ１００５に戻り、アンサンブル識別器生成部３２０は、より少ない弱識別器の個数を設定する。 On the other hand, if the classification result does not satisfy the desired performance, the process returns to step S1005, and the ensemble classifier generation unit 320 sets a smaller number of weak classifiers.

以上のように本実施形態では、設定された目標値、即ち識別処理の際に使用するメモリ量、あるいは識別処理の際に要求される処理速度を満たす弱識別器の個数を導出し、導出した個数の弱識別器からなる弱識別器の組み合わせパターンを選択する。従って、設定された目標値に応じてアンサンブル識別器を構成する弱識別器の個数を動的に設定することで、第１の実施形態で説明した効果に加え、以下の効果を奏する。即ち、所定の条件を満たす中でより高性能なアンサンブル識別器を生成することができると共に、識別処理時の省メモリ化や識別処理の高速化を実現することができる。
尚、本実施形態の物体識別装置３００を、第２の実施形態に適用することもできる。また、アンサンブル識別器構築システム（情報処理装置２００および物体識別装置３００）を１つの装置で構成してもよい。 As described above, in the present embodiment, the set target value, that is, the amount of memory used in the identification processing, or the number of weak classifiers that satisfy the processing speed required in the identification processing is derived and derived. A combination pattern of weak classifiers including the number of weak classifiers is selected. Therefore, by dynamically setting the number of weak discriminators constituting the ensemble discriminator according to the set target value, the following effects can be obtained in addition to the effects described in the first embodiment. That is, it is possible to generate a higher-performance ensemble discriminator while satisfying the predetermined condition, and it is possible to realize a memory saving at the time of the identification process and a high-speed identification process.
Note that the object identification device 300 of the present embodiment can be applied to the second embodiment. Further, the ensemble classifier construction system (the information processing device 200 and the object classification device 300) may be configured by one device.

尚、前述した実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that each of the above-described embodiments is merely an example of a concrete example for carrying out the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features.

（その他の実施例）
本発明は、以下の処理を実行することによっても実現される。即ち、まず、以上の実施形態の機能を実現するソフトウェア（コンピュータプログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給する。そして、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ等）が当該コンピュータプログラムを読み出して実行する。 (Other Examples)
The present invention is also realized by executing the following processing. That is, first, software (computer program) that realizes the functions of the above embodiments is supplied to a system or an apparatus via a network or various storage media. Then, a computer (or CPU, MPU, or the like) of the system or apparatus reads and executes the computer program.

１１：弱識別器集合、１２：サンプルデータ群、１３：アンサンブル識別器、１００：情報処理装置、１１０：弱識別器間多様性指標算出部、１２０：弱識別器組み合わせ選択部、１３０：アンサンブル識別器生成部 11: weak discriminator set, 12: sample data group, 13: ensemble discriminator, 100: information processing device, 110: inter-weak discriminator diversity index calculation unit, 120: weak discriminator combination selection unit, 130: ensemble discrimination Unit generator

Claims

An information processing apparatus for performing a process for selecting a plurality of weak classifiers constituting the ensemble classifier,
Input means for inputting a candidate of a weak classifier constituting the ensemble classifier,
The same data is given to at least two of the weak discriminators among the candidates input by the input means, and information obtained when performing discrimination processing on the data in each of the weak discriminators is compared. Deriving means for, for each of a plurality of combinations of the at least two weak discriminators, deriving an index representing the diversity of the discrimination results between the at least two weak discriminators based on the result of
For each of the candidates of the number of the plurality of weak classifiers constituting the ensemble classifier , based on an index derived by the derivation unit and representing the diversity of the classification result between the at least two weak classifiers. And a processing unit for selecting a combination of the plurality of weak classifiers.

The information processing apparatus according to claim 1, wherein the processing unit further includes a generating unit configured to generate the ensemble classifier using a combination of the selected weak classifiers.

The information processing apparatus according to claim 1, wherein the information obtained when performing the identification processing is a result of identification of an identification target by the identifier.

The weak classifier classifies the input data, and outputs an identification result of the identification target based on the classification result,
The information processing apparatus according to claim 1, wherein the information obtained when performing the identification processing is a result of the classification performed by the identifier.

The information processing apparatus according to claim 1, wherein a label is not assigned to the data.

The deriving unit is obtained by giving the same data to at least two of the weak discriminators among the candidates input by the input unit, and performing the discriminating process on the data in each of the weak discriminators. Means for obtaining information for each of the plurality of data,
Based on the information obtained when performing the identification processing, in the at least two weak classifiers, means for deriving a degree of variation in information obtained when performing the identification processing,
Means for deriving an index indicating the diversity of identification results between the at least two weak classifiers based on a degree of variation in information obtained when performing the identification processing. The information processing device according to any one of claims 1 to 5 .

The deriving unit provides the same data to at least two of the weak classifiers among the candidates input by the input unit, and performs a classification process on the data in each of the weak classifiers. 7. The information processing apparatus according to claim 6 , wherein when does not contribute to the identification of the identification target, when the data is used, the degree of variation in information obtained when performing the identification processing is not derived. .

The number of the plurality of weak classifiers, the information processing apparatus according to any one of claim 1 to 7, characterized in that a predetermined number.

An ensemble classifier construction system having the information processing device according to claim 1 ,
Target value setting means for setting a target value for identification processing by the ensemble identifier,
Deriving a number of the weak classifiers to achieve the target value, before hexene-option has been among the combinations of the plurality of weak classifiers, using said combination of said plurality of weak classifiers of the number obtained by the derived Generating means for generating an ensemble discriminator;

10. The system according to claim 9 , wherein the target value includes at least one of an amount of memory used when the identification by the ensemble identifier is performed and a processing speed when the identification by the ensemble identifier is performed. 4. The ensemble discriminator construction system according to 1.

An information processing method for performing a process for selecting a plurality of weak classifiers constituting the ensemble classifier,
An inputting step of inputting a candidate of a weak classifier constituting the ensemble classifier,
Among the candidates input in the input step, the same data is given to at least two of the weak classifiers, and information obtained when performing the identification process on the data in each of the weak classifiers is compared. Deriving an index representing the diversity of the classification result between the at least two weak classifiers based on the result, performing a deriving step for each of a plurality of combinations of the at least two weak classifiers;
For each of the candidates for the number of the plurality of weak classifiers constituting the ensemble classifier , based on an index that is derived in the derivation step and that represents the diversity of the classification results between the at least two weak classifiers. the information processing method characterized by having a step of selecting a combination of said plurality of weak classifiers.

Program for causing a computer to function as each unit of the information processing apparatus according to any one of claims 1-8.