JP2012238075A

JP2012238075A - Feature selecting device, feature selecting method, and feature selecting program

Info

Publication number: JP2012238075A
Application number: JP2011105150A
Authority: JP
Inventors: Akira Suzuki; 章鈴木; Masashi Morimoto; 正志森本; Shunichi Yonemura; 俊一米村; Satoshi Shimada; 聡嶌田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-05-10
Filing date: 2011-05-10
Publication date: 2012-12-06

Abstract

PROBLEM TO BE SOLVED: To provide a feature selecting device with improved generalizing property.SOLUTION: The feature selecting device for selecting features of samples using genetic algorithm includes: means for calculating distances of all learning samples from a hyperplane that is a determination boundary between categories; means for selecting one of all the learning samples, defining it as a reference sample, and setting two hyperplanes in both the sides of the determined boundary at positions of a distance between the reference sample and the determined boundary; means for calculating penalty values for learning samples projecting from the set two hyperplanes, according to the lengths projecting from the two hyperplanes; means for calculating an evaluation measure of robustness based on the interval between the two hyperplanes and the penalty values and defining a value of the evaluation measure as an evaluation value of the reference sample; and means for calculating the evaluation value, for each of all the learning samples, when using them as the reference sample, using the reference sample evaluation means and outputting a minimal value among them as the selected feature evaluation value.

Description

本発明は、パターン認識の識別精度を向上させる技術のひとつである、特徴を選択する技術（以下、特徴選択と称する）に関し、特にその中で遺伝的アルゴリズム（以下、ＧＡと称する）を用いる技術に関する。 The present invention relates to a technique for selecting features (hereinafter referred to as feature selection), which is one of the techniques for improving the recognition accuracy of pattern recognition, and in particular, a technique using a genetic algorithm (hereinafter referred to as GA). About.

ＧＡを用いた特徴選択の従来の技術としては、学習用の多くのパターンおよび各々の成果カテゴリ情報から構成されるデータベースを用い、特徴の組合せを変化させてデータベースを用いた識別および識別率の集計を繰り返し、識別率を適応度とする探索をＧＡで行う方法が知られている（例えば、非特許文献１参照）。 As a conventional technique for feature selection using GA, a database consisting of many patterns for learning and each result category information is used, and the combination of features is changed and the identification and aggregation of the identification rate are performed using the database. Is known, and a method for performing a search with the identification rate as fitness is performed by GA (see, for example, Non-Patent Document 1).

浜本義彦、古里眞理、金山知余、富田眞吾：「遺伝的アルゴリズムを用いた特徴選択法」信学論（Ａ）、ｖｏｌ．Ｊ７８−Ａ、ｎｏ．１０、ｐｐ．１３８５−１３８９（１９９５）．Yoshihiko Hamamoto, Yuri Furusato, Tomoya Kanayama, Satoshi Tomita: “Feature selection method using genetic algorithm”, theory of theory (A), vol. J78-A, no. 10, pp. 1385-1389 (1995).

従来の特徴選択の技術では、汎化能力を高めることができなかった。識別における汎化能力とは、学習サンプルだけに対してだけでなく未知のサンプルに対しても高い識別率で識別できる能力であり、実用上非常に重要な特性である。汎化能力を重視した識別器である線形ＳＶＭの考え方によれば、決定境界と近接した学習サンプルとのユークリッド距離である「マージン」をできる限り高めることが識別器の汎化能力を高めるのに有効であるとされているが、このような考え方に基づく特徴選択の技術はこれまでなかった。 Conventional feature selection techniques cannot increase generalization ability. The generalization ability in identification is the ability to identify not only a learning sample but also an unknown sample with a high identification rate, and is a very important characteristic for practical use. According to the idea of linear SVM, which is a discriminator that emphasizes generalization ability, increasing the “margin” that is the Euclidean distance between the decision boundary and the adjacent learning sample as much as possible increases the generalization ability of the discriminator. Although effective, it has never been a feature selection technique based on this concept.

本発明は、このような事情に鑑みてなされたもので、汎化能力を高める特徴選択の技術を実現することができる特徴選択装置、特徴選択方法及び特徴選択プログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide a feature selection device, a feature selection method, and a feature selection program that can realize a feature selection technique that increases generalization ability. .

本発明は、遺伝的アルゴリズムを用いて、サンプルの特徴を選択することによりパターン認識の識別精度を向上させる特徴選択装置であって、カテゴリ間の決定境界である超平面からのすべての学習用のサンプルの距離を算出する距離算出手段と、すべての学習用サンプルの中の１個を選んで基準サンプルとし、決定境界の両側の前記基準サンプルと決定境界の距離の位置それぞれに２つの超平面を設定するマージン設定手段と、前記設定された２つの超平面からはみ出した学習用のサンプルに対して、前記２つの超平面からはみ出した長さに応じたペナルティ値を計算するペナルティ計算手段と、前記２つの超平面の間隔と、前記ペナルティ値とに基づいて、ロバスト性の評価尺度を算出し、該評価尺度の値を前記基準サンプルの評価値とする基準サンプル評価手段と、前記基準サンプル評価手段を用いてすべての学習用のサンプル毎に、それらを基準サンプルとして用いた時の評価値を算出し、その中の最小値を選択された特徴の評価値として出力する評価手段とを備えたことを特徴とする。 The present invention is a feature selection device that improves the recognition accuracy of pattern recognition by selecting features of a sample using a genetic algorithm, and is used for all learning from a hyperplane that is a decision boundary between categories. A distance calculation means for calculating the distance of the sample and one of all the learning samples is selected as a reference sample, and two hyperplanes are provided at each of the reference sample and the determination boundary distance positions on both sides of the determination boundary. Margin setting means for setting, penalty calculation means for calculating a penalty value according to the length of the learning sample that protrudes from the two hyperplanes that is set, and the length that protrudes from the two hyperplanes; Based on an interval between two hyperplanes and the penalty value, a robustness evaluation scale is calculated, and the value of the evaluation scale is set as the evaluation value of the reference sample. For each learning sample using the reference sample evaluation means, the evaluation value when using them as a reference sample is calculated, and the minimum value among them is selected for the selected feature. An evaluation means for outputting as an evaluation value is provided.

本発明は、遺伝的アルゴリズムを用いて、サンプルの特徴を選択することによりパターン認識の識別精度を向上させるために、距離算出手段と、マージン設定手段と、ペナルティ計算手段と、基準サンプル評価手段と、評価手段とを備えた特徴選択装置における特徴選択方法であって、前記距離算出手段が、カテゴリ間の決定境界である超平面からのすべての学習用のサンプルの距離を算出する距離算出ステップと、前記マージン設定手段が、すべての学習用サンプルの中の１個を選んで基準サンプルとし、決定境界の両側の前記基準サンプルと決定境界の距離の位置それぞれに２つの超平面を設定するマージン設定ステップと、前記ペナルティ計算手段が、前記設定された２つの超平面からはみ出した学習用のサンプルに対して、前記２つの超平面からはみ出した長さに応じたペナルティ値を計算するペナルティ計算ステップと、前記基準サンプル評価手段が、前記２つの超平面の間隔と、前記ペナルティ値とに基づいて、ロバスト性の評価尺度を算出し、該評価尺度の値を前記基準サンプルの評価値とする基準サンプル評価ステップと前記評価手段が、前記基準サンプル評価手段を用いてすべての学習用のサンプル毎に、それらを基準サンプルとして用いた時の評価値を算出し、その中の最小値を選択された特徴の評価値として出力する評価ステップとを有することを特徴とする。 The present invention provides a distance calculation means, a margin setting means, a penalty calculation means, a reference sample evaluation means, in order to improve pattern recognition identification accuracy by selecting sample features using a genetic algorithm. A distance selection step in which the distance calculation means calculates the distances of all learning samples from the hyperplane that is a decision boundary between categories; The margin setting means selects one of all learning samples as a reference sample, and sets margins for setting two hyperplanes at each of the distance between the reference sample and the decision boundary on both sides of the decision boundary. A step and a penalty calculating means for the learning sample that protrudes from the set two hyperplanes. A penalty calculation step for calculating a penalty value according to a length protruding from the hyperplane, and the reference sample evaluation means determines a robustness evaluation scale based on the interval between the two hyperplanes and the penalty value. A reference sample evaluation step that calculates and uses the value of the evaluation scale as the evaluation value of the reference sample and the evaluation means use the reference sample evaluation means for each learning sample as a reference sample. And an evaluation step for calculating a minimum evaluation value as an evaluation value of the selected feature.

本発明は、遺伝的アルゴリズムを用いて、サンプルの特徴を選択することによりパターン認識の識別精度を向上させる特徴選択装置上のコンピュータに特徴選択処理を行わせる特徴選択プログラムであって、カテゴリ間の決定境界である超平面からのすべての学習用のサンプルの距離を算出する距離算出ステップと、すべての学習用サンプルの中の１個を選んで基準サンプルとし、決定境界の両側の前記基準サンプルと決定境界の距離の位置それぞれに２つの超平面を設定するマージン設定ステップと、前記設定された２つの超平面からはみ出した学習用のサンプルに対して、前記２つの超平面からはみ出した長さに応じたペナルティ値を計算するペナルティ計算ステップと、前記２つの超平面の間隔と、前記ペナルティ値とに基づいて、ロバスト性の評価尺度を算出し、該評価尺度の値を前記基準サンプルの評価値とする基準サンプル評価ステップと前記基準サンプル評価ステップにより、すべての学習用のサンプル毎に、それらを基準サンプルとして用いた時の評価値を算出し、その中の最小値を選択された特徴の評価値として出力する評価ステップとを前記コンピュータに行わせることを特徴とする。 The present invention is a feature selection program for causing a computer on a feature selection device to improve pattern recognition discrimination accuracy by selecting sample features using a genetic algorithm, A distance calculating step for calculating distances of all learning samples from the hyperplane which is a decision boundary, and selecting one of all learning samples as a reference sample, and the reference samples on both sides of the decision boundary; A margin setting step for setting two hyperplanes at each position of the decision boundary distance, and a length of the learning sample that protrudes from the two hyperplanes set to a length that protrudes from the two hyperplanes. Based on the penalty calculation step for calculating the corresponding penalty value, the interval between the two hyperplanes, and the penalty value, And a reference sample evaluation step using the evaluation scale value as the evaluation value of the reference sample and the reference sample evaluation step to use them as reference samples for all learning samples. An evaluation value is calculated by the computer, and an evaluation step of outputting a minimum value among them as an evaluation value of the selected feature is performed.

本発明によれば、遺伝的アルゴリズムを用いた特徴選択において、染色体の適応度として線形ＳＶＭのソフトマージンをベースとする評価尺度を用いたため、線形ＳＶＭと同様に汎化能力の高い特徴選択を行うことができるという効果が得られる。 According to the present invention, in the feature selection using the genetic algorithm, since the evaluation scale based on the soft margin of the linear SVM is used as the fitness of the chromosome, the feature selection having a high generalization ability is performed as in the linear SVM. The effect that it can be obtained.

本発明の一実施形態の構成を示すブロック図である。It is a block diagram which shows the structure of one Embodiment of this invention. 染色体の構造を示す説明図である。It is explanatory drawing which shows the structure of a chromosome. 図１に示す染色体評価手段８の構成を示すブロック図である。It is a block diagram which shows the structure of the chromosome evaluation means 8 shown in FIG. 学習サンプルの集合をψ、標準パターンをＷ_１，Ｗ_２とすると選択された特徴空間におけるψの分布を示す模式図である。FIG. 5 is a schematic diagram showing the distribution of ψ in a selected feature space, where ψ is a set of learning samples and W ₁ and W ₂ are standard patterns. 特徴空間における標準パターンと決定境界、サンプルの位置関係の概念図である。It is a conceptual diagram of the standard pattern in a feature space, a decision boundary, and the positional relationship between samples. 図１に示す特徴選択装置１の処理動作を示すフローチャートである。It is a flowchart which shows the processing operation of the feature selection apparatus 1 shown in FIG. 図１に示す全世代染色体集合格納手段４の構成を示す説明図である。It is explanatory drawing which shows the structure of the all-generation chromosome set storage means 4 shown in FIG.

以下、図面を参照して、本発明の一実施形態による特徴選択装置を説明する。図１は同実施形態の構成を示すブロック図である。この図において、符号１は、コンピュータ装置によって構成する特徴選択装置である。符号２は、装置全体を統括して動作を制御する全体制御手段である。符号３は、初期の染色体集合を作成する初期染色体集合作成手段である。符号４は、全世代の染色体集合を格納する全世代染色体集合格納手段である。符号５は、選択確率値を算出する選択確率値算出手段である。符号６は、世代数をカウントする世代数カウンタである。符号７は、個別世代実行部である。符号７１は、交叉処理を行う交叉実行手段である。符号７２は、突然変異処理を行う突然変異実行手段である。符号７３は、複製処理を実行する複製実行手段である。符号７４は、染色体集合の並べ替えを行う染色体集合並べ替え手段である。符号８は、染色体の評価を行う染色体評価手段である。 A feature selection apparatus according to an embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the embodiment. In this figure, reference numeral 1 denotes a feature selection device configured by a computer device. Reference numeral 2 denotes overall control means for controlling the operation by controlling the entire apparatus. Reference numeral 3 denotes an initial chromosome set creating means for creating an initial chromosome set. Reference numeral 4 denotes all generation chromosome set storage means for storing all generations of chromosome sets. Reference numeral 5 denotes selection probability value calculation means for calculating a selection probability value. Reference numeral 6 denotes a generation number counter for counting the number of generations. Reference numeral 7 denotes an individual generation execution unit. Reference numeral 71 denotes crossover execution means for performing crossover processing. Reference numeral 72 denotes a mutation execution means for performing a mutation process. Reference numeral 73 denotes a duplication execution means for executing duplication processing. Reference numeral 74 denotes chromosome set rearranging means for rearranging chromosome sets. Reference numeral 8 is a chromosome evaluation means for evaluating a chromosome.

まず、識別手法について説明する。識別手法には種々の方法があるが、本実施形態では、入力パターンと各カテゴリーの標準パターンとのユークリッド距離の中で最小値をとる標準パターンに対応するカテゴリーを識別結果とする最小距離識別を用いることとする。最小距離識別では２カテゴリー問題の場合、決定境界は２つの標準パターンを結ぶ線分を二等分する超平面となる。標準パターンは、学習サンプルの平均とする。 First, the identification method will be described. There are various identification methods. In this embodiment, the minimum distance identification is performed with the category corresponding to the standard pattern having the minimum value among the Euclidean distances between the input pattern and the standard pattern of each category as the identification result. We will use it. In the case of the two-category problem in the minimum distance identification, the decision boundary is a hyperplane that bisects a line segment connecting two standard patterns. The standard pattern is the average of the learning samples.

次に、図２を参照して、染色体の構造について説明する。図２は染色体の構造を示す図である。原特徴の次元数をＬとして、Ｌ次元の中で選択する次元を表現するベクトルを想定し、これを特徴選択ベクトルλと呼ぶ。この特徴選択ベクトルを、本実施形態ではそのままＧＡにおける染色体として用いる。λが選択する特徴の個数をＫで表し、図２に示すように特徴選択ベクトルを構成する。このベクトルは個別の要素が対応する特徴の使用の有無を表し、「１」が「使用する」、「０」が「使用しない」を表す。図２では、各々の次元の「１」で示された破線の枠で「使用する」の適用範囲を表している。個々の要素の値を、λ_１，λ_２，・・・，λ_Ｌで表す。 Next, the structure of the chromosome will be described with reference to FIG. FIG. 2 is a diagram showing the structure of a chromosome. Assuming that the number of dimensions of the original feature is L, a vector expressing a dimension to be selected in the L dimension is assumed, and this is called a feature selection vector λ. In the present embodiment, this feature selection vector is used as it is as a chromosome in GA. The number of features selected by λ is represented by K, and a feature selection vector is constructed as shown in FIG. This vector indicates whether or not the feature corresponding to each individual element is used. “1” indicates “use” and “0” indicates “not use”. In FIG. 2, the applicable range of “use” is represented by a dashed frame indicated by “1” in each dimension. The value of the individual _{_{elements, λ 1, λ 2, ···}} , expressed in lambda _L.

次に、図３を参照して、図１に示す染色体評価手段８の構成を説明する。図３は、図１に示す染色体評価手段８の構成を示すブロック図である。学習サンプルの原特徴の集合をΨ、学習サンプルの数をＮ、Ψの個別の要素をＸ_ｉ（ｉ＝１，２，・・・，Ｎ）、Ｘ_ｉが属するカテゴリの情報をｙ_ｉで表し、カテゴリ１をｙ_ｉ＝１、カテゴリ２をｙ_ｉ＝−１で表す。このｙ_ｉを教師信号と呼ぶ。識別対象とするカテゴリの個数を２、個々のカテゴリをカテゴリ１およびカテゴリ２とし、原特徴の次元数をＬとする。学習サンプル集合の格納手段８１には、学習サンプルの原特徴の集合をΨおよびすべての教師信号が事前に格納される。 Next, the configuration of the chromosome evaluation means 8 shown in FIG. 1 will be described with reference to FIG. FIG. 3 is a block diagram showing the configuration of the chromosome evaluation means 8 shown in FIG. The set of original features of the learning sample is Ψ, the number of learning samples is N, the individual elements of Ψ are X _i (i = 1, 2,..., N), and the information of the category to which X _i belongs is y _i . And category 1 is represented by y _i = 1 and category 2 is represented by y _i = −1. This y _i is called a teacher signal. Assume that the number of categories to be identified is 2, each category is category 1 and category 2, and the number of dimensions of the original feature is L. The learning sample set storage means 81 stores Ψ and all teacher signals in advance as a set of original features of the learning sample.

原特徴空間における標準パターンの作成手段８２は、特徴選択装置１が起動した際に最初に動作する。全体制御手段２の動作の開始は、原特徴空間における標準パターンの作成手段の処理が完了した後で開始される。原特徴空間における標準パターンの作成手段８２は、原特徴空間における標準パターンを以下のように作成し、教師信号を用いてＸ_１，Ｘ_２，・・・，Ｘ_Ｎの要素をカテゴリ１とカテゴリ２に分け、各々について平均を算出し、それらを原特徴空間におけるカテゴリ１と２の標準パターンとしてＭ_１，Ｍ_２で表す。 The standard pattern creation means 82 in the original feature space operates first when the feature selection device 1 is activated. The operation of the overall control unit 2 is started after the processing of the standard pattern creating unit in the original feature space is completed. Creating means 82 standard patterns in the original feature space, the standard pattern in the original feature space created as follows, X _1, X _2, · · ·, category 1 and category elements of X _N with the teacher signal Dividing into two, the average is calculated for each, and these are represented by M ₁ and M ₂ as standard patterns of categories 1 and 2 in the original feature space.

サンプルとカテゴリ間の距離算出手段８５は、決定境界からのサンプルの距離の算出手段８４から入力されたサンプルとカテゴリｋ（ｋ＝１、２）との、λの適用を前提とした距離を以下のように算出する。まず、λの適用を前提とした基本的なベクトル間の距離は以下の通りである。原特徴空間中の２つのベクトルＸ_Ａ，Ｘ_Ｂと特徴選択ベクトルλが与えられたとする。Ｘ_Ａ＝（Ｘ_Ａ１，Ｘ_Ａ２，・・・，Ｘ_ＡＬ）、Ｘ_Ｂ＝（Ｘ_Ｂ１，Ｘ_Ｂ２，・・・，Ｘ_ＢＬ）とする。この時、ベクトルＸ_Ａの要素Ｘ_Ａｉ（１≦ｉ≦Ｌ）の中から、対応するλ_ｉが１に等しい要素だけを抽出し、それらをまとめてＫ次元のベクトル〜Ｘ_Ａ＝（〜Ｘ_Ａ１，〜Ｘ_Ａ２，・・・，〜Ｘ_ＡＫ）（〜は、Ｘの頭に付く、以下同様）を作成する。Ｘ_Ｂからも同様の方法でＫ次元のベクトル〜Ｘ_Ｂ＝（〜Ｘ_Ｂ１，〜Ｘ_Ｂ２，・・・，〜Ｘ_ＢＫ）を作成する。この２つを用いて、λを前提としたＸ_ＡとＸ_Ｂの距離を（１）式の関数ｆ_Ｄにより算出する。（１）式は、ｆ_ｄが特徴選択ベクトルλで選択された特徴だけを用いて算出したユークリッド距離であることを表す。

The sample-to-category distance calculation unit 85 calculates the distance between the sample input from the sample distance calculation unit 84 from the decision boundary and the category k (k = 1, 2) on the assumption that λ is applied. Calculate as follows. First, the basic distance between vectors based on the application of λ is as follows. Assume that two vectors X _A and X _B in the original feature space and a feature selection vector λ are given. Let X _A = (X _A1 , X _A2 ,..., X _AL ), X _B = (X _B1 , X _B2 ,..., X _BL ). In this case, among the elements _{X Ai} of the vector _{X A (1 ≦ i ≦ L} ), the corresponding lambda _i extracts only elements equal to 1, are collectively K-dimensional vector _{to X} A = (to X _{_{_{A1, ~X A2, ···, ~X}}} AK) (~ is attached to the head of the X, to create a hereinafter the same). A K-dimensional vector ˜X _B = (˜X _B1 , ˜X _B2 ,..., ˜X _BK ) is created from X _{B in the} same manner. Using these two, the distance between X _A and X _{B on} the assumption of λ is calculated by the function f _{D in the} equation (1). Equation (1) indicates that f _d is the Euclidean distance calculated using only the feature selected by the feature selection vector λ.

次に、カテゴリ１とカテゴリ２の標準パターンの間の距離を基準として、それがλが変わっても常に一定値「１」となるように特徴空間の座標系をスケール変換する方法を説明する。まず、任意のサンプルの原特徴をＸとし、これからλで選択された特徴でベクトルを作成し、これを〜Ｘ（〜は、Ｘの頭に付く）とする。また、原特徴の標準パターンＭ_ｋ（ｋ＝１，２）からλで選択された特徴でベクトルを作成し、これを〜Ｍ_ｋ（ｋ＝１，２）（〜は、Ｍの頭に付く、以下同様）とする。この〜Ｍｋ（ｋ＝１，２）を用いて、〜Ｘは（２）式の変換を受け、Ｋ次元のベクトルｘに変換される。

Next, with reference to the distance between the standard patterns of category 1 and category 2, a method of scaling the coordinate system of the feature space so that it always becomes a constant value “1” even when λ changes will be described. First, let X be an original feature of an arbitrary sample, and create a vector from the feature selected by λ from this, and let this be ~ X (~ is attached to the head of X). Further, a vector is created with the feature selected by λ from the standard pattern M _k (k = 1, 2) of the original feature, and this is added to ~ M _k (k = 1, 2) (˜ is the head of M). The same shall apply hereinafter. Using this ~ Mk (k = 1, 2), ~ X receives the conversion of equation (2) and is converted to a K-dimensional vector x.

（２）式と同じ変換がカテゴリｋ（ｋ＝１、２）の標準パターンに対しても（３）式により行われる。

The same conversion as that of the equation (2) is also performed by the equation (3) for the standard pattern of the category k (k = 1, 2).

（３）式におけるＷ_ｋは正規化後の標準パターンであり、これもＫ次元のベクトルである。（３）式により、ここではλが変わるごとに標準パターンが新たに作成されることになる。サンプルとカテゴリ間の距離算出手段８５では、任意のサンプルとカテゴリｋ（ｋ＝１、２）との距離を、ｆ_ｄ（ｘ，Ｗ_ｋ）により算出して出力する。 W _{k in} equation (3) is a standard pattern after normalization, which is also a K-dimensional vector. According to the equation (3), a standard pattern is newly created every time λ changes. The distance calculation means 85 between the sample and the category calculates and outputs the distance between an arbitrary sample and the category k (k = 1, 2) using f _d (x, W _k ).

次に、図３を参照して、染色体評価手段８による染色体λの評価を行う動作を説明する。染色体評価手段８は、染色体λが入力されると、決定境界からのサンプルの距離の算出手段８４、ロバスト性の評価手段８６の順番に動作してそれを評価し、スコアを出力する。このスコアを適応度と呼ぶ。決定境界からのサンプルの距離の算出手段８４は、まず学習サンプルの集合Ψの各要素を（２）式によって変換した集合を作成し、これを｛ｘ_１，ｘ_２，・・・，ｘ_Ｎ｝とし、ψで表わす。 Next, with reference to FIG. 3, the operation | movement which evaluates the chromosome (lambda) by the chromosome evaluation means 8 is demonstrated. When the chromosome λ is inputted, the chromosome evaluation means 8 operates in the order of the sample distance calculation means 84 and the robustness evaluation means 86 from the decision boundary, evaluates them, and outputs a score. This score is called fitness. The sample distance calculation means 84 from the decision boundary first creates a set obtained by converting each element of the learning sample set Ψ by the equation (2), and this is set as {x ₁ , x ₂ ,..., X _N } And represented by ψ.

カテゴリ１とカテゴリ２の標準パターンの間の距離が１になるようにスケール変換を行なった特徴空間において、学習サンプルの集合をψ、標準パターンをＷ_１，Ｗ_２とすると選択された特徴空間におけるψの分布の模式図を図４に示す。図４において、黒丸がカテゴリ１のサンプル、白丸がカテゴリ２のサンプルを表す。またＨ_０はカテゴリ間の決定境界の超平面を表す。 In the feature space that is scale-transformed so that the distance between the standard patterns of category 1 and category 2 is 1, if the set of learning samples is ψ and the standard patterns are W ₁ and W ₂ , A schematic diagram of the distribution of ψ is shown in FIG. In FIG. 4, black circles represent category 1 samples, and white circles represent category 2 samples. H ₀ represents the hyperplane of the decision boundary between categories.

次に、決定境界からのサンプルの距離の算出手段８４は、決定境界Ｈ_０からｘ_ｉ（ｉ＝１，２，・・・，Ｎ）までのユークリッド距離ｈ（ｘ_ｉ）（ｉ＝１，２，・・・，Ｎ）を算出する。具体的には、サンプルとカテゴリ間の距離算出手段を起動して、カテゴリ１、２との距離ｆ_Ｄ（ｘ_ｉ，Ｗ_１）、ｆ_Ｄ（ｘ_ｉ，Ｗ_２）を算出し、算出結果を用いて（４）式によりｈ（ｘ_ｉ）を算出する。

Next, the sample distance calculation means 84 from the decision boundary includes the Euclidean distance h (x _i ) (i = 1, 1) from the decision boundary H ₀ to x _i (i = 1, 2,..., N). 2, ..., N). Specifically, the distance calculation means between the sample and the category is activated to calculate the distances f _D (x _i , W ₁ ) and f _D (x _i , W ₂ ) between the

categories

1 and 2, and the calculation result Is used to calculate h (x _i ) according to equation (4).

特徴空間における標準パターンと決定境界、サンプルの位置関係の概念図を図５に示す。（４）式で算出されるｈ（ｘ_ｉ）には正負の符号がついており、ｘ_ｉがＨ_０から見てカテゴリ１の側にあるときはｈ（ｘ_ｉ）＞０、カテゴリ２の側にあるときはｈ（ｘ_ｉ）＜０となる。また教師信号ｙ_ｉとの積の符号によって識別結果の正誤を判断でき、ｈ（ｘ_ｉ）・ｙ_ｉ＞０であれば正しい識別結果、ｈ（ｘ_ｉ）・ｙ_ｉ＜０であれば誤った識別結果を意味する。 FIG. 5 shows a conceptual diagram of a standard pattern, a decision boundary, and a positional relationship between samples in the feature space. H (x _i ) calculated by equation (4) has a positive or negative sign, and when x _i is on the category 1 side when viewed from H _0, h (x _i )> 0, on the category 2 side H (x _i ) <0. The correctness of the identification result can be determined by the sign of the product with the teacher signal _yi . If h (x _i ) · y _i > 0, the correct identification result is obtained. If h (x _i ) · y _i <0, the error is incorrect. Means the identification result.

次に、ロバスト性の評価手段８６は、個別のｘ_ｉについてｈ（ｘ_ｉ）・ｙ_ｉの符号を調べ、ｈ（ｘ_ｉ）・ｙ_ｉ＞０であれば以下に述べる（１）〜（３）によってｘ_ｉにおけるロバスト性の評価尺度Ｓ_ｉを算出し、ｈ（ｘ_ｉ）・ｙ_ｉ＜０であれば（１）〜（３）で算出される値の範囲よりも十分大きい正値の定数をＳ_ｉとする。そして、すべてのｘ_ｉについてＳ_ｉの算出を行った後、Ｓ_ｉの集合から最小値を算出し、その値を染色体λの適応度として出力する。 Next, the evaluation means 86 of the robustness checks the sign of _{h (x} _i) · _y i for the individual _{x i,} described below, if _{_{h (x i) · y i}} > 0 (1) ~ ( 3) by calculating the evaluation measure _{S i} of robustness in _{_{x i, h (x i)}} · y if i <0 (1) ~ ( 3) sufficiently larger positive value than the range of values calculated by Let S _i be a constant. Then, after the calculation of the S _i for all x _i, calculate the minimum value from a set of S _i, and outputs the value as the fitness of the chromosome lambda.

（１）マージンの設定
Ｈ_０の両側の距離｜ｈ（ｘ_ｉ）｜の位置に超平面Ｈ_１，Ｈ_２を設定する。Ｈ_１，Ｈ_２のいずれかにｘ_ｉは含まれる。そして、Ｈ_１とＨ_２の間隔２・｜ｈ（ｘ_ｉ）｜を算出し、これを２／ω_ｉで表す。すなわち、このｘ_ｉは線形ＳＶＭのサポートベクトルに相当し、ここでは「サポートポイント」と呼ぶ。また、２／ω_ｉは線形ＳＶＭと同じくマージンを意味する。 (1) Margin setting Hyperplanes H ₁ and H ₂ are set at positions | h (x _i ) | on both sides of H ₀ . X _i is included in either H ₁ or H ₂ . Then, an interval 2 · | h (x _i ) | between H ₁ and H ₂ is calculated, and this is represented by 2 / ω _i . That is, the x _i corresponds to support vector linear SVM, referred to herein as "support point". Further, 2 / ω _i means a margin as in the case of the linear SVM.

（２）ペナルティの計算
上記（１）により設定されたマージンからはみ出したサンプルに対するペナルティを計算する。Ｈ_１により特徴空間はＨ_０の側の領域とその反対側の領域に二分割され、前者の領域に含まれているカテゴリ１のサンプルは分布の分離の度合いを低下させるとみなしてＨ_１からの距離に比例するペナルティを与えるものとする。同様の処理をカテゴリ２のサンプルに対してもＨ_２を用いて行う。これを定式化したものが（５）式である。

(2) Penalty calculation Penalty is calculated for the sample protruding from the margin set in (1) above. The feature space is divided into two by H ₁ into a region on the H ₀ side and a region on the opposite side, and the sample of category 1 included in the former region is regarded as reducing the degree of distribution separation from H _1. A penalty proportional to the distance is given. A similar process is performed on the category 2 sample using H ₂ . Formula (5) formulates this.

（５）式においてξ_ｉｊがψの個別要素ｘ_ｊに与えられるペナルティである。（５）式は、ｘ_ｊがマージンからはみ出した場合には（すなわちｈ（ｘ_ｊ）＞１／ω_ｉの場合）、はみ出した長さである（１／ω_ｉ−ｙ（ｊ）・ｈ（ｘ_ｊ）の、１／ω_ｉに対する比率である（１−ｙ（ｊ）・ｈ（ｘ_ｊ）・ω_ｉ）をペナルティとして与え、はみ出さない場合には０を与えることを意味する。そして、全体としてのペナルティとしてそれらの合計値Σ^Ｎ _ｊ＝１ξ_ｉｊを算出する。 In equation (5), ξ _ij is a penalty given to the individual element x _{j of} ψ. (5) (if the ie _{h (x j)> 1 /} ω i) _{if x j} is protruding from the margin, the length protruding _{(1 / ω i -y (j} ) · h It means that (1-y (j) · h (x _j ) · ω _i ), which is the ratio of (x _j ) to 1 / ω _i , is given as a penalty, and 0 is given if it does not protrude. Then, the total value Σ ^N _{j = 1} ξ _ij is calculated as a penalty as a whole.

（３）最終的なＳ_ｉの計算
（１）で算出したマージンの値２／ω_ｉと、（２）で算出したペナルティの値Σ^Ｎ _ｊ＝１ξ_ｉｊを用いて、Ｓ_ｉを以下の（６）式で計算する。

(3) Final calculation of S _{i Using} the margin value 2 / ω _i calculated in (1) and the penalty value Σ ^N _{j = 1} ξ _ij calculated in (2), let S _{i be} Calculate with equation (6).

（６）式において、第１項はマージンを大きくとるための項であり、第２項はマージンからはみだしたサンプルに対するペナルティの項である。そしてＣは第１項と第２項のバランスをとるための定数であり、実験的に決定する。Ｓ_ｉは低いほど良い値であるので、（７）式によって最小値を選択する。

In the equation (6), the first term is a term for increasing the margin, and the second term is a penalty term for the sample protruding from the margin. C is a constant for balancing the first term and the second term, and is determined experimentally. Since _Si is a better value as it is lower, the minimum value is selected according to Equation (7).

次に、図６を参照して、図１に示す特徴選択装置１の処理動作を説明する。図６は、図１に示す特徴選択装置１の処理動作を示すフローチャートである。特徴選択装置１は遺伝的アルゴリズム（ＧＡ）に基づいて動作することを基本とする。ＧＡにおける世代の番号（世代数）をＧＮとする。全体制御手段２は、まずＧＮ＝１とし、この値を世代数カウンタ６にセットする（ステップＳ１）。続いて、全体制御手段２は初期染色体集合作成手段３に対して動作開始を指示する。 Next, the processing operation of the feature selection device 1 shown in FIG. 1 will be described with reference to FIG. FIG. 6 is a flowchart showing the processing operation of the feature selection device 1 shown in FIG. The feature selection device 1 basically operates based on a genetic algorithm (GA). The generation number (number of generations) in GA is GN. The overall control means 2 first sets GN = 1, and sets this value in the generation number counter 6 (step S1). Subsequently, the overall control means 2 instructs the initial chromosome set creation means 3 to start operation.

ここで、全世代染色体集合格納手段４の構成を説明する。図７（ａ）は全世代染色体集合格納手段４の構成例であり、ＧＮ_ｍａｘ個の染色体集合が個別の染色体集合格納部に格納される。全世代染色体集合格納手段４における個別の染色体集合格納部は各世代番号ＧＮに対応しており、ＧＮの染色体集合格納手段をＡ（ＧＮ）で表わす。個別の染色体集合格納部は、図７（ｂ）に示すように、１個の染色体格納部は染色体を格納する領域、適応度を格納する領域、選択される確率を格納する領域とから構成する。染色体集合格納手段には最大Ｋ_β個の染色体格納部を格納できる。個別の染色体集合格納部はすべて、特徴選択装置１が起動時は空である。 Here, the configuration of the all generation chromosome set storage means 4 will be described. FIG. 7A shows a configuration example of the all-generation chromosome set storage means 4, and GN _max chromosome sets are stored in individual chromosome set storage units. The individual chromosome set storage section in all generation chromosome set storage means 4 corresponds to each generation number GN, and the chromosome set storage means of GN is represented by A (GN). As shown in FIG. 7B, the individual chromosome set storage unit is composed of a region for storing chromosomes, a region for storing fitness, and a region for storing the probability of selection. . The chromosome set storage means can store a maximum of K _β chromosome storage units. All individual chromosome set storage units are empty when the feature selection device 1 is activated.

初期染色体集合作成手段３は、以下のようにＡ（１）を設定する。Ｋ個の染色体の各個体の特徴選択ビット列の各ビットの値を一定確率Ｐ_ｆ０で「０」に、１−Ｐ_ｆ０で「１」に設定する。続いて、各個体のサンプル選択ビット列の各ビットの値を一定確率Ｐ_ｐ０で「０」に、１−Ｐ_ｐ０で「１」に設定し、それらの個体をすべてＡ（１）の要素とする。Ａ（１）の各個体における学習サンプルの識別率Ｐ_αを適応度とし、その降順にＡ（１）の個体を並び替える（ステップＳ２）。そして、各順位の染色体が選択される確率を選択確率値算出手段５により算出し、全世代染色体集合格納手段４の選択確率を格納する領域に書き込む。 The initial chromosome set creation means 3 sets A (1) as follows. The value of each bit of the feature selection bit string of each individual of K chromosomes is set to “0” with a constant probability P _f0 and “1” with 1-P _f0 . Subsequently, the value of each bit of the sample selection bit string of each individual is set to “0” with a constant probability P _p0 and “1” with 1-P _p0 , and all these individuals are elements of A (1). . The identification rate P _alpha training samples in each individual A (1) and fitness rearranges the individual A (1) to the descending order (Step S2). Then, the probability that the chromosomes of each rank are selected is calculated by the selection probability value calculation means 5 and written in the area for storing the selection probabilities of the all generation chromosome set storage means 4.

ここで、選択確率値算出手段５の動作を説明する。選択確率値算出手段５は、第ｒ位の染色体が選択される確率値Ｐ_Ｓ（ｒ）を（８）式で算出する。

上式においてＭａｘは２変数の中の最大値を出力する関数である。 Here, the operation of the selection probability value calculation means 5 will be described. The selection probability value calculation means 5 calculates a probability value P _S (r) by which the r-th chromosome is selected by the equation (8).

In the above equation, Max is a function that outputs the maximum value of the two variables.

次に、初期染色体集合作成手段３の動作完了後、個別世代実行部７の動作に移る。全体制御手段２は、まず世代数カウンタ６に格納されたＧＮの値に１を加算する（ステップＳ３）。そして、全体制御手段２は交叉実行手段７１に実行を指示する。交叉実行手段７１は、Ａ（ＧＮ−１）から個別の染色体格納部の選択確率の領域に書かれた確率値によりランダムに個別の染色体格納部のペアを選び、それらから２つの染色体を複製し、複製した染色体のペアを用いて交叉を行って新たに２つの染色体を生成し、それらを染色体評価手段８に送って各々の適応度ψを算出させ、適応度の値とともに染色体をＡ（ＧＮ）に追加する。 Next, after the operation of the initial chromosome set creating means 3 is completed, the operation of the individual generation execution unit 7 is started. The overall control means 2 first adds 1 to the value of GN stored in the generation number counter 6 (step S3). Then, the overall control means 2 instructs the cross execution means 71 to execute. The crossover execution means 71 selects a pair of individual chromosome storage units at random from the probability value written in the selection probability area of each individual chromosome storage unit from A (GN-1), and duplicates two chromosomes therefrom. Crossover is performed using a pair of replicated chromosomes to generate two new chromosomes, which are sent to the chromosome evaluation means 8 to calculate each fitness ψ, together with the fitness value, the chromosome A (GN ) To add.

この処理をＫ_１／２回繰り返すことでＫ_１個の個体が生成されＡ（ＧＮ）の要素とする（ステップＳ４）。交叉は二点交叉とし、染色体のサンプル選択ビット列と特徴選択ビット列のそれぞれに対して独立に行なう。これは、染色体全体に単純に交叉を適用すると、両ビット列の一方のみに交叉が施され、両者の同時選択が実行できない恐れがあるからである。 The process of _{K 1} single individuals by repeating _K 1/2 times is generated as an element of A (GN) (step S4). Crossover is a two-point crossover, and is performed independently for each of the sample selection bit string and the feature selection bit string of the chromosome. This is because, if crossover is simply applied to the entire chromosome, only one of both bit strings is crossed, and there is a possibility that simultaneous selection of both cannot be performed.

次に、全体制御手段２は突然変異実行手段７２に実行を指示する。突然変異実行手段７２は、Ａ（ＧＮ−１）から個別の染色体格納部の選択確率の領域に書かれた確率値によりランダムに１個の染色体格納部を選んで複製を行い、複製した染色体に対して一定確率で染色体のビット列を反転しＡ（ＧＮ）の要素とする（ステップＳ５）。この処理をＫ_２回繰り返すことで、Ａ（ＧＮ）にはＫ_２個の個体が追加される。特徴選択ビット列とサンプル選択ビット列のビットを反転させる確率はそれぞれＰ_ｆｍ，Ｐ_ｐｍとする。 Next, the overall control means 2 instructs the mutation execution means 72 to execute. The mutation execution means 72 selects one chromosome storage part at random according to the probability value written in the selection probability area of the individual chromosome storage part from A (GN-1), and replicates it to the replicated chromosome. On the other hand, the chromosomal bit string is inverted with a certain probability to be an element of A (GN) (step S5). By repeating this process _{K 2} times, _{K 2} pieces of individual is added to A (GN). The probabilities of inverting the bits of the feature selection bit sequence and the sample selection bit sequence are P _fm and P _pm , respectively.

次に、全体制御手段２は、複製実行手段７３に実行を指示する。複製実行手段７３は、Ａ（ＧＮ−１）から個別の染色体格納部の選択確率の領域に書かれた確率値によりランダムにＫ_３個の染色体格納部を選び、各々の染色体と適応度をＡ（ＧＮ）に追加する（ステップＳ６）。以上の中で、Ｋ_１、Ｋ_２、Ｋ_３の合計はｋ_βに等しい値とする。最後に、全体制御手段２は染色体集合並べ替え手段７４に実行を指示する。染色体集合並べ替え手段７４は、Ａ（ＧＮ）の各個体を適応度の大きさの降順に並べ替える（ステップＳ７）。以上が１つの世代のＧＡの処理である。 Next, the overall control unit 2 instructs the copy execution unit 73 to execute. The replication executing means 73 randomly selects K ₃ chromosome storage units from the probability value written in the selection probability area of each individual chromosome storage unit from A (GN-1), and assigns each chromosome and fitness to A (GN) is added (step S6). In the above, the sum of K ₁ , K ₂ , and K ₃ is a value equal to k _β . Finally, the overall control means 2 instructs the chromosome set rearranging means 74 to execute. The chromosome set rearranging means 74 rearranges each individual of A (GN) in descending order of fitness (step S7). The above is the processing of one generation of GA.

そして、ＧＮ＝ＧＮ_ｍａｘであるか否かを判定し（ステップＳ８）、ＧＮ＝ＧＮｍａｘであればそこで動作を完了して、Ａ（ＧＮｍａｘ）の第１位の染色体格納手段の染色体を取り出して最終結果として出力する。ＧＮ＝ＧＮｍａｘでなければ、ステップＳ３に戻って処理を繰り返す。 Then, it is determined whether or not GN = GN _max (step S8). If GN = GNmax, the operation is completed, and the chromosome in the first-order chromosome storage means of A (GNmax) is taken out and finally processed. Output as a result. If GN = GNmax is not satisfied, the process returns to step S3 and is repeated.

以上説明したように、ＧＡを用いた特徴選択において、染色体の適応度として線形ＳＶＭのソフトマージンをベースとする評価尺度を用いているため、線形ＳＶＭと同様に汎化能力の高い特徴選択を行うことができる。 As described above, in the feature selection using the GA, since the evaluation scale based on the soft margin of the linear SVM is used as the fitness of the chromosome, the feature selection having a high generalization ability is performed similarly to the linear SVM. be able to.

なお、図１における処理部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより特徴選択処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed to execute the feature selection process. You may go. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Further, the “computer-readable recording medium” refers to a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

遺伝的アルゴリズムを用いて、特徴を選択することによりパターン認識の識別精度を向上させることが不可欠な用途に適用できる。 It can be applied to applications where it is essential to improve the recognition accuracy of pattern recognition by selecting features using a genetic algorithm.

１・・・特徴選択装置、２・・・全体制御手段、３・・・初期染色体集合作成手段、４・・・全世代染色体集合格納手段、５・・・選択確率値算出手段、６・・・世代数カウンタ、７・・・個別世代実行部、７１・・・交叉実行手段、７２・・・突然変異実行手段、７３・・・複製実行手段、７４・・・染色体集合並べ替え手段、８・・・染色体評価手段 DESCRIPTION OF SYMBOLS 1 ... Feature selection apparatus, 2 ... Overall control means, 3 ... Initial chromosome set creation means, 4 ... All generation chromosome set storage means, 5 ... Selection probability value calculation means, 6 ... Generation number counter, 7 ... individual generation execution unit, 71 ... cross execution means, 72 ... mutation execution means, 73 ... replication execution means, 74 ... chromosome set rearrangement means, 8 ... Chromosome evaluation means

Claims

A feature selection device that improves the recognition accuracy of pattern recognition by selecting features of a sample using a genetic algorithm,
A distance calculating means for calculating the distances of all learning samples from the hyperplane that is a decision boundary between categories;
Margin setting means for selecting one of all learning samples as a reference sample, and setting two hyperplanes at each of the distance between the reference sample and the decision boundary on both sides of the decision boundary;
Penalty calculation means for calculating a penalty value corresponding to a length protruding from the two hyperplanes for the learning sample protruding from the two hyperplanes set,
A reference sample evaluation unit that calculates a robustness evaluation scale based on the interval between the two hyperplanes and the penalty value, and uses the value of the evaluation scale as the evaluation value of the reference sample;
Evaluation means for calculating an evaluation value when using them as a reference sample for every learning sample using the reference sample evaluation means, and outputting the minimum value among them as the evaluation value of the selected feature And a feature selection device.

In order to improve the identification accuracy of pattern recognition by selecting features of a sample using a genetic algorithm, a distance calculation means, a margin setting means, a penalty calculation means, a reference sample evaluation means, an evaluation means, A feature selection method in a feature selection device comprising:
A distance calculating step in which the distance calculating means calculates the distances of all learning samples from the hyperplane which is a decision boundary between categories;
A margin setting step in which the margin setting means selects one of all the learning samples as a reference sample, and sets two hyperplanes at each of the distance between the reference sample and the determination boundary on both sides of the determination boundary. When,
A penalty calculating step in which the penalty calculating means calculates a penalty value corresponding to a length protruding from the two hyperplanes for the learning sample protruding from the set two hyperplanes;
The reference sample evaluation means calculates a robust evaluation measure based on the interval between the two hyperplanes and the penalty value, and uses the value of the evaluation measure as the evaluation value of the reference sample. The step and the evaluation means calculate an evaluation value when using them as a reference sample for every learning sample using the reference sample evaluation means, and the minimum value among them is selected. An evaluation step for outputting as an evaluation value.

A feature selection program that causes a computer on a feature selection device to improve pattern recognition discrimination accuracy by selecting a feature of a sample using a genetic algorithm,
A distance calculating step for calculating distances of all learning samples from a hyperplane that is a decision boundary between categories;
A margin setting step of selecting one of all the learning samples as a reference sample, and setting two hyperplanes at each of the reference sample and decision boundary distance positions on both sides of the decision boundary;
A penalty calculating step for calculating a penalty value corresponding to a length protruding from the two hyperplanes with respect to the learning sample protruding from the set two hyperplanes;
Based on the interval between the two hyperplanes and the penalty value, a robustness evaluation measure is calculated, and a reference sample evaluation step using the value of the evaluation measure as an evaluation value of the reference sample; and the reference sample evaluation step Then, for each of the learning samples, an evaluation value when using them as a reference sample is calculated, and an evaluation step for outputting the minimum value among them as the evaluation value of the selected feature is performed on the computer. A feature selection program characterized by