JP2016071684A

JP2016071684A - Pattern recognition device, pattern learning device, pattern learning method, and pattern learning program

Info

Publication number: JP2016071684A
Application number: JP2014201180A
Authority: JP
Inventors: 佐藤　敦; Atsushi Sato; 敦佐藤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2014-09-30
Filing date: 2014-09-30
Publication date: 2016-05-09
Anticipated expiration: 2034-09-30
Also published as: JP6409463B2

Abstract

PROBLEM TO BE SOLVED: To improve pattern recognition precision by making an evaluation function reach a minimization solution regardless of the form of a loss term and performing simultaneous rationalization of feature selection and feature conversion.SOLUTION: A pattern learning device includes: an initial value input section for inputting the initial value of the parameter of a discrimination function used for pattern recognition; a loss calculation section for calculating a loss term corresponding to a recognition error in an evaluation function for evaluating the discrimination function based on an input vector for learning; a regularization calculation section for calculating a regularization term in the evaluation function; a parameter update section for updating the parameter of the discrimination function such that the total sum of the loss term and the regularization term reduces; and a parameter output section for outputting the updated parameter of the discrimination function by the parameter update section. The regularization calculation section calculates a regularization term defined by ratio of norm using the feature conversion matrix of the discrimination function.SELECTED DRAWING: Figure 1

Description

本発明は、パターン認識装置、パターン学習装置、パターン学習方法およびパターン学習プログラムに関する。 The present invention relates to a pattern recognition device, a pattern learning device, a pattern learning method, and a pattern learning program.

音声や画像などのパターンをコンピュータに認識させる場合に用いるパターン識別器において、認識速度および認識精度の向上のために識別関数のパラメータを適正化するパターン学習を行なう。パターン学習として、特に、特徴選択および特徴変換の学習が行なわれる。 In a pattern discriminator used when a computer recognizes a pattern such as a voice or an image, pattern learning for optimizing the parameters of the discrimination function is performed in order to improve recognition speed and recognition accuracy. In particular, feature selection and feature conversion learning are performed as pattern learning.

ここで、特徴選択とは、入力パターンから得られたｄ個の特徴から識別に有効な特徴を少数個選択するもので、従来から総当たり法、前向き逐次特徴選択法、および後ろ向き逐次特徴選択法などが知られている（非特許文献１、p.153）。一方、特徴変換とは、ｄ次元特徴空間を識別に有効なより低次元空間に変換する処理であり、従来から主成分分析や判別分析などが知られている（非特許文献１、p.95）。 Here, the feature selection is to select a small number of features effective for discrimination from d features obtained from the input pattern. Conventionally, the round-robin method, the forward sequential feature selection method, and the backward sequential feature selection method. Are known (Non-Patent Document 1, p.153). On the other hand, feature conversion is processing for converting a d-dimensional feature space into a lower-dimensional space effective for identification, and principal component analysis and discriminant analysis have been conventionally known (Non-Patent Document 1, p. 95). ).

また、評価関数を設定して特徴変換パラメータを更新する機械学習に基づく方法も知られている。評価関数にパラメータのＬ1ノルムを正則化項として加えた“Lasso”と呼ばれる方法では、多くのパラメータの値がゼロとなるスパースな特徴変換を行うことができる（非特許文献２）。さらに、いくつかのパラメータをグループ化してLassoを行うことで、グループごとに値をゼロとする“Group Lasso”と呼ばれる方法も提案されており（非特許文献３）、この方法を使うことで特徴選択と特徴変換との同時最適化を行うことができる。例えば、特許文献１には、ニューラルネットワークのパラメータを、Ｌ1ノルムを正則化項として加えた評価関数により収束させて適正化する技術が開示されている。 In addition, a method based on machine learning in which an evaluation function is set and a feature conversion parameter is updated is also known. In a method called “Lasso” in which the L1 norm of a parameter is added as a regularization term to an evaluation function, sparse feature conversion in which many parameter values are zero can be performed (Non-patent Document 2). Furthermore, a method called “Group Lasso” is proposed in which several parameters are grouped and Lasso is performed to make the value zero for each group (Non-patent Document 3). Simultaneous optimization of selection and feature conversion can be performed. For example, Patent Document 1 discloses a technique for optimizing the parameters of a neural network by converging with an evaluation function obtained by adding an L1 norm as a regularization term.

特開平０８−２０２６７４号公報Japanese Patent Laid-Open No. 08-202675

認識工学―パターン認識とその応用―（テレビジョン学会教科書シリーズ9），鳥脇純一郎著，コロナ社，1993．Cognitive engineering-Pattern recognition and its application-(Television Society Textbook Series 9), Junichiro Toriwaki, Corona, 1993. R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal Statist. Soc. B, Vol.58, No.1, pp.267-288, 1996.R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal Statist.Soc.B, Vol.58, No.1, pp.267-288, 1996. M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, J. Royal Statist. Soc. B, Vol.68, No.1, pp.49-67, 2006.M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, J. Royal Statist.Soc.B, Vol.68, No.1, pp.49-67, 2006.

しかしながら、上記文献に記載の技術では、識別関数のパラメータの１つである特徴変換行列を更新するたびに各要素が際限なくゼロに近づくため、評価関数の最小解に到達できない。例えば、識別関数の損失項が特徴変換行列を定数倍しても同じ値をとるように定義されている場合には、特徴変換行列が安定して求まらないため、パターン認識精度の向上には限界がある。 However, with the technique described in the above-mentioned document, every time the feature transformation matrix that is one of the parameters of the discriminant function is updated, each element approaches zero indefinitely, so the minimum solution of the evaluation function cannot be reached. For example, if the loss term of the discriminant function is defined to take the same value even if the feature transformation matrix is multiplied by a constant, the feature transformation matrix cannot be obtained stably, which improves pattern recognition accuracy. There are limits.

本発明の目的は、上述の課題を解決する技術を提供することにある。 The objective of this invention is providing the technique which solves the above-mentioned subject.

上記目的を達成するため、本発明に係るパターン学習装置は、
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力手段と、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算手段と、
前記評価関数における正則化項を計算する正則化計算手段と、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新手段と、
前記パラメータ更新手段による更新後の前記識別関数のパラメータを出力するパラメータ出力手段と、
を備え、
前記正則化計算手段は、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する。 In order to achieve the above object, a pattern learning apparatus according to the present invention includes:
An initial value input means for inputting an initial value of a parameter of an identification function used for pattern recognition;
A loss calculating means for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
Regularization calculation means for calculating a regularization term in the evaluation function;
Parameter updating means for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
Parameter output means for outputting the parameters of the discrimination function after being updated by the parameter update means;
With
The regularization calculation means calculates a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.

上記目的を達成するため、本発明に係るパターン認識装置は、
上記パターン学習装置を有するパターン認識装置であって、
前記識別関数のパラメータの初期値および前記パラメータ出力手段が出力した前記更新後の前記識別関数のパラメータを格納する認識辞書と、
前記初期値および前記学習用の入力ベクトルに基づいて、前記パターン学習装置に前記更新後の前記識別関数のパラメータを生成させるパラメータ生成指示手段と、
入力された認識対象の入力ベクトルに基づいて、前記更新後の前記識別関数のパラメータを用いた前記識別関数によりクラス識別を行なうクラス識別手段と、
を備える。 In order to achieve the above object, a pattern recognition apparatus according to the present invention includes:
A pattern recognition device having the pattern learning device,
A recognition dictionary for storing initial values of parameters of the discriminant function and parameters of the discriminant function after the update output by the parameter output means;
Based on the initial value and the input vector for learning, parameter generation instructing means for causing the pattern learning device to generate parameters of the updated identification function;
Class identification means for performing class identification by the identification function using the parameter of the identification function after the update based on the input recognition target input vector;
Is provided.

上記目的を達成するため、本発明に係るパターン学習方法は、
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力ステップと、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算ステップと、
前記評価関数における正則化項を計算する正則化計算ステップと、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新ステップと、
前記パラメータ更新ステップにおいて更新後の前記識別関数のパラメータを出力するパラメータ出力ステップと、
を含み、
前記正則化計算ステップにおいては、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する。 In order to achieve the above object, a pattern learning method according to the present invention includes:
An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
Including
In the regularization calculation step, a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function is calculated.

上記目的を達成するため、本発明に係るパターン学習プログラムは、
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力ステップと、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算ステップと、
前記評価関数における正則化項を計算する正則化計算ステップと、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新ステップと、
前記パラメータ更新ステップにおいて更新後の前記識別関数のパラメータを出力するパラメータ出力ステップと、
をコンピュータに実行させるパターン学習プログラムであって、
前記正則化計算ステップにおいては、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する。 In order to achieve the above object, a pattern learning program according to the present invention includes:
An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
A pattern learning program for causing a computer to execute
In the regularization calculation step, a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function is calculated.

本発明によれば、損失項の形によらず評価関数を最小化する解に到達させ、特徴選択と特徴変換との同時適正化を行なって、パターン認識精度を向上することができる。 According to the present invention, it is possible to improve the pattern recognition accuracy by reaching a solution that minimizes the evaluation function regardless of the form of the loss term and simultaneously optimizing feature selection and feature conversion.

本発明の第１実施形態に係るパターン学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the pattern learning apparatus which concerns on 1st Embodiment of this invention. 本発明の第２実施形態に係るパターン学習部を含むパターン認識装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the pattern recognition apparatus containing the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る認識辞書の構成を示す図である。It is a figure which shows the structure of the recognition dictionary which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習部の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパラメータ更新部の構成を示すブロック図である。It is a block diagram which shows the structure of the parameter update part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパラメータ更新部におけるパラメータテーブルの構成を示す図である。It is a figure which shows the structure of the parameter table in the parameter update part which concerns on 2nd Embodiment of this invention. 前提技術における特徴選択を説明するための図である。It is a figure for demonstrating the feature selection in a premise technique. 前提技術における特徴選択後の特徴変換を説明するための図である。It is a figure for demonstrating the feature conversion after the feature selection in a premise technique. 前提技術における特徴選択および特徴変換を行なう特徴変換行列を説明するための図である。It is a figure for demonstrating the feature transformation matrix which performs the feature selection and feature transformation in a base technology. 前提技術におけるパターン学習部の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the pattern learning part in a premise technique. 本発明の第２実施形態に係るパターン学習部を含むパターン認識装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the pattern recognition apparatus containing the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン認識装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the pattern recognition apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the pattern learning process which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るノルム比に基づく正則化の例を示す図である。It is a figure which shows the example of regularization based on the norm ratio which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習部を含むパターン認識装置の具体的な構成を示すブロック図である。It is a block diagram which shows the specific structure of the pattern recognition apparatus containing the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習部の具体的な構成を示すブロック図である。It is a block diagram which shows the specific structure of the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習部を含むパターン認識装置の具体的な処理手順を示すフローチャートである。It is a flowchart which shows the specific process sequence of the pattern recognition apparatus containing the pattern learning part which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係るパターン学習処理の具体的な手順を示すフローチャートである。It is a flowchart which shows the specific procedure of the pattern learning process which concerns on 2nd Embodiment of this invention. 本発明の第３実施形態に係る正則化計算部の構成を示す図である。It is a figure which shows the structure of the regularization calculation part which concerns on 3rd Embodiment of this invention. 本発明の第３実施形態に係るパターン学習処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the pattern learning process which concerns on 3rd Embodiment of this invention. 本発明の第４実施形態に係る認識辞書の構成を示す図である。It is a figure which shows the structure of the recognition dictionary which concerns on 4th Embodiment of this invention. 本発明の第４実施形態に係るパターン学習処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the pattern learning process which concerns on 4th Embodiment of this invention. 本発明の第５実施形態に係るパターン学習部を含むパターン認識装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the pattern recognition apparatus containing the pattern learning part which concerns on 5th Embodiment of this invention. 本発明の第５実施形態に係るパターン認識装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the pattern recognition apparatus which concerns on 5th Embodiment of this invention.

以下に、図面を参照して、本発明の実施の形態について例示的に詳しく説明する。ただし、以下の実施の形態に記載されている構成要素は単なる例示であり、本発明の技術範囲をそれらのみに限定する趣旨のものではない。なお、本明細書で使用する「特徴変換行列」は認識対象パターンを表わす入力ベクトルの次元数を特徴選択と特徴変換とをまとめて行ない認識精度次元数を減らすための行列である。
［第１実施形態］
本発明の第１実施形態としてのパターン学習装置１００について、図１を用いて説明する。パターン学習装置１００は、パターン認識に用いる識別関数のパラメータを更新する装置である。 Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings. However, the constituent elements described in the following embodiments are merely examples, and are not intended to limit the technical scope of the present invention only to them. Note that the “feature transformation matrix” used in this specification is a matrix for reducing the number of recognition accuracy dimensions by collectively performing feature selection and feature transformation on the dimensionality of an input vector representing a recognition target pattern.
[First Embodiment]
A pattern learning apparatus 100 as a first embodiment of the present invention will be described with reference to FIG. The pattern learning device 100 is a device that updates parameters of a discrimination function used for pattern recognition.

図１に示すように、パターン学習装置１００は、初期値入力部１０１と、損失計算部１０２と、正則化計算部１０３と、パラメータ更新部１０４と、パラメータ出力部１０５と、を含む。初期値入力部１０１は、パターン認識に用いる識別関数のパラメータの初期値を入力する。損失計算部１０２は、学習用の入力ベクトルに基づいて、識別関数を評価する評価関数における認識誤りに相当する損失項を計算する。正則化計算部１０３は、評価関数における正則化項を計算する。パラメータ更新部１０４は、損失項と正則化項との総和が減少するように、識別関数のパラメータを更新する。パラメータ出力部１０５は、パラメータ更新部１０４による更新後の識別関数のパラメータを出力する。ここで、正則化計算部１０３は、識別関数の特徴変換行列を用いたノルムの比１０３ａで定義される正則化項を計算する。 As shown in FIG. 1, the pattern learning device 100 includes an initial value input unit 101, a loss calculation unit 102, a regularization calculation unit 103, a parameter update unit 104, and a parameter output unit 105. The initial value input unit 101 inputs an initial value of a parameter of a discrimination function used for pattern recognition. The loss calculation unit 102 calculates a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on the learning input vector. The regularization calculation unit 103 calculates a regularization term in the evaluation function. The parameter updating unit 104 updates the parameters of the discrimination function so that the sum of the loss term and the regularization term decreases. The parameter output unit 105 outputs the identification function parameters updated by the parameter update unit 104. Here, the regularization calculation unit 103 calculates a regularization term defined by the norm ratio 103a using the feature transformation matrix of the discriminant function.

本実施形態によれば、損失項の形によらず評価関数を最小化する解に到達させ、特徴選択と特徴変換との同時適正化を行なって、パターン認識精度を向上することができる。 According to the present embodiment, it is possible to improve the pattern recognition accuracy by reaching a solution that minimizes the evaluation function regardless of the form of the loss term, and performing simultaneous optimization of feature selection and feature conversion.

［第２実施形態］
次に、本発明の第２実施形態に係るパターン学習部を含むパターン認識装置について説明する。本実施形態に係るパターン学習部は、評価関数の正則化項を識別関数の特徴変換行列を用いたノルムの比で定義することにより、特徴変換行列の適正化、すなわち、認識精度を向上し、かつ、特徴変換行列のスパース化を行なう。本実施形態においては、評価関数の正則化項を識別関数の特徴変換行列の列を用いたノルムの比で定義する。 [Second Embodiment]
Next, a pattern recognition apparatus including a pattern learning unit according to the second embodiment of the present invention will be described. The pattern learning unit according to the present embodiment defines the regularization term of the evaluation function by the norm ratio using the feature transformation matrix of the discrimination function, thereby improving the optimization of the feature transformation matrix, that is, the recognition accuracy, In addition, the feature transformation matrix is sparse. In the present embodiment, the regularization term of the evaluation function is defined by the norm ratio using the sequence of the feature transformation matrix of the discriminant function.

《前提技術》
本実施形態の特徴を明瞭にするため、パターン学習の前提技術について簡単に説明する。まず、特徴選択と特徴変換とについて具体的に説明する。《Prerequisite technology》
In order to clarify the features of the present embodiment, a prerequisite technique for pattern learning will be briefly described. First, feature selection and feature conversion will be specifically described.

（特徴選択）
図６Ａは、前提技術における特徴選択６１０を説明するための図である。特徴選択６１０は、ｄ次元の入力ベクトルｘの要素のうち、いくつかの要素を抜きだしたｑ次元（ｑ＜ｄ）のベクトルｚを作る処理であり、ｑ×ｄ行列Ｓで記述できる。ただし、行列Ｓの各行は１つの要素のみ“１”であり、他の要素は“０”である。 (Feature selection)
FIG. 6A is a diagram for explaining feature selection 610 in the base technology. The feature selection 610 is a process for creating a q-dimensional (q <d) vector z in which some elements are extracted from the elements of the d-dimensional input vector x, and can be described by a q × d matrix S. However, in each row of the matrix S, only one element is “1”, and the other elements are “0”.

（特徴変換）
図６Ｂは、前提技術における特徴選択後の特徴変換６２０を説明するための図である。特徴変換６２０は、ｑ次元のベクトルｚをさらに低次元のｐ次元（ｐ＜ｑ）に線形変換する処理であり、ｐ×ｑ行列Ａで記述される。 (Feature conversion)
FIG. 6B is a diagram for describing feature conversion 620 after feature selection in the base technology. The feature conversion 620 is a process of linearly converting the q-dimensional vector z into a lower-dimensional p dimension (p <q), and is described by a p × q matrix A.

（特徴変換行列）
図６Ｃは、前提技術における特徴選択および特徴変換を行なう特徴変換行列６３０を説明するための図である。図６Ｃに示すように、図６Ｂの行列Ｓと行列Ａとはまとめることができ、これを特徴変換行列Ｂと表記すると、図６Ｃに黒で示したように、いくつかの列ベクトルについては要素の値が全て“０”となっている。これは、行列Ｓのスパース性に起因しており、ベクトルｙはその列ベクトルに対応するベクトルｘの要素値の影響を受けない、つまり特徴選択によって選ばれないことを意味する。 (Feature transformation matrix)
FIG. 6C is a diagram for describing a feature transformation matrix 630 that performs feature selection and feature transformation in the base technology. As shown in FIG. 6C, the matrix S and the matrix A in FIG. 6B can be put together and expressed as a feature transformation matrix B. As shown in black in FIG. The values of all are “0”. This is due to the sparsity of the matrix S, and means that the vector y is not affected by the element value of the vector x corresponding to the column vector, that is, is not selected by feature selection.

この場合に、行列Ｓを求めてから行列Ａを設計すると、認識に重要な特徴が特徴選択で選ばれなかった場合は認識精度が低下する。したがって、特徴変換行列Ｂを直接、最適化することが望ましい。前提技術では、“Group Lasso”によってこれを実現している。すなわち、認識誤りに相当する損失項と特徴変換行列Ｂのパラメータからなる正則化項を合わせた評価関数の値が小さくなるように、特徴変換行列Ｂを更新する。 In this case, if the matrix A is designed after obtaining the matrix S, the recognition accuracy is reduced if a feature important for recognition is not selected by feature selection. Therefore, it is desirable to directly optimize the feature transformation matrix B. In the base technology, this is achieved by “Group Lasso”. That is, the feature transformation matrix B is updated so that the value of the evaluation function combining the loss term corresponding to the recognition error and the regularization term composed of the parameters of the feature transformation matrix B becomes smaller.

具体的には（数式１）を最小化する。

ここで、第１項は損失項、第２項は正則化項であって、Ｎはサンプル数、ｘ_nは入力ベクトル、loss(x)はベクトルｘに対する間違いやすさに相当する量（＝損失）、λ＞0は正則化の重みである。||θ||₁はパラメータθに対するＬ1ノルムであり、（数式２）で定義される。 Specifically, (Equation 1) is minimized.

Here, the first term is a loss term, the second term is a regularization term, N is the number of samples, x _n is an input vector, and loss (x) is an amount corresponding to the error probability for vector x (= loss) ), Λ> 0 is a regularization weight. || θ || ₁ is an L1 norm with respect to the parameter θ, and is defined by (Equation 2).

Ｌ1ノルムが小さくなるようにパラメータを更新すると、いくつかの要素については値が“０”となり、スパースな解が得られる。Group Lassoでは、いくつかの要素をまとめてパラメータθ_jを定義することで、グループに含まれる要素値をまとめて“０”にすることができる。例えば、行列Ｂの要素ｂ_ijを用いて、パラメータθ_jを（数式３）のように定義する。

When the parameters are updated so that the L1 norm becomes small, the values of some elements become “0”, and a sparse solution is obtained. In Group Lasso, element values included in a group can be collectively set to “0” by defining a parameter θ _j together with several elements. For example, using the element b _{ij of the} matrix B, the parameter θ _j is defined as (Equation 3).

このようにすれば、行列Ｂのｊ番目の列ベクトルの大きさを“０”にすることができ、特徴選択として作用する。

In this way, the size of the j-th column vector of the matrix B can be set to “0”, which acts as feature selection.

（パターン学習）
図６Ｄは、前提技術におけるパターン学習部６４０の機能構成を示すブロック図である。 (Pattern learning)
FIG. 6D is a block diagram illustrating a functional configuration of the pattern learning unit 640 in the base technology.

パターン学習部６４０は、初期値入力部６０１と、損失計算部６０２と、正則化計算部６０３と、評価値算出部（加算部）６０４と、パラメータ更新部６０５と、パラメータ出力部６０６と、を備える。 The pattern learning unit 640 includes an initial value input unit 601, a loss calculation unit 602, a regularization calculation unit 603, an evaluation value calculation unit (addition unit) 604, a parameter update unit 605, and a parameter output unit 606. Prepare.

初期値入力部６０１は、特徴変換行列の初期値（および、参照ベクトル）を入力する。損失計算部６０２は、選択部６２１と、識別関数演算部６２２と、損失算出部６２３とを有し、評価関数の損失項を計算する。選択部６２１は、特徴変換行列の初期値入力と更新中の特徴変換行列の入力とを選択する。識別関数演算部６２２は、特徴変換行列および参照ベクトルを使用して、学習用の入力ベクトルから最小距離の参照ベクトルに基づいて識別クラスを判別する。そして、損失算出部６２３は、識別クラスの判別の正否と間違いの程度を累積した、損失項の値を算出する。 The initial value input unit 601 inputs an initial value (and a reference vector) of the feature transformation matrix. The loss calculation unit 602 includes a selection unit 621, an identification function calculation unit 622, and a loss calculation unit 623, and calculates a loss term of the evaluation function. The selection unit 621 selects an initial value input of the feature transformation matrix and an input of the feature transformation matrix being updated. The discriminant function calculation unit 622 uses the feature transformation matrix and the reference vector to discriminate the discriminating class based on the reference vector having the minimum distance from the learning input vector. Then, the loss calculation unit 623 calculates the value of the loss term in which the accuracy of discrimination of the identification class and the degree of error are accumulated.

正則化計算部６０３は、選択部６３１とＬ1ノルム算出部６３５とを有し、特徴変換行列の列ベクトルを用いたＬ1ノルムを計算する。選択部６３１は、特徴変換行列の初期値入力と更新中の特徴変換行列の入力とを選択する。Ｌ1ノルム算出部６３５は、特徴変換行列の列ベクトルを累積したＬ1ノルムを正則化項の値として算出する。 The regularization calculation unit 603 includes a selection unit 631 and an L1 norm calculation unit 635, and calculates an L1 norm using a column vector of the feature transformation matrix. The selection unit 631 selects an initial value input of the feature transformation matrix and an input of the feature transformation matrix being updated. The L1 norm calculation unit 635 calculates an L1 norm obtained by accumulating the column vectors of the feature transformation matrix as a regularization term value.

評価関数値算出部（加算部）６０４は、損失項の値と正則化項の値とを加算して、評価関数の値を算出する。パラメータ更新部６０５は、終了条件を満たさなければ、評価関数の値が減るように特徴変換行列を更新して、再度、評価関数の値を算出する。パラメータ更新部６０５は、終了条件を満たせば、パラメータ出力部６０６を経由して最適化されスパース化された特徴変換行列を出力する。 The evaluation function value calculation unit (addition unit) 604 adds the value of the loss term and the value of the regularization term to calculate the value of the evaluation function. If the end condition is not satisfied, the parameter update unit 605 updates the feature transformation matrix so that the value of the evaluation function is reduced, and calculates the value of the evaluation function again. If the termination condition is satisfied, the parameter update unit 605 outputs the optimized and sparse feature transformation matrix via the parameter output unit 606.

（前提技術の課題）
ところが、上記前提技術では、識別関数のパラメータの１つである特徴変換行列を更新するたびに各要素が際限なくゼロに近づくため、評価関数の最小解に到達できない。例えば、識別関数の損失項が特徴変換行列を定数倍しても同じ値をとるように定義されている場合には、特徴変換行列が安定して求まらないため、パターン認識精度の向上には限界がある。 (Issues of prerequisite technologies)
However, in the above-mentioned base technology, every time the feature transformation matrix that is one of the parameters of the discriminant function is updated, each element approaches zero indefinitely, so that the minimum solution of the evaluation function cannot be reached. For example, if the loss term of the discriminant function is defined to take the same value even if the feature transformation matrix is multiplied by a constant, the feature transformation matrix cannot be obtained stably, which improves pattern recognition accuracy. There are limits.

すなわち、損失項がパラメータの定数倍に対して不変ということは、パラメータθを用いて計算した損失項の値と、パラメータをｋ倍したθ’を用いて計算した損失項の値が等しいということである。その場合のＬ1ノルムは、 That is, the fact that the loss term is invariant to a constant multiple of the parameter means that the value of the loss term calculated using the parameter θ is equal to the value of the loss term calculated using θ ′ obtained by multiplying the parameter by k. It is. In that case, the L1 norm is

となるため、ｋが小さければ小さいほど正則化項の値は小さくなる。したがって、前提技術において、損失項と正則化項との和である評価関数値を最小化するようパラメータを更新すると、正則化項の値が定数倍小さくなっても損失項の値が変わらない。そのため、パラメータは際限なく小さくなり続ける。

Therefore, the smaller k is, the smaller the value of the regularization term becomes. Therefore, in the base technology, when the parameter is updated so as to minimize the evaluation function value that is the sum of the loss term and the regularization term, the value of the loss term does not change even if the regularization term value is reduced by a constant. As a result, the parameters continue to decrease indefinitely.

《本実施形態における解決策》
本実施形態においては、ノルム比で正則化項を定義する。例えば、Ｌ1ノルム（数式４）とＬ2ノルム（数式５）とを用いると、 << Solution in this embodiment >>
In this embodiment, the regularization term is defined by the norm ratio. For example, using the L1 norm (Formula 4) and the L2 norm (Formula 5),

より、

Than,

となって、正則化項もパラメータの定数倍に対して不変な量になる。したがって、評価関数の最小化によってパラメータが際限なく小さくなりつづけるという現象は生じないため、安定して最小解に向かうことができる。

Thus, the regularization term also becomes an invariable amount with respect to a constant multiple of the parameter. Therefore, since the phenomenon that the parameter continues to become infinitely small due to the minimization of the evaluation function does not occur, the minimum solution can be stably achieved.

《本実施形態のパターン学習部を有するパターン認識装置》
図２は、本実施形態に係るパターン学習部２４０を含むパターン認識装置２００の機能構成を示すブロック図である。なお、本実施形態においては、パターン認識装置２００を独立した装置として説明するが、情報処理装置内にパターン認識部として組み込まれた構成でもよい。 << Pattern Recognition Apparatus Having Pattern Learning Unit of Present Embodiment >>
FIG. 2 is a block diagram illustrating a functional configuration of the pattern recognition apparatus 200 including the pattern learning unit 240 according to the present embodiment. In the present embodiment, the pattern recognition device 200 is described as an independent device, but may be configured as a pattern recognition unit in the information processing device.

パターン認識装置２００は、パラメータ初期値生成部２１０と、認識辞書２２０と、クラス識別部２３０と、パターン学習部２４０と、を備える。 The pattern recognition apparatus 200 includes a parameter initial value generation unit 210, a recognition dictionary 220, a class identification unit 230, and a pattern learning unit 240.

パラメータ初期値生成部２１０は、初期値生成用の入力ベクトルに基づいて、パラメータの初期化、本実施形態の識別関数では特徴変換行列と参照ベクトルとの初期値を生成する。なお、初期値生成用の入力ベクトルは本パターン認識装置２００によってクラス識別する対象パターンに対応する典型的な入力ベクトルが望ましいが、それに限定されない。認識辞書２２０は、パターン認識装置２００で使用する、識別関数や評価関数、あるいは、初期パラメータや更新パラメータを格納する。クラス識別部２３０は、識別関数を用いて、パターン学習中は学習用入力ベクトルに基づいて、パターン認識中は認識対象パターンの入力ベクトルに基づいて、距離が最短の参照ベクトルを含むクラスへのクラス識別を行なう。クラス識別部２３０は、パターン学習中はパターン学習部２４０のクラス識別結果と損失とを通知する。一方、クラス識別部２３０は、パターン認識中はクラス識別結果を外部に出力する。 The parameter initial value generation unit 210 initializes parameters based on the input vector for initial value generation, and generates initial values of the feature transformation matrix and the reference vector in the discrimination function of this embodiment. Note that the input vector for generating the initial value is preferably a typical input vector corresponding to a target pattern for class identification by the pattern recognition apparatus 200, but is not limited thereto. The recognition dictionary 220 stores an identification function, an evaluation function, an initial parameter, and an update parameter used by the pattern recognition apparatus 200. The class identification unit 230 uses a discrimination function to classify a class including a reference vector having the shortest distance based on an input vector for learning during pattern learning and based on an input vector of a recognition target pattern during pattern recognition. Identify. The class identification unit 230 notifies the class identification result and loss of the pattern learning unit 240 during pattern learning. On the other hand, the class identification unit 230 outputs the class identification result to the outside during pattern recognition.

パターン学習部２４０は、認識辞書２２０からパラメータである特徴変換行列の初期値を取得して、クラス識別部２３０からのクラス識別結果と損失とを取得する。そして、繰り返し特徴変換行列の要素を評価関数値が小さくなるように変更し、収束した時点における特徴変換行列を求めて認識辞書２２０に保存する。 The pattern learning unit 240 acquires the initial value of the feature transformation matrix, which is a parameter, from the recognition dictionary 220 and acquires the class identification result and loss from the class identifying unit 230. Then, the elements of the iterative feature transformation matrix are changed so that the evaluation function value becomes smaller, and the feature transformation matrix at the time of convergence is obtained and stored in the recognition dictionary 220.

なお、図２においては、認識対象パターンの入力ベクトルはパターン認識装置２００の外部で生成される構成としたが、パターン認識装置２００内において取得したパターン情報（画像や音声など）から特徴抽出と量子化や正規化などを行なって入力ベクトルを生成してもよい。 In FIG. 2, the input vector of the recognition target pattern is generated outside the pattern recognition apparatus 200. However, feature extraction and quantum are obtained from pattern information (such as images and sounds) acquired in the pattern recognition apparatus 200. The input vector may be generated by performing normalization or normalization.

（認識辞書）
図３は、本実施形態に係る認識辞書２２０の構成を示す図である。なお、図３には、識別関数のパラメータのみを図示し、識別関数や評価関数などは省略する。 (Recognition dictionary)
FIG. 3 is a diagram showing a configuration of the recognition dictionary 220 according to the present embodiment. In FIG. 3, only the parameters of the discriminant function are shown, and the discriminant function and the evaluation function are omitted.

認識辞書２２０には、特徴変換行列と参照ベクトルとを含むパラメータ初期値３０１と、パターン学習部２４０で最適化した特徴変換行列からなるパラメータ更新値３０２と、終了条件の正否３０３と、を記憶する。なお、終了条件は、パラメータの更新回数や更新による評価関数値の変化量などを条件とする。 The recognition dictionary 220 stores a parameter initial value 301 including a feature transformation matrix and a reference vector, a parameter update value 302 made up of a feature transformation matrix optimized by the pattern learning unit 240, and whether the end condition is correct 303. . The termination condition is based on the number of parameter updates, the amount of change in the evaluation function value due to the update, and the like.

《パターン学習部の機能構成》
図４は、本実施形態に係るパターン学習部２４０の機能構成を示すブロック図である。なお、パターン学習部２４０は、単独でも装置あるいはＩＣチップとして製造して市場に提供可能であり、独立したパターン学習装置と称してもよい。パターン学習部２４０は、パターン認識装置２００のパラメータ生成指示に基づいて、動作する。 <Functional configuration of pattern learning unit>
FIG. 4 is a block diagram illustrating a functional configuration of the pattern learning unit 240 according to the present embodiment. The pattern learning unit 240 can be manufactured alone or manufactured as an apparatus or an IC chip and provided to the market, and may be referred to as an independent pattern learning apparatus. The pattern learning unit 240 operates based on a parameter generation instruction from the pattern recognition apparatus 200.

図４を参照すると、パターン学習部２４０は、初期値入力部４０１と、損失計算部４０２と、正則化計算部４０３と、評価値算出部（加算部）４０４と、パラメータ更新部４０５と、パラメータ出力部４０６と、を備える。 Referring to FIG. 4, the pattern learning unit 240 includes an initial value input unit 401, a loss calculation unit 402, a regularization calculation unit 403, an evaluation value calculation unit (addition unit) 404, a parameter update unit 405, a parameter And an output unit 406.

初期値入力部４０１は、特徴変換行列の初期値（および、参照ベクトル）を入力する。損失計算部４０２は、選択部４２１と、識別関数演算部４２２と、損失算出部４２３とを有し、評価関数の損失項を計算する。選択部４２１は、特徴変換行列の初期値入力と更新中の特徴変換行列の入力とを選択する。識別関数演算部４２２は、特徴変換行列および参照ベクトルを使用して、学習用の入力ベクトルから最小距離の参照ベクトルに基づいて識別クラスを判別する。そして、損失算出部４２３は、識別クラスの判別の正否と間違いの程度を累積した、損失項の値を算出する。 The initial value input unit 401 inputs an initial value (and a reference vector) of the feature transformation matrix. The loss calculation unit 402 includes a selection unit 421, an identification function calculation unit 422, and a loss calculation unit 423, and calculates a loss term of the evaluation function. The selection unit 421 selects the initial value input of the feature transformation matrix and the input of the feature transformation matrix being updated. The discriminant function calculator 422 uses the feature transformation matrix and the reference vector to discriminate the discriminating class based on the reference vector having the minimum distance from the learning input vector. Then, the loss calculation unit 423 calculates the value of the loss term in which the accuracy of discrimination of the identification class and the degree of error are accumulated.

正則化計算部４０３は、選択部４３１と、Ｌvノルム算出部４３２と、Ｌｗノルム算出部４３３と、Ｌv／Ｌw算出部４３４とを有し、特徴変換行列の列ベクトルを用いた正則化項を計算する。選択部４３１は、特徴変換行列の初期値入力と更新中の特徴変換行列の入力とを選択する。Ｌvノルム算出部４３２は、特徴変換行列の列ベクトルの長さ（ノルム）をｖ乗して累積した後に（１／ｖ）乗したＬvノルムを算出する。Ｌwノルム算出部４３３は、特徴変換行列の列ベクトルの長さ（ノルム）をｗ乗して累積した後に（１／ｗ）乗したＬwノルムを算出する（ｖ，ｗは実数、v＜w）。Ｌv／Ｌw算出部４３４は、正則化項の値として、（Ｌvノルム／Ｌwノルム）を算出する。 The regularization calculation unit 403 includes a selection unit 431, an Lv norm calculation unit 432, an Lw norm calculation unit 433, and an Lv / Lw calculation unit 434, and a regularization term using a column vector of the feature transformation matrix. calculate. The selection unit 431 selects an initial value input of the feature transformation matrix and an input of the feature transformation matrix being updated. The Lv norm calculation unit 432 calculates an Lv norm obtained by accumulating the length (norm) of the column vector of the feature transformation matrix by raising it to the power of (1 / v). The Lw norm calculation unit 433 calculates an Lw norm obtained by accumulating the length (norm) of the column vector of the feature transformation matrix by raising to the power (1 / w) (v and w are real numbers, v <w). . The Lv / Lw calculation unit 434 calculates (Lv norm / Lw norm) as the value of the regularization term.

評価関数値算出部（加算部）４０４は、損失項の値と正則化項の値とを加算して、評価関数の値を算出する。パラメータ更新部４０５は、終了条件を満たさなければ、評価関数の値が減るように特徴変換行列を更新して、再度、評価関数の値を算出する。パラメータ更新部４０５は、終了条件を満たせば、パラメータ出力部４０６を経由して最適化されスパース化された特徴変換行列を出力する。 The evaluation function value calculation unit (addition unit) 404 adds the value of the loss term and the value of the regularization term to calculate the value of the evaluation function. If the end condition is not satisfied, the parameter update unit 405 updates the feature transformation matrix so that the value of the evaluation function is reduced, and calculates the value of the evaluation function again. If the termination condition is satisfied, the parameter update unit 405 outputs the optimized and sparse feature transformation matrix via the parameter output unit 406.

（パラメータ更新部）
図５Ａは、本実施形態に係るパラメータ更新部４０５の構成を示すブロック図である。なお、図５Ａは、評価関数値の変化値が閾値より小さい場合に、最適値に収束した終了条件とする構成を示す。しかしながら、終了条件がこれに限らず、更新回数を終了条件としてもよい。 (Parameter update part)
FIG. 5A is a block diagram illustrating a configuration of the parameter update unit 405 according to the present embodiment. FIG. 5A shows a configuration in which the end condition converges to the optimum value when the change value of the evaluation function value is smaller than the threshold value. However, the end condition is not limited to this, and the number of updates may be the end condition.

パラメータ更新部４０５は、特徴変換行列更新部５０１と、評価関数値記憶部５０２と、評価関数変化値算出部５０３と、終了条件判定部５０４と、を有する。特徴変換行列更新部５０１は、初期値または学習中の特徴変換行列を受信して、評価関数の値が小さくなるように特徴変換行列の要素を更新する。評価関数値記憶部５０２は、更新前の評価関数値を記憶する。評価関数変化値算出部５０３は、更新前の評価関数値から更新後の評価関数値への変化値（減少値）を算出する。終了条件判定部５０４は、変化値（減少値）を閾値αと比較して、変化値（減少値）が閾値αより小さければパラメータ更新終了として、更新した特徴変換行列をパラメータ出力部４０６に送出する。一方、評価関数値の変化値（減少値）が閾値α以上の場合は、更新した特徴変換行列を損失計算部４０２および正則化計算部４０３に戻して、パターン学習処理を継続する。 The parameter update unit 405 includes a feature transformation matrix update unit 501, an evaluation function value storage unit 502, an evaluation function change value calculation unit 503, and an end condition determination unit 504. The feature transformation matrix update unit 501 receives the initial value or the feature transformation matrix being learned, and updates the elements of the feature transformation matrix so that the value of the evaluation function becomes smaller. The evaluation function value storage unit 502 stores the evaluation function value before update. The evaluation function change value calculation unit 503 calculates a change value (decrease value) from the evaluation function value before the update to the evaluation function value after the update. The end condition determination unit 504 compares the change value (decrease value) with the threshold value α and, if the change value (decrease value) is smaller than the threshold value α, ends the parameter update and sends the updated feature transformation matrix to the parameter output unit 406. To do. On the other hand, when the change value (decrease value) of the evaluation function value is equal to or greater than the threshold value α, the updated feature transformation matrix is returned to the loss calculation unit 402 and the regularization calculation unit 403, and the pattern learning process is continued.

（パラメータテーブル）
図５Ｂは、本実施形態に係るパラメータ更新部４０５におけるパラメータテーブル５１０の構成を示す図である。パラメータテーブル５１０は、パラメータ更新部４０５において学習中にデータ保持のために使用される。 (Parameter table)
FIG. 5B is a diagram showing a configuration of the parameter table 510 in the parameter update unit 405 according to the present embodiment. The parameter table 510 is used for holding data during learning in the parameter update unit 405.

パラメータテーブル５１０は、前の特徴変換行列５１１と、更新した特徴変換行列５１２と、前の評価関数算出値５１３と、新しい評価関数算出値５１４と、評価関数値の変化値５１５と、閾値α５１６と、終了条件正否５１７と、を記憶する。 The parameter table 510 includes a previous feature transformation matrix 511, an updated feature transformation matrix 512, a previous evaluation function calculated value 513, a new evaluation function calculated value 514, an evaluation function value change value 515, a threshold value α516, , End condition correct / incorrect 517 is stored.

《パターン認識装置のハードウェア構成》
図７は、本実施形態に係るパターン学習部２４０を含むパターン認識装置２００のハードウェア構成を示すブロック図である。なお、図７において、パターン学習部２４０に関連する要素のみを選択すれば、パターン学習装置として動作する。 << Hardware configuration of pattern recognition device >>
FIG. 7 is a block diagram illustrating a hardware configuration of the pattern recognition apparatus 200 including the pattern learning unit 240 according to the present embodiment. In FIG. 7, if only elements related to the pattern learning unit 240 are selected, the pattern learning device operates.

図７で、ＣＰＵ(Central Processing Unit)７１０は演算制御用のプロセッサであり、プログラムを実行することで図２のパターン認識装置２００の機能構成部、あるいは、パターン学習部２４０の機能構成部を実現する。ＲＯＭ(Read Only Memory)７２０は、初期データおよびプログラムなどの固定データおよびプログラムを記憶する。また、通信制御部７３０は、ネットワークを介して認識対象のパターンを受信し、認識結果を送信する。あるいは、通信制御部７３０は、識別関数や評価関数、あるいは、プログラムを取得するために使用される。なお、ＣＰＵ７１０は１つに限定されず、複数のＣＰＵであっても、あるいは画像処理用のＧＰＵ(Graphic Processin Unit)を含んでもよい。また、通信制御部７３０は、ＣＰＵ７１０とは独立したＣＰＵを有して、ＲＡＭ(Random Access Memory)７４０の領域に送受信データを書き込みあるいは読み出しするのが望ましい。また、ＲＡＭ７４０とストレージ７５０との間でデータを転送するＤＭＡＣ(Direct Memory Access Unit)を設けるのが望ましい（図示なし）。さらに、入出力インタフェース７６０は、ＣＰＵ７１０とは独立したＣＰＵを有して、ＲＡＭ７４０の領域に入出力データを書き込みあるいは読み出しするのが望ましい。したがって、ＣＰＵ７１０は、ＲＡＭ７４０にデータが受信あるいは転送されたことを認識してデータを処理する。また、ＣＰＵ７１０は、処理結果をＲＡＭ７４０に準備し、後の送信あるいは転送は通信制御部７３０やＤＭＡＣ、あるいは入出力インタフェース７６０に任せる。 In FIG. 7, a CPU (Central Processing Unit) 710 is a processor for arithmetic control, and by executing a program, the functional configuration unit of the pattern recognition apparatus 200 of FIG. 2 or the functional configuration unit of the pattern learning unit 240 is realized. To do. A ROM (Read Only Memory) 720 stores fixed data and programs such as initial data and programs. Further, the communication control unit 730 receives a recognition target pattern via a network and transmits a recognition result. Alternatively, the communication control unit 730 is used to acquire an identification function, an evaluation function, or a program. Note that the number of CPUs 710 is not limited to one, and may be a plurality of CPUs or may include a GPU (Graphic Processin Unit) for image processing. The communication control unit 730 preferably includes a CPU independent of the CPU 710 and writes or reads transmission / reception data in a RAM (Random Access Memory) 740 area. Further, it is desirable to provide a DMAC (Direct Memory Access Unit) for transferring data between the RAM 740 and the storage 750 (not shown). Further, the input / output interface 760 preferably has a CPU independent of the CPU 710 and writes or reads input / output data to / from the area of the RAM 740. Therefore, the CPU 710 recognizes that the data has been received or transferred to the RAM 740 and processes the data. Further, the CPU 710 prepares the processing result in the RAM 740 and leaves the subsequent transmission or transfer to the communication control unit 730, the DMAC, or the input / output interface 760.

ＲＡＭ７４０は、ＣＰＵ７１０が一時記憶のワークエリアとして使用するランダムアクセスメモリである。ＲＡＭ７４０には、本実施形態の実現に必要なデータを記憶する領域が確保されている。学習用入力ベクトル７４１は、パターン認識装置２００のパターン学習部２４０が使用する学習用のベクトルである。新しい評価関数算出値５１４は、更新中の現在パラメータ７４４に基づいて算出された評価関数値であり、損失値と正規化値とを含む。前の評価関数算出値５１３は、更新前の識別関数のパラメータに基づいて算出された評価関数値である。現在パラメータ７４４は、更新中の識別関数のパラメータであり、終了条件を満足した場合には最終の最適パラメータとなる。評価関数変化値７４５は、前の評価関数算出値５１３から評価関数算出値５１４への変化値（減少値）である。閾値７４６は、終了条件として評価関数変化値７４５と比較する値である。終了条件フラグ７４７は、評価関数変化値７４５が閾値７４６より小さい場合に終了を示し、評価関数変化値７４５が閾値７４６以上の場合に継続を示す、フラグである。なお、終了条件を回数とする場合には、更新回数と、閾値としての回数とが記憶されることになる。 The RAM 740 is a random access memory that the CPU 710 uses as a work area for temporary storage. In the RAM 740, an area for storing data necessary for realizing the present embodiment is secured. The learning input vector 741 is a learning vector used by the pattern learning unit 240 of the pattern recognition apparatus 200. The new evaluation function calculated value 514 is an evaluation function value calculated based on the current parameter 744 being updated, and includes a loss value and a normalized value. The previous evaluation function calculated value 513 is an evaluation function value calculated based on the parameters of the identification function before update. The current parameter 744 is a parameter of the discriminating function being updated, and becomes the final optimum parameter when the end condition is satisfied. The evaluation function change value 745 is a change value (decrease value) from the previous evaluation function calculated value 513 to the evaluation function calculated value 514. The threshold value 746 is a value to be compared with the evaluation function change value 745 as an end condition. The end condition flag 747 is a flag that indicates end when the evaluation function change value 745 is smaller than the threshold value 746 and indicates continuation when the evaluation function change value 745 is equal to or greater than the threshold value 746. When the end condition is the number of times, the number of updates and the number of times as a threshold are stored.

ストレージ７５０には、データベースや各種のパラメータ、あるいは本実施形態の実現に必要な以下のデータまたはプログラムが記憶されている。識別関数７５１は、本パターン認識装置２００が使用する、特徴変換行列と参照ベクトルとを含むクラス識別用の関数の定義である。評価関数７５２は、損失項の定義と正規化項の定義を含む関数の定義である。初期値算出アルゴリズム７５３は、識別関数７５１のパラメータである特徴変換行列と参照ベクトルとの初期値を生成するアルゴリズムである。初期値７５４は、初期値算出アルゴリズム７５３に従って生成された特徴変換行列と参照ベクトルとの初期値である。更新値７５５は、パターン学習部２４０の処理に従って更新された特徴変換行列と参照ベクトルとの更新値である。 The storage 750 stores a database, various parameters, or the following data or programs necessary for realizing the present embodiment. The identification function 751 is a definition of a class identification function including a feature transformation matrix and a reference vector used by the pattern recognition apparatus 200. The evaluation function 752 is a function definition including a loss term definition and a normalization term definition. The initial value calculation algorithm 753 is an algorithm for generating initial values of a feature transformation matrix and a reference vector that are parameters of the discrimination function 751. The initial value 754 is an initial value of the feature transformation matrix and the reference vector generated according to the initial value calculation algorithm 753. The update value 755 is an update value of the feature transformation matrix and the reference vector updated according to the process of the pattern learning unit 240.

ストレージ７５０には、以下のプログラムが格納される。パターン認識プログラム７５６は、本パターン認識装置２００によるパターン認識を実行するプログラムである。パターン学習モジュール７５７は、パターン学習部２４０の処理を実現するモジュールである。評価関数算出モジュール７５８は、識別関数によるパターン認識を本実施形態の評価関数を使用して評価するモジュールである。パラメータ更新モジュール７５９は、評価関数算出モジュール７５８による評価結果に応じて、パラメータ、本実施形態においては特徴変換行列を更新するモジュールである。 The storage 750 stores the following programs. The pattern recognition program 756 is a program that executes pattern recognition by the pattern recognition apparatus 200. The pattern learning module 757 is a module that implements the processing of the pattern learning unit 240. The evaluation function calculation module 758 is a module that evaluates pattern recognition by the discrimination function using the evaluation function of the present embodiment. The parameter update module 759 is a module that updates the parameters, in this embodiment, the feature transformation matrix, according to the evaluation result by the evaluation function calculation module 758.

入出力インタフェース７６０は、入出力機器との入出力データをインタフェースする。入出力インタフェース７６０には、本パターン認識装置２００に認識対象あるいは学習用のパターンを入力するパターン入力部７６１と、認識結果を出力する認識結果出力部７６２と、が接続される。なお、表示部や操作部なども接続されてよいが、省略する。 The input / output interface 760 interfaces input / output data with input / output devices. The input / output interface 760 is connected to a pattern input unit 761 that inputs a recognition target or learning pattern to the pattern recognition apparatus 200 and a recognition result output unit 762 that outputs a recognition result. In addition, although a display part, an operation part, etc. may be connected, it abbreviate | omits.

なお、図７のＲＡＭ７４０やストレージ７５０には、パターン認識装置２００が有する汎用の機能や他の実現可能な機能に関連するプログラムやデータは図示されていない。 Note that the RAM 740 and the storage 750 in FIG. 7 do not show programs and data related to general-purpose functions and other realizable functions that the pattern recognition apparatus 200 has.

《パターン認識装置の処理手順》
図８は、本実施形態に係るパターン認識装置２００の処理手順を示すフローチャートである。このフローチャートは、図７のＣＰＵ７１０がＲＡＭ７４０を使用しながら実行し、図２のパターン認識装置２００の機能構成部を実現する。 << Processing procedure of pattern recognition device >>
FIG. 8 is a flowchart showing a processing procedure of the pattern recognition apparatus 200 according to the present embodiment. This flowchart is executed by the CPU 710 in FIG. 7 using the RAM 740, and realizes a functional configuration unit of the pattern recognition apparatus 200 in FIG.

パターン認識装置２００は、ステップＳ８０１において、識別関数のパラメータである特徴変換行列および参照ベクトルの初期値を生成する。なお、特徴変換行列の初期化としては、学習用のデータを用いて評価関数が十分大きな値となるよう初期値を求めて、認識辞書２２０に保持する。パターン認識装置２００は、ステップＳ８０３において、学習用の入力ベクトルを用いて評価関数を小さくするように特徴変換行列を更新するパターン学習を行ない、最適化した特徴変換行列を認識辞書２２０に格納する。そして、パターン認識装置２００は、ステップＳ８０５において、認識辞書２２０に格納された最適化された特徴変換行列と参照ベクトルをパラメータとして用いて、識別関数に基づきパターン認識（最短距離の参照ベクトルを含むフラス識別）を実行する。 In step S801, the pattern recognition apparatus 200 generates an initial value of a feature transformation matrix and a reference vector that are parameters of a discrimination function. As initialization of the feature transformation matrix, an initial value is obtained using learning data so that the evaluation function becomes a sufficiently large value, and is stored in the recognition dictionary 220. In step S803, the pattern recognition apparatus 200 performs pattern learning for updating the feature transformation matrix so as to reduce the evaluation function using the learning input vector, and stores the optimized feature transformation matrix in the recognition dictionary 220. Then, in step S805, the pattern recognition apparatus 200 uses the optimized feature transformation matrix and the reference vector stored in the recognition dictionary 220 as parameters, and performs pattern recognition based on the discrimination function (the flag including the reference vector with the shortest distance). Identification).

パターン認識装置２００は、ステップＳ８０７において、さらにパターン認識する対象パターンがあるか否かを判定する。まだ対象パターンがあれば、ステップＳ８０５のパターン認識を繰り返す。対象パターンが無くなれば、処理を終了する。 In step S807, the pattern recognition apparatus 200 further determines whether there is a target pattern to be recognized. If there is still a target pattern, the pattern recognition in step S805 is repeated. If there are no more target patterns, the process ends.

なお、図８においては、パターン学習処理を最初にしたのみであるが、パターン認識処理の途中で再度パターン学習処理を行なってもよい。この場合は、一定時間間隔、あるいは、認識処理回数ごとに、あるいは、パターン認識率の低下を認知した場合に、最近のパターン認識対象の入力ベクトルを使用して行なう。 In FIG. 8, the pattern learning process is only performed first, but the pattern learning process may be performed again during the pattern recognition process. In this case, the input is performed using the latest pattern recognition target input vector at regular time intervals, at every recognition processing count, or when a decrease in the pattern recognition rate is recognized.

（パターン学習処理）
図９は、本実施形態に係るパターン学習処理（Ｓ８０３）の手順を示すフローチャートである。このフローチャートは、図７のＣＰＵ７１０がＲＡＭ７４０を使用しながら実行し、図４のパターン学習部２４０の機能構成部を実現する。 (Pattern learning process)
FIG. 9 is a flowchart showing the procedure of the pattern learning process (S803) according to the present embodiment. This flowchart is executed by the CPU 710 in FIG. 7 using the RAM 740, and implements the functional configuration unit of the pattern learning unit 240 in FIG.

パターン認識装置２００は、ステップＳ９０１において、識別関数のパラメータである特徴変換行列および参照ベクトルの初期値を認識辞書２２０から取得する。ここで、初期値としては、評価関数の値が十分大きな値が設定される。パターン認識装置２００は、ステップＳ９０３において、パターン認識対象に対応して識別関数のパラメータを適正化する学習用の入力ベクトルを取得する。学習用の入力ベクトルは外部から提供されても、認識辞書２２０にパターン認識対象に対応して格納されていてもよい。 In step S <b> 901, the pattern recognition apparatus 200 acquires from the recognition dictionary 220 the initial values of the feature transformation matrix and the reference vector that are parameters of the discrimination function. Here, a sufficiently large value of the evaluation function is set as the initial value. In step S903, the pattern recognition apparatus 200 acquires an input vector for learning that optimizes the parameters of the discrimination function corresponding to the pattern recognition target. The learning input vector may be provided from the outside, or may be stored in the recognition dictionary 220 corresponding to the pattern recognition target.

パターン認識装置２００は、ステップＳ９０５において、学習用の入力ベクトルを用いて識別関数を計算してクラス識別処理をし、損失を累積して認識誤りに相当する評価関数の損失項の値を算出する。パターン認識装置２００は、ステップＳ９０７において、特徴変換行列の列要素によるノルムの比で定義される正則化項の値を算出する。パターン認識装置２００は、ステップＳ９０９において、損失項の値と正則化項の値との総和から評価関数の値を求める。 In step S905, the pattern recognition apparatus 200 calculates a discrimination function by using the learning input vector, performs class discrimination processing, accumulates losses, and calculates a value of a loss term of an evaluation function corresponding to a recognition error. . In step S907, the pattern recognition apparatus 200 calculates the value of the regularization term defined by the norm ratio based on the column elements of the feature transformation matrix. In step S909, the pattern recognition apparatus 200 obtains the value of the evaluation function from the sum of the loss term value and the regularization term value.

パターン認識装置２００は、ステップＳ９１１において、初期値あるいは以前の評価関数の値と、ステップＳ９０９で算出した新たな評価関数の値を比較する。初期値あるいは以前の評価関数の値より新たな評価関数の値が小さく、かつ、本例では閾値αよりも小さくなっていれば、ステップＳ９１３に進む。パターン認識装置２００は、ステップＳ９１３において、新たな評価関数の値を認識辞書２２０に比較基準として保持すると共に、評価関数の値がさらに小さくなるように特徴変換行列を更新して、ステップＳ９０５に戻る。初期値あるいは以前の評価関数の値より新たな評価関数の値が小さくない、または、閾値αよりも小さくなっていなければ、ステップＳ９１５に進む。パターン認識装置２００は、ステップＳ９１５において、現在の収束した更新パラメータを認識辞書２２０に出力し、パターン学習部２４０の処理を終了する。 In step S911, the pattern recognition apparatus 200 compares the initial value or the previous evaluation function value with the new evaluation function value calculated in step S909. If the new evaluation function value is smaller than the initial value or the previous evaluation function value and is smaller than the threshold value α in this example, the process proceeds to step S913. In step S913, the pattern recognition apparatus 200 stores the new evaluation function value in the recognition dictionary 220 as a comparison reference, updates the feature transformation matrix so that the evaluation function value becomes smaller, and returns to step S905. . If the new evaluation function value is not smaller than the initial value or the previous evaluation function value, or is not smaller than the threshold value α, the process proceeds to step S915. In step S915, the pattern recognition apparatus 200 outputs the current converged update parameter to the recognition dictionary 220, and ends the process of the pattern learning unit 240.

なお、ステップＳ９１１の終了条件は、更新回数としてもよい。あるいは、他の終了条件を設定してもよい。 Note that the termination condition in step S911 may be the number of updates. Alternatively, other end conditions may be set.

《ノルム比に基づく正則化の効果》
図１０は、本実施形態に係るノルム比に基づく正則化の例を示す図である。図１０に基づいて、本実施形態において、正則化項をノルム比により定義する効果を説明する。なお、識別関数の損失項が特徴変換行列を定数倍しても同じ値をとるように定義されている場合に、ノルム比により定義された正則化項も変化しないので、安定して最小解に向かうことは、先に説明した。図１０においては、ノルム比により定義された正則化項による、さらなる効果を説明する。図１０においては、ノルムの要素が２の場合の例を示すが、要素が２個より多い場合も同様の効果を奏するものである。《Effect of regularization based on norm ratio》
FIG. 10 is a diagram illustrating an example of regularization based on the norm ratio according to the present embodiment. Based on FIG. 10, an effect of defining the regularization term by the norm ratio in the present embodiment will be described. Note that when the loss term of the discriminant function is defined to take the same value even if the feature transformation matrix is multiplied by a constant, the regularization term defined by the norm ratio does not change. I explained earlier. In FIG. 10, the further effect by the regularization term defined by the norm ratio is demonstrated. FIG. 10 shows an example in which the norm element is 2, but the same effect can be obtained when there are more than two elements.

図１０において、ＸＹ平面に示したのはノルム比の等高線であり、Ｚ軸がノルム比の値を示している。ノルム比はＸ軸上あるいはＹ軸上の場合に最小値“１”をとるので、図に丸と矢印で示したように、どこからスタートしたとしてもノルム比を最小化させていくと、ＸとＹとのいずれかが“０”になる。 In FIG. 10, the contour line of the norm ratio is shown on the XY plane, and the Z axis shows the value of the norm ratio. Since the norm ratio takes the minimum value “1” on the X axis or the Y axis, as indicated by the circles and arrows in the figure, if the norm ratio is minimized no matter where it starts, X and One of Y becomes “0”.

図１０のように、ノルム比が||θ||₁／||θ||₂の場合に、次の範囲の値となる。ＸとＹとが等しい、あるいは、ＸとＹとの差が各値に比較して非常に小さい場合、ノルム比は最大で２／２^1/2＝２^1/2≒１．４となる。一方、ＸとＹのいずれかが“０”になる、あるいは、“０”に近くなると、ノルム比は１／１^1/2＝１となる。すなわち、１．４≧ノルム比≧１の間で、ＸとＹのいずれかが“０”に近付くとノルム比は“１”に向かって減少していき、ＸとＹのいずれかが“０”になった時点で、“１”に収束する。 As shown in FIG. 10, when the norm ratio of _{|| θ || 1 / || θ ||} 2, the values in the following ranges. If X and Y are equal, or if the difference between X and Y is very small compared to each value, the norm ratio is 2/2 ^1/2 = 2 ^1/2 ≈1.4 at the maximum. On the other hand, when either X or Y becomes “0” or becomes close to “0”, the norm ratio becomes 1/1 ^1/2 = 1. That is, between 1.4 ≧ norm ratio ≧ 1, when either X or Y approaches “0”, the norm ratio decreases toward “1”, and either X or Y becomes “0”. When it becomes “”, it converges to “1”.

したがって、識別関数のパラメータの１つである特徴変換行列の更新は、本実施形態においては、１要素を残して他の要素が“０”になれば、それ以降、ノルム比は小さくならないので、際限なく全要素を“０”にすることはない。一方、ノルム比が||θ||₂／||θ||₁の場合、１≧ノルム比＞０．７（≒２^1/2／２＝１／２^1/2）の間となり、ＸとＹのいずれかが“０”に近付くとノルム比は“１”に向かって上昇する。このように、ノルム比の要素の値が“０”になることでスパース化されるには、ノルム比の分子をＬvノルム、分母をＬwノルムとすると、v＜wであるのが望ましい。なお、本実施形態においては、評価関数の最小化が学習の基準になっており、この場合はv＜wでないとスパース化できないが、逆に、評価関数の最大化が学習の基準になっている場合は、v＞wでないとスパース化できなくなる。 Therefore, the update of the feature transformation matrix, which is one of the parameters of the discriminant function, in the present embodiment, if the remaining elements become “0” with one element remaining, the norm ratio will not decrease thereafter. All elements are never set to “0”. On the other hand, when the norm ratio is || θ || ₂ / || θ || ₁ , it is between 1 ≧ norm ratio> 0.7 (≈2 ^1/2 / 2 = ^{1/2 1/2} ), and X When either of Y and Y approaches “0”, the norm ratio increases toward “1”. Thus, in order to be sparse when the value of the element of the norm ratio becomes “0”, it is desirable that v <w where the numerator of the norm ratio is Lv norm and the denominator is Lw norm. In this embodiment, minimization of the evaluation function is the learning criterion. In this case, sparse conversion is possible unless v <w, but conversely, maximization of the evaluation function is the learning criterion. If it is v> w, it cannot be sparse.

《具体的な構成》
次に、具体的なパターン認識およびパターン学習の構成および動作を説明する。ここで、学習用データであるｄ次元の入力ベクトルを{ｘ_n,ｔ_n|n = 1,…,Ｎ}、パターン識別器として用いるｄ次元の参照ベクトルを{ｙ_k|k = 1,…,Ｋ}と表記する。ｘ_nはｎ番目のサンプル、ｔ_nはｘ_nの正解クラス、Ｎはサンプル数、Ｋはクラス数である。 <Specific configuration>
Next, a specific configuration and operation of pattern recognition and pattern learning will be described. Here, the d-dimensional input vector as learning data is {x _n , t _n | n = 1,..., N}, and the d-dimensional reference vector used as a pattern discriminator is {y _k | k = 1,. , K}. correct class of x _n is the n-th sample, t _n is x _n, N is the number of samples, K is the number of class.

クラスω_kの識別関数を、次式（数式７）と定義する。

ここで、Ｂは(ｐ×ｄ)の特徴変換行列であり、入力ｘはｄ_k(x)（入力ベクトルと参照ベクトルとの２乗距離）が最小となるクラスに属するものと判定される。 The discriminant function of class ω _k is defined as the following formula (Formula 7).

Here, B is a (p × d) feature transformation matrix, and the input x is determined to belong to the class having the smallest d _k (x) (the square distance between the input vector and the reference vector).

評価関数を、次式（数式８）で定義する。

第１項は損失項、第２項は正則化項であり、λ＞0は正則化項の重みである。 The evaluation function is defined by the following formula (Formula 8).

The first term is a loss term, the second term is a regularization term, and λ> 0 is the weight of the regularization term.

入力ｘ_nに対する損失は、次式（数式９）で定義する。

ここで、“１( )”は( )内が真なら“１”、偽なら“０”を返す指示関数、f( )は単調増加関数、ｒ_kj( )は次式（数式１０-１）で定義される間違いやすさを表す量である。なお、ｒ_kj( )は、特徴変換行列の定数倍によっても変化しない量であり、例えば（数式１０-２）や（数式１０-３）などであってもよい。 The loss for the input x _n is defined by the following equation (Equation 9).

Here, “1 ()” is an indicator function that returns “1” if the value in () is true, “0” if it is false, f () is a monotonically increasing function, and r _kj () is the following equation (Formula 10-1) This is a quantity that represents the error probability defined by. Note that r _kj () is an amount that does not change even by a constant multiple of the feature transformation matrix, and may be, for example, (Equation 10-2) or (Equation 10-3).

||θ||₁と||θ||₂とは、それぞれパラメータθに対するＬ1ノルムとＬ2ノルムとであり、特徴変換行列の要素ｂ_ijを用いて次式（数式１１および数式１２）のように定義する。

|| θ || ₁ and || θ || ₂ are an L1 norm and an L2 norm with respect to the parameter θ, respectively, and are expressed by the following equations (equation 11 and equation 12) using the element b _ij of the feature transformation matrix. Defined in

まず初期化として、クラスごとの入力ベクトルの平均を所定の参照ベクトルとして設定し、特徴変換行列については、主成分分析で得られる固有ベクトルφ_iを、固有値の大きい順にｐ個選んでＢ＝(φ₁,…,φ_p)^Tと設定する。

First, as an initialization, the average of the input vectors for each class is set as a predetermined reference vector. For the feature transformation matrix, p eigenvectors φ _i obtained by principal component analysis are selected in descending order of eigenvalues and B = (φ ₁ ,…, φ _p ) Set as ^T.

そして、（数式８）に示した評価関数の値を計算し、それの値が減少するように特徴変換行列を更新する。例えば、最急降下法に従えば、特徴変換行列の全ての要素について、次式（数式１３）と更新する。 Then, the value of the evaluation function shown in (Formula 8) is calculated, and the feature transformation matrix is updated so that the value decreases. For example, according to the steepest descent method, the following equation (Equation 13) is updated for all elements of the feature transformation matrix.

特徴変換行列の全ての要素を更新した後、評価関数を計算しなおす処理を繰り返す。終了条件は、事前に繰り返し回数を決めておいてもよいし、評価関数の変化がある値以下になった時点で処理を終了しても構わない。

After updating all the elements of the feature transformation matrix, the process of recalculating the evaluation function is repeated. As the end condition, the number of repetitions may be determined in advance, or the process may be ended when the evaluation function changes below a certain value.

以下、上記具体的な定義を、パターン認識装置２００およびパターン学習部２４０に適用した場合の構成および動作を簡単に説明する。なお、参照番号やステップ番号は、図２および図４の機能構成図の参照番号、図８および図９のフローチャートのステップ番号と同じとし、上記具体的な定義を挿入する。 Hereinafter, a configuration and operation when the above specific definition is applied to the pattern recognition device 200 and the pattern learning unit 240 will be briefly described. The reference numbers and step numbers are the same as the reference numbers in the functional configuration diagrams of FIGS. 2 and 4 and the step numbers in the flowcharts of FIGS. 8 and 9, and the above specific definitions are inserted.

（パターン認識装置の具体例）
図１１Ａは、本実施形態に係るパターン学習部２４０を含むパターン認識装置２００の具体的な構成を示すブロック図である。 (Specific example of pattern recognition device)
FIG. 11A is a block diagram illustrating a specific configuration of the pattern recognition apparatus 200 including the pattern learning unit 240 according to the present embodiment.

パラメータ初期値生成部２１０は、初期化として、クラスごとの入力ベクトルの平均を参照ベクトルｙ_kとして設定し、特徴変換行列については、主成分分析で得られる固有ベクトルφ_iを、固有値の大きい順にｐ個選んでＢ₀＝(φ₁,…,φ_p)^Tと設定する。 The parameter initial value generation unit 210 sets the average of the input vectors for each class as the reference vector y _k as initialization, and for the feature transformation matrix, sets the eigenvectors φ _i obtained by the principal component analysis in order of increasing eigenvalues. Select one and set B ₀ = (φ ₁ ,..., Φ _p ) ^T.

認識辞書２２０は、初期値として、Ｂ₀とｙ_kとを保持し、パターン学習により特徴変換行列Ｂ₀を更新して、最適なパターン認識が可能な値に収束した特徴変換行列Ｂ_zを格納する。 The recognition dictionary 220 stores B ₀ and y _k as initial values, updates the feature transformation matrix B ₀ by pattern learning, and stores the feature transformation matrix B _z that has converged to a value that enables optimum pattern recognition. To do.

クラス識別部２３０は、識別関数（数式７）を用いて、パターン学習中は学習用入力ベクトルに基づいて、パターン認識中は認識対象パターンの入力ベクトルに基づいて、距離が最短の参照ベクトルを含むクラスへのクラス識別を行ない、パターン学習部２４０のクラス識別結果と損失とを通知する。一方、パターン認識中はクラス識別結果を外部に出力する。 The class identification unit 230 includes a reference vector having the shortest distance based on an input vector for learning during pattern learning and based on an input vector of a recognition target pattern during pattern recognition using an identification function (Equation 7). Class identification to the class is performed, and the class identification result and loss of the pattern learning unit 240 are notified. On the other hand, the class identification result is output to the outside during pattern recognition.

パターン学習部２４０は、認識辞書２２０からパラメータである特徴変換行列の初期値Ｂ₀を取得して、クラス識別部２３０からのクラス識別結果と損失とを取得する。そして、繰り返し特徴変換行列の要素を評価関数（数式８）の値が小さくなるように変更し（数式１３参照）、収束した時点における特徴変換行列を求めて認識辞書２２０に保存する。 The pattern learning unit 240 acquires the initial value B ₀ of the feature transformation matrix that is a parameter from the recognition dictionary 220, and acquires the class identification result and loss from the class identification unit 230. Then, the elements of the repeated feature transformation matrix are changed so that the value of the evaluation function (Equation 8) becomes smaller (see Equation 13), and the feature transformation matrix at the time of convergence is obtained and stored in the recognition dictionary 220.

（パターン学習部の具体例））
図１１Ｂは、本実施形態に係るパターン学習部２４０の具体的な構成を示すブロック図である。 (Specific example of pattern learning unit))
FIG. 11B is a block diagram illustrating a specific configuration of the pattern learning unit 240 according to the present embodiment.

初期値入力部４０１は、特徴変換行列および参照ベクトルの初期値（Ｂ₀、ｙ_k）を入力する。損失計算部４０２において、選択部４２１は、特徴変換行列の初期値入力Ｂ₀と更新中の特徴変換行列の入力とを選択する。識別関数演算部４２２は、特徴変換行列および参照ベクトルを使用して、学習用の入力ベクトルから識別関数（数式７）を演算して最小距離の参照ベクトルに基づいて識別クラスを判別する。そして、損失算出部４２３は、識別クラスの判別の正否と間違いの程度を累積した、損失項（数式９参照）の値を算出する。 The initial value input unit 401 inputs a feature transformation matrix and initial values (B ₀ , y _k ) of a reference vector. In the loss calculation unit 402, the selection unit 421 selects an initial value input B ₀ of the feature transformation matrix and an input of the feature transformation matrix being updated. The discriminant function calculation unit 422 calculates a discriminant function (Formula 7) from the learning input vector using the feature transformation matrix and the reference vector, and discriminates the discriminant class based on the reference vector of the minimum distance. Then, the loss calculation unit 423 calculates the value of the loss term (see Equation 9), which is an accumulation of the correctness of the discrimination of the identification class and the degree of error.

正則化計算部４０３において、選択部４３１は、特徴変換行列の初期値入力と更新中の特徴変換行列の入力とを選択する。Ｌ1ノルム算出部４３２は、（数式１１）に従ってＬ1ノルムを算出する。Ｌ2ノルム算出部４３３は、（数式１２）に従ってＬ2ノルムを算出する。Ｌ1／Ｌ2算出部４３４は、正則化項の値としてノルム比（数式６参照）を算出する。 In the regularization calculation unit 403, the selection unit 431 selects an initial value input of the feature transformation matrix and an input of the feature transformation matrix being updated. The L1 norm calculation unit 432 calculates the L1 norm according to (Formula 11). The L2 norm calculation unit 433 calculates the L2 norm according to (Equation 12). The L1 / L2 calculation unit 434 calculates a norm ratio (see Equation 6) as the value of the regularization term.

評価関数値算出部（加算部）４０４は、損失項の値と正則化項の値とを加算して、評価関数（数式８）の値を算出する。パラメータ更新部４０５は、終了条件を満たさなければ、評価関数の値が減るように特徴変換行列を更新して（数式１３参照）、再度、評価関数の値を算出する。パラメータ更新部４０５は、終了条件を満たせば、パラメータ出力部４０６を経由して最適化されスパース化された特徴変換行列Ｂ_zを出力する。 The evaluation function value calculation unit (addition unit) 404 adds the value of the loss term and the value of the regularization term to calculate the value of the evaluation function (Formula 8). If the end condition is not satisfied, the parameter update unit 405 updates the feature transformation matrix so that the value of the evaluation function is reduced (see Equation 13), and calculates the value of the evaluation function again. If the end condition is satisfied, the parameter update unit 405 outputs the optimized and sparse feature transformation matrix B _z via the parameter output unit 406.

（パターン認識装置の処理手順）
図１２Ａは、本実施形態に係るパターン学習部２４０を含むパターン認識装置２００の具体的な処理手順を示すフローチャートである。 (Processing procedure of pattern recognition device)
FIG. 12A is a flowchart showing a specific processing procedure of the pattern recognition apparatus 200 including the pattern learning unit 240 according to the present embodiment.

パターン認識装置２００は、ステップＳ８０１において、特徴変換行列および参照ベクトルの初期値（Ｂ₀、ｙ_k）を生成して、認識辞書２２０に保持する。ステップＳ８０３において、学習用の入力ベクトルを用いて評価関数（数式８）を小さくするように特徴変換行列を更新するパターン学習を行ない、最適化した特徴変換行列Ｂ_zを認識辞書２２０に格納する。そして、ステップＳ８０５において、認識辞書２２０に格納された最適化された特徴変換行列Ｂ_zと参照ベクトルｙ_kとをパラメータとして用いて、識別関数（数式７）に基づきパターン認識（最短距離の参照ベクトルに基づくクラス識別）を実行する。 In step S _< b> 801, the pattern recognition apparatus 200 generates a feature transformation matrix and initial values (B ₀ , y _k ) of the reference vector, and holds them in the recognition dictionary 220. In step S803, pattern learning for updating the feature transformation matrix is performed using the learning input vector so as to reduce the evaluation function (Equation 8), and the optimized feature transformation matrix B _z is stored in the recognition dictionary 220. In step S805, using the optimized feature transformation matrix B _z and the reference vector y _k stored in the recognition dictionary 220 as parameters, pattern recognition (reference vector with the shortest distance) is performed based on the discriminant function (Equation 7). Class identification).

（パターン学習処理）
図１２Ｂは、本実施形態に係るパターン学習処理（Ｓ８０３）の具体的な手順を示すフローチャートである。 (Pattern learning process)
FIG. 12B is a flowchart showing a specific procedure of the pattern learning process (S803) according to the present embodiment.

パターン認識装置２００は、ステップＳ９０１において、特徴変換行列および参照ベクトルの初期値（Ｂ₀、ｙ_k）を認識辞書２２０から取得する。ステップＳ９０３において、学習用の入力ベクトル（ｘ₁、ｘ₂、…、ｘ_N）を取得する。ステップＳ９０５において、入力ベクトル（ｘ₁、ｘ₂、…、ｘ_N）を用いて識別関数（数式７）を計算してクラス識別処理をし、損失を累積して認識誤りに相当する評価関数の損失項（数式８の右辺第１項）の値を算出する。ステップＳ９０７において、特徴変換行列の列要素によるノルムの比で定義される正則化項（数式８の右辺第２項）の値を算出する。ステップＳ９０９において、損失項の値と正則化項の値との総和から評価関数（数式８）の値を求める。 In step S _< b> 901, the pattern recognition apparatus 200 acquires an initial value (B ₀ , y _k ) of the feature transformation matrix and the reference vector from the recognition dictionary 220. In step S903, learning input vectors (x ₁ , x ₂ ,..., X _N ) are acquired. In step S905, the discriminant function (Formula 7) is calculated using the input vectors (x ₁ , x ₂ ,..., X _N ) to perform class discrimination processing, and the evaluation function corresponding to the recognition error is accumulated by accumulating loss. The value of the loss term (the first term on the right side of Equation 8) is calculated. In step S907, the value of the regularization term (second term on the right side of Equation 8) defined by the norm ratio of the column elements of the feature transformation matrix is calculated. In step S909, the value of the evaluation function (Formula 8) is obtained from the sum of the loss term value and the regularization term value.

パターン認識装置２００は、ステップＳ９１１において、初期値あるいは以前の評価関数の値Ｌ_iと、ステップＳ９０９で算出した新たな評価関数の値Ｌ_i+1を比較する。初期値あるいは以前の評価関数の値Ｌ_iより新たな評価関数の値Ｌ_i+1が小さく、かつ、本例では閾値αよりも小さくなっていれば、ステップＳ９１３に進む。ステップＳ９１３において、新たな評価関数の値を認識辞書２２０に比較基準として保持すると共に、評価関数の値がさらに小さくなるように特徴変換行列を更新して（数式１３参照）、ステップＳ９０５に戻る。初期値あるいは以前の評価関数の値より新たな評価関数の値が小さくない、または、閾値αよりも小さくなっていなければ、ステップＳ９１５において、現在の収束した特徴変換行列Ｂzを認識辞書２２０に出力し、パターン学習部２４０の処理を終了する。 Pattern recognition apparatus 200, in step S911, compares the value L _i of the initial value or the previous evaluation function, the value L _{i + 1} of the new evaluation function calculated in step S909. If the new evaluation function value L _{i + 1} is smaller than the initial value or the previous evaluation function value L _i and is smaller than the threshold value α in this example, the process proceeds to step S913. In step S913, the value of the new evaluation function is held as a comparison criterion in the recognition dictionary 220, and the feature transformation matrix is updated so that the value of the evaluation function is further reduced (see Expression 13), and the process returns to step S905. If the new evaluation function value is not smaller than the initial value or the previous evaluation function value, or is not smaller than the threshold value α, the current converged feature transformation matrix Bz is output to the recognition dictionary 220 in step S915. And the process of the pattern learning part 240 is complete | finished.

すなわち、本実施形態においては、特徴変換行列の列ベクトルをGroup Lassoによってスパース化するため、特徴選択と特徴変換行列を同時に最適化でき、認識精度がより向上する。特に、損失項が特徴変換行列の定数倍に対して不変であっても、正則化項をノルム比で定義することにより、評価関数を最小化する解に到達でき、特徴選択と特徴変換の同時最適化が行える。 That is, in this embodiment, since the column vector of the feature transformation matrix is sparse by Group Lasso, the feature selection and the feature transformation matrix can be optimized simultaneously, and the recognition accuracy is further improved. In particular, even if the loss term is invariant to a constant multiple of the feature transformation matrix, by defining the regularization term with a norm ratio, a solution that minimizes the evaluation function can be reached, and feature selection and feature transformation can be performed simultaneously. Optimization can be performed.

本実施形態のパターン認識装置およびパターン学習部は、認識誤りに相当する量として計算される損失項と、辞書の要素値で定義されるノルム比で計算される正則化項の和が減るように特徴変換行列を更新するよう動作する。このような構成を採用し、特徴選択と特徴変換を同時最適化することにより、認識精度を改善することができる。 The pattern recognition apparatus and the pattern learning unit of the present embodiment reduce the sum of a loss term calculated as an amount corresponding to a recognition error and a regularization term calculated by a norm ratio defined by a dictionary element value. Operates to update the feature transformation matrix. By adopting such a configuration and simultaneously optimizing feature selection and feature conversion, recognition accuracy can be improved.

［第３実施形態］
次に、本発明の第３実施形態に係るパターン学習部を含むパターン認識装置について説明する。本実施形態に係るパターン学習部を含むパターン認識装置は、上記第２実施形態と比べると、特徴変換行列の列ベクトルおよび行ベクトルをGroup Lassoによってスパース化する点で異なる。すなわち、本実施形態の正則化項は、列ベクトルを要素とするノルム比と、行ベクトルを要素とするノルム比とを含む。その他の構成および動作は、第２実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。 [Third Embodiment]
Next, a pattern recognition apparatus including a pattern learning unit according to the third embodiment of the present invention will be described. The pattern recognition apparatus including the pattern learning unit according to the present embodiment is different from the second embodiment in that the column vector and the row vector of the feature transformation matrix are sparse by Group Lasso. That is, the regularization term of this embodiment includes a norm ratio having a column vector as an element and a norm ratio having a row vector as an element. Since other configurations and operations are the same as those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

本実施形態においては、第２実施形態と同じ識別関数を用い、評価関数を次式（数式１４）で定義する。

第１項は損失項、第２項および第３項は正則化項であり、λ＞0およびη＞0は正則化項の重みである。第１項に含まれる入力ｘ_nに対する損失には、前出の（数式９）を用い、第２項には（数式１１）および（数式１２）を用いる。 In this embodiment, the same discriminant function as in the second embodiment is used, and the evaluation function is defined by the following formula (Formula 14).

The first term is a loss term, the second and third terms are regularization terms, and λ> 0 and η> 0 are the weights of the regularization terms. For the loss with respect to the input x _n included in the first term, (Equation 9) is used, and (Equation 11) and (Equation 12) are used for the second term.

第３項は、次式（数式１５および数式１６）のように定義する。

The third term is defined as the following formulas (Formula 15 and Formula 16).

本実施形態においても、第２実施形態と同様に、初期化として、クラスごとの入力ベクトルの平均を参照ベクトルｙ_kとして設定し、特徴変換行列については、主成分分析で得られる固有ベクトルφ_iを、固有値の大きい順にｐ個選んでＢ=(φ₁,…,φ_p)^Tと設定する。しかし、本実施形態においては、（数式１４）に示した評価関数の値を計算し、それの値が減少するように特徴変換行列を最急降下法などで更新した後、評価関数を計算しなおす処理を繰り返す。終了条件は、事前に繰り返し回数を決めておいてもよいし、評価関数の変化がある値以下になった時点で処理を終了しても構わない。 Also in the present embodiment, as in the second embodiment, as an initialization, the average of the input vectors for each class is set as the reference vector y _k , and the eigenvector φ _i obtained by the principal component analysis is set for the feature transformation matrix. Then, select p in descending order of eigenvalues and set B = (φ ₁ ,..., Φ _p ) ^T. However, in this embodiment, the value of the evaluation function shown in (Formula 14) is calculated, and the feature conversion matrix is updated by the steepest descent method so that the value decreases, and then the evaluation function is recalculated. Repeat the process. As the end condition, the number of repetitions may be determined in advance, or the process may be ended when the evaluation function changes below a certain value.

《パターン学習部の正則化計算部》
図１３は、本実施形態に係る正則化計算部１３０３の構成を示す図である。なお、図１３において、図４の正則化計算部４０３と同様の機能構成部には同じ参照番号を付して、説明を省略する。 <Regularization calculation part of pattern learning part>
FIG. 13 is a diagram illustrating a configuration of the regularization calculation unit 1303 according to the present embodiment. In FIG. 13, the same functional components as those of the regularization calculation unit 403 in FIG.

行のＬvノルム算出部１３３２は、特徴変換行列の行ベクトルの長さ（ノルム）をｖ乗して累積した後に（１／ｖ）乗したＬvノルムを算出する。行のＬwノルム算出部１３３３は、特徴変換行列の行ベクトルの長さ（ノルム）をｗ乗して累積した後に（１／ｗ）乗したＬwノルムを算出する（v＜w）。行のＬv／Ｌw算出部１３３４は、第３項の正則化項の値として、行の（Ｌvノルム／Ｌwノルム）を算出する。正則化項生成部（加算部）１３３５は、列の（Ｌvノルム／Ｌwノルム）と行の（Ｌvノルム／Ｌwノルム）とを加算して、正則化項の値とする（数式１４参照）。なお、列の（Ｌvノルム／Ｌwノルム）と行の（Ｌvノルム／Ｌwノルム）とを、評価関数値算出部（加算部）４０４において損失項と加算してもよい。 The row Lv norm calculation unit 1332 calculates the Lv norm obtained by accumulating the length (norm) of the row vector of the feature transformation matrix by raising it to the power of (1 / v). The row Lw norm calculation unit 1333 calculates the Lw norm obtained by accumulating the length (norm) of the row vector of the feature transformation matrix by raising to the power of (1 / w) (v <w). The Lv / Lw calculation unit 1334 of the row calculates (Lv norm / Lw norm) of the row as the value of the regularization term of the third term. The regularization term generation unit (addition unit) 1335 adds the (Lv norm / Lw norm) of the column and the (Lv norm / Lw norm) of the row to obtain the value of the regularization term (see Expression 14). Note that the (Lv norm / Lw norm) of the column and the (Lv norm / Lw norm) of the row may be added to the loss term in the evaluation function value calculation unit (addition unit) 404.

（パターン学習処理）
図１４は、本実施形態に係るパターン学習処理（Ｓ１４０３）の手順を示すフローチャートである。本実施形態においては、図１４の手順により図９の手順を代替する。このフローチャートは、図７のＣＰＵ７１０がＲＡＭ７４０を使用しながら実行し、パターン学習部の機能構成部を実現する。なお、図１４において、図９と同様のステップには同じステップ番号を付して、説明を省略する。 (Pattern learning process)
FIG. 14 is a flowchart showing the procedure of the pattern learning process (S1403) according to the present embodiment. In the present embodiment, the procedure of FIG. 9 is replaced by the procedure of FIG. This flowchart is executed by the CPU 710 in FIG. 7 using the RAM 740, and realizes a functional configuration unit of the pattern learning unit. In FIG. 14, steps similar to those in FIG. 9 are denoted by the same step numbers, and description thereof is omitted.

パターン認識装置２００は、ステップＳ１４０７において、損失項として、｛λ（||θ||₁/||θ||₂）＋η（||ξ||₁/||ξ||₂）｝を算出する。 In step S1407, the pattern recognition apparatus 200 calculates {λ (|| θ || ₁ / || θ || ₂ ) + η (|| ξ || ₁ / || ξ || ₂ )} as a loss term. To do.

本実施形態によれば、特徴変換行列の列ベクトルだけでなく行ベクトルもGroup Lassoによってスパース化する。そのため、特徴選択と特徴変換行列の最適化だけでなく、変換後のベクトル次元数も最適化できるため、よりコンパクトな特徴変換行列を作ることができる。これにより、認識精度の向上ばかりでなく、認識処理の高速化も行える。 According to the present embodiment, not only the column vector of the feature transformation matrix but also the row vector is sparse by Group Lasso. Therefore, not only feature selection and feature transformation matrix optimization, but also the number of vector dimensions after transformation can be optimized, so that a more compact feature transformation matrix can be created. Thereby, not only the recognition accuracy can be improved, but also the recognition process can be accelerated.

［第４実施形態］
次に、本発明の第４実施形態に係るパターン学習部を含むパターン認識装置について説明する。本実施形態に係るパターン学習部を含むパターン認識装置は、上記第２実施形態および第３実施形態と比べると、本実施形態の評価関数を用いて特徴変換行列と共に参照ベクトルも最適化する点で異なる。その他の構成および動作は、第２実施形態または第３実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。 [Fourth Embodiment]
Next, a pattern recognition apparatus including a pattern learning unit according to the fourth embodiment of the present invention will be described. Compared with the second embodiment and the third embodiment, the pattern recognition apparatus including the pattern learning unit according to the present embodiment optimizes the reference vector together with the feature transformation matrix using the evaluation function of the present embodiment. Different. Since other configurations and operations are the same as those of the second embodiment or the third embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

（認識辞書）
図１５は、本実施形態に係る認識辞書１５２０の構成を示す図である。なお、図１５において、図３と同様の構成要素には同じ参照番号を付して、説明を省略する。また、図１５には、識別関数のパラメータのみを図示し、識別関数や評価関数などは省略する。 (Recognition dictionary)
FIG. 15 is a diagram showing the configuration of the recognition dictionary 1520 according to this embodiment. In FIG. 15, the same components as those in FIG. 3 are denoted by the same reference numerals, and description thereof is omitted. FIG. 15 shows only the parameters of the discriminant function, and omits the discriminant function and the evaluation function.

認識辞書１５２０には、パターン学習部２４０で最適化した特徴変換行列と参照ベクトルからなるパラメータ更新値１５０２を記憶する。 The recognition dictionary 1520 stores a parameter update value 1502 composed of a feature transformation matrix and a reference vector optimized by the pattern learning unit 240.

（パターン学習処理）
図１６は、本実施形態に係るパターン学習処理（Ｓ１６０３）の手順を示すフローチャートである。本実施形態においては、図１６の手順により図９の手順を代替する。このフローチャートは、図７のＣＰＵ７１０がＲＡＭ７４０を使用しながら実行し、パターン学習部の機能構成部を実現する。なお、図１６において、図９と同様のステップには同じステップ番号を付して、説明を省略する。 (Pattern learning process)
FIG. 16 is a flowchart showing the procedure of the pattern learning process (S1603) according to this embodiment. In the present embodiment, the procedure of FIG. 9 is substituted by the procedure of FIG. This flowchart is executed by the CPU 710 in FIG. 7 using the RAM 740, and realizes a functional configuration unit of the pattern learning unit. In FIG. 16, steps similar to those in FIG. 9 are denoted by the same step numbers and description thereof is omitted.

パターン認識装置２００は、ステップＳ９１１において、特徴変換行列Ｂの最適値に収束すると、ステップＳ１６１５において、特徴変換行列Ｂの最適値に収束した時点の評価関数値を記憶する。パターン認識装置２００は、ステップＳ１６１７において、参照ベクトルｙ_kを更新する。なお、参照ベクトルｙ_kの更新は、損失項を小さくする方向であることが望ましい。パターン認識装置２００は、ステップＳ１６１９において、参照ベクトルの最適値を取得したか否かを判定する。ステップＳ１６１９においては、例えば、参照ベクトルを更新した後に、特徴変換行列Ｂを最適値に収束した時点の評価関数の値の中で最小の値に収束した参照ベクトルを選択する。参照ベクトルの最適値を取得したなら、パターン認識装置２００は、ステップＳ１６２３において、その時に特徴変換行列Ｂの最適値と参照ベクトルｙ_kの最適値とを認識辞書２２０に格納する。 When the pattern recognition apparatus 200 converges to the optimum value of the feature transformation matrix B in step S911, the pattern recognition apparatus 200 stores the evaluation function value at the time of convergence to the optimum value of the feature transformation matrix B in step S1615. In step S1617, the pattern recognition apparatus 200 updates the reference vector y _k . It should be noted that the update of the reference vector y _k is preferably in the direction of reducing the loss term. In step S1619, the pattern recognition apparatus 200 determines whether or not the optimum value of the reference vector has been acquired. In step S1619, for example, after updating the reference vector, the reference vector that has converged to the minimum value among the evaluation function values at the time when the feature transformation matrix B has converged to the optimum value is selected. If the optimum value of the reference vector is acquired, the pattern recognition apparatus 200 stores the optimum value of the feature transformation matrix B and the optimum value of the reference vector y _{k in} the recognition dictionary 220 at that time in step S1623.

一方、参照ベクトルの最適値を取得していないなら、パターン認識装置２００は、ステップＳ１６２１において、特徴変換行列を初期化する。そして、パターン認識装置２００は、ステップＳ９０５に戻り、特徴変換行列Ｂの最適値を探す。なお、ステップＳ１６２１における特徴変換行列の初期化は必須ではなく、現在の特徴変換行列の値を用いてステップＳＳ９０５に戻ってもよい。 On the other hand, if the optimum value of the reference vector has not been acquired, the pattern recognition apparatus 200 initializes the feature transformation matrix in step S1621. Then, the pattern recognition apparatus 200 returns to step S905 and searches for the optimum value of the feature transformation matrix B. Note that initialization of the feature transformation matrix in step S1621 is not essential, and the current feature transformation matrix value may be used to return to step SS905.

本実施形態によれば、特徴変換行列の最適値に加えて、参照ベクトルの最適化が実現され、認識精度のより一層の向上が達成できる。 According to this embodiment, in addition to the optimum value of the feature transformation matrix, the optimization of the reference vector is realized, and the recognition accuracy can be further improved.

［第５実施形態］
次に、本発明の第５実施形態に係るパターン学習部を含むパターン認識装置について説明する。本実施形態に係るパターン学習部を含むパターン認識装置は、上記第２実施形態乃至第４実施形態と比べると、パターン認識装置においてパターン学習部とは別途に参照ベクトルの初期化において最適化する点で異なる。その他の構成および動作は、第２実施形態、第３実施形態または第４実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。 [Fifth Embodiment]
Next, a pattern recognition apparatus including a pattern learning unit according to the fifth embodiment of the present invention will be described. The pattern recognition apparatus including the pattern learning unit according to the present embodiment is optimized in the initialization of the reference vector separately from the pattern learning unit in the pattern recognition apparatus as compared with the second to fourth embodiments. It is different. Since other configurations and operations are the same as those of the second embodiment, the third embodiment, or the fourth embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof is omitted.

《パターン認識装置の機能構成》
図１７は、本実施形態に係るパターン学習部２４０を含むパターン認識装置１７００の機能構成を示すブロック図である。なお、図１７において、図２と同様の機能構成部には同じ参照番号を付して、説明を省略する。 <Functional configuration of pattern recognition device>
FIG. 17 is a block diagram showing a functional configuration of a pattern recognition apparatus 1700 including a pattern learning unit 240 according to this embodiment. In FIG. 17, the same functional components as those in FIG. 2 are denoted by the same reference numerals, and description thereof is omitted.

参照ベクトル更新部１７５０は、パラメータ初期値生成部２１０において生成された参照ベクトルｙ_kを、最適な位置に更新する。かかる参照ベクトルｙ_kの更新方向は、初期値生成用の入力ベクトルや、学習用入力ベクトルを用いて、本実施形態の識別関数および評価関数に基づいて行なってもよいし、既存の参照ベクトルｙ_kの最適化によって実行してもよい。 The reference vector update unit 1750 updates the reference vector y _k generated by the parameter initial value generation unit 210 to an optimal position. The update direction of the reference vector y _k may be performed based on the identification function and the evaluation function of the present embodiment using the input vector for initial value generation or the learning input vector, or the existing reference vector y You may perform by optimization of _k .

本実施形態においては、特徴変換行列の最適化に使用する参照ベクトルを、クラスごとの入力ベクトルの平均を参照ベクトルｙ_kとして設定した後に、参照ベクトル更新部１７５０において参照ベクトルｙkを最適な位置に更新するので、認識精度のより一層の向上が達成できる。 In this embodiment, after setting the reference vector used for optimizing the feature transformation matrix as the reference vector y _k as the average of the input vectors for each class, the reference vector update unit 1750 sets the reference vector y _k to the optimal position. Since it is updated, the recognition accuracy can be further improved.

《パターン認識装置の処理手順》
図１８は、本実施形態に係るパターン認識装置１７００の処理手順を示すフローチャートである。このフローチャートは、図７のＣＰＵ７１０がＲＡＭ７４０を使用しながら実行し、図１８のパターン認識装置１７００の機能構成部を実現する。なお、図１８において、図８と同様のステップには同じステップ番号を付して、説明を省略する。 << Processing procedure of pattern recognition device >>
FIG. 18 is a flowchart showing a processing procedure of the pattern recognition apparatus 1700 according to this embodiment. This flowchart is executed by the CPU 710 in FIG. 7 while using the RAM 740, and implements the functional components of the pattern recognition apparatus 1700 in FIG. In FIG. 18, steps similar to those in FIG. 8 are denoted by the same step numbers and description thereof is omitted.

パターン認識装置１７００は、ステップＳ１８０２において、ステップＳ８０１で生成された参照ベクトルを最適な位置に更新する処理を行なって、認識辞書２２０に保持する。 In step S1802, the pattern recognition apparatus 1700 performs processing for updating the reference vector generated in step S801 to an optimal position, and stores the updated reference vector in the recognition dictionary 220.

本実施形態によれば、最適化された参照ベクトルに基づいて特徴変換行列の最適値を行なうので、認識精度のより一層の向上が達成できる。 According to the present embodiment, since the optimum value of the feature transformation matrix is performed based on the optimized reference vector, the recognition accuracy can be further improved.

［他の実施形態］
本発明の活用例として、画像中に含まれる対象物を自動検出する検出装置や、検出装置をコンピュータに実現するためのプログラムなどの用途が挙げられる。 [Other Embodiments]
Applications of the present invention include applications such as a detection device that automatically detects an object included in an image and a program for realizing the detection device on a computer.

なお、実施形態を参照して本発明を説明したが、本発明は上記実施形態に限定されるものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解し得る様々な変更をすることができる。また、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステムまたは装置も、本発明の範疇に含まれる。 In addition, although this invention was demonstrated with reference to embodiment, this invention is not limited to the said embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention. In addition, a system or an apparatus in which different features included in each embodiment are combined in any way is also included in the scope of the present invention.

また、本発明は、複数の機器から構成されるシステムに適用されてもよいし、単体の装置に適用されてもよい。さらに、本発明は、実施形態の機能を実現するパターン認識プログラムやパターン学習プログラムが、システムあるいは装置に直接あるいは遠隔から供給される場合にも適用可能である。したがって、本発明の機能をコンピュータで実現するために、コンピュータにインストールされるプログラム、あるいはそのプログラムを格納した媒体、そのプログラムをダウンロードさせるＷＷＷ(World Wide Web)サーバも、本発明の範疇に含まれる。特に、少なくとも、上述した実施形態に含まれる処理ステップをコンピュータに実行させるプログラムを格納した非一時的コンピュータ可読媒体（non-transitory computer readable medium）は本発明の範疇に含まれる。 In addition, the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention is also applicable to a case where a pattern recognition program or a pattern learning program that realizes the functions of the embodiment is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention on a computer, a program installed in the computer, a medium storing the program, and a WWW (World Wide Web) server that downloads the program are also included in the scope of the present invention. . In particular, at least a non-transitory computer readable medium storing a program for causing a computer to execute the processing steps included in the above-described embodiments is included in the scope of the present invention.

［実施形態の他の表現］
上記の実施形態の一部または全部は、以下の付記のようにも記載されうるが、以下には限られない。
（付記１）
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力手段と、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算手段と、
前記評価関数における正則化項を計算する正則化計算手段と、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新手段と、
前記パラメータ更新手段による更新後の前記識別関数のパラメータを出力するパラメータ出力手段と、
を備え、
前記正則化計算手段は、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する、パターン学習装置。
（付記２）
前記特徴変換行列は、入力ベクトルの要素を選択することにより次元を減らす特徴選択と、前記入力ベクトルを線形変換して次元を減らす特徴変換と、を行なう行列である、付記１に記載のパターン学習装置。
（付記３）
前記正則化項が、前記特徴変換行列の列ベクトルを用いたノルムの比で定義される付記１または２に記載のパターン学習装置。
（付記４）
前記正則化項が、前記特徴変換行列の行ベクトルを用いたノルムの比で定義される付記１乃至３のいずれか１項に記載のパターン学習装置。
（付記５）
分子をＬvノルムとし、分母をＬwノルムとする場合（v, wは実数）、wがvより大きいノルムの比を前記正則化項とする、付記１乃至４のいずれか１項に記載のパターン学習装置。
（付記６）
前記正則化項は、分子をＬ1ノルムとし、分母をＬ2ノルムとする、ノルムの比を用いる、付記５に記載のパターン学習装置。
（付記７）
前記識別関数は、前記パラメータの初期値として、前記特徴変換行列と、入力ベクトルのクラス識別に用いる参照ベクトルと、を有し、
前記パラメータ更新手段は、所定の参照ベクトルに基づいて、前記特徴変換行列を変更する、付記１乃至６のいずれか１項に記載のパターン学習装置。
（付記８）
前記識別関数は、前記パラメータの初期値として、前記特徴変換行列と、入力ベクトルのクラス識別に用いる参照ベクトルと、を有し、
前記パラメータ更新手段は、前記参照ベクトルと前記特徴変換行列とを変更する、付記１乃至６のいずれか１項に記載のパターン学習装置。
（付記９）
前記損失項は、前記特徴変換行列を定数倍しても同じ値をとるように、分子と分母とに前記特徴変換行列を含む間違いやすさを表わす量の関数として定義される、付記１乃至８のいずれか１項に記載のパターン学習装置。
（付記１０）
前記識別関数のパラメータの初期値を生成する初期値生成手段を、さらに備え、
前記初期値生成手段は、クラスごとの入力ベクトルの平均を参照ベクトルとして設定し、主成分分析で得られる固有ベクトルφ_iを固有値の大きい順にｐ個選んで特徴変換行列Ｂ=(φ₁,…,φ_p)^Tと設定する、付記１乃至９のいずれか１項に記載のパターン学習装置。
（付記１１）
付記１乃至１０のいずれか１項に記載のパターン学習装置を有するパターン認識装置であって、
前記識別関数のパラメータの初期値および前記パラメータ出力手段が出力した前記更新後の前記識別関数のパラメータを格納する認識辞書と、
前記初期値および前記学習用の入力ベクトルに基づいて、前記パターン学習装置に前記更新後の前記識別関数のパラメータを生成させるパラメータ生成指示手段と、
入力された認識対象の入力ベクトルに基づいて、前記更新後の前記識別関数のパラメータを用いた前記識別関数によりクラス識別を行なうクラス識別手段と、
を備えるパターン認識装置。
（付記１２）
前記パラメータ生成指示手段は、前記識別関数のパラメータとして前記特徴変換行列を更新させ、
前記識別関数のパラメータである参照ベクトルを更新させる参照ベクトル更新手段を、さらに備える付記１１に記載のパターン認識装置。
（付記１３）
前記識別関数のパラメータの初期値を生成する初期値生成手段を、さらに備え、
前記初期値生成手段は、クラスごとの入力ベクトルの平均を参照ベクトルとして設定し、主成分分析で得られる固有ベクトルφ_iを固有値の大きい順にｐ個選んで特徴変換行列Ｂ=(φ₁,…,φ_p)^Tと設定する、付記１１または１２に記載のパターン認識装置。
（付記１４）
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力ステップと、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算ステップと、
前記評価関数における正則化項を計算する正則化計算ステップと、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新ステップと、
前記パラメータ更新ステップにおいて更新後の前記識別関数のパラメータを出力するパラメータ出力ステップと、
を含み、
前記正則化計算ステップにおいては、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する、パターン学習方法。
（付記１５）
前記識別関数のパラメータの初期値を生成する初期値生成ステップを、さらに含み、
前記初期値生成ステップにおいては、クラスごとの入力ベクトルの平均を参照ベクトルとして設定し、主成分分析で得られる固有ベクトルφ_iを固有値の大きい順にｐ個選んで特徴変換行列Ｂ=(φ₁,…,φ_p)^Tと設定する、付記１４に記載のパターン学習方法。
（付記１６）
付記１４または１５のパターン学習方法を含むパターン認識方法であって、
前記初期値および前記学習用の入力ベクトルに基づいて、前記パターン学習方法により前記更新後の前記識別関数のパラメータを生成させるパラメータ生成指示ステップと、
入力された認識対象の入力ベクトルに基づいて、前記更新後の前記識別関数のパラメータを用いた前記識別関数によりクラス識別を行なうクラス識別ステップと、
を含むパターン認識方法。
（付記１７）
前記パラメータ生成指示ステップにおいては、前記識別関数のパラメータとして前記特徴変換行列を更新させ、
前記識別関数のパラメータである参照ベクトルを更新させる参照ベクトル更新ステップを、さらに含む付記１６に記載のパターン認識方法。
（付記１８）
パターン認識に用いる識別関数のパラメータの初期値を入力する初期値入力ステップと、
学習用の入力ベクトルに基づいて、前記識別関数を評価する評価関数における認識誤りに相当する損失項を計算する損失計算ステップと、
前記評価関数における正則化項を計算する正則化計算ステップと、
前記損失項と前記正則化項との総和が減少するように、前記識別関数のパラメータを更新するパラメータ更新ステップと、
前記パラメータ更新ステップにおいて更新後の前記識別関数のパラメータを出力するパラメータ出力ステップと、
をコンピュータに実行させるパターン学習プログラムであって、
前記正則化計算ステップにおいては、前記識別関数の特徴変換行列を用いたノルムの比で定義される正則化項を計算する、パターン学習プログラム。
（付記１９）
前記識別関数のパラメータの初期値を生成する初期値生成ステップを、さらに含み、
前記初期値生成ステップにおいては、クラスごとの入力ベクトルの平均を参照ベクトルとして設定し、主成分分析で得られる固有ベクトルφ_iを固有値の大きい順にｐ個選んで特徴変換行列Ｂ=(φ₁,…,φ_p)^Tと設定する、付記１８に記載のパターン学習プログラム。
（付記２０）
付記１８または１９のパターン学習プログラムを含むパターン認識プログラムであって、
前記初期値および前記学習用の入力ベクトルに基づいて前記パターン学習プログラムを実行させ、前記更新後の前記識別関数のパラメータを生成させるパラメータ生成指示ステップと、
入力された認識対象の入力ベクトルに基づいて、前記更新後の前記識別関数のパラメータを用いた前記識別関数によりクラス識別を行なうクラス識別ステップと、
をコンピュータに実行させるパターン認識プログラム。
（付記２１）
前記パラメータ生成指示ステップにおいては、前記識別関数のパラメータとして前記特徴変換行列を更新させ、
前記識別関数のパラメータである参照ベクトルを更新させる参照ベクトル更新ステップを、さらにコンピュータに実行させる付記２０に記載のパターン認識プログラム。 [Other expressions of embodiment]
A part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
An initial value input means for inputting an initial value of a parameter of an identification function used for pattern recognition;
A loss calculating means for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
Regularization calculation means for calculating a regularization term in the evaluation function;
Parameter updating means for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
Parameter output means for outputting the parameters of the discrimination function after being updated by the parameter update means;
With
The regularization calculation means is a pattern learning device that calculates a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.
(Appendix 2)
The pattern learning according to claim 1, wherein the feature transformation matrix is a matrix that performs feature selection for reducing dimensions by selecting elements of an input vector, and feature transformation for reducing dimensions by linearly transforming the input vector. apparatus.
(Appendix 3)
The pattern learning apparatus according to appendix 1 or 2, wherein the regularization term is defined by a norm ratio using a column vector of the feature transformation matrix.
(Appendix 4)
The pattern learning device according to any one of supplementary notes 1 to 3, wherein the regularization term is defined by a norm ratio using a row vector of the feature transformation matrix.
(Appendix 5)
The pattern according to any one of appendices 1 to 4, wherein when the numerator is an Lv norm and the denominator is an Lw norm (v and w are real numbers), the ratio of norms where w is greater than v is the regularization term. Learning device.
(Appendix 6)
The pattern learning device according to appendix 5, wherein the regularization term uses a norm ratio in which a numerator is an L1 norm and a denominator is an L2 norm.
(Appendix 7)
The discriminant function includes, as initial values of the parameters, the feature transformation matrix and a reference vector used for class discrimination of an input vector,
The pattern learning device according to any one of appendices 1 to 6, wherein the parameter update unit changes the feature transformation matrix based on a predetermined reference vector.
(Appendix 8)
The discriminant function includes, as initial values of the parameters, the feature transformation matrix and a reference vector used for class discrimination of an input vector,
The pattern learning device according to any one of appendices 1 to 6, wherein the parameter update unit changes the reference vector and the feature transformation matrix.
(Appendix 9)
The loss term is defined as a function of an amount representing an error probability including the feature transformation matrix in the numerator and the denominator so that the same value is obtained even if the feature transformation matrix is multiplied by a constant. The pattern learning device according to any one of the above.
(Appendix 10)
An initial value generating means for generating an initial value of the parameter of the discriminant function;
The initial value generating means sets an average of input vectors for each class as a reference vector, selects p eigenvectors φ _i obtained by principal component analysis in descending order of eigenvalues, and a feature transformation matrix B = (φ ₁ ,... The pattern learning device according to any one of appendices 1 to 9, wherein φ _p ) ^T is set.
(Appendix 11)
A pattern recognition apparatus having the pattern learning apparatus according to any one of appendices 1 to 10,
A recognition dictionary for storing initial values of parameters of the discriminant function and parameters of the discriminant function after the update output by the parameter output means;
Based on the initial value and the input vector for learning, parameter generation instructing means for causing the pattern learning device to generate parameters of the updated identification function;
Class identification means for performing class identification by the identification function using the parameter of the identification function after the update based on the input recognition target input vector;
A pattern recognition apparatus comprising:
(Appendix 12)
The parameter generation instruction means updates the feature transformation matrix as a parameter of the discriminant function,
The pattern recognition apparatus according to appendix 11, further comprising reference vector update means for updating a reference vector that is a parameter of the discrimination function.
(Appendix 13)
An initial value generating means for generating an initial value of the parameter of the discriminant function;
The initial value generating means sets an average of input vectors for each class as a reference vector, selects p eigenvectors φ _i obtained by principal component analysis in descending order of eigenvalues, and a feature transformation matrix B = (φ ₁ ,... The pattern recognition device according to appendix 11 or 12, wherein φ _p ) ^T is set.
(Appendix 14)
An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
Including
In the regularization calculation step, a pattern learning method of calculating a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.
(Appendix 15)
An initial value generating step of generating an initial value of a parameter of the discriminant function;
In the initial value generation step, an average of input vectors for each class is set as a reference vector, and p eigenvectors _i obtained by principal component analysis are selected in descending order of eigenvalues, and feature transformation matrix B = (φ ₁ ,. , φ _p ) ^T , The pattern learning method according to appendix 14.
(Appendix 16)
A pattern recognition method including the pattern learning method according to appendix 14 or 15,
Based on the initial value and the input vector for learning, a parameter generation instruction step for generating the updated parameters of the discriminant function by the pattern learning method;
A class identifying step for performing class identification by the identification function using the parameters of the updated identification function based on the input recognition target input vector;
A pattern recognition method including:
(Appendix 17)
In the parameter generation instruction step, the feature transformation matrix is updated as a parameter of the discrimination function,
The pattern recognition method according to appendix 16, further comprising a reference vector update step of updating a reference vector that is a parameter of the discrimination function.
(Appendix 18)
An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
A pattern learning program for causing a computer to execute
In the regularization calculation step, a pattern learning program that calculates a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.
(Appendix 19)
An initial value generating step of generating an initial value of a parameter of the discriminant function;
In the initial value generation step, an average of input vectors for each class is set as a reference vector, and p eigenvectors _i obtained by principal component analysis are selected in descending order of eigenvalues, and feature transformation matrix B = (φ ₁ ,. , φ _p ) The pattern learning program according to appendix 18, which is set as ^T.
(Appendix 20)
A pattern recognition program including the pattern learning program according to appendix 18 or 19,
A parameter generation instruction step for causing the pattern learning program to be executed based on the initial value and the input vector for learning, and generating a parameter of the discriminant function after the update;
A class identifying step for performing class identification by the identification function using the parameters of the updated identification function based on the input recognition target input vector;
A pattern recognition program that causes a computer to execute.
(Appendix 21)
In the parameter generation instruction step, the feature transformation matrix is updated as a parameter of the discrimination function,
The pattern recognition program according to appendix 20, further causing a computer to execute a reference vector update step of updating a reference vector that is a parameter of the discrimination function.

Claims

An initial value input means for inputting an initial value of a parameter of an identification function used for pattern recognition;
A loss calculating means for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
Regularization calculation means for calculating a regularization term in the evaluation function;
Parameter updating means for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
Parameter output means for outputting the parameters of the discrimination function after being updated by the parameter update means;
With
The regularization calculation means is a pattern learning device that calculates a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.

The pattern according to claim 1, wherein the feature transformation matrix is a matrix that performs feature selection for reducing a dimension by selecting an element of an input vector and feature transformation for reducing the dimension by linearly transforming the input vector. Learning device.

The pattern learning device according to claim 1, wherein the regularization term is defined by a norm ratio using a column vector of the feature transformation matrix.

The pattern learning device according to claim 1, wherein the regularization term is defined by a norm ratio using a row vector of the feature transformation matrix.

5. The regularization term according to claim 1, wherein when the numerator is an Lv norm and the denominator is an Lw norm (v and w are real numbers), a ratio of norms where w is greater than v is the regularization term. Pattern learning device.

The discriminant function includes, as initial values of the parameters, the feature transformation matrix and a reference vector used for class discrimination of an input vector,
The pattern learning device according to claim 1, wherein the parameter update unit changes the feature transformation matrix based on a predetermined reference vector.

The discriminant function includes, as initial values of the parameters, the feature transformation matrix and a reference vector used for class discrimination of an input vector,
The pattern learning device according to claim 1, wherein the parameter update unit changes the reference vector and the feature transformation matrix.

A pattern recognition device comprising the pattern learning device according to claim 1,
A recognition dictionary for storing initial values of parameters of the discriminant function and parameters of the discriminant function after the update output by the parameter output means;
Based on the initial value and the input vector for learning, parameter generation instructing means for causing the pattern learning device to generate parameters of the updated identification function;
Class identification means for performing class identification by the identification function using the parameter of the identification function after the update based on the input recognition target input vector;
A pattern recognition apparatus comprising:

An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
Including
In the regularization calculation step, a pattern learning method of calculating a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.

An initial value input step for inputting an initial value of a parameter of a discrimination function used for pattern recognition;
A loss calculating step for calculating a loss term corresponding to a recognition error in the evaluation function for evaluating the discriminant function based on an input vector for learning;
A regularization calculation step for calculating a regularization term in the evaluation function;
A parameter updating step for updating the parameters of the discriminant function so that the sum of the loss term and the regularization term decreases;
A parameter output step for outputting the parameter of the discriminant function after the update in the parameter update step;
A pattern learning program for causing a computer to execute
In the regularization calculation step, a pattern learning program that calculates a regularization term defined by a norm ratio using a feature transformation matrix of the discriminant function.