JPH07311754A

JPH07311754A - Learning machine

Info

Publication number: JPH07311754A
Application number: JP6131439A
Authority: JP
Inventors: Kenji Fukumizu; 健次福水
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1994-05-19
Filing date: 1994-05-19
Publication date: 1995-11-28

Abstract

PURPOSE:To efficiently and precisely perform a learning by generating optimum learning data which are effective to the learning. CONSTITUTION:A learning data generation part 4 generates input data xnu according to a probable rule and generates the learning data (xnu, ynu) by using the real response of an unknown system to the input data as tutor data ynu, and an initial learning data generation part 4a which generates initial learning data and an update learning data generation part 4b which generates learning data for update are provided corresponding to that a conditioned probability estimation part 2 carries out the learning in two stages of learning for initial learning and subsequent parameter update.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、システム同定，パター
ン認識や制御問題などのように、与えられた入力から望
ましい出力を得るのに利用可能な学習機械に関する。FIELD OF THE INVENTION This invention relates to learning machines that can be used to obtain a desired output from a given input, such as system identification, pattern recognition and control problems.

【０００２】[0002]

【従来の技術】従来、ニューラルネットワークなどの統
計的学習機械では、学習メカニズムとして、例えばバッ
クプロパゲーション(誤差逆伝播アルゴリズム)などが知
られている。統計的学習機械が例えば図２に示すよう
に、入力層５１と中間層５２と出力層５３とからなる３
層構造のパーセプトロン(階層的ネットワーク)である場
合、バックプロパゲーションの学習メカニズムは、入力
層５１のユニットと中間層５２のユニットとの結合の重
み，中間層５２のユニットと出力層５３のユニットとの
結合の重みをパラメータに含めて、二乗誤差最小化の学
習を行なうものである。より具体的には、ｍ層のネット
ワークを考え、ｋ層の第ｉユニットへの入力の総和をｉ
^k _i，出力をｏ^k _iとし、ｋ−１層の第ｉユニットからｋ層
の第ｉユニットへの結合の重みをｗ^k-1 _i ^k _jとし、各ユニ
ットの入出力関係を与える関数をｆとすると、これらの
変数の間の関係は次式のようになる。2. Description of the Related Art Conventionally, in a statistical learning machine such as a neural network, for example, back propagation (error back propagation algorithm) is known as a learning mechanism. As shown in FIG. 2, the statistical learning machine includes an input layer 51, an intermediate layer 52, and an output layer 53.
In the case of a layered perceptron (hierarchical network), the learning mechanism of the backpropagation is that the weight of the coupling between the unit of the input layer 51 and the unit of the intermediate layer 52, the unit of the intermediate layer 52 and the unit of the output layer 53 The weight of the coupling of is included in the parameter, and learning of the square error minimization is performed. More specifically, considering an m-layer network, the sum of the inputs to the i-th unit of the k-layer is i
^k _i, the output and o ^k _i, the weight of binding to the i-th unit of k layer and w ^k-1 _i ^k _j from the i unit k-1 layer, a function that gives the input-output relationship of each unit Given f, the relationship between these variables is:

【０００３】[0003]

【数１】 [Equation 1]

【０００４】損失関数ｒを誤差の二乗にとると、ある入
出力パターンの組(ｘ，ｙ)が与えられたときの損失関数
ｒは、次式で与えられる。Taking the loss function r as the square of the error, the loss function r when a certain set of input / output patterns (x, y) is given is given by the following equation.

【０００５】[0005]

【数２】 [Equation 2]

【０００６】ここで、ｗは、対応付けを与えるネットワ
ークの結合の重みを全てまとめたものであり、結合の重
みｗの学習を行なうには(すなわち、結合の重みｗの修
正量を求めるには)、損失関数ｒの結合の重みｗについ
てのグラディエント(勾配)∂ｒ／∂ｗ^k-1 _i ^k _jを計算すれ
ば良い。すなわち、１回の修正の大きさを決めるパラメ
ータをεとするとき、結合の重みｗの修正量△ｗ^k-1 _i ^k _j
は、次式で与えられる。[0006] Here, w is a collection of all the connection weights of the network to which the association is given, and in order to learn the connection weight w (that is, to find the correction amount of the connection weight w). ), The gradient (gradient) ∂r / ∂w ^k-1 _i ^k _j for the weight w of the connection of the loss function r may be calculated. That is, when the parameter that determines the magnitude of one modification is ε, the modification amount Δw ^k−1 _i ^k _{j of the} connection weight w is
Is given by the following equation.

【０００７】[0007]

【数３】 [Equation 3]

【０００８】数３の第２番目，第３番目の式から、結合
の重みｗ^k-1 _i ^k _jの修正に使う信号ｄ^k _jがｋ＝ｍからｋ＝
２に向かって再帰的に計算されることがわかる。また、
数３の第３番目の式から、このｄ^k _jの計算の過程は、出
力層での理想出力と実際の出力との誤差ｄ^k+1 _hを入力と
して、出力層から入力層の方向へ、信号の伝播とは逆の
方向にｗ^k _i ^k+1 _hで重みを付けた和をとりながら、伝播さ
せていくようになっている。これが、バックプロパゲー
ション(誤差逆伝播)学習アルゴリズムである。[0008] Number 3 the second, from the third equation, the signal d ^k _j used to modify the weights w ^{^k-1} _i ^k _j of coupling k = m k =
It can be seen that it is calculated recursively toward 2. Also,
From the third equation of the equation 3, the process of calculating d ^k _j is that the error d ^{k + 1} _h between the ideal output and the actual output in the output layer is used as an input, and the direction from the output layer to the input layer is changed. , And propagates while taking the sum weighted by w ^k _i ^{k + 1} _h in the opposite direction to the signal propagation. This is the backpropagation (error backpropagation) learning algorithm.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、上述し
たような従来の学習機械では、学習データは受動的にし
か与えられず、学習に有効な学習データを発生するよう
にはなっていない。例えば、上述したバックプロパゲー
ションの学習アルゴリズムを用いた学習機械では、学習
データ(入出力パターンの組(ｘ，ｙ))として、通常、真
の確率分布から受動的に発生したものを用いているが、
真の確率分布に従う標本で学習するのが最適であるとの
保証は全くなく、従って、上述したバックプロパゲーシ
ョンの学習アルゴリズムでは、実際には、振動を減ら
し、学習の収束を早めるために、次式のような修正を行
なったりしなければならなかった。However, in the conventional learning machine as described above, the learning data is given only passively, and the learning data effective for learning is not generated. For example, in a learning machine using the above-described learning algorithm of back propagation, as learning data (a pair of input / output patterns (x, y)), one that is passively generated from a true probability distribution is usually used. But,
There is no guarantee that learning with a sample that follows the true probability distribution is optimal, so the backpropagation learning algorithm described above actually uses the following to reduce oscillations and accelerate learning convergence: I had to make corrections like expressions.

【００１０】[0010]

【数４】 [Equation 4]

【００１１】なお、数４において、αは小さな正の定
数，ｔは修正の回数を表わしている。In Equation 4, α represents a small positive constant and t represents the number of corrections.

【００１２】このように、従来の学習機械では、学習に
有効な最適な学習データを発生させて学習を行なうよう
にはなっていなかったので、学習を効率良くかつ精度良
く行なうことができないという問題があった。As described above, the conventional learning machine has not been designed to generate the optimum learning data effective for learning to perform the learning, so that the learning cannot be performed efficiently and accurately. was there.

【００１３】本発明は、学習に有効な最適な学習データ
を発生させ、学習を効率良くかつ精度良く行なうことの
可能な学習機械を提供することを目的としている。It is an object of the present invention to provide a learning machine capable of generating optimum learning data effective for learning and performing learning efficiently and accurately.

【００１４】[0014]

【課題を解決するための手段および作用】上記目的を達
成するために、請求項１記載の発明は、入力ベクトル空
間Ｘからの入力ベクトルｘを受け取る入力手段と、学習
用入力データとそれに対する前記未知システムの応答で
ある教師データとの組からなる学習データを用いて、所
定のパラメータθの学習を行ない、真の条件付確率ｐ
(ｙ|ｘ)を推定し、これによって未知システムの推定を
行なう条件付確率推定手段と、条件付確率推定手段によ
って学習されたパラメータ〈θ〉を用いて、入力手段か
ら与えられた未学習の入力ベクトルｘに対する出力ｙを
条件付確率ｐ(ｙ|ｘ；〈θ〉)に従う標本として算出す
る出力手段と、条件付確率推定手段が用いる学習データ
を、確率的規則に従って発生させた入力データと該入力
データに対する未知システムの応答とから作成する学習
データ作成手段とを有している。この構成では、学習デ
ータ作成手段により学習に有効な最適な学習データを発
生させることができ、これにより、学習を効率良くかつ
精度良く行なうことができる。In order to achieve the above object, the invention according to claim 1 is an input means for receiving an input vector x from an input vector space X, an input data for learning and the above-mentioned input data for it. Using the learning data that is a set of the teacher data that is the response of the unknown system, the learning of the predetermined parameter θ is performed, and the true conditional probability p
(y | x) is estimated, and the conditional probability estimating means for estimating the unknown system by this, and the parameter <θ> learned by the conditional probability estimating means are used to obtain an unlearned value given by the input means. The output means for calculating the output y for the input vector x as a sample according to the conditional probability p (y | x; <θ>), and the input data generated according to the stochastic rule, the learning data used by the conditional probability estimation means Learning data creating means is created from the response of the unknown system to the input data. With this configuration, it is possible to generate the optimum learning data effective for learning by the learning data creating means, and thereby the learning can be performed efficiently and accurately.

【００１５】また、請求項２乃至６記載の発明は、条件
付確率推定手段は、初期のパラメータを学習する初期学
習手段と、初期のパラメータが確定した後、新たな学習
データによってパラメータを更新するパラメータ更新手
段とを有し、また、学習データ作成手段は、条件付確率
推定手段の初期学習手段が用いる初期学習データを作成
する初期学習データ作成手段と、初期学習データ作成手
段において初期学習が終了した後に、パラメータ更新手
段が用いる更新用学習データを作成する更新用学習デー
タ作成手段とを有している。これにより、システムの稼
動状況に応じ、学習に有効な最適な学習データを発生さ
せることができる。Further, in the inventions according to claims 2 to 6, the conditional probability estimating means updates the parameters with new learning data after the initial learning means for learning the initial parameters and the initial parameters are determined. Parameter learning means, and the learning data creating means, the initial learning data creating means for creating the initial learning data used by the initial learning means of the conditional probability estimating means, and the initial learning ends in the initial learning data creating means. After that, it has an update learning data creating means for creating the update learning data used by the parameter updating means. As a result, optimum learning data effective for learning can be generated according to the operating status of the system.

【００１６】[0016]

【実施例】以下、本発明の実施例を図面に基づいて説明
する。図１は本発明に係る学習機械の一実施例の構成図
である。本実施例の学習機械は、Ｌ次元の入力ベクトル
空間Ｘからの入力ベクトルｘ＝(ｘ₁，…，ｘ_L)に対し
て、真の条件付確率ｐ(ｙ｜ｘ)に従うＭ次元の出力ベク
トル空間Ｙ上の出力ベクトルｙ＝(ｙ₁，…，ｙ_M)を発生
する未知システムを推定する学習機械(統計的学習機械)
であって、図１を参照すると、この学習機械は、Ｌ次元
の入力ベクトル空間Ｘからの入力ベクトルｘを受け取る
入力部１と、学習入力データｘ_ν(１≦ν≦Ｎ)とそれに
対する上記未知システムの応答である教師データｙ
_ν(１≦ν≦Ｎ)との組からなるＮ個の学習データ
｛ｘ_ν，ｙ_ν｜１≦ν≦Ｎ｝を用いて、所定のパラメー
タθの学習を行ない、未知システム，すなわち真の条件
付確率ｐ(ｙ｜ｘ)を推定する条件付確率推定部２と、条
件付確率推定部２によって学習されたパラメータ〈θ〉
を用いて、入力部１から与えられた未学習の入力ベクト
ルｘに対する出力ｙを条件付確率ｐ(ｙ｜ｘ；〈θ〉)に
従う標本として算出する出力部３と、条件付確率推定部
２が用いる学習データ｛ｘ_ν，ｙ_ν｝(ν=１〜Ｎ)を作
成する学習データ作成部４とを有している。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of an embodiment of a learning machine according to the present invention. The learning machine according to the present embodiment outputs an M-dimensional output according to a true conditional probability p (y | x) for an input vector x = (x ₁ , ..., X _L ) from an L-dimensional input vector space X. A learning machine (statistical learning machine) that estimates an unknown system that generates an output vector y = (y ₁ , ..., y _M ) on a vector space Y
Therefore, referring to FIG. 1, the learning machine includes an input unit 1 that receives an input vector x from an L-dimensional input vector space X, learning input data x _ν (1 ≦ ν ≦ N), and Teacher data y which is the response of the unknown system
_Using N learning data {x _ν , y _ν | 1 ≤ ν ≤ N} consisting of a set of _ν (1 ≤ ν ≤ N), learning of a predetermined parameter θ is performed, and an unknown system, that is, true The conditional probability estimator 2 for estimating the conditional probability p (y | x), and the parameter <θ> learned by the conditional probability estimator 2.
Using, the output unit 3 for calculating the output y for the unlearned input vector x given from the input unit 1 as a sample according to the conditional probability p (y | x; <θ>), and the conditional probability estimation unit 2 And a learning data creating unit 4 for creating learning data {x _ν , y _ν } (ν = 1 to N).

【００１７】ここで、条件付確率推定部２には、初期学
習データを用いてパラメータθの初期学習を行なう初期
学習部２ａと、初期のパラメータθが確定した後、更新
用学習データを用いてパラメータθの更新を行なうパラ
メータ更新部２ｂとが設けられている。本実施例におい
ては、条件付確率推定部２は、上記のように、学習を、
初期学習とその後のパラメータ更新の２段階に分けて行
なうが、初期学習部２ａ，パラメータ更新部２ｂのいず
れも、パラメータθの推定量(学習結果)〈θ〉を、例え
ば、パラメータθを持つパラメータ付確率密度関数族
｛ｐ(ｙ｜ｘ；θ)｝を用い、統計の分野で良く知られて
いる最尤推定法により、次式のように求めるようになっ
ている。Here, the conditional probability estimation unit 2 uses an initial learning unit 2a that performs initial learning of the parameter θ using the initial learning data, and the learning data for updating after the initial parameter θ is determined. A parameter updating unit 2b for updating the parameter θ is provided. In the present embodiment, the conditional probability estimator 2 performs learning as described above,
The initial learning and the subsequent parameter updating are performed in two stages. In both the initial learning unit 2a and the parameter updating unit 2b, the estimated amount (learning result) <θ> of the parameter θ is, for example, a parameter having the parameter θ. Using the attached probability density function family {p (y | x; θ)}, the maximum likelihood estimation method well known in the field of statistics is used to obtain the following equation.

【００１８】[0018]

【数５】 [Equation 5]

【００１９】すなわち、Ｓを最大にするパラメータθを
推定量〈θ〉として求めるようになっている。That is, the parameter θ that maximizes S is obtained as the estimated amount <θ>.

【００２０】また、学習データ作成部４は、確率的規則
に従って入力データｘ_νを発生し、それらに対する真の
未知システムの応答を教師データｙ_νとして学習データ
｛ｘ_ν，ｙ_ν｝を作成するが、条件付確率推定部２が、
学習を、初期学習とその後のパラメータ更新との２段階
に分けて行なうことに対応させて、初期学習データを作
成する初期学習データ作成部４ａと、更新用学習データ
を作成する更新用学習データ作成部４ｂとが設けられて
いる。Further, the learning data creation unit 4 creates the input data x _ν according to the stochastic rule, and creates the learning data {x _ν , y _ν } with the response of the true unknown system to them as the teacher data y _ν. However, the conditional probability estimation unit 2
Initial learning data creation unit 4a that creates initial learning data and update learning data creation that creates update learning data in correspondence with performing learning in two stages of initial learning and subsequent parameter update And a section 4b.

【００２１】初期学習データ作成部４ａは、実際の入力
データの分布が既知の場合には、実際の入力データの分
布に従って、初期学習データを作成することができる。
例えば、この学習機械の利用時(稼動時)に生じる入力デ
ータの発生分布の推定量から初期学習データの入力デー
タｘ_νを発生させ、初期学習データ｛ｘ_ν，ｙ_ν｝を作
成することができる。これに対し、実際の入力データの
分布が未知の場合には、予め事前の知識に基づいて定め
られた一様分布や正規分布に従って、初期学習データを
作成することができる。例えば、この学習機械が予め用
意した一定の入力空間上の確率分布に従って、初期学習
データの入力データｘ_νを発生させ、初期学習データ
｛ｘ_ν，ｙ_ν｝を作成することができる。When the distribution of the actual input data is known, the initial learning data creating section 4a can create the initial learning data according to the distribution of the actual input data.
For example, the initial learning data {x _ν , y _ν } may be generated by generating the input data x _ν of the initial learning data from the estimated amount of the generation distribution of the input data generated when the learning machine is used (operating). it can. On the other hand, when the actual distribution of the input data is unknown, the initial learning data can be created according to the uniform distribution or the normal distribution that is determined in advance based on prior knowledge. For example, the learning machine can generate the initial learning data {x _ν , y _ν } by generating the input data x _ν of the initial learning data according to the probability distribution on the constant input space prepared in advance.

【００２２】また、更新用学習データ作成部４ｂは、所
定の確率密度関数ｒ(ｘ；ｖ)によって更新用学習データ
を作成するようになっている。すなわち、更新用学習デ
ータ作成部４ｂは、入力ベクトル空間Ｘ上のパラメータ
(ｖ)付確率密度関数族｛ｒ(ｘ；ｖ)｝を保持しており、
例えば、この学習機械の利用時(稼動時)に発生する実際
の入力データの発生分布の確率密度関数ｑ(ｘ)の推定量
〈ｑ(ｘ)〉と条件付確率推定部２の初期学習部２ａによ
り得られたパラメータθの推定値〈θ〉とを用いて、学
習データの入力データｘ_νを｛ｒ(ｘ；ｖ)｝に従って発
生させた場合の学習誤差の推定値Ｅ(ｖ)を最小にするパ
ラメータｖの値を最適値として計算し、最適化されたパ
ラメータｖによる確率密度関数ｒ(ｘ；ｖ)によって、更
新用学習データの入力データｘ_νを発生させ、更新用学
習データ｛ｘ_ν，ｙ_ν｝を作成するようになっている。The update learning data creating section 4b is adapted to create update learning data by a predetermined probability density function r (x; v). That is, the update learning data creation unit 4b uses the parameters in the input vector space X
(v) holds the probability density function family {r (x; v)},
For example, the estimated amount <q (x)> of the probability density function q (x) of the occurrence distribution of the actual input data that occurs when this learning machine is used (operating) and the initial learning unit of the conditional probability estimation unit 2 The estimated value E (v) of the learning error when the input data x _ν of the learning data is generated according to {r (x; v)} using the estimated value <θ> of the parameter θ obtained by 2a. The value of the parameter v to be minimized is calculated as the optimum value, and the input data x _v of the learning data for updating is generated by the probability density function r (x; v) based on the optimized parameter v, and the learning data for updating { x _ν , y _ν } is created.

【００２３】なお、学習機械の利用時(稼動時)に発生す
る実際の入力データの発生分布の確率密度関数ｑ(ｘ)の
推定量〈ｑ(ｘ)〉としては、例えば、この学習機械の実
際の稼動時と同じ状況で入力データをＮ’個発生させ、
その分布をパラメトリックモデルにより推定したり、あ
るいは、次式のようにデルタ関数で表わして、予め用意
することができる。The estimated amount <q (x)> of the probability density function q (x) of the distribution of the actual input data generated when the learning machine is used (when operating) is, for example, Generate N'input data in the same situation as the actual operation,
The distribution can be estimated by a parametric model, or represented by a delta function as in the following equation, and prepared in advance.

【００２４】[0024]

【数６】 [Equation 6]

【００２５】また、学習誤差の推定値Ｅ(ｖ)を最小にす
るパラメータｖの値の計算は、更新用学習データ作成部
４ｂ内の学習誤差最小化回路５によってなされる。より
具体的には、学習誤差最小化回路５は、次式の学習誤差
推定値Ｅ(ｖ)を最小化するように、パラメータｖを最適
化する。The calculation of the value of the parameter v that minimizes the estimated value E (v) of the learning error is performed by the learning error minimization circuit 5 in the update learning data creation unit 4b. More specifically, the learning error minimization circuit 5 optimizes the parameter v so as to minimize the learning error estimated value E (v) of the following equation.

【００２６】[0026]

【数７】 [Equation 7]

【００２７】ここで、〈Ｉ〉，〈Ｊ(ｖ)〉は行列であ
り、次のように計算される。Here, <I> and <J (v)> are matrices, which are calculated as follows.

【００２８】[0028]

【数８】 [Equation 8]

【００２９】一般には最小値を与えるｖを解析的に解く
のは困難な場合が多いので、例えば最急降下法によって
逐次更新していくことによりｖを最適化していけばよ
い。このようにして得られたパラメータを〈ｖ〉と書く
ことにする。更新用学習データ作成手段は、ｒ(ｘ；
〈ｖ〉)に従ってある個数の入力データを発生させ、そ
の入力データと、それらに対する真の未知システムの出
力結果である教師データとからなる学習データを更新用
学習データとして作成する。パラメータ更新部２ｂは、
この更新用学習データを用いてｐ(ｙ｜ｘ；θ)のパラメ
ータθを再び最尤推定法によって学習し直す。In general, it is often difficult to analytically solve v that gives the minimum value, so v may be optimized by sequentially updating it by, for example, the steepest descent method. The parameter thus obtained will be written as <v>. The update learning data creating means is r (x;
According to <v>), a certain number of input data are generated, and learning data composed of the input data and teacher data which is an output result of a truly unknown system for them is created as update learning data. The parameter updating unit 2b
Using this update learning data, the parameter θ of p (y | x; θ) is learned again by the maximum likelihood estimation method.

【００３０】上で述べた方式が学習誤差を低減する上で
有効であることを以下に述べる。It will be described below that the method described above is effective in reducing the learning error.

【００３１】いま真の条件つき確率ｐ(ｙ｜ｘ)は、設定
したモデル｛ｐ(ｙ｜ｘ；θ)｝に含まれているものと
し、ｐ(ｙ｜ｘ)＝ｐ(ｙ｜ｘ；θ₀)とする。ここで、θ₀
は真のパラメータである。また学習データを発生させる
入力分布の密度函数をｒ(ｘ)として、これにより作成し
たＮ個の学習データから最尤推定によって得られたパラ
メータを〈θ〉とする。このとき学習の損失の量Ｒを次
式のように測る。Now, the true conditional probability p (y | x) is assumed to be included in the set model {p (y | x; θ)}, and p (y | x) = p (y | x ; Θ ₀ ). Where θ ₀
Is a true parameter. Further, the density function of the input distribution that generates the learning data is r (x), and the parameter obtained by the maximum likelihood estimation from the N learning data thus created is <θ>. At this time, the learning loss amount R is measured by the following equation.

【００３２】[0032]

【数９】 [Equation 9]

【００３３】この損失の測り方はKullback-Leibler Div
ergenceに基づくもので統計の分野ではよく用いられる
ものである。この量Ｒをθ₀のまわりでテイラー展開す
ると、次式のようになる。How to measure this loss is Kullback-Leibler Div
It is based on ergence and is often used in the field of statistics. Taylor expansion of this amount R around θ ₀ gives the following equation.

【００３４】[0034]

【数１０】 [Equation 10]

【００３５】上式において１階微分の項(第２項)はｙで
積分することにより“０”となることがわかる。また、
上式の第１項は推定量〈θ〉に依存しない項である。従
って、学習の良さを判定するには、上式の第３項を見れ
ばよいことがわかる。この項の値を学習データの出方に
よって期待値をとったものをＥ₀とすると、Ｅ₀は簡単な
計算により次式のように求まる。In the above equation, it is understood that the first-order differential term (second term) becomes "0" by integrating with y. Also,
The first term in the above equation is a term that does not depend on the estimator <θ>. Therefore, it can be seen that the third term of the above equation can be seen to determine the goodness of learning. Letting E ₀ be the expected value of the value of this term depending on the way the learning data appears, E ₀ is obtained by a simple calculation as in the following equation.

【００３６】[0036]

【数１１】 [Equation 11]

【００３７】ここで、統計的推定の分野で良く知られて
いるように、次式が成り立つ。Here, as well known in the field of statistical estimation, the following equation holds.

【００３８】[0038]

【数１２】 [Equation 12]

【００３９】Ｉ_ab，Ｊ_abを次式のように定義すると、When I _ab and J _ab are defined as follows,

【００４０】[0040]

【数１３】 [Equation 13]

【００４１】数１２から次の近似式が導き出せる。The following approximate expression can be derived from the equation (12).

【００４２】[0042]

【数１４】 [Equation 14]

【００４３】本発明の更新用学習データ作成部４ｂが用
いる学習誤差の推定値は、この値Ｅ₀を各行列の推定値
を用いて計算したものであり、この量を小さくすること
が学習の誤差を小さくすることにつながっていることが
わかる。The estimated value of the learning error used by the update learning data creating unit 4b of the present invention is calculated by using the estimated value of each matrix for the value E ₀ , and it is necessary to reduce the amount of learning. It can be seen that this leads to reducing the error.

【００４４】次に、このような構成の学習機械の動作に
ついて説明する。本実施例の学習機械は、この学習機械
が与える任意の入力ｘに対してｐ(ｙ｜ｘ)に従う標本を
返答することができる未知システムを推定するのに用い
られる。例えばシステム同定の問題などでは、真のシス
テムは任意の入力に対して出力を返すことができる場合
も多い。本実施例の学習機械は、このようなシステムを
推定するのに用いることができる。Next, the operation of the learning machine having such a configuration will be described. The learning machine of this embodiment is used to estimate an unknown system capable of returning a sample according to p (y | x) for an arbitrary input x given by this learning machine. True systems can often return outputs for arbitrary inputs, such as in system identification problems. The learning machine of this embodiment can be used to estimate such a system.

【００４５】未知システムの推定(学習)は、条件付確率
推定部２で行なわれるが、条件付確率推定部２に対する
学習データを発生させるため、本実施例では、学習デー
タ作成部４が設けられている。学習データ作成部４は、
条件付確率推定部２に学習データを与えるため、確率的
規則に従って入力データを発生し、それらに対する真の
未知システムの応答を教師データとして、学習データを
作成する。The estimation (learning) of the unknown system is performed by the conditional probability estimator 2, but in order to generate learning data for the conditional probability estimator 2, a learning data generator 4 is provided in this embodiment. ing. The learning data creation unit 4
In order to provide learning data to the conditional probability estimation unit 2, input data is generated according to the stochastic rule, and learning data is created by using the response of the true unknown system to them as the teacher data.

【００４６】この際、先ず、学習データ作成部４の初期
学習データ作成部４ａが、所定の入力データ分布に従っ
て、初期学習データ｛ｘ_ν，ｙ_ν｝を作成する。初期学
習データ｛ｘ_ν，ｙ_ν｝が作成されると、条件付確率推
定部２の初期学習部２ａは、この初期学習データ
｛ｘ_ν，ｙ_ν｝を用いて、所定のパラメータθの初期学
習を行ない、初期学習結果〈θ〉を生成する。At this time, first, the initial learning data creating unit 4a of the learning data creating unit 4 creates the initial learning data {x _ν , y _ν } according to a predetermined input data distribution. When the initial learning data {x _ν , y _ν } is created, the initial learning unit 2a of the conditional probability estimation unit 2 uses this initial learning data {x _ν , y _ν } to initialize the predetermined parameter θ. Learning is performed to generate an initial learning result <θ>.

【００４７】このようにして、初期学習がなされた後、
学習データ作成部４の更新用学習データ作成部４ｂは、
この学習機械の稼動時に発生する実際の入力データの発
生分布の確率密度関数ｑ(ｘ)の推定量〈ｑ(ｘ)〉と条件
付確率推定部２の初期学習部２ａにより得られたパラメ
ータθの推定値〈θ〉とを用いて、学習データの入力デ
ータｘ_νを確率密度関数族｛ｒ(ｘ；ｖ)｝に従って発生
させた場合の学習誤差の推定値Ｅ(ｖ)を最小にするパラ
メータｖの値を最適値として計算し、最適化されたパラ
メータｖによる確率密度関数ｒ(ｘ；ｖ)によって、更新
用学習データの入力データｘ_νを発生させ、更新用学習
データ｛ｘ_ν，ｙ_ν｝を作成する。更新用学習データ
｛ｘ_ν，ｙ_ν｝が作成されると、条件付確率推定部２の
パラメータ更新部２ｂは、更新用学習データ｛ｘ_ν，ｙ
_ν｝を用いて、パラメータθの更新学習を行ない、更新
学習結果〈θ〉を生成する。ここで、更新用学習データ
｛ｘ_ν，ｙ_ν｝は、学習に有効な最適な学習データとな
っており、従って、条件付確率推定部２のパラメータ更
新部２ｂは、学習を効率良くかつ精度良く行なって、精
度の高い(学習誤差の小さい)学習結果〈θ〉を迅速に得
ることができる。After the initial learning is performed in this way,
The update learning data creation unit 4b of the learning data creation unit 4 is
The estimated amount <q (x)> of the probability density function q (x) of the occurrence distribution of the actual input data that occurs when the learning machine is operating, and the parameter θ obtained by the initial learning unit 2a of the conditional probability estimation unit 2. And the estimated value <θ> of the learning data are used to minimize the estimated value E (v) of the learning error when the input data x _ν of the learning data is generated according to the probability density function family {r (x; v)}. The value of the parameter v is calculated as the optimum value, and the input data x _ν of the learning data for update is generated by the probability density function r (x; v) based on the optimized parameter v, and the learning data for update {x _ν , Create y _ν }. When the update learning data {x _ν , y _ν } is created, the parameter updating unit 2b of the conditional probability estimation unit 2 updates the update learning data {x _ν , y.
_ν } is used to perform the update learning of the parameter θ, and the update learning result <θ> is generated. Here, the update learning data {x _ν , y _ν } is optimum learning data effective for learning, and therefore, the parameter updating unit 2b of the conditional probability estimating unit 2 efficiently and accurately performs learning. By performing well, a highly accurate (small learning error) learning result <θ> can be quickly obtained.

【００４８】このようにして、条件付確率推定部２にお
いてパラメータθの学習が行なわれ、更新学習結果
〈θ〉が生成された後、入力部１から入力ベクトルｘが
与えられると、出力部３は、条件付確率推定部２によっ
て学習されたパラメータ(更新学習結果)〈θ〉を用い
て、与えられた入力ベクトルｘに対する出力ｙを条件付
確率ｐ(ｙ|ｘ；〈θ〉)に従う標本として算出すること
ができる。In this way, the parameter probability θ is learned in the conditional probability estimator 2, and after the update learning result <θ> is generated, when the input vector x is given from the input unit 1, the output unit 3 Is a sample that follows the conditional probability p (y | x; <θ>) of the output y for the given input vector x using the parameter (update learning result) <θ> learned by the conditional probability estimator 2. Can be calculated as

【００４９】以上のように、本実施例では、システム同
定の問題のように、学習機械が用意した入力データに対
して未知システムが教師データを与えてくれる問題を、
ニューラルネットワークなどの統計的学習機械によって
学習する場合に、学習データをどのように用意するのが
最適かを設計することができ、学習後の推定精度を向上
させることができる。As described above, in the present embodiment, the problem that the unknown system gives the teacher data to the input data prepared by the learning machine, such as the system identification problem,
When learning is performed by a statistical learning machine such as a neural network, it is possible to design how to prepare the learning data optimally, and improve the estimation accuracy after learning.

【００５０】[0050]

【発明の効果】以上に説明したように、請求項１記載の
発明によれば、条件付確率推定手段が用いる学習データ
を、確率的規則に従って発生させた入力データと該入力
データに対する未知システムの応答とから作成するの
で、学習に有効な最適な学習データを発生させ、学習を
効率良くかつ精度良く行なうことができる。As described above, according to the first aspect of the present invention, the learning data used by the conditional probability estimating means is the input data generated according to the stochastic rule and the unknown system for the input data. Since it is created from the response, the optimum learning data effective for learning can be generated and learning can be performed efficiently and accurately.

【００５１】また、請求項２乃至６記載の発明によれ
ば、条件付確率推定手段は、初期のパラメータを学習す
る初期学習手段と、初期のパラメータが確定した後、新
たな学習データによってパラメータを更新するパラメー
タ更新手段とを有し、また、学習データ作成手段は、条
件付確率推定手段の初期学習手段が用いる初期学習デー
タを作成する初期学習データ作成手段と、初期学習デー
タ作成手段において初期学習が終了した後に、パラメー
タ更新手段が用いる更新用学習データを作成する更新用
学習データ作成手段とを有しているので、学習に有効な
最適な学習データを発生させ、学習を効率良くかつ精度
良く行なうことができる。According to the invention described in claims 2 to 6, the conditional probability estimating means determines the parameters by the initial learning means for learning the initial parameters and the new learning data after the initial parameters are determined. The learning data creating means includes an initial learning data creating means for creating initial learning data used by the initial learning means of the conditional probability estimating means, and an initial learning in the initial learning data creating means. After the completion of the above, since it has an update learning data creating means for creating the update learning data used by the parameter updating means, the optimum learning data effective for learning is generated, and the learning is efficiently and accurately performed. Can be done.

[Brief description of drawings]

【図１】本発明に係る学習機械の一実施例の構成図であ
る。FIG. 1 is a configuration diagram of an embodiment of a learning machine according to the present invention.

【図２】統計的学習機械としての３層構造のパーセプト
ロンを示す図である。FIG. 2 is a diagram showing a perceptron having a three-layer structure as a statistical learning machine.

[Explanation of symbols]

１入力部２条件付確率推定部３出力部４学習データ作成部２ａ初期学習部２ｂパラメータ更新部４ａ初期学習データ作成部４ｂ更新用学習データ作成部５学習誤差最小化回路 1 Input Part 2 Conditional Probability Estimating Part 3 Output Part 4 Learning Data Creating Part 2a Initial Learning Part 2b Parameter Updating Part 4a Initial Learning Data Creating Part 4b Updating Learning Data Creating Part 5 Learning Error Minimization Circuit

Claims

[Claims]

1. A learning machine for estimating an unknown system that produces an output vector y according to a true conditional probability p (y | x) with respect to an input vector x from an input vector space X. Using the input means for receiving the input vector x from and the learning data consisting of the learning input data and the teacher data which is the response of the unknown system to the learning data, learning of the predetermined parameter θ is performed,
From the input means, the true conditional probability p (y | x) is estimated and the conditional probability estimating means for estimating the unknown system by this, and the parameter <θ> learned by the conditional probability estimating means are used. Given unlearned input vector x
Output means for calculating the output y for the sample as a sample according to the conditional probability p (y | x; <θ>), the learning data used by the conditional probability estimation means, the input data generated according to the stochastic rule, and the input data. A learning machine for creating learning data from the response of the unknown system to the learning machine.

2. The learning machine according to claim 1, wherein the conditional probability estimation means updates the parameters with new learning data after the initial learning means for learning the initial parameters and the initial parameters are determined. A parameter updating means, and the learning data creating means, the initial learning data creating means for creating the initial learning data used by the initial learning means of the conditional probability estimating means, and the initial learning data creating means After learning,
A learning machine comprising: update learning data creating means for creating update learning data used by the parameter updating means.

3. The learning machine according to claim 2, wherein the initial learning data creating means generates the input data of the initial learning data from an estimated amount of an input occurrence distribution generated when the learning machine is used. Learning machine to do.

4. The learning machine according to claim 2, wherein the initial learning data creating means generates the input data of the initial learning data according to a probability distribution on a constant input space prepared in advance by the learning machine. Characteristic learning machine.

5. The learning machine according to claim 2, wherein the learning data creating means for updating holds a parameterized probability density function {r (x; v)} on an input vector space, The input data of the learning data is set to {r by using the estimated amount <q (x)> of the probability density function of the occurrence distribution of the input data generated at the time of operation and the estimated value <θ> of the parameter θ obtained by the initial learning means. (x; v)} The value of the parameter v that reduces the estimated value E (v) of the learning error when generated according to (x; v)} is calculated as an optimum value, and the probability density function r (x; v) by the optimized parameter v is calculated. ) Is used to generate input data of update learning data.

6. The learning machine according to claim 5, wherein the optimum parameter v is sequentially updated using the steepest descent method so as to minimize the estimated value E (v) of the learning error. A learning machine characterized by being obtained by.