JPH0736184B2

JPH0736184B2 - Learning machine

Info

Publication number: JPH0736184B2
Application number: JP1043730A
Authority: JP
Inventors: 茂生阪上; 敏行香田; 泰治メ木; 英行高木; 隼人戸川
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1989-02-23
Filing date: 1989-02-23
Publication date: 1995-04-19
Anticipated expiration: 2010-04-19
Also published as: JPH02222061A

Description

【発明の詳細な説明】産業上の利用分野本発明は学習機械に関するものである。FIELD OF THE INVENTION The present invention relates to learning machines.

従来の技術従来の学習機械としては、例えばディイールンメルハル
ト（D..E.Rummelhart）らによる“ラーニングレプレゼ
ンテイションバイバックプロパゲイティングエラー
（Learning representations by back-propagating err
ors）",ネイチャー（Nature）Vol.323 No.9（1986）に
示されている。2. Description of the Related Art As a conventional learning machine, for example, learning learnings by back-propagating err by D..E.Rummelhart et al.
ors) ", Nature Vol.323 No.9 (1986).

第９図はこの従来の学習機械の構成図を示すものであ
り、51、52は入力端子、53、54、55、56、57、58は可変
重み乗算器、59、60、61は飽和入出力特性を持つ加算
器、62は出力端子、63は教師信号発生部、64は誤差算出
部、65は探索方向決定部、66は重み変更部である。第９
図に示されるように、従来の学習機械は飽和入出力特性
を持つ加算器を階層状に接続し、各層の加算器の間に可
変重み乗算器を接続した構成となっている。FIG. 9 shows a block diagram of this conventional learning machine. 51, 52 are input terminals, 53, 54, 55, 56, 57, 58 are variable weight multipliers, 59, 60, 61 are saturated inputs. An adder having an output characteristic, 62 is an output terminal, 63 is a teacher signal generating section, 64 is an error calculating section, 65 is a search direction determining section, and 66 is a weight changing section. 9th
As shown in the figure, the conventional learning machine has a configuration in which adders having saturated input / output characteristics are connected in a hierarchical manner, and variable weight multipliers are connected between the adders of each layer.

以上のように構成された従来の学習機械における加算器
59、60及び61の入出力特性を第10図に示す。第10図に示
されるように加算器59、60及び61の入出力特性は飽和特
性を持つ。即ち、加算器の入出力特性はで表わすことができる。ここで、output［ｊ］は第ｊ番
目の加算器の出力信号、input［ｉ］は第ｊ番目の加算
器に入力される第ｉ番目の入力信号、func（）は飽和特
性を持つ関数でシグモイド関数func（ｘ）＝2/（１＋ex
p（−ｘ））−１等で表される。The adder in the conventional learning machine configured as described above
The input / output characteristics of 59, 60 and 61 are shown in FIG. As shown in FIG. 10, the input / output characteristics of the adders 59, 60 and 61 have saturation characteristics. That is, the input / output characteristics of the adder are Can be expressed as Here, output [j] is the output signal of the j-th adder, input [i] is the i-th input signal input to the j-th adder, and func () is a function having a saturation characteristic. Sigmoid function func (x) = 2 / (1 + ex
It is represented by p (-x))-1 and the like.

第９図に示されるように、加算器に入力される信号は、
前段の加算器の出力信号に重みを掛けたものになってい
る。即ち、 input［ｉ］＝ｗ［i,j］＊ｘ［ｉ］ ……（２）ここに、ｘ［ｉ］は前段の第ｉ番目の加算器の出力信号
であり、ｗ［i,j］は前段の第ｉ番目の加算器の出力信
号が第ｊ番目の加算器に入力されるときに可変重み乗算
器で掛けられる重みである。As shown in FIG. 9, the signal input to the adder is
The output signal of the adder in the preceding stage is weighted. That is, input [i] = w [i, j] * x [i] (2) where x [i] is the output signal of the i-th adder in the previous stage, and w [i, j ] Is a weight to be multiplied by the variable weight multiplier when the output signal of the i-th adder in the previous stage is input to the j-th adder.

従来の学習機械では、入力端子51及び52から入力される
信号に応じて、教師信号発生部63が前記入力信号に対す
る望ましい出力信号を教師信号として発生し、誤差算出
部64は出力端子62から出力される実際の出力信号と前記
教師信号との差から誤差Ｅを算出する。誤差Ｅはで表される。ここに、ｚ［ｋ］は出力層の第ｋ番目の加
算器の出力信号、ｔ［ｋ］はｚ［ｋ］に対する教師信
号、は教師信号のパターン数に関する総和、は出力層の加算器の数に関する総和、は前記可変重み
乗算器の重みのベクトル表現である。探索方向決定部65
は重みをベクトルで表現する重み空間における誤差の最
小点探索方向を求める。探索方向は、最急降下方向であ
り、で求まる。このようにして求まった探索方向をもとに、
重み変更部66は、各可変重み乗算器53、54、55、56、5
7、58の重みの変更量を求め、重みを変更する。重みの
変更量の求め方は、最急降下法及び加速法によるもの
で、 Δ＝ε＊＋α＊Δ′ …（５）で表される。ここに、Δはの変更量、εは学習パラ
メータと呼ばれる正の定数、αは加速パラメータと呼ば
れる正の定数、Δ′は前回の重み変更におけるΔで
ある。以上のように重みの変更量を求めることの繰り返
しにより、誤差を小さくしてゆき、誤差が十分に小さく
なると、出力信号が望ましい値に十分近くなったものと
して、学習を終了する。In the conventional learning machine, the teacher signal generator 63 generates a desired output signal for the input signal as a teacher signal according to the signals input from the input terminals 51 and 52, and the error calculator 64 outputs from the output terminal 62. The error E is calculated from the difference between the actual output signal and the teacher signal. The error E is It is represented by. Where z [k] is the output signal of the kth adder in the output layer, t [k] is the teacher signal for z [k], Is the sum of the number of patterns of the teacher signal, Is the sum of the number of adders in the output layer, and is the vector representation of the weights of the variable weight multiplier. Search direction determination unit 65
Finds the minimum error search direction in the weight space that represents weights as a vector. The search direction is the steepest descent direction, Can be obtained with. Based on the search direction obtained in this way,
The weight changing unit 66 includes the variable weight multipliers 53, 54, 55, 56, 5
The change amount of the weights of 7 and 58 is calculated, and the weight is changed. The weight change amount is obtained by the steepest descent method and the acceleration method, and is represented by Δ = ε * + α * Δ ′ (5). Here, Δ is the change amount of, ε is a positive constant called a learning parameter, α is a positive constant called an acceleration parameter, and Δ ′ is Δ in the previous weight change. The error is reduced by repeating the calculation of the weight change amount as described above, and when the error becomes sufficiently small, it is considered that the output signal is sufficiently close to the desired value, and the learning ends.

発明が解決しようとする課題しかしながら上記のような構成では、学習パラメータε
及び加速パラメータαは経験的に定められているので、
それらは必ずしも最適値ではなく、学習に要する時間が
長くなるという課題を有していた。However, in the configuration as described above, the learning parameter ε
Since the acceleration parameter α is empirically determined,
They are not necessarily optimal values, and have a problem that learning takes a long time.

本発明はかかる点に鑑み、学習に要する時間の短い学習
機械を提供することを目的とする。The present invention has been made in view of the above points, and an object thereof is to provide a learning machine that requires a short learning time.

課題を解決するための手段本発明は、可変重み乗算器で乗算するそれぞれの重み
（w₁，w₂，……）を座標軸とし、重みをベクトル（＝
（w₁，w₂，……））で表現する仮想的な重み空間におい
て、最急降下法などに基づいて、前記誤差の最小点探索
方向を決定する探索方向決定部と、誤差の最小点探索方向に
おいて学習パラメータ（ε）の値をε_０に設定したとき
の重み（′＝＋ε_０）に対する誤差が、重みに
対する誤差よりも減少すれば学習パラメータを増加さ
せ、重みに対する誤差よりも増加すれば学習パラメー
タを減少させて、誤差が最小となる近傍の学習パラメー
タおよび誤差を求める学習パラメータ変化部と、誤差の
最小点の近傍において、重みの変化に対する誤差の変化
を表す誤差曲面を放物面で近似してその頂点における誤
差を求める放物面近似部と、それら誤差が最小となる点
における学習パラメータを学習パラメータの最適値とし
て各可変重み乗算器の重みを変更する重み変更部と、前
回の誤差の最小点探索における学習パラメータの最適値
を今回の誤差の最小点探索における学習パラメータの初
期値に反映させる学習パラメータ初期化部と、前記探索
方向決定部と前記学習パラメータ初期化部と前記学習パ
ラメータ変化部と前記放物面近似部と前記重み変更部と
を繰り返し用いて誤差を十分に小さくする誤差最小化回
路とを備えた学習機械である。Means for Solving the Problems In the present invention, each weight (w ₁ , w ₂ , ...) To be multiplied by a variable weight multiplier is used as a coordinate axis, and the weight is a vector (=
In the virtual weight space represented by (w ₁ , w ₂ , ...)), based on the steepest descent method or the like, the minimum point search direction of the error is calculated. If the error for the weight (′ = + ε ₀ ) when the value of the learning parameter (ε) is set to ε ₀ in the search direction determining unit for determining the minimum error point is smaller than the error for the weight, The learning parameter is increased by increasing the learning parameter and decreasing the learning parameter when the error is larger than the error with respect to the weight, and the learning parameter changing unit that obtains the learning parameter and the error in the neighborhood where the error is the minimum and the weight A parabolic approximation part that finds the error at its apex by approximating an error curved surface that represents the change of the error to the change with a parabolic surface, and the learning parameter at the point where those errors are the minimum, with each variable weight as the optimal value of the learning parameter. The weight changing unit that changes the weight of the multiplier and the optimum value of the learning parameter in the previous minimum error point search are used in this minimum error point search. A learning parameter initialization unit that reflects the learning parameter initial value, the search direction determination unit, the learning parameter initialization unit, the learning parameter changing unit, the parabolic approximation unit, and the weight changing unit are repeatedly used. The learning machine is provided with an error minimization circuit that sufficiently reduces the error.

作用本発明は前記した構成により、探索方向決定部で重み空
間において誤差が最小となる点の探索方向を決定した
後、学習パラメータ変化部で前記探索方向における誤差
の最小点の近傍の学習パラメータを求め、放物面近似部
において誤差の最小点の近傍を放物面近似してその頂点
における誤差を求め、それら誤差が最小となる点を前記
探索方向における誤差最小の点として、その点における
学習パラメータを用いて重み変更部で各可変重み乗算器
の重みを変更する。次に、探索方向決定部で前記誤差最
小の点において新たに誤差最小の点の探索方向を求め、
学習パラメータ初期化部で前回の最小点探索における学
習パラメータの最適値を反映させて学習パラメータの初
期値を設定し、以下前回の誤差の最小点探索と同様に今
回の探索方向における誤差の最小点を求め重みを変更
し、誤差最小化回路によって誤差が十分に小さくなるま
で誤差最小点の探索を繰り返す。以上により、誤差の最
小点探索方向が決まるとその方向において最適な学習パ
ラメータが自動的に求まり、前回の探索における学習パ
ラメータの最適値を今回の探索の初期値に反映させるこ
とによって、常に最適の学習パラメータを効率良く求め
ながら誤差を小さくして学習を進めるために、短い学習
時間で誤差が十分に小さくなり、学習を終了することが
できる。With the above-described configuration, the present invention determines the search direction of the point having the minimum error in the weight space in the search direction determination unit, and then sets the learning parameter in the vicinity of the minimum error point in the search direction in the learning parameter changing unit. In the parabolic surface approximation section, the neighborhood of the minimum error point is parabolic surface-approximated to find the error at the apex, and the point at which those errors are the minimum is set as the minimum error point in the search direction, and learning at that point is performed. The weight changing unit changes the weight of each variable weight multiplier using the parameter. Next, in the search direction determination unit, the search direction of the point with the minimum error is newly obtained at the point with the minimum error,
The learning parameter initialization unit sets the initial value of the learning parameter by reflecting the optimum value of the learning parameter in the previous minimum point search, and the minimum point of the error in the current search direction is set as in the previous minimum point search of the error. And the weight is changed, and the search for the minimum error point is repeated until the error is sufficiently reduced by the error minimization circuit. As described above, when the search direction of the minimum error point is determined, the optimum learning parameter is automatically found in that direction, and the optimum value of the learning parameter in the previous search is reflected in the initial value of this search, so that the optimum value is always optimized. Since the error is made small and the learning is advanced while efficiently obtaining the learning parameter, the error becomes sufficiently small in a short learning time, and the learning can be ended.

実施例以下に、本発明の実施例について図面を参照しながら説
明する。Embodiments Embodiments of the present invention will be described below with reference to the drawings.

第１図は本発明の第１の実施例における学習機械の構成
図を示すものである。第１図において、１及び２は入力
端子、３、４、５、６、７及び８は可変重み乗算器、
９、10及び11は飽和入出力特性をもつ加算器、12は出力
端子、13は出力信号算出回路、14は教師信号発生部、15
は誤算算出部、16は探索方向決定部、17は学習パラメー
タ初期化部、18は学習パラメータ変化部、19は放物面近
似部、20は重み変更部、21は誤差最小化回路である。FIG. 1 shows a block diagram of a learning machine in the first embodiment of the present invention. In FIG. 1, 1 and 2 are input terminals, 3, 4, 5, 6, 7 and 8 are variable weight multipliers,
9, 10 and 11 are adders having saturated input / output characteristics, 12 is an output terminal, 13 is an output signal calculation circuit, 14 is a teacher signal generator, and 15
Is an erroneous calculation calculating unit, 16 is a search direction determining unit, 17 is a learning parameter initializing unit, 18 is a learning parameter changing unit, 19 is a parabolic approximation unit, 20 is a weight changing unit, and 21 is an error minimizing circuit.

可変重み乗算器で乗算するそれぞれの重み（ｗ［i₁，
j₁］,w［i₂，j₂］）を座標軸とする仮想的な重みを空間
において、重みを変化させたときに教師信号と出力信号
との誤差が等しくなる点を結んだ線を示す。重みが変化
したときの誤差の値を第２図の紙面に垂直な方向の軸で
表すと、誤差の値は、その３次元空間における曲面（誤
差曲面）となる。第２図は、この誤差曲面の紙面からの
高さが等しい点を結んだグラフという意味で、誤差曲面
の等高線図と呼ぶ。第２図において、ｗ［i₁，j₁］及び
ｗ［i₂，j₂］は可変重み乗算器３、４、５、６、７及び
８のうち任意の２つの可変重み乗算器で乗算する重みで
ある。第２図の仮想的な重み空間においては、重みを、
＝（ｗ［i₁，j₁］,w［i₂，j₂］……）のように、各重
みの値を要素とするベクトルで表す。本実施例の学習機
械の学習においては、第２図に示される誤差曲面の出発
点で表される可変重み乗算器の重みの初期値から始め
て、誤差曲面上を誤差の小さくなる方向に最小点探索を
繰り返し、誤差の大局的最小点に達することが目的とな
る。可変重み乗算器３、４、５、６、７及び８は入力信
号に対して重みを掛けて出力する。加算器９、10及び11
は（１）式で表される飽和入出力特性を持つ。出力信号
算出回路13は入力端子１及び２から入力される入力信号
の乗算及び加算によって出力信号を求める。教師信号発
生部14は入力信号に対して望ましい出力信号を教師信号
として発生し、誤差算出部15は端子12から出力される実
際の出力信号と前記教師信号とから（３）式にしたがっ
て誤差の最初の値を求める。探索方向決定部16は可変重
み乗算器の重みをベクトルで表す重み空間における誤差
の最小点探索方向を決定する。探索方向は出発点におけ
る最急降下方向即ち（４）式で求められる。第３図に
本実施例の初回の最小点探索における誤差最小化回路の
動作説明図を示す。第３図に示す出発点及びP1は第２図
の出発点及びP1に一致しており、第３図は第２図に示さ
れている誤差曲面の出発点とP1とを結ぶ直線による断面
の誤差曲線を示す。学習パラメータ初期化部17は、学習
パラメータの初期値を、初回の最小点探索ではε_０と
し、２回目以降の最小点探索では前回の重み変更に用い
た学習パラメータの値もしくはε_０のうち大きい力の値
に決定し、前記学習パラメータの初期値に対する誤差を
求める。ここにε_０は正の定数である。第３図に示す初
回の最小点探索では、学習パラメータの初期値はε_０と
なる。学習パラメータ変化部18では、学習パラメータの
初期値に対する誤差が、前記誤差の最初の値より小さく
なったときには、学習パラメータεの値を２倍にして誤
差の値を求めるという動作を誤差の値が増加に転ずるま
で繰り返す。即ち、 ε_ｋ＝ε_ｋ−１＊２ ……（６）として誤差を E_k＝Ｅ（＋ε_ｋ＊） ……（７）として求める。第３図に示す初回の最小点探索ではE₀＜
E_orgなので学習パラメータεの値を２倍にしていき、E
_org＞E₀＞E₁＞E₂＜E₃となるため、学習パラメータ変化
部18はE₃まで求める。なお学習パラメータの初期値に対
する誤差が、前記誤差の最初の値より増加したときに
は、学習パラメータ変化部18は学習パラメータの値を1/
2倍して誤差を求めるという動作を、誤差が前記誤差の
最初の値より小さくなるまで繰り返す。即ち、 ε_ｋ＝ε_ｋ−１/2 ……（８）として誤差を（７）式で求める。放物面近似部19は、誤
差最小点の近傍を放物面で近似するもので、第３図にお
いて重み空間内で重みの値が等間隔で変化する点での誤
差を求めるために、まず ε_2.5＝（ε_２＋ε_３）/2 における誤差 E_2.5＝Ｅ（＋ε_2.5＊）を求める。第３図においてはE_2.5＜E₂＜E₃＜E₁なので、
E₂，E_2.5，E₃の３点を通る放物面で、誤差最小の点の近
傍の誤差曲面を近似し、その頂点における誤差を求め
る。即ち、 E_v＝Ｅ（＋ε_ｖ＊） ……（９）ただし重み変更部20では、以上のようにして求まった誤差のう
ちで最も小さい誤差に対する学習パラメータを用いて、
各可変重み乗算器の重みを変更する。第３図ではE₁，
E₂，E_2.5，E₃，E_vのうちで最も小さい誤差であるE₃に対
応する学習パラメータε_３をこの探索方向における学習
パラメータの最適値として、ε_３を用いて各可変重み乗
算器の重みを変更する。Each weight (w [i ₁ ,
j ₁ ], w [i ₂ , j ₂ ]) is a virtual weight with a coordinate axis, and shows a line connecting the points where the error between the teacher signal and the output signal becomes equal when the weight is changed in space . When the value of the error when the weight changes is represented by the axis in the direction perpendicular to the paper surface of FIG. 2, the value of the error becomes a curved surface (error curved surface) in the three-dimensional space. FIG. 2 is called a contour diagram of the error curved surface in the sense that it is a graph connecting points of the error curved surface having the same height from the paper surface. In FIG. 2, w [i ₁ , j ₁ ] and w [i ₂ , j ₂ ] are multiplied by any two variable weight multipliers among variable weight multipliers 3, 4, 5, 6, 7 and 8. It is the weight to do. In the virtual weight space of FIG. 2, weights are
_{= (W [i 1, j} 1], w [i 2, j 2] ......) As in, represented by a vector to the value of each weight element. In learning of the learning machine of the present embodiment, starting from the initial value of the weight of the variable weight multiplier represented by the starting point of the error curved surface shown in FIG. 2, the minimum point on the error curved surface in the direction in which the error decreases The goal is to repeat the search and reach the global minimum of the error. The variable weight multipliers 3, 4, 5, 6, 7 and 8 multiply the input signals by weight and output the weighted signals. Adders 9, 10 and 11
Has a saturated input / output characteristic represented by equation (1). The output signal calculation circuit 13 obtains an output signal by multiplying and adding the input signals input from the input terminals 1 and 2. The teacher signal generator 14 generates a desired output signal as a teacher signal with respect to the input signal, and the error calculator 15 calculates the error from the actual output signal output from the terminal 12 and the teacher signal according to the equation (3). Find the first value. The search direction determination unit 16 determines the minimum point error search direction in the weight space in which the weights of the variable weight multipliers are represented by vectors. The search direction is obtained by the steepest descent direction at the starting point, that is, the expression (4). FIG. 3 shows an operation explanatory diagram of the error minimizing circuit in the first minimum point search of the present embodiment. The starting point and P1 shown in FIG. 3 coincide with the starting point and P1 in FIG. 2, and FIG. 3 shows the cross section of the straight line connecting P1 and the starting point of the error curved surface shown in FIG. The error curve is shown. The learning parameter initialization unit 17 sets the initial value of the learning parameter to ε ₀ in the first minimum point search, and in the second and subsequent minimum point searches, the learning parameter value used for the previous weight change or ε ₀ is larger. The force value is determined, and the error with respect to the initial value of the learning parameter is obtained. Where ε ₀ is a positive constant. In the initial minimum point search shown in FIG. 3, the initial value of the learning parameter is ε ₀ . When the error with respect to the initial value of the learning parameter becomes smaller than the initial value of the error, the learning parameter changing unit 18 doubles the value of the learning parameter ε to obtain the error value. Repeat until increase starts. That is, ε _k = ε _k-1 * 2 (6) and the error is calculated as E _k = E (+ ε _k *) (7). In the initial minimum point search shown in FIG. 3, E ₀ <
Since it is E _org, double the value of the learning parameter ε,
_{Since org} > E ₀ > E ₁ > E ₂ <E ₃ , the learning parameter changing unit 18 obtains up to E ₃ . When the error with respect to the initial value of the learning parameter is larger than the initial value of the error, the learning parameter changing unit 18 reduces the value of the learning parameter to 1 /
The operation of doubling to obtain the error is repeated until the error becomes smaller than the initial value of the error. That is, ε _k = ε _k-1 / 2 (8) and the error is calculated by the equation (7). The parabolic surface approximation unit 19 approximates the vicinity of the minimum error point by a parabolic surface. First, in order to obtain the error at the points where the weight values change at equal intervals in the weight space in FIG. Find the error E _2.5 = E (+ ε _2.5 *) at ε _2.5 = (ε ₂ + ε ₃ ) / 2. In Figure 3, because E _2.5 <E ₂ <E ₃ <E ₁ ,
A parabolic surface that passes through the three points E ₂ , E _2.5 , and E ₃ is approximated to the error surface near the point with the smallest error, and the error at the vertex is obtained. That is, E _v = E (+ ε _v *) (9) The weight changing unit 20 uses the learning parameter for the smallest error among the errors obtained as described above,
Change the weight of each variable weight multiplier. In Figure 3, E ₁ ,
The learning parameter ε ₃ corresponding to E ₃ , which is the smallest error among E ₂ , E _2.5 , E ₃ , and E _v , is used as the optimum value of the learning parameter in this search direction, and each variable weight multiplier is used using ε ₃ . Change the weight of.

変更された重みについて、出力信号算出回路13は出力信
号を算出し、教師信号発生部14が出力する教師信号と出
力信号との差をもとに誤差算出部15は（３）式で与えら
れる誤差を算出する。初回の最小点探索の時と同様にし
て、探索方向決定部16は重み空間における誤差の最小点
探索方向をP1における最急降下方向に決定する。第４図
に本実施例の２回目の最小点探索における誤差最小化回
路の動作説明図を示す。第４図に示すP1及びP2は第２図
のP1及びP2に一致しており、第４図は第２図の誤差曲面
のP1とP2とを結ぶ直線による断面の誤差曲線を示す。学
習パラメータ初期化部17は、学習パラメータの初期値
を、初回の最小点探索ではε_０とし、２回目以後の最小
点探索では前回の重み変更に用いた学習パラメータの値
もしくはε_０のうち大きい方の値に決定し、前記学習パ
ラメータの初期値に対する誤差を求める。即ち、前回の
学習パラメータの最適値がε_０よりも大きいときには前
回の学習パラメータの最適値が学習パラメータの初期値
となり、前回の学習パラメータの最適値がε_０よりも小
さいときにはε_０が学習パラメータの初期値となる。こ
れは前回の学習パラメータの最適値を今回の探索におけ
る学習パラメヘータの初期値とすることにより今回の探
索に最適な学習パラメータの設定を効率良く行えるとと
もに、前回の学習パラメータの最適値が小さくて、それ
を今回の最小点探索の学習パラメータの初期値として用
いると、誤差曲面の局所的最小点から抜けられなくなる
ことを防いでいる。第４図では、前回の学習パラメータ
の最適値ε_３がε_０より大きいので、ε_３を学習パラメ
ータの初期値とする。第４図において、学習パラメータ
の初期値ε_３に対する誤差E₃は誤差の最初の値E₀より小
さいので、学習パラメータ変化部18は学習パラメータε
の値を２倍にして誤差の値を求めるという動作を誤差の
値が増加に転ずるまで繰り返す。第４図では、E₀＞E₃＜
E₄となるので、E₄まで求める。放物面近似部19は誤差最
小点の近傍を放物面で近似する。第４図では、重み空間
内で重みの値が学習パラメータの値0,ε_３，ε_４で等間
隔に変化しているので、これらの点における誤差から誤
差曲面を放物面近似して、その頂点における誤差を
（９）式で求める。ただし、である。重み変更部20は、以上のようにして求まった誤
差E₃，E₄，E_vのうちで最も小さい誤差E_vに対する学習パ
ラメータε_ｖを２回目の探索の学習パラメータの最適値
とし、それを用いて各可変重み乗算器の重みを変更す
る。以下、誤差量小化回路21は、教師信号発生部14と誤
差算出部15と探索方向決定部16と学習パラメータ初期化
部17と学習パラメータ変化部18と放物面近似部19と重み
変更部20とを繰り返し用いて誤差を小さくする。この繰
り返しを誤差が十分小さくなるまで行い、学習を終了す
る。With respect to the changed weight, the output signal calculation circuit 13 calculates the output signal, and the error calculation unit 15 is given by the equation (3) based on the difference between the teacher signal output by the teacher signal generation unit 14 and the output signal. Calculate the error. Similar to the initial minimum point search, the search direction determination unit 16 determines the minimum point search direction of the error in the weight space to be the steepest descent direction in P1. FIG. 4 shows an operation explanatory diagram of the error minimizing circuit in the second minimum point search of the present embodiment. P1 and P2 shown in FIG. 4 coincide with P1 and P2 of FIG. 2, and FIG. 4 shows an error curve of a section by a straight line connecting P1 and P2 of the error curved surface of FIG. The learning parameter initialization unit 17 sets the initial value of the learning parameter to ε ₀ in the first minimum point search, and in the second or subsequent minimum point search, the learning parameter value used for the previous weight change or ε ₀ is larger. Either value is determined, and the error with respect to the initial value of the learning parameter is obtained. That is, when the optimal value of the previous learning parameter is larger than ε ₀ , the optimal value of the previous learning parameter becomes the initial value of the learning parameter, and when the optimal value of the previous learning parameter is smaller than ε ₀ , ε ₀ is the learning parameter. Is the initial value of. This is because the optimum value of the previous learning parameter can be set efficiently as the initial value of the learning parameter in this search, and the optimum value of the last learning parameter can be set efficiently, and the optimum value of the previous learning parameter is small. When it is used as the initial value of the learning parameter for the minimum point search this time, it is prevented that the local minimum point of the error curved surface cannot be escaped. In FIG. 4, since the optimum value ε ₃ of the previous learning parameter is larger than ε ₀ , ε ₃ is set as the initial value of the learning parameter. In Figure 4, since the error E ₃ relative to the initial value epsilon ₃ learning parameter is less than the first value E ₀ of the error, the learning parameter change unit 18 learning parameters epsilon
The operation of doubling the value of to obtain the error value is repeated until the error value starts to increase. In FIG. 4, E ₀ > E ₃ <
Since the E _4, it seeks to E _4. The parabolic surface approximation unit 19 approximates the vicinity of the minimum error point with a parabolic surface. In FIG. 4, since the weight values in the weight space change at the learning parameter values 0, ε ₃ , and ε ₄ at equal intervals, the error curved surface is parabolic-approximated from the errors at these points, The error at the apex is obtained by the equation (9). However, Is. The weight changing unit 20 sets the learning parameter ε _v for the smallest error E _v among the errors E ₃ , E ₄ , E _v obtained as described above as the optimum value of the learning parameter for the second search, and sets it as the optimum value. Use to change the weight of each variable weight multiplier. Hereinafter, the error amount reduction circuit 21, the teacher signal generation unit 14, the error calculation unit 15, the search direction determination unit 16, the learning parameter initialization unit 17, the learning parameter changing unit 18, the parabolic approximation unit 19 and the weight changing unit Use 20 and repeatedly to reduce the error. This iteration is repeated until the error becomes sufficiently small, and the learning ends.

以上のように本実施例によれば、学習パラメータ初期化
部17、学習パラメータ変化部18及び放物面近似部19を設
けることによって、誤差の最小点探索方向における学習
パラメータの最適値を常に効率良く求めながら可変重み
乗算器の重みを変更していくので、誤差の最小化を効率
良く行うことができ、学習時間を短縮できる。As described above, according to the present embodiment, by providing the learning parameter initializing unit 17, the learning parameter changing unit 18, and the parabolic surface approximating unit 19, the optimum value of the learning parameter in the search direction of the minimum point of error is always efficient. Since the weight of the variable weight multiplier is changed while being obtained well, the error can be minimized efficiently and the learning time can be shortened.

なお、本実施例において、探索方向決定部16は誤差の最
小点探索の方向を最急降下方向に決定したが、最小点探
索の方向を共役勾配方向にしてもよい。共役勾配方向
は、＝＋β＊′ ……（12）で与えられる。ただし、は最急降下方向でありであり、′は前回の最小点探索における共役勾配方
向、′は前回の最小点探索における最急降下方向であ
る。この場合も初回の最小点探索方向は、最急降下方向
に決定する。以上のように第１の実施例では、探索方向
決定部16が誤差の最小点探索の方向を、最急降下の方向
としても、共役勾配の方向としても同様の効果が得られ
る。In this embodiment, the search direction determination unit 16 determines the direction of the minimum error search as the steepest descent direction, but the direction of the minimum point search may be the conjugate gradient direction. The conjugate gradient direction is given by: = + β * '... (12) However, is the steepest descent direction ′ Is the conjugate gradient direction in the previous minimum point search, and ′ is the steepest descent direction in the previous minimum point search. Also in this case, the initial minimum point search direction is determined to be the steepest descent direction. As described above, in the first embodiment, the same effect can be obtained when the search direction determination unit 16 sets the direction of the minimum error search as the direction of the steepest descent or the direction of the conjugate gradient.

第５図は、本発明の第２の実施例の構成図である。第５
図において、22は最急降下方向決定部、23は学習パラメ
ータ最大値制限部、24は誤差最小化回路である。本実施
例においては、最急降下方向決定部22は誤差の最小点探
索方向を（４）式で表される最急降下方向に決定する。
学習のパラメータ初期化部17は、第１の実施例と同様に
して、学習パラメータの初期値を、初回の最小点探索で
はε_０とし、２回目以後の最小点探索では前回の最小点
探索における学習パラメータの最適値もしくはε_０のう
ち大きい方の値に決定し、前記学習パラメータの初期値
に対する誤差を求める。学習パラメータ変化部18では、
学習パラメータの初期値に対する誤差が誤差の最初の値
よりも小さいときには、学習パラメータεの値を２倍に
して誤差の値を求めるという動作を誤差の値が増加に転
ずるまで繰り返すか、または学習パラメータの初期値に
対する誤差が誤差の最初の値よりも増加したときには、
学習パラメータεの値を1/2倍にして誤差の値を求める
という動作を誤差の値が誤差の最初の値より小さくなる
まで繰り返す。ただし、本実施例では、学習パラメータ
εの値が適当な正の定数ε_maxを越えても誤差が増加に
転じないときには、その時点の学習パラメータの値で各
可変重み乗算器の重みを変更する。学習パラメータεの
値がε_maxを越えない範囲で誤差が増加に転じた場合に
は、放物面近似部19及び重み変更部20の動作は第１の実
施例と同様である。本実施例では、誤差の最小点探索に
おいて、探索方向を最急降下方向に決め、学習パラメー
タεの値が正の定数ε_maxを越えない範囲で学習パラメ
ータを変化させる点で第１の実施例と異なっている。第
６図は本発明の第２の実施例の効果の説明図である。第
１の実施例によると、第６図において、初回の最小点探
索において、出発点の最急降下方向をもとにその探索方
向における誤差最小の点を求めていたので、初回の最小
点探索がP1′で終了し、２回目の最小点探索がP2′で終
了し、結局局所的最小点に陥ってしまう。本実施例によ
れば、学習パラメータεの値が正の定数ε_maxを越えな
い範囲で学習パラメータを変化させるので、初回の最小
点探索は学習パラメータの最大値ε_maxに対応する点P1
で終了し、２回目の最小点探索ではP1における最急降下
方向をもとに最小点探索方向を決定し、２回目の最小点
探索は学習パラメータの最大値ε_maxに対応する点P2で
終了する。同様に３回目の最小点探索はP3で終了し、４
回目の最小点探索はP4で終了し、結局大局的最小点にた
どり着くことができる。以上のように、本実施例では、
学習パラメータの最大値を制限することによって、局所
的最小点に陥ることを防ぐことができる。FIG. 5 is a block diagram of the second embodiment of the present invention. Fifth
In the figure, 22 is a steepest descent direction determining unit, 23 is a learning parameter maximum value limiting unit, and 24 is an error minimizing circuit. In the present embodiment, the steepest descent direction determination unit 22 determines the minimum error point search direction to be the steepest descent direction represented by equation (4).
Similar to the first embodiment, the learning parameter initialization unit 17 sets the initial value of the learning parameter to ε ₀ in the initial minimum point search, and in the second and subsequent minimum point searches, in the previous minimum point search. The optimum value of the learning parameter or ε ₀ is determined to be the larger value, and the error with respect to the initial value of the learning parameter is obtained. In the learning parameter changing unit 18,
When the error with respect to the initial value of the learning parameter is smaller than the first value of the error, the operation of doubling the value of the learning parameter ε to obtain the value of the error is repeated until the value of the error turns into an increase, or When the error with respect to the initial value of becomes larger than the first value of the error,
The operation of obtaining the error value by halving the value of the learning parameter ε is repeated until the error value becomes smaller than the initial error value. However, in this embodiment, when the error does not start to increase even if the value of the learning parameter ε exceeds an appropriate positive constant ε _max , the weight of each variable weight multiplier is changed by the value of the learning parameter at that time. . When the error turns to increase within the range where the value of the learning parameter ε does not exceed ε _max , the operations of the parabolic surface approximation unit 19 and the weight changing unit 20 are the same as those in the first embodiment. In the present embodiment, in the search for the minimum error point, the search direction is determined to be the steepest descent direction, and the learning parameter is changed within the range in which the value of the learning parameter ε does not exceed the positive constant ε _max. Is different. FIG. 6 is an explanatory view of the effect of the second embodiment of the present invention. According to the first embodiment, in FIG. 6, in the initial minimum point search, the point with the smallest error in the search direction is obtained based on the steepest descent direction of the starting point. It ends at P1 ', the second minimum point search ends at P2', and ends up falling into a local minimum point. According to the present embodiment, since the learning parameter is changed within the range in which the value of the learning parameter ε does not exceed the positive constant ε _max , the first minimum point search is performed at the point P1 corresponding to the maximum value ε _max of the learning parameter.
Then, in the second minimum point search, the minimum point search direction is determined based on the steepest descent direction in P1, and the second minimum point search ends at the point P2 corresponding to the maximum value ε _max of the learning parameter. . Similarly, the third minimum point search ends at P3, and 4
The second minimum point search ends at P4, and eventually the global minimum point can be reached. As described above, in this embodiment,
By limiting the maximum value of the learning parameter, it is possible to prevent falling into a local minimum point.

第７図は本発明の第３の実施例の構成図である。第７図
において、25は探索方向決定部、26は最急降下方向決定
部、27は誤差最小化回路である。本実施例においては、
探索方向決定部25は誤差の最小点探索方向を初回の探索
では（４）式で表される最急降下方向に決定し、２回目
以後の探索では（12）式で表される共役勾配方向に決定
する。学習のパラメータ初期化部17は、第１及び第２の
実施例と同様にして、学習パラメータの初期値を、初回
の最小点探索ではε_０とし、２回目以降の最小点探索で
は前回の最小点探索における学習パラメータの最適値も
しくはε_０のうち大きい方の値に決定し、前記学習パラ
メータの初期値に対する誤差を求める。学習パラメータ
変化部18では、学習パラメータの初期値に対する誤差が
誤差の最初の値よりも小さいときには、学習パラメータ
εの値を２倍にして誤差の値を求めるという動作を誤差
の値が増加に転ずるまで繰り返すか、または学習パラメ
ータの初期値に対する誤差が誤差の最初の値よりも増加
したときには、学習パラメータεの値を1/2倍にして誤
差の値を求めるという動作を誤差の値が誤差の最初の値
より小さくなるまで繰り返す。ただし、本実施例では、
学習パラメータεの値を、適当な正の定数ε_minより小
さくしても誤差の値が誤差の最初の値よりも小さくなら
ないときには、最小点探索方向を共役勾配方向から最急
降下方向に切り替える。放物面近似部19及び重み変更部
20の動作は第１の実施例と同様である。本実施例では、
２回目以後の誤差の最小点探索において、探索方向を共
役勾配方向に決め、学習パラメータεの値が正の定数ε
_minより小さくならない範囲で学習パラメータを変化さ
せる点で第１の実施例と異なっている。第８図は本発明
の第３の実施例の効果の説明図である。第１の実施例に
よると、第８図において、初回の最小点探索において、
出発点の最急降下方向をもとにその探索方向における誤
差最小の点を求めて、初回の最小点探索がP1で終了し、
２回目の最小点探索はP1における共役勾配方向を探索方
向としてその方向で誤差が最小となる点を求めていたの
で、２回目の探索はP2′で終了していた。本実施例によ
れば、学習パラメータεの値が正の定数ε_minより小さ
くならない範囲で学習パラメータを変化させるので、２
回目の最小点探索において学習パラメータεの値をε
_minより小さくしても誤差が誤差の最初の値より小さく
ならない場合には、共役勾配方向の探索を打ち切り、探
索方向を最急降下方向に切り替え、２回目の最小点探索
はP2で終了する。以上のように、本実施例では、共役勾
配方向に対する学習パラメータの値の最小値を制限し、
それより小さくなるときには探索方向を最急降下方向に
切り替えることにより、大局的最小点に効率良くたどり
着くことができる。FIG. 7 is a block diagram of the third embodiment of the present invention. In FIG. 7, reference numeral 25 is a search direction determining unit, 26 is a steepest descent direction determining unit, and 27 is an error minimizing circuit. In this embodiment,
The search direction determination unit 25 determines the minimum error search direction to be the steepest descent direction represented by equation (4) in the first search, and the conjugate gradient direction represented by equation (12) in the second and subsequent searches. decide. Similar to the first and second embodiments, the learning parameter initializing unit 17 sets the initial value of the learning parameter to ε ₀ in the initial minimum point search and the previous minimum in the second and subsequent minimum point searches. The optimum value of the learning parameter in the point search or the larger value of ε ₀ is determined, and the error with respect to the initial value of the learning parameter is obtained. In the learning parameter changing unit 18, when the error with respect to the initial value of the learning parameter is smaller than the initial value of the error, the operation of doubling the value of the learning parameter ε and obtaining the error value is turned to increase the error value. Or the error with respect to the initial value of the learning parameter increases more than the initial value of the error, the operation of obtaining the error value by halving the value of the learning parameter ε is performed. Repeat until it is less than the first value. However, in this embodiment,
When the value of the learning parameter ε is smaller than an appropriate positive constant ε _min and the error value does not become smaller than the initial value of the error, the minimum point search direction is switched from the conjugate gradient direction to the steepest descent direction. Parabolic approximation unit 19 and weight changing unit
The operation of 20 is similar to that of the first embodiment. In this embodiment,
In the minimum point search of the error after the second time, the search direction is set to the conjugate gradient direction, and the value of the learning parameter ε is a positive constant ε.
_The difference from the first embodiment is that the learning parameter is changed within a range not smaller than _min . FIG. 8 is an explanatory diagram of the effect of the third embodiment of the present invention. According to the first embodiment, in FIG. 8, in the initial minimum point search,
Based on the steepest descent direction of the starting point, find the point with the smallest error in that search direction, and the initial minimum point search ends at P1,
In the second minimum point search, the conjugate gradient direction in P1 was used as the search direction, and the point where the error was the smallest was found in that direction, so the second search ended at P2 '. According to the present embodiment, the learning parameter is changed within the range in which the value of the learning parameter ε does not become smaller than the positive constant ε _min.
In the second minimum point search, the learning parameter ε is set to ε
_{If the} error does not become smaller than the initial value of the error even if it is smaller than _min, the search in the conjugate gradient direction is terminated, the search direction is switched to the steepest descent direction, and the second minimum point search ends at P2. As described above, in this embodiment, the minimum value of the learning parameter value with respect to the conjugate gradient direction is limited,
When it becomes smaller than that, the search direction is switched to the steepest descent direction, whereby the global minimum point can be efficiently reached.

発明の効果以上説明したように、本発明によれば、誤差の最小点探
索方向における学習パラメータの最適値を常に効率良く
求めながら可変重み乗算器の重みを変更していくので、
誤差の最小化を効率良く行うことができ、学習機械の学
習に要する時間を短縮でき、その実用的効果は大きい。Effect of the Invention As described above, according to the present invention, the weight of the variable weight multiplier is changed while always efficiently obtaining the optimum value of the learning parameter in the minimum error point search direction.
The error can be minimized efficiently, the time required for learning of the learning machine can be shortened, and its practical effect is great.

[Brief description of drawings]

第１図は本発明における第１の実施例の学習機械のブロ
ック図、第２図は重み空間における誤差曲面の等高線
図、第３図は同実施例の初回の最小点探索における誤差
最小化回路の動作説明図、第４図は同実施例の初回の最
小点探索における誤差最小化回路の動作説明図、第５図
は本発明の第２の実施例の学習機械のブロック図、第６
図は同実施例の効果の説明図、第７図は本発明の第３の
実施例の学習機械のブロック図、第８図は同実施例の効
果の説明図、第９図は従来の学習機械のブロック図、第
10図は加算器の入出力特性図である。３、４、５、６、７、８……可変重み乗算器、９、10、
11……加算器、14……教師信号発生部、15……誤差算出
部、16、25……探索方向決定部、17……学習パラメータ
初期化部、18……学習パラメータ変化部、19……放物面
近似部、20……重み変更部、22、26……最急降下方向決
定部、23……学習パラメータ最大値制限部。FIG. 1 is a block diagram of a learning machine according to the first embodiment of the present invention, FIG. 2 is a contour diagram of an error curved surface in a weight space, and FIG. 3 is an error minimizing circuit in the first minimum point search of the same embodiment. 4 is an operation explanatory diagram of the error minimization circuit in the initial minimum point search of the same embodiment, FIG. 5 is a block diagram of a learning machine of a second embodiment of the present invention, and FIG.
FIG. 7 is an explanatory view of the effect of the same embodiment, FIG. 7 is a block diagram of a learning machine of the third embodiment of the present invention, FIG. 8 is an explanatory view of the effect of the same embodiment, and FIG. 9 is conventional learning. Machine block diagram, No.
Figure 10 shows the input / output characteristics of the adder. 3, 4, 5, 6, 7, 8 ... Variable weight multipliers, 9, 10,
11 ... Adder, 14 ... Teacher signal generating section, 15 ... Error calculating section, 16, 25 ... Search direction determining section, 17 ... Learning parameter initializing section, 18 ... Learning parameter changing section, 19 ... … Parabolic surface approximating part, 20 …… Weight changing part, 22, 26 …… Steepest descent direction determining part, 23 …… Learning parameter maximum value limiting part.

───────────────────────────────────────────────────── フロントページの続き (72)発明者高木英行大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者戸川隼人東京都文京区向丘１丁目12番２号 (56)参考文献１．電子情報通信学会技術研究報告ｖｏｌ．88 Ｎｏ．325 ＰＲＵ88−93「誤差および出力変動を最小化するバックプロパゲーション」木村義政２．岩波講座情報科学−19「最適化」西川、三宮、茨木 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Hideyuki Takagi 1006 Kadoma, Kadoma City, Osaka Prefecture Matsushita Electric Industrial Co., Ltd. (72) Inventor Hayato Togawa 1-22, Mukooka, Bunkyo-ku, Tokyo (56) References 1. IEICE Technical Report vol. 88 No. 325 PRU88-93 “Backpropagation to minimize error and output fluctuation” Yoshimasa Kimura 2. Iwanami Course Information Science-19 "Optimization" Nishikawa, Sannomiya, Ibaraki

Claims

[Claims]

1. An output signal calculation circuit in which a multi-input / single-output adder having a saturated input / output characteristic is hierarchically connected to the adder of each layer via a variable weight multiplier with the adder of the next layer. A teacher signal generator that gives a teacher signal as a desired value of the output signal of the adder in the output layer each time an input signal enters; an error calculator that calculates an error between the output signal and the teacher signal; Each weight (w ₁ , w ₂ , ...) multiplied by the multiplier is used as a coordinate axis, and the weight is a vector (=
In the virtual weight space expressed by w ₁ , w ₂ ,…)),
A search direction determination unit that determines the minimum error point search direction (= g ₁ , g ₂ , ...) Based on the steepest descent method, and a learning parameter (ε) in the minimum error point search direction.
Weight when the value of is set to ε ₀ ('= + ε ₀ )
Learning parameter is increased if the error relative to the weight is smaller than the error relative to the weight, and the learning parameter is decreased if the error relative to the weight is greater than the error relative to the weight, and a learning parameter changing unit for determining the learning parameter and the error in the neighborhood where the error is minimized. And in the vicinity of the minimum point of the error,
A parabolic approximation part that finds the error at its apex by approximating an error curved surface that represents the change in error with respect to the change in weight with a parabolic surface, and the learning parameter at the point where those errors are the minimum are set as the optimal value of the learning parameter. A weight changing unit for changing the weight of the variable weight multiplier and a learning parameter initialization for reflecting the optimum value of the learning parameter in the previous minimum point search of the error to the initial value of the learning parameter in the minimum point search of the current error. Unit, the teacher signal generating unit, the error calculating unit, the search direction determining unit, the learning parameter initializing unit, the learning parameter changing unit, the parabolic surface approximating unit, and the weight changing unit are repeatedly used, A learning machine comprising an error minimization circuit for sufficiently reducing an error.

2. A steepest descent direction determination unit that determines a direction in which the minimum point of the error is searched in a steepest descent direction in a weight space that represents the weight of the variable weight multiplier by a vector, and a learning parameter that limits the maximum value of the learning parameter. The learning machine according to claim 1, further comprising a maximum value limiting unit.

3. A conjugate gradient direction determination unit that determines the direction of finding the minimum point of the error in the weight space that represents the weight of the variable weight multiplier as a vector, and a learning parameter that is smaller than a certain value. The learning machine according to claim 1, further comprising a search direction changing unit that sets a steepest descent direction as a direction of a minimum point search of the error when the error does not decrease.