JP3114276B2

JP3114276B2 - Learning method of hierarchical neural network

Info

Publication number: JP3114276B2
Application number: JP03257511A
Authority: JP
Inventors: 雅理市川
Original assignee: Advantest Corp
Current assignee: Advantest Corp
Priority date: 1991-10-04
Filing date: 1991-10-04
Publication date: 2000-12-04
Anticipated expiration: 2015-12-04
Also published as: JPH05101209A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は出力関数としてシグナ
ム関数（符号関数）を用いるユニットにより構成された
階層型ニューラルネットワークに対するＭＲII（ＭＡＤ
ＡＬＩＮＥＲｕｌｅ II ）という学習方法の改良に関す
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an MRII (MAD) for a hierarchical neural network constituted by units using a signum function (sign function) as an output function.
ALINERule II).

【０００２】[0002]

【従来の技術】ニューラルネットワークは例えば入力信
号の分類（認識）などに利用される。例えば図３に示す
ように入力層１１、中間層１２、出力層１３よりなる階
層型ニューラルネットワークを用いて入力画像中のパタ
ーンを分類する場合、入力層１１のユニット１４の数は
階層型ニューラルネットワークに入力する画像１５の画
素数によって決定する。同様に、出力層１３のユニット
１６の数は出力画像の画素数、分類のカテゴリ数等によ
って決定する。中間層１２のユニット１７の数は認識し
たいパターンの個数や複雑さによって適切に選択する必
要があるが、適切なユニット数の決定法は確立していな
い。2. Description of the Related Art A neural network is used, for example, for classifying (recognizing) input signals. For example, as shown in FIG. 3, when a pattern in an input image is classified using a hierarchical neural network including an input layer 11, an intermediate layer 12, and an output layer 13, the number of units 14 of the input layer 11 is determined by the number of hierarchical neural networks. Is determined by the number of pixels of the image 15 to be input to the. Similarly, the number of units 16 in the output layer 13 is determined by the number of pixels of the output image, the number of categories of classification, and the like. The number of units 17 in the intermediate layer 12 needs to be appropriately selected depending on the number and complexity of patterns to be recognized, but a method for determining an appropriate number of units has not been established.

【０００３】図４にこの階層型ニューラルネットワーク
の中間層１２と出力層１３のユニット１７，１６に用い
たニューロンモデルを示す。このニューロンモデルは、
±１の二値信号Ｘ（ｘ₁，ｘ₂・・・，ｘ_n）が入力さ
れると、入力信号に結合荷重を乗じて総和ｙを求め、二
値信号ｑ＝ＳＧＮ（ｙ）を出力する。出力関数に用いた
シグナム関数ＳＧＮ（ｙ）は、実数値を持つｙの符号を
見て＋１または−１を出力する関数である。ｘ₀＝１は
しきい値の入力である。FIG. 4 shows a neuron model used for the units 17 and 16 of the intermediate layer 12 and output layer 13 of the hierarchical neural network. This neuron model is
When a binary signal X (x ₁ , x ₂ ..., X _n ) of ± 1 is input, a sum y is obtained by multiplying the input signal by a connection weight, and a binary signal q = SGN (y) is output. I do. The signum function SGN (y) used for the output function is a function that outputs +1 or −1 by looking at the sign of y having a real value. x ₀ = 1 is the threshold input.

【０００４】図３の各ユニットの出力関数としてシグナ
ム関数を用いる階層型ニューラルネットワークの学習法
として、つまり、例えば入力画像を入力すると、そのパ
ターンに応じた出力端子に出力が得られ、画像の分類を
可能とするための各結合荷重の決定を行う方法として、
ＭＲII法を図５を参照して説明する。中間層１２のユニ
ット１７として適当な数、例えば学習のために用意した
信号の個数だけ用意しておき、全ユニット１７，１６の
結合荷重に小数をランダムに与えて初期化する
（Ｓ₁）。次にトータルエラーをゼロ、学習セット提示
回数を０に初期化し（Ｓ₂）、用意した学習セット（学
習に用いる入力信号Ｘと教師信号Ｄとの組）のうちの１
組をニューラルネットワークに提示し、つまり入力信号
Ｘをニューラルネットワークに入力する（Ｓ₃）。その
入力信号に対し中間層１２の出力を計算し、更に出力層
１３の出力を計算して出力信号Ｑを得る（Ｓ₄）。As a learning method of a hierarchical neural network using a signum function as an output function of each unit in FIG. 3, that is, for example, when an input image is input, an output is obtained at an output terminal corresponding to the pattern, and the image is classified. As a method of determining each coupling load to enable
The MRII method will be described with reference to FIG. An appropriate number of units 17 of the intermediate layer 12 are prepared, for example, the number of signals prepared for learning, and a decimal number is randomly given to the connection weights of all the units 17 and 16 to initialize them (S ₁ ). Next, the total error is initialized to zero, and the number of presentations of the learning set is initialized to zero (S ₂ ), and one of the prepared learning sets (a set of the input signal X and the teacher signal D used for learning) is set.
The set is presented to the neural network, that is, the input signal X is input to the neural network (S ₃ ). The output of the intermediate layer 12 is calculated for the input signal, and the output of the output layer 13 is calculated to obtain an output signal Q (S ₄ ).

【０００５】その出力信号Ｑと教師信号Ｄとの誤差Ｅを
求め（Ｓ₅）、その誤差Ｅをトータルエラーに加算して
それを新たなトータルエラーとする（Ｓ₆）。次に試行
回数を０に初期化し（Ｓ₇）、中間層ユニット１７の内
部状態値ｙが試行回数＋１番目にゼロに近い中間層ユニ
ットを選択し、つまり内部状態値ｙの絶対値が試行回数
＋１番目に小さい中間層ユニットを選択する（Ｓ₈）。[0005] The error E between the output signal Q and the teacher signal D calculated (S _5), which it adds the error E to the total error as a new total error (S _6). Next, the number of trials is initialized to 0 (S ₇ ), and the intermediate layer unit in which the internal state value y of the intermediate layer unit 17 is the number of trials + 1 plus the first zero is selected, that is, the absolute value of the internal state value y is the number of trials. the intermediate layer unit selects small +1 th (S _8).

【０００６】その選択した中間層ユニットの二値出力ｑ
の符号を反転し、新しく中間層の出力信号を作る（以下
これを試行パターンと記す）（Ｓ₉）。その試行パター
ンを出力層１３に入力し、演算して出力信号Ｑ′を求め
（Ｓ₁₀）、その出力信号Ｑ′と教師信号Ｄとの誤差Ｅ′
を求める（Ｓ₁₁）。この誤差信号Ｅ′とステップＳ₅で
得た誤差信号Ｅとを比較し（Ｓ₁₂）、Ｅ＞Ｅ′の場合は
選択した中間層ユニットの結合荷重を、実際にそのユニ
ットの出力の符号が反転するようにＬＭＳアルゴリズム
によって更新する（Ｓ₁₃）。つまり現在の結合荷重をＷ
ｋ，更新後のそれをＷ_k+1、学習係数をα、教師信号を
ｄ（符号反転後の二値出力）とするとＷ_k+1＝Ｗ _k ＋α
εＸ／｜Ｘ｜ ²，ε＝ｄ−Ｘ^TＷ_kを演算する。Ｅ≦
Ｅ′の場合は試行パターン中の反転した符号を元に戻
し、結合荷重の更新は行わない（Ｓ₁₄）。The binary output q of the selected intermediate layer unit
To create a new intermediate layer output signal (hereinafter referred to as a trial pattern) (S ₉ ). Enter the attempt pattern at the output layer 13, calculates and 'seeking (S _10), the output signal Q' output signal Q error E between the teacher signal D and '
(S ₁₁ ). 'Compares and the error signal E obtained in step _{_{S 5 (S 12), E}} >E' the error signal E of the coupling weight of the intermediate layer units selected in the case of, actually the sign of the output of the unit updated by the LMS algorithm to invert (S _13). In other words, the current connection weight is W
k, W _{k + 1} after the update, α as the learning coefficient, and d (the binary output after sign inversion) as the teacher signal, W _{k + 1} = W _k + α
εX / | X | ^2, calculates the ^{_{ε = d-X T W k}} . E ≦
For E 'undoing the inverted symbols in the trial pattern, not updated connection weights (S _14).

【０００７】次に試行回数を＋１して新たな試行回数と
し（Ｓ₁₅）、その試行回数が中間層１２のユニット数と
一致したかを調べ（Ｓ₁₆）、一致していなければステッ
プＳ ₈に戻る。このようにして中間層ユニットのすべて
についてその内部状態値ｙがゼロに近いものの順に、結
合荷重を更新するかしないままとされる。その後その入
力信号Ｘを再び入力して出力信号Ｑを再度求め
（Ｓ₁₇）、その出力信号Ｑと教師信号Ｄとを比較し（Ｓ
₁₈）、不一致の場合は出力層１３のユニット１６の結合
荷重をＬＭＳアルゴリズムで更新し（Ｓ₁₉）、一致して
いる場合は出力層ユニットの結合荷重をそのままとす
る。Next, the number of trials is incremented by one, and a new trial number is obtained.
(S₁₅), The number of trials is the same as the number of units in the middle layer 12
Check if they match (S₁₆), If they do not match,
S ₈Return to In this way all of the middle tier units
In order of those whose internal state value y is close to zero.
The total load is updated or not. Then enter
Input the force signal X again and find the output signal Q again
(S₁₇), And compares the output signal Q with the teacher signal D (S
₁₈), In the case of a mismatch, the combination of the units 16 of the output layer 13
The load is updated by the LMS algorithm (S₁₉), Match
If it is, leave the coupling load of the output layer unit as it is.
You.

【０００８】次に学習セット提示回数を＋１してこれを
新たに学習セット提示回数とし（Ｓ ₂₀）、その学習セッ
ト提示回数が予め与えられた学習セットの数と一致した
かをチェックし（Ｓ₂₁）、不一致の場合はステップＳ₃
に戻り、新たに他の学習セットについて同様のことを行
い、以下同様にして、すべての学習セットについてステ
ップＳ₃〜Ｓ₂₁を実行（学習）し終ると（１サイクルの
学習を終了すると）トータルエラーがゼロか否かをチェ
ックし（Ｓ₂₂）、ゼロでなければステップＳ₂に戻り、
再びすべての学習セットについてトータルエラーがゼロ
になるまでステップＳ₂〜Ｓ₂₂を繰返し実行（学習）す
る。トータルエラーがゼロになったら学習を終了する。Next, the number of presentations of the learning set is incremented by one, and
A new number of learning set presentations (S ₂₀), The learning set
The number of presentation matches the number of learning sets given in advance
Check whether (S_{twenty one}), If not, step S_Three
And do the same for another training set.
Steps are repeated for all training sets in the same way.
Top S_Three~ S_{twenty one}After the execution (learning) is completed (one cycle
Check if total error is zero)
Click (S_{twenty two}), If not zero, step S_TwoBack to
Again zero total error for all training sets
Step S until_Two~ S_{twenty two}Repeatedly (learn)
You. When the total error becomes zero, the learning ends.

【０００９】図５の学習において、中間層１２の出力信
号の計算は図６に示すように、各入力信号（画素信号）
と各１つの中間層ユニットについてその結合荷重とを掛
算したものの総和を求めてその内部状態値を得、その内
部状態値をシグナム関数に代入して二値化した中間層出
力を得る。出力層１３の出力信号の計算は図７に示すよ
うに、各中間層出力と各１つの出力層ユニットについて
その結合荷重とを掛算したものの総和を求めてその内部
状態値を得、その内部状態値をシグナム関数に代入して
二値化した出力層の出力信号を得る。出力層の出力信号
Ｑと教師信号Ｄとの誤差は図８に示すように計算する。In the learning of FIG. 5, the output signal of the intermediate layer 12 is calculated as shown in FIG.
And the sum of the products of the respective intermediate layer units multiplied by the coupling load is obtained to obtain the internal state value, and the internal state value is substituted for the signum function to obtain a binarized intermediate layer output. As shown in FIG. 7, the output signal of the output layer 13 is calculated by summing the output of each intermediate layer and the coupling weight of each output layer unit, obtaining the internal state value, and obtaining the internal state value. By substituting the value into the signum function, an output signal of the binarized output layer is obtained. The error between the output signal Q of the output layer and the teacher signal D is calculated as shown in FIG.

【００１０】ステップＳ₈，Ｓ₉の試行パターンの生成
は図９に示すようにして行う。中間層の各ユニットの内
部状態値の絶対値の小さい順に並べ、（試行回数＋１）
番目に小さい内部状態値をもつ中間層ユニットを求め、
そのユニットの出力符号を反転し、これとその他の中間
層ユニットの出力とを試行パターンとする。ステップＳ
₁₃における中間層ユニットの結合荷重の更新は図１０に
示すように行われる。つまり、選択した中間層ユニット
の出力と、そのユニットの内部状態値との差を求め、そ
の差εと学習係数αと各入力画素信号との積を入力画素
数を割った値をその画素信号に対する現結合荷重に加算
して更新した結合荷重とする。ステップＳ₁₉における出
力層ユニットの結合荷重の更新は図１１に示すように行
う。まず中間層出力を計算し、次に各出力層ユニットの
出力Ｑ（ｎ）を計算し、これと対応する教師信号Ｄ
（ｎ）とを比較し、不一致の時は、各出力層ユニットｎ
についてその内部状態値と教師信号Ｄ（ｎ）との差εを
計算し、そのεとαと、各中間層ユニットの内部状態値
との積を中間層ユニットの数Ｍで割算した値を、その中
間層ユニットとの結合荷重と加算して、その中間層ユニ
ットとの新たな結合荷重とする。このことを各出力層ユ
ニットについて行う。The generation of the trial patterns in steps S ₈ and S ₉ is performed as shown in FIG. Sorted in ascending order of the absolute value of the internal state value of each unit of the intermediate layer, (number of trials + 1)
Find the middle layer unit with the smallest internal state value,
The output sign of the unit is inverted, and this and the output of the other hidden units are used as the trial pattern. Step S
The update of the coupling load of the intermediate layer unit in ₁₃ is performed as shown in FIG. That is, the difference between the output of the selected hidden unit and the internal state value of the unit is obtained, and the product of the difference ε, the learning coefficient α and each input pixel signal is divided by the number of input pixels to obtain the value of the pixel signal. Is added to the current connection load to obtain the updated connection load. Update connection weights in the output layer unit in step S ₁₉ is performed as shown in FIG. 11. First, the output of the hidden layer is calculated, and then the output Q (n) of each output layer unit is calculated.
(N), and when they do not match, each output layer unit n
, The difference ε between its internal state value and the teacher signal D (n) is calculated, and the product of the ε and α and the internal state value of each intermediate layer unit divided by the number M of intermediate layer units is , Is added to the coupling load with the intermediate layer unit to obtain a new coupling load with the intermediate layer unit. This is performed for each output layer unit.

【００１１】[0011]

【発明が解決しようとする課題】階層型ニューラルネッ
トワークの性能は、中間層の層数、ユニット数などのネ
ットワークの構造に強く依存している。例えば中間層の
ユニット数が多過ぎる場合は、入力信号のベクトル空間
を必要以上に分割するため、階層型ニューラルネットワ
ークの汎化能力が低下する。しかし、中間層ユニットの
適切な個数を求める方法が確立していないため、試行錯
誤によって階層型ニューラルネットワークの構造を決定
するしかなかった。The performance of a hierarchical neural network strongly depends on the network structure such as the number of intermediate layers and the number of units. For example, if the number of units in the intermediate layer is too large, the vector space of the input signal is unnecessarily divided, and the generalization ability of the hierarchical neural network is reduced. However, since a method for obtaining an appropriate number of intermediate layer units has not been established, the structure of a hierarchical neural network must be determined by trial and error.

【００１２】試行錯誤によって中間層のユニット数を決
定する場合に、冗長な個数の中間層ユニットを用いるの
が一般的である。このため階層型ニューラルネットワー
クの構造が大きくなり、学習時間や計算量が増大すると
いう問題があった。When determining the number of units of the intermediate layer by trial and error, it is common to use a redundant number of intermediate layer units. For this reason, there is a problem that the structure of the hierarchical neural network becomes large, and the learning time and the amount of calculation increase.

【００１３】[0013]

【課題を解決するための手段】この発明によれば階層型
ニューラルネットワークに対するＭＲII学習方法におい
て、すべての学習セットについての実行（学習）を１回
終了する（１サイクルの終了）ごとに、その学習におい
て常にＥ＝Ｅ′であった中間層ユニットを、そのネット
ワークの動作に貢献しない非貢献中間層ユニットとして
削除する。According to the present invention, in an MRII learning method for a hierarchical neural network, the learning (learning) for each learning set is completed once (one cycle end). , The intermediate layer unit that always has E = E ′ is deleted as a non-contributing intermediate layer unit that does not contribute to the operation of the network.

【００１４】[0014]

【作用】この発明方法で用いたニューロンモデルからな
る階層型ニューラルネットワークをパターン認識に用い
る場合、中間層ユニットは入力パターンの作る多次元ベ
クトル空間を分割する働きを担う。一つの中間層ユニッ
トは多次元ベクトル空間を二つに分割するので、複数個
の中間層ユニットがある場合は多次元ベクトル空間を細
かく分割することになる。出力層ユニットは入力パター
ンが分割された多次元ベクトル空間のどこに存在するか
を見て認識結果を出力する。このとき、多次元ベクトル
空間が適切に分割されていると階層型ニューラルネット
ワークの汎化能力は高くなり、優れた認識能力を持つこ
とができる。反対に、必要以上に多くの中間層ユニット
があり、そのために細かく多次元ベクトル空間が分割さ
れている場合、ユニット数は適切であるが分割が適切で
ない場合は汎化能力は低くなる。上記学習手順では、中
間層ユニットの出力信号の符号を反転したとき、その影
響が出力層出力の誤差に現われるか否かによって、中間
層ユニットの貢献の程度を決めている。誤差が減少する
場合、選択した中間層ユニットの符号が反転するように
結合荷重を更新することは、多次元ベクトル空間の分割
が適切になるように修正することであり、結合荷重更新
後の中間層ユニットはネットワークの行うパターン認識
に貢献すると考えることができる。誤差が増加する場
合、選択した中間層ユニットは現状の方がネットワーク
の行うパターン認識に貢献している可能性があると考え
ることができる。これらに反して、１サイクルの学習で
１度も誤差の増減の無い中間層ユニットは、ネットワー
クの行うパターン認識に貢献していないか貢献の程度が
非常に低いと考えることができる。この発明の学習で
は、誤差が減少しない場合は結合荷重の更新を行わない
方針であるから、このように誤差の増減しない中間層ユ
ニットは多次元ベクトル空間の分割の最適化を受けるこ
ともないので、貢献していないと見なし削除することが
妥当である。When a hierarchical neural network composed of neuron models used in the method of the present invention is used for pattern recognition, an intermediate layer unit has a function of dividing a multidimensional vector space created by an input pattern. Since one intermediate layer unit divides a multidimensional vector space into two, if there are a plurality of intermediate layer units, the multidimensional vector space is finely divided. The output layer unit outputs recognition results by checking where the input pattern exists in the divided multidimensional vector space. At this time, if the multidimensional vector space is appropriately divided, the generalization ability of the hierarchical neural network increases, and it is possible to have excellent recognition ability. Conversely, if there are more intermediate layer units than necessary and the multidimensional vector space is finely divided for that reason, if the number of units is appropriate but the division is not appropriate, the generalization ability will be low. In the above learning procedure, when the sign of the output signal of the hidden layer unit is inverted, the degree of contribution of the hidden layer unit is determined depending on whether or not the influence appears in an error of the output of the hidden layer. When the error decreases, updating the connection weight so that the sign of the selected hidden unit is reversed is to correct the division of the multidimensional vector space so as to be appropriate. The layer units can be considered to contribute to the pattern recognition performed by the network. If the error increases, it can be considered that the selected middle layer unit may have contributed to the pattern recognition performed by the network. On the contrary, it is considered that the intermediate layer unit in which the error does not increase or decrease at least once in one cycle of learning does not contribute to the pattern recognition performed by the network or the degree of the contribution is very low. In the learning of the present invention, if the error does not decrease, the policy is not to update the connection weight. Therefore, the intermediate layer unit in which the error does not increase or decrease does not receive the optimization of the division of the multidimensional vector space. , It is reasonable to consider it as not contributing and delete it.

【００１５】[0015]

【実施例】図１にこの発明の実施例を示し、図５と対応
するステップには同一記号を付けてある。この発明では
中間層ユニットテーブルを用意し、その各ユニットに対
し、１ビットを割り当て、これを“１”にしてフラグを
立てることができるようにされる。図５と異なる部分に
ついてのみ説明する。ステップＳ₂では中間層ユニット
テーブルの各ビットをゼロとしてフラグを消して初期化
する。ステップＳ₁₂でＥ＞Ｅ′と判定されると、選択し
た中間層ソニットについて中間層ユニットテーブルにフ
ラグを立て（Ｓ₂₃）、ステップＳ₁₃に移り、Ｅ＜Ｅ′の
場合も同様に選択した中間層ユニットについて中間層ユ
ニットテーブルにフラグを立て（Ｓ₂₄）、ステップＳ₁₄
に移る。Ｅ＝Ｅ′の場合はフラグを立てない。FIG. 1 shows an embodiment of the present invention. Steps corresponding to those in FIG. 5 are denoted by the same reference numerals. In the present invention, an intermediate layer unit table is prepared, one bit is allocated to each unit, and this bit is set to "1" so that a flag can be set. Only parts different from FIG. 5 will be described. In step S ₂ is initialized to erase the flag each bit zero of the intermediate layer unit table. 'If it is determined, a flag in the intermediate layer unit table for the intermediate layer Sonitto selected (S _23), the procedure proceeds to step _{S 13, E <E' E} > E at step S ₁₂ selected Similarly for the intermediate layer unit flags the intermediate layer unit table (S _24), step S ₁₄
Move on to If E = E ', no flag is set.

【００１６】ステップＳ₂₁において学習セット提示回数
が学習セット数と一致し、すべての学習セットについて
学習を終了すると、つまり１サイクルの学習が終了する
と、中間層ユニットテーブルの各ビットがすべて１かを
チェックし（Ｓ₂₅）、すべて１でない場合は０ビット、
つまりフラグが立っていない中間層ユニットを中間層ユ
ニットテーブルから探し、その中間層ユニットを非貢献
中間層ユニットとして削除して（Ｓ₂₀）、ステップＳ₂₂
に移り、中間層ユニットテーブルの各ビットがすべて１
の場合は直ちにステップＳ₂₂に移る。The training set presentation number at step S ₂₁ is consistent with the number of training set and terminates the learning for all the training set, i.e. when one cycle of the learning is completed, whether all the bits of the intermediate layer unit table 1 Check (S ₂₅ ), if all are not 1, 0 bit,
That locate the intermediate layer unit flag is not set from the intermediate layer unit table, delete the hidden units as non-contributing intermediate layer unit (S _20), step S ₂₂
And all the bits of the intermediate layer unit table are 1
For immediately proceeds to step S _22.

【００１７】非貢献中間層ユニットの削除は例えば図２
に示すようにして行う。中間層ユニットテーブルからそ
の１つの中間層ユニットｍを読み出し、これが１か否か
をチェックし（Ｓ₃₁）、これが１でなければ、つまりフ
ラグが立っていなければ、その中間層ユニットｍと各入
力信号との各中間層ユニットの結合荷重Ｗｍ（ｉ．ｊ）
（Ｗについての上添字ｍｉｄは省略した）をゼロとする
（Ｓ₃₂）、次にこの中間層ユニットｍと各出力層ユニッ
トとの各結合荷重Ｗ_n（ｍ）（Ｗについての上添字ｏｕ
ｔは省略した）をゼロとする（Ｓ₃₃）。このように１で
ない、つまりフラグが立っていない各中間層ユニットに
ついての上記ステップＳ₃₁〜Ｓ₃₃を実行して非貢献中間
層ユニットを削除する。The elimination of the non-contributing intermediate layer unit is described in FIG.
This is performed as shown in FIG. The one intermediate layer unit m is read from the intermediate layer unit table, and it is checked whether or not this is 1 (S ₃₁ ). If this is not 1, that is, if the flag is not set, the intermediate layer unit m and each input are read. Coupling load Wm (ij) of each intermediate unit with signal
(Suffix mid for W is omitted) is set to zero (S ₃₂ ), and then each coupling load W _n (m) between the intermediate layer unit m and each output layer unit (the superscript ou for W)
t is omitted) is set to zero ( _S33 ). Thus not one, i.e. flag deletes the non-contribution intermediate layer unit the steps S ₃₁ to S ₃₃ for each intermediate layer unit is not running standing.

【００１８】上述では非貢献中間層ユニットを削除する
ために、対応する結合荷重をゼロとしたが、結合荷重メ
モリ中の結合荷重をゼロとする部分を詰めて除去しても
よい。この場合は学習の途中でその詰め処理を１サイク
ルの学習ごとに行うと中間層と出力層との関係が異って
くるから、この関係を学習アルゴリズムで補正する必要
がある。しかし前述のように結合荷重をゼロとして削除
する場合は学習アルゴリズムを途中で修正する必要がな
い点で処理が簡単となる。In the above description, the corresponding connection load is set to zero in order to eliminate the non-contributing intermediate layer unit. However, a portion where the connection load is set to zero in the connection load memory may be reduced and removed. In this case, if the stuffing process is performed during the learning for each cycle during the learning, the relationship between the intermediate layer and the output layer will be different. Therefore, it is necessary to correct this relationship by a learning algorithm. However, when the connection weight is deleted as zero as described above, the processing is simplified in that the learning algorithm does not need to be modified in the middle.

【００１９】階層型ニューラルネットワークはパターン
認識装置に適用する場合に限らず、電子計算機上で学習
を行い、学習後の結合荷重をＲＯＭにコピーして、その
ＲＯＭを他の装置に利用することもできる。Hierarchical neural networks are not limited to application to pattern recognition devices. Learning can be performed on an electronic computer, and the connection weights after learning can be copied to a ROM, and the ROM can be used for other devices. it can.

【００２０】[0020]

【発明の効果】以上述べたようにこの発明によれば学習
途中で非貢献中間ユニットを削除するため、その学習に
おいても計算量が少なくなり、学習時間が短縮される。
また適切な個数の中間層ユニットをもつ階層型ニューラ
ルネットワークが構成され、汎化能力の高い階層型ニュ
ーラルネットワークを得ることができる。As described above, according to the present invention, since the non-contributing intermediate unit is deleted during the learning, the amount of calculation in the learning is reduced, and the learning time is shortened.
Also, a hierarchical neural network having an appropriate number of intermediate layer units is configured, and a hierarchical neural network with high generalization ability can be obtained.

[Brief description of the drawings]

【図１】この発明の実施例を示す流れ図。FIG. 1 is a flowchart showing an embodiment of the present invention.

【図２】図１中の非貢献中間層ユニットの削除ステップ
Ｓ₂₆の具体例を示す流れ図。Figure 2 is a flow diagram showing a specific example of a deletion step S ₂₆ of the non-contribution hidden unit in FIG.

【図３】階層型ニューラルネットワークを示すブロック
図。FIG. 3 is a block diagram showing a hierarchical neural network.

【図４】ニューロンモデル（ユニット）の例を示すブロ
ック図。FIG. 4 is a block diagram showing an example of a neuron model (unit).

【図５】将来の学習方法を示す流れ図。FIG. 5 is a flowchart showing a future learning method.

【図６】中間層出力の計算を示す流れ図。FIG. 6 is a flowchart showing the calculation of the output of the hidden layer.

【図７】出力層出力の計算を示す流れ図。FIG. 7 is a flowchart showing calculation of an output layer output.

【図８】誤差の計算を示す流れ図。FIG. 8 is a flowchart showing calculation of an error.

【図９】試行パターンの生成を示す流れ図。FIG. 9 is a flowchart showing generation of a trial pattern.

【図１０】中間層ユニットの結合荷重の更新処理を示す
流れ図。FIG. 10 is a flowchart showing a process of updating the coupling load of the intermediate layer unit.

【図１１】出力層ユニットの結合荷重の更新処理を示す
流れ図。FIG. 11 is a flowchart showing a process of updating the connection load of the output layer unit.

フロントページの続き (56)参考文献「階層型ニューラルネットワークの中間層素子数を自動削減する誤差逆伝搬学習アルゴリズム」松永豊、中出美彰、村瀬一之著、電子情報通信学会技術研究報告、第91巻第25号頁９〜 14、1991年「淘汰機能を有するバックプロバケーション（中間層ユニット数の削減と収束の高速化）」萩原将文著、電子情報通信学会技術研究報告、第89巻第104号頁85〜90、1990年「30 ＹｅａｒｓｏｆＡｄａｐｔｉｖｅＮｅｕｒａｌＮｅｔｗｏｒｋｓ：Ｐｅｒｃｅｐｔｒｏｎ，Ｍａｄａｌｉｎｅ，ａｎｄＢａｃｋｐｒｏｐａｇａｔｉｏｎ」ＷｉｄｒｏｗＢ．ａｎｄＬｅｈｒＭ．Ａ．，ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈＩＥＥＥ，ｖｏｌ．78，ｎｏ．９，ｐｐ．1415−1442, 1990 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06G 7/60 G06F 15/18 Continuation of the front page (56) References "Error Back Propagation Learning Algorithm for Automatically Reducing the Number of Middle Layer Elements in a Hierarchical Neural Network" Yutaka Matsunaga, Yoshiaki Nakade, Kazuyuki Murase, IEICE Technology Research Report, Vol. 91, No. 25, pp. 9-14, 1991, "Back Provision with Selection Function (Reduction of Number of Intermediate Units and Faster Convergence)", Masafumi Hagiwara, IEICE Technical Report Research Report, Vol. 89, No. 104, pp. 85-90, 1990, "30 Years of Adaptive Neural Networks: Perceptron, Madaline, and Backpropagation," Widow B. and Lehr M.S. A. , Proceedings of the IEEE, vol. 78, no. 9, pp. 1415-1442, 1990 (58) Field surveyed (Int.Cl. ⁷ , DB name) G06G 7/60 G06F 15/18

Claims

(57) [Claims]

1. A method for learning a hierarchical neural network constituted by units using a signum function as an output function, comprising: a. Giving an appropriate decimal to the combined load of all units in the middle and output layers; b. Training set prepared (pair of input signal and teacher signal)
Is input to the neural network, c. Calculating an error E between the output signal at that time and the teacher signal; d. Selecting an intermediate layer unit in the order in which its internal state value is close to zero and inverting the sign of the binary output of the selected unit to produce a new intermediate layer output signal (referred to as a trial pattern); e. The trial pattern is input to the output layer to determine an output signal, and an error E 'between the output signal and the teacher signal is determined. F. The error E 'is compared with the error E, and when E>E', the coupling load of the selected intermediate layer unit is updated so that the sign of the binary output is actually inverted; g. When E ≦ E ′, the sign obtained by inverting the trial pattern is restored, h. Repeat dg above for all intermediate layer units, i. Thereafter, the input signal is input again to obtain an output signal, and an error between the output signal and the teacher is determined; j. If the error is not zero, update the coupling weight of the output layer unit; k. Perform steps b through j for each of the other learning sets, l. Thereafter, it is determined whether or not the total error (total error) of i obtained for each learning set is zero, and m. If the value is not zero, the above b to l are repeated, and the process ends when the value is zero. In the learning method of the hierarchical neural network, when the execution of all the learning sets in k is completed, the middle layer unit where E = E 'is always deleted in the execution. How to learn the network.