JPH0713745A

JPH0713745A - Neural network type pattern learning method and pattern processor

Info

Publication number: JPH0713745A
Application number: JP5157603A
Authority: JP
Inventors: Shin Mizutani; 伸水谷; Noriyoshi Uchida; 典佳内田; Kazuhiko Shinosawa; 一彦篠沢; Noboru Sonehara; 曽根原　　登
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1993-06-28
Filing date: 1993-06-28
Publication date: 1995-01-17
Anticipated expiration: 2018-06-03
Also published as: JP3412700B2

Abstract

PURPOSE:To attain the learning despite a steep change of a learning pattern by calculating the differential value of an output pattern against an input pattern and adding the differential value to a learning cost function in response to the differential value to suppress the over-learning and changing the learning cost function based on a learning pattern distribution when the learning pattern is steeply changed to suppress the over-learning occurring in the progress of learning. CONSTITUTION:This processor is provided with a pattern input circuit 1, a square error calculating circuit 2, a differential value calculating circuit 3 which calculates the differential value of an output pattern against an input pattern, a coefficient calculating circuit 4, and a coupling coefficient correcting circuit 5. In such a constitution, it is possible to suppress the over-learning by calculating the differential value of the output pattern and adding this differential value to a learning cost function according to the differential value. Then the learning cost function is changed based on a learning pattern distribution in order to suppress the over-learning when a learning pattern has a sharp change. So that the learning is possible even in a steep change of the learning pattern.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力パターンに対応し
た望ましい出力パターンを与えるだけで自動的に学習す
る神経回路網型パターン学習方法およびパターン処理装
置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a neural network type pattern learning method and a pattern processing device for automatically learning by giving a desired output pattern corresponding to an input pattern.

【０００２】[0002]

【従来の技術】従来のバックプロパゲーション学習法
は、図３に示すような多層の神経回路網において、以下
のように行なわれる。なお、図３は３層の神経回路網を
示し、丸印は非線形しきいユニットを示し、直線はユニ
ット間の重みを示す。2. Description of the Related Art A conventional back propagation learning method is carried out as follows in a multilayer neural network as shown in FIG. It should be noted that FIG. 3 shows a three-layer neural network, in which circles represent non-linear threshold units, and straight lines represent weights between units.

【０００３】ｍ層の多層神経回路網を考え、ｋ層の第ｉ
番目のユニットへの入力の総和をｉ、これらの変数の関係は以下のようになる。Considering an m-layer multilayer neural network, the k-th i-th layer
The sum of the inputs to the th unit is i , The relation of these variables is as follows.

【０００４】[0004]

【数１】ここでコスト関数として次のようなものを考える。[Equation 1] Consider the following as a cost function.

【０００５】[0005]

【数２】但し、ｊについての和は最終層ｍに存在するユニットの
数だけとる。すると、重みの変更は次のようになる。[Equation 2] However, the sum for j is the number of units existing in the final layer m. Then, the weight change is as follows.

【０００６】[0006]

【数３】 [Equation 3]

【数４】このように重みを変化させ、学習を行なう。[Equation 4] In this way, the weight is changed and learning is performed.

【０００７】[0007]

【発明が解決しようとする課題】学習パターンが例えば
図４に示すように分布していたとき、上述した学習法の
コスト関数では、与えた学習パターンの点を通れば学習
したことになるのだが、学習が進むにつれて、図５に示
すように過学習の状態が発生するという問題がある。When the learning pattern is distributed as shown in FIG. 4, for example, the cost function of the learning method described above has learned if it passes through the points of the given learning pattern. As the learning progresses, there is a problem that an over-learning state occurs as shown in FIG.

【０００８】本発明は、上記に鑑みてなされたもので、
その目的とするところは、学習が進むにつれて発生する
過学習を抑制し、学習パターンが急峻に変化しても学習
可能な神経回路網型パターン学習方法およびパターン処
理装置を提供することにある。The present invention has been made in view of the above,
An object of the invention is to provide a neural network type pattern learning method and a pattern processing device capable of suppressing over-learning that occurs as learning progresses and capable of learning even when a learning pattern sharply changes.

【０００９】[0009]

【課題を解決するための手段】上記目的を達成するた
め、本発明の神経回路網型パターン学習方法は、非線形
しきいユニットからなる多層の神経回路網を備えた神経
回路網型パターン学習方法であって、入力パターンに対
する出力パターンの微分値を計算し、前記微分値の大き
さに応じて該微分値を学習コスト関数に加えることによ
り過学習を抑制し、学習パターンが急峻に変化する場合
に該学習パターンの分布を基に前記学習コスト関数を変
化させることを要旨とする。In order to achieve the above object, the neural network pattern learning method of the present invention is a neural network pattern learning method provided with a multilayer neural network composed of nonlinear threshold units. Therefore, when the differential value of the output pattern with respect to the input pattern is calculated and the differential value is added to the learning cost function according to the magnitude of the differential value, overlearning is suppressed, and when the learning pattern changes abruptly. The gist is to change the learning cost function based on the distribution of the learning pattern.

【００１０】また、本発明の神経回路網型パターン処理
装置は、非線形しきいユニットからなる多層の神経回路
網を備えた神経回路網型パターン処理装置であって、入
力パターンに対する出力パターンの微分値を計算する手
段と、前記微分値の大きさに応じて該微分値を学習コス
ト関数に加える割合を計算する手段とを有することを要
旨とする。The neural network type pattern processing apparatus of the present invention is a neural network type pattern processing apparatus provided with a multilayered neural network composed of non-linear threshold units, the differential value of the output pattern with respect to the input pattern. And a means for calculating a ratio of adding the differential value to the learning cost function according to the magnitude of the differential value.

【００１１】[0011]

【作用】本発明の神経回路網型パターン学習方法では、
入力パターンに対する出力パターンの微分値を計算し、
微分値の大きさに応じて微分値を学習コスト関数に加え
ることにより過学習を抑制し、学習パターンが急峻に変
化する場合に学習パターンの分布を基に学習コスト関数
を変化させる。In the neural network pattern learning method of the present invention,
Calculate the differential value of the output pattern with respect to the input pattern,
Over-learning is suppressed by adding the differential value to the learning cost function according to the magnitude of the differential value, and when the learning pattern changes abruptly, the learning cost function is changed based on the distribution of the learning pattern.

【００１２】また、本発明の神経回路網型パターン処理
装置では、入力パターンに対する出力パターンの微分値
を計算し、微分値の大きさに応じて微分値を学習コスト
関数に加える割合を計算する。Further, in the neural network type pattern processing apparatus of the present invention, the differential value of the output pattern with respect to the input pattern is calculated, and the ratio of adding the differential value to the learning cost function is calculated according to the magnitude of the differential value.

【００１３】[0013]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００１４】図１は、本発明の一実施例に係わる神経回
路網型パターン処理装置の構成を示すブロック図であ
る。同図に示す神経回路網型パターン処理装置は、パタ
ーンを入力するパターン入力回路１、２乗誤差を計算す
る２乗誤差計算回路２、入力パターンに対する出力パタ
ーンの微分値を計算する微分計算回路３、係数を計算す
る係数計算回路４、結合係数を修正する結合係数修正回
路５を有する。FIG. 1 is a block diagram showing the arrangement of a neural network type pattern processing apparatus according to an embodiment of the present invention. The neural network type pattern processing apparatus shown in the figure includes a pattern input circuit 1 for inputting a pattern, a square error calculation circuit 2 for calculating a squared error, and a differential calculation circuit 3 for calculating a differential value of an output pattern with respect to an input pattern. , A coefficient calculation circuit 4 for calculating coefficients, and a coupling coefficient correction circuit 5 for correcting coupling coefficients.

【００１５】今、ｍ層の多層神経回路網を考え、ｋ層の
第ｉ番目のユニットへの入力の総和ると、これらの変数の関係は以下のようになる。Now, considering a multilayer neural network of m layers, the sum of inputs to the i-th unit of k layers is summed up. Then, the relation of these variables is as follows.

【００１６】[0016]

【数５】ここでコスト関数として次のような２乗誤差コスト関数
を考える。[Equation 5] Consider the following squared error cost function as the cost function.

【００１７】[0017]

【数６】但し、ε₂は定数で、ｊについての和は最終層ｍに存在
するユニットの数だけする微分であり、回路が実現する関数を滑らかにし、過
学習を押えるためのペナルティであるが、このペナルテ
ィを入れたままのコスト関数では、学習パターンが急峻
に変化するところでは滑らかさを押えることができな
い。そこで、（１−ρ_j）という滑らかさのペナルティ
に掛けた項でペナルティをどの程度効果を持たせるかと
いうことを決める。ρ_j＝１の場合は学習パターンが急
峻に変化するところであり、滑らかさのペナルティをコ
スト関数に含めない。また、ρ_j＝０のときは、滑らか
さのペナルティをコスト関数に含める。[Equation 6] However, ε ₂ is a constant, and the sum for j is the number of units existing in the final layer m. It is a differentiation that makes the function realized by the circuit smooth and suppresses over-learning, but with the cost function with this penalty put, smoothness is suppressed where the learning pattern changes sharply. I can't. Therefore, how much the penalty is effective is determined by the term multiplied by the smoothness penalty of (1-ρ _j ). When ρ _j = 1, the learning pattern changes abruptly, and the penalty of smoothness is not included in the cost function. When ρ _j = 0, the smoothness penalty is included in the cost function.

【００１８】学習である重みの変更は、次のようにな
る。The change of weight, which is learning, is as follows.

【００１９】[0019]

【数７】 [Equation 7]

【数８】 [Equation 8]

【数９】 ρ_jを決めるには、例えば、次のような方法がある。[Equation 9] For example, the following method can be used to determine ρ _j .

【００２０】ある拘束条件を最適化するのに一般的によ
く用いられるホップフィールド型神経回路網を考え、そ
のコスト関数としてＥ’を考える。Consider a Hopfield neural network that is commonly used to optimize certain constraints, and consider E'as its cost function.

【００２１】[0021]

【数１０】Ｖ（ρ_j）は、ρ_j＝１になった場合のペナルティで、
次のように与える。[Equation 10] V (ρ _j ) is the penalty when ρ _j = 1 and
Give as follows.

【００２２】[0022]

【数１１】但し、Ｃ₁，Ｂ₁，Ｂ₂，Ｂ₃，Ｂ₄は定数とする。第
１，２項はρ_j＝１となることに対するペナルティで、
第３項はρが０か１に収束するための項である。ここ
で、Ｇは次のような関数とする。[Equation 11] However, C ₁ , B ₁ , B ₂ , B ₃ , and B ₄ are constants. The first and second terms are the penalties for ρ _j = 1 and
The third term is a term for ρ to converge to 0 or 1. Here, G is the following function.

【００２３】[0023]

【数１２】但し、λは定数である。[Equation 12] However, λ is a constant.

【００２４】ここで、ρ_jを決める方程式として、次の
式を考える。Here, the following equation is considered as an equation for determining ρ _j .

【００２５】[0025]

【数１３】但し、ε₃は定数とする。[Equation 13] However, ε ₃ is a constant.

【００２６】また、Ｅ’の時間微分は次の式より、必ず
０以下になる。The time derivative of E'is always 0 or less according to the following equation.

【００２７】[0027]

【数１４】Ｅ’の時間微分が０になれば、すべてのρ_jについて次
のようになり、ρ_jは必ず収束する。[Equation 14] When the time derivative of E ′ becomes 0, the following holds for all ρ _j , and ρ _j always converges.

【００２８】ｄρ_j／ｄｔ＝０（27）次に、図２に示すフローチャートを参照して、パターン
学習方法について説明する。上述したように、入力に対
する出力の微分値を微分計算回路３で計算し（ステップ
１１０）、この計算した微分値よりコスト関数の中にど
の程度効かせるかを別のホップフィールド型神経回路網
を用いて計算する。すなわち、ホップフィールド型神経
回路網による係数を決定する（ステップ１２０）。それ
から、このコスト関数で多層の神経回路網を学習させる
（ステップ１３０）。以上の手順を繰り返すことによ
り、過学習を抑えつつ、学習パターンが急峻に変化して
いるところでも学習が可能となる。Dρ _j / dt = 0 (27) Next, the pattern learning method will be described with reference to the flowchart shown in FIG. As described above, the differential value of the output with respect to the input is calculated by the differential calculation circuit 3 (step 110), and another Hopfield neural network is used to determine how effective the calculated differential value is in the cost function. Calculate using. That is, the coefficient by the Hopfield neural network is determined (step 120). Then, a multilayer neural network is trained with this cost function (step 130). By repeating the above procedure, it is possible to learn even while the learning pattern is changing sharply while suppressing over-learning.

【００２９】[0029]

【発明の効果】以上説明したように、本発明によれば、
入力パターンに対する出力パターンの微分値を計算し、
微分値の大きさに応じて微分値を学習コスト関数に加え
ることにより過学習を抑制し、学習パターンが急峻に変
化する場合に学習パターンの分布を基に学習コスト関数
を変化させるので、過学習を抑えることができるととも
に、学習パターンが急峻に変化していても学習が可能に
なり、従来の方法では実現できなかったパターン認識、
分類等の処理が可能となる。As described above, according to the present invention,
Calculate the differential value of the output pattern with respect to the input pattern,
Over-learning is suppressed by adding the differential value to the learning cost function according to the magnitude of the differential value, and when the learning pattern changes sharply, the learning cost function is changed based on the distribution of the learning pattern. In addition to suppressing the above, learning is possible even when the learning pattern changes sharply, pattern recognition that could not be realized by conventional methods,
Processing such as classification becomes possible.

[Brief description of drawings]

【図１】本発明の一実施例に係わる神経回路網型パター
ン処理装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a neural network type pattern processing apparatus according to an embodiment of the present invention.

【図２】本発明の一実施例に係わる神経回路網型パター
ン学習方法の手順を示すフローチャートである。FIG. 2 is a flowchart showing a procedure of a neural network pattern learning method according to an embodiment of the present invention.

【図３】多層神経回路網の一例である３層神経回路網を
示す図である。FIG. 3 is a diagram showing a three-layer neural network which is an example of a multilayer neural network.

【図４】学習パターンの点（入力、出力とも１次元）と
神経回路網が実現した関数を示す図である。FIG. 4 is a diagram showing points (both input and output are one-dimensional) of a learning pattern and a function realized by a neural network.

【図５】学習パターンの点（入力、出力とも１次元）と
過学習した神経回路網が実現した関数を示す図である。FIG. 5 is a diagram showing points (both input and output are one-dimensional) of a learning pattern and a function realized by an overlearned neural network.

[Explanation of symbols]

１パターン入力回路２２乗誤差計算回路３微分計算回路４係数計算回路５結合係数修正回路 1 pattern input circuit 2 square error calculation circuit 3 differential calculation circuit 4 coefficient calculation circuit 5 coupling coefficient correction circuit

───────────────────────────────────────────────────── フロントページの続き (72)発明者曽根原登東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Noboru Sonehara 1-1-6 Uchisaiwaicho, Chiyoda-ku, Tokyo Nihon Telegraph and Telephone Corporation

Claims

[Claims]

1. A neural network type pattern learning method comprising a multilayer neural network composed of non-linear threshold units, wherein a differential value of an output pattern with respect to an input pattern is calculated, and the differential value is calculated according to the magnitude of the differential value. Neural network characterized by suppressing the over-learning by adding the differential value to the learning cost function, and changing the learning cost function based on the distribution of the learning pattern when the learning pattern changes abruptly. Type pattern learning method.

2. A neural network pattern processing apparatus comprising a multilayer neural network composed of non-linear threshold units, comprising means for calculating a differential value of an output pattern with respect to an input pattern, and a magnitude of the differential value. And a means for calculating a ratio of adding the differential value to the learning cost function according to the above.