JPH0373057A

JPH0373057A - Rationalizing method for data processor

Info

Publication number: JPH0373057A
Application number: JP1208672A
Authority: JP
Inventors: Ryohei Kumagai; 熊谷　良平; Sunao Takatori; 直高取; Makoto Yamamoto; 誠山本; Koji Matsumoto; 幸治松本
Original assignee: Ezel Inc
Current assignee: Ezel Inc
Priority date: 1989-08-12
Filing date: 1989-08-12
Publication date: 1991-03-28

Abstract

PURPOSE:To prevent a drop into a local minimum in a learning processing by forcedly changing the weight of a neuron generating a significant output at a certain time point so that the weight can be temporarily decreased to a minimum value and afterwards increased to the value at this time point, and adjusting the weight during the period of this change. CONSTITUTION:Plural neural layers NL are provided with plural neurons N parallelly provided and the neural layer is constituted so that the output of a certain neural layer can be the input of the neural layer in the next step. Threshold values theta of the neurons in the same neural layer are mode same and the threshold value theta is increased toward the neural layer in the rear step. A gradient for this increase of the threshold value is made smooth. The weight of the neuron generating the significant output is forcedly changed so as to be temporarily decreased to the minimum value and afterwards to be increased to the value at this time point. Then, while applying the fixed input data during a period for this forced change of the weight, the weight is corrested based on the evaluated result of the output. Thus, the drop into the local minimum can be prevented.

Description

【発明の詳細な説明】（産業上の利用分野〕本発明は、入力されたデータに所定の重みを乗じた積の
総和と閾値との比較結果に応じたデータを出力するニュ
ーロンが複数設けられたデータ処理装置において、前記
重みを適正化するための方法に関する。[Detailed Description of the Invention] (Industrial Application Field) The present invention provides a plurality of neurons that output data according to the result of comparing the sum of products obtained by multiplying input data by a predetermined weight and a threshold value. The present invention relates to a method for optimizing the weights in a data processing device.

[Conventional technology]

この種のデータ処理装置の学習過程において、ある程度
適正な入出力相関が生じ、その相関が強められたとき、
以後データ処理装置がその状態から脱出できず、最適化
が不可能となる場合がある。In the learning process of this type of data processing device, when a certain degree of appropriate input-output correlation occurs and that correlation is strengthened,
Thereafter, the data processing device may not be able to escape from this state, and optimization may become impossible.

これはローカルミニマムへの落ち込みと表現され、例え
ばポルツマンマシン（Ｄａｖｉｄ　）１．　Ａｃｋｌｅ
ｙ、　Ｇｅｏｆｆｒｅｙ　Ｅ、　）Ｉｔｎｔｏｎ、ａｎ
ｄ　Ｔｅｒｒｅｎｃｅ　Ｊ、　Ｓｅｊｎｏｗｓｋｉ：　
ＡＩｅａｒｎｆｎｇ　ａｌｇｏｒｉｔｈｍ　ｆｏｒ　Ｂ
ｏｌｚｔｓａｎ　ｍａｃｈｉｎｅｓ：Ｃｏｇｎｉｔｉｖ
ｅ　５ｃｉｅｎｃｅ　９＋　１９８９＋　１４７−１６
９）では、ニューラルネットワークのエネルギ式マムから脱出して大局的ローカルごニマムに到達するた
めに、一時的に前記エネルギを所定の高レベルにジャン
プさせる。しかし入出力の最適相関と前記エネルギの大
局的ローカル果ニマムの状態とは必ずしも一致するもの
ではない。また一般に局所的ローカルくニマムの深さは
不明であり、エネルギレベルを所定値だけ高めたときに
、確実に局所的ローカルごニマムから脱出し得るという
保証はなく、エネルギの大局的ローカルミニマムに到達
し得るという保証すらない。さらにエネルギを一時的ジ
ャンプするということ自体、生体系の機能との整合性は
ない。This is expressed as a drop to a local minimum, such as the Portzmann machine (David) 1. Ackle
y, Geoffrey E.) Itnton, an
d Terrence J, Sejnowski:
AIearnfng algorithm for B
olztsan machines: Cognitive
e 5science 9+ 1989+ 147-16
In 9), the energy is temporarily jumped to a predetermined high level in order to escape from the energy equation of the neural network and reach the global local equation. However, the optimal correlation between input and output does not necessarily match the state of the global local energy. In addition, the depth of the local minimum is generally unknown, and there is no guarantee that when the energy level is increased by a predetermined value, it will be possible to escape from the local minimum, and the global local minimum of energy will be reached. There is no guarantee that it will even be possible. Furthermore, a temporary jump in energy itself is inconsistent with the functioning of biological systems.

[Problem to be solved by the invention]

本発明は前記従来の問題点を解消すべく創案されたもの
で、学習過程における局所的ローカルミニマムへの落ち
込みを確実に防止し得るデータ処理装置の適正化方法を
提供することを目的とする。The present invention was devised to solve the above-mentioned conventional problems, and it is an object of the present invention to provide a method for optimizing a data processing device that can reliably prevent a drop to a local minimum during the learning process.

[Means to solve the problem]

本発明に係るデータ処理装置の適正化方法は、データ処
理装置に一定の入力を与えつつ、ある時点で有意な出力
を生じたニューロンの重みを、一時的に最小値まで減少
させた後に前記時点での値まで増加させるように強制的
に変化させ、この重みの強制的な変化の期間に、前記重
みを調整するものである。A method for optimizing a data processing device according to the present invention is to temporarily reduce the weight of a neuron that has produced a significant output at a certain point in time to a minimum value while giving a constant input to the data processing device, and then The weight is forcibly changed so as to increase to the value , and the weight is adjusted during the period of this forced change of the weight.

[Effect]

本発明に係るデータ処理装置の適正化方法によれば、生
体系の機能に則したメカニズムにより確実に局所的極小
値から脱出し得る。According to the method for optimizing a data processing device according to the present invention, it is possible to reliably escape from a local minimum value using a mechanism that conforms to the function of a biological system.

〔実施例］以下図示実施例により本発明を説明する。〔Example] The present invention will be explained below with reference to illustrated embodiments.

ここで、学習の効果を示す指標として下記標準−成度Ｔ
Ｐを定義しておく。Here, as an indicator showing the effectiveness of learning, the following standard - maturity T
Let us define P.

ただし、ここでｎ　　　：入力事象の数ｉ、ｊ：入力事象の番号Ｐｉ、ｊ：ｊ番目の入力事象によるニューラルネットワ
ーク出力と、ｊ番目の入力事象によるニューラルネットワーク出力との一致度（一致ビット数）上記標準−成度ＴＰは、ニューラルネットワークにおけ
る入力事象の識別能力の指標であり、ＴＰが低い程識別
能力が高いことを示す。そして、全く学習が為されてい
ないニューラルネットワークでは一般にＴＰ＝Ｌ　Ｏ０
％であり、その後の学習によりＴＰは単調減少していく
。（第１図）しかし、学習方法が不適切であった場合に
は、学習効果を逆に阻害することもあり、第２図に示す
ように、−旦低下したＴＰが再び上昇することもある。However, here, n: number of input events i, j: number of input events Pi, j: degree of coincidence between the neural network output due to the j-th input event and the neural network output due to the j-th input event (number of matching bits) ) The standard-graded TP is an index of the discrimination ability of input events in a neural network, and the lower the TP, the higher the discrimination ability. In general, in a neural network that has not undergone any learning, TP=L O0
%, and TP monotonically decreases with subsequent learning. (Figure 1) However, if the learning method is inappropriate, it may actually impede the learning effect, and as shown in Figure 2, the TP that once decreased may rise again. .

そして前述のように局所的ローカルミニマムへの落ち込
みが生じたときには、第３図に示すようにＴＰが高い値
で飽和してしまう。When a drop to the local minimum occurs as described above, TP becomes saturated at a high value as shown in FIG.

この局所的ローカルミニマムへの落ち込みは、前記エネ
ルギ式（１）とは無関係であり、例えばある事象の特徴
の重みに偏りがあり、特定の特徴が強調された結果複数
の事象の混同を生じることを肯定する学習が為された結
果と考えることができる。This local drop to the local minimum is unrelated to the energy equation (1) above, and may be caused by, for example, bias in the weight of the features of a certain event, and a certain feature is emphasized, resulting in confusion of multiple events. This can be thought of as the result of learning to affirm .

これを模式的に表現すれば、第４図に示すように、ある
入力ビットＩ４の影響が強いため、ニューロンの発火分
布が偏ったものとなり、この状態が強化され続ける状態
と考えることができる。なお、この図において発火した
ニューロンは黒い丸により示されている。Expressing this schematically, as shown in FIG. 4, because the influence of a certain input bit I4 is strong, the firing distribution of neurons becomes biased, and this state can be considered as a state that continues to be strengthened. In this figure, neurons that have fired are indicated by black circles.

この状態を脱却するためには、他の入力ビットＩＩ〜Ｉ
ｚ、Ｉｓに対するニューロンの関与を強化する必要があ
る。In order to escape from this state, other input bits II to I
There is a need to strengthen neuronal involvement in z, Is.

ここで発明者等は生体系のニューラルネットワークにお
ける絶対不応期、相対不応期に注目した。Here, the inventors focused on the absolute refractory period and relative refractory period in neural networks of biological systems.

これら不応期は生体系のニューラルネットワークにおい
て、興奮波の伝播の抑制、制御に寄与しているといわれ
ている。（せ利俊−著：神経回路網の数理−脳の情報処
理様式−；産業図書）これに対して、発明者等は、これ
ら不応期が入力データの各特徴の影響のバランスを一時
的に変化させる要因ととらえ、ある入力に対して発火し
たニューロンの重みに対し、第５図に示すような変化を
与え、絶対不応期および相対不応期と同等の効果を与え
た。These refractory periods are said to contribute to suppressing and controlling the propagation of excitation waves in the neural networks of biological systems. (Toshi Seri: Mathematics of Neural Networks - Information Processing Style of the Brain -; Sangyo Tosho) On the other hand, the inventors believe that these refractory periods temporarily balance the effects of each feature of the input data. Considering this as a changing factor, we changed the weight of a neuron firing in response to a certain input as shown in Figure 5, giving the same effect as the absolute refractory period and the relative refractory period.

すなわち、学習過程におけるある時点ｔｏの連想で発火
したニューロンの各重みＷ、〜Ｗ７をその後一定期間ｔ
ｌの間最小値Ｗ、ｉ、（例えば０）まで減少させる。こ
れは閾値が無限大となる絶対不応期に対応する。この絶
対不応期においても学習を続行し、ニューロンの重みに
変化を与える。このとき重み変化が与えられるニューロ
ンは、ｔｏ以前には、その入力に対して発火しながった
ニューロンである。期間ｔ３の経過後、各重みＷ〜Ｗ７
をｔｏ時点での値まで徐々に増加させる。That is, each weight W, ~W7 of a neuron fired by an association at a certain time to in the learning process is then set for a certain period t.
,W,i,is decreased to a minimum value,W,i,(eg, 0) during,l,. This corresponds to an absolute refractory period where the threshold value becomes infinite. Even during this absolute refractory period, learning continues and changes the weights of neurons. The neuron to which the weight change is applied at this time is the neuron that did not fire in response to that input before to. After the period t3 has elapsed, each weight W to W7
is gradually increased to the value at time to.

この漸増の期間は、閾値θが漸減する相対不応期に対応
し、この期間をｔｌとする。This period of gradual increase corresponds to a relative refractory period in which the threshold value θ gradually decreases, and this period is designated as tl.

絶対不応期においては、ｔｏ時点で発火しなかったニュ
ーロンのみが重みを強化され、偏ったニューロン分布に
ついて学習が行われるが、相対不応期においては、ｔｏ
時点で発火したニューロンも徐々に発火するようになり
、徐々にニューラルネットワーク全体についてバランス
のとれた修正学習が行われる。In the absolute refractory period, only the neurons that did not fire at time to have their weight strengthened, and learning is performed about the biased neuron distribution, but in the relative refractory period,
The neurons that fired at that point also gradually start firing, and balanced corrective learning is gradually performed for the entire neural network.

この結果いわゆるローカルミニマムからの脱出が実現さ
れ、標準−政変を最適値（大局的ローカルミニマム）に
到らしめるような学習がおこなわれる。As a result, an escape from the so-called local minimum is realized, and learning is performed to bring the standard-political change to the optimal value (global local minimum).

なお第６図に示すように、前記重み変化のサイクルを複
数回与えることにより、局所的ローカルミニマムからの
脱出がより確実になる。但し、重み変化サイクル回数が
多過ぎたときには、かえって学習の収束が阻害され、ま
た装置終期において重み変化サイクルを与えたときにも
学習の収束が阻害される。Note that, as shown in FIG. 6, by applying the weight change cycle a plurality of times, escape from the local minimum becomes more reliable. However, if the number of weight change cycles is too large, the convergence of learning will be hindered, and the convergence of learning will also be hindered when weight change cycles are given at the end of the device.

すなわち重み変化サイクルは、学習の比較的初期におい
て適正回数与える必要がある。また絶対不応期の期間ｊ
Ｒ１相対不応期の期間ｔｒ、および両者のバランスにつ
いても最適化を考慮する必要がある。In other words, the weight change cycle needs to be given an appropriate number of times at a relatively early stage of learning. Also, the period of absolute refractory period j
It is also necessary to consider optimization of the period tr of the R1 relative refractory period and the balance between the two.

次に以上の適正化方法が効果的に実行されるデータ処理
装置について説明する。Next, a data processing apparatus in which the above optimization method is effectively executed will be described.

第７図において、データ処理装置は、複数のニューロン
Ｎを並列に設けてなる複数のニューラルレイヤＮＬを複
数有し、ニューラルレイヤは、あるニューラルレイヤの
出力が次段のニューラルレイヤの入力となるように構成
されている。In FIG. 7, the data processing device has a plurality of neural layers NL each having a plurality of neurons N arranged in parallel, and the neural layers are such that the output of one neural layer becomes the input of the next neural layer. It is composed of

このような構成においてはニューラルネットワークのト
ポロジを定義でき、各ニューロンにつぃて、第７図に示
すように座標を特定し得る。In such a configuration, the topology of the neural network can be defined, and the coordinates of each neuron can be specified as shown in FIG.

ここで入力側から出力側向かう方向にＸ軸をとり、各ニ
ューラルレイヤの幅方向にＹ軸をとる。Here, the X-axis is taken in the direction from the input side to the output side, and the Y-axis is taken in the width direction of each neural layer.

そしてデータ処理装置の初期状態におけるニューロンの
閾値分布を第８図のように設定する。すなわち同一ニュ
ーラルレイヤのニューロンの閾値θは同一とし、後段の
ニューラルレイヤになる程閾値θを高めておく。この閾
値の増大の勾配は滑らかなものとする。通常学習過程に
おいては、正しい入力に対して有意なシナプス入力が入
力されたシナプスの重みを高めるが、第８図の閾値分布
では、当初は入力側寄りのニューロンのみが発火し、徐
々に後段のニューロンに発火の範囲が広がる。Then, the threshold distribution of neurons in the initial state of the data processing device is set as shown in FIG. That is, the threshold value θ of neurons in the same neural layer is set to be the same, and the threshold value θ is set higher as the neural layer becomes later. The gradient of increase in this threshold value is assumed to be smooth. In the normal learning process, the weight of synapses with significant synaptic input is increased in response to correct input, but in the threshold distribution shown in Figure 8, initially only neurons closer to the input side fire, and gradually later neurons fire. Neurons have a wider firing range.

この過程において、前述の重みの最小値化のプロセスを
導入すると、各二二−ラルレイヤテノー定入力に対する
発火ニューロンを増加させる効果があり、最終段のニュ
ーラルレイヤが発火するようになった時点では多くのニ
ューロンがデータ処理に関与するようになり、偏った発
火パターンに収束することはない。従ってローカル名二
マムヘの落ち込みを防止することができる。In this process, introducing the above-mentioned weight minimization process has the effect of increasing the number of firing neurons for each 2-bilateral layer tenor constant input, and when the final stage neural layer starts firing, many neurons are fired. Neurons become involved in data processing and do not converge on a biased firing pattern. Therefore, it is possible to prevent the local name from falling into the wrong place.

第９図は第２実施例の初期閾値分布を示し、この閾値分
布は、データ処理装置の中央のニューロンの閾値を高め
、この中央ニューロンから遠ざかる程閾値を低下させて
いる。このようなデータ処理装置では、学習初期には閾
値の麓を回り込むように発火パターンが生じる。しかし
、前述の重みの最小値化を行ったとき、発火パターンに
隣接して比較的閾値の低いニューロンが存在するため、
発火ニューロンの分布は閾値の頂上に向かって徐々に広
がっていく。これによって学習終期には、多くのニュー
ロンがデータ処理に関与するようになり、偏った発火パ
ターンに収束することはない。FIG. 9 shows the initial threshold distribution of the second embodiment, which increases the threshold of the central neuron of the data processing device and decreases the threshold as the distance from the central neuron increases. In such a data processing device, a firing pattern occurs around the base of the threshold value in the initial stage of learning. However, when minimizing the weights described above, there are neurons with relatively low thresholds adjacent to the firing pattern, so
The distribution of firing neurons gradually widens toward the top of the threshold. As a result, many neurons become involved in data processing at the end of learning, and the firing pattern does not converge to a biased one.

従ってローカルミニマムへの落ち込みを防止することが
できる。Therefore, it is possible to prevent a drop to the local minimum.

第１０図は初期閾値分布の第３実施例を示し、ここでは
Ｙ座標の中央値のニューロンの閾値を最低とし、Ｙ座標
の最大値、最小値に向かって滑らかな勾配で閾値を高め
ている。このようなデータ処理装置では、学習初期には
Ｙ座標中央値付近を貫通する発火パターンが生じる。し
かし、前述の重みの最小値化を行ったとき、発火パター
ンに隣接して比較的閾値の低いニューロンが存在するた
め、発火ニューロンの分布はＹ座標の最大値、最小値両
方に向かって広がっていく。これによって学習終期には
、多くのニューロンがデータ処理に関与することになり
、偏った発火パターンに収束することはない。従ってロ
ーカルミニマムへの落ち込みを防止し得る。Figure 10 shows a third example of the initial threshold distribution, in which the threshold of the neuron at the median Y coordinate is set to the lowest, and the threshold increases with a smooth gradient toward the maximum and minimum values of the Y coordinate. . In such a data processing device, a firing pattern that passes through the vicinity of the median Y coordinate occurs in the initial stage of learning. However, when the weights are minimized as described above, there are neurons with relatively low thresholds adjacent to the firing pattern, so the distribution of firing neurons expands toward both the maximum and minimum values of the Y coordinate. go. As a result, many neurons are involved in data processing at the end of learning, and the firing pattern does not converge to a biased one. Therefore, falling to the local minimum can be prevented.

第１１図は第４実施例を示す。この実施例は第８図に示
す第１実施例と異なり、同−Ｙ座標においては全てのニ
ューラルレイヤのニューロンの閾値は同一であり、一方
、同一ニューラルレイヤにおいてはＹ座標が増加するほ
どニューロンの閾値θは高められる。この閾値の増大の
勾配は滑らかなものとする。第１１図の閾値分布では、
当初はＹ座標の大きいニューロンのみが発火し、徐々に
Ｙ座標の小さいニューロンに発火の範囲が広がる。FIG. 11 shows a fourth embodiment. This embodiment differs from the first embodiment shown in FIG. 8 in that at the same -Y coordinate, the thresholds of neurons in all neural layers are the same, while in the same neural layer, as the Y coordinate increases, the neuron thresholds are the same. The threshold value θ is increased. The gradient of increase in this threshold value is assumed to be smooth. In the threshold distribution of Fig. 11,
Initially, only neurons with large Y coordinates fire, and the range of firing gradually expands to neurons with small Y coordinates.

ここで、前述の重みの最小値化を行うと、発火パターン
に隣接して比較的閾値の低いニューロンが存在するため
、発火ニューロンの分布は、さらにスムーズに広がる。Here, when the weights are minimized as described above, there are neurons with relatively low thresholds adjacent to the firing pattern, so the distribution of firing neurons spreads even more smoothly.

これにより学習終期には、多くのニューロンがデータ処
理に関与することになり、偏った発火パターンに収束す
ることがなくなって、ローカルミニマムへの落ち込みが
防止される。As a result, many neurons will be involved in data processing at the end of learning, preventing convergence to a biased firing pattern and preventing a drop to a local minimum.

第１２図は第５実施例を示し、この実施例において、同
一のＹ座標においては全てのニューラルレイヤのニュー
ロンの閾値は同一であり、一方、同一ニューラルレイヤ
においては特定のＹ座標（例えば中央）Ｍにおける閾値
θが最大値をとり、Ｙ座標がこれから遠ざかるほど閾値
θは滑らかに減少する。第１２図の閾値分布では、当初
はＹ座標の最大値および最小値側のニューロンが発火し
、徐々にＹ座標Ｍ側のニューロンに発火の範囲が広がる
。ここで、前述の重みの最小値化を行うと、発火パター
ンに隣接して比較的閾値の低いニューロンが存在するた
め、発火ニューロンの分布は、さらにスムーズに広がり
、ローカルミニマムへの落ち込みが防止される。FIG. 12 shows a fifth embodiment, in which the thresholds of neurons in all neural layers are the same at the same Y coordinate, while at a specific Y coordinate (for example, the center) in the same neural layer. The threshold value θ in M takes the maximum value, and the threshold value θ decreases smoothly as the Y coordinate becomes further away from this value. In the threshold distribution shown in FIG. 12, neurons on the maximum and minimum Y coordinate side initially fire, and the range of firing gradually expands to neurons on the Y coordinate M side. Here, if the aforementioned weight minimization is performed, since there are neurons with relatively low thresholds adjacent to the firing pattern, the distribution of firing neurons will spread even more smoothly, and a drop to the local minimum will be prevented. Ru.

第１３図は第６実施例を示す。この実施例では、ニュー
ロンの閾値θは、Ｘ座標およびＹ座標が小さいほど小さ
く、そしてＸ座標およびＹ座標が大きくなるほど滑らか
に増加する。この閾値分布では、当初はＸ座標およびＹ
座標が小さい部分のニューロンが発火し、徐々にＸ座標
およびＹ座標が大きい側のニューロンに発火の範囲が広
がる。ここで、前述の重みの最小値化を行うと、発火パ
ターンに隣接して比較的閾値の低いニューロンが存在す
るため、発火ニューロンの分布は、さらにスムーズに広
がり、ローカルくニマムへの落ち込みが防止される。FIG. 13 shows a sixth embodiment. In this example, the threshold value θ of the neuron decreases as the X and Y coordinates become smaller, and increases smoothly as the X and Y coordinates become larger. In this threshold distribution, initially the X coordinate and Y
Neurons with smaller coordinates fire, and the firing range gradually expands to neurons with larger X and Y coordinates. Here, when the weights are minimized as described above, there are neurons with relatively low thresholds adjacent to the firing pattern, so the distribution of firing neurons spreads even more smoothly, preventing a drop to the local uniformity. be done.

第１４図は第７実施例を示し、初′＃Ｊ１閾値分布とし
て、中央のニューロンの閾値を最小値とし、この中央ニ
ューロンから遠ざかる程閾値を増大させている。この実
施例では、学習初期には中央のニューロンから発火パタ
ーンが生じる。ここで、前述の重みの最小値化を行うと
、発火パターンに隣接して比較的閾値の低いニューロン
が存在するため、発火ニューロンの分布は周囲のニュー
ロンに向かって徐々に広がっていく。これによって学習
終期には、多くのニューロンがデータ処理に関与するよ
うになり、偏った発火パターンに収束することが防止さ
れて、ローカルミニマムへの落ち込みが防止される。FIG. 14 shows a seventh embodiment, in which the initial '#J1 threshold distribution is such that the threshold of the central neuron is the minimum value, and the threshold increases as the distance from the central neuron increases. In this example, a firing pattern is generated from the central neuron at the beginning of learning. Here, when the aforementioned weight minimization is performed, since there are neurons with relatively low thresholds adjacent to the firing pattern, the distribution of firing neurons gradually expands toward surrounding neurons. As a result, at the end of learning, many neurons become involved in data processing, preventing convergence to a biased firing pattern and preventing a drop to a local minimum.

第１５図は第８実施例を示し、初期閾値分布として、Ｙ
座標に関しては例えば中央はど高い閾値を有し、Ｘ座標
に関しては例えば中央はど低い閾値を有する。すなわち
、閾値分布は鞍型を呈する。FIG. 15 shows the eighth embodiment, where Y is the initial threshold distribution.
Regarding the coordinates, for example, the center has a high threshold value, and regarding the X coordinate, for example, the center has a low threshold value. That is, the threshold distribution exhibits a saddle shape.

この実施例によっても、上記各実施例と同様な効果が得
られる。This embodiment also provides the same effects as those of the above embodiments.

〔Effect of the invention〕

以上のように本発明に係るデータ処理装置の適正化方法
によれば、確実に局所的ローカルミニマムを脱出し得る
という優れた効果を有する。As described above, the method for optimizing a data processing device according to the present invention has the excellent effect of reliably escaping the local minimum.

[Brief explanation of drawings]

第１図は適正な学習が行われたときに標準−成度の変化
を示すグラフ、第２図は不適正な学習が行われた時の標準−成度の変化
を示すグラフ、第３図は学習過程で局所的ローカルミニマムに落ち込ん
だときの標準−数置の変化を示すグラフ、第４図は局所
的ローカルミニマムに落ち込んだニューラルネットワー
クの発火状態の例を示す概念図、第５図は本発明方法の一実施例による重み変化を示すグ
ラフ、第６図は他の実施例における重み変化を示すグラフ、第７図はデータ処理装置のニューラルネットワークの一
例を示す概念図、第８図は第１実施例の初期閾値分布を示すグラフ、第９図は第２実施例の初期閾値分布を示すグラフ、第１０図は第３実施例の初期閾値分布を示すグラフ、第１１図は第４実施例の初ｇ、）１閾値分布を示すグラ
フ、第１２図は第５実施例の初期閾値分布を示すグラフ、第１３図は第６実施例の初期閾値分布を示すグラフ、第１４図は第７実施例の初期閾値分布を示すグラフ、第１５図は第８実施例の初期閾値分布を示すグラフであ
る。Figure 1 is a graph showing the change in standard-achievement when appropriate learning is performed. Figure 2 is a graph showing the change in standard-achievement when inappropriate learning is performed. Figure 3. is a graph showing the change in the standard-number position when it falls to a local minimum during the learning process, Figure 4 is a conceptual diagram showing an example of the firing state of a neural network that falls to a local minimum, and Figure 5 is A graph showing weight changes according to one embodiment of the method of the present invention, FIG. 6 a graph showing weight changes according to another embodiment, FIG. 7 a conceptual diagram showing an example of a neural network of a data processing device, and FIG. 9 is a graph showing the initial threshold distribution of the second embodiment. FIG. 10 is a graph showing the initial threshold distribution of the third embodiment. Fig. 12 is a graph showing the initial threshold distribution of the fifth embodiment, Fig. 13 is a graph showing the initial threshold distribution of the sixth embodiment, Fig. 14 is a graph showing the initial threshold distribution of the sixth embodiment. FIG. 15 is a graph showing the initial threshold distribution of the seventh embodiment. FIG. 15 is a graph showing the initial threshold distribution of the eighth embodiment.

Claims

[Claims]

(1) Multiple neurons are provided that output data according to the comparison result between the sum of input data multiplied by a predetermined weight and a threshold, and a smooth gradient bias is given to the threshold distribution of the neurons in the initial state. A method for optimizing a data processing device, which forcibly changes the weight of a neuron that has produced a significant output at a certain point in time so as to temporarily decrease it to a minimum value and then increase it to the value at the point in time, A method for optimizing a data processing apparatus, comprising modifying the weights based on output evaluation results while applying constant input data during the period of forced change of the weights.