JP2798793B2

JP2798793B2 - Neural network device

Info

Publication number: JP2798793B2
Application number: JP2186499A
Authority: JP
Inventors: 能彬穐本; 秀雄田中; 宏美荻; 良夫泉井; 久雄田岡; 敏明坂口
Original assignee: Tokyo Electric Power Co Inc; Mitsubishi Electric Corp
Current assignee: Tokyo Electric Power Co Inc; Mitsubishi Electric Corp
Priority date: 1990-07-12
Filing date: 1990-07-12
Publication date: 1998-09-17
Anticipated expiration: 2013-09-17
Also published as: JPH0476662A

Description

【発明の詳細な説明】［産業上の利用分野］この発明は、神経細胞とその間の結合を模擬して、例
えば、記憶、推論、パターン認識、制御、モデルの推
定、及び関数の近似などを行う神経回路網装置に関する
ものである。DETAILED DESCRIPTION OF THE INVENTION [Industrial Application Field] The present invention simulates a nerve cell and a connection between the nerve cells, and performs, for example, memory, inference, pattern recognition, control, model estimation, and function approximation. The present invention relates to a neural network device for performing.

［従来の技術］第８図は、例えば、雑誌（Dvaid E.RTumelhart,Geoff
rey E.Hinton ＆ Ronald J.Williams.“Learning repre
sentations by back−propagating erros",Nauture,Vo
l.323,No.9,第535頁〜第536頁,10月,1986年）に示され
た従来の多層のフィードフォワード型の神経回路網装置
の構成を示す説明図である。例えば、同図では、入力
層，中間層，出力層が各１層ずつで構成された場合であ
り、入力層が３つの神経素子、中間素子が４つの神経素
子、出力層が１つの神経素子の場合を示している。図に
おいて、（11）は神経細胞を模擬する素子（以下、神経
素子と言う）で、入力層（11a），中間層（11b），出力
層（11c）で構成されている。（12）は神経素子（11）
の層間を結合してシナプスを模擬する素子（以下、結合
素子と言う）で、その結合の強さを結合重みと言う。[Prior Art] FIG. 8 is a diagram illustrating, for example, a magazine (Dvaid E. RTumelhart, Geoff).
rey E. Hinton & Ronald J. Williams. “Learning repre
sentations by back-propagating erros ", Nauture, Vo
l.323, No. 9, 535 to 536, October, 1986) is an explanatory diagram showing the configuration of a conventional multilayer feed-forward type neural network device. For example, FIG. 1 shows a case where the input layer, the intermediate layer, and the output layer are each composed of one layer. The input layer has three neural elements, the intermediate element has four neural elements, and the output layer has one neural element. Is shown. In the figure, reference numeral (11) denotes an element that simulates a nerve cell (hereinafter, referred to as a neural element), and includes an input layer (11a), an intermediate layer (11b), and an output layer (11c). (12) is a neural element (11)
(Hereinafter, referred to as a coupling element) that simulates a synapse by connecting layers between the layers, and the strength of the coupling is referred to as a connection weight.

この際に構成されている神経回路網装置において、神
経素子（11）は層状に結合されており、ダイナミクスと
しては、矢印Ａに示すように、入力層（11a）から入っ
た入力信号は中間層（11b）を介して出力層（11c）に伝
搬されていく。In the neural network device configured at this time, the neural element (11) is connected in a layered manner, and as shown in the arrow A, the input signal coming from the input layer (11a) is Propagated to the output layer (11c) via (11b).

定量的には次のようになる。▲Ｖ^P _1i▼を入力層（11
a）における、第ｐ番目の学習データの第ｉ番目の値、d
_KPを出力層（11c）における第ｐ番目の学習データの第
ｋ番目の値、u_hj,V_hjを第ｈ層のｊ番目の神経素子の内
部状態と出力値、W_hjiを第ｈ層の第ｉ番目の神経素子と
第ｈ＋１層における第ｊ番目の神経素子との間の結合重
みとする。この実施例では、入力層（11a）ではｈ＝1,
中間層（11b）ではｈ＝2,出力層（11c）ではｈ＝３であ
る。この時、各変数の関係は式（１），式（２）のよう
に表わされる。It is quantitatively as follows. ▲ V ^P _1i ▼ the input layer (11
The i-th value of the p-th learning data in a), d
_KP is the k-th value of the p-th learning data in the output layer (11c), u _hj and V _hj are the internal state and the output value of the j-th neural element in the h-th layer, and W _hji is the output value of the h-th layer. The connection weight between the i-th neural element and the j-th neural element in the (h + 1) th layer. In this embodiment, in the input layer (11a), h = 1,
H = 2 in the intermediate layer (11b) and h = 3 in the output layer (11c). At this time, the relationship between the variables is expressed as in equations (1) and (2).

V_hj＝ｇ（u_hj） …（２）ここで、関数ｇ（＊）は微分可能で非減少な関数であ
ればよく、一例としては式（３）で表わされるものと
し、この関数を横軸にu,縦軸にｇ（ｕ）として第９図に
示す。 V _hj = g (u _hj ) (2) Here, the function g (*) may be any function that is differentiable and non-decreasing. For example, the function g (*) is represented by Expression (3). FIG. 9 shows u as the axis and g (u) as the vertical axis.

さらに、結合重みＷは、式（４）に示す学習則で逐次
的に決定される。則ち、出力層における学習データd_1P
（希望信号）と、神経回路網によって実際に選られた値
で定義される２乗誤差に関する最急降下法で逐次的に決
定される。神経回路網の層の数をＨ（＝３）とすると、
２乗誤差は式（４）で表わされる。 Further, the connection weight W is sequentially determined according to a learning rule shown in Expression (4). That is, the learning data d _1P in the output layer
(Desired signal) and a steepest descent method regarding a square error defined by a value actually selected by the neural network. Assuming that the number of layers of the neural network is H (= 3),
The square error is represented by equation (4).

また、結合重みＷの逐次変更はα，βを適当なパラメ
ータとし、モーメント法を仕様した場合には式（５）で
実行できる。 In addition, the sequential change of the connection weight W can be executed by Expression (5) when α and β are used as appropriate parameters and the moment method is specified.

実際の使用に際して、式（５）を差分化し右辺をさら
に詳細に記述すると、出力誤差が出力側から入力側に矢
印Ｂに示すように伝搬されつつ、結合重みＷを調整して
学習が行なわれるので、逆伝搬法、あるいは、バックプ
ロパゲーションと呼ばれる。 In actual use, if equation (5) is differentiated and the right side is described in more detail, learning is performed by adjusting the connection weight W while the output error is propagated from the output side to the input side as shown by the arrow B. Therefore, it is called a back propagation method or a back propagation method.

［発明が解決しようとする課題］従来の神経回路網装置は以上のように構成されている
ので、学習アルゴリズムにおける繰り返し演算の収束に
多大な時間が必要であるという問題点があった。また、
中間層の層の数と各層における神経素子の数が予めどの
程度必要であるかが不明であった。[Problem to be Solved by the Invention] Since the conventional neural network device is configured as described above, there has been a problem that a large amount of time is required for convergence of the repetitive operation in the learning algorithm. Also,
Whether the number of neural elements in the number and each of the layers of the intermediate layer is how necessary advance was unknown.

この発明は上記のような問題点を解消するためになさ
れたもので、学習のための繰り返し演算の収束を速くし
て高速に学習でき、さらに中間層の神経素子の数を予め
決定することができる神経回路網装置を得ることを目的
とする。The present invention has been made in order to solve the above-described problems, and it is possible to speed up the convergence of the repetitive operation for learning so that learning can be performed at a high speed. Further, it is possible to determine the number of neural elements in the intermediate layer in advance. It is an object of the present invention to obtain a neural network device that can be used.

［課題を解決するための手段］この発明に係る神経回路網装置は、不定間隔の入力値
とこれに対する出力値の対で構成させる学習データに対
し、一定間隔の入力値とこれに対する出力値の対で構成
されるサンプリングデータを仮定し、サンプリングデー
タの入力値だけ平行移動した標本化関数を中間層の神経
素子に記憶し、サンプリングデータの出力値を中間層と
出力層間を結合する結合重みとして設定し、学習データ
の入力値を与えた時に得られる出力と学習データの出力
値との誤差が局所最小になるように、サンプリングデー
タの出力値を学習方程式により調整したことを特徴とす
るものである。[Means for Solving the Problems] A neural network device according to the present invention provides a method for generating a pair of an input value at an irregular interval and an output value corresponding to the input data at a constant interval and an output value corresponding thereto at a constant interval. Assuming sampling data composed of pairs, the sampling function translated in parallel by the input value of the sampling data is stored in the neural element of the intermediate layer, and the output value of the sampling data is used as a connection weight for coupling the intermediate layer and the output layer. The output value of the sampling data is adjusted by the learning equation so that the error between the output obtained when the input value of the learning data is set and the output value of the learning data is locally minimized. is there.

［作用］この発明における神経回路網装置においては、不定間
隔に得られた入力値と出力値のペアで構成される学習デ
ータに対し、一定間隔でサンプリングされたサンプリン
グデータを仮定する。このサンプリングデータの入力値
だけ平行移動した標本化関数を中間層の神経素子に記憶
し、サンプリングデータの出力値を入力層と出力層の間
の結合重みに設定する。そして学習データの入力値を神
経回路網装置に与えた時に、神経回路網装置の出力が所
望の出力値となるようにサンプリングデータの出力値で
ある結合重みを調整することによって学習するようにし
ている。従って、学習における繰り返し演算の収束が速
いので高速な学習が可能となる。また、標本化定理にお
けるように、学習データを生成した情報源の複雑度、即
ち最高周波数がわかれば必要とする中間層の神経素子の
数を決めることができる。[Operation] In the neural network device according to the present invention, sampling data sampled at regular intervals is assumed for learning data composed of pairs of input values and output values obtained at irregular intervals. The sampling function translated in parallel by the input value of the sampling data is stored in the neural element of the intermediate layer, and the output value of the sampling data is set to the connection weight between the input layer and the output layer. Then, when the input value of the learning data is given to the neural network device, learning is performed by adjusting the connection weight, which is the output value of the sampling data, so that the output of the neural network device becomes a desired output value. I have. Therefore, the convergence of the repetitive operation in the learning is fast, so that high-speed learning can be performed. Further, as in the sampling theorem, if the complexity of the information source that generated the learning data, that is, the maximum frequency is known, the number of required neural elements in the intermediate layer can be determined.

［実施例］第１図は、この発明の一実施例による神経回路網装置
の構成を示す説明図である。図において、（11）は神経
素子であり、入力層（11a），中間層（11b），出力層
（11c）で構成されている。（12）は神経素子（11）間
の結合重みである。第２図は神経素子（11）の入出力特
性の一例を示すものであり、横軸に入力値、縦軸に出力
値をとって、その関係を示しており、１次元で、サンプ
ル数（Ｍ）が21の場合を示している。Embodiment FIG. 1 is an explanatory diagram showing a configuration of a neural network device according to an embodiment of the present invention. In the figure, (11) is a neural element, which is composed of an input layer (11a), an intermediate layer (11b), and an output layer (11c). (12) is the connection weight between the neural elements (11). FIG. 2 shows an example of the input / output characteristics of the neural element (11). The horizontal axis represents the input value, and the vertical axis represents the output value. M) is 21.

従来の多層のフィードフォワード型の神経回路網装置
の機能を第３図に基づいて説明する。図において、（3
1）ば学習データを生成した元の関数で放物線、５つの
黒丸（32）は学習データ、（33）は３層の従来の神経回
路網によって再生した補間された曲線である。図におい
て、横軸が入力、縦軸が出力となっている。再生した曲
線（33）と元の曲線（31）を比べると、学習データ（3
2）では元の曲線（31）の出力値に近い値が生成され、
その他の部分では補間が行われ、曲線が再生されてい
る。このように、従来装置では、本質的には入力と出力
とを与えた時にその間を補間するという機能である。The function of a conventional multilayer feed-forward type neural network device will be described with reference to FIG. In the figure, (3
1) The original function that generated the learning data is a parabola, five black circles (32) are learning data, and (33) is an interpolated curve reproduced by a three-layer conventional neural network. In the figure, the horizontal axis represents input and the vertical axis represents output. Comparing the regenerated curve (33) with the original curve (31), the learning data (3
2) produces a value close to the output of the original curve (31),
In other parts, interpolation is performed and a curve is reproduced. As described above, the conventional device has a function of interpolating between an input and an output when they are given.

従来の神経回路網装置の機能は上記のようになってい
るので、神経素子（11）として第９図に示されているい
わゆるシグモイド関数の代わりに、第２図に示すような
関係を使うことにより、同様な機能が実現できる。Since the function of the conventional neural network device is as described above, the relationship shown in FIG. 2 should be used instead of the so-called sigmoid function shown in FIG. 9 as the neural element (11). Thereby, a similar function can be realized.

このため、まず、フーリエ展開について説明する。周
期性のある関数はフーリエ展開が可能である。関数ｙ＝
ｆ（ｘ）の定義或は、一辺の長さが１の超立方体とすれ
ば、その他の空間では定義或が周期１で繰り返されてい
ると見なしてよいので、全体としては周期性を持つこと
になる。２次元の場合の定義或を第４図に示す。従っ
て、関数ｙ＝ｆ（ｘ）は次のようにフーリエ展開させ
る。Therefore, the Fourier expansion will be described first. A function with periodicity can be Fourier-expanded. Function y =
If f (x) is defined as a hypercube with a side length of 1, the definition or repetition in the other space may be regarded as being repeated at a period of 1. Therefore, it must have periodicity as a whole. become. FIG. 4 shows the definition or the two-dimensional case. Therefore, the function y = f (x) is Fourier-expanded as follows.

ただし、式（７）は関数ｆ（ｘ）に含まれている周波
数ｋのフーリエ級数の強度は、Ｎは入力の次元数を表わ
す。Ｋは関数ｆ（ｘ）の最高周波数で、一般に未知の関
数を復元するには無限大の周波数まで考慮に入れなけれ
ばならないことを示す。 However, in the equation (7), the intensity of the Fourier series of the frequency k included in the function f (x), and N represents the number of dimensions of the input. K is the highest frequency of the function f (x), indicating that generally infinite frequencies must be taken into account to recover the unknown function.

関数ｆ（ｘ）がフーリエ展開可能であるためには、そ
の定義或が超立方体の中になければならない。しかし、
一般には、神経回路網装置で実現しようとしている実際
の学習データｘ′は任意の実数値をとるのが普通であり
超立方体の要請に反する。そこで、入力データに対して
は、適正な正規化を行うことにより、その値をｘとして
超立方体内に納める。In order for a function f (x) to be Fourier-expandable, it must be in its definition or in a hypercube. But,
Generally, the actual learning data x 'to be realized by the neural network device usually takes an arbitrary real value, which is contrary to the requirement of a hypercube. Therefore, the input data is subjected to appropriate normalization, and its value is stored in the hypercube as x.

また、通常、神経回路網装置の出力ｙも超立方体の中
とするのが慣習なので、実際の出力データ、ｙ′に対し
ては、逆正規化を行う。つまり、入出力の正規化を含め
たシステム全体としての動作は第５図のようになる。即
ち、ブロック（41）で任意の実数値である実際の学習デ
ータｘ′を正規化して、ブロック（42）の神経回路網装
置で関数の近似を行ない、この出力ｙをブロック（43）
で逆正規化して実際の出力データｙ′を得る。正規化と
しては、例えば、次のような方法が考えられる。Usually, it is customary that the output y of the neural network device is also in the hypercube, so that the actual output data, y ', is denormalized. In other words, the operation of the entire system including input / output normalization is as shown in FIG. That is, the actual learning data x 'which is an arbitrary real value is normalized in the block (41), the function is approximated by the neural network device in the block (42), and the output y is converted to the block (43)
To obtain the actual output data y '. As the normalization, for example, the following method can be considered.

シグモイド関数を使う方法アフィン変換する方法実際の変数の定義或が全実数値の場合は、次式をよう
にシグモイド関数を使うと超立方体内に非線形変換され
る。Method of using sigmoid function Method of affine transformation In the case of the definition of actual variables or all real values, using the sigmoid function as shown in the following formula will convert nonlinearly into a hypercube.

データの定義或があらかじめ次のように分かっている
場合は、で示される領域を式（10）でアフィン変換し、超立方体
内に押し込めれば良い。 If you know the definition of the data or the following in advance, The affine transformation may be performed on the region shown by the equation (10), and the region may be pushed into the hypercube.

推定すべき関数をｆ（ｘ）とし、式（６）と式（７）
を用いて次のように変形を行なう。 The function to be estimated is f (x), and Equations (6) and (7)
Is transformed as follows using

ここで、｛＊｝は最高周波数Ｋが十分大きい時、デル
タ関数に接近する。即ち、関数ｆ（ｘ）は、ｘ′におい
てｆ（ｘ′）の高さを持つデルタ関数の重ね合わせによ
って構成されていると解釈できる。 Here, {*} approaches the delta function when the maximum frequency K is sufficiently large. That is, it can be interpreted that the function f (x) is constituted by the superposition of the delta functions having the height of f (x ') at x'.

ここで、学習データは不定間隔にしか得られないの
で、定義或を埋め尽くすほどの十分大量のデータがない
と式（15）の積分を精度よく求めることができない。そ
こで、まず簡単のため、学習データが規制的であるとし
て、この結果を用いて不規則な学習データを取り扱うこ
とにする。今、学習データが非常に都合良く超立方体を
2K_i＋１＝M_i等分するように得られたとして、差分化に
より式（15）の積分を行う。まず、とする。さらに積分値はM_i等分された代表点、で行うとする。ここで、Ｋを関数ｆ（ｘ）の持つ最高周
波数とすれば、となる。ただし、Ｍ＝2K＋１ …（22）さらに、式（21）の意味を明確にするために次のように
書き換える。Here, since learning data can be obtained only at irregular intervals, the integral of Expression (15) cannot be obtained with high accuracy unless there is a sufficient amount of data to fill in the definitions or the like. Therefore, for the sake of simplicity, it is assumed that learning data is regulated, and irregular learning data will be handled using this result. Now, the learning data is very convenient
Assuming that 2K _i + 1 = M _i are obtained, the integration of Expression (15) is performed by subtraction. First, And Representative point further integrated value which is equal M _i, Let's do it. Here, assuming that K is the highest frequency of the function f (x), Becomes However, M = 2K + 1 (22) In order to clarify the meaning of Expression (21), it is rewritten as follows.

ただし、 Φ（ｘ）はM_iが大きくなるとデルタ関数に近接する関数
で、入出力が１次元でＭ＝21の場合には第２図で示した
図となる。 However, Function [Phi (x) is close to the larger the delta function is M _i, the diagram shown in FIG. 2 in the case input and output of the M = 21 1-dimensional.

以上から、ｙ＝ｆ（ｘ）は、基底関数Φ（ｘ）により
展開させていることがわかる。つまり、定義或が超立方
体内に限定された標本化定理となっている。従って、最
高周波数Ｋが分かれば、関数の再生には必要最低限、式
（22）で示されるＭだけ学習データが得られればよく、
他の学習データは不必要である。これは、関数の複雑
さ、つまり、最高周波数がわかれば学習データの必要な
数がわかり、その数以下だと、関数が完全には再現でき
ないことを意味している。式（27）を３層の神経回路網
状に表現したものが第１図である。この実施例における
中間層（11b）はＰ＝21個で構成されていればよい。From the above, it can be seen that y = f (x) is expanded by the basis function Φ (x). In other words, it is a definition or a sampling theorem limited to a hypercube. Therefore, if the maximum frequency K is known, it is sufficient to obtain the learning data of at least M represented by the equation (22) necessary for reproducing the function.
No other training data is needed. This means that the complexity of the function, that is, the required number of training data can be known by knowing the highest frequency, and below that number, the function cannot be completely reproduced. FIG. 1 shows Expression (27) as a three-layer neural network. The intermediate layer (11b) in this embodiment may be composed of P = 21.

以下のように、学習データが超立方体を等分するよう
に2K＋１個規則的に得られた場合は、標本化定理と同様
に完全に関数が復元できる。しかし、ここでは、学習デ
ータは不規則にしか得られないとしているので、このよ
うな場合は、規則的なデータ（以下ではサンプリングデ
ータという）を仮定し、このデータから復元された関数
と不規則な学習データ（以下では単に学習データとい
う）との誤差を考え、これを局所最小にするようにサン
プリングデータを調整する。As described below, when 2K + 1 pieces of learning data are regularly obtained so as to equally divide a hypercube, the function can be completely restored in the same manner as the sampling theorem. However, here, it is assumed that the learning data can only be obtained irregularly. In such a case, regular data (hereinafter referred to as sampling data) is assumed, and the function restored from this data and the irregular data are assumed. Considering an error with respect to simple learning data (hereinafter simply referred to as learning data), the sampling data is adjusted so as to minimize this error locally.

サンプリングデータを（x_m,_ｍ）、学習データを（x
_p,y_p）とした時、誤差Ｅを次のように定義する。Sampling data is (x _m , _m ) and learning data is (x
_p , y _p ), the error E is defined as follows.

ただし、従って、Ｅを最小にするためには、例えば、次のように
最急降下法を用いてサンプリングデータを調整すればよ
い。 However, Therefore, in order to minimize E, for example, the sampling data may be adjusted using the steepest descent method as follows.

ただし、あるいは、バックプロパゲーションと同様にモーメント
法を用いて、加速度の項を考慮することもできる。 However, Alternatively, the acceleration term can be considered by using the moment method as in the case of back propagation.

ただし、式（33）は差分式ではなく微分方程式で記述
した学習方程式である。 However, equation (33) is not a difference equation but a learning equation described by a differential equation.

なお、この時、学習データを生成した関数の最高周波
数以上のサンプリングデータを仮定しないと、学習デー
タを満足する関数ｆ（ｘ）は理論的に構成できない。従
って、サンプリングデータ不足している時は、仮定した
最高周波数を上げて、サンプリングデータの数を増加さ
せる必要がある。At this time, a function f (x) that satisfies the learning data cannot be theoretically constructed unless sampling data having a frequency equal to or higher than the highest frequency of the function that generated the learning data is assumed. Therefore, when the sampling data is insufficient, it is necessary to increase the assumed maximum frequency to increase the number of sampling data.

全体の動作を第６図のフローチャートに示す。ステッ
プ（51）で学習データの対を用意し、ステップ（52）で
最高周波数Ｋとサンプリングデータの初期値を設定す
る。次にステップ（53）で学習データにおける誤差Ｅを
最小にするようにサンプリングデータを調整し、ステッ
プ（54）で調整が不十分ならば、さらに高い最高周波数
を仮定してサンプリングデータを調整する。ステップ
（55）で誤差が十分小さいならば処理を終了する。The entire operation is shown in the flowchart of FIG. In step (51), a pair of learning data is prepared, and in step (52), the maximum frequency K and the initial value of the sampling data are set. Next, in step (53), the sampling data is adjusted so as to minimize the error E in the learning data. If the adjustment is insufficient in step (54), the sampling data is adjusted by assuming a higher maximum frequency. If the error is sufficiently small in step (55), the process ends.

また、ステップ（52）のサンプリングデータの初期値
の設定法は、例えば近辺y_pからの線形補間により推定す
るようにしてもよい。また、ステップ（54）の最高周波
数増加時のサンプリングデータの増加方法については、
例えば、新しいサンプリングデータを古いサンプリング
データから線形補間により推定するような方法がある。The setting method of initial values of the sampling data in step (52) may be estimated for example by linear interpolation from nearby y _p. Regarding the method of increasing the sampling data when the maximum frequency is increased in step (54),
For example, there is a method of estimating new sampling data from old sampling data by linear interpolation.

シミュレーションにより収束結果を第７図に示す。学
習データを発生する元になる関数（以下、元関数とい
う）式は（34）で表わされ、第７図（ｃ）の線（61）で
示されるような山型である。FIG. 7 shows the convergence result by simulation. The function from which the learning data is generated (hereinafter referred to as the original function) is represented by (34) and has a mountain shape as shown by the line (61) in FIG. 7 (c).

第７図（ａ）は横軸に繰り返し回数、縦軸に誤差Ｅの
平方根を示し、第７図（ｂ）は横軸に繰り返し回数、縦
軸に絶対値誤差の最大値を示し、第７図（ｃ）は横軸に
入力、縦軸に出力を示すグラフである。第７図（ａ），
（ｂ）によれば最高周波数が切り替わったところで、誤
差が増加しているが、これは、新しいサンプリングデー
タの補間による推定が十分でないためである。第７図
（ｃ）において、黒丸（62）が学習データ、減衰してい
る三角関数（63）が標本化関数、学習データを結ぶ直線
的な線（61）が元関数、曲がりくねった線（64）が再生
された関数である。線（61）と線（64）を比較すれば、
いずれも、かなり精度良く再生されていることがわか
る。 FIG. 7A shows the number of repetitions on the horizontal axis, the square root of the error E on the vertical axis, FIG. 7B shows the number of repetitions on the horizontal axis, and the maximum value of the absolute value error on the vertical axis. FIG. 3C is a graph showing the input on the horizontal axis and the output on the vertical axis. FIG. 7 (a),
According to (b), the error increases when the highest frequency is switched, because the estimation by interpolation of new sampling data is not sufficient. In FIG. 7 (c), black circles (62) represent learning data, attenuated trigonometric functions (63) represent sampling functions, linear lines (61) connecting the learning data represent original functions, and winding lines (64). ) Is the reconstructed function. Comparing line (61) and line (64),
In each case, it can be seen that the reproduction is performed with high accuracy.

なお、従来の方法であるバックプロパゲーション（B
P）との速度比較の結果（第１表）を示しておく。条件
は次のとおりである。In addition, back propagation (B
The results (Table 1) of the speed comparison with P) are shown. The conditions are as follows.

元関数は第７図に示す山型の関数とする。 The original function is a mountain-shaped function shown in FIG.

使用計算機は汎用のワークステーション BPの中間層は100個不規則な学習データは元関数からランダムに得る。 The computer used is a general-purpose workstation. The middle layer of the BP is 100. Irregular learning data is obtained randomly from the original function.

最高周波数の初期値は１として、1,3,5,7等と増加
させながら繰り返し演算を行う。Assuming that the initial value of the highest frequency is 1, the calculation is repeatedly performed while increasing to 1, 3, 5, 7, and the like.

従来例と実施例との計算時間は比較結果を第１表に示
す。表中で…はシミュレーションをしていないことを示
す。Table 1 shows a comparison result of the calculation time between the conventional example and the embodiment. In the table, indicates that no simulation was performed.

第１表からわかるように、繰り返し演算を行うので、
学習データが増して元関数を精密に再生する必要が生じ
ると学習時間がかかるようになる。ここでは、シミュレ
ーションを１回しか行っていないので、かなり、乱数の
初期値の影響をうけており、この影響をみることができ
る。従来のバックプロパゲーションはパラメータの設定
値にも依存するが、非常に学習時間が長く、データ数が
10で収束しなくなる。いずれにしても、従来装置のバッ
クプロパゲーションによるものと比較して、この実施例
による神経回路網装置は非常に高速であることがわか
る。 As can be seen from Table 1, since the repetition operation is performed,
If it becomes necessary to precisely reproduce the original function due to an increase in learning data, it takes a long learning time. Here, since the simulation is performed only once, the influence of the initial value of the random number is considerably affected, and this influence can be seen. Conventional backpropagation also depends on parameter settings, but requires a very long learning time and a large number of data.
Stop converging at 10. In any case, it can be seen that the neural network device according to this embodiment is much faster than the back propagation of the conventional device.

このように、上記実施例では学習のための繰り返し演
算の収束が速いので高速な学習が可能となり、さらに、
学習データを生成した情報源の複雑度、即ち最高周波数
がわかれば必要とする中間層の神経素子の数を決定する
ことが出来る。As described above, in the above embodiment, the convergence of the iterative operation for learning is fast, so that high-speed learning is possible.
Knowing the complexity of the information source that generated the learning data, that is, the maximum frequency, it is possible to determine the number of required neural elements in the intermediate layer.

なお、神経回路網を構成する神経素子の層の数や個数
は、上記実施例に限るものではなく、応用分野に応じて
変更すればよい。In addition, the number and the number of layers of the neural element constituting the neural network are not limited to the above-described embodiment, and may be changed according to the application field.

［発明の効果］以上のように、この発明によれば、入力層，中間層，
及び出力層で構成され、生体の神経細胞を模擬した複数
の神経素子、並びに神経素子の層間を結合する結合重み
を備える神経回路網装置において、不定間隔の入力値と
これに対する出力値の対で構成される学習データに対
し、一定間隔の入力値とこれに対する出力値の対で構成
されるサンプリングデータを仮定し、サンプリングデー
タの入力値だけ平行移動した標本化関数を中間溝の神経
素子に記憶し、サンプリングデータの出力値を中間層と
出力層間を結合する結合重みとして設定し、学習データ
の入力値を与えた時に得られる出力と学習データの出力
値との誤差が局所最小になるように、サンプリングデー
タの出力値を学習方程式により調整したことので、学習
のための繰り返し演算の収束が速いので高速に学習で
き、さらに、学習データを生成した情報源の複雑度、即
ち最高周波数がわかれば必要とする中間層の神経素子の
数を決定することができる神経回路網装置の構築が可能
となった。[Effects of the Invention] As described above, according to the present invention, the input layer, the intermediate layer,
And a neural network device comprising a plurality of neural elements simulating nerve cells of a living body, and a connection weight for coupling between layers of the neural elements. Assuming sampling data consisting of pairs of input values at regular intervals and output values corresponding to the learning data, the sampling function translated in parallel by the input value of the sampling data is stored in the neural element in the intermediate groove Then, the output value of the sampling data is set as a connection weight for connecting the intermediate layer and the output layer, and the error between the output obtained when the input value of the learning data is given and the output value of the learning data is locally minimized. Since the output value of the sampling data is adjusted by the learning equation, the convergence of the iterative operation for learning is fast, so that learning can be performed at high speed. Complexity of the generated information sources, namely the construction of the neural network apparatus can determine the number of neural elements of the intermediate layer is highest frequency required knowing becomes possible.

[Brief description of the drawings]

第１図はこの発明による神経回路網装置の一実施例の構
成を示す説明図。第２図はこの実施例に係る神経素子の
入出力特性を示す特性図、第３図は従来の多層のフィー
ドフォワード型の神経回路網装置の動作における関数の
グラフ、第４図はこの発明の一実施例により近似される
関数の定義或を示す説明図、第５図は入力変数の正規化
と出力変数の逆正規化を含めたシステム全体の動作図、
第６図は一実施例にかかる全体の動作を示すフローチャ
ート、第７図（ａ）は学習の収束における誤差の低減を
示すグラフ、第７図（ｂ）は学習データ点における誤差
の絶対値の最大値を示すグラフ、第７図（ｃ）は元関数
と一実施例による装置で再生された関数を示すグラフ、
第８図は従来の３層のフィードフォワード型の神経回路
網装置の構成を示す説明図、第９図は従来装置に係る神
経素子の入出力特性を示す特性図である。（11），（11a），（11b），（11c）……神経素子、（12）……結合重み。FIG. 1 is an explanatory diagram showing the configuration of an embodiment of a neural network device according to the present invention. FIG. 2 is a characteristic diagram showing input / output characteristics of the neural element according to this embodiment, FIG. 3 is a graph of a function in the operation of a conventional multilayer feedforward type neural network device, and FIG. FIG. 5 is an explanatory diagram showing the definition or definition of a function approximated by one embodiment. FIG. 5 is an operation diagram of the entire system including normalization of input variables and denormalization of output variables.
FIG. 6 is a flowchart showing the overall operation according to one embodiment, FIG. 7 (a) is a graph showing a reduction in error in learning convergence, and FIG. 7 (b) is a graph showing the absolute value of the error at the learning data point. A graph showing the maximum value, FIG. 7 (c) is a graph showing the original function and the function reproduced by the apparatus according to one embodiment,
FIG. 8 is an explanatory diagram showing a configuration of a conventional three-layer feedforward type neural network device, and FIG. 9 is a characteristic diagram showing input / output characteristics of a neural element according to the conventional device. (11), (11a), (11b), (11c) ... neural element, (12) ... connection weight.

───────────────────────────────────────────────────── フロントページの続き (72)発明者荻宏美東京都中央区入船１丁目４番10号東京電力株式会社システム研究所内 (72)発明者泉井良夫兵庫県尼崎市塚口本町８丁目１番１号三菱電機株式会社産業システム研究所内 (72)発明者田岡久雄兵庫県尼崎市塚口本町８丁目１番１号三菱電機株式会社産業システム研究所内 (72)発明者坂口敏明兵庫県尼崎市塚口本町８丁目１番１号三菱電機株式会社産業システム研究所内 (56)参考文献Ｍ．Ｓｔｉｎｃｈｃｏｍｂｅ，Ｈ．Ｗｈｉｔｅ，”ＵｎｉｖｅｒｓａｌＡｐｐｒｏｘｉｍａｔｉｏｎＵｓｉｎｇＦｅｅｄｆｏｒｎａｒｄＮｅｔｗｏｒｋｓｗｉｔｈＮｏｎ−ＳｉｇｍｏｉｄＨｉｄｄｅｎＬａｙｅｒＡｃｔｉｖａｔｉｏｎＦｕｎｃｔｉｏｎｓ”，ＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＪｏｉｎｔＣｏｎｆｅｒｅｎｃｅｏｎＮｅｕｒａｌＮｅｔｗｏｒｋｓ，Ｖｏｌ．１，Ｐ．▲Ｉ▼−613 〜Ｐ．▲Ｉ▼．617 （1989) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 15/18 ＪＩＣＳＴファイル（ＪＯＩＳ)──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Hiromi Ogi 1-4-10 Irifune, Chuo-ku, Tokyo Tokyo Electric Power Company System Research Laboratory (72) Inventor Yoshio Izui 8-1-1 Honcho Tsukaguchi, Amagasaki City, Hyogo Prefecture No. Mitsubishi Electric Corporation Industrial System Research Laboratory (72) Inventor Hisao Taoka 8-1-1 Tsukaguchi Honcho, Amagasaki City, Hyogo Prefecture Mitsubishi Electric Corporation Industrial System Research Laboratory (72) Inventor Toshiaki Sakaguchi 8 Tsukaguchi Honmachi, Amagasaki City, Hyogo Prefecture No. 1-1, Mitsubishi Electric Corporation Industrial System Research Laboratory (56) References Stinchcombe, H .; W white, "Universal Application Usage Using Feedforward Networks ks with Non-Signoid Hidden Layer Actiaction International Connection Insurance", IEEE Internet Information Communication Services. 1, P. ▲ I ▼ -613 ~ P. ▲ I ▼. 617 (1989) (58) Field surveyed (Int. Cl. ⁶ , DB name) G06F 15/18 JICST file (JOIS)

Claims

(57) [Claims]

An input layer, an intermediate layer, and an output layer;
A plurality of neural elements that simulate nerve cells of a living body, and a neural network device having connection weights for connecting layers of the neural elements, wherein learning data composed of pairs of input values at irregular intervals and output values corresponding to the input values are provided. On the other hand, assuming sampling data composed of pairs of input values and output values corresponding thereto at regular intervals, a sampling function translated in parallel by the input value of the sampling data is stored in the neural element of the intermediate layer, and the sampling is performed. An output value of data is set as a connection weight for connecting the intermediate layer and the output layer, and an error between an output obtained when an input value of the learning data is given and an output value of the learning data is locally minimized. Wherein the output value of the sampling data is adjusted by a learning equation.