JPH06266692A

JPH06266692A - Neural network

Info

Publication number: JPH06266692A
Application number: JP5078657A
Authority: JP
Inventors: Masahiko Tateishi; 雅彦立石
Original assignee: NipponDenso Co Ltd
Current assignee: Denso Corp
Priority date: 1993-03-12
Filing date: 1993-03-12
Publication date: 1994-09-22

Abstract

PURPOSE:To provide a neutral network to give transformation easy to learn. CONSTITUTION:An input layer 1 is provided with n-pieces of input units and N-pieces of TLT layers (threshold logic transforming layer) 2 at every unit. An output layer 4 is outputted by m pieces of data. At the time of n=1, N=4, input is quadrupled, and a value defined as a data value at every unit becomes the value of each unit of TLT. At the time when the output of each unit of TLT is inputted to the intermediate layer 3 of high order, when weight (coupling coefficient) to the coupled with a j-th intermediate layer unit is defined as wji respectively, the transformation shows shape approximating final output. At the time of n=1, wijXN shows the inclination of each polygonal line, and if the weight wij is seen from a learnt result, an approximate line can be grasped immediately, and the rough grasp of the transformation becomes efficient. Besides, each TLT layer unit comes to share in each part of an input value, and a role each unit plays can be grasped well too.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は神経回路網の応用分野、
神経回路網（ニューラル・ネットワーク、ＮＮＷ）を応
用した制御やパターン認識等の分野に関する。FIELD OF THE INVENTION The present invention relates to the field of application of neural networks,
The present invention relates to fields such as control and pattern recognition to which a neural network (neural network, NNW) is applied.

【０００２】[0002]

【従来の技術】神経回路網は、制御・認識等を行う場合
に回路網の設定値や定数（結合係数）を学習により変更
させることが可能なシステムであり、数式化しにくい経
験的な操作、形状の把握等を自動的に行わせるのに適し
ている。信号処理を行いながら状況を把握して信号処理
に反映していく知識処理機能を学習と呼ぶ。神経回路網
はいくつかのデータのみから得られた学習によって、そ
の関係を基に未学習領域部分で望ましい判定・出力がで
きるという補間能力、予測能力を持つ。入力層、中間
層、出力層というように階層化された神経回路網ではＢ
Ｐ学習（バックプロバケーション学習、逆伝播学習）が
一般に知られている。これはある系の入力値に対する出
力値が得られた時、その入力に対する望ましい出力値が
系から出力されるように、各層を結び付ける結合係数を
出力層側から変更していく学習アルゴリズムである。こ
の学習は場合によっては計算処理に大幅な時間をとって
しまい、目的である制御・認識等が効率的に行われない
場合があり、そのため学習を効率良く実行させる方法が
いくつか提案されてきた。2. Description of the Related Art A neural network is a system capable of changing a set value and a constant (coupling coefficient) of the network by learning when performing control / recognition, etc. It is suitable for automatically grasping the shape. The knowledge processing function of grasping the situation while performing signal processing and reflecting it in signal processing is called learning. The neural network has the ability to interpolate and predict that it is possible to make desired judgments and outputs in the unlearned area based on the relationships obtained by learning obtained from only some data. B is used in a neural network that is layered such as an input layer, an intermediate layer, and an output layer.
P learning (back pro vacation learning, back propagation learning) is generally known. This is a learning algorithm that changes the coupling coefficient connecting the layers from the output layer side so that when the output value for the input value of a certain system is obtained, the desired output value for the input is output from the system. In some cases, this learning may take a significant amount of time for calculation processing, and the target control / recognition may not be performed efficiently. Therefore, some methods have been proposed to make learning efficient. .

【０００３】ＢＰ学習効率化手段として例えば、特開平
1-320565号公報に示されるような学習パラメータ（学習
率、慣性項）を学習中に変化させる方法がある。この方
法では、学習の繰り返し計算ごと、計算何回かに一回ご
との割合で学習パラメータを変化させることで、より早
く学習させることを目的としている。As a means for improving BP learning efficiency, for example, Japanese Patent Laid-Open No.
There is a method of changing learning parameters (learning rate, inertial term) as described in Japanese Patent Laid-Open No. 1-320565 during learning. This method aims to make learning faster by changing the learning parameter at a rate of every repeated calculation of learning and once every several calculations.

【０００４】また学習効率化の他の手段としては、入力
データを予め学習容易な形式に変換するという前処理を
施すやり方があり、入力データをファジイ理論でのメン
バーシップ関数で特徴抽出する方法（特開平3-134704号
公報）や、入力データをチェビシェフ関数により変換す
る方法（生天目、上田；情報処理学会論文誌 Vol.32No.
12, pp.1542-1550 ）がある。後者はこの前処理機能を
神経回路網の枠組みの中に組み込んでいる（図２(b) 、
チェビシェフネットワークと称する）が、機能的には学
習のための中間層より前で処理することで前者と同様な
形式となっている（図２(a) 参照）。しかし学習効率化
としては、前段でどのような処理を施すかが問題であ
り、言い換えれば変換に使用する数式がどのようになっ
ているかで大きく系の特徴が変化するため、前者と後者
では本質的に異なった別の方法となっている。いずれの
方法においても、データを学習し易く変換することで効
率化を図っている。As another means for improving learning efficiency, there is a method of performing pre-processing of converting input data into a format that is easy to learn in advance, and a method of extracting a feature of the input data by a membership function in fuzzy theory ( Japanese Patent Laid-Open No. 3-134704) and a method of converting input data by Chebyshev function (Nakamame, Ueda; IPSJ Journal Vol.32 No.
12, pp.1542-1550). The latter incorporates this preprocessing function within the framework of neural networks (Fig. 2 (b),
The Chebyshev network) is functionally similar to the former by processing before the middle layer for learning (see Fig. 2 (a)). However, in order to improve learning efficiency, what kind of processing is performed in the previous stage is a problem, in other words, the characteristics of the system greatly change depending on how the mathematical formula used for conversion is, so the former and the latter are essentially Is a different method. In either method, the efficiency is improved by converting the data so that it can be easily learned.

【０００５】[0005]

【発明が解決しようとする課題】しかし、メンバーシッ
プ関数で特徴抽出する前者の方法では、対処している問
題に対する外乱やプロセス状態量などの特徴量をメンバ
ーシップ関数で変換する際、どのような特徴量を選択し
て変換をどの程度にするのかなどを各問題ごとに決めな
ければならず、試行錯誤的で汎用性ではないという問題
がある。However, in the former method of extracting the features by the membership function, when the feature function such as the disturbance or the process state quantity for the problem to be dealt with is converted by the membership function, There is a problem that it is not versatile because of trial and error, because it is necessary to decide for each problem how much the conversion should be performed by selecting the feature amount.

【０００６】また、チェビシェフネットワークを利用す
る後者の方法は汎用性を備えた変換方式であるが、デー
タ変換にcos,arccos関数を用いており、そのための専用
回路を必要とすることや、その計算に時間がかかること
等の問題があり、その代わりとしてメモリーにマップを
もつようにする提案もあるが、その場合には多くのメモ
リーを必要とするなど汎用的に使用しにくいという問題
がある。The latter method utilizing the Chebyshev network is a conversion method having versatility, but uses the cos and arccos functions for data conversion, requires a dedicated circuit for that, and calculates them. However, there is a proposal to have a map in the memory instead, but in that case, there is a problem that it is difficult to use for general purpose because it requires a lot of memory.

【０００７】従って発明者らは上記の問題を、前処理と
して複雑な関数を用いたため汎用性が妨げられていると
考え、前処理として汎用性があり、かつ処理関数が計算
容易な変換を提案することで、より学習の効率化を図っ
た神経回路網を提供することを本発明の目的とする。[0007] Therefore, the present inventors consider the above problem to be a general purpose is hindered by the use of a complicated function as a preprocessing, and propose a conversion which has generality as a preprocessing and whose processing function is easy to calculate. By doing so, it is an object of the present invention to provide a neural network that achieves more efficient learning.

【０００８】[0008]

【課題を解決するための手段】上記の課題を解決するた
め第一発明の構成は、入力層と中間層と出力層の少なく
とも３層を有し、各層はユニットを少なくとも一つ以上
含む多層構造な神経回路網において、第ｐ層と第ｑ層
（ｐ、ｑはｑ＝ｐ＋１の自然数）との間に、前記第ｐ層
のユニットについて少なくとも一つ以上閾値論理変換層
を有すること、その閾値論理変換層は前記第ｐ層の一つ
のユニットに関する入力データｘをＮ個のデータ（Ｎ≧
２）、つまり数１式に変換し、変換関数ψは、データｚ
に対して数２式で示される変換で構成される閾値論理変
換であること、前記Ｎ個のデータはそれぞれが前記第ｑ
層のｊ番目のユニットへ重みｗ_ji（i=0,1,2 ・・,N-1）
で出力されることで構成される閾値論理変換であること
を特徴とする。また第一発明の関連発明の構成として、
数２式の関数ｆが区間０≦ｚ≦１で入力データｚと一対
一対応の関数であることを特徴とする。あるいは数２式
の関数ｆが数３式、或いは数４式、或いは数５式で表さ
れることを特徴とする。また、前記第ｐ層が入力層であ
り、前記第ｑ層が中間層であることを特徴とする。In order to solve the above-mentioned problems, the structure of the first invention has at least three layers of an input layer, an intermediate layer and an output layer, and each layer has a multilayer structure containing at least one unit. A neural network, at least one threshold logic conversion layer is provided between the p-th layer and the q-th layer (p and q are natural numbers of q = p + 1) for the unit of the p-th layer. The logic conversion layer converts the input data x regarding one unit of the p-th layer into N pieces of data (N ≧
2) That is, the conversion function ψ is converted into the formula 1 and the data z
Is a threshold value logical conversion constituted by the conversion expressed by the equation 2, and each of the N data is the q-th
Weight w _ji (i = 0,1,2 ... N-1) to the j-th unit of the layer
It is characterized in that it is a threshold value logical conversion configured by being output by. Further, as the structure of the related invention of the first invention,
It is characterized in that the function f of the equation 2 is a one-to-one function with the input data z in the section 0 ≦ z ≦ 1. Alternatively, the function f of the equation 2 is represented by the equation 3, the equation 4, or the equation 5. Further, the p-th layer is an input layer, and the q-th layer is an intermediate layer.

【０００９】[0009]

【作用】一つのユニットのデータｘが数１式、数２式に
よる閾値論理変換層（Threshold Logic Transform 層、
以下 TLT層）によりＮ個のデータになる。そのＮ個のデ
ータを基にそれぞれの結合係数ｗ_jiが掛かって次層へデ
ータが与えられる。次層が受け取るデータは、ｆ(z) ＝
ｚの例で言うと、一次元入力では丁度折れ線近似または
棒グラフ近似を、二次元入力では多平面近似を施したデ
ータとなる。そして学習時に以降の中間層で、誤差の二
乗和が最小になるように結合係数を変更していくという
周知のＢＰ学習が行われる。本方式は、従来のチェビシ
ェフネットワークによるチェビシェフ関数（Ｎ次多項式
に相当）の結合係数を決めて近似する方式とは異なっ
て、Ｎ個の線型近似または解析可能な関数でのＮ分割近
似を前段で施すことになり、そのためＮの数が増えれば
それだけ分解能が上がり近似の精度が上昇する。つま
り、チェビシェフネットワーク方式では区間内で連続の
関数を縦方向に重みをつけて望む出力波形を合成する方
式であるが、本発明では入力値を細かく横方向に分解し
て重みをつけて近似する方式であり、従来と全く異なっ
た新しい方法を用いている。The data x of one unit is expressed by the equation 1 and the equation 2 by the threshold logic transform layer (Threshold Logic Transform layer,
Below, the TLT layer) results in N data. Based on the N data, each coupling coefficient w _ji is multiplied and the data is given to the next layer. The data received by the next layer is f (z) =
In the example of z, the data is just line approximation or bar graph approximation for one-dimensional input, and multi-plane approximation for two-dimensional input. Well-known BP learning in which the coupling coefficient is changed so that the sum of squared errors is minimized is performed in the subsequent intermediate layers during learning. This method differs from the conventional method that determines the coupling coefficient of the Chebyshev function (corresponding to the Nth degree polynomial) by the Chebyshev network and approximates it by N linear approximation or N division approximation with an analyzable function in the previous stage. Therefore, if the number of N is increased, the resolution is increased and the accuracy of the approximation is increased. That is, in the Chebyshev network method, a continuous function is weighted in the vertical direction in the interval to synthesize a desired output waveform, but in the present invention, the input value is finely decomposed in the horizontal direction and weighted for approximation. This is a new method that is completely different from the conventional one.

【００１０】[0010]

【発明の効果】本発明の構成による変換演算は、入力デ
ータに対して積算と和算を施すのみであり、また複雑な
関数を用いたとしても解析可能な演算であるため、数表
が不要で、規模の小さいマイクロコンピュータ等で十分
対応でき実現しやすい神経回路網を提供できる。 TLT層
により大雑把な近似が行われ、目標値からのずれは極め
て縮小されるため、中間層での学習量が大幅に減少す
る。即ち学習効率が大幅に上昇する。また、どのような
入力に対しても対応でき汎用性が高い。EFFECTS OF THE INVENTION The conversion operation according to the configuration of the present invention does not need a mathematical table because it only performs multiplication and addition on input data and can analyze even if a complicated function is used. Thus, it is possible to provide a neural network that can be easily realized by a small-scale microcomputer or the like. A rough approximation is performed by the TLT layer, and the deviation from the target value is greatly reduced, so that the learning amount in the intermediate layer is greatly reduced. That is, the learning efficiency is significantly increased. In addition, it can handle any input and is highly versatile.

【００１１】[0011]

【実施例】以下、本発明を具体的な実施例に基づいて説
明する。図１は本発明の一実施例を示す神経回路網の構
成図である。図１において、入力層１はｎ個の入力ユニ
ットからなり（ｎ次元データ、入力パターン）、各入力
ユニットについて数１、２式によるデータＮ個の TLT層
（閾値論理変換層）２が連なっている。従って当実施例
の TLT層２はＮ×ｎだけユニットが存在する。このＮを
変換定数と呼ぶことにする。この TLT層２が従来の中間
層３にネットワークとなってつながり、その結果が出力
層４に送られ、ｍ個のデータとなって出力される（ｍ次
元データ、出力パターン）。なお系によっては TLT層２
のすべてのユニットがＮ個である必要はなく、ユニット
ごとに個数を決めてもよい。EXAMPLES The present invention will be described below based on specific examples. FIG. 1 is a block diagram of a neural network showing an embodiment of the present invention. In FIG. 1, the input layer 1 is composed of n input units (n-dimensional data, input pattern), and N TLT layers (threshold logic conversion layer) 2 of N pieces of data according to equations 1 and 2 are connected for each input unit. There is. Therefore, the TLT layer 2 of this embodiment has N × n units. This N will be called a conversion constant. The TLT layer 2 is connected to the conventional intermediate layer 3 as a network, and the result is sent to the output layer 4 and output as m pieces of data (m-dimensional data, output pattern). Depending on the system, TLT layer 2
It is not necessary that all of the units are N, and the number may be determined for each unit.

【００１２】TLT層２の作用を図３の例で説明する。図
３は一次元入力、一次元出力（図示していない）、変換
関数ψのｆ_i(z) ＝ｚ、Ｃ_i＝０、Ｄ_i＝１、変換定数
Ｎ＝４の場合の一例で、入力値ｘとして０．６が入力さ
れた場合の TLT各ユニットの出力値が示されている。こ
の場合の数１式は次式で示される。The operation of the TLT layer 2 will be described with reference to the example of FIG. FIG. 3 is an example in which one-dimensional input, one-dimensional output (not shown), f _i (z) = z of the conversion function ψ, C _i = 0, D _i = 1 and the conversion constant N = 4, The output value of each TLT unit when 0.6 is input as the input value x is shown. Equation 1 in this case is expressed by the following equation.

【数６】Ｔｉ（ｘ）＝ ψ（４ｘ−ｉ）（i=0,1,2,3 ）図３から明らかなように、入力値０．６が変換定数倍
（４倍）されて２．４となり、その値の１単位ごとにデ
ータ値を分配したものが TLT各ユニットの値となってい
る。これは丁度、分解能を４倍にしたのと等価である。
図９に図３の例において主な４つの入力値に対する TLT
各ユニットの出力を示した。## EQU00006 ## Ti (x) =. Psi. (4x-i) (i = 0,1,2,3) As is apparent from FIG. 3, the input value 0.6 is multiplied by the conversion constant (4 times) to 2 .4, and the value of each unit of the TLT is the data value distributed for each unit. This is exactly equivalent to quadrupling the resolution.
Fig. 9 shows the TLT for the four main input values in the example of Fig. 3.
The output of each unit is shown.

【００１３】さて、図３における TLT各ユニットの値を
出力として上位の中間層３へ入力する場合、ｊ番目の中
間層ユニットに結合される重み（荷重）をそれぞれｗ_ji
とすると、ユニットｊに送られる重み付き入力ｆは以下
の次式で示されるＮ＋１個の点を通る折れ線で表現され
る。Now, when the value of each TLT unit in FIG. 3 is input to the upper intermediate layer 3 as an output, the weight (load) coupled to the j-th intermediate layer unit is respectively w _ji.
Then, the weighted input f sent to the unit j is represented by a polygonal line passing through N + 1 points shown by the following equation.

【数７】（０，０），（１／Ｎ，ｗ_j0），（２／Ｎ，ｗ_j0＋ｗ_j1），（３／Ｎ，ｗ_j0＋ｗ_j1＋ｗ_j2），・・・・（Ｎ−１／Ｎ，ｗ_j0＋ｗ_j1＋ｗ_j2＋・・＋ｗ_jN-2），（１，^N-1Σ_i=0ｗ_ji ）この例における入力が TLT層の有無によって出力が変化
する様子を図５に示す。図５(a) で、 TLT層がある場合
（ただしＮ＝２、それぞれのユニットの重みをｐ、ｑと
した。）、入力はｘが０から１まで変化するとすると、
出力は例えば図５(b) のようになり、ｐ、ｑで決まる直
線二本の折れ線となる。従って TLT層がない場合に比べ
て出力をより近似できる。このことを図３の例で示した
のが図４で、各重みの値は図４に示す値としてある。図
４(a) は各ユニットの出力値Ｔｉに重みｗ_jiを掛けた出
力、図４(b) 〜(e) を合成したもので、この出力が中間
層のユニットｊの入力値となっていることを示す。図４
(a) の曲線は本来ユニットｊに送られるべきであろうと
予想されるデータ値である。但し図示した曲線値が最終
的な値ではなく、学習によって変化していくものであ
る。ユニットｊからの実際の直線の出力とこの曲線との
差が誤差である。このため、 TLT層で大きく近似がなさ
れると以降の中間層での誤差収拾の負担が軽くなり、計
算量なども減り効率的な結果が得られる。この図４の誤
差分を次段の中間層から出力層への重み（結合係数）に
よって補うようにＢＰ学習で決定していく。ＢＰ学習の
結果により TLT層から中間層ユニットへの結合係数ｗ_ji
も変化を受けるので図４のグラフも変化するわけであ
る。Equation 7] (0,0), (1 / N , w j0), (2 / N, w j0 + w j1), (3 / N, w j0 + w j1 + w j2), ···· (N- 1 / N, w _j0 + w _j1 + w _j2 + ... + w _jN-2 ), (1, ^N-1 Σ _{i = 0} w _ji ) Figure 5 shows how the output changes depending on the presence or absence of the TLT layer. Shown in. In Fig. 5 (a), if there is a TLT layer (where N = 2 and the weight of each unit is p and q), if the input changes from 0 to 1,
The output is, for example, as shown in FIG. 5 (b), and becomes two broken lines determined by p and q. Therefore, the output can be more approximated as compared to the case without the TLT layer. This is shown in the example of FIG. 3 in FIG. 4, and the value of each weight is the value shown in FIG. Fig. 4 (a) is a composite of the output value Ti of each unit multiplied by the weight w _ji and Fig. 4 (b) to (e). This output becomes the input value of the unit j in the middle layer. Indicates that Figure 4
The curve in (a) is the data value that is expected to be sent to unit j. However, the curve values shown in the figure are not final values, but change with learning. The difference between the actual straight line output from unit j and this curve is the error. For this reason, if a large approximation is made in the TLT layer, the burden of error collection in the subsequent intermediate layer will be lightened, and the amount of calculation will be reduced and efficient results will be obtained. The error in FIG. 4 is determined by BP learning so as to be compensated by the weight (coupling coefficient) from the intermediate layer to the output layer in the next stage. According to the result of BP learning, the coupling coefficient w _ji from the TLT layer to the intermediate layer unit
Also changes, so the graph in FIG. 4 also changes.

【００１４】また、上記例の説明を別の表現ですると、
重みｗ_jiはそれぞれの折れ線の傾きに相当する値を示し
ており（正確にはｗ_ji×Ｎが傾き）、学習結果から重み
ｗ_jiを見れば直ちに各折れ線が図形として把握できる。
そのため望ましい変換の大雑把な把握が効率的になされ
ることになる。また、各 TLT層ユニットは入力値の各部
分を分担することになり、各ユニットの果たす役割りも
よく把握できる。例えば入力の小さいときにいつも出力
データが望ましくないとかおかしい現象を生じるような
場合、原因はハード上でのトラブルも考えられるが、入
力データの小さい領域は図４の例で言えばＴ₀のユニッ
トが受け持っているので、このＴ₀のユニットをチェッ
クすればよい、とすぐ対応がとれる。このように各 TLT
層ユニットは明確な意味をもつ特徴がある。In addition, if the description of the above example is expressed in another way,
The weight w _ji indicates a value corresponding to the inclination of each polygonal line (correctly, w _ji × N is an inclination), and each polygonal line can be immediately grasped as a figure by looking at the weighting w _ji from the learning result.
Therefore, a rough grasp of the desired conversion can be made efficiently. In addition, each TLT layer unit shares each part of the input value, and the role played by each unit can be well understood. For example, if the output data always causes an undesirable or strange phenomenon when the input is small, the cause may be hardware trouble, but the small input data area is the unit of T _{0 in} the example of FIG. Since I am in charge of this, I will take immediate action to check this unit of T ₀ . Thus each TLT
The layer unit has a characteristic with a clear meaning.

【００１５】なお多次元入力の場合、学習の結果 TLT層
が期待される出力を近似した形状・係数になるかならな
いかということはＢＰ学習の特性によっており、解析的
には今のところ解明できていない。ＢＰ学習がエネルギ
ー最小の原理で働くことから、当然の成り行きとして最
も近似する形状に落ちつく、という説明が最も確からし
いと言える。一般的なＮＮＷの特性として、中間層が必
ずしも出力を近似するという特徴抽出機能を発揮するわ
けではなく、一見無駄なものも含まれるということがあ
る。そのようなユニットでも何らかの信号を送っている
ため、削除する事は出来ない。しかしシミュレーション
の結果のほとんどはユニットの少なくとも一つが近似を
示しているという結果を得ている。In the case of multidimensional input, whether or not the TLT layer has a shape and coefficient that approximates the expected output as a result of learning depends on the characteristics of BP learning, and can be clarified analytically so far. Not not. Since BP learning works on the principle of minimum energy, it can be said that the explanation that the BP learning settles to the most approximate shape as a matter of course. As a characteristic of a general NNW, the intermediate layer does not always exhibit the feature extraction function of approximating the output, and it may include a seemingly useless one. Even such units are sending some signal and cannot be deleted. However, most of the simulation results show that at least one of the units shows an approximation.

【００１６】上記の例では直線近似を施す場合を示した
が、変換関数ｆがステップ関数の場合は、一次元の場合
では丁度望むべき出力をヒストグラムで近似する。この
場合Ｎが増加するといわば関数のデジタルサンプリング
形状を TLT層が出力するようになる。そのため TLT層自
体がほとんど出力関数を担う形となる。当然結合係数ｗ
の意味は上記の直線近似の例とは全く異なり、ｗがその
まま出力関数値を意味するようになってくる。そこで、
収束性は見込めないが第二の TLT層を次段に設けて更に
近似を施すことも可能となる。このことは部分的に細か
い近似を施したい場合に有効な手段である。また別の変
換関数の場合は中間層でよく利用されるシグモイド関数
を兼用する場合であるが、定数ａにＮ／４を掛けた値は
グラフの最大の傾きを示すのでａは３〜１０が望まし
い。この関数の場合は計算が面倒になる可能性がある。In the above example, the case where the linear approximation is applied is shown. However, when the conversion function f is a step function, the output which is just desired in the case of one dimension is approximated by a histogram. In this case, when N increases, the TLT layer outputs the digital sampling shape of the function. Therefore, the TLT layer itself bears the output function. Naturally the coupling coefficient w
Is completely different from the above linear approximation example, and w means the output function value as it is. Therefore,
Convergence cannot be expected, but it is possible to provide a second TLT layer in the next stage for further approximation. This is an effective means when it is desired to make a fine approximation partially. In the case of another conversion function, the sigmoid function often used in the intermediate layer is also used. However, since the value obtained by multiplying the constant a by N / 4 shows the maximum slope of the graph, a is 3 to 10. desirable. This function can be cumbersome to calculate.

【００１７】しかしいずれの変換においても、各分割さ
れた区間で変換関数でもって出力関数を近似することに
なるから、誤差を少なくする処理を TLT層で施している
ことになり、当然中間層以降の負担は軽くなり学習効率
があがる効果がある。また、出力の周波数成分が高いま
たは変動が大きいような場合には、結合係数Ｎの数を増
やせばそれだけ近似の度合いが高まり、より効果的であ
るが、ただＮの増加に伴い TLT層での計算量も増加する
ので、対象とする系・問題により目的に応じた適切なＮ
を用いる必要がある。However, in any conversion, since the output function is approximated by the conversion function in each divided section, the processing for reducing the error is performed in the TLT layer. There is an effect that the burden of is lightened and learning efficiency is improved. In addition, when the frequency component of the output is high or the fluctuation is large, increasing the number of coupling coefficients N increases the degree of approximation and is more effective. However, as N increases, the TLT layer Since the amount of calculation will also increase, the appropriate N depending on the target system and problem
Need to be used.

【００１８】さらに、変換関数ｆ_i(z) はｉごとに別の
関数であっても出力関数を近似することに変わりはな
く、関数を複数用いることで入力ｘに対する特徴的区別
を図ることができる。ただこの場合は関数をそれだけ準
備しなければならないため、システムは複雑化の傾向を
示す。また重みも単純ではなくなり意味を把握しにくく
なる。そのほか、場合によっては出力がある入力特性に
依存しないことが明白であればその入力ユニットの TLT
層を設ける必要がなく、省略できた分、効率を上げるこ
とができる。Further, even if the conversion function f _i (z) is a different function for each i, the output function is still approximated, and a characteristic distinction for the input x can be achieved by using a plurality of functions. it can. However, in this case, the system tends to be complicated because the function must be prepared accordingly. Also, the weight is not simple and it is difficult to understand the meaning. In other cases, if it is clear that the output does not depend on certain input characteristics, the TLT of that input unit
It is not necessary to provide a layer, and the efficiency can be increased by the amount omitted.

【００１９】TLT層の効果を調べるため、 TLT層ありの
神経回路網（以下TLT-NNW ）系と TLT層なしの通常の神
経回路網（以下N-NNW ）系とでパターン認識に相当する
シミュレーションを行った。シミュレーションは、二次
元または三次元入力、そして一次元出力で、 TLT層とし
て関数ｆ_i(z) ＝ｚ、Ｃ_i＝０、Ｄ_i＝１で変換させ
る。そして学習パターンデータを入力して、誤差をもと
に結合係数を変更していくＢＰ学習を計算させ、その学
習結果を基にテストパターンを各NNW に入力して得られ
る出力を調べた。変換定数Ｎ＝２、３、４、５で、学習
回数、テストパターンに対する誤差自乗和、収束率によ
り性能比較をした。TLT-NNW 系の構成は、入力層、 TLT
層、第一中間層、第二中間層、出力層の５層からなり、
隣接する層間は全結合させてある。第一及び第二中間層
のユニット数は例題に応じて変えている。N-NNW 系の構
成は、前記TLT-NNW 系のネットワーク構成から TLT層の
みを取り除き、入力層と第一中間層とを全結合させた４
層構造のものを用いた。In order to investigate the effect of the TLT layer, a simulation corresponding to pattern recognition is performed using a neural network system with a TLT layer (hereinafter TLT-NNW) system and an ordinary neural network system without a TLT layer (hereinafter N-NNW) system. I went. In the simulation, two-dimensional or three-dimensional input and one-dimensional output are used, and the TLT layer is transformed with the functions f _i (z) = z, C _i = 0, and D _i = 1. Then, the learning pattern data was input, BP learning in which the coupling coefficient was changed was calculated based on the error, and the output obtained by inputting the test pattern to each NNW was examined based on the learning result. With conversion constants N = 2, 3, 4, and 5, performance comparison was performed based on the number of learnings, the sum of squared errors for test patterns, and the convergence rate. TLT-NNW system consists of input layer, TLT
Consists of 5 layers, a layer, a first intermediate layer, a second intermediate layer, and an output layer,
Adjacent layers are fully bonded. The number of units in the first and second intermediate layers varies depending on the example. In the N-NNW system configuration, only the TLT layer is removed from the TLT-NNW system network configuration, and the input layer and the first intermediate layer are fully connected.
A layered structure was used.

【００２０】学習は以下の条件で計算した。学習係数・・・・・・０．３モーメンタム・・・・０．９許容誤差・・・・・・±０．０５学習回数の上限・・・１００００回なお、上限値に達した場合は収束しなかったものとみな
した。シミュレーションは次の４つの例題で行った。The learning was calculated under the following conditions. Learning coefficient: 0.3 Momentum: 0.9 Allowable error: ± 0.05 Upper limit of number of learning: 10,000 times If the upper limit is reached, convergence I thought it was not done. The simulation was performed on the following four examples.

【００２１】１．シミュレーション問題例題Ａｓｉｎ曲線1. Simulation problem example A sin curve

【数８】０．５＋０．２・ｓｉｎ（４πｘ）を境界とするｘ−ｙ平面上の領域Ｃ１、Ｃ２の分離をデ
ータとして得る（図６（ａ））。Ｃ１、Ｃ２に属する点
にそれぞれ出力値０、１を割り当てる。学習パターン：図１１（ｂ）に示す白丸座標とその出
力データテストパターン：（ｘ，ｙ）＝（０．０５ｍ，０．０５
ｎ）の各点ｍ，ｎ＝０，１，・・・，２０の計４４１パターンユニット数：第一中間層８、第二中間層６例題Ｂｘ−ｙ平面上の点（０．５，０．５）を中心とする同心
円（図７）の白色と黒色領域それぞれに出力値０、１を
割り当てる。学習パターン：（ｘ，ｙ）＝（０．２ｍ，０．２ｎ）
の各点ｍ，ｎ＝０，１，・・・，５の計３６パターンテストパターン：（ｘ，ｙ）＝（０．１ｐ，０．１ｑ）
の各点でｐ，ｑ＝０，１，・・・，１０の計１２１パタ
ーンユニット数：第一中間層６、第二中間層６例題Ｃｘ、ｙの二次元データを入力とする関数g(x,y)のマッピ
ング（図８）。[Equation 8] Separation of the regions C1 and C2 on the xy plane bounded by 0.5 + 0.2 · sin (4πx) is obtained as data (FIG. 6A). Output values 0 and 1 are assigned to points belonging to C1 and C2, respectively. Learning pattern: White circle coordinates shown in FIG. 11B and its output data Test pattern: (x, y) = (0.05 m, 0.05
n) Each point m, n = 0, 1, ..., 20 total 441 patterns Number of units: First intermediate layer 8, second intermediate layer 6 Example B Point on the xy plane (0.5, Output values 0 and 1 are assigned to the white and black regions of a concentric circle (FIG. 7) centered on 0.5). Learning pattern: (x, y) = (0.2m, 0.2n)
Each point of m, n = 0, 1, ..., 5 in total 36 patterns Test pattern: (x, y) = (0.1p, 0.1q)
121 patterns at each point of p, q = 0, 1, ..., 10 Number of units: First intermediate layer 6, second intermediate layer 6 Example C A function g that inputs two-dimensional data of x and y Mapping of (x, y) (Fig. 8).

【数９】g(x,y)=0.5 + 0.5*cos(2π* sqrt(2((x-0.5)²+
(y-0.5)²))) 学習パターン：（ｘ，ｙ）＝（０．２ｍ，０．２ｎ）
の各点ｍ，ｎ＝０，１，・・・，５の計３６パターンテストパターン：（ｘ，ｙ）＝（０．１ｐ，０．１ｑ）
の各点でｐ，ｑ＝０，１，・・・，１０の計１２１パタ
ーンユニット数：第一中間層６、第二中間層６（注；g(x,y)＜0.5 に出力値０、g(x,y)≧0.5 に出力値
１を割り当てると例題Ｂと同一となる。）例題Ｄｘ、ｙ、ｚの三次元空間におけるデータの分離。原点を
中心とする半径０．５の球内の点に対し出力値０、球外
の点に対し１を割り当てる。学習パターン：（x,y,z)＝（p/7, q/7, r/7)の各点 p,q,r ＝０，１，・・・，７の計５１２パターンテストパターン：（x,y,z)＝（p/9, q/9, r/9)の各点で p,q,r ＝０，１，・・・，９の計１０００パターンユニット数：第一中間層６、第二中間層４## EQU9 ## g (x, y) = 0.5 + 0.5 * cos (2π * sqrt (2 ((x-0.5) ² +
(y-0.5) ² ))) Learning pattern: (x, y) = (0.2m, 0.2n)
Each point of m, n = 0, 1, ..., 5 in total 36 patterns Test pattern: (x, y) = (0.1p, 0.1q)
121 patterns of p, q = 0, 1, ..., 10 at each point of the number of units: First intermediate layer 6, second intermediate layer 6 (Note; output value 0 for g (x, y) <0.5 , G (x, y) ≧ 0.5, the output value 1 is assigned, which is the same as the example B.) Example D Separation of data in a three-dimensional space of x, y, and z. An output value 0 is assigned to a point inside the sphere having a radius of 0.5 around the origin, and 1 is assigned to a point outside the sphere. Learning pattern: (x, y, z) = (p / 7, q / 7, r / 7) each point p, q, r = 0,1, ..., 7 512 patterns in total Test pattern :( x, y, z) ＝ (p / 9, q / 9, r / 9) at each point, p, q, r = 0,1, ..., 9 total 1000 patterns Number of units: 1st middle layer 6, second intermediate layer 4

【００２２】２．シミュレーション結果 TLT層あり、なしの両者についての比較結果を図１０、
１１に示す。各欄に記載したデータは神経回路網の初期
値を３回変えて試行を行った平均値を示す。結果とし
て、 TLT層を導入により、学習回数は大幅に減少し、例
題によっては１／２０の回数になっている。また学習結
果の信頼性を示す、テストパターンに対する誤差自乗和
は、例題によっては特定の変換定数で増加が認められる
が、最良の結果で見れば４割程度以上の減少を示してい
る。これは TLT層が学習回数の削減だけでなく、汎化能
力向上にも効果があることを意味している。2. Simulation results Comparison results for both with and without TLT layer are shown in Fig. 10.
11 shows. The data shown in each column show the average value of trials with the initial value of the neural network changed three times. As a result, with the introduction of the TLT layer, the number of learnings is significantly reduced, and it is 1/20 in some cases. In addition, the error sum of squares for the test pattern, which indicates the reliability of the learning result, increases with a specific conversion constant depending on the example, but the best result shows a decrease of about 40% or more. This means that the TLT layer is effective not only in reducing the number of learnings but also in improving generalization ability.

【００２３】TLT層は学習回数の削減、汎化能力改善の
他、収束性の向上にも著しい効果を示した。収束性を調
べるため、上述の例題について学習パターン数を以下の
ように増やし、同一条件でシミュレーションを行った。
その結果を図１２に示す。例題Ａ学習パターン数・・・６２ → １２２例題Ｂ学習パターン数・・・３６ → ６４例題Ｃ学習パターン数・・・３６ → ６４例題Ｄ学習パターン数・・・５１２ →１０００The TLT layer showed remarkable effects not only in reducing the number of learnings and improving generalization ability, but also in improving convergence. In order to investigate the convergence, the number of learning patterns for the above example was increased as follows, and simulation was performed under the same conditions.
The result is shown in FIG. Example A Number of learning patterns ・・・ 62 → 122 Example B Number of learning patterns ・・・ 36 → 64 Example C Number of learning patterns ・・・ 36 → 64 Example D Number of learning patterns ・・・ 512 → 1000

【００２４】以上のシミュレーションにより TLT層の効
果が明確に示せ、本発明が高汎用性でかつ構築容易な神
経回路網であることが示せた。By the above simulation, the effect of the TLT layer was clearly shown, and it was shown that the present invention is a highly versatile and easily constructed neural network.

[Brief description of drawings]

【図１】本発明の一実施例を示す神経回路網の構成図。FIG. 1 is a configuration diagram of a neural network showing an embodiment of the present invention.

【図２】前処理と神経回路網の関係を示す模式図。FIG. 2 is a schematic diagram showing a relationship between preprocessing and a neural network.

【図３】閾値論理変換の作用の説明図。FIG. 3 is an explanatory diagram of an operation of threshold logic conversion.

【図４】図３の例の出力を示す説明図。FIG. 4 is an explanatory diagram showing an output of the example of FIG.

【図５】ＴＬＴ層あり、なしの比較図。FIG. 5 is a comparison diagram with and without a TLT layer.

【図６】例題Ａの領域を示す説明図。6 is an explanatory diagram showing a region of Example A. FIG.

【図７】例題Ｂの領域を示す説明図。7 is an explanatory diagram showing a region of Example B. FIG.

【図８】例題Ｃの領域を示す説明図。FIG. 8 is an explanatory diagram showing a region of Example C.

【図９】図３の例の変換定数Ｎ＝４の時のＴＬＴユニッ
トの出力一覧図。9 is a list of outputs of the TLT unit when the conversion constant N = 4 in the example of FIG.

【図１０】シミュレーション結果の学習回数のデータ比
較図。FIG. 10 is a data comparison diagram of the number of times of learning of simulation results.

【図１１】シミュレーション結果の誤差自乗和のデータ
比較図。FIG. 11 is a data comparison diagram of the error sum of squares of the simulation result.

【図１２】シミュレーション結果の収束性のデータ比較
図。FIG. 12 is a data comparison diagram of convergence of simulation results.

[Explanation of symbols]

１入力層２ＴＬＴ層３中間層４出力層 1 Input Layer 2 TLT Layer 3 Middle Layer 4 Output Layer

Claims

[Claims]

1. A neural network having a multi-layer structure having at least three layers of an input layer, an intermediate layer and an output layer, each layer including at least one unit, wherein a p-th layer and a q-th layer (p and q are q = p + 1 natural number), at least one threshold logic conversion layer is provided for the unit of the p-th layer, and the threshold logic conversion layer inputs the input data x relating to one unit of the p-th layer by N. Number of data (N ≧ 2) [Equation 1] x → [ψ ₀ (Nx), ψ ₁ (Nx-1), ψ ₂ (Nx-2), ...
., Ψ _N-1 (Nx- (N-1))], and the conversion function ψ is given by However, C _i and D _i are functions satisfying real constants, i = 0,1,2 ... N-1, and the N data are respectively stored in the j-th unit of the q-th layer. A neural network characterized by being output with a weight w _ji .

2. A function f _i (z) (i = 0,1,2
., N-1) is a function which has a one-to-one correspondence with z in the interval 0≤z≤1.

3. A function f _i (z) (i = 0,1,2
, N-1) is given by the following formula: f _i (z) = z or the following formula: f _i (z) = u (z) where the function u is a unit step function or the following formula: f _i (z) = 1 / (1 + exp (-a (z-
0.5))) where exp is an exponential function and a is a natural number constant (a
The neural network according to claim 1, wherein the neural network is any function of ≧ 1).

4. The neural network according to claim 1, wherein the p-th layer is an input layer and the q-th layer is an intermediate layer.