JPH05342189A

JPH05342189A - Learning system for network type information processor

Info

Publication number: JPH05342189A
Application number: JP4175056A
Authority: JP
Inventors: Takeshi Nakamura; 健中村; Keiko Shiozawa; 恵子塩沢; Yoichi Ueishi; 陽一上石; Toshihide Fujimaki; 俊秀藤巻
Original assignee: AdIn Research Inc
Current assignee: AdIn Research Inc
Priority date: 1992-06-10
Filing date: 1992-06-10
Publication date: 1993-12-24
Anticipated expiration: 2017-12-09
Also published as: JP3354593B2

Abstract

PURPOSE:To provide the system of network inference for which configuration is simplified, processing speed is (higher and learning efficiency is improved. CONSTITUTION:This system is provided with a data input part 12 for learning to input data for learning to a network type information processor, area judge part 13 to judge whether an area shown by the inputted data for learning is included in an existent recognizing area or not, filter function update part 14 to individually update respective filter functions in the set of filter functions forming the existent recognizing area when the area judge part 13 judges that the area is included in the existent recognizing area, and filter function generation part 15 to generate the set of filter functions for forming the new recognizing area when the area judge part 13 judges that the area is not included in the existent recognizing area.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、パターン認識装置、制
御システム、診断システム、意志決定支援システムなど
のコンピュータを利用したネットワーク構造を有する情
報処理システムに関し、とくにその学習システムに関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information processing system having a network structure using a computer such as a pattern recognition device, a control system, a diagnostic system, a decision support system, and more particularly to a learning system thereof.

【０００２】[0002]

【従来の技術】高度な情報処理機能を実現するための従
来の技術の一つとして、情報処理を行う構造が人間の脳
の神経細胞をモデルとしたニューラルネットワークがあ
る（例えば、Ｃｏｈｅｎ，Ｆｅｉｇｅｎｂａｕｍ著、
「人工知能ハンドブック」共立出版社発行、ｐ４８７−
４９３、甘利外、特集「ニューラルネットワークに
ついて」，人工知能学会誌，Ｖｏｌ．４，ｐｐ１１８−
１７６（１９８９））。神経細胞は多入力１出力の情報
処理素子である。その信号をｘ＝（ｘ₁，ｘ₂，ｘ₃・・
・ｘ_n）のｎ次元ベクトルで表し、出力信号をｖとす
る。神経細胞の入力ｘに対して、出力ｖを定める入出力
関係が分かれば、神経細胞の特性が分かったことにな
る。神経細胞の特性を解明して、それに似た情報処理を
おこなうモデルを、コンピューター上に構成すれば、人
間の神経細胞と同様の情報処理が可能となる。さらに、
神経細胞に似せた個のモデル（計算ユニット）を複数設
けて、互いの間に神経細胞のシナプスの情報伝達機能と
同様の働きが行えるようにすれば、神経細胞網と同じよ
うな神経回路を構築できる可能性がある。神経細胞では
信号ｘ_iやｖは、神経線維（シナプス）を伝わる神経細
胞の興奮パルスの頻度すなわち神経細胞の発火頻度の形
で表されていると考えられる。従って、このような場
合、信号ｘ_iやｖは連続的なアナログ値を取る。従来の
ニューラルネットワークモデルの例ではアナログ値を近
似的にディジタル値として取り扱うことが多い。即ち、
神経細胞が興奮していない状態と、興奮している状態と
を０，１の２値に対応させて、信号ｘ_iやｖを取り扱う
ことが多い。2. Description of the Related Art As one of the conventional techniques for realizing advanced information processing functions, there is a neural network whose information processing structure is modeled on human brain nerve cells (for example, by Cohen and Feigenbaum). ,
"Artificial Intelligence Handbook" published by Kyoritsu Publisher, p487-
493, Amari, Special Feature "About Neural Networks", Journal of Japan Society for Artificial Intelligence, Vol. 4, pp118-
176 (1989)). A nerve cell is a multi-input / single-output information processing element. The signal is x = (x ₁ , x ₂ , x ₃ ...
X _n ) is represented by an n-dimensional vector, and the output signal is v. If the input-output relationship that determines the output v with respect to the input x of the nerve cell is known, the characteristics of the nerve cell are known. If a model that elucidates the characteristics of nerve cells and performs similar information processing is configured on a computer, information processing similar to that of human nerve cells becomes possible. further,
By providing multiple models (calculation units) that resemble nerve cells so that they can perform the same function as the information transmission function of synapses of nerve cells between them, a neural circuit similar to the neural network can be created. There is a possibility to build. In nerve cells, the signals x _i and v are considered to be expressed in the form of the frequency of nerve cell excitation pulses transmitted through nerve fibers (synapses), that is, the frequency of nerve cell firing. Therefore, in such a case, the signals x _i and v have continuous analog values. In the example of the conventional neural network model, analog values are often treated as digital values approximately. That is,
Signals x _i and v are often handled by associating a state in which nerve cells are not excited and a state in which they are excited with binary values of 0 and 1.

【０００３】以下にニューラルネットワークモデルの例
についてもう少し詳しく説明する。入力信号ｘ_iが方向
性リンク（エッジ）を通してノード（計算ユニット）に
伝わる場合の影響が、方向性リンクに備わる重みの値ｗ
_iによって定まるものとする。単位時間内では一定の入
力である入力信号ｘに対する出力ｖはｖ_j＝ｆ（ｕ_j）（１）ｕ_j＝Σｗ_ij・ｖ_i−θ_j （２）ｖ_iは１つ前段に位置する計算ユニットの出力で、一番
最初は入力信号ｘである。ｖ_jは計算ユニット_jの出力、
ｗ_ijは計算ユニットｉから_jへの結合の重み（例：実
数）、θ_jは計算ユニットｊのしきい値である。ｕ_jは計
算ユニットｊへの入力の総和と呼ばれる。ｆは計算ユニ
ットの入力関数あるいは特性関数と呼ばれる。上式
（１）の関数ｆ（ｕ）は通常神経生理学の知見および数
学的な扱い易さから、シグモイド｛Ｓ字形の単調増加
で、飽和特性を持つ関数１／（１＋ｅ^-x）｝が使われ
る。。式（２）は線形式であるが、式（１）は非線形性
を有しているのでこのような計算ユニットは準線形ユニ
ットと呼ばれる。An example of the neural network model will be described in more detail below. The influence when the input signal x _i is transmitted to the node (computation unit) through the directional link (edge) is the weight value w provided in the directional link.
_It shall be determined by _i . Output v with respect to the input signal x is a constant input in a unit time is _{_{v j = f (u j)}} (1) u j = Σw ij · v i -θ j (2) v i is located in one pre-stage The output of the calculation unit, the first being the input signal x. v _j is the output of calculation unit _j ,
w _ij is a weight (for example, a real number) of the connection from the calculation unit i to _j , and θ _j is a threshold value of the calculation unit j. u _j is called the sum of the inputs to calculation unit j. f is called the input function or characteristic function of the calculation unit. The function f (u) in the above equation (1) is usually a sigmoid {S-shaped monotonically increasing function 1 / (1 + e ^−x )} having a saturation characteristic because of the knowledge of neurophysiology and mathematical ease of handling. Be seen. .. Although Equation (2) is linear, Equation (1) has non-linearity, and thus such a calculation unit is called a quasi-linear unit.

【０００４】いま、上記のような準線形のモデルによっ
て所望の情報処理が行われるようにするためには、計算
ユニットの能力とネットワークの構造とを同時に語らね
ばならない。最初に入力層と出力層のみからなり、ネッ
トワークの構造が１つの階層であるものについて説明を
する。ｎ個の複数入力情報、つまりｎ変数の組合せに対
して、１つの計算ユニットが出力として設けられた場
合、入出力特性はｎ変数の関数を記述していることにな
る。例えば、計算ユニットの入出力関数ｆとして、しき
い値をもったしきい関数（入力の総和が負ならば値０、
正ならば値１とする）を用いた場合ならば、ｎ次元空間
をある傾きの超平面で切り、入力のパターン（ｎ次元）
がその一方の側に含まれている場合に１を、他方に含ま
れている場合に０を出力する。入力が３つであれば３次
元の空間内に傾いた平面を置きその片側が０となる。原
点からのズレ（距離）がしきい関数ｆのしきい値によっ
て定まる。このような１つの階層だけの構造を持つネッ
トワークを通した推論では、入力値に線形な情報を基に
計算ユニットが総和を求めるのだから、ｎ次元の超平面
が可能な情報処理は結局、線形分離特性でしかない。Now, in order to perform desired information processing by the above quasi-linear model, it is necessary to simultaneously talk about the capability of the calculation unit and the structure of the network. First, an explanation will be given of a case where the network structure has only one layer, that is, an input layer and an output layer only. When one calculation unit is provided as an output for a plurality of n pieces of input information, that is, a combination of n variables, the input / output characteristics describe a function of n variables. For example, as the input / output function f of the calculation unit, a threshold function having a threshold value (value 0 if the total sum of inputs is negative,
If the value is 1 if it is positive), the n-dimensional space is cut by a hyperplane with a certain slope and the input pattern (n-dimensional)
Is output on the one side, and 0 is output on the other side. If there are three inputs, a tilted plane is placed in the three-dimensional space and one side becomes 0. The deviation (distance) from the origin is determined by the threshold value of the threshold function f. In the inference through the network having such a structure of only one hierarchy, the calculation unit obtains the sum based on the linear information on the input value, so that the information processing capable of the n-dimensional hyperplane is eventually linear. It is only a separation characteristic.

【０００５】非線形的な分離を行うためには、ネットワ
ーク構造に複数の階層を必要とする。ｎ入力計算ユニッ
ト１つが１つのｎ次元超平面を構成するようにして、同
じ入力を共有するｍ個の計算ユニットを設ければ、ｎ次
元の空間にｍ個の異なる傾きを持った超平面があること
になる。それらの複数の超平面１つずつについて分離す
べき事象がどちら側にあるのかを決定して置けば、空間
の局所的な偏向が発生するので、非線形分離特性が得ら
れる。このｍ個の計算ユニットのすべての出力を、１つ
の計算ユニットの入力となるようなｍ入力計算ユニット
を配置すれば、２段階の情報処理が可能となる。このよ
うに段階を複数にした情報処理ができるネットワークを
階層的ネットワークという。最初のノードは入力ノード
と呼ばれ、基本的には何の情報処理もせず、次の段階
（階層）のノードの入力へ情報を伝達する。最終段のノ
ードは出力ノードといわれ、入力ノードと出力ノードの
間に存在するノードは中間層あるいは隠れ層のノードと
いわれる。中間層ノードと出力ノードは計算ユニットで
ある。In order to perform non-linear separation, a network structure requires a plurality of layers. If one n-input calculation unit configures one n-dimensional hyperplane and m calculation units that share the same input are provided, then n hyperspaces having different inclinations will be generated in the n-dimensional space. There will be. Non-linear separation characteristics can be obtained by deciding which side of each of the plurality of hyperplanes the event to be separated is on and by setting the local deviation of the space. By arranging m-input calculation units so that all outputs of the m calculation units become inputs of one calculation unit, information processing in two stages becomes possible. A network that can perform information processing in multiple stages is called a hierarchical network. The first node is called an input node and basically does not perform any information processing, and transmits information to the input of the node at the next stage (hierarchy). The node at the final stage is called an output node, and the node existing between the input node and the output node is called an intermediate layer or hidden layer node. The middle layer node and the output node are calculation units.

【０００６】従来の技術においては、入出力の精度は、
基本的には中間層のノードの数によって定まるため、精
度を確保しようとするとデータ記述量が爆発的に増える
場合が存在した。中間層のノードを少なくした場合に
は、最終段の計算ユニット出力が粗い段階的表現になっ
たり、あるいは弁別できる種類が少なくなったりする。
しかし、この中間層の数をどの程度増やせば適正である
のかについては、定説はない。In the prior art, the accuracy of input / output is
Since it is basically determined by the number of nodes in the middle tier, there were cases where the amount of data description increased explosively when trying to ensure accuracy. When the number of nodes in the middle layer is reduced, the output of the calculation unit at the final stage becomes a coarse stepwise expression, or the types that can be discriminated are reduced.
However, there is no established theory as to how much the number of intermediate layers should be increased.

【０００７】また、従来技術においては、学習の基盤と
なるシステムの初期状態を記述する有効な手段がなく、
基本的には平衡状態やランダムな状態から学習を始める
場合が多かった。物理的な問題でも、人間の感覚と同じ
能力をシミュレートするような問題でも、何らかの入力
信号と出力すべき情報の間にある関係は、領域として捉
えられている場合が多い。ところでニューラルネットワ
ークの中間層のノード１つの役割はｎ次元空間に置かれ
た超平面の片側が有為であることを意味することであ
る。つまり、ｎ次元の超空間の中に領域を十分緻密に設
定しようとすると、２のｎ乗から３のｎ乗程度の超平面
即ち中間ノードを用意し、その各々が領域の周囲を取り
囲むように配置しなければならない。超平面の配置、あ
るいは移動は、ノードのｎ個の入力エッジ（方向正リン
ク）の重みと、計算ユニットの変換式のしきい値とを同
時に変化されることによって実施される。つまり、中間
層が１段だけで出力がただ１つであるネットワークを想
定すれば、まず入力と中間層ノードとの間には、しきい値を持つ中間ノード数Ｎ２＝ｋⁿ個重みを有するエッジの本数ｅ（１−２）＝ｎ×ｋⁿ本次に、中間層ノードと出力層ノードとの間には、重みを有するエッジの本数ｅ（２−３）＝ｋⁿ本しきい値を持つ出力ノードＮ３＝１個が必要となる。以上から、総エッジ数ｅ＝ｅ（１−２）＋ｅ（２−３）＝（ｎ＋
１）×ｋⁿ本しきい値を有する総ノード数Ｎ＝Ｎ２＋Ｎ３＝ｋⁿ＋１
個重み値としきい値を合わせた総調整箇所数＝（ｎ＋２）
×ｋⁿ＋１となる。ｋは緻密さを表す係数で２＜ｋ＜３が目安とな
っている。いま、ｋを２とし、入力信号を１０個、出力
数を１とすれば、総調整箇所数は、（１０＋２）×２¹⁰＝１２，８８８となり、概略の設定でも初期設定を行えるような数では
なくなる。このため、方向性を持った初期状態を指定す
ることができず、学習の方向性を付けることや学習の収
束速度を上げるなどの調整が困難であった。この初期
設定を困難にする要因は、ニューラルネットワークの構
造そのものに原因があるとも考えられる。中間層ノード
によってｎ次元超平面ができることは上記のとおりであ
るが、当初にランダムに配置されていた複数のｎ次元超
平面が、後述するようなニューラルネットワークの学習
規則に則って入力信号がある度に、出力信号が望ましい
パターンに近似するように平面の中を傾きを変更し、原
点からの距離を変更して移動するのである。しかし、各
々の超平面が特定の領域を均等にくるむように（球状な
らばｎ次元超球となるように）超平面が配置されて行く
という保証はない。それは、超平面同士、相互の関連を
記述して置くことができないからである。Further, in the prior art, there is no effective means for describing the initial state of the system which is the basis of learning,
Basically, there were many cases where learning was started from an equilibrium state or a random state. Whether it is a physical problem or a problem that simulates the same ability as a human sense, the relationship between some input signal and the information to be output is often regarded as a domain. By the way, the role of one node in the middle layer of the neural network is to mean that one side of the hyperplane placed in the n-dimensional space is significant. In other words, when it is attempted to set a region in an n-dimensional superspace sufficiently densely, hyperplanes, that is, intermediate nodes, of about 2 n to 3 n power are prepared, and each of them surrounds the region. Must be placed. The placement or movement of the hyperplane is performed by simultaneously changing the weights of the n input edges (directional positive links) of the node and the threshold value of the transformation formula of the calculation unit. In other words, assuming a network in which the number of intermediate layers is one and only one is output, first, the number of intermediate nodes having a threshold value N2 = k ⁿ has weights between the input and the intermediate layer nodes. Number of edges e (1-2) = n × k ⁿ Next, between the intermediate layer node and the output layer node, the number of edges with weight e (2-3) = k ⁿ threshold Output node N3 = 1 is required. From the above, the total number of edges e = e (1-2) + e (2-3) = (n +
1) × k ⁿ total number of nodes having threshold value N = N2 + N3 = k ⁿ +1
Total number of adjustment points with weight value and threshold value = (n + 2)
× k ⁿ +1. k is a coefficient representing the fineness, and 2 <k <3 is a standard. Now, if k is 2, input signals are 10 and the number of outputs is 1, then the total number of adjustment points is (10 + 2) × 2 ¹⁰ = 12,888, which is a number that can be initialized even in rough settings. Not be. Therefore, it is impossible to specify the initial state having directionality, and it is difficult to adjust the directionality of learning and increase the convergence speed of learning. It is considered that the factor that makes this initial setting difficult is due to the structure of the neural network itself. As described above, the n-dimensional hyperplane can be formed by the intermediate layer node, but a plurality of n-dimensional hyperplanes that are initially randomly arranged have input signals according to the learning rule of the neural network as described later. Each time, the inclination is changed in the plane so that the output signal approximates the desired pattern, and the distance from the origin is changed to move. However, there is no guarantee that the hyperplanes will be arranged such that each hyperplane uniformly wraps a specific region (if it is spherical, it becomes an n-dimensional hypersphere). This is because hyperplanes cannot be described by describing their mutual relationships.

【０００８】上記のように、ある程度緻密に領域を設定
するような問題で、さらに識別すべき結果の数（出力ノ
ード数に等しい）が複数あるならば、中間層のノード数
と出力ノードが増加してゆく。これまでの前提はｎ個の
入力を基に１つの出力を導くための枠組みを想定して説
明していた。識別や診断を行う場合の緻密さによって係
数ｋの値が定まるが、出力がｍ個必要な場合には、中間
層の数はｍ倍程度に増加する。上記と同様の計算で総エ
ッジ数と総ノード数は、ｅ＝ｎ×ｍ×ｋⁿ＋ｍ×ｋⁿ＝（ｎ＋１）×ｍ×ｋⁿ Ｎ＝ｍ×ｋⁿ＋ｍ＝ｍ×（ｋⁿ＋１）試みに入力数ｎ＝１０、出力数ｍ＝１０、ｋ＝２とする
と、ｅ＝（１０＋２）×１０×２¹⁰＝１２２，８８０Ｎ＝１０×（２¹⁰＋１）＝１０，２５０実際問題として上記のような１万以上の計算ユニットを
設け、総和としきい関数の演算をし、１２万にものぼる
重みの計算をすることは、計算ユニットを物理的に構成
することも困難であるし、同時に計算処理時間の上から
も問題となることが多い。計算処理時間については、推
論した結果を出力する度に、結果の良否判断を行ってし
きい値や重み値を確率的に更新して行くなどの学習機能
が備わっている場合には、学習時間がさらに大きな問題
となる。As described above, if there are a plurality of results to be identified (equal to the number of output nodes) due to the problem of setting a region with a certain degree of precision, the number of nodes in the intermediate layer and the number of output nodes increase. Do it. The premise so far has been explained assuming a framework for deriving one output based on n inputs. The value of the coefficient k is determined depending on the degree of precision when performing identification or diagnosis, but when m outputs are required, the number of intermediate layers increases by about m times. With the same calculation as above, the total number of edges and the total number of nodes are: e = n × m × k ⁿ + m × k ⁿ = (n + 1) × m × k ⁿ N = m × k ⁿ + m = m × (k ⁿ +1 ) If the number of inputs is n = 10, the number of outputs is m = 10, and k = 2, then e = (10 + 2) × 10 × 2 ¹⁰ = 122,880 N = 10 × (2 ¹⁰ +1) = 10,250 Actual problem As described above, it is difficult to physically configure the calculation unit to provide 10,000 or more calculation units as described above, calculate the sum and the threshold function, and calculate the weight of up to 120,000. At the same time, it often becomes a problem in terms of calculation processing time. As for the calculation processing time, if there is a learning function such as judging the quality of the result every time the inferred result is output and updating the threshold value and the weight value stochastically, the learning time Becomes an even bigger problem.

【０００９】入力層、中間層、出力層のように階層的に
計算ユニットを配置した、階層型ニューラルネットワー
クの方向性リンクの重みや計算ユニットのしきい値など
のパラメータを、問題の解析から初期設定を行うこと
は、前述の理由で困難である。そのためニューラルネッ
トワークには学習方式が備えられることが多い。ニュー
ラルネットワークに例題を与えて、例題を通した推論過
程における推論の誤差を評価して、ネットワークの推論
が目的の動作に近づくように、上記のパラメータを変更
する機構が必要となる。このように各種パラメータを帰
納法的に確定することを学習と呼ぶ。階層型ニューラル
ネットワークの学習方式には大別して、教師信号（ある
いは単に教師という）なしの学習と、教師信号ありの学
習の２種類がある。本発明によるネットワーク型推論方
式でも後述するようにこれらの２つの学習を行うことが
できる。Parameters such as the weight of the directional link and the threshold value of the calculation unit of the hierarchical neural network in which the calculation units are hierarchically arranged like the input layer, the intermediate layer, and the output layer are initially determined from the problem analysis. Setting is difficult for the above reasons. Therefore, the neural network is often equipped with a learning method. A mechanism is required to give an example to a neural network, evaluate the inference error in the inference process through the example, and change the above parameters so that the inference of the network approaches the desired behavior. In this way, inductively determining various parameters is called learning. There are roughly two types of learning methods for the hierarchical neural network: learning without a teacher signal (or simply called a teacher) and learning with a teacher signal. The network type inference method according to the present invention can also perform these two learnings, as will be described later.

【００１０】学習のプロセスを司る規則、いわゆる学習
則はこれまでに多くの研究によって提案されているが、
ここでは代表例としてＤ．Ｅ．Ｒｕｍｅｌｈａｒｔによ
って提案された誤り逆伝播（エラー・バック・プロパゲ
ーション、ｅｒｒｏｒｂａｃｋｐｒｏｐａｇａｔｉ
ｏｎ）学習則（例えば、Ｒｕｍｅｌｈａｒｔ外２，「Ｌ
ｅａｒｎｉｎｇＲｅｐｒｅｓｅｎｔａｔｉｏｎｓｂ
ｙＢａｃｋ−ｐｒｏｐａｇａｔｉｎｇｅｒｒｏｒ
ｓ」，Ｎａｔｕｒｅ，Ｖｏｌ．３２３−９，ｐｐ５３３
−５３６）について説明する。上記式（１）、式（２）
の前提を利用して階層が３層のニューラルネットワーク
の学習を説明する。入出力層だけのニューラルネットワ
ークは単純な線形分離しかできないために制約が多かっ
たが、３層以上の階層型ニューラルネットワークは１−
２層で線形分離された特徴空間において、次の２−３層
の結合によってさらに特徴が組み合わされて線形分離さ
れる。このことから中間にある層（ここでは第２層）の
計算ユニット（ノード）を十分に用意すれば、いかなる
入出力の非線形関係をも記述できることが言われてい
る。多層型のニューラルネットワークの記述能力は向上
しても、有効な学習則である逆伝播学習則の発見によっ
てはじめて利用可能な状態になった。A number of studies have proposed a rule governing the learning process, a so-called learning rule,
Here, as a typical example, D.I. E. Error backpropagation proposed by Rumelhart (error back propagation, error back propagation)
on) Learning rule (eg, Rumelhart Out 2, "L
Earning Representations b
y Back-propagating error
s ", Nature, Vol. 323-9, pp533
-536) will be described. Formula (1), Formula (2)
The learning of a neural network having three layers will be described by using the above assumption. Neural networks with only input / output layers have many restrictions because they can only perform simple linear separation, but hierarchical neural networks with three or more layers have 1-
In the feature space linearly separated by two layers, the features are further combined and linearly separated by the next combination of layers 2-3. From this, it is said that any non-linear relationship between input and output can be described by sufficiently preparing the calculation units (nodes) in the middle layer (here, the second layer). Even if the description ability of the multi-layered neural network improved, it became available only after the discovery of an effective learning rule, the back-propagation learning rule.

【００１１】入力層（第１層）の第ｈ番目の計算ユニッ
トＵ_1hに入力する信号をｘ_h、第２層の第ｉ番目の計算
ユニットをＵ_2i、その入力の総和をｕ_i、計算ユニット
Ｕ_1hと計算ユニットＵ_2iの間のエッジ（方向性リンク）
の変換効率すなわち重みをｗ _hi、計算ユニットＵ_2iの出
力をｖ_i、出力層（第３層）のｊ番目の計算ユニットＵ
_3jとの間の重みｗ_ij、Ｕ_2jの入力の総和をｕ_j、出力を
ｖ_jとすれば、それらの間の関係は次のように表すこと
ができる。ただし、以後の説明を簡単にするために式
（２）のしきい値θを省略する。ｖ_j＝ｆ（ｕ_j）＝ｆ（Σｗ_ij・ｖ_i）（３）ｖ_i＝ｆ（ｕ_i）＝ｆ（Σｗ_hi・Ｘ_h）（４）いま学習のためにネットワークの入力ベクトルＸのパタ
ーンとネットワークの初期状態の出力パターンの組
｛（Ｘ_p，Ｖ_p）｝，_p＝１，２，…，Ｎ，が用意されて
いる。適当な入力パターンＸを入力し、ニューラルネッ
トワークの出力ｖ_jと対応する正解パターンＹの第ｊ成
分をｙ_jとするなら、誤差の総和Ｅはネットワークと正
解との誤差の絶対値の２乗で表すことができる。Ｅ＝｜ｅ｜²＝Σ（ｙ_p−ｖ_p）² （５）＜Σの底はｐ＞学習にって、最終的に得られるニューラルネットワーク
の結合が、訓練セット｛（Ｘ_p，Ｖ_p）｝に関してのニュ
ーラルネットワークの誤差の総和Ｅが極小になるように
なれば、学習の目標を達成したことになる。Ｅを最小に
するためにＥのノード毎の誤差成分の２分の１のＥ_jを
最小にする。Ｅ_j＝（ｙ_j−ｖ_j）²／２（６）ｄＥ_j／ｄｖ_j＝−（ｙ_j−ｖ_j）（７）The h-th calculation unit of the input layer (first layer)
To U_1hInput signal to x_h, I-th calculation of layer 2
Unit U_2i, Sum of its inputs u_i, Calculation unit
U_1hAnd calculation unit U_2iEdge between (directional link)
The conversion efficiency of w _hi, Calculation unit U_2iOut of
Force v_i, The j-th calculation unit U of the output layer (third layer)
_3jWeight w between_ij, U_2jThe sum of the input of u_j, Output
v_jThen, the relationship between them should be expressed as
You can However, in order to simplify the following explanation, the formula
The threshold value θ in (2) is omitted. v_j= F (u_j) = F (Σw_ij・ V_i) (3) v_i= F (u_i) = F (Σw_hi・ X_h) (4) The pattern of the input vector X of the network for learning now
A set of output patterns in the initial state of the network and the network
{(X_p, V_p)},_p= 1, 2, ..., N are prepared
There is. Input an appropriate input pattern X and
Network output v_jJ-th correct pattern Y corresponding to
Minutes to y_jThen, the total error E is positive with the network.
It can be represented by the square of the absolute value of the error from the solution. E = | e |²= Σ (y_p-V_p)² (5) <The bottom of Σ is p> Neural network finally obtained by learning
Is the training set {(X_p, V_p)}
So that the total sum E of the errors in the local network is minimized
If so, you have achieved your learning goals. Minimize E
In order to do so, E of half the error component of each node of E_jTo
Minimize. E_j= (Y_j-V_j)²/ 2 (6) dE_j/ Dv_j=-(Y_j-V_j) (7)

【００１２】いま、結合の強さである重みｗ_ijが僅かに
ｄαだけ変化したときの出力ｖ_jへの影響の度合いｄｖ_j
／ｄα_jは、式（３）より、ｄｖ_j／ｄα_j＝ｖ_i・ｆ’（Σｗ_ij・ｖ_i）＝ｆ’（ｕ_j）・ｖ_i （８）となる。従って結合の重みの変化ｄα_jが、２乗誤差Ｅ
に与える影響ｄＥ_j／ｄα_jは、ｄＥ_j／ｄα_j＝（ｄＥ_j／ｄｖ_j）×（ｄｖ_j／ｄα_j）＝−（ｙ_j−ｖ_j）・ｆ’（ｕ_j）・ｖ_i ＝−δ_j・ｖ_i （９）ただし、 δ_j＝ｆ’（ｕ_j）・（ｙ_j−ｖ_j）（１０）以上のように、ニューラルネットワークの重みを変化さ
せたときの誤差への影響が計算によって求めることがで
きるので、逆にある誤差が観測されたときに、誤差を少
なくする方向に重みを変更することができる。Now, the degree of influence dv _{j on} the output v _j when the weight w _ij, which is the strength of the coupling, changes slightly by dα.
From equation (3), / dα _j is dv _j / dα _j = v _i · f ′ (Σw _ij · v _i ) = f ′ (u _j ) · v _i (8). Therefore, the change dα _{j in} the weight of the connection is
DE _j / dα _j is given by: dE _j / dα _j = (dE _j / dv _j ) × (dv _j / dα _j ) = − (y _j −v _j ) · f ′ (u _j ) · v _i _{_{= -δ j · v i (9}} ) _{However, δ j = f '(u} j) · (y j -v j) (10) as described above, to the error at the time of changing the weights of the neural network Since the influence can be obtained by calculation, when a certain error is observed, the weight can be changed to reduce the error.

【００１３】ここで、一般的に行われるバックプロパゲ
ーション（逆伝播）学習則による学習は、ある入力があ
ったとき、式（３）（４）において最終出力段に発生し
た誤差のノード成分の値を基に、その前段の重みｗ_ijを
正解に近づく方向に僅かに修正し、新たな重みとする。
式（１０）を出力誤差から入力へと計算を逆に行って、
誤差Ｅ_jを減少させるような重みｗ_ijの変化量Δαを計
算することができる。即ち、 Δα_j＝η・δ_j・ｖ_i （１１）ｗ_ij＝ｗ_ij＋Δα_j＝ｗ_ij＋η・δ_j・ｖ_i （１２） δ_j＝（ｙ_j−ｖ_j）・ｆ’（ｕ_j）（１０）式に同じ δ_i＝ｆ’（ｕ_i）Σｗ_ij・δ_j （１３）＜Σの底はｊ＞式（１２）は、計算ユニットＵ_jの一般化された誤差δ_j
と、その結合が伝えている信号ｖ_iとの積に応じて、計
算ユニットＵ_jへの結合重みｗ_ijを修正する、修正式で
ある。常数ηは一般に小さな正数である。式（１２）
は、誤差が正、即ち計算ユニットＵ_jの活動が不足して
いる場合、計算ユニットＵ_jに正の信号を送っていたユ
ニットからの結合の重みを減らすように働く。誤差が負
の場合には、結合重みの修正は逆に行われる。式（１
０）のδ_jは、一般化された誤差と呼ばれ、最終出力段
における出力ユニットＵ_jの誤差に出力感度に相当する
ｆ’を乗じている。これに対して、式（１３）ではδ_i
を、丁度、出力層での誤差δ_jを入力とし、出力層から
入力層へ逆方向に、ｆ’（ｕ）を乗じながら伝搬してい
るような形の計算になっている。このことから逆伝播学
習則と呼ばれている。式（１０）〜式（１３）は上記に
示した３層のニューラルネットワークよりも階層の多い
場合にも、当てはめることができる。Here, the learning by the backpropagation (backpropagation) learning rule that is generally performed, when there is a certain input, the node component of the error generated in the final output stage in the equations (3) and (4). Based on the value, the weight w _ij of the preceding stage is slightly modified in the direction approaching the correct answer, and a new weight is set.
Reverse the calculation of equation (10) from the output error to the input,
It is possible to calculate the change amount Δα of the weight w _ij that reduces the error E _j . _{_{That, Δα j = η · δ j}} · v i (11) w ij = w ij + Δα j = w ij + η · δ j · v i (12) δ j = (y j -v j) · f '(u _j) (10) formula in the same _{_{δ i = f '(u i}} ) Σw ij · δ j (13) < bottom of Σ is j> formula (12), generalized error [delta] _j computing unit U _j
Is a modification equation for modifying the connection weight w _ij to the calculation unit U _j according to the product of the signal and the signal v _i transmitted by the connection. The constant η is generally a small positive number. Formula (12)
Acts to reduce the weight of the coupling from the unit that was signaling positive to the computing unit U _j if the error is positive, ie the computing unit U _j lacks activity. If the error is negative, the modification of the connection weight is reversed. Expression (1
Δ _j of 0) is called a generalized error and is obtained by multiplying the error of the output unit U _j in the final output stage by f ′ corresponding to the output sensitivity. On the other hand, in equation (13), δ _i
Is exactly the input of the error δ _j in the output layer, and is propagated in the reverse direction from the output layer to the input layer while being multiplied by f ′ (u). For this reason, it is called the back propagation learning rule. Expressions (10) to (13) can be applied even when there are more layers than the three-layer neural network shown above.

【００１４】また、学習を効果的に行うための改良も行
われている。例えば、シミュレーティド・アニーリング
（徐冷）法と言われる方法では、当初の重み変更の度合
いを大きく、学習が進んで誤差が小さくなった場合には
重みの変更度合いを小さくするような方法である。しか
し、式（１２）を入力層の方向に置き換えると、式（１
３）も併せて、ｗ_hi＝ｗ_hi＋Δα_i＝ｗ_hi＋η・δ_i・ｖ_h ＝ｗ_hi＋η・ｆ’（ｕ_i）・Σｗ_ij・δ_j・ｖ_h （１４）結局、１度の訓練用の信号入力がされたときに、全ての
エッジに関して式（１０）と式（１２）、あるいは式
（１４）のいずれかの計算をしなければならない。かつ
計算ユニットのしきい関数ｆ（ｕ）には非線形関数がお
もに使われるので計算の負荷は見かけよりも大きくな
る。上記の計算量はエッジの数に比例して増加する。一
方でエッジの数が多いほど訓練用の信号セットの量すな
わち学習回数を増加させなければならない。学習に要す
る時間は、対象が困難な問題ほど、指数関数的に増加す
る。Improvements have also been made for effective learning. For example, in the method called the simulated annealing method, the degree of weight change at the beginning is large, and the degree of weight change is decreased when learning progresses and the error becomes small. is there. However, substituting equation (12) in the direction of the input layer, equation (1
3) also, w _hi = w _hi + Δα _i = w _hi + η · δ _i · v _h = w _hi + η · f ′ (u _i ) · Σ w _ij · δ _j · v _h (14) After all, once When a signal for training is input, any of the equations (10) and (12) or (14) must be calculated for all edges. Moreover, since a non-linear function is mainly used for the threshold function f (u) of the calculation unit, the calculation load becomes larger than it seems. The above calculation amount increases in proportion to the number of edges. On the other hand, as the number of edges increases, the amount of signal sets for training, that is, the number of times of learning must be increased. The time required for learning increases exponentially as the problem is more difficult to target.

【００１５】また、従来技術においては、初期状態の設
定の困難であることもあり学習収束の方向性を付けるこ
とは困難である。基本的には、１回のサンプリングごと
のバックプロパゲーションによって内部状態を変更する
ため入力データの与え方に偏りが有ったり、与え方が採
用している学習方式に対して不適当な場合学習に障害を
発生する場合があった。このため、学習速度や学習結果
の妥当性に問題が出る場合があった。Further, in the prior art, it is difficult to set the direction of learning convergence because it is difficult to set the initial state. Basically, learning is performed when the internal state is changed by backpropagation for each sampling, and there is a bias in how input data is given, or when the giving method is inappropriate for the learning method adopted. Occasionally, there was a failure. Therefore, there may be a problem in the learning speed and the validity of the learning result.

【００１６】従来技術においては、学習の対象となるシ
ステム内部の処理記述が基本的にはリンクの重みという
形で表現され、概念的な具体性を持たないため、学習の
結果を外部（即ち、入出力データ）からしか判定できな
い。即ち、従来のシステムにおいては、内部の処理記述
の説明性が良くなかった。このため、学習結果を内部状
態（記述）によって直接評価することは困難であった。In the prior art, the processing description inside the system to be learned is basically expressed in the form of link weights and has no conceptual concreteness. It can be judged only from input / output data). That is, in the conventional system, the explanation of the internal process description was not good. Therefore, it is difficult to directly evaluate the learning result by the internal state (description).

【００１７】従来技術においては、学習の対象となるシ
ステム内部の処理記述が基本的にはリンクの重みという
形で表現され、基本的にはノード間のリンクは全結合に
近い状態で運用される。このため、学習結果を部分的に
取り出し利用する、２つの学習結果を合わせて１つのシ
ステムを構成する、学習結果を異なるデータ記述形式を
持つシステム（プロダクションシステムなど）に適用す
る、といった一つのシステムから他のシステムへの改造
や、拡張、縮小などを行うことは困難であった。また、
同様の理由で、入出力のノードを追加／削除することも
困難であった。In the prior art, the processing description inside the system to be learned is basically expressed in the form of link weights, and basically the links between nodes are operated in a state close to full connection. .. For this reason, one system in which learning results are partially extracted and used, two learning results are combined to form one system, and learning results are applied to systems having different data description formats (such as production systems) It was difficult to modify, expand, or shrink from the system to other systems. Also,
For the same reason, it was difficult to add / delete the input / output node.

【００１８】本願の特許出願人は先に、以上のような従
来の技術の諸問題を解決するための発明について、特許
出願をした（特願平３−３１００８２号「ネットワーク
型情報処理システム」）。この特願平３−３１００８２
号の発明（以下、「前発明」という）のネットワーク型
情報処理システムは、複数のノードを有する入力層と複
数のノードを有する出力層が方向性リンクを介して結合
され、その方向性リンクは通過する情報の変換をする情
報変換機能を有し、前記出力層のノードは方向性リンク
を介して入力される情報に対して関数演算を行う機能を
有するネットワーク型情報処理システムにおいて、上記
の方向性リンクの情報変換機能として、帯域通過型ある
いは帯域阻止型のような選択的な特性をもつフィルタ関
数に従って情報変換を行うフィルタ関数演算部を有する
ものである。これによれば、一つの方向性リンクのフィ
ルタ関数は、従来のニューラルネットワークの複数の中
間層が実現していた超平面に対応し、また、入力層と出
力層とを結合する方向性リンクが作り出す領域指定は、
上記従来技術の入力層と中間層と出力層の組合せリンク
とノードからなる複数の超平面が示す領域指定に対応す
る。従って、上記ネットワーク型情報処理システムの発
明は、従来のニューラルネットワークに比べ、同一の精
度を得るのに必要な記述量が少なくなり、要素の数か少
なくなって構成が簡単となるとともに、処理時間を短縮
することができる。The patent applicant of the present application has previously filed a patent application for an invention for solving the above-mentioned problems of the prior art (Japanese Patent Application No. 3-310082 "Network type information processing system"). .. This Japanese Patent Application No. 3-310082
In the network type information processing system of the invention (hereinafter, referred to as “previous invention”), an input layer having a plurality of nodes and an output layer having a plurality of nodes are coupled via a directional link, and the directional link is In the network type information processing system, which has an information conversion function of converting information passing therethrough, and the node of the output layer has a function of performing a function operation on information input via a directional link, The information conversion function of the sex link has a filter function operation unit that performs information conversion according to a filter function having a selective characteristic such as a band pass type or a band stop type. According to this, the filter function of one directional link corresponds to the hyperplane realized by the plurality of intermediate layers of the conventional neural network, and the directional link connecting the input layer and the output layer is The area specification to create is
This corresponds to the area designation indicated by a plurality of hyperplanes each including a combination link of the input layer, the intermediate layer, and the output layer and a node in the above-mentioned conventional technique. Therefore, the invention of the network type information processing system described above reduces the amount of description required to obtain the same accuracy as compared with the conventional neural network, reduces the number of elements, simplifies the configuration, and reduces the processing time. Can be shortened.

【００１９】上述の前発明のネットワーク型情報処理シ
ステムにおける学習方式は、前記ネットワーク型情報処
理システムが入力情報から方向性リンクと計算ユニット
を通して情報処理を行うのと平行して、同じ入力情報か
ら正常な情報処理結果を得る手段を有し、その正常な結
果を教師信号として、教師信号と上記の情報処理結果
（計算ユニットの出力）との差異（誤差）を評価関数に
よって求め、その差異の大小やベクトル値を算出し、学
習関数を介して、方向性リンクのフィルタ関数を修正す
る学習手段を有する。ここで、正常な情報処理結果を得
る手段とは、例えば、入力情報とそれに対応する正しい
出力情報を対比させた教師情報を予め用意しておき、入
力情報が与えられたとき、正しい出力情報を取り出すよ
うに構成したものである。また、学習関数とは、例えば
後述するメンバーシップ関数の形状を変化させるゲイン
値のような学習のルールのことである。その学習手段
は、反応サイクル毎の誤差逆伝播学習ではなく、前記ネ
ットワーク型情報処理システムにおける複数回の反応サ
イクルの結果／評価を統計的に処理し、この統計的処理
の結果に基づいて、前記フィルタ関数のパラメータを修
正する関数修正手段を有する。なお、反応サイクルと
は、「入力があり出力を行う」という処理を１サイクル
とするものである。The learning method in the network type information processing system of the above-mentioned invention is such that while the network type information processing system processes information from the input information through the directional link and the calculation unit, the same input information is normally processed. Means for obtaining various information processing results, and using the normal result as the teacher signal, the difference (error) between the teacher signal and the above information processing result (output of the calculation unit) is obtained by the evaluation function, and the difference is large or small. And a vector value are calculated, and a learning means for correcting the filter function of the directional link is provided through the learning function. Here, as the means for obtaining a normal information processing result, for example, teacher information prepared by comparing input information and correct output information corresponding thereto is prepared in advance, and when the input information is given, correct output information is obtained. It is configured to be taken out. The learning function is a learning rule such as a gain value that changes the shape of the membership function described later. The learning means statistically processes the results / evaluations of a plurality of reaction cycles in the network type information processing system, not the error backpropagation learning for each reaction cycle, and based on the result of this statistical processing, It has a function modifying means for modifying the parameters of the filter function. In addition, the reaction cycle is one cycle of a process of “inputting and outputting”.

【００２０】[0020]

【発明が解決しようとする課題】本発明は、従来のニュ
ーラルネットワークの問題点を解決する前発明のネット
ワーク型情報処理システムを、さらに改良することを目
的とするものである。SUMMARY OF THE INVENTION It is an object of the present invention to further improve the network type information processing system of the prior invention which solves the problems of the conventional neural network.

【００２１】１）前発明のネットワーク型情報処理シ
ステムにおけるルール生成方式では、各次元毎に１つの
分布関数であった。そのため、境界領域における認識率
が低くなるおそれがあった。また、前発明のネットワー
ク型情報処理システムにおいて、認識率を高めるため
に、対象空間内に有効な領域を複数の升目に分割し、升
目毎に各次元について分布関数を設定することも考えら
れるが、それでは分割数の爆発を起こす危険性がある。
本発明は、ネットワーク型情報処理システムの境界領域
における認識率を向上させる学習システムを提供するこ
とを目的とする。1) In the rule generation method in the network type information processing system of the previous invention, there is one distribution function for each dimension. Therefore, there is a possibility that the recognition rate in the boundary area may be low. In addition, in the network type information processing system of the previous invention, in order to increase the recognition rate, it is possible to divide an effective area in the target space into a plurality of squares and set a distribution function for each dimension for each square. , Then there is a risk of explosion of the number of divisions.
It is an object of the present invention to provide a learning system that improves the recognition rate in the boundary area of a network type information processing system.

【００２２】２）前発明のネットワーク型情報処理シ
ステムにおいては、評価データは学習の終了をチェック
するためだけに用いられ、学習システムの中に組み込ま
れてはいない。本発明は評価データを有効に利用し、ル
ールが生成されていない領域に確実にルールを作成し、
学習の効率を高めることのできる学習システムを提供す
ることを目的とするものである。2) In the network type information processing system of the previous invention, the evaluation data is used only for checking the end of learning, and is not incorporated in the learning system. The present invention effectively uses the evaluation data, surely creates a rule in the area where the rule is not generated,
It is an object of the present invention to provide a learning system that can improve learning efficiency.

【００２３】３）推論、認識などを行なう処理系で
は、効率良く処理を行なうため、入力データに対して、
複数データの統合、微分などの前処理を行なう場合が多
い。従来、この前処理の内容は、主成分分析、最小２乗
法などの統計解析手法によって得られた解析結果を基に
作成したり、対象の系に対する経験則、対象の系の物理
的な特性などによって人為的に作成する、などの方法が
取られていた。統計処理は、特に非線形解析を行なう場
合、扱うデータの種類、量などが増大する処理に必要な
処理装置における手順数が非常に増大し、処理時間が多
くかかることになる。また、人為的に作成する場合は、
経験則、対象系の特性などを作業者が調査、記述しなけ
ればならず、作業時間が多くかかる。また、あくまで作
業者の知識、調査内容に依存して処理が記述されるた
め、客観性、完全性を欠く危険がある。本発明は、自動
的非線形解析を行う前処理系を構築し、前処理系の作成
時間、作業負荷を短縮することができるネットワーク型
情報処理装置の学習システムを提供することを目的とす
る。この目的は、ネットワーク型情報処理装置を用いて
教師データ無しの学習処理を行なわせ、複数の入力デー
タによって非線形解析を行なうことによって達成され
る。3) In a processing system that performs inference, recognition, etc., in order to perform processing efficiently,
In many cases, preprocessing such as integration of multiple data and differentiation is performed. Conventionally, the contents of this preprocessing are created based on the analysis results obtained by statistical analysis methods such as principal component analysis and least squares method, empirical rules for the target system, physical characteristics of the target system, etc. The method of artificially creating it was adopted. In the statistical processing, particularly when performing non-linear analysis, the number of procedures in the processing device required for the processing in which the type and amount of data to be handled increases, and the processing time becomes long. Also, when artificially creating,
Workers have to research and describe empirical rules, characteristics of the target system, etc., which takes a lot of work time. Further, since the processing is described only depending on the knowledge of the worker and the content of the investigation, there is a risk of lack of objectivity and completeness. It is an object of the present invention to provide a learning system for a network-type information processing device that can construct a preprocessing system for performing automatic nonlinear analysis and reduce the preparation time and workload of the preprocessing system. This object is achieved by using a network type information processing apparatus to perform learning processing without teacher data and performing non-linear analysis with a plurality of input data.

【００２４】[0024]

【問題を解決するための手段および作用】本発明は、ネ
ットワーク型情報処理装置の学習システムであり、その
学習にネットワーク型情報処理装置を用いる。ネットワ
ーク型情報処理装置は、複数の入力ノード（図２の２
１）と、複数の出力ノード（図２；２４）と、前記入力
ノードと出力ノードを結合する方向性リンクとを有す
る。その方向性リンクは非線形の選択型関数（図２のｍ
₁₁〜ｍ₃₂）であるフィルタ関数を記憶するフィルタ関数
記憶手段（図１の１１３）と、方向性リンクを通過する
情報を前記フィルタ関数により変換をするフィルタ関数
演算手段（図１１の１１２）を有する。また、前記出力
ノードは方向性リンクを介して入力される情報に対して
関数演算を行う手段（図１の１１４、１１５）を有す
る。このネットワーク型情報処理装置（図１の１１）の
学習システムは、学習用のデータを前記ネットワーク型
情報処理装置に入力する入力手段（図１の１２）と、入
力された学習用データの示す領域が、既存の認識領域に
含まれるか否かを判定する領域判定手段（図１の１３）
と、その領域判定手段により前記既存の認識領域に含ま
れると判定された場合に、その既存の認識領域を形成す
るフィルタ関数のセットにおける各フィルタ関数を個別
に更新するフィルタ関数更新手段（図１の１４）と、前
記領域判定手段により前記既存の認識領域に含まれない
と判定された場合に、新たな認識領域を形成するフィル
タ関数のセットを生成するフィルタ関数生成手段（図１
の１５）とを備えた基本構成を有する。既存の認識領域
とは、既存のフィルタ関数のセットにより限定される領
域のことである。The present invention is a learning system for a network type information processing apparatus, and the network type information processing apparatus is used for the learning. The network type information processing device has a plurality of input nodes (2 in FIG.
1), a plurality of output nodes (FIG. 2; 24), and a directional link connecting the input node and the output node. The directional link is a non-linear selective function (m in FIG. 2).
_{11 to} m ₃₂ ) and a filter function storage means (113 in FIG. 1) for storing the filter function and a filter function calculation means (112 in FIG. 11) for converting the information passing through the directional link by the filter function. Have. Further, the output node has means (114, 115 in FIG. 1) for performing a functional operation on the information input via the directional link. The learning system of this network-type information processing device (11 in FIG. 1) includes an input unit (12 in FIG. 1) for inputting learning data to the network-type information processing device, and an area indicated by the input learning data. Area determination means (13 in FIG. 1) for determining whether or not the area is included in the existing recognition area
And a filter function updating means for individually updating each filter function in the set of filter functions forming the existing recognition area when it is judged by the area judging means to be included in the existing recognition area (see FIG. 1). 14) and a filter function generating means for generating a set of filter functions forming a new recognition area when the area determining means determines that the area is not included in the existing recognition area (FIG. 1).
15) and with a basic configuration. An existing recognition area is an area limited by an existing set of filter functions.

【００２５】上記の基本構成において、領域判定手段
は、入力データが示す領域（多次元領域における空間座
標）が、ネットワーク型情報処理装置の既存のフィルタ
関数セットが限定する認識領域に属するか否かを判定す
る。フィルタ関数更新手段は、その判定の結果、前記認
識領域に属するものであるとき、その認識領域に対応す
るフィルタ関数セットの各々のフィルタ関数を、ネット
ワーク型情報処理装置の目的に合うように、個別に更新
する。前記認識領域に属さないと判定されたときは、フ
ィルタ関数生成手段により新たな認識領域を設定する。
これにより、同じ階層にあるフィルタ関数のセットが多
次元の領域を軸方向に非対象に切り取るように認識領域
を分離できる。従って、自然に大小の認識領域が生成さ
れ、境界領域に小さなルールが生成されので、認識率が
高まる。なお、従来のニューラルネットワークの学習方
式であるＲＣＥ方式では、新たな多次元領域に次々と電
荷に似たエネルギー（クーロン力の）半径を持つ仮想素
子を配置し、データ入力に対して近辺に半径のある素子
が有れば半径を拡張し、無ければ新たな素子を配置する
ことにより学習を実行するようにしているが、本発明は
これとは異なる。すなわち、本発明は、ＲＣＥ法のよう
な単なる球あるいは超球を新領域として設定するような
空間分割法でなく、・軸独立に分布が設定でき楕円体あるいは片スボミのコ
ーンのような領域を形成でき（ＲＣＥ法のように領域が
球あるいは超球に限られるものではない）、・学習の期間において、本発明の領域の中心値は、原理
上自由であり、空間を移動できる（ＲＣＥ法のように原
理的に中心値不変のものではない）。In the above basic configuration, the area determining means determines whether the area (spatial coordinates in the multidimensional area) indicated by the input data belongs to the recognition area limited by the existing filter function set of the network type information processing apparatus. To judge. As a result of the determination, the filter function updating means individually assigns each filter function of the filter function set corresponding to the recognition area so as to meet the purpose of the network type information processing device. To update. When it is determined that the recognition area does not belong to the recognition area, a new recognition area is set by the filter function generating means.
As a result, the recognition regions can be separated so that a set of filter functions in the same hierarchy cuts a multidimensional region axially asymmetrically. Therefore, large and small recognition areas are naturally generated and small rules are generated in the boundary area, so that the recognition rate is increased. In the conventional RCE method, which is a learning method for a neural network, virtual elements having energy (coulomb force) radii similar to charges are successively arranged in a new multidimensional region, and the radius is close to the data input. The learning is executed by expanding the radius if there is a certain element, and arranging a new element if there is no element, but the present invention is different from this. That is, the present invention is not a space division method such as the RCE method in which a sphere or a hypersphere is set as a new area, but a distribution such that an axis can be set independently of an area such as an ellipsoid or a cone of one-sided It can be formed (the area is not limited to a sphere or a hypersphere like the RCE method), and during the learning period, the central value of the area of the present invention is theoretically free and can move in space (the RCE method). It is not the same as the central value in principle).

【００２６】本発明は、学習に用いるデータは、後述す
る教師なしの学習の場合を除いては、入力データとそれ
に対する教師信号からなるものてある。一般に、教師信
号は、「入力データから推論される結果は何々である」
という肯定的な意味を持つものであるが、本発明では、
「入力データから推論される結果は何々ではない」とい
う否定的な意味を持つものを用いることがてきる。対象
が複雑で明確でないシステムにおいては、否定的出力の
教師信号を用いると好都合な場合がある。このような場
合に対処するために、本発明は、否定の教師信号を用い
るとともに、否定を表す認識領域を生成する。この場合
の発明の構成は、入力データとそれに対応する肯定また
は否定の教師信号を含む学習用のデータを前記ネットワ
ーク型情報処理装置に入力する入力手段と、入力データ
の示す領域が、既存の肯定を表す認識領域または否定を
表す認識領域に含まれるか否かを判定する領域判定手段
と、前記領域判定手段により前記既存の肯定を表す認識
領域または否定を表す認識領域に含まれると判定された
場合に、その既存の認識領域を形成するフィルタ関数の
セットにおける各フィルタ関数を個別に更新するフィル
タ関数更新手段と、前記領域判定手段により前記既存の
肯定を表す認識領域または否定を表す認識領域のいずれ
に含まれないと判定された場合に、教師信号が肯定を表
すものであるときには新たな肯定を表す認識領域を形成
するフィルタ関数のセットを生成し、また、教師信号が
否定を表すものであるときには新たな否定を表す認識領
域を形成するフィルタ関数のセットを生成するフィルタ
関数生成手段とを備えている。また、後処理のために
は、認識領域の出力は肯定的なものに統一したほうが好
都合である。そのために、否定出力ノードを、それに隣
接する領域を持つ肯定出力ノードを参照して、肯定出力
ノードに変換する手段を設けるとよい。In the present invention, the data used for learning is composed of input data and a teacher signal for the data except for the case of unsupervised learning described later. Generally, the teacher signal is "what is the result inferred from the input data?"
Although it has a positive meaning, in the present invention,
It is possible to use the one that has the negative meaning of "there is no result inferred from the input data". In systems with complex and unclear objects, it may be advantageous to use a negative output teacher signal. In order to deal with such a case, the present invention uses a negative teacher signal and generates a recognition area representing the negative. In the configuration of the invention in this case, the input means for inputting the learning data including the input data and the corresponding positive or negative teacher signal to the network type information processing device, and the area indicated by the input data are the same as the existing positive data. Area determination means for determining whether or not the recognition area is included in the recognition area that represents the negative or the recognition area that represents the negative, and the area determination means determines that the area is included in the existing recognition area that represents the affirmation or recognition area that represents the negative. In this case, a filter function updating means for individually updating each filter function in the set of filter functions forming the existing recognition area, and a recognition area representing the existing affirmative or negative recognition area by the area determining means. If it is determined that the teacher signal does not include any of the above, and if the teacher signal indicates affirmation, the fiducial area that forms a new affirmation region is formed. Generates a set of data functions, also includes a filter function generating means for generating a set of filter functions that form the recognition region representing a new negative when the teacher signal are representative of a negation. Further, for post-processing, it is convenient to unify the outputs of the recognition areas to be positive. Therefore, it is preferable to provide means for converting the negative output node into the positive output node by referring to the positive output node having a region adjacent to the negative output node.

【００２７】本発明の他の態様においては、フィルタ関
数の生成および更新を目標値に対する評価を基に行うよ
うに構成される。すなわち、その構成は、ネットワーク
型情報処理装置の学習システムにおいて、入力データそ
れに対応する教師信号を前記ネットワーク型情報処理装
置に入力する入力手段と、目標値としての教師信号と、
入力データをネットワーク情報処理した結果の値との誤
差を求め、その誤差に基づいて既存の領域を更新するか
否かを判定する評価手段と、前記評価手段により既存の
認識領域を更新すべきであると判定された場合に、その
認識領域を形成するフィルタ関数のセットにおける各フ
ィルタ関数を個別に更新するフィルタ関数更新手段と、
前記評価手段により更新が否であると判定された場合
に、新たな認識領域を形成するフィルタ関数のセットを
生成するフィルタ関数生成手段とを備えたものである。
この態様によれば、適宜評価を行い、フィルタ関数の生
成、更新を行うので、学習処理がより効率的に行われ
る。In another aspect of the present invention, the filter function is generated and updated based on the evaluation with respect to the target value. That is, the configuration is such that, in a learning system of a network type information processing device, input data for inputting a teacher signal corresponding to the input data to the network type information processing device, and a teacher signal as a target value,
An evaluation unit that determines an error from the value of the result of network information processing of the input data and determines whether to update the existing region based on the error, and the existing recognition region should be updated by the evaluation unit. Filter function updating means for individually updating each filter function in the set of filter functions forming the recognition region when it is determined that there is,
And a filter function generating means for generating a set of filter functions forming a new recognition area when the evaluation means determines that the update is not possible.
According to this aspect, since the evaluation is appropriately performed and the filter function is generated and updated, the learning process is performed more efficiently.

【００２８】本発明の一態様においては、教師信号を用
いないで学習を行わせることにより、特徴分析や、主成
分分析などの統計分析機能を学習することが可能なもの
である。その構成は、ネットワーク型情報処理装置の学
習システムにおいて、教師信号を含まない学習用のデー
タを前記ネットワーク型情報処理装置に入力する入力手
段と、入力データを前記ネットワーク型情報処理装置に
よりネットワーク情報処理した結果の値が最大となる出
力を有する出力ノードを判定する判定手段と、前記判定
手段により判定された出力ノードに接続されたフィルタ
関数のセットにおける各フィルタ関数を個別に更新する
フィルタ関数更新手段とを備えて成るものである。教師
なし学習の初期のバラつかせ法は、特徴軸の最大最小、
中間などができるだけ均一になればよい。また全ての組
合せを作成しようとすれば、２^anのルールが必要である
（ｎ次元、ａは最大最小だけなら１）が、これは適当に
設定してもよい。According to one aspect of the present invention, it is possible to learn statistical analysis functions such as feature analysis and principal component analysis by performing learning without using a teacher signal. The configuration is such that, in a learning system of a network type information processing apparatus, input means for inputting learning data not including a teacher signal to the network type information processing apparatus, and input data by the network type information processing apparatus Determining means for determining an output node having an output with the maximum value of the result, and filter function updating means for individually updating each filter function in the set of filter functions connected to the output node determined by the determining means. And. The initial variation method of unsupervised learning is maximum and minimum of feature axis,
It suffices if the middle part is as uniform as possible. If all combinations are to be created, a rule of 2 ^an is required (n-dimensional, a is 1 if the maximum and minimum are the same), but this may be set appropriately.

【００２９】前記本発明の基本構成において、前記領域
判定手段は、具体的態様では前記ネットワーク型情報処
理装置のフィルタ関数演算手段の演算により得られた合
致度を所定のしきい値と比較し、その結果により既存の
認識領域に含まれるか否かの判定を行うよう構成するこ
とができる。In the basic configuration of the present invention, in a specific mode, the area determination means compares the degree of agreement obtained by the operation of the filter function operation means of the network type information processing device with a predetermined threshold value, According to the result, it can be configured to determine whether or not it is included in the existing recognition area.

【００３０】前記基本構成の学習システムにおいて、フ
ィルタ関数の更新には、学習の履歴により更新方法を選
択するように構成することができる。すなわち、この場
合、各認識領域に対する学習の履歴情報を記憶する履歴
記憶手段（図１の１７１）と、前記履歴記憶手段に記憶
された学習の履歴情報を基に、各認識領域ごとにフィル
タ関数の更新方法として、複数の種類の更新方法から適
切な更新方法を選択し指示する更新方法指示手段（図１
の１８）とを設け、フィルタ関数更新部は複数の更新方
法を有し、更新方法指示手段の指示に従った更新方法で
前記既存の認識領域を形成するフィルタ関数のセットに
おける各フィルタ関数を個別に更新するよう構成され
る。例えば、履歴データ数が１の場合には、第１の更新
方法として、メンバーシップ関数の中心値Ｃを初期入力
データに設定し、曖昧度Ａを０に設定し、履歴データ数
が２以上でＭ（所定値）以下の場合には、第２の更新方
法として、メンバーシップ関数の中心値Ｃを履歴データ
の最大値と最小値との中間の値に設定し、曖昧度Ａを履
歴データの最大値と最小値の差に基づいて設定する。履
歴データ数がある所定値Ｍより大きい場合には、第３の
更新方法として統計的手法を用いる。In the learning system having the above basic structure, the filter function can be updated by selecting an update method according to the learning history. That is, in this case, based on the history storage means (171 in FIG. 1) for storing the learning history information for each recognition area, and the learning history information stored in the history storage means, a filter function is provided for each recognition area. As an update method of the update method, an update method instructing means (FIG.
18) and the filter function update unit has a plurality of update methods, and each filter function in the set of filter functions that forms the existing recognition region is individually updated by the update method according to the instruction of the update method instruction unit. Is configured to update. For example, when the number of history data is 1, as the first updating method, the central value C of the membership function is set to the initial input data, the ambiguity A is set to 0, and the history data number is 2 or more. When it is less than M (predetermined value), as a second updating method, the central value C of the membership function is set to an intermediate value between the maximum value and the minimum value of the history data, and the ambiguity A of the history data is set. Set based on the difference between the maximum and minimum values. When the number of pieces of history data is larger than a predetermined value M, a statistical method is used as the third updating method.

【００３１】また、本発明のさらに他の態様では、更新
方法として統計的手法を用いるための構成を有する。即
ち、学習の履歴情報から度数分布を得る度数分布計測手
段１７を備え、フィルタ関数更新手段は、度数分布計測
手段により得られた度数分布に基づいてフィルタ関数を
変更する。Further, according to still another aspect of the present invention, there is provided a configuration for using a statistical method as the updating method. That is, the frequency distribution measuring unit 17 for obtaining the frequency distribution from the learning history information is provided, and the filter function updating unit changes the filter function based on the frequency distribution obtained by the frequency distribution measuring unit.

【００３２】上記の学習の履歴情報は、前記方向性リン
クに対応するフィルタ関数ごとにそれぞれ収集し、ま
た、領域判定手段により領域に入ると判定された入力デ
ータのみをその領域を定めているフィルタ関数のセット
に対応する履歴データとして記憶する。The learning history information is collected for each filter function corresponding to the directional link, and only the input data determined by the region determining means to enter the region defines the region. Store as history data corresponding to a set of functions.

【００３３】また、他の態様によれば、一定期間の学習
終了後に、極端に教師信号の少なかったフィルタ関数の
セットについて、隣接する領域をもつフィルタ関数のセ
ットの内容を基にノイズであるか否かを判定し、ノイズ
と判定されたとき、そのルールを除去するノイズ除去手
段を設けるとよい。According to another aspect, after the learning for a certain period of time, whether the set of filter functions having extremely few teacher signals is noise based on the contents of the set of filter functions having adjacent regions. It is advisable to provide a noise removing means for determining whether or not there is noise and removing the rule when it is determined to be noise.

【００３４】[0034]

【Example】

（学習対象のネットワーク型情報処理装置）図２は、本
発明の学習方式が適用されるネットワーク型情報処理シ
ステム（装置）の実施例の概略の構成を示す図である。
このネットワーク型情報処理システムは、図２に示すよ
うに入力層の複数のノード２１，２２，２３および出力
層の複数のノード２４，２５が方向性リンクにより接続
され、ネットワークを構成している。図２は形式的にみ
れば従来のニューラルネットワークと同様の構成である
が、各ノードがフィルタ関数を有する方向性リンクを介
して結合される点において従来とは根本的に異なるもの
である。(Network-type information processing device to be learned) FIG. 2 is a diagram showing a schematic configuration of an embodiment of a network-type information processing system (device) to which the learning method of the present invention is applied.
In this network type information processing system, as shown in FIG. 2, a plurality of nodes 21, 22, 23 in the input layer and a plurality of nodes 24, 25 in the output layer are connected by directional links to form a network. FIG. 2 is formally similar to a conventional neural network, but is fundamentally different from the conventional one in that each node is connected through a directional link having a filter function.

【００３５】図３は図２における演算を行う部分の構成
例を示すブロック図である。図３に示すノードは、各方
向性リンクにおける、入力情報を選択的に通過させるフ
ィルタ関数の演算を行うフィルタ関数演算部３１〜３４
と、フィルタ関数演算部３１〜３４の出力を重み付けし
た加算平均処理を行う加算部３５と、その加算部３５の
出力にしきい演算を施すしきい関数演算部３６からなっ
ている。入力と状況（出力）との関係は、次のようにル
ールの組（マトリックス）によって表すことができる。入力１入力２入力３状況１ｆ₁₁＜ｇ₁₁＞ｆ₂₁＜ｇ₂₁＞ｆ₃₁＜ｇ₃₁＞ … 状況２ｆ₁₂＜ｇ₁₂＞ｆ₂₂＜ｇ₂₂＞ｆ₃₂＜ｇ₃₂＞ … 状況３ｆ₁₃＜ｇ₁₃＞ｆ₂₃＜ｇ₂₃＞ｆ₃₃＜ｇ₃₃＞ … ・・・・・・・・ただし、ｆ₁₁〜ｆ₃₃はそれぞれメンバーシップ関数であ
る。また、ｇ₁₁〜ｇ₃₃はそれぞれのメンバーシップ関数
の「重み」である。なお、重みの設定は必須ではない。FIG. 3 is a block diagram showing an example of the configuration of the portion for performing the calculation in FIG. The nodes shown in FIG. 3 are filter function operation units 31 to 34 that perform operation of filter functions that selectively pass input information in each directional link.
An adder unit 35 that performs addition and averaging processing by weighting the outputs of the filter function calculation units 31 to 34, and a threshold function calculation unit 36 that performs a threshold calculation on the output of the addition unit 35. The relationship between the input and the situation (output) can be represented by a set of rules (matrix) as follows. Input 1 Input 2 Input 3 Situation 1 f ₁₁ <g ₁₁ > f ₂₁ <g ₂₁ > f ₃₁ <g ₃₁ > ... Situation 2 f ₁₂ <g ₁₂ > f ₂₂ <g ₂₂ > f ₃₂ <g ₃₂ > ... Situation 3 _{_{_{f 13 <g 13> f 23}}} <g 23> f 33 <g 33> ... · · · · · · · · However, f ₁₁ ~f ₃₃ is a membership function, respectively. In addition, g ₁₁ ~g ₃₃ is the "weight" of each of the membership function. Note that setting the weight is not essential.

【００３６】フィルタ関数は、本実施例では図４に示す
ように、入力ノードに与えられた入力情報の値（入力
値）を横軸にとり合致度を縦軸にしたグラフにおいて不
等辺の台形の形状を持つメンバーシップ関数（ファジィ
メンバーシップ関数）が用いられる。メンバーシップ関
数は、最大合致度を得る入力の中心の値である中心値
ｃ、合致度ｖが得られる入力値の許容範囲を中心値ｃか
らの左および右への幅であらわす左分散値ｖ_lおよび右
分散値ｖ_r、最大合致度が得られる入力値の許容範囲を
中心値ｃからの幅で示した曖昧度ａによって記述され
る。フィルタ関数演算部３１〜３４は、メンバーシップ
関数の演算即ち入力値ｓを合致度ｖに変換するものであ
る。入力値ｓを合致度ｖに変換する式は次にように表す
ることができる。ｖ＝０．０｛ｓ≦（ｃ−ｖ_l）または（ｃ＋ｖ_r）≦ｓ｝ｖ＝１．０｛（ｃ−ａ）≦ｓかつｓ≦（ｃ＋ａ）｝ｖ＝（（ｖ_l−ａ）−（ｃ−ａ−ｓ））／（ｖ−ａ）｛（ｃ−ｖ_l）≦ｓかつｓ≦（ｃ−ａ）｝ｖ＝（（ｖ_r−ａ）−（ｓ−ｃ−ａ））／（ｖ−ａ）｛（ｃ＋ａ）≦ｓかつｓ≦（ｃ＋ｖ_r）｝・・・・・（１４）In this embodiment, as shown in FIG. 4, the filter function has a trapezoidal trapezoid of an unequal side in a graph in which the horizontal axis represents the value (input value) of the input information given to the input node and the vertical axis represents the degree of matching. A membership function with a shape (fuzzy membership function) is used. The membership function is the center value c which is the center value of the input that obtains the maximum matching score, and the left variance value v that represents the allowable range of the input value that obtains the matching score v as the width from the center value c to the left and right. _{It is described by l} and the right variance value v _r , and the ambiguity a that indicates the allowable range of the input value that gives the maximum matching degree with the width from the central value c. The filter function calculation units 31 to 34 are for calculating the membership function, that is, converting the input value s into the matching degree v. An equation for converting the input value s into the matching degree v can be expressed as follows. v = 0.0 {s ≦ (c−v _l ) or (c + v _r ) ≦ s} v = 1.0 {(c−a) ≦ s and s ≦ (c + a)} v = ((v _l −a ) - (c-a-s )) / (v-a) {(c-v l) ≦ s and s ≦ (c-a)} v = ((v r -a) - (s-c-a )) / (V−a) {(c + a) ≦ s and s ≦ (c + v _r )} (14)

【００３７】加算部３５は、フィルタ関数演算部３１〜
３４により算出した出力を総合するために加算平均演算
を行う。本実施例では各メンバーシップ関数による入力
情報Ｉ₁〜Ｉ_iの合致度ｖ_1j〜ｖ_ijに重みを乗じた値を総
合して最終合致度を求める。最終合致度算出の方法は種
々考えられるが、本実施例においては、式（１５）に示
すような加重加算平均を用いている。Ｖ_j＝Σｖ_ij・ｇ_ij／Σｇ_ij （１５）ただし、Ｖ_j：パターン（あるいはルール）ｊの合致度ｖ_ij：各メンバーシップ関数による合致度（＝ｆ_ij（Ｉ
_i）Ｉ_iは入力値）ｇ_ij：各メンバーシップ関数の重みＮ_s：入力データ数なお、Σはｉ＝１からＮ_sまでの総和を表す。The adder 35 includes filter function calculators 31 to 31.
An arithmetic mean calculation is performed in order to combine the outputs calculated by S34. In the present embodiment obtains the final match degree by comprehensively a value obtained by multiplying the weight to the matching degree v _1j to v _ij of input information I ₁ ~I _i by each membership function. Although various methods of calculating the final match degree are conceivable, in the present embodiment, the weighted addition average as shown in Expression (15) is used. V _j = Σv _ij · g _ij / Σg _ij (15) where V _j : the degree of matching of the pattern (or rule) j v _ij : the degree of matching by each membership function (= f _ij (I
_i ) I _i is an input value) g _ij : Weight of each membership function N _s : Number of input data Note that Σ represents the sum total from i = 1 to N _s .

【００３８】（学習システムの実施例）次に、以上に説
明したネットワーク型情報処理装置の学習を行う本発明
の一実施例について説明する。学習は、図２に示す学習
対象のネットワーク型情報処理装置に学習用データとし
て、予め入力データとそれに対する正しい出力信号を表
す教師信号の組を用意しておき、その入力データを学習
対象のネットワーク型情報処理装置１１に与え、それに
対する出力を教師信号と比べる評価関数によって誤差を
検出し、それを学習関数（規則）の入力として、逆伝播
学習の手法により、方向性リンクのフィルタ関数や重み
を変更することによって行う。図１は、本実施例の学習
を行うたの機能をブロック図に表したものである。学習
対象のネットワーク型情報処理装置１１と、学習用のデ
ータを入力する学習用データ入力部１２と、ネットワー
ク型情報処理装置１１のフィルタ関数演算部１１２の出
力を基に、教師信号によって特定される出力ノードに対
応する既存のルール（領域）に入力データが入るか否か
を判定する領域判定部１３と、領域判定部１３により前
記既存のルールに含まれると判定された場合に、その既
存のルールを形成するフィルタ関数のセットにおける各
フィルタ関数を個別に更新するフィルタ関数更新部１４
と、領域判定部１３により前記既存のルールに含まれな
いと判定された場合に、新たなルールを形成するフィル
タ関数のセットを生成するフィルタ関数生成部１５と、
学習の評価を行う評価部１６とを有している。また、フ
ィルタ関数を入力データ数に応じて適切な更新方法に切
り替えるための更新方法指定部１８と、統計的手法によ
りフィルタ関数を更新するための度数分布データを得る
ための度数分布計測手段１７とを備えている。度数分布
データは、履歴バッファ１７１の内容を解析して求め
る。(Embodiment of Learning System) Next, an embodiment of the present invention for learning the above-described network type information processing apparatus will be described. In the learning, a set of input data and a teacher signal representing a correct output signal for the input data is prepared in advance as learning data in the network information processing apparatus of the learning target shown in FIG. Type information processing apparatus 11 and outputs an error to the teacher signal to detect an error by an evaluation function. The error is detected as an input of a learning function (rule), and a backpropagation learning method is used to detect a directional link filter function or weight. By changing. FIG. 1 is a block diagram showing the function of performing learning in this embodiment. The network-type information processing apparatus 11 to be learned, the learning data input section 12 for inputting learning data, and the output of the filter function calculation section 112 of the network-type information processing apparatus 11 are specified by the teacher signal. An area determination unit 13 that determines whether or not input data enters an existing rule (area) corresponding to an output node, and if the area determination unit 13 determines that the input data is included in the existing rule, the existing rule A filter function updating unit 14 for individually updating each filter function in the set of filter functions forming a rule
And a filter function generation unit 15 that generates a set of filter functions forming a new rule when the area determination unit 13 determines that the rule is not included in the existing rules,
It has the evaluation part 16 which evaluates learning. Further, an update method designating unit 18 for switching the filter function to an appropriate update method according to the number of input data, and a frequency distribution measuring unit 17 for obtaining frequency distribution data for updating the filter function by a statistical method. Is equipped with. The frequency distribution data is obtained by analyzing the contents of the history buffer 171.

【００３９】（教師データによるルールの自動生成）図
５は、教師データありのルール自動生成の概略のフロー
示す図である。データ入力部１２からネットワーク型情
報処理装置１１に、学習のための一定数の入力データと
それに対応する教師データとからなるデータを入力する
（ステップＳ５１）。入力されたデータを基にルールの
自動生成処理を行う（ステップＳ５２）。ルールの自動
生成処理の手順については、後で説明する。学習が進ん
だ段階で、必要に応じてノイズ除去処理を行う（ステッ
プＳ５３）。そして後述する評価処理を行う（ステップ
Ｓ５４）。平均誤差がしきい値よりも小さくなったかど
うかを判定し（ステップＳ５５）、もし小さくなってい
れば、処理を終了する。平均誤差がしきい値よりも大き
かったら、再利用用の評価データを学習用データとして
入力し（ステップＳ５６）、さらに、ルールの自動生成
処理を行う（ステップＳ５７）。(Automatic Rule Generation Based on Teacher Data) FIG. 5 is a diagram showing a schematic flow of automatic rule generation with teacher data. Data consisting of a fixed number of input data for learning and teacher data corresponding thereto is input from the data input unit 12 to the network type information processing device 11 (step S51). Automatic rule generation processing is performed based on the input data (step S52). The procedure of automatic rule generation processing will be described later. At the stage where learning has progressed, noise removal processing is performed if necessary (step S53). Then, an evaluation process described later is performed (step S54). It is determined whether the average error has become smaller than the threshold value (step S55), and if it has become smaller, the process ends. If the average error is larger than the threshold value, the evaluation data for reuse is input as learning data (step S56), and the rule is automatically generated (step S57).

【００４０】図５の処理フローにおけるステップＳ５１
およびＳ５７のルールの自動生成処理の詳細を図６に示
す。同図に示すように、入力データおよび教師データを
ネットワーク型情報処理装置１１に入力する（ステップ
Ｓ６１）。ネットワーク型情報処理装置の持つルール
（パターン）の集合を記憶するパターンテーブルを探索
し、教師データと同じ出力を持つ全てのルールを選択す
る（ステップＳ６２）。パターンテーブルは図７に概略
を示すような形式でルールデータを保持している。即
ち、パターンテーブルには、ルール名と正常な認識結果
（領域）、ルールに対応する特徴次元の各軸のフィルタ
関数のパラメータ等が記憶されている。図８は特徴空間
におけるルールと領域の例を示すものである。そして、
入力された学習用の入力データを上記選択されたルール
の中の一つと照合する（ステップＳ６３）。すなわち、
選択された一つのルール（パターン）の各次元のフィル
タ関数ごとに入力データの対応する次元のデータの合致
度を求める。求めた合致度が予め定めたしきい値よりも
大きいか否かを領域判定部１３により判定する（ステッ
プＳ６４）。合致度がしきい値よりも大きかったときに
は、そのルール上の各次元のフィルタ関数を変更する
（ステップＳ６５）。すなわち、図９に示すように、全
次元について合致度がしきい値以上で、合致度の平均が
最も高くなるルールを拡張するよう変更する。フィルタ
関数の変更の手順は、図１１のフローチャートにより後
で説明する。合致度がしきい値よりも小さいときは、フ
ィルタ関数の変更は行わない。ステップＳ６２で選択さ
れたルールのうち入力データに対してまだ照合されてい
ないものがあるかどうかを判定する（ステップＳ６
６）。まだ、ルールが残っていれば、ステップＳ６３〜
Ｓ６６を繰り返す。全ての選択されたルールについて、
照合がすべて済んでいれば、次にしきい値を越えたルー
ルがあるか否かを調べる（ステップＳ６７）。しきい値
を越えたルールがなければ、新しいフィルタ関数のセッ
トを作成し、新規なルールとしてパターンテーブルに登
録する（ステップＳ６８）。すなわち、どのルールにつ
いてもしきい値以上の合致度が得られない次元が一つで
もあれば、フィルタ関数生成部１５により新しいルール
を追加する。図１０はその新しいルールの生成を示す図
である。そして、生成したルールのフィルタ関数を後で
述べるように変更して（ステップＳ６９）、処理を終了
する。ステップＳ６７の判定において、全ての次元でし
きい値を越えたルールがあったときには、終了する。Step S51 in the processing flow of FIG.
FIG. 6 shows details of the automatic rule generation processing in S57 and S57. As shown in the figure, input data and teacher data are input to the network type information processing device 11 (step S61). A pattern table that stores a set of rules (patterns) that the network-type information processing device has is searched, and all the rules that have the same output as the teacher data are selected (step S62). The pattern table holds rule data in a format as shown in FIG. That is, the pattern table stores the rule name, the normal recognition result (area), the parameter of the filter function of each axis of the feature dimension corresponding to the rule, and the like. FIG. 8 shows an example of rules and regions in the feature space. And
The input learning input data is collated with one of the selected rules (step S63). That is,
For each dimensional filter function of the selected one rule (pattern), the matching degree of the dimensional data corresponding to the input data is obtained. The area determination unit 13 determines whether or not the obtained matching degree is larger than a predetermined threshold value (step S64). When the degree of matching is larger than the threshold value, the filter function of each dimension on the rule is changed (step S65). That is, as shown in FIG. 9, the rule in which the degree of agreement is equal to or higher than the threshold and the average of the degree of agreement is highest in all dimensions is changed to be expanded. The procedure for changing the filter function will be described later with reference to the flowchart of FIG. When the degree of matching is smaller than the threshold value, the filter function is not changed. It is determined whether any of the rules selected in step S62 has not been matched against the input data (step S6).
6). If there are still rules, step S63-
Repeat S66. For all selected rules,
If all the matching has been completed, it is next checked whether or not there is a rule that exceeds the threshold value (step S67). If no rule exceeds the threshold value, a new set of filter functions is created and registered as a new rule in the pattern table (step S68). That is, if there is at least one dimension for which no matching degree equal to or greater than the threshold value is obtained for any rule, the filter function generating unit 15 adds a new rule. FIG. 10 is a diagram showing the generation of the new rule. Then, the filter function of the generated rule is changed as described later (step S69), and the process ends. When it is determined in step S67 that there is a rule that the threshold values are exceeded in all dimensions, the process ends.

【００４１】（フィルター関数の生成、変更）図６の処
理におけるステップＳ６５およびＳ６９ルールを構成す
る各次元のフィルター関数（メンバーシップ関数で表わ
す）の変更は、図１１のフローチャートに示すように行
なわれる。図２のシステムに入力するデータは、履歴バ
ッファに格納される（ステップＳ１１１）。履歴バッフ
ァは一つのルールに入力された各次元のデータを保持す
るもので、所定のＭ個の入力の履歴を蓄える容量も持っ
ている。各ルールに対応して履歴バッファが用意され、
各履歴バッファごとに入力されたデータをカウントする
カウンタが設けられている。履歴バッファの入力データ
数を数えるカウンタは、入力されるごとにカウンタの値
を１増加させる（ステップＳ１１２）。次に、カウンタ
の値が１か否かを判定し（ステップＳ１１３）、もし１
であれば、フィルタ関数更新部１４により次のようにフ
ィルタ関数のパラメータを更新する（ステップＳ１１
４）。即ち、履歴データ数に応じて更新方法を切り替え
て更新を行う。なお、Ｎ：履歴データ数（サンプリングデータ数）Ｍ：履歴バッファーサイズＸ₀：初期入力データＸ₁：履歴データの最小値Ｘ₂：履歴データの最大値Ｒ₁：正規空間最小値Ｒ₂：正規空間最大値Ｃ：中心値Ａ：曖昧度Ｖ：分散値とする。(Generation and Change of Filter Function) Steps S65 and S69 in the process of FIG. 6 change the filter function (represented by a membership function) of each dimension constituting the rule as shown in the flowchart of FIG. .. The data input to the system of FIG. 2 is stored in the history buffer (step S111). The history buffer holds the data of each dimension input to one rule, and also has the capacity to store the history of a predetermined M number of inputs. A history buffer is prepared for each rule,
A counter is provided for counting the input data for each history buffer. The counter that counts the number of input data in the history buffer increments the value of the counter by 1 each time it is input (step S112). Next, it is determined whether or not the value of the counter is 1 (step S113), and if 1
If so, the filter function updating unit 14 updates the parameters of the filter function as follows (step S11).
4). That is, the update method is switched according to the number of history data to perform the update. Note that N: number of history data (number of sampling data) M: history buffer size X ₀ : initial input data X ₁ : minimum value of history data X ₂ : maximum value of history data R ₁ : minimum value of regular space R ₂ : normal Spatial maximum value C: central value A: ambiguity V: dispersion value

【００４２】ａ）Ｎ＝１の場合Ｃ＝Ｘ₀ Ａ＝０Ｖ＝学習処理時の情報として設定図１２は設定されたフィルタ関数の例を示すものであ
る。A) In the case of N = 1 C = X ₀ A = 0 V = Set as information in learning processing FIG. 12 shows an example of the set filter function.

【００４３】カウンタの値の判定の結果、２以上であっ
た場合は、カウンタの値が所定の数Ｍを越えているか否
かを判定する（ステップＳ１１５）。その判定の結果Ｍ
を越えていない場合は、フィルタ関数を次の方法で更新
する（ステップＳ１１６）。When the value of the counter is determined to be 2 or more, it is determined whether or not the value of the counter exceeds a predetermined number M (step S115). The result of the judgment M
If not, the filter function is updated by the following method (step S116).

【００４４】ｂ）２＜Ｎ＜Ｍの場合Ｃ＝（Ｘ₁＋Ｘ₂）／２Ａ＝（Ｘ₂−Ｘ₁）／２Ｖ＝Ａ＋２Ａ／Ｎ図１３は設定されたメンバーシップ関数の例を示すもの
である。上述のａ）およびｂ）の場合は、そもそも統計
的に意味を持たない少数のデータを対象としている場合
であり、学習方法も統計的というよりは特に非の無い直
観的な手法に頼らざるを得ない。ファジィの特性を反映
し、データ捕捉と評価が確実であると判断できる手法の
一例である。B) In the case of 2 <N <M C = (X ₁ + X ₂ ) / 2 A = (X ₂ −X ₁ ) / 2 V = A + 2A / N FIG. 13 shows an example of the set membership function. It is shown. The cases a) and b) above are cases where a small amount of data that does not have statistical significance is targeted in the first place, and the learning method must rely on an intuitive non-intuitive method rather than statistically. I don't get it. This is an example of a method that can judge that data capture and evaluation are reliable by reflecting the characteristics of fuzzy.

【００４５】ステップＳ１１５の判定の結果、カウンタ
の値が履歴バッファの数を越えている場合には、次に示
すような統計処理により変更する（ステップＳ１１
７）。なお、統計処理により設定する方法は、前述の特
願平３−３１００８２号「ネットワーク型情報処理シス
テム」に開示されている方法を用いることができる。ｃ）統計処理による変更ある一定の観測期間中に、入力されたデータでかつ該当
するパターンに合致したものを集計し母集団とする。図
１５は入力されたデータの量子化のレベルＺを横軸と
し、各レベルに対するデータの発生回数Ｇを縦軸にと
り、データの発生分布の例を示すグラフの例である。な
お、Ｇ_cは入力データのノイズ成分等を除去するための
カットオフレベルである。母集団に含まれる要素の数が
一定の数に達した所でフィルタ関数であるメンバーシッ
プ関数の変更操作を行う。その母集団より下記の手順に
従ってメンバーシップ関数を導出する。If the result of determination in step S115 is that the counter value exceeds the number of history buffers, it is changed by statistical processing as described below (step S11).
7). As a method of setting by statistical processing, the method disclosed in the above-mentioned Japanese Patent Application No. 3-310082 "Network type information processing system" can be used. c) Changes by statistical processing During a certain observation period, the input data that matches the applicable pattern is tabulated and used as the population. FIG. 15 is an example of a graph in which the horizontal axis represents the quantization level Z of the input data and the vertical axis represents the number of data generations G for each level, showing an example of the data generation distribution. Note that G _c is a cutoff level for removing noise components and the like of the input data. When the number of elements included in the population reaches a certain number, the membership function, which is a filter function, is changed. A membership function is derived from the population according to the following procedure.

【００４６】図１４はメンバーシップ関数のパラメータ
抽出処理のフロー図である。カットオフデータＧ_cが出
力データ指定として設定されているか否かを判定する
（ステップ１４１）。カットオフレベルＧ_cが設定され
ていれば、各量子化レベルＺ_iごとにカットオフレベル
を差し引いた値を求めることにより分布データを操作す
る（ステップ１４２）。即ち、Ｚ_i−Ｇ_cを求め、新たな
Ｚ_iとする。図１５の入力データ分布をカットオフレベ
ルＧ_cで処理した結果を図１６に示す。上記のカットオ
フレベル以下のデータを切り捨てる演算処理がＺの量子
化レベルのすべてについて終了するまで行われる。その
ために、各量子化レベルの演算終了ごとに、全ての量子
化レベルについて終了したか否かの判定をする（ステッ
プ１４３）。全ての量子化レベルについて終了したとき
はステップ１４４に移る。また、ステップ１４１の判定
によりカットオフレベルが設定されていなかったときも
ステップ１４４に進む。FIG. 14 is a flow chart of a membership function parameter extraction process. It is determined whether or not the cutoff data G _c is set as the output data designation (step 141). If the cutoff level G _c is set, the distribution data is manipulated by obtaining a value obtained by subtracting the cutoff level for each quantization level Z _i (step 142). That is, Z _i −G _c is obtained and set as a new Z _i . FIG. 16 shows the result of processing the input data distribution of FIG. 15 at the cutoff level G _c . The arithmetic processing for truncating the data below the cutoff level is performed until all the quantization levels of Z are completed. For this reason, it is determined whether or not all the quantization levels have been calculated each time the calculation of each quantization level is completed (step 143). When it is completed for all the quantization levels, the routine proceeds to step 144. Further, if it is determined in step 141 that the cutoff level is not set, the process also proceeds to step 144.

【００４７】分布データの平均値正規化座標値Ｚ_mを求
める（ステップ１４４）。Ｚ_m＝Σ（Ｚ_i×Ｇ_i）／ΣＧ_i ただし、Σはｉ＝０からｉ＝ｎまでの総和を表すものと
する。平均値正規化座標値Ｚｍからマイナス側、プラス
側それぞれ独立に標準偏差値Ｓ_l，Ｓ_rを求める（ステッ
プ１４５）。ステップ１４４で求めた平均値正規化座標
値Ｚ_mおよびステップ１４５で求めたマイナス側標準偏
差値Ｓ_l，プラス側標準偏差値Ｓ_rを基に次のような正規
化座標値を求める（ステップ１４６）。即ち、平均値正
規化座標値Ｚ_mを中心に、図１７に示すように、マイナ
ス側標準偏差値Ｓ_lの１倍（注＊）の正規化座標値
Ｚ_L1、マイナス側標準偏差値Ｓ_lの３倍（注＊）の正規
化座標値Ｚ_L2、プラス側標準偏差値Ｓ_rの１倍（注＊）
の正規化座標値Ｚ_R1、プラス側標準偏差値Ｓ_rの３倍
（注＊）の正規化座標値Ｚ_R2、（注＊：この値は条件によって変更される）をそれぞれ求める。Average value of distribution data Normalized coordinate value Z_mSeeking
(Step 144). Z_m= Σ (Z_i× G_i) / ΣG_i However, Σ represents the total sum from i = 0 to i = n
To do. Average value Normalized coordinate value Zm minus side, plus
Standard deviation value S independently for each side_l, S_rAsk (step
145). Average value normalized coordinates obtained in step 144
Value Z_mAnd the negative standard deviation obtained in step 145
Difference value S_l, Plus standard deviation S_rBased on
The converted coordinate value is obtained (step 146). That is, the average value is positive
Normalized coordinate value Z_mMainly, as shown in FIG.
Standard deviation S_lNormalized coordinate value of 1 times (*)
Z_L1, Negative standard deviation S_l3 times (Note *) regular
Coordinate value Z_L2, Plus side standard deviation value S_r1 time (Note *)
Normalized coordinate value Z_R1, Plus side standard deviation value S_r3 times
(*) Normalized coordinate value Z_R2, (Note *: This value changes depending on the conditions).

【００４８】図１８に示すような正規化中心値Ｃ_s，正
規化曖昧度Ｖ_as，正規化分散値Ｖ_ls，Ｖ_rsをそれぞれ次
式により求める（ステップ１４７）。正規化中心値Ｃ_s＝（Ｚ_L1＋Ｚ_R1）／２正規化曖昧度Ｖ_as＝（Ｚ_R1-Ｚ_L1）／２正規化左分散値Ｖ_ls＝Ｃ_s-Ｚ_L2 正規化右分散値Ｖ_rs＝Ｚ_R2-Ｃ_s 次に、正規化中心値Ｃ_s，正規化曖昧度Ｖ_as，正規化左
分散値Ｖ_ls，正規化右分散値Ｖ_rsをそれぞれ逆正規化し
て、中心値Ｃ，曖昧度Ｖ_a，左分散値Ｖ_l，右分散値Ｖ_r
求める（ステップ１４８）。学習による変更前の元のメ
ンバーシップ関数と、前述の図１４に示す処理フローに
より生成したメンバーシップ関数とを用いて新しい（１
回学習後の）メンバーシップ関数を生成する。The normalized center value C _s , the normalized ambiguity V _as , and the normalized variance values V _ls , V _rs _as shown in FIG. 18 are obtained by the following equations (step 147). Normalized center value C _s = (Z _L1 + Z _R1 ) / 2 Normalized ambiguity V _as = (Z _R1- Z _L1 ) / 2 Normalized left variance value V _ls = C _s- Z _L2 Normalized right variance value V _rs = Z _R2- C _s Then, the normalized center value C _s , the normalized ambiguity V _as , the normalized left variance value V _ls , and the normalized right variance value V _rs are denormalized to obtain the central value C, Ambiguity V _a , left variance V _l , right variance V _r
(Step 148). Using the original membership function before the change by learning and the membership function generated by the processing flow shown in FIG.
Generate a membership function (after learning).

【００４９】図１９は、本実施例においてメンバーシッ
プ関数を変更する方法（学習関数）を示す図である。同
図において、元の（現在の）メンバーシップ関数は点Ｐ
₁，Ｐ₂，Ｐ₃，Ｐ₄を結ぶ直線群（太線）により示され、
図１４により求めた一定期間のサンプリングから得られ
たデータ分布に基づくメンバーシップ関数は点Ｐ₁ ^'，Ｐ
₂ ^'，Ｐ₃ ^'，Ｐ₄ ^'を結ぶ直線群（破線）により示され、こ
れらのメンバーシップ関数を基に新しく生成されるメン
バーシップ関数は点Ｐ₁ ^"，Ｐ₂ ^"，Ｐ₃ ^"，Ｐ₄ ^"を結ぶ直線
群（細線）により示されている。各４点の座標をＰ
（ｓ，ｖ）、Ｐ^'（ｓ^'，ｖ^'）、Ｐ^"（ｓ^"，ｖ^"）とす
る。ｖ^"＝ｖ^'＝ｖｓ^"＝（１．０−ｇ）×ｓ＋ｇ×ｓ^' ただし、０．０≦ｇ≦１．０各４点のゲイン値ｇは独立に設定可能とする。FIG. 19 shows the member table in this embodiment.
It is a figure which shows the method (learning function) of changing the loop function. same
In the figure, the original (current) membership function is the point P
₁, P₂, P₃, P_FourIndicated by a group of straight lines (thick line) connecting
Obtained by sampling for a certain period of time, which was obtained from Fig. 14.
The membership function based on the data distribution₁ ^', P
₂ ^', P₃ ^', P_Four ^'Indicated by a group of straight lines (broken line) connecting
Newly generated members based on these membership functions
Barship function is point P₁ ^", P₂ ^", P₃ ^", P_Four ^"Straight line connecting
Shown by groups (thin lines). P for each four points
(S, v), P^'(S^', V^'), P^"(S^", V^")
It v^"= V^'= Vs^"= (1.0-g) xs + gxs^' However, 0.0 ≦ g ≦ 1.0, the gain value g at each of the four points can be set independently.

【００５０】各点のゲイン値の設定によって、同じ観測
データによっても、メンバーシップ関数の変更結果は異
なる。図２０は、ゲインの決めかたにより変更結果がど
のように変わるかを示すものである。図２０（ａ）は、
現在のメンバーシップ関数（細い実線）と一定期間のサ
ンプリングから得られた分布に基づくメンバーシップ関
数（破線）を示す。同図（ｂ）は、各点のゲイン値ｇを
０．５としたときに生成されるメンバーシップ関数（太
い実線）を示す。また、同図（ｃ）は、点Ｐ₁のゲイン
値ｇ＝０、他の点のゲイン値ｇ＝０．５とした場合を示
し、同図（ｄ）は、曖昧度（Ｐ₂とＰ₃間の距離）を変え
ず、底辺拡張方向のゲイン値ｇ＝１．０、かつ底辺縮小
方向のゲイン値ｇ＝０とした場合を示す。入力信号源で
あるセンサの特性、学習の意図などによって、上記ゲイ
ン値を設定し、学習の方向性を変化させることが可能で
ある。The result of changing the membership function differs depending on the same observation data depending on the setting of the gain value at each point. FIG. 20 shows how the change result changes depending on how to determine the gain. FIG. 20A shows
The current membership function (thin solid line) and the membership function based on the distribution obtained from sampling for a certain period (dashed line) are shown. FIG. 11B shows a membership function (thick solid line) generated when the gain value g at each point is set to 0.5. Further, FIG. 6C shows a case where the gain value g of the point P ₁ is g = 0 and the gain value g of the other points is g = 0.5, and the same figure (d) shows the ambiguity (P ₂ and P The distance between ₃ ) is not changed, and the gain value in the bottom expanding direction is set to g = 1.0, and the gain value in the bottom contracting direction is set to g = 0. The gain value can be set and the directionality of learning can be changed depending on the characteristics of the sensor that is the input signal source, the intention of learning, and the like.

【００５１】前述のように、教師が進んだ段階で、必要
に応じてノイズの除去処理を行うが、それは次のように
行う。ａ）各ルールのメンバーシップ関数のＮ（履歴データ
数）を見る。これは履歴バッファ毎に設けられたカウン
タの値を見ればよい。ｂ）Ｎ＜境界データ数の場合には、同じ認識領域のルールが１つ以上隣接する場合には
ルールの重みの変化は行わない。すなわち、ルールの重
みに変化はない。なお、ルールが隣接するとは、ある次
元のメンバーシップ関数同士の底辺（分散値）が重なる
ことである。同じ認識領域のルールが１つも隣接しない場合に
は、ルールの重みを下げる。つまり、各次元のメンバー
シップ関数の重みを下げる。図２１はノイズの除去の例を説明するためのもので、同
図において点は教師データであり、点を丸で囲った領域
がルールであり、これらのルールによって認識のための
領域が構成される。同図におけるルール１とルール２は
極端に教師信号の少なかったルールであり、このような
ルールについては、同じ認識領域のルールが隣接するル
ール１については、ルールの重みの変更は行わず、同じ
認識領域のルールが１つも隣接しないルール２について
は、ルールの重みを下げる。As described above, when the teacher advances, noise removal processing is performed as necessary, which is performed as follows. a) Check N (number of historical data) of membership function of each rule. This can be done by looking at the value of the counter provided for each history buffer. b) In the case of N <the number of boundary data, if one or more rules in the same recognition area are adjacent to each other, the rule weight is not changed. That is, the rule weight does not change. Note that the rules are adjacent to each other means that the bases (dispersion values) of membership functions of a certain dimension overlap with each other. If no rules in the same recognition area are adjacent, the weight of the rule is lowered. That is, the weight of the membership function of each dimension is reduced. FIG. 21 is for explaining an example of noise removal. In FIG. 21, points are teacher data, and areas surrounded by circles are rules. These rules form an area for recognition. It The rule 1 and the rule 2 in the figure are rules with extremely few teacher signals. Regarding such a rule, the rule weight is not changed for the rule 1 adjacent to the rule in the same recognition region, and the rule weight is the same. For rule 2, which has no adjacent rules in the recognition region, the rule weight is lowered.

【００５２】（否定教師信号学習）否定教師信号を用い
て学習を行うと都合が良い場合がある。例えば、プラン
トシステムにおいて、あるセンサ情報群が表す状況に対
して、それは○○の故障の発生であるというような肯定
側の情報を与えることはできないが、少なくとも火災の
発生という事象ではないという否定側の情報を与えるこ
とはできる場合がある。このようなときに、否定側の情
報により教示を行うことは有用である。ａ）入力データ（センサー情報）と否定の認識結果
（〜ではない、例えばｎｏｔＡ）を表す教師信号とが、
学習のためにネットワーク型情報処理装置に入力された
ときは、肯定（すなわち、否定ではない）教師信号と同
様の方式でルールを生成する。ｂ）教師データによる学習過程において、否定教師信号
によるルールに、正常な認識結果（例えば、ｎｏｔＡ→
ＢｏｒＣｏｒ…）のルールが１つ以上隣接した場
合、否定ルールも隣接する正常な認識結果のルールと同
じ認識領域のルールであると判断される。図２２は否定教師信号により生成されたルールを肯定出
力のルールに変換処理する処理フローを示す図である。
否定教師信号によるルールに正常信号によるルールが隣
接するかを調べ（ステップＳ２２１）、隣接していれば
否定ルールを隣接する正常ルールと同じ領域に変換する
（ステップＳ２２２）。図２３は、否定教師信号による
ルールを否定ではない肯定ルールに変換する例を示すも
ので、ｎｏｔＡの周りにＡ以外の領域Ｂ（ｏｒＣ…）
を持つルールが１つでも隣接した場合は、ｎｏｔＡをＢ
（ｏｒＣ…）に変更する。否定ルールが他にあれば、そ
の否定ルールに対してステップＳ２２１〜Ｓ２２３を繰
り返す。否定ルールが無くなったら、終了する。（例）認識結果Ｘ軸Ｙ軸Ｚ軸ルール１：ｎｏｔＡＭＦ１ＭＦ２ＭＦ３ ↓ ルール１：ＢＭＦ１ＭＦ２ＭＦ３(Negative Teacher Signal Learning) It may be convenient to carry out learning using a negative teacher signal. For example, in a plant system, it is not possible to give affirmative information such as the occurrence of a failure of XX to the situation represented by a certain sensor information group, but at least it is not an event of fire occurrence. It may be possible to give side information. In such a case, it is useful to teach using the information on the negative side. a) Input data (sensor information) and a teacher signal representing a negative recognition result (not, for example, notA),
When input to the network type information processing device for learning, rules are generated in the same manner as a positive (that is, not negative) teacher signal. b) In the learning process using the teacher data, the normal recognition result (for example, notA →
When one or more rules of B or Cor ... Are adjacent, the negative rule is also determined to be the rule of the same recognition area as the rule of the adjacent normal recognition result. FIG. 22 is a diagram showing a processing flow for converting a rule generated by a negative teacher signal into a positive output rule.
It is checked whether or not the rule based on the normal teacher signal is adjacent to the rule based on the negative teacher signal (step S221). If the rule based on the normal signal is adjacent to the rule, the negative rule is converted into the same area as the adjacent normal rule (step S222). FIG. 23 shows an example in which a rule based on a negative teacher signal is converted into a positive rule that is not negative, and a region B (or C ...) Other than A around notA.
If even one rule with is adjacent, notA is B
Change to (orC ...). If there is another negative rule, steps S221 to S223 are repeated for the negative rule. When there is no negative rule, it ends. (Example) Recognition result X axis Y axis Z axis Rule 1: notA MF1 MF2 MF3 ↓ Rule 1: B MF1 MF2 MF3

【００５３】（評価データの再利用学習を行う実施例）
一定数の学習の終了後、学習の程度を評価するために、
評価データを与える。評価データも学習用データと同様
センサー情報（入力データ）と正常な結果からなる。評
価部１６において、各入力データの情報処理結果（推論
結果）と評価用に与えられた正常な結果との差異（誤
差）を求め、評価関数により評価して良否を判定する。
これらの良否は、各ルール毎に積算し、評価のための入
力データの情報処理が全て終了したところで、積算の分
布を評価する。これにより領域分割の妥当性を判断する
ことができる。妥当であると判断された時点で学習の終
了とする。この評価の仕方によれば、評価結果を特徴空
間に蓄積して最終的に領域分割するので、ローカルミニ
マム極小解に落ち込む危険性がない。そのため評価関数
は線形である必要が無く、自由に利用できる。特異なケ
ースとして、パターンテーブルのどのルールにも合致し
ないケース（適用ルール＝ナシ、合致度＝０）の場合、
評価データを教師データとして再利用する。図２４は、
評価処理と評価データの再利用の処理フローを示すもの
である。ａ）各評価データに対して、パターンテーブルのルー
ルを適用する（ステップＳ２４１）。ｂ）正常な認識結果（合致度＝１．０）であるべき
評価データに対して、適用ルール＝無し（合致度＝０．
０、誤差＝１．０）の評価データを全て求める（ス
テップＳ２４２〜ステップＳ２４５）。ｃ）ｂ）に該当する評価データを教師データとして、
教師学習を行なう（ステップＳ２４６）。ｄ）ｃ）終了後、教師データによる学習を再開する。図２５は、評価データの再利用を説明するもので、Ａの
領域において×印の位置の評価データに対応するルール
がない場合、その評価データを教師データとして学習を
行うことにより、認識領域の欠落した部分を補うことが
できる。(Example of Reuse Learning of Evaluation Data)
After the completion of a certain number of learning, in order to evaluate the degree of learning,
Give evaluation data. Like the learning data, the evaluation data also includes sensor information (input data) and a normal result. The evaluation unit 16 obtains the difference (error) between the information processing result (inference result) of each input data and the normal result given for evaluation, and evaluates it by the evaluation function to judge pass / fail.
These pass / fail are integrated for each rule, and the distribution of integration is evaluated when all the information processing of the input data for evaluation is completed. This makes it possible to judge the appropriateness of the area division. The learning ends when it is judged to be appropriate. According to this evaluation method, the evaluation results are accumulated in the feature space and finally divided into regions, so there is no risk of falling into the local minimum minimum solution. Therefore, the evaluation function does not need to be linear and can be used freely. As a peculiar case, in the case where it does not match any rule of the pattern table (applied rule = none, matching degree = 0),
Reuse evaluation data as teacher data. Figure 24 shows
It shows a processing flow of evaluation processing and reuse of evaluation data. a) The rule of the pattern table is applied to each evaluation data (step S241). b) With respect to the evaluation data that should be a normal recognition result (matching degree = 1.0), there is no application rule = matching degree = 0.
0, error = 1.0) is obtained (steps S242 to S245). c) Using the evaluation data corresponding to b) as teacher data,
Teacher learning is performed (step S246). d) After the completion of c), the learning with the teacher data is restarted. FIG. 25 is a diagram for explaining reuse of evaluation data. When there is no rule corresponding to the evaluation data at the position of the X mark in the area A, learning is performed by using the evaluation data as teacher data, You can make up for the missing parts.

【００５４】（教師データ無しの学習を行う実施例）こ
の実施例は、基本的な構成は図１に示すものと同様であ
る。図２６はその機能を示すブロック図出あり、図１と
は領域判定部の代わりに最大出力ノード判定部２６１を
設けた構成となっている。なお、図２６において、図１
と同じ機能の部分については同じ符号を用いている。内
部状態の改変（学習）は以下に示す手順で行なわれる。
図２７に処理のフローを示す。ａ）入力／出力の各ノード間のリンク上には、適当に
分布させたメンバーシップ関数（＝帯域通過型関数））
を定義しておく。（合致度合成方式によっては、リンク
上の重みも設定しておく）ｂ）通常の学習と同様に、入力データ列をネットワー
ク型情報処理装置１１に入力する（ステップＳ２７
１）。但し、通常の学習時とは異なり教師データの入力
は行なわない。ｃ）演算処理（推論）を行ない、出力を得る（ステッ
プＳ２７２）。ｄ）出力ノードの中で最も出力が大きいノードを選択
する（ステップＳ２７３）。ｅ）図２８に示すように、その出力ノードの出力値を
正（例えば１．０）、他の出力ノードの出力値を偽（例
えば０．０）として教師を行なう（ステップＳ２７
４）。ｆ）図１の実施例において説明したような定められた
学習方式によってフィルタ関数の変更を行なう（ステッ
プＳ２７５）。なお、合致度合成方式によっては、リン
ク上の重みも変更する。(Example in which learning is performed without teacher data) In this example, the basic configuration is the same as that shown in FIG. FIG. 26 is a block diagram showing its function, which is different from FIG. 1 in that a maximum output node determination unit 261 is provided instead of the area determination unit. Note that in FIG.
The same reference numerals are used for parts having the same functions as. Modification (learning) of the internal state is performed by the following procedure.
FIG. 27 shows a processing flow. a) A membership function (= band-pass function) appropriately distributed on the link between the input / output nodes.
Is defined. (Depending on the matching degree synthesis method, the weight on the link is also set.) B) The input data string is input to the network type information processing device 11 as in the case of normal learning (step S27).
1). However, unlike the case of normal learning, no teacher data is input. c) Arithmetic processing (inference) is performed to obtain an output (step S272). d) Among the output nodes, the node with the largest output is selected (step S273). e) As shown in FIG. 28, the output value of the output node is positive (for example, 1.0), and the output values of the other output nodes are false (for example, 0.0) to perform the teacher (step S27).
4). f) The filter function is changed by the predetermined learning method as described in the embodiment of FIG. 1 (step S275). Note that the weight on the link is also changed depending on the matching degree combining method.

【００５５】この実施例によれば、ある特定の入力に対
する出力値を強調する方向に内部状態の改変が進行する
ため、各々が入力情報されたデータなんらかの特徴成分
を記述する事となる。これは、広義の非線形解析と同値
である。このネットワーク自体では、学習によって入力
に対する非線形解析を行なうことと、その解析結果（＝
ネットワーク）を用いた入力データの特徴量への変換処
理を行なうかたちとなる。これを従来実施されていた統
計解析などによる前処理系と置換し図２９、図３０のよ
うな系とすることが可能である。以下に、１つの運用形
態の例を説明する。まず、教師データなしの学習を行な
うネットワークを作成し、教師なしの学習を行なわせ
る。一定の学習（例えば一定数のデータ入力する、ネッ
トワークが一定の収束を示すなどを指標とする）が終了
した時点で、このネットワークを固定する（学習を停止
する）。実際の出力処理を行なうネットワークに連結
し、教師ありの学習を行なう。教師データなしの学習ネ
ットワークと、通常の教師ありのネットワークとの接続
は、直接の入力データと併存させる図２９のような形態
や、教師データなしの学習ネットワークを完全前処理系
とする図３０のような形態など種々の形態で実施するこ
とができる。According to this embodiment, since the modification of the internal state progresses in the direction of emphasizing the output value for a certain specific input, each characteristic describes some characteristic component of the input information data. This is equivalent to the non-linear analysis in a broad sense. In this network itself, nonlinear analysis is performed on the input by learning, and the analysis result (=
Network) to convert the input data into features. It is possible to replace this with a preprocessing system that has been conventionally performed by statistical analysis or the like, and obtain a system as shown in FIGS. An example of one operation form will be described below. First, a network for learning without teacher data is created to allow learning without teachers. When a certain amount of learning (for example, inputting a certain number of data, using the network as a constant convergence index) is completed, the network is fixed (learning is stopped). It is connected to a network that performs actual output processing, and supervised learning is performed. The connection between the learning network without the teacher data and the normal network with the teacher is as shown in FIG. 29 in which the learning network without the teacher data coexists, or the learning network without the teacher data is used as a complete preprocessing system in FIG. It can be implemented in various forms such as the above form.

【００５６】[0056]

【The invention's effect】

１）本発明によれば、フィルタ関数の生成と更新を行う
学習処理によって、境界領域に小さなルールが生成され
認識率が高まる。２）本発明によれば、自然に大小のルールが生成される
ので、従来のように升目分割をすることなく、認識率を
高めることができる。３）本発明では、評価処理においてルールが生成されて
いない領域のデータであった場合、教師信号として再度
利用され、ルールが生成されていない領域に確実にルー
ルを作成することを可能にしている。評価データの再利
用と学習速度の増加ができる。４）本発明によれば、ネットワーク型情報処理システム
を用いて教師データ無しの学習処理を行なわせることに
よって、複数の入力データによって非線形解析を行なう
ことができ、従って教師無しの学習処理によって自動的
に前処理系を構築し、前処理系の作成時間、作業負荷を
短縮することが可能となる。1) According to the present invention, a small rule is generated in the boundary region and the recognition rate is increased by the learning process of generating and updating the filter function. 2) According to the present invention, large and small rules are naturally generated, so that the recognition rate can be increased without dividing the grid as in the conventional case. 3) According to the present invention, when the data in the area where the rule is not generated in the evaluation processing is used again as the teacher signal, it is possible to surely create the rule in the area where the rule is not generated. .. The evaluation data can be reused and the learning speed can be increased. 4) According to the present invention, by performing the learning process without the teacher data by using the network type information processing system, it is possible to perform the nonlinear analysis with a plurality of input data. It is possible to construct a pretreatment system and reduce the preparation time and workload of the pretreatment system.

[Brief description of drawings]

【図１】本発明のネットワーク型情報処理システムの
学習システムの機能ブロック図FIG. 1 is a functional block diagram of a learning system of a network type information processing system according to the present invention.

【図２】ネットワーク型情報処理システムの構成を示
す図FIG. 2 is a diagram showing a configuration of a network type information processing system.

【図３】図２における演算を行う部分の構成例を示す
ブロック図FIG. 3 is a block diagram showing a configuration example of a portion for performing calculation in FIG.

【図４】ファジィメンバーシップ関数を説明するため
の図FIG. 4 is a diagram for explaining a fuzzy membership function.

【図５】教師信号による学習の処理の概略を示すフロ
ー図FIG. 5 is a flowchart showing an outline of learning processing by a teacher signal.

【図６】図５のフローにおけるルールの自動生成処理
の詳細を示すフロー図FIG. 6 is a flow chart showing details of automatic rule generation processing in the flow of FIG.

【図７】ルールの形式を説明するための図FIG. 7 is a diagram for explaining a rule format.

【図８】本実施例における認識領域を説明するための
図FIG. 8 is a diagram for explaining a recognition area in the present embodiment.

【図９】ルールの拡張を説明するための図FIG. 9 is a diagram for explaining expansion of rules.

【図１０】ルールの生成を説明するための図FIG. 10 is a diagram for explaining rule generation.

【図１１】フィルタ関数の変更処理を示すフロー図FIG. 11 is a flowchart showing a filter function changing process.

【図１２】履歴データ数が１であるときに設定される
フィルタ関数を示す図FIG. 12 is a diagram showing a filter function set when the number of history data is 1.

【図１３】履歴データ数が２以上でＭより小さいとき
に設定されるフィルタ関数を示す図FIG. 13 is a diagram showing a filter function set when the number of history data is 2 or more and less than M.

【図１４】メンバーシップ関数のパラメータ抽出処理
のフロー図FIG. 14 is a flowchart of parameter extraction processing of membership function.

【図１５】入力されたデータの量子化のレベルＺを横
軸とし、各レベルに対するデータの発生回数Ｇを縦軸に
とり、データの発生分布の例を示すグラフの例を示す図FIG. 15 is a diagram showing an example of a graph showing an example of a data generation distribution in which the horizontal axis represents the quantization level Z of input data and the vertical axis represents the number of times data is generated for each level.

【図１６】１図５のデータの発生分布をカットオフＧ
ｃにより足切りした後のデータ分布を示す図FIG. 16: Cutoff G of the generation distribution of the data in FIG.
The figure which shows the data distribution after cutting off by c

【図１７】各正規化座標値の算出を説明するための図FIG. 17 is a diagram for explaining calculation of each normalized coordinate value.

【図１８】図５のデータ分布から得られたメンバーシ
ップ関数を示す図FIG. 18 is a diagram showing a membership function obtained from the data distribution of FIG.

【図１９】本実施例においてメンバーシップ関数を変
更する基本的な方法（学習関数）を説明するための図FIG. 19 is a diagram for explaining a basic method (learning function) for changing the membership function in the present embodiment.

【図２０】学習関数のゲインの決めかたによりメンバ
ーシップ関数の変更結果がどのように変わるかを示すも
のであり、（ａ）は、現在のメンバーシップ関数（細い
実線）と一定期間のサンプリングから得られた分布に基
づくメンバーシップ関数（破線）を示し、（ｂ）は、各
点のゲイン値ｇを０．５としたときに生成されるメンバ
ーシップ関数（太い実線）を示し、（ｃ）は、点Ｐ１の
ゲイン値ｇ＝０、他の点のゲイン値ｇ＝０．５とした場
合を示し、（ｄ）は、曖昧度（Ｐ２とＰ３間の距離）を
変えず、底辺拡張方向のゲイン値ｇ＝１．０、かつ底辺
縮小方向のゲイン値ｇ＝０とした場合を示すFIG. 20 shows how the result of changing the membership function changes depending on how to determine the gain of the learning function. (A) shows the current membership function (thin solid line) and sampling for a certain period. Shows a membership function (broken line) based on the distribution obtained from, (b) shows a membership function (thick solid line) generated when the gain value g at each point is 0.5, and (c) ) Shows the case where the gain value of the point P1 is g = 0 and the gain value of the other point is g = 0.5, and (d) is the base expansion without changing the ambiguity (distance between P2 and P3). Direction gain value g = 1.0 and bottom side reduction direction gain value g = 0 are shown.

【図２１】ノイズの除去を説明するための図FIG. 21 is a diagram for explaining noise removal.

【図２２】否定教示のルールを肯定教示のルールに変
換する処理のフロー図である。FIG. 22 is a flowchart of a process of converting a rule of negative teaching to a rule of positive teaching.

【図２３】否定教示のルールを肯定教示のルールへの
変換を説明する図FIG. 23 is a diagram for explaining conversion of a rule of negative teaching into a rule of positive teaching.

【図２４】評価処理のフロー図FIG. 24 is a flowchart of evaluation processing

【図２５】評価データの再利用を説明するための図FIG. 25 is a diagram for explaining reuse of evaluation data.

【図２６】教師データなし学習を行う実施例の機能ブ
ロック図FIG. 26 is a functional block diagram of an embodiment for performing learning without teacher data.

【図２７】教師データなし学習の処理フロー図FIG. 27 is a processing flowchart of learning without teacher data.

【図２８】教師データなし学習を説明するための図FIG. 28 is a diagram for explaining learning without teacher data.

【図２９】教師データなし学習を適用したネットワー
ク型情報処理装置の例を示す図FIG. 29 is a diagram showing an example of a network-type information processing device to which learning without teacher data is applied.

【図３０】教師データなし学習を適用したネットワー
ク型情報処理装置の他の例を示す図FIG. 30 is a diagram showing another example of a network-type information processing device to which learning without teacher data is applied.

[Explanation of symbols]

１１…ネットワーク型情報処理装置、１２…学習用デー
タ入力部、１３…領域判定部、１４…フィルタ関数更新
部、１５…フィルタ関数生成部、１６…評価部、１７…
度数分布計測部、１７１…履歴バッファ、１８…更新方
法指定部。11 ... Network-type information processing device, 12 ... Learning data input unit, 13 ... Region determination unit, 14 ... Filter function updating unit, 15 ... Filter function generating unit, 16 ... Evaluation unit, 17 ...
Frequency distribution measuring unit, 171 ... History buffer, 18 ... Update method designating unit.

───────────────────────────────────────────────────── フロントページの続き (72)発明者上石陽一東京都渋谷区神南一丁目15番８号兼仲ビル４階株式会社アドイン研究所内 (72)発明者藤巻俊秀北海道札幌市中央区北２条西二丁目１番地服部ビル３階株式会社アドイン研究所札幌ラボラトリー内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Yoichi Ueishi Inventor Yoichi Ueishi 1-15-8, Jinnan, Shibuya-ku, Tokyo, 4th floor, Ken Naka Building, Add-in Research Institute Co., Ltd. (72) Toshihide Fujimaki Kita, Chuo-ku, Sapporo, Hokkaido Article 2 Nishi 2-chome 1 Hattori Building 3rd floor Inside Add-in Research Institute Sapporo Laboratory

Claims

[Claims]

1. A plurality of input nodes, a plurality of output nodes, and a directional link connecting the input node and the output node; the directional link storing a filter function which is a non-linear selective function. Filter function storage means for
Network type information having filter function operation means for converting information passing through a directional link by the filter function; the output node having means for performing function operation on information input via the directional link In a learning system of a processing device, an input means for inputting data and a teacher signal for the data to the network type information processing device, and a region for determining whether or not an area indicated by the input data is included in an existing recognition area Determining means, and a filter function updating means for individually updating each filter function in a set of filter functions forming the existing recognition area when it is determined by the area determining means to be included in the existing recognition area, , A new recognition area is formed when it is judged by the area judgment means that the recognition area is not included in the existing recognition area. And a filter function generating means for generating a set of filter functions.

2. A plurality of input nodes, a plurality of output nodes, and a directional link connecting the input node and the output node; the directional link storing a filter function which is a non-linear selective function. Filter function storage means for
Network type information having filter function operation means for converting information passing through a directional link by the filter function; the output node having means for performing function operation on information input via the directional link In the learning system of the processing device, an input means for inputting learning data including input data and a corresponding positive or negative teacher signal to the network type information processing device, and an area indicated by the input learning data is An area determination unit that determines whether or not the area is included in an existing recognition area that represents affirmation or a recognition area that represents a negative, and the area determination unit includes an existing recognition area that represents affirmation or a recognition area that represents a negative recognition area. If so, each filter function in the set of filter functions that forms the existing recognition region is individually updated. Filter function updating means and the area determining means determines that the teacher signal does not belong to any of the existing recognition area indicating affirmation or the existing recognition area indicating negation, and if the teacher signal indicates affirmative, a new A filter function generating means for generating a set of filter functions forming a recognition region expressing affirmation, and generating a set of filter functions forming a recognition region expressing a new negation when the teacher signal represents negation; A learning system for a network type information processing device, comprising:

3. A network type information processing apparatus according to claim 2, further comprising means for converting a negative output node into a positive output node by referring to a positive output node having a region adjacent to the negative output node. Learning system.

4. A plurality of input nodes, a plurality of output nodes, and a directional link connecting the input node and the output node; the directional link storing a filter function which is a non-linear selective function. Filter function storage means for
Network type information having filter function operation means for converting information passing through a directional link by the filter function; the output node having means for performing function operation on information input via the directional link In a learning system of a processing device, an input means for inputting a teacher signal corresponding to the input data to the network type information processing device, a teacher signal as a target value, and an error between the value obtained as a result of the network information processing of the input data are calculated. An evaluation unit that determines whether or not to update the existing region based on the error, and forms the recognition region when the existing recognition region is determined to be updated by the evaluation unit Filter function updating means for individually updating each filter function in the set of filter functions; A learning system for a network-type information processing device, comprising: a filter function generating means for generating a set of filter functions that form a new recognition region when determined.

5. A plurality of input nodes, a plurality of output nodes, and a directional link connecting the input node and the output node; the directional link storing a filter function which is a nonlinear selective function. Filter function storage means for
Network type information having filter function operation means for converting information passing through a directional link by the filter function; the output node having means for performing function operation on information input via the directional link In a learning system of a processing device, an input unit for inputting learning data not including a teacher signal to the network type information processing device, and a maximum value of a result obtained by performing network information processing on the input data by the network type information processing device. And a filter function updating unit that individually updates each filter function in the set of filter functions connected to the output node determined by the determining unit. Learning system for network type information processing equipment.

6. The area judging means compares the degree of matching obtained by the calculation of the filter function calculating means of the network type information processing device with a predetermined threshold value, and judges the recognition area based on the result. A learning system for a network type information processing apparatus according to claim 1 or 5.

7. A history storage means for storing learning history information for each recognition area, and based on the learning history information stored in the history storage means when it is determined to update an existing recognition area. , Update method instructing means for selecting and instructing an appropriate update method from a plurality of types of update methods as update methods of the filter function for each recognition area, and the filter function updating means determines the update method selecting means. 6. The network according to claim 1, wherein each of the filter functions in the set of filter functions forming the existing recognition region is individually updated by using the update method described above. System for learning type information processing equipment.

8. A frequency distribution measuring means for obtaining a frequency distribution from history information stored in the history storage means is provided, and the filter function updating means uses a statistical method using the frequency distribution obtained by the frequency distribution measuring means. The learning system for a network type information processing apparatus according to claim 1, wherein the filter function is updated.

9. After completion of learning for a certain period, it is determined whether or not there is noise in a set of filter functions having extremely few teacher signals, based on the contents of the set of filter functions having adjacent regions, and noise is detected. 9. The learning system for a network type information processing apparatus according to claim 1, further comprising a noise removing unit that removes the rule when it is determined to be true.

10. An area in which a set of filter functions is not generated, comprising means for deriving an error using the evaluation data and checking the end of learning, the evaluation data is not used only for deriving the error. 10. The learning system for a network-type information processing device according to claim 1, further comprising an evaluation data reuse unit that is also used as learning data for.