JP2732603B2

JP2732603B2 - Learning process monitoring device for network configuration data processing device

Info

Publication number: JP2732603B2
Application number: JP63216863A
Authority: JP
Inventors: 旭川村; 和雄浅川; 茂美長田; 信雄渡部; 隆木本
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1988-08-31
Filing date: 1988-08-31
Publication date: 1998-03-30
Anticipated expiration: 2013-03-30
Also published as: JPH0264846A

Description

【発明の詳細な説明】〔概要〕複数の入力と重みとの積和を閾値関数によって変換す
る基本ユニットを基本単位にして階層ネットワークを構
成することで，入力パターンに対しての所望の出力パタ
ーンを得るよう処理するネットワーク構成データ処理装
置において，重みの値が学習により設定されていく際に，より学習
の実現状況を視覚的に表示できるようにすることを目的
とし，入力層の基本ユニットに対して入力信号を供給するこ
とで，出力層の基本ユニットから対応する出力信号とこ
の出力信号がとるべき値を指示する教師信号とを使い,2
つの信号の不一致の大きさを表す誤差値を算出し，算出
された誤差値の総和に基づいて算出することになる重み
の更新量に従って，重みを初期値から順次更新していく
ことで，誤差値の総和が所定の許容範囲に入ることにな
る重みの値を求めるよう処理する重み学習手段とを備え
ると共に，上記階層ネットワークの構造に関する情報と
学習パターン保持部に保持されている学習パターンに関
する情報とを受取って，任意適宜に選択された少数個の
結合の重みの変化に対応して，学習の進捗状況を表わす
評価関数の分布を視覚的に表示可能にする評価関数分布
表示部と，重み学習手段による学習の進捗にともなって
変化する結合の重みを受取って，上記評価関数分布表示
部によって得られた表示画像上にプロットする重み探索
経路表示部とを備え，学習の進捗状況に影響を与える係
数（ε及び／又はα）の設定の良否を視覚的に観察でき
るよう構成する。DETAILED DESCRIPTION OF THE INVENTION [Summary] By forming a hierarchical network using a basic unit that converts a product sum of a plurality of inputs and weights by a threshold function as a basic unit, a desired output for an input pattern is obtained. In a network configuration data processing device that processes to obtain a pattern, the purpose of the present invention is to provide a visual display of the learning realization status when the weight value is set by learning. By supplying an input signal to the output unit, a corresponding output signal from the basic unit of the output layer and a teacher signal indicating a value to be taken by the output signal are used.
By calculating an error value representing the magnitude of the mismatch between the two signals, and sequentially updating the weights from the initial value according to the weight update amount calculated based on the sum of the calculated error values, Weight learning means for processing so as to obtain a value of a weight such that the sum of values falls within a predetermined allowable range, and information on the structure of the hierarchical network and information on a learning pattern held in a learning pattern holding unit. And an evaluation function distribution display unit for visually displaying the distribution of evaluation functions representing the progress of learning in response to a change in the weight of a small number of connections selected as appropriate. A weight search path display unit for receiving the weight of the connection that changes with the progress of the learning by the learning means and plotting the weight on the display image obtained by the evaluation function distribution display unit. It is configured for visually observing the quality of the setting of the coefficients that affect the progress of learning (epsilon and / or alpha).

[Industrial applications]

本発明は，ネットワーク構成データ処理装置の学習処
理監視装置に関するものである。The present invention relates to a learning process monitoring device for a network configuration data processing device.

従来の逐次処理コンピュータ（ノイマン型コンピュー
タ）では，使用方法や環境の変化に応じてコンピュータ
のデータ処理機能を調節することが難しいので，適応性
を有するデータ処理方式として，新たに階層ネットワー
クによる並列分散処理方式が提唱されてきている。特
に，バック・プロパゲーション法と呼ばれる処理方式
（D.E.Rumelhart,G.E.Hinton,and R.J.Williams,“Lear
ning Internal Representations by Error Propagatio
n,"PARALLEL DISTRIBUTED PROCESSING,Vol.1,pp.318−3
64,The MIT Press,1986）がその実用性の高さから注目
されている。With a conventional sequential processing computer (Neumann-type computer), it is difficult to adjust the data processing function of the computer according to changes in usage or environment. Processing methods have been proposed. In particular, a processing method called the back propagation method (DERumelhart, GE Hinton, and RJ Williams, “Lear
ning Internal Representations by Error Propagatio
n, "PARALLEL DISTRIBUTED PROCESSING, Vol.1, pp. 318-3
64, The MIT Press, 1986) has attracted attention because of its practicality.

バック・プロパゲーション法では，基本ユニットと呼
ぶ一種のノードと重みを持つ内部結合とから階層ネット
ワークを構成している。第８図に，基本ユニット１の基
本構成を示す。この基本ユニット１は，多入力一出力系
となっており，複数の入力に対し夫々の内部結合の重み
を乗じてそれらの全乗算結果を和算する累算処理部２
と，この累算値に非線型の閾値処理を施して一つの最終
出力を出力する閾値処理部３とを備える。そして，この
ような構成の多数の基本ユニット１が，第９図に示すよ
うに階層的に接続されることで階層ネットワークが構成
され，入力信号のパターンを対応する出力信号のパター
ンに変換するというデータ処理機能を発揮することにな
る。In the back-propagation method, a hierarchical network is composed of a kind of node called a basic unit and an internal connection having a weight. FIG. 8 shows the basic configuration of the basic unit 1. The basic unit 1 is a multi-input / one-output system, and accumulates a plurality of inputs by a weight of an internal connection and sums all multiplication results thereof.
And a threshold processing unit 3 that performs a non-linear threshold process on the accumulated value and outputs one final output. Then, a hierarchical network is formed by connecting a number of basic units 1 having such a configuration in a hierarchical manner as shown in FIG. 9, and converts a pattern of an input signal into a pattern of a corresponding output signal. The data processing function will be demonstrated.

このバック・プロパゲーション法では，選択された入
力信号に対しての出力信号が，とるべき信号値を指示す
る教師信号となるように，所定の学習アルゴリズムに従
って階層ネットワーク中の内部結合の重みを決定してい
くことになる。そして，この処理により重みが決定され
ると，例え想定していなかった入力信号が入力されるこ
とになっても，この階層ネットワークから，それらしい
出力信号を出力することで，“柔らかい”並列分散のデ
ータ処理機能が実現されるのである。In this back propagation method, the weight of the internal connection in the hierarchical network is determined according to a predetermined learning algorithm so that the output signal for the selected input signal becomes a teacher signal indicating the signal value to be taken. Will be done. Then, when the weight is determined by this processing, even if an unexpected input signal is input, by outputting an appropriate output signal from this hierarchical network, a “soft” parallel distributed The data processing function is realized.

このような構成のネットワーク構成データ処理装置を
実用的なものにしていくためには，重みの学習処理をよ
り短時間で実現できるようにしていく必要がある。この
ことは，複雑なデータ処理を実現していくために，階層
ネットワークをより大規模にしていく必要があるという
背景のもとで，どうしても解決していかなくてはならな
い課題の１つなのである。In order to make the network configuration data processing device having such a configuration practical, it is necessary to realize a weight learning process in a shorter time. This is one of the issues that must be solved in the context of the need to increase the scale of the hierarchical network in order to realize complex data processing. .

[Conventional technology]

ｈ層を前段層としてｉ層を後段層とすならば，基本ユ
ニット１の累算処理部２で行われる演算は，次の（１）
式に示すものであり，また，閾値処理部３で行われる演
算は次の（２）式に示すものである。If the h-th layer is the first layer and the i-th layer is the second layer, the operation performed by the accumulation processing unit 2 of the basic unit 1 is as follows:
The calculation performed by the threshold processing unit 3 is shown in the following equation (2).

但し， h :h層のユニット番号 i :i層のユニット番号 p :入力信号のパターン番号 θ_i:i層のｉ番目のユニットの閾値 W_ih:h−ｉ層間の内部結合の重み x_pi:p番目パターンの入力信号に関して,h層の各ユニ
ットからｉ層のｉ番目ユニットへの入力の積和 y_ph:p番目パターンの入力信号に対するｈ層の出力 y_pi:p番目パターンの入力信号に対するｉ層の出力バック・プロパゲーション法では，この重みW_ihと閾
値θ_ｉとを誤差のフィードバックにより適応的に自動調
節することになる。この（１）式及び（２）式から明ら
かなように，重みW_ihと閾値θ_ｉとの調節は同時に実行
される必要があるが，この作業は，相互に干渉する難し
い作業となる。 Where, h: unit number of the h layer i: unit number of the i layer p: input signal pattern number θ _i : threshold value of the i-th unit of the i layer W _ih : weight of internal coupling between the h−i layers x _pi : With respect to the input signal of the p-th pattern, the product sum of the input from each unit of the h-layer to the i-th unit of the i-th layer y _ph : the output of the h-layer for the input signal of the p-th pattern y _pi : the input signal of the p-th pattern In the output back propagation method of the i-th layer, the weight W _ih and the threshold θ _i are adaptively and automatically adjusted by error feedback. As is apparent from the equations (1) and (2), the adjustment of the weight W _ih and the threshold θ _i needs to be performed at the same time, but this task is a difficult task that interferes with each other.

次に，重みの学習処理の従来技術について説明する。
この発明は，第10図に示すようなｈ層−ｉ層−ｊ層とい
う構造の階層ネットワークをもって行うことにする。Next, a conventional technique of weight learning processing will be described.
The present invention is carried out with a hierarchical network having a structure of h layer-i layer-j layer as shown in FIG.

重みの学習処理では，最初に，下式に従って，教師信
号と出力層からの出力信号との誤差の二乗和である誤差
ベクトルE_pを，階層ネットワークの誤差として計算す
る。ここで，教師信号とは，出力信号のとるべき信号と
なるものである。In the learning process of the weights, first, according to the following equation, the error vector E _p is the square sum of the error between the output signal from the teacher signal and the output layer is calculated as the error of the hierarchical network. Here, the teacher signal is a signal to be taken by the output signal.

但し， E_p :p番目パターンの入力信号に対しての誤差ベクト
ル E :全パターンの入力信号に対しての誤差ベクトルの
総和（評価関数に相当する） d_pj:p番目パターンの入力信号に対するｊ層ｊ番目ユ
ニットへの教師信号ここで，誤差ベクトルと出力信号との関係を求めるた
め，（３）式をy_pjに関して偏微分すると，を得る。更に，誤差ベクトルE_pとｊ層への入力との関係
を求めるため，誤差ベクトルE_pをx_pjで偏微分すると，を得る。更に，誤差ベクトルE_pとｉ−ｊ層間の重みとの
関係を求めるため，誤差ベクトルE_pをW_jiで偏微分する
と，の積和で表される解を得る。 Where E _p : error vector for p-th pattern input signal E: sum of error vectors for all pattern input signals (corresponding to evaluation function) d _pj : j for p-th pattern input signal Teacher signal to layer j-th unit Here, in order to find the relationship between the error vector and the output signal, the equation (3) is partially differentiated with respect to y _pj . Get. Furthermore, to determine the relationship between the input to the error vector E _p and j layer, the error vector E _p is partially differentiated by x _pj, Get. Furthermore, to determine the relationship between the weight of the error vector E _p and i-j layers, the error vector E _p is partially differentiated with W _ji, To obtain a solution represented by the product sum of.

次に,i層の出力y_piに対する誤差ベクトルE_pの変化を
求めると，を得る。更に,i層入力ユニットへの総和x_piの変化に対
する誤差ベクトルの変化を計算すると，の積和で表される解を得る。更に,h−ｉ層間の重みの変
化に対する誤差ベクトルの変化の関係を求めると，の積和で表される解を得る。Then, when determining the change of the error vector E _p for the output y _pi of the i-layer, Get. Furthermore, when the change of the error vector with respect to the change of the sum x _pi to the i-th layer input unit is calculated, To obtain a solution represented by the product sum of. Further, when the relation of the change of the error vector to the change of the weight between the hi and i layers is obtained, To obtain a solution represented by the product sum of.

これらから，全入力パターンに対する誤差ベクトルと
ｉ−ｊ層間の重みとの関係を求めると，以下のようにな
る。From these, the relationship between the error vector for all input patterns and the weight between the ij layers is obtained as follows.

また、全入力パターンに対する誤差ベクトルとｈ−ｉ
層間の重みとの関係を求めると，以下のようになる。 Also, the error vector and h−i for all input patterns
When the relationship with the weight between layers is obtained, it is as follows.

（11）式及び（12）式は，各層間の重みの変化に対す
る誤差ベクトルの変化率を示していることから，この値
が常に負になるように重みを変化させると，公知の勾配
法により，誤差ベクトルの総和（評価関数）Ｅを漸近的
に０とすることができる。そこで，従来のバック・プロ
パゲーション法では，重みの一回当たりの更新量ΔW_ji
とΔW_ihとを以下のように設定し，この重みの更新を反
復することにより，誤差ベクトルの総和Ｅを極小値に収
束させている。 Equations (11) and (12) show the rate of change of the error vector with respect to the change of the weight between the layers. Therefore, if the weight is changed so that this value is always negative, the known gradient method can be used. , An error vector E (evaluation function) can be asymptotically set to zero. Therefore, in the conventional back propagation method, the update amount per weight ΔW _ji
And ΔW _ih are set as follows, and the updating of the weights is repeated so that the sum E of the error vectors converges to a minimum value.

但し，εは学習の制御パラメータに対応する係数であ
る。 Here, ε is a coefficient corresponding to a learning control parameter.

更に，（13）式及び（14）式を発展せしめたものとし
て，次の如きものも利用される。Further, as an extension of the expressions (13) and (14), the following is also used.

但し，αは制御パラメータに対応する係数である。ま
た,tは更新回数である。 Here, α is a coefficient corresponding to the control parameter. T is the number of updates.

第11図は学習処理を表すフローチャートを示してい
る。FIG. 11 shows a flowchart showing the learning process.

[Problems to be solved by the invention]

バック・プロパゲーション法の最大の問題点は，収束
に要するまでの学習回数が長いことにある。この問題点
は，ネットワーク構造を大規模にすればするほど大きな
ものとなる。The biggest problem with the back propagation method is that the number of learnings required for convergence is long. This problem becomes more serious as the network structure becomes larger.

上記制御パラメータに対応する係数のεとαとを十分
小さくすれば，ほぼ確実に誤差ベクトルの総和Ｅは収束
することになるが，収束するまでの学習回数は多くなっ
てしまうことになる。一方，学習回数を減らそうとして
両パラメータを大きくとると，今度は，誤差ベクトルの
総和Ｅが振動してしまうというおそれがでてくる。入力
層のユニット数が“13"で，中間層のユニット数が“8"
で，出力層のユニット数が“7"という階層ネットワーク
を想定し,62個の入力パターン信号とそれに対応する教
師パターン信号とを使って学習したときの学習結果を第
12図に示す。この第12図において，横軸は学習回数で，
縦軸はそのときの誤差ベクトルの総和である。ここで，
制御パラメータに対応する係数は，ε＝0.3,α＝0.2に
設定した。If the coefficients ε and α corresponding to the control parameters are made sufficiently small, the sum E of the error vectors will almost certainly converge, but the number of times of learning until the convergence will increase. On the other hand, if both parameters are increased in order to reduce the number of times of learning, there is a risk that the sum E of the error vectors may fluctuate. The number of units in the input layer is “13” and the number of units in the middle layer is “8”
Assuming a hierarchical network in which the number of units in the output layer is “7”, the learning result obtained when learning using 62 input pattern signals and the corresponding teacher pattern signal is shown in FIG.
Figure 12 shows. In FIG. 12, the horizontal axis is the number of times of learning.
The vertical axis is the sum of the error vectors at that time. here,
The coefficients corresponding to the control parameters were set to ε = 0.3 and α = 0.2.

この第12図からも明らかとなるように，パラメータの
設定の違いにより多少の違いはできるものの，最適の重
みが決定されるまでに相当回数の学習が繰返される。そ
して，上記係数ε及び／又はαの値を如何に設定するか
が，最適の重みが得られるまでの学習回数に大きく影響
する。As is clear from FIG. 12, although a slight difference can be caused by a difference in parameter setting, learning is repeated a considerable number of times until an optimum weight is determined. Then, how to set the values of the coefficients ε and / or α greatly affects the number of times of learning until an optimum weight is obtained.

このことから，上記係数ε及び／又はαの値を，複数
個の夫々の重みW_ihやW_jiに毎に如何に選ぶかによって，
上記学習がどのような状況を経つつ完成していくかを，
視覚的に表示し，上記係数ε及び／又はαの値の設定の
良否を例えばオペレーション訓練要員に自覚させること
が望まれる。From this, depending on how the values of the coefficients ε and / or α are selected for each of the plurality of weights W _ih and W _ji ,
Under what circumstances the above learning will be completed
It is desired to visually display such that the operation training personnel are aware of the quality of the setting of the coefficient ε and / or α, for example.

本発明は，ネットワーク構成データ処理装置におい
て，重みの値が学習によって設定されていく際に，学習
の実現状況を視覚的に表示できるようにすることを目的
としている。SUMMARY OF THE INVENTION It is an object of the present invention to provide a network configuration data processing device that can visually display a learning implementation status when a weight value is set by learning.

[Means for solving the problem]

第１図は本発明の原理構成図である。 FIG. 1 is a diagram illustrating the principle of the present invention.

図中,1は階層ネットワークの基本単位をなす基本ユニ
ットであり，複数の入力とこれらの入力に対して乗算さ
れるべき重みとを受け取って積和を得るとともに，この
得られた積和値を閾値関数によって変換して最終出力を
得るよう処理する。１−ｈは入力層を構成する複数個の
基本ユニット,1−ｉは１つ又は複数段の中間層を構成す
る複数個の基本ユニット,1−ｊは出力層を構成する１つ
又は複数個の基本ユニットである。基本ユニット１−ｈ
と基本ユニット１−ｉとの間，基本ユニット１−ｉの相
互間，基本ユニット１−ｉと基本ユニット１−ｊとの間
で接続がなされ，かつこの各接続に対応して設定される
重みにより,10で示される階層ネットワークが構成され
ることになる。In the figure, reference numeral 1 denotes a basic unit which is a basic unit of the hierarchical network. The unit receives a plurality of inputs and weights to be multiplied with these inputs to obtain a product sum, and obtains the obtained product sum value. Processing is performed so as to obtain a final output by conversion using a threshold function. 1-h is a plurality of basic units forming an input layer, 1-i is a plurality of basic units forming one or more intermediate layers, and 1-j is one or more forming an output layer Is the basic unit. Basic unit 1-h
And the basic unit 1-i, between the basic units 1-i, between the basic units 1-i and the basic units 1-j, and weights set corresponding to the respective connections. Thus, a hierarchical network indicated by 10 is configured.

20は重みの学習処理のために必要となる学習パターン
を保持する学習パターン保持部であって，複数の所定の
入力信号を保持する入力信号（又は入力信号保持域）21
と，この所定の入力信号に対しての出力信号となるべき
教師信号を保持する教師信号（又は教師信号保持域）22
とを備えるもの,13は出力信号導出手段であって，入力
信号保持域21が保持する入力信号を階層ネットワーク10
に対して供給した結果の各層のユニットの出力を得るも
の,12は結合の重みであって上述のW_ihやW_jiを保持する
もの,30は重み学習手段であり,33は誤差値算出手段であ
って出力信号導出手段13からの出力信号と保持されてい
る教示信号22とからこの２つの信号の不一致度合を表す
誤差値を算出するとともに供給されるすべての入力信号
の誤差値の総和を求めるよう処理するものである。Reference numeral 20 denotes a learning pattern holding unit for holding a learning pattern necessary for the weight learning process, and an input signal (or input signal holding area) 21 for holding a plurality of predetermined input signals.
And a teacher signal (or teacher signal holding area) 22 for holding a teacher signal to be an output signal with respect to the predetermined input signal.
13 is an output signal deriving means for converting the input signals held in the input signal holding area 21 into the hierarchical network 10
To obtain an output unit of each layer of the result of the supply against, 12 holds the above-mentioned W _ih and W _ji a weight of binding, 30 is a weight learning means, the error value calculation unit 33 An error value indicating the degree of mismatch between the two signals is calculated from the output signal from the output signal deriving means 13 and the held teaching signal 22, and the sum of the error values of all input signals supplied is calculated. It is processing to ask.

重み学習手段30は，重み更新量ΔＷ算出のための制御
パラメータ（係数ε，α）32と重み更新のための更新規
則31とを備え，階層ネットワーク10における結合の重み
12を初期値から学習回数が進む毎に順次更新して，新し
い結合の重み12として格納する。即ち，や，にしたがって，結合の重みW_ihやW_jiを更新していく。The weight learning means 30 includes a control parameter (coefficient ε, α) 32 for calculating the weight update amount ΔW and an update rule 31 for updating the weight, and a weight of the connection in the hierarchical network 10.
12 is sequentially updated each time the number of times of learning progresses from the initial value, and stored as a new connection weight 12. That is, And , The connection weights _Wih and _Wji are updated.

評価関数分布表示部50は，評価関数の分布の算出処理
51と，切断平面の算出処理52と，評価関数の分布の表示
処理53とを行うものである。The evaluation function distribution display unit 50 is used to calculate the distribution of the evaluation function.
51, processing 52 for calculating a cutting plane, and processing 53 for displaying the distribution of the evaluation function.

階層ネットワーク10の構造が与えられ，かつ学習パタ
ーンが与えられると，上述の個々の結合の重みW_ihやW_ji
が如何なる値にあるときに，上記誤差ベクトルの総和即
ち評価関数Ｅが如何なる値をとるかについては，知るこ
とができる。即ち，数多くの結合の重みについて，いわ
ばカット・アンド・トライによって評価関数Ｅの値を算
出することができる。本発明の監視装置の場合には，こ
のような評価関数の分布の算出は，オフライン処理的に
実行され終っているものと考えてもよい。評価関数の分
布の状況が算出されると，上述の如く数多く存在する結
合の重みのうちのいずれの重みが，当該重みの変化にも
とづいて上記の評価関数Ｅの値が大きく影響を受けてい
るかについて知ることができる。本発明の監視装置の場
合には，或る結合の重みW_xを選択的に抽出し、当該重み
W_xを学習的に決定させていく上での上述の係数ε及び／
又はαを選定した際に，学習が進捗していく毎に，当該
重みが，あるべき最適の重みに対して，如何なる経路を
へて近づいていくかを視覚的に表示できるようにする。Given the structure of the hierarchical network 10 and the learning pattern, the weights W _ih and W _ji
At which value the sum of the error vectors, ie, what value the evaluation function E takes, can be known. That is, the value of the evaluation function E can be calculated for a large number of connection weights by cut-and-try. In the case of the monitoring device of the present invention, such calculation of the distribution of the evaluation function may be considered to have been executed in an off-line manner. When the state of the distribution of the evaluation function is calculated, which of the many existing connection weights has a large influence on the value of the evaluation function E based on the change in the weight is determined. You can know about. When the monitoring device of the present invention is to selectively extract the weight W _x of a certain binding, the weight
Coefficient of the above the W _x on going learning manner to determine ε and /
Alternatively, when α is selected, each time learning progresses, it is possible to visually display what path the weight approaches the optimal weight which should be approached.

このことから，図示の切断平面の算出処理52において
は，所望する或る重みW_xについての監視を行うべく当該
重みW_xに注目した際に，視覚的な表示を２次元平面上で
行う上で好ましい平面を決定する。そして，評価関数の
分布の表示処理53は，当該決定された平面上で重みW_xの
値に対応して評価関数の値がどのように対応するかを表
示する図形を得る。Therefore, in the calculation process 52 of the cutting plane shown, when attention is paid to the weight W _x to perform the monitoring of the desired certain weight W _x, on a visual display on a two-dimensional plane Determines the preferred plane. Then, the display process 53 of the distribution of evaluation function to obtain a figure which indicates whether the value of the evaluation function corresponding to the value of the weight W _x on the determined plan how corresponding to.

重み探索経路表示部60は，重み学習手段30が，学習回
数が進む毎に新しい結合の重みを生成する状況を受取
り，上記評価関数分布表示部50において表示しようとし
ている図形上で，上記新しい結合の重みがどのような位
置に位置するかを算出する。この算出は，図示の重み探
索経路の算出処理61において行われる。そして，算出さ
れた結果にもとづいて，重み探索経路の表示処理62は，
上記図形上での新しい結合の重みの位置をプロットす
る。The weight search path display unit 60 receives the situation in which the weight learning unit 30 generates a new connection weight each time the number of times of learning advances, and displays the new connection weight on the graphic to be displayed on the evaluation function distribution display unit 50. Is calculated in what position the weight is located. This calculation is performed in a weight search path calculation process 61 shown in the figure. Then, based on the calculated result, the display processing 62 of the weight search path
Plot the positions of the new connection weights on the figure.

これらの結果が表示装置70において表示される。即
ち，係数ε及び／又はαとして或る値を設定した際に，
重み学習手段30による学習回数が進捗していくにつれ
て，上記プロットの位置がどのような状況の下で，最適
な重みに近づいていくかが視覚的に表示される。These results are displayed on the display device 70. That is, when a certain value is set as the coefficient ε and / or α,
As the number of times of learning by the weight learning means 30 progresses, it is visually displayed under what condition the position of the plot approaches the optimum weight.

勿論，本発明の前提としては，最適な重みについては
予め判明しているものであるが，上記係数ε及び／又は
αの値の設定の如何によって，如何に最適な重みに近づ
いていくかを訓練者に知らせるようにする。Of course, the premise of the present invention is that the optimum weight is known in advance, but how to approach the optimum weight depends on the setting of the value of the coefficient ε and / or α. Inform trainees.

(Operation)

階層ネットワーク10の構造に関する情報と学習パター
ン保持部20の内容とが評価関数分布表示部50に与えら
れ，所望する平面でみたときの評価関数の分布が例えば
２次元図形の等高線を表す如く表示される。そして，対
応する重みW_xについての係数ε及び／又はαが制御パラ
メータ32として設定され，かつ結合の重み12として初期
値がセットされる。Information about the structure of the hierarchical network 10 and the contents of the learning pattern storage unit 20 are given to the evaluation function distribution display unit 50, and the distribution of the evaluation function when viewed on a desired plane is displayed so as to represent, for example, a contour line of a two-dimensional figure. You. The coefficient ε and / or α for the corresponding weight W _x is set as the control parameter 32, and the initial value is set as the weight 12 of the coupling.

この状態の下で，重み学習手段30における学習が進捗
する毎に（あるいは適当な回数毎に），重み探索経路表
示部60が，新しく生成された重みW_xについての上記図形
上の位置をプロットしていく。上記係数ε及び／又はα
の値の与え方の如何によっては，非所望な発振が生じる
ことも視覚的に判る。In this state, each time the learning in the weight learning means 30 progresses (or every suitable number of times), the weight search path display unit 60 plots the position of the newly generated weight W _x on the figure. I will do it. The coefficient ε and / or α
It can be visually recognized that an undesired oscillation occurs depending on how the value is given.

〔Example〕

評価関数の分布についての特徴が把握されている状態
で，その特徴が良く現われるような表示を行うために，
第１図図示の切断平面の算出処理52が行われる。上記の
如く特徴が良く現われるような表示を得るための平面を
得るに当って，次の２つの方式が考慮される。In a state where the characteristics of the distribution of the evaluation function are grasped, in order to display the characteristics well,
A cutting plane calculation process 52 shown in FIG. 1 is performed. The following two methods are considered in obtaining a plane for obtaining a display in which the characteristics appear well as described above.

（Ａ）切断平面選択方式（その１）今仮に,3種類の重みW₁,W₂,W₃について最適な重みを得
るものとし，最適な重みW_cが，座標｛W_c1,W_c2,W_c3｝で
与えられているものとすると，最適な重みについての位
置ベクトルは，第２図図示（W_c）の如く表わされる。こ
のことから，当該位置ベクトル（W_c）を含むような平面
Ｓを考慮する。当該位置ベクトル（W_c）を含む平面はい
わば無限に存在するが，今任意のベクトル（W_x）を考慮
して，ベクトル（W_c）とベクトル（W_x）とを含む平面を
切断すべき平面とすると，当該平面Ｓ上の任意の点W
_pは， W_p＝aW_c＋bW_x （但し,a,bは任意の実数）で表わされる。このような平面Ｓを選んだ場合には，最
適な重みに対応する位置ベクトル（W_c）の先端に向うよ
うに学習が行われることが好ましいものであり，実際の
学習に当って，当該先端に向って如何なる経路をへて学
習が進んでいくかを効率よく監視することができる。(A) Cutting plane selection method (part 1) It is now assumed that optimum weights are obtained for three types of weights W ₁ , W ₂ , and W ₃ , and the optimum weights W _c are coordinates ｛W _c1 , W _c2 , _Assuming that the position vector is given by W _c3 }, the position vector for the optimal weight is represented as shown in FIG. 2 (W _c ). Therefore, a plane S including the position vector (W _c ) is considered. The plane containing the position vector (W _c ) exists infinitely, so to speak, the plane containing the vector (W _c ) and the vector (W _x ) should be cut in consideration of an arbitrary vector (W _x ) If it is a plane, any point W on the plane S
_p is represented by W _p = aW _c + bW _x (where a and b are arbitrary real numbers). When such a plane S is selected, it is preferable that the learning is performed so as to be directed to the tip of the position vector (W _c ) corresponding to the optimum weight. It is possible to efficiently monitor what path the learning progresses toward.

（Ｂ）切断平面選択方式（その２）今，最適な重みについて座標｛W_c1,W_c2,W_c3｝が与え
られているとし，例えば重みW₂については固定値W_c2に
固定しておいて重みW₁とW₃とについて値を変化するもの
と考え，第３図図示の如く，固定値W_c2を通る平面Ｓ′
を切断面に選定する。このように平面Ｓ′を選択してお
くと，当該平面Ｓ′上の点（位置ベクトル（W_c）の先
端）に向うように学習が行われることが好ましいもので
あり，実際の学習に当って，当該先端に向って如何なる
経路をへて学習が進んでいくかを効率よく監視すること
ができる。(B) the cutting plane selection method (Part 2) Now, as for the optimal weights coordinates _{_{{W c1, W c2, W}} c3} are given, for example contact is fixed to a fixed value W _c2 for weight W ₂ There considered to change the values for the weights W ₁ and W ₃ are, as Figure 3 illustrated, a plane S 'passing through the fixed value W _c2
Is selected as the cut surface. When the plane S 'is selected in this manner, it is preferable that learning is performed so as to be directed to a point on the plane S' (the tip of the position vector (W _c )). Thus, it is possible to efficiently monitor what path the learning proceeds toward the tip.

この切断方式について一般的に表わすと，次のように
なる。即ち，上記（１）式において,x_piはy_phについて線形であ
る。このことから,x_piとy_phとを座標軸とする空間で考
えると，（１）式は超平面の方程式となる（但し,pとｉ
とは固定する）。仮に,y_phが２次元でx_piが１次元であ
るとして， x_pi＝W_i1y_p1＋W_i2y_p2−θ_ｉであると考えると，第４図図示の如く,x_piは x_pi＝W_i1y_p1＋W_i2y_p2−θ_ｉ＝０なる直線を含みかつ座標軸（x_pi）をθ_ｉできる平面と
なる。A general description of this cutting method is as follows. That is, In the above equation (1), x _pi is linear with respect to y _ph . From this, when considering in a space where x _pi and y _ph are coordinate axes, equation (1) is a hyperplane equation (where p and i
Is fixed). Assuming that y _ph is two-dimensional and x _pi is one-dimensional, x _pi = _{Wi 1} y _p1 + _Wi ₂ y _p2 −θ _i , and as shown in FIG. 4, x _pi becomes x _pi = W _i1 y _p1 + W _i2 y comprises _p2 - [theta] _i = 0 becomes linear and the plane of the coordinate axes (x _pi) can theta _i.

上記x_pi＝０なる直線の位置はW_ihとθ_ｉとの間の比だ
けに依存し,x_piの絶対値の変化には影響されない。この
絶対値に依存するのは超平面の傾きである。The position of the line where x _pi = 0 depends only on the ratio between W _ih and θ _i, and is not affected by changes in the absolute value of x _pi . Dependent on this absolute value is the slope of the hyperplane.

したがって，第４図に代えて,x_piとy_piとの間の閾値
関数として段階関数を用いたものを第５図に示すと，第
５図においても，或る入力についての重みW_ihと閾値θ
_ｉとの間の比のみに依存することになる。閾値関数とし
て，段階関数を鈍らせたＳ字状関数とすると，位置関係
はW_ihとθ_ｉとの比に依存し，絶対値|y_pi|はその鈍りの
効果が反映されたものとなる。Therefore, instead of FIG. 4, using a step function as a threshold function between x _pi and y _pi is shown in FIG. 5. In FIG. 5, the weights W _ih and Threshold θ
It will only depend on the ratio between _i . As a threshold function, assuming that the step function is an S-shaped function that is blunted, the positional relationship depends on the ratio between W _ih and θ _i, and the absolute value | y _pi | reflects the effect of the blunting .

したがって，階層ネットワークの使用目的が識別にあ
る場合，境界線を決定するのに寄与するものは重み同士
の比であり，絶対値は境界線の曖昧さに寄与する。した
がって，例えば２つの重みを２次元座標軸として，その
２次元平面上の各点について上述の評価関数Ｅの値をプ
ロットした３次元図形を考えると，当該３次元図形は，
座標の原点から放射状に拡がる峡谷または円錐状の分布
をもつ形が現われる。Therefore, when the purpose of use of the hierarchical network is identification, what contributes to determining the boundary is the ratio between the weights, and the absolute value contributes to the ambiguity of the boundary. Therefore, for example, considering a three-dimensional figure in which the values of the above-described evaluation function E are plotted for each point on the two-dimensional plane using two weights as two-dimensional coordinate axes,
A shape with a canyon or conical distribution spreading radially from the origin of the coordinates appears.

第６図はベクトルW_cとW_xとに対応する評価関数Ｅの図
形の一例を示している。Figure 6 shows an example of a figure of the evaluation function E corresponding to the vector W _c and W _x.

aW_c＋bW_x で与えられる点での評価関数Ｅの値が示され,3次元図形
となっている。The value of the evaluation function E at the point given by aW _c + bW _x is shown, and has a three-dimensional graphic.

第７図は第６図図示の３次元図形に対して切断平面選
択方式（その１）に対応した平面を考慮して評価関数Ｅ
の値を等高線で表した表示図形を示している。この表示
図形上に，学習によって新しく生成された重みをプロッ
トしていき，学習の進捗状況を監視する。上記切断平面
選択方式（その１）の場合には，上述した「座標の原点
から放射状に拡がる峡谷又は円錐状の分布をもつ形」の
縦断面をみることとなり，最適重み探索過程がほぼその
切断平面上に載る。第７図図示の場合には図示右斜め上
方に延びる峡谷の谷に沿って進む如く学習が進捗するの
が好ましい形である。FIG. 7 shows an evaluation function E for the three-dimensional figure shown in FIG. 6 in consideration of a plane corresponding to the cutting plane selection method (part 1).
Is shown in a display figure in which the value of is represented by a contour line. Weights newly generated by learning are plotted on the display figure, and the progress of learning is monitored. In the case of the above cutting plane selection method (Part 1), the vertical cross section of the above-mentioned "shape having a valley or conical distribution spreading radially from the origin of the coordinates" is seen, and the optimum weight search process is almost completed. Rest on a flat surface. In the case shown in FIG. 7, it is preferable that the learning progresses so as to proceed along the valley of the gorge which extends diagonally upward to the right in the figure.

なお上述の切断平面選択方式（その２）に対応した平
面をもって表示する図形の場合には上記峡谷又は円錐に
ついての横断面をみることとなる。この表示図形上に，
上述の如く新しく生成された重みがプロットされてい
く。In the case of a graphic displayed with a plane corresponding to the above-described cutting plane selection method (No. 2), the cross section of the gorge or the cone is viewed. On this display figure,
The newly generated weights are plotted as described above.

〔The invention's effect〕

以上説明した如く，本発明によれば，学習の進捗状況
を視覚的にみることが可能となり，オペレーション訓練
者が自己の与えた係数ε及び／又はαの値についての良
否を観察することが可能となる。As described above, according to the present invention, the progress of learning can be visually checked, and the operation trainee can observe the quality of the coefficients ε and / or α given by himself / herself. Becomes

【図面の簡単な説明】第１図は本発明の原理構成図，第２図及び第３図は夫々
切断平面の選択方式について説明する説明図，第４図及
び第５図はユニットにおける計算を説明する説明図，第
６図は評価関数Ｅを表わす図形，第７図は切断平面選択
方式に対応した平面で表わした表示図形，第８図は基本
ユニットの構成図，第９図は階層ネットワークの構成
図，第10図はバック・プロパゲーション法の説明図，第
11図は学習処理を表わすフローチャート，第12図は学習
により重みが変化していく状況を表わす図を示す。図中,1は基本ユニット,2は累算処理部,3は閾値処理部,1
0は階層ネットワーク,20は学習パターン保持部,30は重
み学習手段,50は評価関数分布表示部,60は重み探索経路
表示部,70は表示装置を表わす。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing the principle of the present invention, FIG. 2 and FIG. 3 are explanatory diagrams for explaining a method of selecting a cutting plane, and FIG. 4 and FIG. FIG. 6 is a diagram showing an evaluation function E, FIG. 7 is a display diagram shown by a plane corresponding to a cutting plane selection method, FIG. 8 is a configuration diagram of a basic unit, FIG. 9 is a hierarchical network Fig. 10 is an illustration of the back propagation method.
FIG. 11 is a flowchart showing a learning process, and FIG. 12 is a diagram showing a situation in which weights change due to learning. In the figure, 1 is the basic unit, 2 is the accumulator, 3 is the threshold processor, 1
0 is a hierarchical network, 20 is a learning pattern holding unit, 30 is a weight learning means, 50 is an evaluation function distribution display unit, 60 is a weight search path display unit, and 70 is a display device.

───────────────────────────────────────────────────── フロントページの続き (72)発明者渡部信雄神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者木本隆神奈川県川崎市中原区上小田中1015番地富士通株式会社内 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Nobuo Watanabe 1015 Uedanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Inside Fujitsu Limited (72) Inventor Takashi Kimoto 1015 Ueodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Fujitsu Limited

Claims

(57) [Claims]

The present invention obtains a product sum by receiving one or more inputs from a preceding layer and a weight (12) of a connection to be multiplied with the input, and obtains a product sum value by a threshold value. A basic unit (1) that is converted by a function to obtain a final output is a basic unit, a plurality of the basic units (1-h) are an input layer, and a plurality of the basic units (1-i) are an intermediate layer. One or more intermediate layers, and one or more of the basic units (1-j) as an output layer, between the input layer and the frontmost intermediate layer, between the intermediate layers, and In the data processing device forming an internal connection between the intermediate layer and the output layer in the final stage and configuring the hierarchical network (10) by setting the weight corresponding to the internal connection, For a plurality of basic units (1-h) The output signal deriving means (13) for obtaining an output signal corresponding to the input signal from the basic unit (1-j) of the output layer by supplying the input signal (21) of ) Using the output signal of each layer unit and the teacher signal (22), which is stored in the learning pattern storage unit (20) and indicates the value to be taken by the output signal, to determine the mismatch between the two signals. According to an error value calculating means (33) for calculating an error value representing the magnitude and a weight update amount to be calculated based on the sum of the error values calculated by the error value calculating means (33), A weight learning means (30) for updating the weight (12) of the connection sequentially from the initial value so as to obtain the value of the weight such that the sum of the error values falls within a predetermined allowable range. And the weight learning means (30) In both cases, processing is performed to determine the update amount of the weight in the current update cycle in consideration of the data factor related to the update amount of the weight determined in the previous update cycle, and the structure in the hierarchical network (10) is determined. The input signal (21) and the teacher signal generated when the weight of the combination (12) is changed by receiving the information about the learning pattern and the information about the learning pattern held in the learning pattern holding unit (20). Based on the error value from (22), an evaluation function with the error value as a variable is used to obtain the distribution of the evaluation function with the weight of the connection (12) as the coordinate axis. An evaluation of obtaining a display image of the distribution of the evaluation function by specifying the weight of an arbitrary small number of connections that have a large influence and projecting the evaluation function on a cutting plane using the specified weight of the connection as a coordinate axis and displaying the evaluation function. A value function distribution display unit (50) and a weight image learning unit (30) receive a change state of the weight (12) of the connection while learning is progressing, and display the distribution of the evaluation function on a display image corresponding to the display. A weight search path display unit (60) for plotting a change situation of the connection weight (12); a display image of the evaluation function distribution obtained by the evaluation function distribution display unit (50); A display device (70) for visually displaying the plot obtained by the search route display unit (60).
A value is set for a coefficient (ε and / or α) that affects the change state of the weight (12) of the connection by the weight learning means (30). A learning process monitoring device for a network configuration data processing device, characterized in that observation is possible.