JP3389684B2

JP3389684B2 - Learning method of neural network

Info

Publication number: JP3389684B2
Application number: JP16047094A
Authority: JP
Inventors: 志津夫永島; 嘉宏松浦
Original assignee: Meidensha Corp
Current assignee: Meidensha Corp
Priority date: 1994-07-13
Filing date: 1994-07-13
Publication date: 2003-03-24
Anticipated expiration: 2018-03-24
Also published as: JPH0830573A

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】本発明は、階層型のニューラルネ
ットワークに係り、特にパターン認識のための出力層構
成とバックプロパゲーションによる学習処理に関する。【０００２】【従来の技術】階層型ニューラルネットワークは、例え
ば、音声認識における音声データから音素列に変換する
処理を行い、この結果を使ってＤＴＷ（動的計画法、Ｄ
Ｐマッチング法）で単語識別を行うという離散単語認識
システムに利用される。従来の階層型ニューラルネット
ワークの例を図４に示す。【０００３】ユニット（ニューロン）が入力層Ｓと中間
層Ａ及び出力層Ｒのそれぞれ独立な３層に分離される。
入力層Ｓは、多数個のニューロンモデルから成り、識別
しようとする入力パターンの数値情報を各ユニットの入
力とし、神経信号に変換する。【０００４】中間層Ａは、多数個のニューロンモデルか
ら成り、入力層Ｓのすべてのユニットと結合荷重Ｗをも
って結合される。【０００５】出力層Ｒは、識別するカテゴリに１対１に
対応する数のニューロンモデルから成り、中間層Ａのす
べてのユニットと結合荷重をもって結合される。【０００６】この構成において、出力パターンには、そ
の入力パターンに対応する正解を得るため、入力パター
ンを与えたときの出力パターンが教師信号（期待出力）
と一致するようにネットワークの各結合荷重を変化させ
るパターン変換処理がなされる。【０００７】このパターン変換処理において、出力パタ
ーンと教師信号との誤差を求め、出力層から入力層へ逆
方向へ誤差を伝播させ、この誤差に応じた量だけ各ユニ
ット間の荷重を調整し、再度に入力パターンを与えて出
力パターンと教師信号の誤差を求めるというバックプロ
パゲーション式学習を行う。【０００８】【発明が解決しようとする課題】従来のニューラルネッ
トワークにおいて、実用的な問題を取り扱う場合、カテ
ゴリー間のデータの境界があいまいであるとき、また、
カテゴリー内のデータの特徴のバラツキが大きいときな
ど、１ユニットで１つのカテゴリーを代表させることに
無理が生じることが多い。このため、音素識別など認識
対象が複雑化するに伴って認識率が低下する。【０００９】本発明の目的は、複雑な認識対象に対する
認識能力を高めるニューラルネットワークの学習方法を
提供することにある。【００１０】【課題を解決するための手段】本発明は、前記課題の解
決を図るため、入力層と中間層及び出力層を有し、前記
出力層は１つの認識カテゴリーに対して複数のユニット
が設けられ、各層間は記憶装置に格納されている結合荷
重を有して結合される階層型のニューラルネットワーク
を、バックプロパゲーションの規則に基づいて前記結合
荷重と教師データから前記結合荷重の修正量を決定して
前記記憶装置に格納されている結合荷重を書き換えるこ
とによって学習させるニューラルネットワークの学習方
法であり、ある認識カテゴリーＸに設定しようとする教
師データが１か否かをチェックする手順と、前記チェッ
クで教師データが１でなければ当該認識カテゴリーＸの
ユニットに与える教師データを０とする手順と、教師デ
ータが１の認識カテゴリーのユニットのうち、最大出力
になるユニットか否かをチェックする手順と、最大出力
でないユニットにはその出力をそのまま教師データとす
る手順と、最大出力になるユニットには教師データを１
にする手順とを有することを特徴とする。【００１１】【作用】認識対象のカテゴリー毎に複数のユニットを用
意した多重化出力層構成とし、バックプロパゲーション
式学習に際して与える教師データは、入力パターンに対
して正解にならないカテゴリーの各ユニットに与える教
師データを０とし、入力パターンに対して正解になるカ
テゴリーの各ユニットのうち、最大値をもつユニットの
みに教師データを与え、他のユニットにはその出力をそ
のまま教師データとして与える。これにより、カテゴリ
ー間の境界を明確にした学習を可能にする。【００１２】【実施例】図１は、本発明の一実施例を示す階層型ネッ
トワーク構成図である。本実施例のネットワークが従来
の階層型ニューラルネットワークと異なる部分は、従来
の構成では出力層の各ユニットが認識対象のカテゴリー
に１対１で対応させるのに対し、本実施例では各カテゴ
リーＡ，Ｂ，Ｃ毎に複数（図示では３）のユニットを用
意する。【００１３】これらユニット中、最大の出力値をもつユ
ニットに属するカテゴリーを認識結果とするのは従来と
同じ手法にされる。【００１４】本実施例の構成において、ニューラルネッ
トの学習は、基本的にはバックプロパゲーション法によ
るが、多重化された出力ユニットに対してはユニットの
識別表現の分化が生じ易くするよう、誤差の計算プロセ
スを図２に示すアルゴリズムとする。【００１５】図２において、破線ブロックは、各カテゴ
リーＡ，Ｂ，Ｃに対する教師データ変更処理を示す。あ
るカテゴリーＸに設定しようとする教師データが１か否
かをチェックし（ステップＳ１）、１でなければ当該カ
テゴリーＸのユニットに与える教師データを０とする
（ステップＳ２）。【００１６】教師データが１のカテゴリーについては、
当該カテゴリーのユニットのうち、最大出力になるユニ
ットか否かをチェックし（ステップＳ３）、最大出力で
ないユニットにはその出力をそのまま教師データとする
（ステップＳ４）。カテゴリー内で最大出力のユニット
には教師データ１を与える（ステップＳ５）。すなわ
ち、学習を行わない。【００１７】これら教師データの変更処理は、すべての
カテゴリーＡ，Ｂ，Ｃについて行われる。図３は、教師
データの変更例を示し、カテゴリーＡ，Ｂ，Ｃの出力値
がそれぞれ図示の値になり、また、教師データとしてカ
テゴリーＡに１を、他のカテゴリーＢ，Ｃに０を与える
場合を示す。【００１８】カテゴリーＢ，Ｃについては教師データが
０になるため、ステップＳ１、Ｓ２の経路から、これら
のカテゴリーには教師データに０を与える。【００１９】カテゴリーＡには教師データが１になるた
め、その３つのユニットのうちの最大出力にならないユ
ニットにはその出力０.１，０.５を教師データとし（ス
テップＳ３、Ｓ４）、最大出力になるユニットには１の
教師データを与える（ステップＳ３、Ｓ５）。【００２０】図２に戻って、各カテゴリーのそれぞれの
ユニットに与える教師データの変更処理を終了した後、
これら教師データを使って通常のバックプロパゲーショ
ン学習アルゴリズムによる学習を行う（ステップＳ
６）。【００２１】したがって、本実施例によれば、カテゴリ
ーを複数のユニットとする多重ユニット構成とし、学習
には入力カテゴリー以外のカテゴリーにはそのユニット
の教師データの全てを０にし、正解になるカテゴリーに
はその最大値になるユニットの教師データを１にし、他
のユニットは出力値を教師データとする。【００２２】これにより、１つのカテゴリーに属してい
ながら他のカテゴリーとの境界があいまいで、識別が従
来難しかったケースでも、出力ユニットの多重化により
学習が無理なく進み、良好なパターン識別が期待でき
る。【００２３】本実施例に基づく実験例として、ニューラ
ルネットにより音声データから音素識別を行い、この音
素識別結果を使ってＤＴＷで単語識別を行った結果、音
素識別能力及び単語識別能力も向上した。【００２４】この実験に使用したニューラルネットは、
出力層を２３ユニット（多重出力では２３×２ユニット
とした）であり、その認識結果を従来型のものと対比さ
せて以下の表１に示す。【００２５】【表１】【００２６】この表からも、従来型ニューラルネットが
誤認識しているカテゴリー１４番の音素データに対し、
本発明になる多重出力型では正しい結果を得ている。【００２７】【発明の効果】以上のとおり、本発明によれば、認識対
象のカテゴリー毎に複数のユニットを用意した多重化出
力層構成とし、バックプロパゲーション式学習に際して
与える教師データは、入力パターンに対して正解になら
ないカテゴリーの各ユニットに与える教師データを０と
し、入力パターンに対して正解になるカテゴリーの各ユ
ニットのうち、最大値をもつユニットのみに教師データ
を与え、他のユニットにはその出力をそのまま教師デー
タとして与えるようにしたため、カテゴリー間の境界を
明確にした学習を可能にし、複雑な識別対象にも良好な
パターン識別ができる効果がある。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a hierarchical neural network, and more particularly to an output layer configuration for pattern recognition and a learning process by back propagation. 2. Description of the Related Art Hierarchical neural networks, for example, perform a process of converting speech data in speech recognition into a phoneme sequence, and use the result to perform DTW (Dynamic Programming, D
It is used in a discrete word recognition system that performs word identification by the P matching method. FIG. 4 shows an example of a conventional hierarchical neural network. A unit (neuron) is separated into an input layer S, an intermediate layer A, and an output layer R, each of which is independent of three layers.
The input layer S is composed of a number of neuron models, and converts numerical information of an input pattern to be identified into each unit and converts it into a neural signal. [0004] The intermediate layer A is composed of a large number of neuron models and is connected to all units of the input layer S with a connection weight W. [0005] The output layer R is composed of a number of neuron models corresponding to the categories to be identified on a one-to-one basis, and is connected to all the units of the intermediate layer A with a connection weight. In this configuration, in order to obtain a correct answer corresponding to the input pattern, the output pattern when the input pattern is given is a teacher signal (expected output).
A pattern conversion process for changing each connection weight of the network so as to coincide with is performed. In this pattern conversion processing, an error between the output pattern and the teacher signal is obtained, the error is propagated in the reverse direction from the output layer to the input layer, and the load between the units is adjusted by an amount corresponding to the error. The back-propagation type learning is performed in which an input pattern is given again and an error between the output pattern and the teacher signal is obtained. [0008] In the conventional neural network, when dealing with practical problems, when the boundaries of data between categories are ambiguous,
In many cases, it is difficult to represent one category by one unit, for example, when the characteristics of the data in the category vary greatly. Therefore, the recognition rate decreases as the recognition target such as phoneme identification becomes more complicated. An object of the present invention is to provide a neural network learning method for improving the ability to recognize a complex recognition target. [0010] Means for Solving the Problems The present invention, in order to solve the above problems, comprises an input layer and the intermediate layer and output layer, wherein
Output layer consists of multiple units for one recognition category
Is provided, and the coupling load stored in the storage device is provided between each layer.
Layered Neural Network Connected with Weight
Based on the rules of back propagation
Determine the correction amount of the connection weight from the load and teacher data
The connection weight stored in the storage device can be rewritten.
Learning method of neural network trained by
A procedure for checking whether the teacher data to be set to a certain recognition category X is 1 or not, and if the teacher data is not 1 in the above check, the teacher data to be given to the unit of the recognition category X is set to 0 A procedure for checking whether the teacher data is a unit having the maximum output among the units of one recognition category, a procedure for using the output as it is as the teacher data for a unit having a non-maximum output, 1 unit of teacher data
And having a procedure to. In a multiplexed output layer configuration in which a plurality of units are prepared for each category to be recognized, back propagation is performed.
The teacher data given at the time of the expression learning corresponds to the input pattern.
Teaching to each unit of the category that does not give a correct answer
The master data is set to 0, and the answer is correct for the input pattern.
Of the units in the category, the unit with the largest value
Only the teacher data, and the output to other units.
It is given as teacher data as it is. This enables learning with clear boundaries between categories. FIG. 1 is a diagram of a hierarchical network configuration showing an embodiment of the present invention. The difference between the network of the present embodiment and the conventional hierarchical neural network is that in the conventional configuration, each unit of the output layer corresponds to the category to be recognized on a one-to-one basis, whereas in the present embodiment, each category A, A plurality (three in the figure) of units are prepared for each of B and C. In these units, the category belonging to the unit having the largest output value is used as the recognition result in the same manner as in the prior art. In the configuration of the present embodiment, the learning of the neural network is basically performed by the back propagation method. However, an error is generated in the multiplexed output unit so that the unit identification expression is easily differentiated. Is an algorithm shown in FIG. In FIG. 2, broken line blocks indicate teacher data change processing for each of the categories A, B, and C. It is checked whether or not the teacher data to be set to a certain category X is 1 (step S1), and if it is not 1, the teacher data given to the unit of the category X is set to 0 (step S2). For the category where the teacher data is 1,
It is checked whether or not the unit of the category has the maximum output (step S3), and the output of the unit having the non-maximum output is directly used as the teacher data (step S4). The teacher data 1 is given to the unit having the maximum output in the category (step S5). That is, no learning is performed. The process of changing the teacher data is performed for all categories A, B, and C. FIG. 3 shows a modification example of the teacher data, in which the output values of the categories A, B, and C become the illustrated values, and 1 is given to the category A and 0 is given to the other categories B and C as the teacher data. Show the case. Since the teacher data is 0 for the categories B and C, 0 is given to the teacher data for these categories from the paths of steps S1 and S2. Since the teacher data is 1 in the category A, the unit which does not have the maximum output among the three units has the output 0.1 and 0.5 as the teacher data (steps S3 and S4), One unit of teacher data is given to the unit to be output (steps S3 and S5). Returning to FIG. 2, after completing the process of changing the teacher data given to each unit of each category,
Learning is performed by using a normal back propagation learning algorithm using these teacher data (step S
6). Therefore, according to the present embodiment, a multi-unit configuration in which a category is composed of a plurality of units is set. For learning other than the input category, all the teacher data of the unit are set to 0, and Sets the teacher data of the unit having the maximum value to 1, and sets the output values of the other units as the teacher data. In this way, even in a case where the boundary between one category and another category is ambiguous and identification is difficult in the past, learning can proceed smoothly by multiplexing output units, and good pattern identification can be expected. . As an experimental example based on this embodiment, phoneme identification was performed from speech data using a neural network, and word identification was performed with the DTW using the phoneme identification results. As a result, the phoneme identification ability and word identification ability were also improved. The neural net used in this experiment is
The output layer has 23 units (23 × 2 units in the multiplex output), and the recognition results are shown in Table 1 below in comparison with those of the conventional type. [Table 1] From this table, it can be seen that for the 14th category of phoneme data which the conventional neural network misrecognizes,
In the multiple output type according to the present invention, a correct result is obtained. As described above, according to the present invention, a multiplexed output layer configuration in which a plurality of units are prepared for each category to be recognized is used for back propagation type learning.
If the teacher data given is correct for the input pattern
The teacher data given to each unit of the category without
Of each category that is correct for the input pattern.
Of the knits, only the unit with the maximum value is teacher data
Output to other units as is for teacher data.
Since it is provided as a data , it is possible to perform learning in which boundaries between categories are clarified, and there is an effect that good pattern identification can be performed even for a complicated identification target.

【図面の簡単な説明】【図１】本発明の一実施例を示すニューラルネットワー
ク構成図。【図２】実施例における誤差の計算アルゴリズム。【図３】実施例における教師データの与え方の例。【図４】従来の階層型ニューラルネットワーク構成図。【符号の説明】Ｓ…入力層Ａ…中間層Ｒ…出力層BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a configuration diagram of a neural network showing one embodiment of the present invention. FIG. 2 is an algorithm for calculating an error in the embodiment. FIG. 3 is an example of how to give teacher data in the embodiment. FIG. 4 is a configuration diagram of a conventional hierarchical neural network. [Description of Signs] S: input layer A: intermediate layer R: output layer

フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06N 1/00 - 7/08 G06G 7/60 G06K 9/62 - 9/72 G06T 1/40 G06T 7/00 - 7/60 G10L 15/00 - 17/00 ＪＳＴファイル（ＪＯＩＳ) ＣＳＤＢ（日本国特許庁) ＩＮＳＰＥＣ（ＤＩＡＬＯＧ)Continued on the front page (58) Fields investigated (Int.Cl. ⁷ , DB name) G06N 1/00-7/08 G06G 7/60 G06K 9/62-9/72 G06T 1/40 G06T 7/00-7 / 60 G10L 15/00-17/00 JST file (JOIS) CSDB (Japan Patent Office) INSPEC (DIALOG)

Claims

(57) [Claims 1 further comprising an input layer and the intermediate layer and output layer, wherein
Output layer consists of multiple units for one recognition category
Is provided, and the coupling load stored in the storage device is provided between each layer.
Layered Neural Network Connected with Weight
Based on the rules of back propagation
Determine the correction amount of the connection weight from the load and teacher data
The connection weight stored in the storage device can be rewritten.
Learning method of neural network trained by
A procedure for checking whether or not the teacher data to be set to a certain recognition category X is 1. If the teacher data is not 1 in the above check, the teacher data to be given to the unit of the recognition category X is set to 0 A procedure to check whether the teacher data is a unit having the maximum output among the units of the recognition category of one, a procedure to use the output as it is as the teacher data for a unit which is not the maximum output, the neural network is in the unit, characterized in that it comprises a procedure for the teacher data to 1
Learning method .