JPH04299443A

JPH04299443A - Neural network learning method

Info

Publication number: JPH04299443A
Application number: JP8969591A
Authority: JP
Inventors: Yukinori Kakazu; 嘉数　侑昇; Takao Yoneda; 米田　孝夫; Moriaki Sakakura; 坂倉　守昭
Original assignee: Toyoda Koki KK
Current assignee: Toyoda Koki KK
Priority date: 1991-03-27
Filing date: 1991-03-27
Publication date: 1992-10-22

Abstract

PURPOSE:To attain an effective updating/learning in a neural network without storing the past teacher data. CONSTITUTION:When a learnt neural network is updated and learnt again, the data equivalent to the past teacher data are generated in the following method in place of a case where the past teacher data are stored. That is, a range of the input data available in the present neural network is sampled since the input/output characteristic of the neural network is secured from the past learning. Thus the input data are generated in an optional number of pieces, and the output data H are obtained in response to the data D. Then the sampled input/output characteristic is obtained. In such a constitution, the neural, network is learnt with use of the input data (D, A) including the input data A to be newly used for learning and the teacher date (H, B) including the teacher data B to be newly used for learning.

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】本発明は、ニューラルネットワー
クの学習方法に関する。詳しくは、既に学習されたニュ
ーラルネットワークを更に学習させる方法に関する。【０００２】【従来の技術】ニューラルネットワークは、理論的な解
析が困難な因果関係を結合係数の学習効果により直接的
に実現する回路網として知られている。即ち、ニューラ
ルネットワークは、予め、離散的な複数の入力に対して
、それぞれの入力に対して最適な出力が得られるように
、ニューラルネットワークの結合係数を学習しておき、
任意の入力に対して妥当な出力が直接的に得られるよう
にした回路網である。【０００３】【発明が解決しようとする課題】しかし、ニューラルネ
ットワークを全ての考え得る事象に対応できるように学
習させておくことは困難である。むしろ、学習済のニュ
ーラルネットワークを実際に使用して新たな事象が発生
して適性な出力が得られない場合に、その事象を含む学
習を行うことが通常である。【０００４】ところが、新たにニューラルネットワーク
の学習が必要となった場合には、学習すべき新たな入力
データとそれに対応した教師データだけを用いて学習さ
せることはできない。何故ならば、その新たな学習の結
果、既学習事項がニューラルネットワークから消去され
てしまうからである。従って、新たにニューラルネット
ワークを学習させる場合には、以前の学習に使用した全
ての入力データとその入力データに対応した全ての教師
データを含めて学習させることが必要である。【０００５】このような結果、過去の学習に使用した全
ての入力データと教師データを保存する必要があり、ニ
ューラルネットワークを長く使用するに伴い、記憶装置
の容量が膨大となるという問題がある。又、学習に使用
される入力データと教師データとが膨大となることから
、学習時間が長くなるという問題がある。【０００６】本発明は上記課題を解決するために成され
たものであり、その目的は、記憶容量が少なくて済むニ
ューラルネットワークの効率的な学習を行うことである
。【０００７】【課題を解決するための手段】上記課題を解決するため
の発明の構成は、学習済のニューラルネットワークを、
新たな入力データとそれに対応する教師データとを用い
て、更に学習させる方法において、学習済のニューラル
ネットワークに複数組の任意の入力データを入力させて
、その入力データに対応した複数組の出力データを得る
ステップと、複数組の入力データに新たに学習させる新
たな入力データとを加えた複数組のデータを入力データ
とし、複数組の出力データに新たに学習させる教師デー
タとを加えた複数組のデータを教師データとして、ニュ
ーラルネットワークを学習させるステップとを設けたこ
とである。【０００８】【作用】既に学習済のニューラルネットワークを更に学
習させる必要が生じた場合には、なるべく広く事象を表
す任意の離散的な複数組の入力データが準備される。こ
の複数組の入力データに対して、既学習済のニューラル
ネットワークが起動され、対応する複数組の出力データ
が得られる。【０００９】次に、上記の複数組の入力データに更に学
習させる必要が生じた入力データが加味された複数組の
入力データが準備される。又、上記の既学習済のニュー
ラルネットワークの出力データに学習させる必要がある
入力データに対応した教師データが加味されて、複数組
の教師データが準備される。この複数組の入力データと
複数組の教師データとに基づいて、ニューラルネットワ
ークの学習が行われる。【００１０】【発明の効果】このように、本発明では、過去の学習に
使用された入力データ及びその教師データは保存されて
いないし直接的には使用されない。既学習済のニューラ
ルネットワークの入出力特性は、過去の学習によって形
成されたものである。適当な離散的な複数組の入力デー
タをニューラルネットワークに付与し、その時の複数組
の出力データを得ることで、既学習済のニューラルネッ
トワークのサンプリングされた入出力特性を知ることが
できる。このサンプリングされた入出力特性に、新たに
追加される入出力特性を加えて学習することにより、過
去の学習事項を消去することない効果的な更新学習が可
能となる。【００１１】【実施例】１．ニューラルネットワーク本実施例のニュ
ーラルネットワーク１０は、図１に示すように、入力層
ＬＩ　と出力層ＬＯ　と中間層ＬＭ　の３層構造に構成
されている。入力層ＬＩ　はｅ個の入力素子を有し、出
力層ＬＯ　はｇ個の出力素子を有し、中間層ＬＭ　はｆ
個の出力素子を有している。多層構造のニューラルネッ
トワークは、一般的に、次の演算を行う装置として定義
される。第ｉ　層の第ｊ　番目の素子の出力Ｏｉｊ　は
、次式で演算される。但し、ｉ　≧２　である。【００１２】【数１】　　Ｏｉｊ　＝ｆ（Ｉｉｊ）　　　　　　　　　　　　
　　　　　　　　　　　　　　　　　　　　　　　　　
　　（１）【数２】【数３】　　ｆ（ｘ）＝１／｛１＋ｅｘｐ（−ｘ）｝　　　　　
　　　　　　　　　　　　　　　　　　　　　　　　　
（３）【００１３】但し、Ｖｉｊ　は第ｉ　層の第ｊ　
番目の演算素子のバイアス、Ｗｉ−１ｋ，ｉｊは、第ｉ
−１　層の第ｋ　番目の素子と第ｉ　層の第ｊ　番目の
素子間の結合係数、Ｏ１ｊ　は第１　層の第　ｊ番目の
素子の出力値を表す。即ち、第１　層であるから演算を
行うことなく、そのまま入力を出力するので、入力層（
第１層）の第ｊ　番目の素子の入力値でもある。【００１４】次に、図１に示す３層構造のニューラルネ
ットワーク１０の具体的な演算手順について図２を参照
して説明する。ステップ１００において、中間層（第２
層）の第ｊ　番目の素子は、入力層（第１層）の各素子
からの出力値Ｏ１ｊ　（第１層の入力データ）を入力し
て、（２）式を層番号と第１層の素子数を用いて具体化
した次式の積和演算を行なう。【数４】【００１５】次に、ステップ１０２において、次式によ
り、（４）式の入力値の積和関数値のシグモイド関数に
より、中間層（第２層）の各素子の出力が演算される。第２層の第ｊ　番目の素子の出力値は次式で演算される
。【００１６】【数５】　　Ｏ２ｊ＝ｆ（Ｉ２ｊ　）＝１／｛１＋ｅｘｐ（−Ｉ
２ｊ）　｝　　　　　　　　　　　　　　　　（５）こ
の出力値　Ｏ２ｊ　は出力層（第３層）の各素子の入力
値となる。次に、ステップ１０４において、出力層（第
３層）の各素子の入力値の積和演算が実行される。【００１７】【数６】　　次に、ステップ１０６において、（５）式と同様に
、シグモイド関数により、出力層の各素子の出力値Ｏ３
ｊが演算される。【００１８】【数７】　　Ｏ３ｊ＝ｆ（Ｉ３ｊ）＝１／｛１＋ｅｘｐ（−Ｉ３
ｊ）｝　　　　　　　　　　　　　　　　　　　（７）
【００１９】２．既学習済ニューラルネットワークの作
成（初期学習）このニューラルネットワークは、初期学
習として、図３に示す手順で学習される。結合係数は良
く知られたバックプロパーゲーション法により実行され
る。この学習は、各種の事象に関する多数の入力データ
に対して、それぞれの出力が、それぞれの最適な教師デ
ータとなるように、繰り返し実行される。【００２０】図３のステップ２００において、次式によ
り出力層の各素子の学習信号が演算される。【数８】　　Ｙ３ｊ＝（Ｔｊ−δｊ）・ｆ’（Ｉ３ｊ）　　　　
　　　　　　　　　　　　　　　　　　　　　　　（８
）　　但し、Ｔｊ　は任意の出力δｊに対する教師デー
タであり、外部から付与される。又、ｆ’（ｘ）　はジ
グモイド関数の導関数である。【００２１】次に、ステップ２０２において、中間層の
学習データＹ　が次式で演算される。【数９】　　　　　　　　【００２２】次に、ステップ２０４に
おいて、出力層の各結合係数が補正される。補正量は次
式で求められる。【数１０】　　Δω２ｉ，３ｊ（ｔ）＝Ｐ・Ｙ３ｊ・ｆ（Ｉ２ｉ）
＋Ｑ・Δω２ｉ，３ｊ（ｔ−１）　　　（１０）但し、
Δω２ｉ，３ｊ（ｔ）　　は、出力層の第ｊ　番目の素
子と中間層の第ｉ　番目の素子との間の結合係数の第ｔ
　回目演算の変化量である。又、Δω２ｉ，３ｊ（ｔ−
１）　　は、その結合係数の前回の補正量である。Ｐ，
Ｑ　は比例定数である。よって、結合係数は、【００２３】【数１１】　　Ｗ２ｉ，３ｊ＋Δω２ｉ，３ｊ（ｔ）　→Ｗ２ｉ，
３ｊ　　　　　　　　　　　　　　　　　　　　（１１
）　　により、補正された結合係数が求められる。次に
、ステップ２０６へ移行して、中間層の各素の結合係数
が補正される。その結合係数の補正量は出力層の場合と
同様に、次式で求められる。【００２４】【数１２】　　Δω１ｉ，２ｊ（ｔ）＝Ｐ・Ｙ２ｊ・ｆ（Ｉ１ｉ）
＋Ｑ・Δω１ｉ，２ｊ（ｔ−１）　　　（１２）よって
、結合係数は、【数１３】　　　Ｗ１ｉ，２ｊ　＋　Δω１ｉ，２ｊ（ｔ）　→Ｗ
１ｉ，２ｊ　　　　　　　　　　　　　　　　　　　（
１３）により、補正された結合係数が求められる。【００２５】次に、ステップ２０８において、学習対象
の全ての入力データに対して１回の学習が完了したか否
が判定される。全ての入力データに対する学習が終了し
ていない場合には、ステップ２１０へ移行して、次の入
力データとその入力データに対応する教師データが学習
対象データとして設定される。そして、ステップ２００
に戻り、次の入力データに関する学習が実行される。こ
のようにして、全入力データに関して１回の学習が完了
すると、ステップ２０８の判定結果がＹＥＳ　となり、
ステップ２１２へ移行する。【００２６】ステップ２１２では、結合係数の補正量Δ
ωが所定の値以下になったか否かの判定により、結合係
数が収束したか否かが判定される。結合係数が収束して
いなければ、ステップ２１４に移行して、全入力データ
に関する第２回目の学習を実行するために、最初の入力
データとそれに対応する教師データとが学習対象データ
として設定される。そして、ステップ２００に戻り、上
記した学習演算が繰り返し実行される。このようにして
、ステップ２１２において、結合係数の補正量Δωが所
定の値以下となり、結合係数が収束するまで、上記の学
習演算が繰り返し実行される。この結果、初期の広範囲
の事象に関して初期学習されたニューラルネットワーク
が完成される。【００２７】３．ニューラルネットワークの更新学習次
に、初期学習されたニューラルネットワークを現実の各
種の入力データに対して使用する。その過程で、適切な
出力結果が得られない場合が生じた時に、追加の更新学
習が実行される。本発明はこのような追加の更新学習を
行う方法に特徴がある。図４のステップ３００において
、過去の使用範囲において、広く任意に離散的にサンプ
リングされた入力データＤ１，…，Ｄｎ　が準備される
。この入力データは、次のように定義される。ｅ個の入力
素子のそれぞれに与えるｅ個のデータを１組のデータと
して考える。そして、任意の第ｐ番目の１組の入力デー
タをＤｐ　で表し、その組に属する第ｊ番目の入力素子
に対する入力データをｄｐｊで表す。Ｄｐ　はベクトル
を表し、ｄｐｊはそのベクトルの成分である。即ち、Ｄ
ｐ　は次式で定義される。【００２８】【数１４】　　Ｄｐ　＝（ｄｐ１，　ｄｐ２，　…，ｄｐｅ−１，
　ｄｐｅ）　　　　　　　　　　　　　　　　　　　　
　　　　（１４）又、ｎ組の入力データはＤ１，Ｄ２，
…，Ｄｎ−１，Ｄｎ　で表される。以下、全ｎ組の入力
データ群は、入力データ群Ｄと表記される。【００２９】次に、ステップ３０２において、この入力
データＤ１，…，　Ｄｎ　をニューラルネットワークに
入力して、図５の（ａ）に示すように、それぞれに対応
する出力データＨ１，…，　Ｈｎ　を得る。この出力デ
ータは、次のように定義される。出力層ＬＯ　に関して
、ｇ個の出力素子のそれぞれから出力される出力データ
を１組のデータとして考える。そして、任意の第ｐ番目
の１組の出力データをＨｐ　で表し、その組に属する第
ｊ番目の出力素子に対する出力データをｈｐｊで表す。Ｈｐ　はベクトルを表し、ｈｐｊはそのベクトルの成分
である。即ち、Ｈｐ　は次式で定義される。【００３０】【数１５】　　Ｈｐ　＝（ｈｐ１，　ｈｐ２，　…，ｈｐｇ−１，
　ｈｐｇ）　　　　　　　　　　　　　　　　　　　　
　　　　（１５）又、ｎ組の出力データはＨ１，Ｈ２，
…，Ｈｎ−１，Ｈｎ　で表される。以下、全ｎ組の出力
データ群は、出力データ群Ｈと表記される。【００３１】次に、ステップ３０４において、図５の（
ａ）に示すように、新たに教示が必要となった事象に関
するｍ個の入力データの組と、それぞれの入力データに
対応するｍ個の教師データの組を準備する。この入力デ
ータは、同じく、ベクトルを用いてＡｐ　で表され、そ
の成分はａｐｊで表される。即ち、Ａｐ　は次式で定義
される。【数１６】　　Ａｐ　＝（ａｐ１，　ａｐ２，　…，ａｐｅ−１，
　ａｐｅ）　　　　　　　　　　　　　　　　　　　　
　　　　（１６）又、その入力データの組数をｍで表わ
すと、教示が必要な新しい事象に関するｍ組の入力デー
タはＡ１，Ａ２，…，　Ａｍ　で表される。以下、ｍ組
のこの入力データ群は、入力データ群Ａと表記される。【００３２】又、その任意の入力データＡｐ　に対応す
る教師データの組は、同じくベクトルを用いてＢｐ　で
表わされ、その成分はｂｐｊで表わされる。即ち、Ｂｐ
　は次式で定義される。【数１７】　　Ｂｐ　＝（ｂｐ１，　ｂｐ２，　…，ｂｐｇ−１，
　ｂｐｇ）　　　　　　　　　　　　　　　　　　　　
　　　　（１７）又、ｍ組の入力データはＡ１，Ａ２，
…，　Ａｍに対応するｍ組の教師データはＢ１，Ｂ２，
…，　Ｂｍ　で表される。以下、ｍ組の教師データは教
師データ群Ｂと表記される。【００３３】次に、ステップ３０６において、図５の（
ｂ）に示すように、サンプリングして求められた入力デ
ータ群Ｄと新たに学習に組み込まれる入力データ群Ａと
を合わせた（ｎ＋ｍ）個のデータ群（Ｄ１，…，Ｄｎ，
Ａ１，…，Ａｍ　）が入力データとされ、上記の出力デ
ータ群Ｈと新たに学習に組み込まれる教師データ群Ｂと
を合わせた（ｎ＋ｍ）個のデータ群（Ｈ１，…，Ｈｎ，
Ｂ１，…，Ｂｍ　）が教師データとされる。そして、そ
の入力データと教師データに対して、上記した初期学習
と同様に、図３のフローチャートに従って、学習が実行
される。【００３４】第１回目の学習サイクル入力データ群Ｄから入力データ群Ａの順番で学習すると
すれば、第１回目の学習サイクルにおいては、入力デー
タ（Ｄ１，…，Ｄｎ　）に対するニューラルネットワー
クの出力データは、（Ｈ１，…，Ｈｎ　）である。従っ
て、（８）式において、出力データと教師データとの差
（成分表示で、Ｔｊ−δｊ）は、零となるので、全ての
学習信号は零となる。よって、データＤ群の学習に関し
ては、結合係数の変化は見られない。次に、入力データ
Ａ群の学習が行われる。この時、（Ａ１，…，Ａｍ　）
を入力データとするニューラルネットワークの出力デー
タは、教師データ（Ｈ１，…，Ｈｎ　）に等しくない。即ち、（８）式において、任意の入力データＡｐ　に関
して、出力データと教師データとの差（成分表示で、Ｔ
ｊ−δｊ）が零ではない。よって、この差によって学習
データが生成され、学習が実行されて、結合係数が変化
される。【００３５】第２回目の学習サイクル第２回目の学習サイクルでは、入力データＡ群の学習に
より、ニューラルネットワークの結合係数が変化されて
いるので、入力データ群（Ｄ１，…，Ｄｎ　）の学習に
関し、任意の入力データＤｐ　に対するニューラルネッ
トワークの出力データは、教師データＨｐ　に等しくな
らない。即ち、（８）式において、出力データと教師デ
ータとの差（成分表示で、Ｔｊ−δｊ）が零ではないの
で、学習データＹが生成されて、結合係数が変化される
。このようにして、２回目以後の学習サイクルでは、サ
ンプリングにより作成した入力データ群（Ｄ１，…，Ｄ
ｎ　）に対する学習も実行されることになる。【００３６】多数回の学習サイクル上記の学習サイクルが多数回繰り返されることになり、
結局、ニューラルネットワークは、合わせた入力データ
群（Ｄ１，…，Ｄｎ，Ａ１，…，Ａｍ　）と、合わせた
教師データ群（Ｈ１，…，Ｈｎ　，Ｂ１，…，Ｂｍ　）
に関して学習が実行されることになる。従って、過去に
学習した事象（入力データ群Ｄ１，…，Ｄｎ　で包絡さ
れる事象）と新たに学習される事象（入力データ群Ａ１
，…，Ａｍ　で包絡される事象）に対応できるニューラ
ルネットワークを形成できる。【００３７】更に、ニューラルネットワークの更新学習
が必要となった場合には、上記の最新のニューラルネッ
トワークを対象として、サンプリングによりデータ群Ｄ
を作成して、新たに追加するデータ群Ａを加味した、図
４に示す更新学習を実行すれば良い。このようにして、
過去の学習に使用された入力データと教師データとの組
を記憶させることなく、次々に更新学習を実行すること
が可能となる。尚、追加する入力データの数ｍは１でも
良い。Description: TECHNICAL FIELD The present invention relates to a learning method for neural networks. Specifically, the present invention relates to a method for further training an already trained neural network. [0002] A neural network is known as a circuit network that directly realizes causal relationships that are difficult to analyze theoretically through the learning effect of coupling coefficients. That is, the neural network learns the neural network coupling coefficients in advance for a plurality of discrete inputs so that the optimal output can be obtained for each input.
It is a circuit network that can directly obtain a valid output for any input. [0003] However, it is difficult to train a neural network so that it can respond to all possible events. Rather, when a new event occurs and an appropriate output cannot be obtained by actually using a trained neural network, learning that includes that event is usually performed. However, when a new neural network needs to be trained, it is not possible to perform the learning using only the new input data to be learned and the corresponding teacher data. This is because, as a result of the new learning, the previously learned items are deleted from the neural network. Therefore, when a new neural network is to be trained, it is necessary to train the neural network by including all the input data used in the previous learning and all the teacher data corresponding to the input data. [0005] As a result, it is necessary to save all input data and teacher data used in past learning, and as the neural network is used for a long time, the capacity of the storage device becomes enormous. Furthermore, since the amount of input data and teacher data used for learning becomes enormous, there is a problem that the learning time becomes long. The present invention has been made to solve the above problems, and its purpose is to perform efficient learning of a neural network that requires less storage capacity. [Means for Solving the Problems] The configuration of the invention for solving the above problems uses a trained neural network,
In a method for further learning using new input data and corresponding training data, multiple sets of arbitrary input data are input to a trained neural network, and multiple sets of output data corresponding to the input data are generated. , and multiple sets of input data including multiple sets of input data plus new input data to be learned, and multiple sets of output data plus new training data. A step is provided in which the neural network is trained using the data as training data. [Operation] When it becomes necessary to further train a neural network that has already been trained, a plurality of arbitrary discrete sets of input data representing as wide a range of events as possible are prepared. A trained neural network is activated for these multiple sets of input data, and corresponding multiple sets of output data are obtained. [0009]Next, a plurality of sets of input data are prepared by adding input data that needs to be further learned to the above-mentioned plurality of sets of input data. Furthermore, a plurality of sets of teacher data are prepared by adding teacher data corresponding to input data that needs to be trained to the output data of the trained neural network. Learning of the neural network is performed based on the plurality of sets of input data and the plurality of sets of teacher data. [0010] As described above, in the present invention, input data used in past learning and its teacher data are not stored or directly used. The input/output characteristics of a trained neural network are formed by past learning. By providing multiple sets of appropriate discrete input data to the neural network and obtaining multiple sets of output data at that time, it is possible to know the sampled input/output characteristics of the trained neural network. By learning by adding newly added input/output characteristics to the sampled input/output characteristics, effective update learning can be performed without erasing past learning items. [Example] 1. Neural Network As shown in FIG. 1, the neural network 10 of this embodiment has a three-layer structure including an input layer LI, an output layer LO, and a middle layer LM. The input layer LI has e input elements, the output layer LO has g output elements, and the middle layer LM has f
It has two output elements. A multilayer neural network is generally defined as a device that performs the following operations. The output Oij of the j-th element of the i-th layer is calculated by the following equation. However, i≧2. [Equation 1] Oij = f(Iij)

(1) [Math. 2] [Math. 3] f(x)=1/{1+exp(-x)}

(3) [0013] However, Vij is the j-th layer of the i-th layer.
The bias of the ith arithmetic element, Wi-1k,ij, is the bias of the ith arithmetic element.
The coupling coefficient between the k-th element of the -1 layer and the j-th element of the i-th layer, O1j represents the output value of the j-th element of the first layer. In other words, since it is the first layer, it outputs the input as it is without performing any calculations, so the input layer (
It is also the input value of the j-th element of the first layer). Next, a detailed calculation procedure of the three-layer neural network 10 shown in FIG. 1 will be explained with reference to FIG. 2. In step 100, the intermediate layer (second
For the j-th element of the layer), input the output value O1j (input data of the first layer) from each element of the input layer (first layer), and use equation (2) with the layer number and the first layer. A sum-of-products operation is performed using the following formula using the number of elements. [0015]Next, in step 102, the output of each element in the intermediate layer (second layer) is calculated by the sigmoid function of the product-sum function value of the input values of equation (4) using the following equation. Ru. The output value of the jth element of the second layer is calculated using the following equation. [Equation 5] O2j=f(I2j)=1/{1+exp(-I
2j) } (5) This output value O2j becomes the input value of each element of the output layer (third layer). Next, in step 104, a sum-of-products operation is performed on the input values of each element of the output layer (third layer). ##EQU00006## Next, in step 106, similarly to equation (5), the output value O3 of each element in the output layer is calculated using a sigmoid function.
j is calculated. [Equation 7] O3j=f(I3j)=1/{1+exp(-I3
j)} (7)
2. Creation of Already Trained Neural Network (Initial Learning) This neural network is trained in the procedure shown in FIG. 3 as initial learning. The coupling coefficients are performed by the well-known backpropagation method. This learning is repeatedly performed on a large number of input data related to various events so that each output becomes the respective optimal training data. At step 200 in FIG. 3, a learning signal for each element of the output layer is calculated using the following equation. [Formula 8] Y3j=(Tj-δj)・f'(I3j)
(8
) However, Tj is teacher data for an arbitrary output δj, and is given from the outside. Also, f'(x) is the derivative of the sigmoid function. Next, in step 202, learning data Y of the intermediate layer is calculated using the following equation. ##EQU9## Next, in step 204, each coupling coefficient of the output layer is corrected. The amount of correction is determined by the following formula. [Formula 10] Δω2i,3j(t)=P・Y3j・f(I2i)
+Q・Δω2i,3j(t-1) (10) However,
Δω2i,3j(t) is the t-th coupling coefficient between the j-th element of the output layer and the i-th element of the intermediate layer.
This is the amount of change in the second calculation. Also, Δω2i, 3j(t-
1) is the previous correction amount of the coupling coefficient. P,
Q is a proportionality constant. Therefore, the coupling coefficient is: [Formula 11] W2i,3j+Δω2i,3j(t) →W2i,
3j (11
), the corrected coupling coefficient is determined. Next, the process moves to step 206, where the coupling coefficient of each element in the intermediate layer is corrected. The amount of correction of the coupling coefficient is determined by the following equation, as in the case of the output layer. [Formula 12] Δω1i,2j(t)=P・Y2j・f(I1i)
+Q・Δω1i,2j(t-1) (12) Therefore, the coupling coefficient is: [Formula 13] W1i,2j + Δω1i,2j(t) →W
1i, 2j (
13), the corrected coupling coefficient is determined. Next, in step 208, it is determined whether one round of learning has been completed for all input data to be learned. If learning has not been completed for all input data, the process moves to step 210, and the next input data and the teacher data corresponding to the input data are set as learning target data. And step 200
The process returns to , and learning is performed on the next input data. In this way, once learning is completed for all input data, the determination result in step 208 becomes YES,
The process moves to step 212. In step 212, the coupling coefficient correction amount Δ
By determining whether ω has become less than or equal to a predetermined value, it is determined whether the coupling coefficient has converged. If the coupling coefficients have not converged, the process moves to step 214, where the first input data and the corresponding teacher data are set as the learning target data in order to perform the second learning on all input data. . Then, the process returns to step 200, and the above-described learning calculation is repeatedly executed. In this way, in step 212, the above learning calculation is repeatedly executed until the correction amount Δω of the coupling coefficient becomes less than or equal to a predetermined value and the coupling coefficient converges. As a result, a neural network initially trained on a wide range of initial events is completed. 3. Update Learning of Neural Network Next, the initially learned neural network is used for various types of actual input data. During this process, additional update learning is executed when a case arises where an appropriate output result cannot be obtained. The present invention is characterized by a method of performing such additional update learning. In step 300 of FIG. 4, input data D1, . . . , Dn, which has been randomly and discretely sampled widely in the past usage range, is prepared. This input data is defined as follows. Consider e pieces of data given to each of e pieces of input elements as one set of data. Then, an arbitrary p-th set of input data is represented by Dp, and input data for the j-th input element belonging to the set is represented by dpj. Dp represents a vector and dpj are the components of that vector. That is, D
p is defined by the following equation. [Formula 14] Dp = (dp1, dp2, ..., dpe-1,
dpe)
(14) Also, n sets of input data are D1, D2,
..., Dn-1, Dn. Hereinafter, all n input data groups will be referred to as input data group D. Next, in step 302, the input data D1,..., Dn are input to the neural network to obtain the corresponding output data H1,..., Hn, as shown in FIG. 5(a). . This output data is defined as follows. Regarding the output layer LO, consider the output data output from each of the g output elements as one set of data. Then, an arbitrary p-th set of output data is represented by Hp, and output data for the j-th output element belonging to the set is represented by hpj. Hp represents a vector, and hpj are the components of that vector. That is, Hp is defined by the following equation. [Formula 15] Hp = (hp1, hp2, ..., hpg-1,
hpg)
(15) Also, the output data of n sets are H1, H2,
..., Hn-1, Hn. Hereinafter, all n output data groups will be referred to as output data group H. Next, in step 304, (
As shown in a), m sets of input data related to events that require new teaching and m sets of teacher data corresponding to the respective input data are prepared. This input data is also represented by Ap using a vector, and its components are represented by apj. That is, Ap is defined by the following equation. [Formula 16] Ap = (ap1, ap2, ..., ape-1,
ape)
(16) Furthermore, if the number of sets of input data is expressed as m, then m sets of input data regarding a new event that requires teaching are expressed as A1, A2, . . . , Am. Hereinafter, the m input data groups will be referred to as input data group A. Further, the set of teacher data corresponding to the arbitrary input data Ap is represented by Bp using a vector, and its components are represented by bpj. That is, Bp
is defined by the following equation. [Formula 17] Bp = (bp1, bp2, ..., bpg-1,
bpg)
(17) Also, m sets of input data are A1, A2,
..., m sets of training data corresponding to Am are B1, B2,
..., expressed as Bm. Hereinafter, the m sets of teacher data will be referred to as a teacher data group B. Next, in step 306, (
As shown in b), (n+m) data groups (D1, ..., Dn,
A1,..., Am) are input data, and (n+m) data groups (H1,..., Hn,
B1,...,Bm) are used as teaching data. Then, similar to the initial learning described above, learning is performed on the input data and teacher data according to the flowchart of FIG. 3. First learning cycle If learning is performed in the order from input data group D to input data group A, in the first learning cycle, the output data of the neural network for input data (D1,...,Dn) is (H1,...,Hn). Therefore, in equation (8), the difference between the output data and the teacher data (Tj - δj in component representation) is zero, so all learning signals are zero. Therefore, regarding the learning of data group D, no change in the coupling coefficient is observed. Next, learning of input data group A is performed. At this time, (A1,..., Am)
The output data of the neural network with input data is not equal to the teacher data (H1,...,Hn). That is, in equation (8), for any input data Ap, the difference between the output data and the teaching data (in component representation, T
j−δj) is not zero. Therefore, learning data is generated based on this difference, learning is performed, and the coupling coefficient is changed. Second learning cycle In the second learning cycle, the coupling coefficients of the neural network have been changed by learning the input data group A, so the learning of the input data group (D1,...,Dn) , the output data of the neural network for any input data Dp is not equal to the teacher data Hp. That is, in equation (8), since the difference between the output data and the teacher data (Tj - δj in component representation) is not zero, learning data Y is generated and the coupling coefficient is changed. In this way, in the second and subsequent learning cycles, the input data group (D1, ..., D
Learning for n ) will also be performed. [0036] Multiple learning cycles The above learning cycle will be repeated multiple times.
In the end, the neural network uses the combined input data group (D1,...,Dn,A1,...,Am) and the combined teacher data group (H1,...,Hn,B1,...,Bm).
Learning will be performed regarding. Therefore, events learned in the past (events enveloped by input data group D1,...,Dn) and newly learned events (input data group A1
, ..., Am) can be formed. Furthermore, when update learning of the neural network becomes necessary, data group D is obtained through sampling using the latest neural network as described above.
What is necessary is to create the update learning shown in FIG. 4, which takes into account the newly added data group A. In this way,
It becomes possible to perform update learning one after another without storing the pairs of input data and teacher data used in past learning. Note that the number m of input data to be added may be one.

[Brief explanation of drawings]

【図１】本発明の具体的な実施例に係るニューラルネッ
トワークの構成を示した構成図。FIG. 1 is a configuration diagram showing the configuration of a neural network according to a specific embodiment of the present invention.

【図２】同実施例に係るニューラルネットワークの演算
手順を示したフローチャート。FIG. 2 is a flowchart showing the calculation procedure of the neural network according to the same embodiment.

【図３】同実施例に係るニューラルネットワークの学習
手順を示したフローチャート。FIG. 3 is a flowchart showing a learning procedure of the neural network according to the same embodiment.

【図４】同実施例に係るニューラルネットワークの更新
学習の手順を示したフローチャート。FIG. 4 is a flowchart showing a procedure for update learning of the neural network according to the same embodiment.

【図５】同実施例に係るニューラルネットワークの更新
学習の概念を示したブロックダイヤグラム。FIG. 5 is a block diagram showing the concept of update learning of the neural network according to the same embodiment.

[Explanation of symbols]

１０…ニューラルネットワーク　　ＬＩ　…入力層ＬＭ
　…中間層　　Ｌｏ　…出力層10...Neural network LI...Input layer LM
...Middle layer Lo ...Output layer

Claims

[Claims]

[Claim 1] A trained neural network,
In a method for further learning using new input data and corresponding training data, inputting a plurality of sets of arbitrary input data to the trained neural network and outputting a plurality of sets corresponding to the input data. a step of obtaining data; and input data is a plurality of sets of data obtained by adding the new input data to be newly learned to the plurality of sets of input data, and the training data to be newly learned to the plurality of sets of output data. A method for learning a neural network, comprising the step of causing the neural network to learn using a plurality of sets of data including the above as training data.