JPH064506A

JPH064506A - Neural network learning method

Info

Publication number: JPH064506A
Application number: JP4165687A
Authority: JP
Inventors: Hisao Ogata; 日佐男緒方; Yutaka Sako; 裕酒匂; Masahiro Abe; 正博阿部
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1992-06-24
Filing date: 1992-06-24
Publication date: 1994-01-14

Abstract

PURPOSE:To reduce the error with movement of a separate plane and to learn a neural network at a high speed by using only a pattern of a large error at the first stage of learning. CONSTITUTION:The input signal pattern of the teacher signal data 10 is inputted to the input layer neuron (S1). Then, the input signals are successively transmitted to the neurons of an output layer and the output of the output layer neuron is finally obtained (S2). An output error is calculated from the output signal of the data 10 and the calculated output of the output layer neuron (S3). Then, it is decided whether the learning pattern under display satisfies the weight correction conditions or not (S4). Then, the sharpest falling slope of an input pattern is calculated to the weight of each neuron (S5), and the correction value of this time is calculated from the error slope and the precedent correction value (S6). This operation is repeated until the output error is set less than the upper limit level of a fixed error that serves as a standard for the end of learning.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ニューラルネットワー
クを用いてパターン認識やデータ圧縮などを行なう処理
分野において、教師信号を用いてネットワークの重みの
学習を行なう方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method of learning network weights using a teacher signal in the processing field of pattern recognition and data compression using a neural network.

【０００２】[0002]

【従来の技術】ニューラルネットワークとは、人間の脳
における神経回路網を簡略化したモデルであり、それは
神経細胞ニューロンが、一方向にのみ信号が通過するシ
ナプスを介して結合されているネットワークである。ニ
ューロン間の信号の伝達は、このシナプスを通して行な
われ、シナプスの抵抗、すなわち、重みを適当に調整す
ることにより、様々な情報処理が可能となる。各ニュー
ロンでは、他のニューロンからの出力をシナプスの重み
付けをして入力し、それらの総和を非線形応答関数の変
形を加えて再度他のニューロンへ出力する。2. Description of the Related Art A neural network is a simplified model of a neural network in the human brain, which is a network in which neuron neurons are connected via synapses through which signals pass in only one direction. . Signal transmission between neurons is performed through this synapse, and various information processing can be performed by appropriately adjusting the resistance of the synapse, that is, the weight. In each neuron, outputs from other neurons are weighted with synapses and input, and the sum of them is output to other neurons again by modifying the nonlinear response function.

【０００３】ニューラルネットワークの構造には、大き
く分けて相互結合型と多層型の二種類があり、前者は最
適化問題に後者はパターン認識などの認識問題に適して
いる。本発明は、多層型ニューラルネットワークに関わ
る。The structure of the neural network is roughly divided into two types, that is, an interconnection type and a multilayer type. The former is suitable for the optimization problem and the latter is suitable for the recognition problem such as pattern recognition. The present invention relates to multilayer neural networks.

【０００４】パターン認識やデータ圧縮に、多層型ニュ
ーラルネットワークを用いる場合、ネットワークが正し
い出力を出すように、入力パターンと出力パターンが対
になった教師信号を用意し、それらを用いてネットワー
クの重みを予め学習しておく必要がある。それには通
常、バックプロパゲーションと呼ばれる最適化手法を用
いて行なわれる。この手法に関しては、例えばＰＤＰモ
デル、産業図書(1989年)第３２５頁から第３３１頁にお
いて論じられている。When a multi-layered neural network is used for pattern recognition and data compression, a teacher signal in which an input pattern and an output pattern are paired is prepared so that the network outputs a correct output, and the weight is applied to the network. It is necessary to learn in advance. This is usually done using an optimization technique called backpropagation. This method is discussed in, for example, the PDP model, Industrial Books (1989), pages 325 to 331.

【０００５】この手法の特徴は、ニューロンの非線形応
答関数にシグモイド関数のような非減少で連続関数を使
うことと、次式で示すように重みを更新することによっ
て、ネットワークの出力誤差を小さくすることにある。The characteristic of this method is to reduce the output error of the network by using a non-decreasing continuous function such as a sigmoid function for the non-linear response function of the neuron and updating the weights as shown in the following equation. Especially.

【０００６】[0006]

【数２】ｄｗ／ｄｔ＝−∂Ｅ／∂ｗ …（数２）ｗ：重みＥ：ネットワークの出力誤差ｔ：時間図２を用いて具体的にその手法の処理手順を示す。Dw / dt = −∂E / ∂w (Equation 2) w: weight E: output error of network t: time A processing procedure of the method is specifically shown using FIG.

【０００７】まず、図２内のデータ１０について説明す
る。学習データ１０は入力パターンデータと教師パター
ンデータからなり、それぞれ（入力層ニューロン数Ｎｉ
×学習パターン数Ｐ）、（入力層ニューロン数Ｎｏ×学
習パターン数Ｐ）だけのデータ数を持つ。入力信号パタ
ーンと教師信号パターンは対応しており、例えば、入力
信号パターン１を入力層に入力して、出力層からでる出
力と教師信号パターン１との誤差を計算し、それを学習
に使用する。First, the data 10 in FIG. 2 will be described. The learning data 10 is composed of input pattern data and teacher pattern data.
X learning pattern number P), (input layer neuron number No x learning pattern number P). The input signal pattern and the teacher signal pattern correspond to each other. For example, the input signal pattern 1 is input to the input layer, the error between the output from the output layer and the teacher signal pattern 1 is calculated, and the error is used for learning. .

【０００８】ステップ２０では、教師信号データ１０の
入力信号パターンを入力層ニューロンに入力する。In step 20, the input signal pattern of the teacher signal data 10 is input to the input layer neuron.

【０００９】ステップ２１では、数３に従い入力信号を
出力層側のニューロンに次々と伝搬させ、最終的に出力
層ニューロンの出力を求める。In step 21, the input signal is successively propagated to the neurons on the output layer side in accordance with equation 3, and finally the output of the neuron on the output layer is obtained.

【００１０】[0010]

【数３】 [Equation 3]

【００１１】ステップ２２では、教師信号データ１０の
出力信号とステップ２３で計算した出力層ニューロンの
出力により、数４を用いて出力誤差を計算する。In step 22, the output error is calculated using the equation 4 by the output signal of the teacher signal data 10 and the output of the output layer neuron calculated in step 23.

【００１２】[0012]

【数４】 [Equation 4]

【００１３】ステップ２３では、数５に従い各ニューロ
ンの重みに対し入力パターンｑの最急降下勾配を求め
る。In step 23, the steepest descent gradient of the input pattern q is obtained for the weight of each neuron according to the equation (5).

【００１４】[0014]

【数５】 [Equation 5]

【００１５】ステップ２４では、下の式を用いて誤差の
勾配と前回の修正量より今回の修正量を求める。In step 24, the present correction amount is obtained from the error gradient and the previous correction amount using the following equation.

【００１６】[0016]

【数６】 [Equation 6]

【００１７】η：学習係数 α：モーメント係数但し、右辺第２項はモーメント項と呼ばれ、学習を加速
するために経験的に加える項である。Η: Learning coefficient α: Moment coefficient However, the second term on the right side is called a moment term and is an term added empirically to accelerate learning.

【００１８】ステップ２５では、数７に従い各ニューロ
ンの重みを修正する。In step 25, the weight of each neuron is modified according to equation (7).

【００１９】[0019]

【数７】 [Equation 7]

【００２０】ステップ２６では、全ての学習パターンに
ついてステップ２０からステップ２５の処理を行なう。At step 26, the processing from step 20 to step 25 is performed for all learning patterns.

【００２１】ステップ２７では、ステップ２６を、出力
誤差が学習終了の基準となる一定の誤差の上限以内に収
まるまで繰り返す。この学習終了条件は、例えば、出力
層ニューロンのそれぞれの出力誤差が０.１などのよう
に、一定の上限値以内に収まり、且つこれが全ての学習
パターンについて成り立つ時とする。In step 27, step 26 is repeated until the output error falls within the upper limit of a constant error that serves as a reference for the end of learning. The learning end condition is, for example, when the output errors of the output layer neurons are within a certain upper limit value such as 0.1, and this is true for all learning patterns.

【００２２】一方、このバックプロパゲーションの問題
点として、学習に時間がかかることが指摘されている。
そこで、学習を高速化するために、収束条件を満たした
パターンは重みの修正を行なわずに、計算を省略する手
法がある。まず、学習パターン毎に、『出力誤差が０.
１以内であるか』という終了条件を満たしたか判定を
する。満たしているパターンについては、上で述べたス
テップ２３からステップ２５の重みを修正するための計
算をしない。この処理手順を、図３を用いて具体的に説
明する。On the other hand, it has been pointed out that learning takes a long time as a problem of this back propagation.
Therefore, in order to speed up the learning, there is a method of omitting the calculation without correcting the weight for the pattern satisfying the convergence condition. First, for each learning pattern, "output error is 0.
Is it within 1? " For the satisfying pattern, the calculation for correcting the weight in step 23 to step 25 described above is not performed. This processing procedure will be specifically described with reference to FIG.

【００２３】ステップ３０からステップ３２では、バッ
クプロパゲーションの説明で述べたステップ２３からス
テップ２５と同様な手続きに従い、出力層ニューロンの
誤差を計算する。In steps 30 to 32, the error of the output layer neuron is calculated according to the procedure similar to steps 23 to 25 described in the explanation of the back propagation.

【００２４】ステップ３３では、現在提示している学習
パターンについて、例えば『出力誤差が０.１以内であ
るか』という終了条件を満たしているかどうかの判定を
する。もし、満たしていなければステップ３４へ進む。In step 33, it is judged whether or not the learning pattern currently presented satisfies the ending condition "whether the output error is within 0.1". If not satisfied, the process proceeds to step 34.

【００２５】ステップ３４からステップ３６では、同じ
く前に説明したステップ２３からステップ２５と同様な
手続きに従い、重みの修正に伴う計算及び重みの修正を
行なう。In steps 34 to 36, calculation and weight correction associated with the weight correction are performed in accordance with the same procedure as in step 23 to step 25 described above.

【００２６】ステップ３７では、全ての学習パターンに
ついてステップ３０からステップ３３の処理を行なう。In step 37, the processes of steps 30 to 33 are performed for all learning patterns.

【００２７】ステップ３８では、全ての学習パターンに
ついて収束条件が満たされるまで、ステップ３７を繰り
返す。In step 38, step 37 is repeated until the convergence condition is satisfied for all learning patterns.

【００２８】以上の結果、計算を省略したバックプロパ
ゲーション学習を行なうことが出来る。As a result of the above, backpropagation learning without calculation can be performed.

【００２９】[0029]

【発明が解決しようとする課題】バックプロパゲーショ
ン学習の問題点として、初めから全パターンを用いて学
習を行なうと、学習が遅くなるということが挙げられ
る。これは、初めから全パターンを用いて学習を行なう
と、個々のパターンの学習効果が互いに相殺されてしま
うためである。A problem of the backpropagation learning is that if the learning is performed using all the patterns from the beginning, the learning becomes slow. This is because, if learning is performed using all patterns from the beginning, the learning effects of the individual patterns cancel each other out.

【００３０】この問題点について、図５ないし図８を用
いて詳細に説明する。図５，図６は、２次元の入力を持
つニューロンに、×と〇の線形分離を行なう学習をさせ
る場合で、図５は学習を終了した状態、図６は学習を行
なう前の状態を示したものである。×で表す入力パター
ンが入力された時には、ニューロンは０.９を出力し、
〇で表す入力パターンが入力された時に、０.１を出力
させるように学習させる。ニューロンの出力Ｏの式は次
で与えられる。This problem will be described in detail with reference to FIGS. FIGS. 5 and 6 show a case where a neuron having a two-dimensional input is trained to perform linear separation of x and ◯, FIG. 5 shows a state in which the learning is finished, and FIG. 6 shows a state before the learning. It is a thing. When the input pattern represented by × is input, the neuron outputs 0.9,
When the input pattern represented by ◯ is input, it is learned to output 0.1. The equation for the output O of the neuron is given by

【００３１】[0031]

【数８】 [Equation 8]

【００３２】ｆはシグモイド関数、net はニューロンへ
の総入力を表す。図５，図６において直線Ｇはnet＝０
を表す直線であり、以下、これを分離平面と呼ぶ。図
７，図８はそれぞれ図５，図６の破線に沿って切ったニ
ューロンの出力特性図である。シグモイド関数は、net
の絶対値が大きくなると０または１に近づく。これは言
い替えると、分離平面から遠くなるほど、または重みベ
クトルの大きさ|Ｗ|が大きくなるほど、net の絶対値が
大きくなり、ニューロンの出力は１または０に近づくこ
とを表す。そこで、学習によって図６の状態から図５の
状態にするためには、分離平面Ｇを×と〇のパターンの
間に持って来ると同時に、|Ｗ|を大きくする必要があ
る。F represents a sigmoid function, and net represents a total input to the neuron. The straight line G in FIGS. 5 and 6 is net = 0.
Is a straight line that represents, and is hereinafter referred to as a separation plane. 7 and 8 are output characteristic diagrams of the neuron taken along the broken lines in FIGS. 5 and 6, respectively. Sigmoid function is net
When the absolute value of increases, it approaches 0 or 1. In other words, as the distance from the separation plane increases, or the magnitude of the weight vector | W | increases, the absolute value of net increases and the output of the neuron approaches 1 or 0. Therefore, in order to change the state of FIG. 6 to the state of FIG. 5 by learning, it is necessary to bring the separation plane G between the patterns of × and ◯ and increase | W | at the same time.

【００３３】次に、一つの入力パターンを提示した時の
学習について説明する。図６において、ａの入力パター
ン〇を提示して、それが出力０.１を出すように学習す
るには、分離平面Ｇを上に押し上げてＧから入力パター
ンａを離すと同時に、|Ｗ|を大きくしてnet を小さくす
る必要がある。一方、ｂの入力パターン×を提示してそ
れが出力０.９を出すように学習するには、逆に分離平
面Ｇを押し下げて、Ｇから入力パターンｂを離すと同時
に、|Ｗ|を大きくしてnet を大きくしなくてはならな
い。学習時にａ，ｂのパターンを同時に提示した場合
は、結果として、分離平面Ｇはあまり動かずに|Ｗ|のみ
が大きくなる。同様なことは、図の全ての学習パターン
を用いて学習した場合にも起こる。Next, learning when one input pattern is presented will be described. In FIG. 6, in order to present the input pattern ◯ of a and learn that it outputs 0.1, the separation plane G is pushed up to separate the input pattern a from G, and at the same time | W | Needs to be increased to reduce net. On the other hand, in order to present the input pattern x of b and learn it to output 0.9, on the contrary, the separation plane G is pushed down to separate the input pattern b from G and at the same time increase | W | Then you have to make the net bigger. When the patterns a and b are presented at the same time during learning, as a result, the separation plane G does not move much and only | W | becomes large. The same thing occurs when learning is performed using all the learning patterns in the figure.

【００３４】|Ｗ|を大きくしてnet が大きくなった場
合、数５で示すように最急降下勾配に積の形で入るシグ
モイド関数の微分が、全てのパターンについて小さくな
る。つまり、|Ｗ|が大きいため、どのようなパターンに
ついてもnet が大きくなってしまい、df／dnetが小さく
なる。そのため、最急降下勾配から数６を用いて計算す
る重みの修正量ΔＷも小さくなり、学習の高速化が図れ
なくなる。When | W | is increased and net is increased, the differentiation of the sigmoid function that enters the steepest descent in the form of a product becomes small for all patterns, as shown in Equation 5. That is, since | W | is large, net is large for any pattern, and df / dnet is small. Therefore, the correction amount ΔW of the weight calculated from the steepest descent gradient using Equation 6 becomes small, and the learning speed cannot be increased.

【００３５】上述のことは、計算を省略したバックプロ
パゲーション学習でも起こりうる。収束条件が厳しくな
った時、すなわち、出力誤差の上限値が小さくなった時
には、全ての学習パターンが収束条件を満たさないの
で、最初から全パターンを用いて学習するからである。The above can also occur in backpropagation learning that omits calculations. This is because, when the convergence condition becomes strict, that is, when the upper limit value of the output error becomes small, all the learning patterns do not satisfy the convergence condition, and the learning is performed using all the patterns from the beginning.

【００３６】本発明の目的は、バックプロパゲーション
を用いた学習において、初めから全てのパターンを同時
に学習させることによる学習が遅くなることを防ぎ、高
速な学習を実現するための方法を提供することにある。An object of the present invention is to provide a method for realizing high-speed learning by preventing learning from being delayed by learning all patterns from the beginning in learning using backpropagation. It is in.

【００３７】[0037]

【課題を解決するための手段】上記課題は、個々の学習
パターンについて、ニューラルネットワークの出力誤差
を計算し、それぞれの出力誤差の絶対値が、数１で定義
される重み修正判定誤差よりも大きい出力ニューロン
を、一つでも含むパターンについて、学習を行なうと判
定し、学習する場合、バックプロパゲーション法を用い
て重みの修正量を計算して、重みを修正する学習方法に
よって達成される。The above problem is that the output error of the neural network is calculated for each learning pattern, and the absolute value of each output error is larger than the weight correction determination error defined by the equation (1). When learning is performed for a pattern including even one output neuron and learning is performed, a weight correction amount is calculated using the back propagation method, and the weight is corrected by a learning method.

【００３８】[0038]

【作用】最初から全学習パターンを用いて学習した場
合、主に重みの大きさ|Ｗ|のみが大きくなって、分離平
面はあまり移動しない。その結果、最急降下勾配に積の
形で入るシグモイド関数の微分が、全ての学習パターン
について小さくなる。そして、最急降下勾配から計算す
る重みの修正量ΔＷも小さくなって、学習の高速化が図
れなくなる。When the learning is performed using all the learning patterns from the beginning, only the magnitude of the weight | W | becomes large, and the separation plane does not move much. As a result, the differentiation of the sigmoid function that enters the steepest descent in the form of a product becomes small for all learning patterns. Then, the correction amount ΔW of the weight calculated from the steepest descent gradient also becomes small, and the learning cannot be speeded up.

【００３９】上記手段によって、学習の最初の方では、
誤差が大きいパターンのみを用いて学習を行なう。その
ため、|Ｗ|の増加が相対的に押さえられ、分離平面が動
いて誤差を減少させるので、学習が高速に行なわれる。
このことを図９を用いて詳細に説明する。By the above means, in the beginning of learning,
Learning is performed using only patterns with large errors. Therefore, the increase of | W | is relatively suppressed, and the separation plane moves to reduce the error, so that the learning is performed at high speed.
This will be described in detail with reference to FIG.

【００４０】図９は、図６と同様に学習を始める前の状
態を示す出力特性図である。図において、●は誤差の大
きい○の学習パターンを表す。破線に沿って切ったニュ
ーロンの出力特性により、分離平面Ｇよりも上側では出
力が０.５よりも大きく、且つ、Ｇからはなれるほど１
に近づく。逆に、Ｇよりも下側ではそれと反対の特性を
持つ。一方、○のパターンは０.１を出力するように学
習するので、Ｇの上側にありＧから離れている●の学習
パターンは誤差が大きい。FIG. 9 is an output characteristic diagram showing a state before starting learning as in FIG. In the figure, ● represents a learning pattern with a large error. Due to the output characteristic of the neuron cut along the broken line, the output is larger than 0.5 above the separation plane G and is 1 apart from G.
Approach. Conversely, below G, it has the opposite characteristic. On the other hand, since the pattern of ◯ is learned so as to output 0.1, the learning pattern of ● above the G and separated from G has a large error.

【００４１】最初のうちは、●の学習パターンのみを用
いて学習することを考える。学習により●に対する出力
を０.１に近付けようとする。そこで、学習では分離平
面Ｇを上に持ち上げようとすると同時に、重みの大きさ
|Ｗ|を小さくしてニューロンの出力を小さくしようとす
る。|Ｗ|が小さくなるので、シグモイド関数の微分値df
／dnetが大きくなり、そのため、数３，数４によってΔ
Ｗがある程度大きい値を保つ。以上の結果、高速に学習
を行なうことが可能になる。At the beginning, consider learning using only the learning pattern of ●. By learning, it tries to bring the output for ● close to 0.1. Therefore, in learning, at the same time as trying to raise the separation plane G upward,
Attempts to reduce the output of the neuron by reducing | W |. Since | W | becomes small, the differential value df of the sigmoid function
/ Dnet becomes large, and therefore, according to Equations 3 and 4, Δ
W keeps a relatively large value. As a result, it becomes possible to perform learning at high speed.

【００４２】[0042]

【実施例】以下、本発明の第１の実施例を、図１，図４
を用いて説明する。図１は実施例の処理全体のフローチ
ャート、図４は重みの学習を行なうための学習データ１
０の詳細を示す説明図である。５０は入力層ニューロン
に入力する入力データであり、（入力層ニューロン数Ｎ
ｉ×学習パターン数Ｐ）のデータ数がある。５３は出力
層ニューロンの出力と比較して誤差を求めるための教師
データであり、（出力層ニューロンＮｏ×学習パターン
数Ｐ）のデータ数がある。入力データ５０と教師データ
５３は対になっており、例えば、入力信号パターン５１
には教師信号パターン５４が対応している。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A first embodiment of the present invention will be described below with reference to FIGS.
Will be explained. FIG. 1 is a flowchart of the entire processing of the embodiment, and FIG. 4 is learning data 1 for performing weight learning.
It is explanatory drawing which shows the detail of 0. Reference numeral 50 is input data input to the input layer neuron,
There are i × the number of learning patterns P). Reference numeral 53 is teacher data for obtaining an error by comparing with the output of the output layer neuron, and there is a data number of (output layer neuron No.times.learning pattern number P). The input data 50 and the teacher data 53 are paired, and for example, the input signal pattern 51
Corresponds to the teacher signal pattern 54.

【００４３】次に図１を用いて本実施例の動作を説明す
る。図においてステップ４が本発明の特徴となってい
る。Next, the operation of this embodiment will be described with reference to FIG. In the figure, step 4 is a feature of the present invention.

【００４４】ステップ１では、教師信号データ１０の入
力信号パターンを入力層ニューロンに入力する。In step 1, the input signal pattern of the teacher signal data 10 is input to the input layer neuron.

【００４５】ステップ２では、数３に従い入力信号を出
力層側のニューロンに次々と伝搬させ、最終的に出力層
ニューロンの出力を求める。In step 2, the input signals are successively propagated to the neurons on the output layer side according to the equation 3, and the output of the output layer neurons is finally obtained.

【００４６】ステップ３では、教師信号データ１０の出
力信号とステップ２で計算した出力層ニューロンの出力
により、下の式を用いて出力誤差を計算する。In step 3, the output error is calculated by using the following equation based on the output signal of the teacher signal data 10 and the output of the output layer neuron calculated in step 2.

【００４７】[0047]

【数９】 [Equation 9]

【００４８】ステップ４では、現在提示している学習パ
ターンが、重みの修正条件を満たすかどうか判定をす
る。すなわち、数１で定義される判定誤差と比較を行な
い、その判定誤差よりも大きい出力誤差の出力層ニュー
ロンが存在するかどうかを判定する。もし、存在してい
れば重みの修正条件を満たしたとして、ステップ５へ進
む。In step 4, it is determined whether or not the currently presented learning pattern satisfies the weight correction condition. That is, a comparison is made with the judgment error defined by Equation 1, and it is judged whether or not there is an output layer neuron having an output error larger than the judgment error. If it exists, it is determined that the weight correction condition is satisfied, and the process proceeds to step 5.

【００４９】ステップ５では、数５に従い各ニューロン
の重みに対し入力パターンｑの最急降下勾配を求める。In step 5, the steepest descent gradient of the input pattern q is obtained for the weight of each neuron according to equation (5).

【００５０】ステップ６では、数６を用いて誤差の勾配
と前回の修正量より今回の修正量を求める。In step 6, the present correction amount is obtained from the error gradient and the previous correction amount using the equation (6).

【００５１】ステップ７では、数７に従い各ニューロン
の重みを修正する。In step 7, the weight of each neuron is modified according to equation (7).

【００５２】ステップ８では、全ての学習パターンにつ
いてステップ１からステップ７の処理を行なう。In step 8, the processes of steps 1 to 7 are performed for all learning patterns.

【００５３】ステップ９では、ステップ１を、出力誤差
が学習終了の基準となる一定の誤差の上限以内に収まる
まで繰り返す。この収束条件は、例えば、出力層ニュー
ロンのそれぞれの出力誤差が０.１などのように、一定
の上限値以内に収まり、且つこれが全ての学習パターン
について成り立つ時とする。In step 9, step 1 is repeated until the output error falls within the upper limit of a constant error which is a reference for learning completion. This convergence condition is, for example, when the output error of each output layer neuron is within a certain upper limit value, such as 0.1, and this holds for all learning patterns.

【００５４】このような手法により、学習の最初の方で
は、誤差の大きいパターンのみを用いて学習を行なうの
で、高速に学習を行なうことが出来る。By such a method, since learning is performed only in the first part of learning using only the pattern having a large error, the learning can be performed at high speed.

【００５５】なお、ステップ４で説明した重みの修正条
件は、本実施例のものに限るものではなく、数１０で示
す判定誤差を用いて、それを超えた学習パターンについ
て修正しても良い。The weight correction condition described in step 4 is not limited to that of this embodiment, and a learning error exceeding it may be corrected by using the judgment error shown in Expression 10.

【００５６】[0056]

【数１０】 [Equation 10]

【００５７】次に、本発明の第２の実施例を図１，図４
及び図１０を用いて説明する。図１０は本実施例の装置
のブロック図を示したものである。Next, a second embodiment of the present invention will be described with reference to FIGS.
Also, description will be made with reference to FIG. FIG. 10 is a block diagram of the apparatus of this embodiment.

【００５８】以下、この装置の動作を、図１の処理フロ
ーに対応させて記述する。なお、ステップ１，２，３，
…はそれぞれ図１のステップをさす。The operation of this apparatus will be described below in association with the processing flow of FIG. Note that steps 1, 2, 3,
... indicate the steps in Fig. 1, respectively.

【００５９】ステップ１では、学習データ記憶部６２に
記憶された、学習信号データ１０の入力信号パターン
を、ニューロン出力計算部６３が読み出し、入力層ニュ
ーロンに入力する。In step 1, the neuron output calculation unit 63 reads the input signal pattern of the learning signal data 10 stored in the learning data storage unit 62 and inputs it to the input layer neuron.

【００６０】ステップ２では、ニューロン出力計算部６
３が、数３に従い各層のニューロンの出力を求める。In step 2, the neuron output calculator 6
3 obtains the output of the neuron in each layer according to equation 3.

【００６１】ステップ３では、学習データ記憶部６２に
記憶された学習データ１０の教師信号とステップ２で計
算した出力層ニューロンの出力により、誤差計算部６６
が数９を用いて出力誤差を計算する。In step 3, the error calculation part 66 is generated by the teacher signal of the learning data 10 stored in the learning data storage part 62 and the output of the output layer neuron calculated in step 2.
Calculates the output error using Equation 9.

【００６２】ステップ４では、重み修正制御部６８が、
現在提示している学習パターンが、重みの修正条件を満
たすかどうかの判定をする。すなわち、重み修正制御部
６８は数１で定義される重み修正判定誤差と、ステップ
３で計算した出力誤差との比較を行ない、その判定誤差
よりも大きい誤差の出力層ニューロンが、存在するかど
うかの判定をする。もし、存在していれば重みの修正条
件を満たしたとして、最急降下勾配計算部６７，重み修
正量計算部６５，重み修正部６１に、重み修正制御部６
８が重みを修正するための計算を行なう指示信号を送
る。そして、ステップ５へ進む。At step 4, the weight correction controller 68
It is determined whether the currently presented learning pattern satisfies the weight correction condition. That is, the weight correction control unit 68 compares the weight correction determination error defined by Equation 1 with the output error calculated in step 3, and determines whether there is an output layer neuron having an error larger than the determination error. To judge. If it exists, it is determined that the weight correction condition is satisfied, and the weight correction control unit 6 is added to the steepest descent gradient calculation unit 67, the weight correction amount calculation unit 65, and the weight correction unit 61.
8 sends an indication signal to perform the calculations to correct the weights. Then, the process proceeds to step 5.

【００６３】ステップ５では、最急降下勾配計算部６７
が、数５に従い各ニューロンの重みに対し学習パターン
ｑの最急降下勾配を求める。In step 5, the steepest descent gradient calculator 67
Calculates the steepest descent gradient of the learning pattern q for each neuron weight according to Equation 5.

【００６４】ステップ６では、重み修正量計算部６５
が、重み修正量記憶部６４から前回の重み修正量を読み
出し、数６を用いて今回の修正量を計算する。In step 6, the weight correction amount calculation unit 65
However, the previous weight correction amount is read from the weight correction amount storage unit 64, and the current correction amount is calculated using the equation 6.

【００６５】ステップ７では、重み修正部６１が数７に
従い、ネットワーク重み記憶部６０に記憶されている各
ニューロンの重みを修正する。In step 7, the weight correction unit 61 corrects the weight of each neuron stored in the network weight storage unit 60 according to the equation (7).

【００６６】ステップ８では、全ての学習パターンにつ
いてステップ１からステップ７の処理を行なう。In step 8, steps 1 to 7 are performed for all learning patterns.

【００６７】ステップ９では、ステップ１を、出力誤差
が学習終了の基準となる一定の誤差の上限以内に収まる
まで繰り返す。この収束条件は、例えば、出力層ニュー
ロンのそれぞれの出力誤差が０.１などのように、一定
の上限値以内に収まり、且つこれが全ての学習パターン
について成り立つ時とする。In step 9, step 1 is repeated until the output error falls within the upper limit of a constant error that serves as a reference for the end of learning. This convergence condition is, for example, when the output error of each output layer neuron is within a certain upper limit value, such as 0.1, and this holds for all learning patterns.

【００６８】次に、本発明の第３の実施例を図１１及び
図４を用いて説明する。図１１は実施例の処理全体のフ
ローチャートを示したものである。図４は重みの学習を
行なうための学習データ１０の詳細を示す説明図であ
る。Next, a third embodiment of the present invention will be described with reference to FIGS. 11 and 4. FIG. 11 shows a flowchart of the entire processing of the embodiment. FIG. 4 is an explanatory diagram showing details of the learning data 10 for performing weight learning.

【００６９】図１１を用いて本実施例の動作を説明す
る。図においてステップ７５，ステップ７４が本発明の
特徴となっている。The operation of this embodiment will be described with reference to FIG. In the figure, step 75 and step 74 are features of the present invention.

【００７０】ステップ７０では、教師信号データ１０の
入力信号パターンを入力層ニューロンに入力する。In step 70, the input signal pattern of the teacher signal data 10 is input to the input layer neuron.

【００７１】ステップ７１では、数３に従い入力信号を
出力層側のニューロンに次々と伝搬させ、最終的に出力
層ニューロンの出力を求める。In step 71, the input signals are successively propagated to the neurons on the output layer side according to the equation 3, and finally the output of the output layer neurons is obtained.

【００７２】ステップ７２では、教師信号データ１０の
出力信号とステップ７１で計算した出力層ニューロンの
出力により、数９を用いて出力誤差を計算する。In step 72, the output error is calculated by using the equation 9 by the output signal of the teacher signal data 10 and the output of the output layer neuron calculated in step 71.

【００７３】ステップ７３では、全ての学習パターンに
ついてステップ７０からステップ７２の処理を行なう。At step 73, the processing from step 70 to step 72 is performed for all learning patterns.

【００７４】以上ステップ７３，ステップ７０〜７２の
処理の結果、全ての学習パターンについて出力層ニュー
ロンの誤差が求まる。As a result of the processing of step 73 and steps 70 to 72, the error of the output layer neuron is obtained for all learning patterns.

【００７５】ステップ７２では、ステップ７３，ステッ
プ７０〜７２で求めた誤差を、大きい順にソーティング
する。そして、予め決めたパターン数だけ、誤差が大き
い順に学習パターンを上から選択し、それを重みを修正
するための学習パターンとする。In step 72, the errors obtained in step 73 and steps 70 to 72 are sorted in descending order. Then, a learning pattern is selected from the top in the order of increasing error by a predetermined number of patterns, and this is used as a learning pattern for correcting the weight.

【００７６】ステップ７４では、ステップ７５で求め
た、重みを修正するための学習パターンについて、ステ
ップ７６〜ステップ８１の処理を行ない重みを修正す
る。In step 74, the weights are corrected by performing the processes of steps 76 to 81 for the learning pattern for correcting the weights obtained in step 75.

【００７７】ステップ７６では、教師信号データ１０の
入力信号パターンを入力層ニューロンに入力する。In step 76, the input signal pattern of the teacher signal data 10 is input to the input layer neuron.

【００７８】ステップ７７では、数３に従い入力信号を
出力層側のニューロンに次々と伝搬させ、最終的に出力
層ニューロンの出力を求める。In step 77, the input signals are successively propagated to the neurons on the output layer side in accordance with equation 3, and finally the output of the output layer neurons is obtained.

【００７９】ステップ７８では、教師信号データ１０の
出力信号とステップ７７で計算した出力層ニューロンの
出力により、数９を用いて出力誤差を計算する。In step 78, the output error is calculated using the equation 9 by the output signal of the teacher signal data 10 and the output of the output layer neuron calculated in step 77.

【００８０】ステップ８９では、数５に従い各ニューロ
ンの重みに対し入力パターンｑの最急降下勾配を求め
る。In step 89, the steepest descent gradient of the input pattern q is obtained for the weight of each neuron according to equation 5.

【００８１】ステップ８０では、数６を用いて誤差の勾
配と前回の修正量より今回の修正量を求める。In step 80, the correction amount for this time is calculated from the gradient of the error and the correction amount for the previous time using the equation (6).

【００８２】ステップ８１では、数７に従い各ニューロ
ンの重みを修正する。In step 81, the weight of each neuron is modified according to the equation (7).

【００８３】ステップ７４では、全ての重みを修正する
ための学習パターンについて、ステップ７６からステッ
プ８１の処理を行なう。In step 74, the processing from step 76 to step 81 is performed for the learning pattern for correcting all weights.

【００８４】ステップ８２では、ステップ７３〜ステッ
プ７５を、出力誤差が学習終了の基準となる一定の誤差
の上限以内に収まるまで繰り返す。この収束条件は、例
えば出力層ニューロンのそれぞれの出力誤差が０.１な
どのように、一定の上限値以内に収まり、且つこれが全
ての学習パターンについて成り立つ時とする。In step 82, steps 73 to 75 are repeated until the output error falls within the upper limit of a constant error that serves as a reference for learning completion. This convergence condition is set such that the output error of each output layer neuron is within a certain upper limit, such as 0.1, and this holds for all learning patterns.

【００８５】このような手法により、誤差の大きいパタ
ーンのみを用いて学習を行なうので、高速に学習を行な
うことが出来る。According to such a method, since learning is performed using only a pattern having a large error, it is possible to perform learning at high speed.

【００８６】なお、ステップ７５で説明した重みを修正
するための学習パターンの選択条件は、本実施例のもの
に限るものではなく、全体の学習パターン数に対して、
予め定めた一定の割合の数だけ、誤差の大きい学習パタ
ーンを上から順に選択しても良い。The learning pattern selection conditions for correcting the weights described in step 75 are not limited to those of the present embodiment, and the learning pattern selection condition is
The learning patterns having large errors may be selected in order from the top by a predetermined fixed number.

【００８７】[0087]

【発明の効果】本発明によれば、重みの修正条件によっ
て、学習の最初の方では誤差が大きいパターンのみを用
いて学習を行なう。そのため、全パターンを最初から用
いて学習する場合に較べて学習の効果が相殺されること
が少なくなる。すなわち、主に重みの大きさ|Ｗ|のみが
大きくなって、分離平面はあまり移動しないということ
が少なくなる。その結果、|Ｗ|の増加が相対的に押さえ
られるので、最急降下勾配に積の形で入るシグモイド関
数の微分があまり小さくならず、最急降下勾配もある程
度の大きさを保つ。よって、重みの修正量ΔＷも急激に
は小さくならないので、学習が高速に行なわれる。According to the present invention, the learning is performed by using only the pattern having a large error at the beginning of the learning depending on the correction condition of the weight. Therefore, the effect of learning is less likely to be canceled out as compared with the case of learning by using all patterns from the beginning. That is, mainly, only the magnitude of the weight | W | becomes large, and the separation plane does not move much. As a result, since the increase of | W | is suppressed relatively, the differential of the sigmoid function that enters the steepest descent gradient in the form of a product is not so small, and the steepest descent gradient keeps a certain degree. Therefore, the weight correction amount ΔW does not suddenly decrease, so that the learning is performed at high speed.

[Brief description of drawings]

【図１】本発明の第１の実施例の処理全体を示すＰＡＤ
のフローチャート。FIG. 1 is a PAD showing the overall processing of the first embodiment of the present invention.
Flow chart.

【図２】バックプロパゲーション法の処理手順を示すフ
ローチャート。FIG. 2 is a flowchart showing a processing procedure of a back propagation method.

【図３】学習の収束条件を満たした学習パターンに対し
て、重み修正のための計算を省略するバックプロパゲー
ション法の処理手順のフローチャート。FIG. 3 is a flowchart of a processing procedure of a backpropagation method in which a calculation for weight correction is omitted for a learning pattern that satisfies a learning convergence condition.

【図４】学習データの詳細を示す説明図。FIG. 4 is an explanatory diagram showing details of learning data.

【図５】ニューロンが信号○，×分離の学習を終えた後
の状態を示す説明図。FIG. 5 is an explanatory diagram showing a state after the neuron has finished learning to separate signals ◯ and ×.

【図６】ニューロンが信号○，×分離の学習を始める前
の状態を示す説明図。FIG. 6 is an explanatory diagram showing a state before the neuron starts learning the separation of signals ◯ and ×.

【図７】図５の破線に沿って切ったニューロンの出力特
性図。7 is an output characteristic diagram of a neuron taken along the broken line in FIG.

【図８】図６の破線に沿って切ったニューロンの出力特
性図。8 is an output characteristic diagram of a neuron taken along the broken line in FIG.

【図９】ニューロンが信号○，×分離の学習を始める前
の状態を示す説明図。FIG. 9 is an explanatory diagram showing a state before the neuron starts learning the separation of signals ◯ and ×.

【図１０】本発明の第２の実施例の機能構成全体を示す
ブロック図。FIG. 10 is a block diagram showing the overall functional configuration of a second embodiment of the present invention.

【図１１】本発明の第３の実施例の処理全体を示すＰＡ
Ｄのフローチャート。FIG. 11 is a PA showing the overall processing of the third embodiment of the present invention.
The flowchart of D.

[Explanation of symbols]

３…入力パターンのニューラルネットへの入力、４…各
ニューロンの出力の計算、５…出力層ニューロンの誤差
計算、６…重みの修正条件の判定、７…最急降下勾配の
計算、８…重み修正量の計算、９…重みの修正、１０…
学習データ。3 ... Input of input pattern to neural network, 4 ... Calculation of output of each neuron, 5 ... Calculation of error of output layer neuron, 6 ... Judgment of weight correction condition, 7 ... Calculation of steepest descent gradient, 8 ... Weight correction Calculation of quantity, 9 ... Correction of weight, 10 ...
Learning data.

Claims

[Claims]

1. In learning of a neural network,
A neural network learning method characterized in that a learning pattern to be learned is controlled by using a judgment condition different from a learning end judgment condition for judging the end of learning.

2. In learning of a neural network using a back propagation method, a first step of calculating an output error of the neural network for each learning pattern, and learning of the pattern based on the output error. A neural network characterized by including a second step of determining whether or not, and a third step of calculating a correction amount of the weight by using the backpropagation method when learning and correcting the weight. Online learning method.

3. In the learning of a neural network using the backpropagation method, the first step of calculating the output error of the neural network for each learning pattern and the absolute value of the output error are given by A second step of determining that learning is to be performed for a learning pattern including an output neuron larger than the defined weight correction determination error; When learning, a third step of calculating a weight correction amount using the backpropagation method and correcting the weight is included.

4. In the learning of a multi-layered neural network using the back propagation method, the first step of calculating the output error of the neural network for each learning pattern and the absolute value of each output error are: The second step of determining that learning is to be performed for a learning pattern including an output neuron that is larger than the weight correction determination error defined by Equation 1, and when learning is performed,
And a third step of correcting the weight by calculating the correction amount of the weight by using the backpropagation method.

5. In the learning of a multilayer neural network using the back propagation method, a first step of calculating an output error of the neural network for each learning pattern, and an output error obtained in the first step. A second step of selecting a learning pattern from the top in the order of increasing error by a predetermined number of patterns and using it as a learning pattern for correcting the weight, and a learning pattern for correcting the weight,
A third step of calculating a weight correction amount by using the back propagation method and correcting the weight, the neural network learning method.