JPH04230570A

JPH04230570A - System for learning neural network

Info

Publication number: JPH04230570A
Application number: JP3000180A
Authority: JP
Inventors: Kiyoshi Nakabayashi; 仲林　清; Mina Maruyama; 丸山　美奈
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1991-01-07
Filing date: 1991-01-07
Publication date: 1992-08-19

Abstract

PURPOSE:To improve sort accuracy without over learning requiring enormous learning time and means to save the learning by performing learning so as to reduce errors even to the learning sample while maintaining sort functions supplied initially when the neural networks are adjusted by using learning sample data. CONSTITUTION:This system is consisted of an error calculation means 11 calculating output errors against learning sample data 14 of neural network, a load difference calculation means 12 calculating the load difference which is the one between the coupled load of the neural network and the initial set value of the bias value, and a coupled load adjustment means 13 adjusting the coupled load and the bias value so as to reduce the sum of the output error and the load difference. The system ends the learning when learning sample data 14 are repeatedly supplied to the neural network to turn the error less than the constant value.

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】本発明はニュ−ラルネットワ−ク
の学習方式に係り、特に分類対象デ−タの分類を行う概
略の分類規則と学習サンプルデ−タを用いて分類精度の
高いニュ−ラルネットワ−クを構成するニュ−ラルネッ
トワ−クの学習方式に関する。【０００２】【従来の技術】従来よりデ−タの分類処理を目的に多層
構造型ニュ−ラルネットワ−クが用いられている。図３
は多層構造型ニュ−ラルネットワ−クの構成図を示す。同図において、入力層３０１は分類対象デ−タの特徴量
を入力する。出力層３０２は分類結果を出力する。中間
層３０３は入力層３０１と出力層３０２の間に位置し、
１層乃至、複数の層によって構成される。各層のユニッ
トの入力と、その前後の層の各々のユニットの出力は結
合されている。各結合の両端のユニットは前段の層のユ
ニットを始点ユニット、後段の層のユニットを終点ユニ
ットと呼ぶ。図３の層を例とすると、始点ユニットは入
力層３０１のユニット３０ａ，３０ｂ，３０ｃ，３０ｄ
であり、ユニット３０ａ，３０ｂ，３０ｃ，３０ｄにと
っての後段である終点ユニットはユニット３１ａ，３１
ｂ，３１ｃということになる。各々のユニットの出力は
以下の式に従って決定される。【０００３】　　　　　　ｏｊ　（ｋ）　　　＝　　ｔａｎｈ（　ａ
ｊ　（ｋ）　）　　　　　　　　　　　　　　　　　　
　　　　　（１）　【０００４】【数１】【０００５】ここでｏｊ　（ｋ）　はｋ層（ｋ≧１，ｋ
＝１が入力層）のｊ番目のユニットの出力値である。ｗ
ｉｊ（ｋ）　はｋ−１層のｉ番目のユニットからｋ層ｊ
番目のユニットへの結合荷重、Ｎ（ｋ−１）　はｋ−１
層のユニット総数である。但し、ｗ０ｊ（ｋ）　はｋ層
のｊ番目のユニットにバイアスを与えるための結合荷重
であり、（ｋ−１）層のｊ番目のユニットの出力値ｏ０
ｊ（ｋ−１）　は常に１とする。また、入力層３０１（
ｋ＝１）の各ユニットは入力された分類対象デ−タの特
徴量をそのまま次段に出力する。【０００６】このようなニュ−ラルネットワ−クにデ−
タの分類を行わせるためには、入力層のユニットに分類
対象デ−タの特徴量を与えた時に、そのデ−タの属する
分類カテゴリに対応する出力層ユニットのみが高い値を
出力し、他の出力層ユニットが低い値を出力するように
各ユニット間の結合荷重及びバイアス値を設定する必要
がある。【０００７】このために、従来の技術の第１の方法とし
て結合荷重及びバイアス値をランダムな値に初期設定し
ておき、分類結果が既知である学習サンプルデ−タの特
徴量を入力したときの実際の出力値と望ましい出力値の
誤差が減少するように、結合荷重及びバイアス値を微小
量ずつ繰り返し調整する逆誤差伝搬学習方式が知られて
いる。【０００８】図４は従来の第１の技術を説明するための
図を示す。学習手段４０２は学習前のニュ−ラルネット
ワ−ク４０１に学習サンプルデ−タの特徴量４０３を入
力したときの実際の出力値と望ましい出力値の誤差を算
出する誤差算出手段４１０と、誤差算出手段４１０で算
出された誤差が減少するように逆誤差伝搬学習方式を用
いて結合荷重及びバイアス値を微小量ずつ繰り返し調整
する結合荷重調整手段４１２からなる。【０００９】学習手段４０２には学習前のニュ−ラルネ
ットワ−ク４０１が設定されて、そのニュ−ラルネット
ワ−クに学習サンプルデ−タの特徴量と分類結果４０３
が入力される。これにより学習手段４０２は誤差算出手
段４１０と結合荷重調整手段４１２により学習を行い、
学習後のニュ−ラルネットワ−ク４０４として出力する
。【００１０】次に学習手段４０２の動作について詳しく
説明する。学習手段４０２の誤差算出手段４１０はニュ
−ラルネットワ−クの入力層に学習サンプルデ−タの特
徴量４０３を入力した時のニュ−ラルネットワ−クの出
力値【００１１】【数２】【００１２】と望ましい出力値の二乗誤差ｅｊ　＝ｏｊ
　（Ｋ）　−　　ｙｊ　　　　　　　（４）　を算出す
る。ここで、Ｋは出力層の層番号、即ちニュ−ラルネッ
トワ−クの全層数である。また、ｙｊ　は出力層ｊ番目
のユニットの当該学習サンプルデ−タ４０３に対する望
ましい出力値である正しい分類結果である。【００１３】次に荷重変化調整手段４１１は算出された
誤差が減少するように、ニュ−ラルネットワ−クの結合
荷重及びバイアス値を以下の式に従って微少量調整する
。結合荷重ないしバイアス値の調整量Δｗｉｊ（ｋ）　
は終点ユニットの誤差信号ｄｊ　（ｋ）　と始点ユニッ
トの出力値ｏｉ　（ｋ−１）　の積の形である以下の式
に従って算出される。【００１４】　　　　　　　　Δｗｉｊ（ｋ）　＝　　−ε１　ｄｊ
　（ｋ）　ｏｉ　（ｋ−１）　　　　　　　　　　　　
　（５）　ここでε１　は一回の繰り返しでの調整量の
大きさを決めるパラメ−タである。ｋ層が出力層（ｋ＝
Ｋ）の時、終点ユニットの誤差信号ｄｊ　（ｋ）　は以
下の式で算出される。【００１５】　　　　　　　　ｄｊ　（Ｋ）　＝ｅｊ　（１−ｏｊ　
（Ｋ）　）（１＋ｏｊ　（Ｋ）　）　　（６）　ここで
ｅｊ　は式（４）　に従って誤差算出手段４１０で算出
された出力誤差である。【００１６】ｋ層が中間層（１＜ｋ＜Ｋ）の時は誤差信
号ｄｊ　（ｋ）　は以下の式で求められる。【００１７】【数３】【００１８】以上の誤差算出手段４１０による誤差算出
と、結合荷重調整手段４１２による結合荷重及びバイア
ス値の調整を、学習サンプルデ−タ４０３を繰り返し与
えて実行し、誤差が一定値以下になったとき学習を終了
する。【００１９】次に従来の第２の方法について説明する。この方法は概ね正しいと考えられるデ−タ分類規則が既
知であるときに、このデ−タ分類規則と等価な分類機能
を有するように、ニュ−ラルネットワ−クの結合荷重及
びバイアス値を初期設定し、その後、学習サンプルデ−
タを用いて逆誤差伝搬学習方式によって結合荷重及びバ
イアス値を調整する方式である。【００２０】この方法は本発明者により特願平２−１５
５７０号のニュ−ラルネットの学習方式に記載されてい
る。このものは学習デ−タの特徴の有無によってデ−タ
の分類を行う既知のデ−タ分類規則を論理演算式に変換
し、論理演算式の乗法項、加法項をニュ−ラルネットの
各ユニットに割り付け、論理演算式と等価の分類機能を
有するようにニュ−ラルネットの構造及び結合荷重の設
定を行い、事例デ−タを用いて学習させる。また、入力
された学習デ−タの特徴量を定数値と比較してその大小
関係によって学習デ−タの分類を行う既知のデ−タ分類
規則を論理演算式に変換し、論理演算式の大小比較項、
乗法項、加法項をニュ−ラルネットの各ユニットを割り
付け論理演算式と等価の分類機能を有するようにニュ−
ラルネットの構造及び結合荷重の設定を行い、学習デ−
タを用いて学習させるものである。【００２１】図５は従来の第２の方法を説明するための
図を示す。同図において、図４と同一部分には同一符号
を付し、その説明を省略する。【００２２】論理演算変換手段５０２はデ−タ分類規則
５０１を論理演算に変換する。結合荷重設定手段５０３
は得られた論理演算式と等価な動作を行うようにニュ−
ラルネットワ−クの構造及び結合荷重を設定する。学習
手段４０２は、結合荷重設定手段５０３から得られるニ
ュ−ラルネットワ−クに対して、従来の第１の方法と同
様の逆誤差伝搬学習方式を用いて、学習サンプルデ−タ
の特徴量４０３を入力し、実際の出力値と望ましい出力
値の誤差が減少するように結合荷重及びバイアス値を微
小量ずつ調整する。【００２３】次に各手段の動作について説明する。ここ
で、既知であるデ−タ分類規則５０１として以下の規則
が与えられているとする。【００２４】　　ＩＦ（ｘ１　＞ａ１　）ａｎｄ（ｘ２　＞ａ２　）
ＴＨＥＮ　　ｙ　　　　　　　　　　　　　（８）　　
ＩＦ（ｘ３　＞ａ３　）ａｎｄ（ｘ４　＜ａ４　）ＴＨ
ＥＮ　　ｙ　　　　　　　　　　　　　（９）　（８）
の分類規則は「分類対象デ−タの特徴量ｘ１　が定数ａ
１　より大きく、且つ、特徴量ｘ２　が定数ａ２　より
大きければ分類対象デ−タはカテゴリｙに属する」こと
を意味しており、（９）の分類規則は「分類対象デ−タ
の特徴量ｘ３　が定数ａ３　より大きく、且つ特徴量ｘ
４　が定数ａ４　より小さければ分類対象デ−タはカテ
ゴリｙに属する」ことを意味している。【００２５】論理演算変換手段５０２は式（８）　，（
９）　から「（ｘ１　がａ１　より大きく、且つ、ｘ２
　がａ２　より大きい）または、（ｘ３　がａ３　より
大きく、且つ、ｘ４　がａ４　より小さい）ならばｙが
真」を意味する以下の論理演算式を生成する。但し、“
・”は論理積、“＋”は論理和、“¬”は論理否定を表
す。【００２６】　　ｙ＝（ｘ１　＞ａ１　）・（ｘ２　＞ａ２　）＋（
ｘ３　＞ａ３　）・¬（ｘ４　＞ａ４　）　　　　　　
　　　　　　　　　　　　　　　　　　　　　　　　　
　　　　　　　　　　　　　　　　　　　　　　　　　
　　　　　　　　（１０）次に結合荷重設定手段５０３
の動作について説明する。図６は従来の第２の方法における結合荷重設定手段５０
３を説明するための図を示す。初期結合荷重設定手段５
０３は論理演算変換手段５０２から与えられる論理演算
式６０１（式１０）に従ってニュ−ラルネットワ−ク６
０２の結合構成及び結合荷重を決定する。ニュ−ラルネ
ットワ−ク６０２の結合構成は同図６０２に示すように
、論理演算式６０１（式１０）の右辺に現れる変数毎に
一つの入力層ユニットを、変数と定数の比較項毎に一つ
の第１中間層ユニットを乗法項毎に一つの第２中間層ユ
ニットを割り当て、出力層で全乗法項の加法を実現する
ように行う。【００２７】結合荷重ｗの決定方法は前記した特願平２
−１８５５７０号のニュ−ラルネット学習方式に記載さ
れている。このものは論理演算式と等価の分類機能を有
するようにニュ−ラルネットの構造及び結合荷重の設定
を行うものである。【００２８】ここで結合荷重ｗの決定のための式を示す
。先ず、加法ユニット（出力層）の結合荷重ｗｉ　とバ
イアスｗｏ　の決定方法を説明する。ｎ−入力加法ユニ
ットを考えたとき、入力信号Ｉｉ　，（１≦ｉ≦ｎ）が
−１≦Ｉｉ　≦−ｄの場合は「偽」、ｄ≦Ｉｉ　≦１の
場合は「真」とする。出力信号ｏが−１≦ｏ≦−ｄ’の
場合は「偽」、ｄ’≦ｏ≦１の場合は「真」とする。以
上の条件で、加法機能を実現するには結合荷重ｗｉ　（
１≦ｉ≦ｎ）とバイアスｗ０　を以下のように設定する
。【００２９】　　ｗｉ　＝ｗ≧｛２ｔａｎｈ−１　（ｄ’）｝／｛ｄ
（ｎ＋１）−（ｎ−１）｝　　　　　（１１）　　　ｗ
０　＝｛ｗ（ｎ−１）（ｄ＋１）｝／２　　　　　　　
　　　　　　　　　　　　　　　　　　　　（１２）　
但し、入力が否定項である場合（ｙ＝ａ＋¬ｂのｂ）は
、ｗｉ　＝−ｗとする。また、ｄは、ｄ＞（ｎ−１）／
（ｎ＋１）という条件を満たす必要がある。【００３０】次に、乗法ユニット（第２中間層）の結合
荷重とバイアスの決定方法を説明する。ｎ，ｄ，ｄ’を
加法の場合と同様に定義する。乗法機能を実現するには
結合荷重ｗｉ　（１≦ｉ≦ｎ）とバイアスｗ０　を以下
のように設定する。【００３１】　　ｗｉ　＝ｗ≧｛２ｔａｎｈ−１　（ｄ’）｝／｛ｄ
（ｎ＋１）−（ｎ−１）｝　　　　　（１３）　　　ｗ
０　＝｛ｗ（１　−ｎ　）（ｄ＋１）｝／２　　　　　
　　　　　　　　　　　　　　　　　　　　　　（１４
）　但し、入力が否定項である場合（ｙ＝ａ・¬ｂのｂ
）は、ｗｉ　＝−ｗとする。また、ｄは、ｄ＞（ｎ−１
）／（ｎ＋１）という条件を満たす必要がある。【００３２】さらに、比較ユニット（第１中間層）にお
ける結合荷重とバイアスの決定方法を説明する。入力Ｉ
が定数Ａより大きいときに「真」を出力するユニットの
結合荷重ｗ１　とバイアス値ｗ０　の関係を以下のよう
に設定する。【００３３】　　　　　　　　　　　　　　　　ｗ０　＝−ｗ１　Ａ
　　　　　　　　　　　　　　　　　　　　　　　　　
　　　　　　　　　　（１５）　なお、ユニット及びユ
ニット間の結合として論理式に対応しない余分なものが
あってもよく、これらのユニット及びユニット間の結合
は上で決定した結合荷重よりも絶対値の十分小さいラン
ダムな値に設定される。【００３４】上記のように、結合荷重設定手段５０３で
得られたニュ−ラルネットワ−クに対して、学習手段４
０２は従来の第１の方法と同じ逆誤差伝搬学習方式を用
いて結合荷重及びバイアス値の調整を行う。【００３５】【発明が解決しようとする課題】しかるに、従来の第１
の方法は逆誤差伝搬学習方式により、学習サンプルデ−
タに対するニュ−ラルネットワ−クの実際の出力値と望
ましい出力値の誤差が減少するように学習を行っている
。このため、学習サンプルデ−タに対しては学習を繰り
返すことによって正しい分類結果を与えるニュ−ラルネ
ットワ−クを得ることができるが、学習サンプルデ−タ
以外の未知デ−タを入力した時に正しい分類結果が得ら
れる保証はない。特に、学習サンプルデ−タに偏りがあ
る場合や学習サンプルデ−タの個数が充分でない場合に
は、ニュ−ラルネットワ−クの分類機能が学習サンプル
デ−タのみの分類に特定されるという問題がある。【００３６】また、従来の第２の方法は既知である概ね
正しいと考えられる分類規則と等価な分類機能を有する
ようにニュ−ラルネットワ−クの結合荷重を初期設定し
ておき、これに対してさらに従来の第１の方法である逆
誤差伝搬学習方式により、学習サンプルデ−タに対する
学習を行っている。このため、従来の第１の方法に比べ
て未知デ−タに対する分類精度が向上することが期待さ
れるが、従来の第１の技術と同様に、学習サンプルデ−
タに偏りがある場合やノイズが含まれている場合には、
初期設定した結合荷重が逆誤差伝搬学習方式によって大
きく変化して、学習が収束しない或いは、膨大な学習時
間を必要とする等の過剰学習を生じ、分類精度の低下を
招く問題がある。【００３７】本発明は上記の点に鑑みなされたもので、
学習サンプルデ−タを用いてニュ−ラルネットワ−クを
調整する際に、初期に与えた分類機能を極力保ったまま
、学習サンプルに対しても誤差が減少するように学習が
行われ、過剰学習せずに分類精度を向上させることがで
きるニュ−ラルネットワ−クの学習方式を提供すること
を目的とする。【００３８】【課題を解決するための手段】図１は本発明の原理構成
図を示す。ニュ−ラルネットワ−クを分類結果が既知で
ある学習サンプルデ−タ１４を用いて学習させる学習手
段１０を有するニュ−ラルネットワ−クの学習方式にお
いて、学習手段１０はニュ−ラルネットワ−クの前記学
習サンプルデ−タ１４に対する出力誤差を算出する誤差
算出手段１１と、ニュ−ラルネットワ−クの結合荷重及
びバイアス値の初期設定値からのずれである荷重差分を
算出する荷重差分算出手段１２と、出力誤差と荷重差分
の和が減少するように結合荷重及びバイアス値を調整す
る結合荷重調整手段１３とを有する。【００３９】【作用】本発明は予め与えられた概ね正確であると考え
られる分類規則と等価の分類機能を有するように初期設
定されたニュ−ラルネットワ−クを学習サンプルデ−タ
に対する出力誤差が減少するように結合荷重を調整する
際に、学習サンプルデ−タに対する出力誤差と結合荷重
の初期設定値からのずれである荷重差分の和が減少する
ように結合荷重を調整する。即ち、初期に与えた分類機
能を極力保持したまま、学習サンプルに対する誤差が減
少するように学習を行う。【００４０】【実施例】図２は本発明の一実施例の構成図を示す。同
図中、図１、図５と同一部分には同一符号を付し、その
説明を省略する。同図中、荷重差分算出手段１２は結合
荷重の初期設定値との差分を算出する。他の構成につい
ては図５の従来の第２の方法と同様の構成となっている
。【００４１】以下に本発明の動作を説明する。既知のデ
−タ分類規則５０１は論理演算変換手段５０２によって
論理演算式（式１０）に変換され、結合荷重設定手段５
０３に送出される。結合荷重設定手段５０３は論理演算
式（式１０）に従ってニュ−ラルネットワ−クの構造及
び結合荷重を設定する。ここまでの動作は従来の第２の
方法と同様である。【００４２】次に学習手段１０の動作原理について説明
する。学習手段１０は結合荷重設定手段５０３によって
初期設定されたニュ−ラルネットワ−クについて、以下
の式で示される評価関数Ｅ’を減少させるように結合荷
重の値を変化させる。【００４３】【数４】【００４４】ここで右辺のＥは式（３）　に示した学習
サンプルデ−タの特徴量を入力した時のニュ−ラルネッ
トワ−クの出力値と望ましい出力値の二乗誤差である。【００４５】また、右辺の第２項は結合荷重の初期設定
値からのずれを表す項である。即ち学習手段１０はニュ
−ラルネットワ−クの出力値の二乗誤差と結合荷重の初
期設定値からのずれの和が減少するように結合荷重を調
整する。式（１６）のＷｉｊ（ｋ）　は結合荷重ｗｉｊ
（ｋ）　の初期設定値である。ｆ（ｘ，ｙ）はｘとｙの
差が増加する関数で、例えば、｜ｘ−ｙ｜、（ｘ−ｙ）
２　などである。ηｉｊ（ｋ）　は評価関数Ｅ’に対す
る各結合荷重の差分の寄与の大きさを決めるパラメ−タ
で、通常は全ての結合荷重ｗｉｊ（ｋ）　に対して一定
の値ηないし、層毎に一定の値η（ｋ）　とする。【００４６】上記の評価関数Ｅ’を減少させるための、
結合荷重ｗｉｊ（ｋ）　の変化量Δｗ’ｉｊ（ｋ）　は
以下の式で与えられる。【００４７】【数５】【００４８】ここでΔｗｉｊ（ｋ）　は式（５）　で算
出される通常の逆誤差伝搬学習方式における結合荷重の
調整量である。式（１７）、（１８）により、例えばｆ
（ｘ，ｙ）＝（ｘ−ｙ）２　としたときは、　　Δｗ’ｉｊ（ｋ）　＝Δｗｉｊ（ｋ）　−２εηｉ
ｊ（ｋ）　ｗｉｊ（ｋ）　（　ｗｉｊ（ｋ）　−Ｗｉｊ
（ｋ）　）　　　　　　　　　　　　　　　　　　　　
　　　　　　　　　　　　　　　　　　　　　　　　　
　　　　　　　　　　　　　　　　　（１９）となる。【００４９】学習手段１０は上記の動作を行う。以下ｆ
（ｘ，ｙ）＝（ｘ−ｙ）２　とした場合、即ち式（１９
）に従って結合荷重の調整を行う場合について説明する
。【００５０】誤差算出手段１１はニュ−ラルネットワ−
クの入力層に学習サンプルデ−タの特徴量１４を入力し
た時のニュ−ラルネットワ−クの出力値Ｅと望ましい出
力値の誤差ｅｊ　を通常の逆誤差伝搬学習の場合と同様
に式（３）　、式（４）　に従って算出する。【００５１】また、荷重差分算出手段１２は結合荷重の
初期設定値Ｗｉｊ（ｋ）　を記憶しておき、現在の結合
荷重の値ｗｉｊ（ｋ）　を用いて、式（１９）の右辺の
第２項である−２εηｉｊ（ｋ）　ｗｉｊ（ｋ）　（ｗ
ｉｊ（ｋ）　−Ｗｉｊ（ｋ）　）を求め、結合荷重のず
れを求める。【００５２】結合荷重調整手段１３は従来の逆誤差伝搬
学習方式と同様に式（１９）の右辺の第１項である結合
荷重の調整量のΔｗｉｊ（ｋ）　を算出し、荷重差分算
出手段１２で算出された右辺第２項のずれの値である−
２εηｉｊ（ｋ）　ｗｉｊ（ｋ）　（ｗｉｊ（ｋ）　−
Ｗｉｊ（ｋ）　）と加算し、結合荷重調整量Δｗ’ｉｊ
（ｋ）　を算出し、誤差算出手段１１の式（３）、式（
４）で算出された出力誤差ｅｊ　を用いて結合荷重の調
整を行う。【００５３】以上の誤差算出手段１１による誤差算出、
荷重差分算出手段１２による結合荷重の初期設定値から
のずれの算出、及び、結合荷重調整手段１３による結合
荷重の調整を学習サンプルデ−タ１４をニュ−ラルネッ
トワ−クに繰り返し与えて実行し、誤差が一定値以下に
なったとき、学習を終了する。【００５４】【発明の効果】上記により本発明によれば、予め与えら
れた概ね正確であると考えられる分類規則の等価の分類
機能を有するように初期設定されたニュ−ラルネットワ
−クを学習サンプルデ−タを用いて調整する際に、学習
サンプルデ−タに対する出力誤差と結合荷重の初期設定
値からのずれの和が減少するように結合荷重を調整する
ように構成したので、初期に与えた分類機能を極力保持
したまま、学習サンプルに対しても誤差が減少するよう
に学習を行うことができ、従来における過剰学習を回避
し、分類精度を向上することができる。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a learning method for a neural network, and in particular to a method for classifying data to be classified and a general classification rule for classifying data to be classified. This invention relates to a neural network learning method that uses data to construct a neural network with high classification accuracy. [0002] Conventionally, multilayer neural networks have been used for the purpose of data classification processing. Figure 3
shows a configuration diagram of a multilayer neural network. In the figure, an input layer 301 inputs feature amounts of data to be classified. The output layer 302 outputs the classification results. The intermediate layer 303 is located between the input layer 301 and the output layer 302,
It is composed of one layer or multiple layers. The input of the unit in each layer and the output of each unit in the layers before and after it are combined. Regarding the units at both ends of each bond, the unit in the previous layer is called the starting point unit, and the unit in the subsequent layer is called the ending point unit. Taking the layer in FIG. 3 as an example, the starting point units are units 30a, 30b, 30c, and 30d of the input layer 301.
, and the terminal units that are subsequent stages for units 30a, 30b, 30c, and 30d are units 31a, 31
b, 31c. The output of each unit is determined according to the following formula. oj (k) = tanh( a
j(k))
(1) [0004] [Formula 1] [0005] Here, oj (k) is k layer (k≧1, k
=1 is the output value of the j-th unit of the input layer). lol
ij(k) is from the i-th unit of the k-1 layer to the k layer j
The joint load to the th unit, N(k-1), is k-1
This is the total number of units in the layer. However, w0j(k) is the connection load for giving a bias to the j-th unit of the k layer, and the output value o0 of the j-th unit of the (k-1) layer
j(k-1) is always set to 1. In addition, the input layer 301 (
Each unit (k=1) outputs the feature amount of the input data to be classified as it is to the next stage. [0006] Data is added to such a neural network.
In order to classify data, when the feature values of the data to be classified are given to the input layer units, only the output layer units corresponding to the classification category to which the data belongs output a high value. It is necessary to set the coupling weight and bias value between each unit so that other output layer units output a low value. [0007] For this purpose, as a first method in the conventional technology, the connection weights and bias values are initially set to random values, and when the feature values of learning sample data whose classification results are known are input. A back error propagation learning method is known in which connection weights and bias values are repeatedly adjusted minute by minute so that the error between the actual output value and the desired output value is reduced. FIG. 4 shows a diagram for explaining the first conventional technique. The learning means 402 includes an error calculating means 410 that calculates the error between the actual output value and the desired output value when the feature quantity 403 of the learning sample data is input to the neural network 401 before learning, and the error calculating means The connection weight adjustment means 412 repeatedly adjusts connection weights and bias values minute by minute using a back error propagation learning method so that the error calculated in step 410 is reduced. [0009] A pre-learning neural network 401 is set in the learning means 402, and the feature values and classification results 403 of the learning sample data are stored in the neural network.
is input. As a result, the learning means 402 performs learning using the error calculation means 410 and the connection weight adjustment means 412,
The neural network 404 after learning is output. Next, the operation of the learning means 402 will be explained in detail. The error calculation means 410 of the learning means 402 calculates the output value of the neural network when the feature amount 403 of the learning sample data is input to the input layer of the neural network. and the squared error of the desired output value ej = oj
(K) − yj (4) is calculated. Here, K is the layer number of the output layer, ie, the total number of layers of the neural network. Further, yj is a correct classification result that is a desired output value for the learning sample data 403 of the j-th unit of the output layer. Next, the load change adjustment means 411 slightly adjusts the connection load and bias value of the neural network according to the following formula so that the calculated error is reduced. Coupling load or bias value adjustment amount Δwij(k)
is calculated according to the following formula, which is the product of the error signal dj (k) of the end point unit and the output value oi (k-1) of the start point unit. Δwij(k) = −ε1 dj
(k) oi (k-1)
(5) Here, ε1 is a parameter that determines the amount of adjustment in one repetition. The k layer is the output layer (k=
K), the error signal dj (k) of the end point unit is calculated by the following formula. dj (K) = ej (1-oj
(K) )(1+oj (K) ) (6) Here, ej is the output error calculated by the error calculation means 410 according to equation (4). When the k layer is an intermediate layer (1<k<K), the error signal dj (k) is obtained by the following equation. ##EQU3## The error calculation by the error calculation means 410 and the adjustment of the connection weights and bias values by the connection weight adjustment means 412 are executed by repeatedly applying the learning sample data 403, and the error is calculated by repeatedly applying the learning sample data 403. Learning ends when becomes below a certain value. Next, a second conventional method will be explained. In this method, when a data classification rule that is considered to be approximately correct is known, the connection weights and bias values of the neural network are initially set so that the neural network has a classification function equivalent to this data classification rule. Then, the training sample data
This method uses a back error propagation learning method to adjust connection weights and bias values using data. This method was proposed by the present inventor in Japanese Patent Application No. 2-15
It is described in Neural Net Learning Method, No. 570. This method converts known data classification rules, which classify data based on the presence or absence of features in training data, into a logical operation formula, and then applies the multiplicative and additive terms of the logical operation formula to each unit of the neural network. The structure and connection weights of the neural net are set so that it has a classification function equivalent to a logical operation formula, and the neural network is trained using example data. In addition, the known data classification rules, which classify the learning data based on the magnitude relationship by comparing the features of the input learning data with constant values, are converted into logical operation expressions, and the logical operation expressions are size comparison term,
Multiplicative terms and additive terms are assigned to each unit of the neural net, and the neural net has a classification function equivalent to a logical operation expression.
Set the structure and connection loads of the net, and set the learning data.
It uses data to teach students. FIG. 5 shows a diagram for explaining the second conventional method. In this figure, the same parts as those in FIG. 4 are given the same reference numerals, and the explanation thereof will be omitted. The logical operation conversion means 502 converts the data classification rule 501 into a logical operation. Connection load setting means 503
is a new model that performs an operation equivalent to the obtained logical operation expression.
Set the structure and connection weight of the network. The learning means 402 applies the feature quantity 403 of the learning sample data to the neural network obtained from the connection weight setting means 503 using a back error propagation learning method similar to the conventional first method. input, and adjust the coupling weight and bias value in minute increments so as to reduce the error between the actual output value and the desired output value. Next, the operation of each means will be explained. Here, it is assumed that the following rule is given as the known data classification rule 501. IF (x1 > a1 ) and (x2 > a2 )
THEN y (8)
IF (x3 > a3 ) and (x4 < a4 ) TH
EN y (9) (8)
The classification rule is ``the feature quantity x1 of the data to be classified is a constant a
1 and the feature quantity x2 is larger than the constant a2, the data to be classified belongs to category y.''The classification rule in (9) is ``If the feature quantity x3 of the data to be classified is is larger than the constant a3, and the feature x
4 is smaller than the constant a4, it means that the data to be classified belongs to category y. [0025] The logical operation conversion means 502 has the formula (8), (
9) From “(x1 is larger than a1, and x2
is greater than a2) or (x3 is greater than a3 and x4 is less than a4), then y is true.''however,"
・” represents logical product, “+” represents logical sum, and “¬” represents logical negation. y=(x1 > a1 )・(x2 > a2 )+(
x3 > a3 )・¬(x4 > a4 )

(10) Next, the joint load setting means 503
The operation will be explained. FIG. 6 shows a connection load setting means 50 in a conventional second method.
A diagram for explaining 3 is shown. Initial connection load setting means 5
03 is the neural network 6 according to the logical operation formula 601 (formula 10) given from the logical operation conversion means 502.
Determine the connection configuration and connection load of 02. As shown in FIG. 602, the connection configuration of the neural network 602 is such that one input layer unit is provided for each variable appearing on the right side of the logical operation formula 601 (Equation 10), and one input layer unit is provided for each comparison term between a variable and a constant. One second hidden layer unit is assigned to each multiplicative term in the first hidden layer unit so as to realize addition of all multiplicative terms in the output layer. The method for determining the bonding load w is described in the above-mentioned patent application No. 2
It is described in Neural Network Learning Method, No. 185570. This method sets the structure and connection weights of a neural net so that it has a classification function equivalent to a logical operation formula. [0028] Here, a formula for determining the joint load w will be shown. First, a method for determining the joint weight wi and bias wo of the addition unit (output layer) will be explained. Considering an n-input addition unit, if the input signal Ii, (1≦i≦n) is -1≦Ii≦-d, it is “false”, and if d≦Ii≦1, it is “true”. When the output signal o is −1≦o≦−d′, it is “false”, and when d′≦o≦1, it is “true”. Under the above conditions, in order to realize the additive function, the joint load wi (
1≦i≦n) and bias w0 are set as follows. wi = w≧{2tanh-1 (d')}/{d
(n+1)-(n-1)} (11) w
0 = {w(n-1)(d+1)}/2
(12)
However, when the input is a negative term (b of y=a+¬b), wi=-w. Also, d is d>(n-1)/
It is necessary to satisfy the condition (n+1). Next, a method for determining the joint load and bias of the multiplicative unit (second intermediate layer) will be explained. Define n, d, and d' in the same way as for addition. To realize the multiplicative function, the connection weight wi (1≦i≦n) and bias w0 are set as follows. wi = w≧{2tanh-1 (d')}/{d
(n+1)-(n-1)} (13) w
0 = {w(1 −n)(d+1)}/2
(14
) However, if the input is a negative term (b of y=a・¬b
) is assumed to be wi = -w. Also, d is d>(n-1
)/(n+1). Furthermore, a method for determining the coupling load and bias in the comparison unit (first intermediate layer) will be explained. Input I
The relationship between the connection weight w1 and the bias value w0 of the unit that outputs "true" when is larger than the constant A is set as follows. w0 = -w1 A

(15) In addition, there may be extra units and connections between units that do not correspond to the logical formula, and these units and connections between units are randomly selected with a sufficiently smaller absolute value than the connection weight determined above. set to the value. As described above, the neural network obtained by the connection weight setting means 503 is
02 adjusts the connection weight and bias value using the same back error propagation learning method as the first conventional method. [Problems to be Solved by the Invention] However, the conventional first
The method uses the back error propagation learning method to acquire training sample data.
Learning is performed so that the error between the actual output value and the desired output value of the neural network for the controller is reduced. Therefore, by repeating learning on training sample data, it is possible to obtain a neural network that gives correct classification results, but when inputting unknown data other than training sample data, it is possible to obtain a neural network that gives correct classification results. There is no guarantee that classification results will be obtained. In particular, when the training sample data is biased or the number of training sample data is insufficient, the problem arises that the classification function of the neural network is specified to classify only the training sample data. There is. Furthermore, in the second conventional method, the connection weights of the neural network are initially set so as to have a classification function equivalent to a known classification rule that is considered to be approximately correct; Furthermore, learning is performed on the learning sample data using the back error propagation learning method, which is the first conventional method. Therefore, it is expected that the classification accuracy for unknown data will be improved compared to the conventional first method. However, like the conventional first method, the training sample data
If the data is biased or contains noise,
There is a problem in that the initially set connection weights change greatly due to the back error propagation learning method, resulting in excessive learning such as learning not converging or requiring a huge amount of learning time, leading to a decrease in classification accuracy. [0037] The present invention has been made in view of the above points.
When adjusting a neural network using training sample data, learning is performed to reduce the error on the training samples while maintaining the initial classification function as much as possible, preventing overfitting. The purpose of the present invention is to provide a learning method for neural networks that can improve classification accuracy without having to do so. [Means for Solving the Problems] FIG. 1 shows a diagram of the basic configuration of the present invention. In a neural network learning method having a learning means 10 for training a neural network using training sample data 14 for which classification results are known, the learning means 10 performs the training of the neural network. An error calculation means 11 for calculating an output error with respect to sample data 14, a load difference calculation means 12 for calculating a load difference which is a deviation from the initial setting values of the connection load and bias value of the neural network, and an output It has a joint load adjusting means 13 that adjusts the joint load and bias value so that the sum of the error and the load difference is reduced. [Operation] The present invention learns a neural network that is initially set to have a classification function equivalent to a classification rule that is considered to be approximately accurate given in advance. When adjusting the connection weight so that it decreases, the connection weight is adjusted so that the sum of the output error for the learning sample data and the load difference, which is the deviation from the initial setting value of the connection weight, decreases. That is, learning is performed so that the error with respect to the learning sample is reduced while maintaining the initially given classification function as much as possible. Embodiment FIG. 2 shows a configuration diagram of an embodiment of the present invention. In the figure, the same parts as in FIGS. 1 and 5 are designated by the same reference numerals, and the explanation thereof will be omitted. In the figure, a load difference calculating means 12 calculates the difference between the joint load and the initial setting value. The other configurations are similar to those of the second conventional method shown in FIG. The operation of the present invention will be explained below. The known data classification rule 501 is converted into a logical operation formula (formula 10) by the logical operation conversion means 502, and the connected weight setting means 5
Sent on 03. The connection weight setting means 503 sets the structure and connection weight of the neural network according to the logical operation formula (Equation 10). The operation up to this point is similar to the second conventional method. Next, the operating principle of the learning means 10 will be explained. The learning means 10 changes the value of the connection weight for the neural network initialized by the connection weight setting means 503 so as to decrease the evaluation function E' expressed by the following equation. [Equation 4] Here, E on the right side is the square of the output value of the neural network and the desired output value when the features of the learning sample data shown in equation (3) are input. This is an error. Furthermore, the second term on the right side is a term representing the deviation of the joint load from the initial setting value. That is, the learning means 10 adjusts the connection weights so that the sum of the squared error of the output value of the neural network and the deviation of the connection weights from the initial setting value is reduced. Wij(k) in equation (16) is the joint load wij
(k) is the initial setting value. f(x, y) is a function in which the difference between x and y increases, for example, |x-y|, (x-y)
2 etc. ηij(k) is a parameter that determines the contribution of the difference between each connection load to the evaluation function E', and is usually a constant value η for all connection loads wij(k), or a constant value for each layer. Let the value η(k) be. In order to reduce the above evaluation function E',
The amount of change Δw'ij(k) in the joint load wij(k) is given by the following formula. ##EQU5## Here, Δwij(k) is the adjustment amount of the connection weight in the normal back error propagation learning method calculated by equation (5). According to equations (17) and (18), for example, f
When (x, y) = (x-y)2, Δw'ij(k) = Δwij(k) −2εηi
j(k) wij(k) ( wij(k) −Wij
(k) )

(19). The learning means 10 performs the above operations. Below f
When (x, y)=(x-y)2, that is, equation (19
), we will explain the case where the joint load is adjusted according to [0050] The error calculation means 11 is a neural network.
The error ej between the output value E of the neural network and the desired output value when the feature quantity 14 of the learning sample data is input to the input layer of the neural network is calculated using the formula (3 ), calculated according to formula (4). The load difference calculation means 12 stores the initial setting value Wij(k) of the joint load, and uses the current joint load value wij(k) to calculate the second value on the right side of equation (19). −2εηij(k) wij(k) (w
ij(k) −Wij(k) ) is determined, and the deviation of the connection load is determined. Similar to the conventional reverse error propagation learning method, the connection weight adjustment means 13 calculates the connection weight adjustment amount Δwij(k), which is the first term on the right side of equation (19), and calculates the connection weight adjustment amount Δwij(k), The value of the deviation of the second term on the right side calculated by -
2εηij(k) wij(k) (wij(k) −
Wij(k) ) and the connection load adjustment amount Δw'ij
(k) is calculated, and the equation (3) of the error calculation means 11 and the equation (
Adjust the joint load using the output error ej calculated in 4). Error calculation by the above error calculation means 11,
The calculation of the deviation of the connection loads from the initial setting value by the load difference calculation means 12 and the adjustment of the connection loads by the connection load adjustment means 13 are executed by repeatedly applying the learning sample data 14 to the neural network, Learning ends when the error becomes less than a certain value. [0054] According to the present invention, a neural network initially set to have a classification function equivalent to a pre-given classification rule that is considered to be approximately accurate is used as a learning sample. When adjusting using data, the connection weights are adjusted so that the sum of the output error for the learning sample data and the deviation from the initial setting value of the connection weights is reduced. Learning can be performed on learning samples to reduce errors while maintaining the classification function as much as possible, avoiding over-learning in the past and improving classification accuracy.

[Brief explanation of the drawing]

【図１】本発明の原理構成図である。FIG. 1 is a diagram showing the principle configuration of the present invention.

【図２】本発明の一実施例の構成図である。FIG. 2 is a configuration diagram of an embodiment of the present invention.

【図３】多層構造型ニュ−ラルネットワ−クの構成図で
ある。FIG. 3 is a configuration diagram of a multilayer neural network.

【図４】従来の第１の方法を説明するための図である。FIG. 4 is a diagram for explaining a first conventional method.

【図５】従来の第２の方法を説明するための図である。FIG. 5 is a diagram for explaining a second conventional method.

【図６】従来の第２の方法における結合荷重設定手段を
説明するための図である。FIG. 6 is a diagram for explaining connection load setting means in a second conventional method.

[Explanation of symbols]

１０　　学習手段１１　　誤差算出手段１２　　荷重差分算出手段１３　　結合荷重調整手段１４　　学習サンプルデ−タ５０１　　既知のデ−タ分類規則５０２　　論理演算変換手段５０３　　結合荷重設定手段 10 Learning methods 11 Error calculation means 12 Load difference calculation means 13　Connection load adjustment means 14 Learning sample data 501 Known data classification rules 502 Logical operation conversion means 503 Combined load setting means

Claims

[Claims]

1. A neural network learning method comprising a learning means for training a neural network using training sample data for which classification results are known, wherein the learning means is configured to train the neural network using training sample data for which classification results are known. an error calculation means for calculating an output error of the neural network with respect to the learning sample data; and a load difference calculation means for calculating a load difference that is a deviation of the connection weight and bias value of the neural network from the initial setting values. , a connection weight adjustment means for adjusting connection weights and bias values so that the sum of the output error and the load difference is reduced.