JP3631443B2

JP3631443B2 - Neural network connection weight optimization apparatus, method and program

Info

Publication number: JP3631443B2
Application number: JP2001155478A
Authority: JP
Inventors: 佐藤　　誠
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2001-05-24
Filing date: 2001-05-24
Publication date: 2005-03-23
Anticipated expiration: 2021-05-24
Also published as: JP2002352215A

Description

【０００１】
【発明の属する技術分野】
本発明は、ニューラルネットの好ましさを決定するモデル評価関数に基づいてニューラルネットの結合重みを最適化するためのニューラルネット結合重み最適化装置及び方法に関する。
【０００２】
【従来の技術】
ニューラルネットは、高度な情報処理機能を人工的に実現するために開発されたコンピュータ技術である。すなわち、ニューラルネットは、コンピュータを使用して脳の神経回路などを模擬した人工知能の技術であり、情報を入力するための入力素子の集まりである入力層、この入力から答えがどのような値になるのかを出力する出力素子の集まりである出力層、およびそれらの層の中間にある中間素子の集まりである中間層から構成され、各層の各素子は、結合重みパラメータを持つ多くの結合によって連結しあっている。
【０００３】
ニューラルネットは、何らかのモデル評価関数に基づいて、高い評価関数値（評価関数によっては低い評価関数値）をとるような結合重みパラメータの最適化を行うことにより、高度な情報処理機能を獲得することができる。
【０００４】
ニューラルネットの結合重みパラメータの最適化方法としては、誤差逆伝播法（例えば、Ｒｕｍｅｌｈａｌｔ，Ｄ．Ｅ．，Ｇ．Ｅ．Ｈｉｎｔｏｎ，ａｎｄＲ．Ｊ．Ｗｉｌｌｉａｍｓ．， “Ｌｅａｒｎｉｎｇｒｅｐｒｅｓｅｎｔａｔｉｏｎｓｂｙｂａｃｋ−ｐｒｏｐａｇａｔｉｎｇｅｒｒｏｒｓ．”，Ｎａｔｕｒｅ３２３，ｐｐ．５３３−５３６．，１９８６．）などがあるが、すべてのパラメータに対して偏微分計算が可能であるなどモデル評価関数に多くの制約が必要になるという問題点がある。
【０００５】
また、一般的なモデル評価関数を対象とするパラメータ最適化法としては、遺伝的アルゴリズム（例えば、Ｇｏｌｄｂｅｒｇ，Ｄ．Ｅ．， “ＧｅｎｅｔｉｃＡｌｇｏｒｉｔｈｍｓｉｎＳｅａｒｃｈ，Ｏｐｔｉｍｉｚａｔｉｏｎ，ａｎｄＭａｃｈｉｎｅＬｅａｒｎｉｎｇ．”，Ａｄｄｉｓｏｎ−ＷｅｓｌｅｙＰｕｂｌｉｓｈｉｎｇＣｏｍｐａｎｙＩｎｃ．，１９８９．）などがあるが、ニューラルネットの結合重み値を単に集め実数ベクトルとするため、十分な最適化性能が得られないという問題点がある。
【０００６】
【発明が解決しようとする課題】
ニューラルネットの結合重みパラメータをモデル評価関数に対して最適化することにより高度な情報処理機能を獲得するための従来の技術として、誤差逆伝播法などのニューラルネットの結合重み最適化に特化した方法は、一般的なモデル評価関数に用いることはできないという問題があり、単にニューラルネットの結合重みを集めた実数ベクトルを用いた遺伝的アルゴリズムなどの方法は、一般的なモデル評価関数を用いることができるが、十分な最適化性能が得られないという問題があった。
【０００７】
本発明は、上記事情を考慮してなされたもので、一般的なモデル評価関数に基づくニューラルネットの結合重みの最適化として従来よりも優れた最適化を可能とするニューラルネット結合重み最適化装置及び方法を提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明は、複数のネットワーク素子及び０又は１つ以上のバイアス素子からなる所定の結合構造を有するニューラルネットの結合重みを最適化するニューラルネット結合重み最適化方法であって、１つ又は複数のニューラルネットと、当該ニューラルネットの結合重みを所定の評価関数に基づいて評価した評価結果とを対応付けて解候補ニューラルネット集合記憶手段に記憶する第１のステップと、前記解候補ニューラルネット集合記憶手段に記憶された前記ニューラルネットから選択した所定数のニューラルネットの各々について、当該ニューラルネットの全結合重みのうち同一のネットワーク素子への入力に係る結合重みごとに、ネットワーク素子の機能を考慮した所定の符号化関数変換を施して符号化し、各ネットワーク素子について得られた符号を連結して、符号化ニューラルネットを生成する第２のステップと、この第２のステップで生成された前記符号化ニューラルネットをもとにして、所定数の符号化ニューラルネットを新たに生成する第３のステップと、この第３のステップで生成された前記符号化ニューラルネットの各々について、当該符号化ニューラルネットの全符号のうち各々のネットワーク素子に対応する符号部分ごとに、ネットワーク素子の機能を考慮した所定の逆関数変換を施して、前記解候補ニューラルネット集合記憶手段に記憶されている前記ニューラルネットと同じ形式のニューラルネットを新たに生成する第４のステップと、この第４のステップで生成された前記ニューラルネットの各々について前記モデル評価関数に基づく評価を行って評価結果を求め、少なくとも、該評価結果に基づいて該ニューラルネット及びその評価結果を前記解候補ニューラルネット集合記憶手段に追加するか否かを判断し、追加すると判断された該ニューラルネットとその評価結果とを対応付けて前記解候補ニューラルネット集合記憶手段に記憶する第５のステップとを有することを特徴とする。
【０００９】
好ましくは、前記第５のステップでは、さらに、前記解候補ニューラルネット集合記憶手段に記憶されている既存の前記ニューラルネット及びその評価結果のうち所定の条件を満たすものを該解候補ニューラルネット集合記憶手段から削除するようにしてもよい。
【００１０】
好ましくは、前記第２のステップから前記第５のステップを、前記第５のステップの後において所定の終了条件が成立したと判断されるまで、繰り返し実行し、前記第５のステップの後において所定の終了条件が成立したと判断された場合には、前記解候補ニューラルネット集合記憶手段に記憶されている前記ニューラルネットのうち、その時点で最も良い前記評価結果を持つものを、最適化されたニューラルネットとして選択するようにしてもよい。
【００１１】
好ましくは、前記第２のステップでは、前記ニューラルネットの前記同一のネットワーク素子への入力に係る結合重みに対して前記所定の符号化関数変換を施すにあたっては、当該ネットワーク素子への入力に係る全結合重みのうちバイアス素子からの入力に係る結合重み以外の結合重みからなる結合重みベクトルのベクトル長を求め、該ベクトル長又は該ベクトル長を正数倍した値に基づいて当該ネットワーク素子への入力に係る全結合重みの各々に所定の演算を施して得た値と、該ベクトル長又は該ベクトル長を正数倍した値とを、符号として連結することにより、前記符号化ニューラルネットを生成するようにしてもよい。好ましくは、前記所定の演算は、前記結合重みを、前記ベクトル長又は前記ベクトル長を正数倍した値で除する演算であるようにしてもよい。
【００１２】
好ましくは、前記第４のステップでは、前記符号化ニューラルネットの全符号のうち一つの前記ネットワーク素子に対応する符号部分に対して前記所定の逆関数変換を施すにあたっては、該符号部分のうち前記ベクトル長又は前記ベクトル長を正数倍した値に対応する符号値に基づいて、該符号部分のうち前記結合重みの各々に所定の演算を施して得た値に対応する符号値に対してそれぞれ所定の演算を施すことによって、新ニューラルネットにおける該当する結合重みを求めるようにしてもよい。好ましくは、前記所定の演算は、前記符号部分のうち前記ベクトル長又は前記ベクトル長を正数倍した値に対応する符号値に対して所定の修正を施した修正値を求め、前記符号部分のうち前記結合重みの各々に所定の演算を施して得た値に対応する符号値に対してそれぞれ該修正値を乗ずる演算であるようにしてもよい。好ましくは、前記所定の修正は、前記値に対してその絶対値とる修正、または前記値が予め定められた正数のしきい値未満である場合に該値を該予め定められた正数のしきい値とする修正であるようにしてもよい。
【００１３】
好ましくは、前記第３のステップでは、遺伝的アルゴリズムによって、前記符号化ニューラルネットを生成するようにしてもよい。
【００１４】
なお、装置に係る本発明は方法に係る発明としても成立し、方法に係る本発明は装置に係る発明としても成立する。
また、装置または方法に係る本発明は、コンピュータに当該発明に相当する手順を実行させるための（あるいはコンピュータを当該発明に相当する手段として機能させるための、あるいはコンピュータに当該発明に相当する機能を実現させるための）プログラムとしても成立し、該プログラムを記録したコンピュータ読取り可能な記録媒体としても成立する。
【００１５】
さて、ニューラルネットの機能は、各ネットワーク素子の機能を集めることで実現されており、各ネットワーク素子の機能に影響しているのは、その素子に入力される結合重みのみと考えることができる。本発明では、ニューラルネットの全結合重みを各ネットワーク素子に入力される結合重みの部分集合ごとにグループ化し、符号化関数変換を施すことで、符号化ニューラルネットを生成し、符号化されたニューラルネット集合から新たに生成された新符号化ニューラルネットを各ネットワーク素子に入力される結合重みの部分集合ごとに逆符号化関数変換を施すことで、新ニューラルネット集合を生成するので、一般的なモデル評価関数に基づくニューラルネットの結合重みの最適化として従来よりも優れた最適化が可能になる。また、各ネットワーク素子の機能ごとに最適化を行うことができる。また、本発明によれば、機能的には似ているが加重値は大きく異なるニューラルネット素子を、似た符号ベクトルに変換できるので、各ネットワーク素子の機能上の重複を避けた結合重みパラメータの最適化を行うことができ、高性能な最適化が実現できる。
【００１６】
【発明の実施の形態】
以下、図面を参照しながら発明の実施の形態を説明する。
【００１７】
図１に、本発明の一実施形態に係るニューラルネットの結合重みパラメータに関する最適化処理を行うニューラルネット結合重み最適化装置の構成例を示す。
【００１８】
図１に示されるように、本実施形態のニューラルネット結合重み最適化装置は、解候補ニューラルネット集合記憶部１０１、モデル評価関数記憶部１０２、ニューラルネット素子符号化部１０３、符号化ニューラルネット集合記憶部１０４、新符号化ニューラルネット生成部１０５、新符号化ニューラルネット集合記憶部１０６、ニューラルネット素子復号化部１０７、新ニューラルネット集合記憶部１０８、新ニューラルネット評価・選択部１０９を備えている。
【００１９】
このニューラルネット結合重み最適化装置は、ソフトウェアによって実現することができる（すなわち計算機上でプログラムを実行する形で実現することができる）。その際、そのソフトウェアの一部または全部の機能をチップ化あるいはボード化して該計算機に組み込んで実現することもできる。また、このニューラルネット結合重み最適化装置は、ソフトウェアによって実現する場合には、他のソフトウェアの一機能として組み込むようにすることも可能である。また、このニューラルネット結合重み最適化装置を専用のハードウェアとして構成することも可能である。
【００２０】
解候補ニューラルネット集合記憶部１０１、モデル評価関数記憶部１０２、符号化ニューラルネット集合記憶部１０４、新符号化ニューラルネット集合記憶部１０６、新ニューラルネット集合記憶部１０８は、いずれも、例えばハードディスクや光ディスクや半導体メモリなどの記憶装置によって構成される。なお、各記憶部は、別々の記憶装置によって構成されていてもよいし、それらの全部または一部が同一の記憶装置によって構成されていてもよい。
【００２１】
なお、図１では省略しているが、ニューラルネット結合重み最適化装置は、外部とデータをやり取りするための入出力装置を備えている。もちろん、ＧＵＩ（グラフィカル・ユーザ・インタフェース）を備えてもよいし、ネットワーク接続インタフェースを備えてもよい。
【００２２】
図２に、本ニューラルネット結合重み最適化装置の全体的な処理手順の一例を示す。
【００２３】
あらかじめ、モデル評価関数記憶部１０２には、ニューラルネットを評価するためのモデル評価関数を示すデータが格納されている。モデル評価関数には、所望の一般的なモデル評価関数を用いることができる。
【００２４】
また、解候補ニューラルネット集合記憶部１０１には、予め作成された初期的なニューラルネットを示すデータが１つ以上格納されている。この初期的なニューラルネットの生成方法は、従来の方法に従って構わない（例えば、ニューラルネットの構造はユーザが作成あるいは選択し、結合重みは所定のアルゴリズムでランダムに生成する）。なお、解候補ニューラルネット集合記憶部１０１に予め格納されているニューラルネットの各々については、ステップＳ４で新ニューラルネットに対して行う評価と同様の方法による評価がなされ、その評価結果（例えば、評価関数値）が各ニューラルネットに対応付けて記憶されているものとする。
【００２５】
まず、ニューラルネット素子符号化部１０３は、解候補ニューラルネット集合記憶部１０１からｎ個（ｎは予め定められたもしくは指定された１以上の整数）のニューラルネットを取り出し、取り出したｎ個のニューラルネットの各々について、当該ニューラルネットにおける同一素子に入力している結合重みを同一グループとして扱い所定の符号化関数変換を施すことによって、ｎ個の符号化ニューラルネットを生成し、それらを符号化ニューラルネット集合記憶部１０４に格納する（ステップＳ１）。
【００２６】
次に、新符号化ニューラルネット生成部１０５は、このステップＳ１で生成され符号化ニューラルネット集合記憶部１０４に格納された符号化ニューラルネットをもとにして、例えば遺伝的アルゴリズムなどの所定の方法に従って、新符号化ニューラルネットをｍ個（ｍは予め定められたもしくは指定された１つ以上の整数）生成し、それらを新符号化ニューラルネット集合記憶部１０６に格納する（ステップＳ２）。
【００２７】
この生成方法としては、種々の方法を用いることが可能である。例えば、ｂｌｘ法（例えば、Ｅｓｈｅｌｍａｎ，Ｌ．Ｊ．，ａｎｄＳｃｈａｆｆｅｒ，Ｊ．Ｄ．， “Ｒｅａｌ−ＣｏｄｅｄＧｅｎｅｔｉｃＡｌｇｏｒｉｔｈｍｓａｎｄＩｎｔｅｒｖａｌ−Ｓｃｈｅｍａｔａ．”，ＦｏｕｎｄａｔｉｏｎｓｏｆＧｅｎｅｔｉｃＡｌｇｏｒｉｔｈｍｓ２．，ｐｐ．１８７−２０２．，１９９３．）として知られる方法を用いてもよい。
【００２８】
次に、ニューラルネット素子復号化部１０７は、このステップＳ２で生成され新符号化ニューラルネット集合記憶部１０６に格納されたｍ個の新符号化ニューラルネットを取り出し、取り出したｍ個のニューラルネットの各々について、当該ニューラルネットにおける同一素子に対応している符号を同一グループとして扱い所定の逆関数変換を行うことによって、ｍ個の新ニューラルネットを生成し、それらを新ニューラルネット集合記憶部１０８に格納する（ステップＳ３）。
【００２９】
次に、新ニューラルネット評価・選択部１０９は、このステップＳ３で新ニューラルネット集合記憶部１０８に格納されたｍ個の新ニューラルネットの各々について、モデル評価関数記憶部１０２に格納された情報をもとにして評価を行い、この評価結果に基づいて、解候補ニューラルネット集合記憶部１０１に格納されている解候補ニューラルネットの全部又は一部と新ニューラルネット集合記憶部１０８に格納されたｍ個の新ニューラルネットの中から（あるいは新ニューラルネット集合記憶部１０８に格納されたｍ個の新ニューラルネットの中から）、１つ以上のニューラルネットを選択し、選択された新ニューラルネットのデータについては、これを該評価結果とともに新たな解候補ニューラルネットとして解候補ニューラルネット集合記憶部１０１に記憶する（ステップＳ４）。なお、選択された新ニューラルネットを解候補ニューラルネット集合記憶部１０１に記憶するにあたって、解候補ニューラルネット集合記憶部１０１から所定の基準で選択した１つ以上の解候補ニューラルネットを削除するようにしてもよい。
【００３０】
この評価・選択方法としては、種々の方法を用いることが可能である。例えば、ｍｇｇ法（例えば、佐藤、小野、小林、「遺伝的アルゴリズムにおける世代交代モデルの提案と評価」、人工知能学会誌、ｖｏｌ．１２、Ｎｏ．５、ｐｐ．７３４−７４４．、１９９７．）として知られる方法を用いてもよい。
【００３１】
ここで、予め定められた所定の終了条件が満たされたか否かを判断し（ステップＳ５）、満たされなかった場合には、ステップＳ１に戻って、同様の処理を繰り返し行う。
【００３２】
なお、終了条件としては、例えば、繰り返し回数が予め定められた回数に達したこと、当該回と前回の評価結果（例えば、評価関数値）の差分が予め定められた値を下回ったこと、当該回と前回の評価結果（例えば、評価関数値）の変化率が予め定められた値を下回ったこと、これらの各条件の論理和あるいは論理積による組合せなど、種々のものが考えられる。
【００３３】
終了条件が満たされた場合には、処理ループを抜けて、解候補ニューラルネット集合記憶部１０１に記憶されているニューラルネットのうち、最も良い評価結果を持つものを選択する。この選択されたニューラルネットが、本最適化処理により得られたニューラルネットとなる。
【００３４】
ここで、図２の手順において遺伝的アルゴリズムを利用した場合に、１回のループ処理は、例えば、図３のようになる。すなわち、当該回のステップＳ１で解候補ニューラルネット集合記憶部１０１から例えば２個のニューラルネットをランダムに選択して取り出してそれぞれのニューラルネットａ，ｂを符号化ニューラルネットに変換した後、ステップＳ２でそれら符号化ニューラルネットに遺伝的アルゴリズムを適用して例えば４個の新符号化ニューラルネットを生成し、次いでステップＳ３でそれらを復号化して新ニューラルネットｃ〜ｆに逆変換し新ニューラルネット集合記憶部１０８に格納する。次に、ステップＳ４で、新ニューラルネットｃ〜ｆを評価し、次いで、例えばもとの解候補ニューラルネットａ，と新ニューラルネットｃ〜ｆについてそれらの評価結果に基づいて最も良い２つを選択し（例えばニューラルネットｅ，ｆとする）、それらが元のニューラルネット（ａ，ｂ）でなければ、選択したニューラルネットｅ，ｆで元のニューラルネットａ，ｂを更新する。もちろん、これは一例であって、遺伝的アルゴリズムを利用した場合の１回のループ処理としては他にも種々の方法が可能であり、また、遺伝的アルゴリズム以外のアルゴリズムを利用した様々な方法が可能である。
【００３５】
さて、以下では、本実施形態のニューラルネット結合重み最適化装置におけるニューラルネット素子符号化及びニューラルネット素子復号化について具体例を用いつつ説明する。
【００３６】
図４は、図１のニューラルネット結合重み最適化装置のうち解候補ニューラルネット集合記憶部１０１の記憶内容とニューラルネット素子符号化部１０３の処理と符号化ニューラルネット集合記憶部１０４の記憶内容を例示したものである。
【００３７】
前述したように、解候補ニューラルネット集合記憶部１０１には、１つ以上の解候補ニューラルネット（２０１）が、それぞれ評価結果を対応付けられて、格納されている。なお、複数の解候補ニューラルネットが格納されている場合には、全ての解候補ニューラルネットは同じ構造を持つものとする。
【００３８】
図４に例示した解候補ニューラルネット２０１は、３つのネットワーク素子（入力素子）ｉ１〜ｉ３及び１つのバイアス素子から３つのネットワーク素子（中間素子）ｈ１〜ｈ３へそれぞれ結合し、３つのネットワーク素子（中間素子）ｈ１〜ｈ３及び１つのバイアス素子から２つのネットワーク素子（出力素子）ｏ１，ｕへそれぞれ結合する例である。
【００３９】
ニューラルネット素子符号化部１０３では、各々の解候補ニューラルネットについて、当該解候補ニューラルネットの全結合重みを、同一ネットワーク素子に入力している結合重みごとにグループ化し、符号化関数変換（２０３）を行うことにより、符号化ニューラルネットを生成する。例えば、図４の解候補ニューラルネット（２０１）においてネットワーク素子ｕに入力している結合重みのグループ（２０２）に対して、符号化関数変換（２０３）を行うことにより、符号化ニューラルネットのｕに相当する部分（２０４）を生成する。このような処理をすべてのネットワーク素子ｈ１，ｈ２，ｈ３，ｏ１，ｕに対して行って、それらを連結することにより、符号化ニューラルネットを生成する。生成された符号化ニューラルネット（２０５）は、符号化ニューラルネット集合記憶部１０４に格納される。
【００４０】
ここで、符号化関数変換についてより詳しく説明する。
【００４１】
まず、図５に、従来方法による符号化ニューラルネット生成方法（１２０３）を示す。従来方法によれば、全ネットワーク素子に入力される全結合重みをそのまま順番に並べることにより、符号化ニューラルネットを生成する。
【００４２】
例えば、ネットワーク素子ｕ（１２０２）へ入力する結合重みＷ_ｕ，１〜Ｗ_ｕ，ｂをそのまま並べることにより、符号化されたニューラルネットのｕに相当する部分（１２０４）、
｛Ｗ_ｕ，１Ｗ_ｕ，２Ｗ_ｕ，３Ｗ_ｕ，ｂ｝
を生成することになる。他のネットワーク素子ｈ１，ｈ２，ｈ３，ｏ１についても同様である。
【００４３】
そして、各ネットワーク素子ｈ１，ｈ２，ｈ３，ｏ１，ｕにそれぞれ相当する部分、
｛Ｗ_ｈ１，１Ｗ_ｈ１，２Ｗ_ｈ１，３Ｗ_ｈ１，ｂ｝
｛Ｗ_ｈ２，１Ｗ_ｈ２，２Ｗ_ｈ２，３Ｗ_ｈ２，ｂ｝
｛Ｗ_ｈ３，１Ｗ_ｈ３，２Ｗ_ｈ３，３Ｗ_ｈ３，ｂ｝
｛Ｗ_ｏ１，１Ｗ_ｏ１，２Ｗ_ｏ１，３Ｗ_ｏ１，ｂ｝
｛Ｗ_ｕ，１Ｗ_ｕ，２Ｗ_ｕ，３Ｗ_ｕ，ｂ｝
を連結することにより、符号化ニューラルネット、
｛Ｗ_ｈ１，１Ｗ_ｈ１，２Ｗ_ｈ１，３Ｗ_ｈ１，ｂＷ_ｈ２，１ … Ｗ_ｏ１，ｂＷ_ｕ，１Ｗ_ｕ，２Ｗ_ｕ，３Ｗ_ｕ，ｂ｝
を生成する。
【００４４】
従って、例えば、図４の解候補ニューラルネット（２０１）において、ネットワーク素子ｉ１，ｉ２，ｉ３及びバイアス素子からそれぞれネットワーク素子ｈ１へ入力する結合重みは、｛０．１，２．３，３．０，−２．０｝であり、同様に、ネットワーク素子ｈ２へ入力する結合重みは、｛２．３，−１．３，１０．０，１．０｝、ネットワーク素子ｈ３へ入力する結合重みは、｛２．１，２．２，−３．３，−２．０｝であり、また、ネットワーク素子ｈ１，ｈ２，ｈ３及びバイアス素子からそれぞれネットワーク素子ｏ１へ入力する結合重みは、｛−５．１，２．４，−１．０，２．０｝であり、同様に、ネットワーク素子ｕへ入力する結合重みは、｛−０．１，２０．４，−０．８，１．０｝であるとすると、従来方法により得られる符号化ニューラルネットは、
｛０．１，２．３，３．０，−２．０，２．３，−１．３，１０．０，１．０，２．１，２．２，−３．３，−２．０，−５．１，２．４，−１．０，２．０，−０．１，２０．４，−０．８，１．０｝
となる。
【００４５】
次に、図６に、本実施形態における符号化関数変換（２０３）の一例を示す。本例では、各々のネットワーク素子について、当該ネットワーク素子に入力される結合重みの部分集合に含まれる重みのうちバイアス素子との結合重み以外の結合重みベクトルのベクトル長を計算し、結合重みの部分集合の全要素をベクトル長で除したものと当該ベクトル長とを符号として連結することにより、当該ネットワーク素子に対応する部分の符号化ニューラルネットを生成する。
【００４６】
例えば、ネットワーク素子ｕ（２０２）へ入力する結合重みＷ_ｕ，１〜Ｗ_ｕ，ｂのうちＷ_ｕ，１〜Ｗ_ｕ，３をベクトルとしたときのベクトル長Ｌ_ｕ、
Ｌ_ｕ＝（Ｗ_ｕ，１ ^２＋Ｗ_ｕ，２ ^２＋Ｗ_ｕ，３ ^２）^１／２
を計算し、Ｗ_ｕ，１〜Ｗ_ｕ，ｂをそれぞれＬ_ｕで割って、
ｗ_ｕ，１＝Ｗ_ｕ，１／Ｌ_ｕ
ｗ_ｕ，２＝Ｗ_ｕ，２／Ｌ_ｕ
ｗ_ｕ，３＝Ｗ_ｕ，３／Ｌ_ｕ
ｗ_ｕ，ｂ＝Ｗ_ｕ，ｂ／Ｌ_ｕ
を計算し、ｗ_ｕ，１、ｗ_ｕ，２、ｗ_ｕ，３、ｗ_ｕ，ｂ、Ｌ_ｕを並べることにより、符号化ニューラルネットのｕに相当する部分（２０４）、
｛ｗ_ｕ，１ｗ_ｕ，２ｗ_ｕ，３ｗ_ｕ，ｂＬ_ｕ｝
を生成する。他のネットワーク素子ｈ１，ｈ２，ｈ３，ｏ１についても同様である。
【００４７】
そして、各ネットワーク素子ｈ１，ｈ２，ｈ３，ｏ１，ｕにそれぞれ相当する部分、
｛ｗ_ｈ１，１ｗ_ｈ１，２ｗ_ｈ１，３ｗ_ｈ１，ｂＬ_ｈ１｝
｛ｗ_ｈ２，１ｗ_ｈ２，２ｗ_ｈ２，３ｗ_ｈ２，ｂＬ_ｈ２｝
｛ｗ_ｈ３，１ｗ_ｈ３，２ｗ_ｈ３，３ｗ_ｈ３，ｂＬ_ｈ３｝
｛ｗ_ｏ１，１ｗ_ｏ１，２ｗ_ｏ１，３ｗ_ｏ１，ｂＬ_ｏ１｝
｛ｗ_ｕ，１ｗ_ｕ，２ｗ_ｕ，３ｗ_ｕ，ｂＬ_ｕ｝
を連結することにより、符号化ニューラルネット、
｛ｗ_ｈ１，１ｗ_ｈ１，２ｗ_ｈ１，３ｗ_ｈ１，ｂＬ_ｈ１ｗ_ｈ２，１ … ｗ_ｏ１，ｂＬ_ｏ１ｗ_ｕ，１ｗ_ｕ，２ｗ_ｕ，３ｗ_ｕ，ｂＬ_ｕ｝
を生成する。
【００４８】
従って、例えば、図４の解候補ニューラルネット（２０１）において、上記にて例示した結合重みを持つものすると、本例による符号化関数変換により得られる符号化ニューラルネットは、
｛０．０２６４４４２…，０．６０８２１８７…，０．７９３３２８８…，−０．５２８８８５８…，３．７８１５３４０…，
０．２２２３７０１…，−０．１２５６８７４…，０．９６６８２６９…，０．０９６６８２６…，１０．３４３１１３…，
０．４６７９３９３…，０．４９０２２２１…，−０．７３５３３３２…，−０．４４５６５６５…，４．４８７７６１１…，
−０．８９０９０６１…，０．４１９２４９９…，−０．１７４６８７４…，０．３４９３７４９…，５．７２４５０８７…，
−０．００４８９８１…，０．９９９２１９９…，−０．０３９１８５０…，０．０４８９８１３…，２０．４１５９２５…｝
となる。
【００４９】
このようにすることによって、機能的には似ているが加重値は大きく異なるニューラルネット素子を、似た符号ベクトルに変換できるので（例えば、｛１０，２０，３０，−４０｝および｛２０，４０，６０，−８０｝という加重値ベクトルを｛１，２，３，−４，１０｝および｛１，２，３，−４，２０｝というベクトルに変換できる）、各ネットワーク素子の機能上の重複を避けた結合重みパラメータの最適化を行うことができ、高性能な最適化が実現できるようになる。
【００５０】
次に、図７は、図１のニューラルネット結合重み最適化装置のうちニューラルネット復号化部１０７の処理を例示したものである。
【００５１】
前述したように、ニューラルネット素子復号化部１０７は、新符号化ニューラルネットにおける同一素子に対応している符号を同一グループとして扱い所定の逆関数変換を行うことによって新ニューラルネットを生成する。
【００５２】
ここで、逆関数変換についてより詳しく説明する。
【００５３】
図７の例は、新符号化ニューラルネットにおける各ネットワーク素子に相当する符号ベクトルの各々について、当該符号ベクトルのうち長さを表す変数の絶対値をそれ以外の変数に乗ずることにより、当該符号ベクトルに対応する新ニューラルネットの部分を生成する。
【００５４】
例えば、新符号化ニューラルネットのうち素子ｕに対応している符号（３０１）、
｛ｗ_ｕ，１ ^ｎｅｗｗ_ｕ，２ ^ｎｅｗｗ_ｕ，３ ^ｎｅｗｗ_ｕ，ｂ ^ｎｅｗＬ_ｕ ^ｎｅｗ｝
については、ベクトルの長さに対応する符号Ｌ_ｕ ^ｎｅｗ（Ｌ_ｕ ^ｎｅｗは負の値である場合がある）の絶対値｜Ｌ_ｕ ^ｎｅｗ｜を取り、Ｌ_ｕ ^ｎｅｗ以外の各結合重みに対応する符号Ｗ_ｕ，１ ^ｎｅｗ〜Ｗ_ｕ，ｂ ^ｎｅｗにそれぞれ｜Ｌ_ｕ ^ｎｅｗ｜を乗じて、
Ｗ_ｕ，１ ^ｎｅｗ＝ｗ_ｕ，１ ^ｎｅｗ・｜Ｌ_ｕ ^ｎｅｗ｜
Ｗ_ｕ，２ ^ｎｅｗ＝ｗ_ｕ，２ ^ｎｅｗ・｜Ｌ_ｕ ^ｎｅｗ｜
Ｗ_ｕ，３ ^ｎｅｗ＝ｗ_ｕ，３ ^ｎｅｗ・｜Ｌ_ｕ ^ｎｅｗ｜
Ｗ_ｕ，ｂ ^ｎｅｗ＝ｗ_ｕ，ｂ ^ｎｅｗ・｜Ｌ_ｕ ^ｎｅｗ｜
を計算し、新ニューラルネットの素子ｕに入力する結合重み（３０３）、
｛Ｗ_ｕ，１ ^ｎｅｗＷ_ｕ，２ ^ｎｅｗＷ_ｕ，３ ^ｎｅｗＷ_ｕ，ｂ ^ｎｅｗ｝
を求める。新符号化ニューラルネットのうち素子ｈ１，ｈ２，ｈ３，ｏ１に対応している符号についてもそれぞれ同様である。
【００５５】
なお、逆関数変換には、種々のバリエーションが考えられる。
【００５６】
例えば、上記ではＬ^ｎｅｗの絶対値をとったが、その代わりに、Ｌ^ｎｅｗが予め定められたしきい値Ｌ_ｔｈ以上である場合には、そのＬ^ｎｅｗを用い、Ｌ^ｎｅｗが予め定められたしきい値Ｌ_ｔｈ未満である場合には、Ｌ^ｎｅｗ＝Ｌ_ｔｈとする方法もある。
【００５７】
また、上記では、新符号化ニューラルネット生成においてＬ^ｎｅｗが負の値になる場合があることを想定したが、その代わりに、図２のステップＳ２において、Ｌ^ｎｅｗが負の値にならないアルゴリズムを採用するか、またはＬ^ｎｅｗが負の値（あるいは０以下）になる場合があるアルゴリズムであってもＬ^ｎｅｗが負の値（あるいは０以下）になったときにはＬ^ｎｅｗを０以上（あるいは正の値）に修正するかまたは当該新符号化ニューラルネットを用いない（破棄する）ようにしてもよい。この場合には、Ｌ^ｎｅｗの絶対値をとるなどの処理が不要になる。
【００５８】
また、上記したものの他にも、符号化関数変換及び逆関数変換には、種々のバリエーションが考えられる。
【００５９】
例えば、上記した符号化関数変換例では、当該ネットワーク素子に入力される結合重みの部分集合に含まれる重み（Ｗ_１〜Ｗ_ｎとする）のうちバイアス素子との結合重み以外の結合重み（Ｗ_１〜Ｗ_ｔとする）のベクトル長Ｌ（＝（Ｗ_１ ^２＋…＋Ｗ_ｔ ^２）^１／２）を計算し、結合重みの部分集合の全要素（Ｗ_１〜Ｗ_ｎ）をベクトル長Ｌで除したものと当該ベクトル長とを符号として連結することにより、当該ネットワーク素子に対応する符号化ニューラルネットの部分（Ｗ_１／Ｌ，…，Ｗ_ｎ／Ｌ，Ｌ）を生成したが、その代わりに、ベクトル長Ｌをｋ倍し（ただし、ｋ＞０）、全要素Ｗ_１〜Ｗ_ｎをＬ・ｋで除したものとＬ・ｋとを符号として連結することにより、当該ネットワーク素子に対応する符号化ニューラルネットの部分（Ｗ_１／（Ｌ・ｋ），…，Ｗ_ｎ／（Ｌ・ｋ），（Ｌ・ｋ））を生成する方法も可能である。このバリエーションに対応する逆関数変換は、Ｌ・ｋをＬとして扱えば、これまで説明した例と同様である。
【００６０】
また、例えば、ベクトル長Ｌをｋ倍し（ただし、ｋ＞０）、要素Ｗ_１〜Ｗ_ｔをＬで除したものと、要素Ｗ_ｔ＋１〜Ｗ_ｎをＬ・ｋで除したものと、ＬおよびＬ・ｋ（もしくは、Ｌおよびｋ）とを符号として連結することにより、当該ネットワーク素子に対応する符号化ニューラルネットの部分（Ｗ_１／Ｌ，…，Ｗ_ｔ／Ｌ，Ｗ_ｔ＋１／（Ｌ・ｋ），…，Ｗ_ｎ／（Ｌ・ｋ），Ｌ，（Ｌ・ｋ））を生成する方法（ベクトル長の部分は、Ｌ，（Ｌ・ｋ）の代わりにＬ，ｋでもよい）も可能である。このバリエーションに対応する逆関数変換は、ＬをＷ_１〜Ｗ_ｔに対するＬ、Ｌ・ｋをＷ_ｔ＋１〜Ｗ_ｎに対するＬとして扱えば、これまで説明した例と同様である。
【００６１】
また、例えば、ベクトル長Ｌをｋ１倍し（ただし、ｋ１＞０）、さらにベクトル長Ｌをｋ２倍し（ただし、ｋ２＞０）、要素Ｗ_１〜Ｗ_ｔをＬ・ｋ１で除したものと、要素Ｗ_ｔ＋１〜Ｗ_ｎをＬ・ｋ２で除したものと、Ｌ・ｋ１およびＬ・ｋ２（もしくは、Ｌ、ｋ１およびｋ２）とを符号として連結することにより、当該ネットワーク素子に対応する符号化ニューラルネットの部分（Ｗ_１／（Ｌ・ｋ１），…，Ｗ_ｔ／（Ｌ・ｋ１），Ｗ_ｔ＋１／（Ｌ・ｋ２）…，Ｗ_ｎ／（Ｌ・ｋ２），（Ｌ・ｋ１），（Ｌ・ｋ２））を生成する方法（ベクトル長の部分は、（Ｌ・ｋ１），（Ｌ・ｋ２）の代わりにＬ，ｋ１，ｋ２でもよい）も可能である。このバリエーションに対応する逆関数変換は、Ｌ・ｋ１をＷ_１〜Ｗ_ｔに対するＬ、Ｌ・ｋ２をＷ_ｔ＋１〜Ｗ_ｎに対するＬとして扱えば、これまで説明した例と同様である。
【００６２】
なお、ｋや、ｋ１及びｋ２は、固定にする方法と、ニューラルネット毎に決定する方法などがある。
【００６３】
もちろん、これらの他にも、符号化関数変換及び逆関数変換には、種々のバリエーションが考えられる。
【００６４】
以上説明してきたように、本実施形態によれば、一般的なモデル評価関数に対してニューラルネットの結合重みを最適化することが可能になる。また、各ネットワーク素子の機能ごとに最適化を行うことができる。
【００６５】
以下では、本実施形態のニューラルネットの結合重み最適化処理による作用・効果について次の例に従って説明する。
【００６６】
図８は、入力素子数１０、中間素子数５、出力素子数１のニューラルネットを例示しており、このネットワークの結合重みベクトルＷ（結合重みの個数＝６１）を最適化する。
【００６７】
図９は、次の最適化の基準となるモデル評価関数ｆに用いるデータ（入力及び教師信号）の一例を示している。
ｆ（Ｗ）＝Σ（ｔ（ｉ）−Ｏ（ｉｎ（ｉ），Ｗ））^２
ここで、ｆ（Ｗ）：結合重みＷのネットワークのモデル評価関数値
Ｏ（ｉｎ（ｉ），Ｗ）：ｉｎ（ｉ）が入力されたときのネットワークの出力値である。また、総和を取る範囲は、ｉ＝１〜Ｎである。
なお、この例では、ｆ（Ｗ）が小さいネットワークほど、好ましいネットワークであると評価される。
なお、図４の例のように出力素子が複数ある場合には、例えば、各出力素子に対するモデル評価関数値ｆ（Ｗ）の総和をとればよい。
【００６８】
図１０と図１１に、図９のモデル評価関数を用い且つ新符号化ニューラルネット生成部１０５においてｂｌｘ法を用い且つ新ニューラルネット評価・選択部１０９においてｍｇｇ法を用いた場合における、本実施形態の図６及び図７の方法による最適化結果（Ｃ１，Ｃ３）と、図５の従来方法による最適化結果（Ｃ２，Ｃ４）とを示す。
【００６９】
図１０と図１１の横軸は、図２の手順のループ処理の反復回数であり、図１０の縦軸は、モデル評価関数値の変化を示し、図１１の縦軸は、図９のデータに関する分類精度の変化を示す。
【００７０】
図１０及び図１１から、本実施形態の最適化結果の方が優れていることが明らかである。
【００７１】
このように本実施形態によれば、ニューラルネット全結合重みを同一素子に入力しているグループごとに分け、それぞれのグループに符号化関数変換を行うことで、ニューラルネットの各素子ごとに独立に最適化を行うことができるので、従来手法と比較して効率的な結合重みの最適化を行うことができる。
【００７２】
なお、以上の各機能は、ソフトウェアとして実現可能である。
また、本実施形態は、コンピュータに所定の手段を実行させるための（あるいはコンピュータを所定の手段として機能させるための、あるいはコンピュータに所定の機能を実現させるための）プログラムとして実施することもでき、該プログラムを記録したコンピュータ読取り可能な記録媒体として実施することもできる。
【００７３】
なお、この発明の実施の形態で例示した構成は一例であって、それ以外の構成を排除する趣旨のものではなく、例示した構成の一部を他のもので置き換えたり、例示した構成の一部を省いたり、例示した構成に別の機能あるいは要素を付加したり、それらを組み合わせたりすることなどによって得られる別の構成も可能である。また、例示した構成と論理的に等価な別の構成、例示した構成と論理的に等価な部分を含む別の構成、例示した構成の要部と論理的に等価な別の構成なども可能である。また、例示した構成と同一もしくは類似の目的を達成する別の構成、例示した構成と同一もしくは類似の効果を奏する別の構成なども可能である。
また、この発明の実施の形態で例示した各種構成部分についての各種バリエーションは、適宜組み合わせて実施することが可能である。
また、この発明の実施の形態は、装置全体としての発明、装置内部の構成部分についての発明、またはそれらに対応する方法の発明等、種々の観点、段階、概念またはカテゴリに係る発明を包含・内在するものである。
従って、この発明の実施の形態に開示した内容からは、例示した構成に限定されることなく発明を抽出することができるものである。
【００７４】
本発明は、上述した実施の形態に限定されるものではなく、その技術的範囲において種々変形して実施することができる。
【００７５】
【発明の効果】
本発明によれば、一般的なモデル評価関数に基づくニューラルネットの結合重みの最適化として従来よりも優れた最適化が可能になる。
【図面の簡単な説明】
【図１】本発明の一実施形態に係るニューラルネット結合重み最適化装置の構成例を示す図
【図２】同実施形態に係るニューラルネット結合重み最適化装置の処理手順の一例を示すフローチャート
【図３】同実施形態に係るニューラルネット結合重み最適化装置の処理手順について説明するための図
【図４】同実施形態に係るニューラルネット結合重み最適化装置のニューラルネット素子符号化部の処理について説明するための図
【図５】従来方法における符号化関数変換について説明するための図
【図６】同実施形態に係るニューラルネット結合重み最適化装置における符号化関数変換について説明するための図
【図７】同実施形態に係るニューラルネット結合重み最適化装置における逆関数変換について説明するための図
【図８】解候補ニューラルネットの構造例を示す図
【図９】最適化の基準となるモデル評価関数の一例を示す図
【図１０】最適化結果の一例を従来方法と比較して説明するための図
【図１１】最適化結果の一例を従来方法と比較して説明するための図
【符号の説明】
１０１…解候補ニューラルネット集合記憶部
１０２…モデル評価関数記憶部
１０３…ニューラルネット素子符号化部
１０４…符号化ニューラルネット集合記憶部
１０５…新符号化ニューラルネット生成部
１０６…新符号化ニューラルネット集合記憶部
１０７…ニューラルネット素子復号化部
１０８…新ニューラルネット集合記憶部
１０９…新ニューラルネット評価・選択部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a neural network connection weight optimization apparatus and method for optimizing the connection weight of a neural network based on a model evaluation function that determines the preference of the neural network.
[0002]
[Prior art]
A neural network is a computer technology developed to artificially realize advanced information processing functions. In other words, a neural network is an artificial intelligence technology that uses a computer to simulate a neural circuit in the brain. The input layer is a collection of input elements for inputting information. It consists of an output layer, which is a collection of output elements that output whether or not, and an intermediate layer, which is a collection of intermediate elements in the middle of those layers. They are connected.
[0003]
A neural network acquires advanced information processing functions by optimizing the connection weight parameter that takes a high evaluation function value (or a low evaluation function value depending on the evaluation function) based on some model evaluation function. Can do.
[0004]
As a method for optimizing the connection weight parameter of the neural network, an error back-propagation method (for example, Rumelhalt, DE, GE Hinton, and RJ Williams, “Learning representations by back-propagating errors. ”, Nature 323, pp. 533-536., 1986.), but there is a problem that many constraints are required for the model evaluation function, such as partial differential calculation is possible for all parameters. .
[0005]
Further, as a parameter optimization method for a general model evaluation function, a genetic algorithm (for example, Goldberg, DE, “Genetic Algorithms in Search, Optimization, and Machining Learning.”, Addison-Wesley Publishing). Company Inc., 1989.), however, there is a problem that sufficient optimization performance cannot be obtained because the connection weight values of the neural network are simply collected and used as a real vector.
[0006]
[Problems to be solved by the invention]
As a conventional technique for acquiring advanced information processing functions by optimizing the neural network connection weight parameters with respect to the model evaluation function, we specialize in neural network connection weight optimization such as the error back propagation method. There is a problem that the method cannot be used for a general model evaluation function, and a method such as a genetic algorithm using a real vector that collects connection weights of a neural network uses a general model evaluation function. However, there was a problem that sufficient optimization performance could not be obtained.
[0007]
The present invention has been made in consideration of the above circumstances, and is a neural network connection weight optimization device that enables optimization better than conventional optimization of connection weights of a neural network based on a general model evaluation function. And to provide a method.
[0008]
[Means for Solving the Problems]
The present invention is a neural network connection weight optimization method for optimizing a connection weight of a neural network having a predetermined connection structure composed of a plurality of network elements and zero or one or more bias elements. A first step of associating a neural network with an evaluation result obtained by evaluating a connection weight of the neural network based on a predetermined evaluation function, and storing it in a solution candidate neural network set storage means; and the solution candidate neural network set storage For each of a predetermined number of neural networks selected from the neural network stored in the means, the function of the network element is considered for each connection weight related to the input to the same network element among all the connection weights of the neural network. Encode by performing a predetermined encoding function conversion, for each network element A second step of generating a coded neural network by concatenating the generated codes and a new number of coded neural networks based on the coded neural network generated in the second step. For each of the coded neural nets generated in the third step, and for each code part corresponding to each network element among all the codes of the coded neural network, a network is generated. A fourth step of newly generating a neural network of the same format as the neural network stored in the solution candidate neural network set storage means by performing a predetermined inverse function transformation considering the function of the element; Each of the neural networks generated in step 4 is evaluated based on the model evaluation function. Determining whether to add the neural network and its evaluation result to the solution candidate neural network set storage means based on at least the evaluation result, and the neural network determined to be added and its evaluation result And a fifth step of storing them in the solution candidate neural network set storage means in association with each other.
[0009]
Preferably, in the fifth step, among the existing neural nets stored in the solution candidate neural net set storage means and evaluation results thereof, those satisfying a predetermined condition are stored in the solution candidate neural network set storage. You may make it delete from a means.
[0010]
Preferably, the second step to the fifth step are repeatedly executed until it is determined that a predetermined end condition is satisfied after the fifth step, and the predetermined step is performed after the fifth step. If it is determined that the termination condition is satisfied, among the neural networks stored in the solution candidate neural network set storage means, those having the best evaluation result at that time are optimized. You may make it select as a neural network.
[0011]
Preferably, in the second step, when the predetermined encoding function conversion is performed on the connection weights related to the input to the same network element of the neural network, all the inputs related to the input to the network element are performed. Obtain a vector length of a coupling weight vector composed of coupling weights other than the coupling weights related to the input from the bias element among the coupling weights, and input to the network element based on the vector length or a value obtained by multiplying the vector length by a positive number. The coded neural network is generated by concatenating the value obtained by performing a predetermined operation on each of all the connection weights according to the above and the vector length or a value obtained by multiplying the vector length by a positive number as a code. You may do it. Preferably, the predetermined operation may be an operation of dividing the coupling weight by the vector length or a value obtained by multiplying the vector length by a positive number.
[0012]
Preferably, in the fourth step, when the predetermined inverse function transformation is performed on a code portion corresponding to one network element among all codes of the coding neural network, Based on a code value corresponding to a vector length or a value obtained by multiplying the vector length by a positive number, a code value corresponding to a value obtained by performing a predetermined operation on each of the coupling weights in the code portion, respectively. A corresponding connection weight in the new neural network may be obtained by performing a predetermined calculation. Preferably, the predetermined calculation obtains a correction value obtained by performing a predetermined correction on a code value corresponding to the vector length or a value obtained by multiplying the vector length by a positive number in the code portion, and Of these, a code value corresponding to a value obtained by performing a predetermined calculation on each of the coupling weights may be calculated by multiplying the correction value by each. Preferably, the predetermined correction is a correction that takes an absolute value for the value, or if the value is less than a predetermined positive threshold, the value is the predetermined positive number. You may make it the correction used as a threshold value.
[0013]
Preferably, in the third step, the encoded neural network may be generated by a genetic algorithm.
[0014]
The present invention relating to the apparatus is also established as an invention relating to a method, and the present invention relating to a method is also established as an invention relating to an apparatus.
Further, the present invention relating to an apparatus or a method has a function for causing a computer to execute a procedure corresponding to the invention (or for causing a computer to function as a means corresponding to the invention, or for a computer to have a function corresponding to the invention. It is also established as a program (for realizing) and also as a computer-readable recording medium on which the program is recorded.
[0015]
Now, the function of the neural network is realized by collecting the functions of each network element, and it can be considered that only the connection weight input to the element influences the function of each network element. In the present invention, all the connection weights of a neural network are grouped for each subset of connection weights input to each network element, and an encoded neural network is generated by performing an encoding function conversion to generate an encoded neural network. A new neural network set is generated by performing inverse encoding function conversion for each subset of connection weights input to each network element. As an optimization of the connection weight of the neural network based on the model evaluation function, it is possible to perform an optimization superior to the conventional one. Further, optimization can be performed for each function of each network element. Further, according to the present invention, a neural network element that is functionally similar but has a significantly different weight value can be converted into a similar code vector. Optimization can be performed, and high-performance optimization can be realized.
[0016]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the invention will be described with reference to the drawings.
[0017]
FIG. 1 shows a configuration example of a neural network connection weight optimizing device that performs an optimization process related to a connection weight parameter of a neural network according to an embodiment of the present invention.
[0018]
As shown in FIG. 1, the neural network connection weight optimization apparatus according to the present embodiment includes a solution candidate neural network set storage unit 101, a model evaluation function storage unit 102, a neural network element encoding unit 103, an encoded neural network set. A storage unit 104, a new encoded neural network generation unit 105, a new encoded neural network set storage unit 106, a neural network element decoding unit 107, a new neural network set storage unit 108, and a new neural network evaluation / selection unit 109 are provided. Yes.
[0019]
This neural network connection weight optimization device can be realized by software (that is, can be realized by executing a program on a computer). At that time, a part or all of the functions of the software can be realized as a chip or board and incorporated in the computer. In addition, when the neural network connection weight optimization apparatus is realized by software, it can be incorporated as a function of other software. It is also possible to configure this neural network connection weight optimization device as dedicated hardware.
[0020]
The solution candidate neural network set storage unit 101, the model evaluation function storage unit 102, the encoded neural network set storage unit 104, the new encoded neural network set storage unit 106, and the new neural network set storage unit 108 are all, for example, a hard disk, It is configured by a storage device such as an optical disk or a semiconductor memory. Each storage unit may be configured by a separate storage device, or all or a part of them may be configured by the same storage device.
[0021]
Although not shown in FIG. 1, the neural network connection weight optimization device includes an input / output device for exchanging data with the outside. Of course, a GUI (graphical user interface) may be provided, or a network connection interface may be provided.
[0022]
FIG. 2 shows an example of the overall processing procedure of the present neural network connection weight optimization apparatus.
[0023]
The model evaluation function storage unit 102 stores data indicating a model evaluation function for evaluating the neural network in advance. A desired general model evaluation function can be used as the model evaluation function.
[0024]
The solution candidate neural network set storage unit 101 stores at least one piece of data indicating an initial neural network created in advance. This initial neural network generation method may be in accordance with a conventional method (for example, the structure of the neural network is created or selected by the user, and the connection weight is randomly generated by a predetermined algorithm). Each of the neural networks stored in advance in the solution candidate neural network set storage unit 101 is evaluated by the same method as the evaluation performed on the new neural network in step S4, and the evaluation result (for example, evaluation It is assumed that (function value) is stored in association with each neural network.
[0025]
First, the neural network element encoding unit 103 extracts n (n is a predetermined or designated integer greater than or equal to 1) neural network from the solution candidate neural network set storage unit 101, and the extracted n neural networks. For each of the nets, n coding neural nets are generated by treating the connection weights input to the same element in the neural network as the same group and performing a predetermined coding function conversion, and these are coded neural networks. The data is stored in the net set storage unit 104 (step S1).
[0026]
Next, the new encoded neural network generation unit 105 generates a predetermined method such as a genetic algorithm based on the encoded neural network generated in step S1 and stored in the encoded neural network set storage unit 104. Accordingly, m new encoded neural nets (m is a predetermined or designated integer or more) are generated and stored in the new encoded neural network set storage unit 106 (step S2).
[0027]
As this generation method, various methods can be used. For example, the blx method (for example, Eshelman, LJ, and Schaffer, JD, “Real-Coded Genetic Algorithms and Interval-Schemata.”, Foundations of Genetic Algorithms. .) May be used.
[0028]
Next, the neural network element decoding unit 107 extracts the m new encoded neural nets generated in step S2 and stored in the new encoded neural network set storage unit 106, and the m neural nets extracted are extracted. For each, the codes corresponding to the same elements in the neural network are treated as the same group, and predetermined inverse function transformation is performed to generate m new neural networks, which are stored in the new neural network set storage unit 108. Store (step S3).
[0029]
Next, the new neural network evaluation / selection unit 109 obtains the information stored in the model evaluation function storage unit 102 for each of the m new neural networks stored in the new neural network set storage unit 108 in step S3. Based on the evaluation result, all or part of the solution candidate neural nets stored in the solution candidate neural network set storage unit 101 and m stored in the new neural network set storage unit 108 are evaluated. One or more neural nets are selected from the new neural nets (or from the m new neural nets stored in the new neural net set storage unit 108), and data of the selected new neural nets are selected. For this, the solution candidate neural network is used as a new solution candidate neural network together with the evaluation result. Stored in the case memory portion 101 (step S4). When storing the selected new neural network in the solution candidate neural network set storage unit 101, one or more solution candidate neural networks selected on the basis of a predetermined criterion are deleted from the solution candidate neural network set storage unit 101. May be.
[0030]
As this evaluation / selection method, various methods can be used. For example, the mgg method (for example, Sato, Ono, Kobayashi, “Proposal and Evaluation of Generation Alternation Model in Genetic Algorithm”, Japanese Society for Artificial Intelligence, Vol. 12, No. 5, pp. 734-744., 1997.) May be used.
[0031]
Here, it is determined whether or not a predetermined end condition set in advance is satisfied (step S5). If not satisfied, the process returns to step S1 and the same processing is repeated.
[0032]
As the end condition, for example, the number of repetitions reaches a predetermined number, the difference between the number of times and the previous evaluation result (for example, the evaluation function value) falls below a predetermined value, Various things can be considered, such as the rate of change between the first and previous evaluation results (e.g., evaluation function value) being lower than a predetermined value, and a combination of these conditions as a logical sum or logical product.
[0033]
When the termination condition is satisfied, the processing loop is exited and the neural network stored in the solution candidate neural network set storage unit 101 is selected with the best evaluation result. This selected neural network is a neural network obtained by this optimization processing.
[0034]
Here, when a genetic algorithm is used in the procedure of FIG. 2, one loop process is as shown in FIG. 3, for example. That is, for example, two neural networks are randomly selected and extracted from the solution candidate neural network set storage unit 101 in step S1 of this time, and the respective neural nets a and b are converted into encoded neural nets. In step S3, for example, four new encoded neural nets are generated by applying a genetic algorithm to these encoded neural nets. Then, in step S3, they are decoded and inversely converted into new neural nets c to f. It is stored in the storage unit 108. Next, in step S4, the new neural nets c to f are evaluated, and then, for example, the best two of the original solution candidate neural net a and the new neural nets c to f are selected based on the evaluation results. If they are not the original neural network (a, b), for example, the original neural nets a, b are updated with the selected neural nets e, f. Of course, this is only an example, and various other methods are possible as one loop processing when a genetic algorithm is used, and various methods using an algorithm other than the genetic algorithm are possible. Is possible.
[0035]
Now, neural network element encoding and neural network element decoding in the neural network connection weight optimizing device of this embodiment will be described using specific examples.
[0036]
FIG. 4 shows the stored contents of the solution candidate neural network set storage unit 101, the processing of the neural network element encoding unit 103, and the stored content of the encoded neural network set storage unit 104 in the neural network connection weight optimization apparatus of FIG. This is just an example.
[0037]
As described above, the solution candidate neural network set storage unit 101 stores one or more solution candidate neural networks (201), each associated with an evaluation result. When a plurality of solution candidate neural networks are stored, all solution candidate neural networks have the same structure.
[0038]
The solution candidate neural network 201 illustrated in FIG. 4 is coupled to three network elements (input elements) i1 to i3 and one bias element to three network elements (intermediate elements) h1 to h3, respectively. In this example, intermediate elements h1 to h3 and one bias element are coupled to two network elements (output elements) o1 and u, respectively.
[0039]
In the neural network element encoding unit 103, for each solution candidate neural network, all connection weights of the solution candidate neural network are grouped for each connection weight input to the same network element, and an encoding function conversion (203) is performed. To generate a coded neural network. For example, the coding function transformation (203) is performed on the connection weight group (202) input to the network element u in the solution candidate neural network (201) of FIG. A portion (204) corresponding to is generated. Such a process is performed on all the network elements h1, h2, h3, o1, u and connected to generate an encoded neural network. The generated encoded neural network (205) is stored in the encoded neural network set storage unit 104.
[0040]
Here, the encoding function conversion will be described in more detail.
[0041]
First, FIG. 5 shows an encoded neural network generation method (1203) according to a conventional method. According to the conventional method, an encoded neural network is generated by arranging all connection weights input to all network elements in order.
[0042]
For example, the connection weight W input to the network element u (1202)_{u, 1}~ W_{u, b}, The portion corresponding to u of the encoded neural network (1204),
{W_{u, 1}  W_{u, 2}  W_{u, 3}  W_{u, b}}
Will be generated. The same applies to the other network elements h1, h2, h3, o1.
[0043]
And portions corresponding to the respective network elements h1, h2, h3, o1, u,
{W_h1,1  W_h1,2  W_h1,3  W_{h1, b}}
{W_h2,1  W_h2,2  W_h2,3  W_{h2, b}}
{W_h3,1  W_h3,2  W_h3,3  W_{h3, b}}
{W_o1,1  W_o1,2  W_o1,3  W_{o1, b}}
{W_{u, 1}  W_{u, 2}  W_{u, 3}  W_{u, b}}
By connecting the coding neural network,
{W_h1,1  W_h1,2  W_h1,3  W_{h1, b}  W_h2,1  ... W_{o1, b}  W_{u, 1}  W_{u, 2}  W_{u, 3}  W_{u, b}}
Is generated.
[0044]
Therefore, for example, in the solution candidate neural network (201) of FIG. , −2.0}. Similarly, the coupling weights input to the network element h2 are {2.3, −1.3, 10.0, 1.0}, and the coupling weights input to the network element h3 are , {2.1, 2.2, −3.3, −2.0}, and the coupling weights input from the network elements h1, h2, h3 and the bias element to the network element o1 are {−5 .1, 2.4, −1.0, 2.0}, and similarly, the connection weights input to the network element u are {−0.1, 20.4, −0.8, 1.0. } Is obtained by the conventional method. Encoding neural net,
{0.1, 2.3, 3.0, -2.0, 2.3, -1.3, 10.0, 1.0, 2.1, 2.2, -3.3, -2 0.0, -5.1, 2.4, -1.0, 2.0, -0.1, 20.4, -0.8, 1.0}
It becomes.
[0045]
Next, FIG. 6 shows an example of the encoding function conversion (203) in the present embodiment. In this example, for each network element, the vector length of the coupling weight vector other than the coupling weight with the bias element among the weights included in the subset of coupling weights input to the network element is calculated, and the coupling weight part is calculated. An encoded neural network corresponding to the network element is generated by concatenating all elements of the set divided by the vector length and the vector length as a code.
[0046]
For example, the connection weight W input to the network element u (202)_{u, 1}~ W_{u, b}W_{u, 1}~ W_{u, 3}Vector length L when_u,
L_u= (W_{u, 1} ²+ W_{u, 2} ²+ W_{u, 3} ²)^1/2
And W_{u, 1}~ W_{u, b}Each L_uDivide by
w_{u, 1}= W_{u, 1}/ L_u
w_{u, 2}= W_{u, 2}/ L_u
w_{u, 3}= W_{u, 3}/ L_u
w_{u, b}= W_{u, b}/ L_u
And w_{u, 1}, W_{u, 2}, W_{u, 3}, W_{u, b}, L_u, The portion corresponding to u of the encoded neural network (204),
{W_{u, 1}  w_{u, 2}  w_{u, 3}  w_{u, b}  L_u}
Is generated. The same applies to the other network elements h1, h2, h3, o1.
[0047]
And portions corresponding to the respective network elements h1, h2, h3, o1, u,
{W_h1,1  w_h1,2  w_h1,3  w_{h1, b}  L_h1}
{W_h2,1  w_h2,2  w_h2,3  w_{h2, b}  L_h2}
{W_h3,1  w_h3,2  w_h3,3  w_{h3, b}  L_h3}
{W_o1,1  w_o1,2  w_o1,3  w_{o1, b}  L_o1}
{W_{u, 1}  w_{u, 2}  w_{u, 3}  w_{u, b}  L_u}
By connecting the coding neural network,
{W_h1,1  w_h1,2  w_h1,3  w_{h1, b}  L_h1  w_h2,1  … W_{o1, b}  L_o1  w_{u, 1}  w_{u, 2}  w_{u, 3}  w_{u, b}  L_u}
Is generated.
[0048]
Therefore, for example, if the solution candidate neural network (201) in FIG. 4 has the coupling weights exemplified above, the encoding neural network obtained by the encoding function conversion according to this example is
{0.0264442 ..., 0.6082187 ..., 0.7933288 ..., -0.5288858 ..., 3.7815340 ...,
0.2223370 ..., -0.1256874 ..., 0.9668269 ..., 0.0966826 ..., 10.343113 ...,
0.4679393 ..., 0.4902221 ..., -0.7353332 ..., -0.4456565 ..., 4.4877611 ...,
-0.8909061 ..., 0.4192499 ..., -0.1746874 ..., 0.349373749 ..., 5.72445087 ...,
-0.0048981 ..., 0.9992199 ..., -0.0391850 ..., 0.0489813 ..., 20.4159925 ...}
It becomes.
[0049]
In this way, neural network elements that are functionally similar but have significantly different weights can be converted into similar code vectors (eg, {10, 20, 30, -40} and {20, 40, 60, −80} can be converted into {1, 2, 3, −4, 10} and {1, 2, 3, −4, 20} vectors), on the function of each network element It is possible to optimize the connection weight parameter that avoids duplication, and to realize high-performance optimization.
[0050]
Next, FIG. 7 illustrates the processing of the neural network decoding unit 107 in the neural network connection weight optimization apparatus of FIG.
[0051]
As described above, the neural network element decoding unit 107 generates a new neural network by treating codes corresponding to the same element in the new encoded neural network as the same group and performing a predetermined inverse function conversion.
[0052]
Here, the inverse function conversion will be described in more detail.
[0053]
In the example of FIG. 7, for each of the code vectors corresponding to each network element in the new encoding neural network, the code vector is obtained by multiplying the other variables by the absolute value of the variable representing the length of the code vector. A new neural network part corresponding to is generated.
[0054]
For example, the code (301) corresponding to the element u in the new encoding neural network,
{W_{u, 1} ^new  w_{u, 2} ^new  w_{u, 3} ^new  w_{u, b} ^new  L_u ^new}
For, the code L corresponding to the length of the vector_u ^new(L_u ^newMay be negative) | L_u ^new｜ L_u ^newW corresponding to each combination weight other than_{u, 1} ^new~ W_{u, b} ^newEach | L_u ^newMultiply
W_{u, 1} ^new  = W_{u, 1} ^new  ・ | L_u ^new｜
W_{u, 2} ^new  = W_{u, 2} ^new  ・ | L_u ^new｜
W_{u, 3} ^new  = W_{u, 3} ^new  ・ | L_u ^new｜
W_{u, b} ^new  = W_{u, b} ^new  ・ | L_u ^new｜
And the connection weight (303) to be input to the element u of the new neural network
{W_{u, 1} ^new  W_{u, 2} ^new  W_{u, 3} ^new  W_{u, b} ^new}
Ask for. The same applies to the codes corresponding to the elements h1, h2, h3 and o1 in the new encoded neural network.
[0055]
Various variations can be considered for the inverse function transformation.
[0056]
For example, in the above, L^newInstead of L, but instead of L^newIs a predetermined threshold value L_thIf this is the case, the L^newAnd L^newIs a predetermined threshold value L_thL if less than^new= L_thThere is also a method.
[0057]
Also, in the above, L^new2 may be negative, but instead, in step S2 of FIG.^newAdopt an algorithm that does not become negative or L^newEven if the algorithm may be negative (or less than 0)^newL becomes negative (or 0 or less)^newMay be corrected to 0 or more (or a positive value), or the new coded neural network may not be used (discarded). In this case, L^newProcessing such as taking the absolute value of becomes unnecessary.
[0058]
In addition to the above, various variations can be considered for the encoding function conversion and the inverse function conversion.
[0059]
For example, in the coding function conversion example described above, the weight (W) included in the subset of connection weights input to the network element.₁~ W_n) And a coupling weight other than the coupling weight with the bias element (W₁~ W_tVector length L (= (W₁ ²+ ... + W_t ²)^1/2) And calculate all elements (W₁~ W_n) Divided by the vector length L and the vector length are connected as a code, so that the portion of the encoded neural network corresponding to the network element (W₁/L,...,W_n/ L, L), but instead, vector length L is multiplied by k (where k> 0) and all elements W₁~ W_nBy dividing L · k by L · k as a code, the portion of the encoded neural network corresponding to the network element (W₁/(L·k),...,W_nA method of generating / (L · k), (L · k)) is also possible. The inverse function transformation corresponding to this variation is the same as the example described so far if L · k is treated as L.
[0060]
Also, for example, the vector length L is multiplied by k (where k> 0), and the element W₁~ W_tDivided by L and element W_{t + 1}~ W_nIs divided by L · k, and L and L · k (or L and k) are connected as codes, thereby the portion of the encoded neural network corresponding to the network element (W₁/L,...,W_t/ L, W_{t + 1}/(L·k),...,W_n/ (L · k), L, (L · k)) (the vector length portion may be L or k instead of L or (L · k)). The inverse function transformation corresponding to this variation₁~ W_tL, L · k for W_{t + 1}~ W_nIt is the same as the example described so far if it is treated as L for.
[0061]
Further, for example, the vector length L is multiplied by k1 (where k1> 0), the vector length L is further multiplied by k2 (where k2> 0), and the element W₁~ W_tDivided by L · k1 and element W_{t + 1}~ W_nIs divided by L · k2, and L · k1 and L · k2 (or L, k1 and k2) are connected as a code to obtain a part of the encoded neural network corresponding to the network element (W₁/(L·k1),...,W_t/ (L · k1), W_{t + 1}/ (L · k2) ..., W_n/ (L · k2), (L · k1), (L · k2)) (the vector length part is L, k1, k2 instead of (L · k1), (L · k2)) Good) is also possible. The inverse function transformation corresponding to this variation uses L · k1 as W₁~ W_tL, L · k2 for W_{t + 1}~ W_nIt is the same as the example described so far if it is treated as L for.
[0062]
There are a method of fixing k, k1, and k2 and a method of determining each neural network.
[0063]
Of course, in addition to these, various variations can be considered for the encoding function conversion and the inverse function conversion.
[0064]
As described above, according to the present embodiment, it is possible to optimize the connection weight of the neural network for a general model evaluation function. Further, optimization can be performed for each function of each network element.
[0065]
In the following, the operation and effect of the connection weight optimization processing of the neural network of this embodiment will be described according to the following example.
[0066]
FIG. 8 illustrates a neural network having 10 input elements, 5 intermediate elements, and 1 output element, and optimizes the connection weight vector W (number of connection weights = 61) of this network.
[0067]
FIG. 9 shows an example of data (input and teacher signal) used for the model evaluation function f serving as the next optimization criterion.
f (W) = Σ (t (i) −O (in (i), W))²
Here, f (W): network model evaluation function value of the connection weight W
O (in (i), W): An output value of the network when in (i) is input. Moreover, the range which takes a sum total is i = 1-N.
In this example, a smaller f (W) network is evaluated as a preferable network.
When there are a plurality of output elements as in the example of FIG. 4, for example, the sum of model evaluation function values f (W) for each output element may be taken.
[0068]
FIGS. 10 and 11 show this embodiment in the case where the model evaluation function of FIG. 9 is used, the blx method is used in the new encoded neural network generation unit 105, and the mgg method is used in the new neural network evaluation / selection unit 109. 6 and 7 show the optimization results (C1, C3) and the optimization results (C2, C4) of the conventional method of FIG.
[0069]
The horizontal axis of FIGS. 10 and 11 is the number of iterations of the loop processing of the procedure of FIG. 2, the vertical axis of FIG. 10 indicates the change in the model evaluation function value, and the vertical axis of FIG. Shows the change in classification accuracy.
[0070]
From FIG. 10 and FIG. 11, it is clear that the optimization result of this embodiment is superior.
[0071]
As described above, according to the present embodiment, the neural network total connection weight is divided into groups that are input to the same element, and the coding function conversion is performed on each group, so that each element of the neural network can be independently performed. Since optimization can be performed, it is possible to optimize the coupling weight more efficiently than in the conventional method.
[0072]
Each function described above can be realized as software.
The present embodiment can also be implemented as a program for causing a computer to execute predetermined means (or for causing a computer to function as predetermined means, or for causing a computer to realize predetermined functions) The present invention can also be implemented as a computer-readable recording medium that records the program.
[0073]
Note that the configuration illustrated in the embodiment of the present invention is an example, and is not intended to exclude other configurations, and a part of the illustrated configuration may be replaced with another, or one of the illustrated configurations. Other configurations obtained by omitting a part, adding another function or element to the illustrated configuration, or combining them are also possible. Also, another configuration that is logically equivalent to the exemplified configuration, another configuration that includes a portion that is logically equivalent to the illustrated configuration, another configuration that is logically equivalent to the main part of the illustrated configuration, and the like are possible. is there. Further, another configuration that achieves the same or similar purpose as the illustrated configuration, another configuration that achieves the same or similar effect as the illustrated configuration, and the like are possible.
In addition, various variations of various components illustrated in the embodiment of the present invention can be implemented in appropriate combination.
Further, the embodiments of the present invention include inventions according to various viewpoints, stages, concepts, or categories, such as inventions of the entire device, inventions of components inside the device, or inventions of methods corresponding thereto. It is inherent.
Therefore, the present invention can be extracted from the contents disclosed in the embodiments of the present invention without being limited to the exemplified configuration.
[0074]
The present invention is not limited to the embodiment described above, and can be implemented with various modifications within the technical scope thereof.
[0075]
【The invention's effect】
According to the present invention, it is possible to optimize the neural network coupling weight based on a general model evaluation function, which is better than the conventional optimization.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration example of a neural network connection weight optimization apparatus according to an embodiment of the present invention.
FIG. 2 is a flowchart showing an example of a processing procedure of the neural network connection weight optimization device according to the embodiment;
FIG. 3 is a view for explaining the processing procedure of the neural network connection weight optimization device according to the embodiment;
FIG. 4 is a diagram for explaining processing of a neural network element encoding unit of the neural network connection weight optimization apparatus according to the embodiment;
FIG. 5 is a diagram for explaining encoding function conversion in a conventional method;
FIG. 6 is a diagram for explaining coding function conversion in the neural network connection weight optimization device according to the embodiment;
FIG. 7 is a diagram for explaining inverse function conversion in the neural network connection weight optimization device according to the embodiment;
FIG. 8 is a diagram showing an example of the structure of a solution candidate neural network
FIG. 9 is a diagram showing an example of a model evaluation function serving as a standard for optimization
FIG. 10 is a diagram for explaining an example of optimization results in comparison with a conventional method;
FIG. 11 is a diagram for explaining an example of optimization results in comparison with a conventional method;
[Explanation of symbols]
101 ... Solution candidate neural network set storage unit
102 ... Model evaluation function storage unit
103: Neural network element encoding unit
104: Coding neural network set storage unit
105 ... New encoding neural network generation unit
106: New encoding neural network set storage unit
107: Neural network element decoding unit
108 ... New neural network set storage unit
109 ... New neural network evaluation / selection unit

Claims

A neural network connection weight optimization method for optimizing a connection weight of a neural network having a predetermined connection structure composed of a plurality of network elements and zero or one or more bias elements,
One or more neural network, a first step of storing the connection weights of the neural network in the candidate solution neural network set storage unit in association with the evaluation results of the evaluation on the basis of a predetermined evaluation function,
For each of a predetermined number of neural networks selected from the neural network stored in the solution candidate neural network set storage means, for each connection weight related to input to the same network element among all the connection weights of the neural network, A second step of performing encoding by performing a predetermined encoding function conversion in consideration of the function of the network element and concatenating the codes obtained for each network element to generate an encoded neural network;
A third step of newly generating a predetermined number of encoded neural nets based on the encoded neural net generated in the second step;
For each of the encoded neural nets generated in the third step, a predetermined inverse considering the function of the network element for each code part corresponding to each network element among all the codes of the encoded neural network. A fourth step of performing function conversion to newly generate a neural network of the same format as the neural network stored in the solution candidate neural network set storage means ;
Each of the neural networks generated in the fourth step is evaluated based on the model evaluation function to obtain an evaluation result, and at least based on the evaluation result, the neural network and its evaluation result are determined as the solution candidate neural network. A fifth step of determining whether to add to the net set storage means and associating the neural network determined to be added with the evaluation result in association with each other and storing them in the solution candidate neural net set storage means A neural network connection weight optimization method characterized by:

In the fifth step, the existing neural network stored in the solution candidate neural network set storage unit and the evaluation result satisfying a predetermined condition are deleted from the solution candidate neural network set storage unit. The neural network connection weight optimization method according to claim 1, wherein:

In the fifth step, in the second step of the evaluation result of the neural network generated in the fourth step and the existing neural network stored in the solution candidate neural network set storage unit , The evaluation result of the selected neural network is compared, and the one having the better evaluation result is selected in the same number as the number of the neural network selected in the second step, and the selected neural network and The neural network stored in the solution candidate neural network set storage means and the portion of the neural network and the evaluation result selected in the second step among the evaluation results are updated with the evaluation result. The neural network connection weight optimization method according to claim 1.

The second step to the fifth step are repeatedly executed until it is determined that a predetermined end condition is satisfied after the fifth step,
If it is determined that a predetermined termination condition is satisfied after the fifth step, among the neural networks stored in the solution candidate neural network set storage unit , the best evaluation result at that time The neural network connection weight optimization method according to any one of claims 1 to 3, wherein the neural network is selected as an optimized neural network.

In the second step, when the predetermined encoding function conversion is performed on the connection weight related to the input to the same network element of the neural network, the total connection weight related to the input to the network element is calculated. The vector length of a coupling weight vector composed of coupling weights other than the coupling weights related to the input from the bias element is obtained, and all the vectors related to the input to the network element based on the vector length or a value obtained by multiplying the vector length by a positive number The encoded neural network is generated by concatenating a value obtained by performing a predetermined operation on each of the connection weights and the vector length or a value obtained by multiplying the vector length by a positive number as a code. The neural network connection weight optimization method according to any one of claims 1 to 4.

6. The neural network connection weight optimization method according to claim 5, wherein the predetermined operation is an operation of dividing the connection weight by the vector length or a value obtained by multiplying the vector length by a positive number.

In the second step, when the predetermined encoding function conversion is performed on the connection weight related to the input to the same network element of the neural network, the total connection weight related to the input to the network element is calculated. Of these, the vector length of a coupling weight vector composed of coupling weights other than the coupling weight related to the input from the bias element is obtained, and the coupling weight related to the input from the bias element based on a value obtained by multiplying the vector length by a first positive number And a value obtained by performing a predetermined operation on each of the connection weights other than, and the vector length multiplied by a second positive multiple (however, the first positive number and the second positive number are different values and the first A value obtained by performing a predetermined operation on each of the coupling weights related to the input from the bias element based on a value obtained by at least one of a positive number of 1 and a second positive number being a value that is not 1), and the vector The length is the first positive 2. The encoded neural network is generated by concatenating, as a code, a set of a value obtained by specifying a value obtained by multiplying the multiplied value and a value obtained by multiplying the vector length by a second positive number as a code. 5. The neural network connection weight optimization method according to any one of items 4 to 4.

The predetermined calculation divides a coupling weight other than a coupling weight related to an input from the bias element by a value obtained by multiplying the vector length by a first positive number, and a coupling weight related to an input from the bias element. 7. The method for optimizing a neural network connection weight according to claim 5, wherein the vector length is divided by a value obtained by multiplying the vector length by a second positive number.

In the fourth step, in performing the predetermined inverse function transformation on a code portion corresponding to one network element of all codes of the coding neural network, the vector length or Based on a code value corresponding to a value obtained by multiplying the vector length by a positive number, a predetermined operation is performed for each code value corresponding to a value obtained by performing a predetermined operation on each of the coupling weights in the code portion. 7. The neural network connection weight optimization method according to claim 5, wherein the corresponding connection weight in the new neural network is obtained by applying

The predetermined calculation obtains a correction value obtained by performing a predetermined correction on a code value corresponding to the vector length or a value obtained by multiplying the vector length by a positive number in the code portion, and the combination in the code portion. The neural network connection weight optimization method according to claim 9, wherein the correction is performed by multiplying a code value corresponding to a value obtained by performing a predetermined calculation on each of the weights.

In the fourth step, when performing the predetermined inverse function transformation on the code portion corresponding to one of the network elements among all the codes of the coding neural network, the vector length of the code portion is set. The vector length specified based on a set of code values corresponding to a set of values that can specify a value obtained by multiplying a first positive number and a value obtained by multiplying the vector length by a second positive number is a first positive number. Based on a value corresponding to the multiplied value, a code value corresponding to a value obtained by performing a predetermined calculation on each of the coupling weights other than the coupling weight related to the input from the bias element in the code part. A code corresponding to a set of values that perform a predetermined calculation and that can specify a value obtained by multiplying the vector length by a first positive number and a value obtained by multiplying the vector length by a second positive number in the code portion. Identified based on value pairs Further, based on a value corresponding to a value obtained by multiplying the vector length by a second positive number, it corresponds to a value obtained by performing a predetermined calculation on each of the coupling weights related to the input from the bias element in the sign portion. 9. The neural network connection weight optimization method according to claim 7 or 8, wherein a corresponding connection weight in the new neural network is obtained by performing a predetermined operation on each of the code values.

The predetermined calculation obtains a first correction value obtained by performing a predetermined correction on a value corresponding to a value obtained by multiplying the specified vector length by a first positive number, and the bias element of the sign portion is obtained. And multiplying the code value corresponding to the value obtained by performing a predetermined calculation on each of the connection weights other than the connection weights related to the input from the first correction value, and the specified vector length A second correction value obtained by performing a predetermined correction on a value corresponding to a value that is a multiple of 2 is obtained, and a predetermined calculation is performed on each of the coupling weights related to the input from the bias element in the sign portion. 12. The method for optimizing a neural network connection weight according to claim 11, wherein the calculation is performed by multiplying the code value corresponding to the value obtained by the second correction value.

Said predetermined correction, the value that takes the absolute value correction, or the value is a positive number of threshold values determined the advance said value if it is less than the threshold value of a positive number that is predetermined for The neural network connection weight optimization method according to claim 10 or 12, characterized in that:

The neural network connection weight optimization method according to claim 1, wherein in the third step, the encoded neural network is generated by a genetic algorithm.

A neural network connection weight optimization device for optimizing a connection weight of a neural network having a predetermined connection structure comprising a plurality of network elements and zero or one or more bias elements,
One or more neural networks, and the solution candidate neural net set storage means for storing the connection weights of the neural network in association with the evaluation result of the evaluation on the basis of a predetermined evaluation function,
For each of a predetermined number of neural networks selected from the neural network stored in the solution candidate neural network set storage means, for each connection weight related to input to the same network element among all the connection weights of the neural network, Means for generating a coded neural network by performing a predetermined coding function conversion in consideration of the function of the network element and coding, concatenating the codes obtained for each network element;
Means for newly generating a predetermined number of encoded neural nets based on the encoded neural net generated by the means;
For each of the encoded neural nets generated by this means, a predetermined inverse function transformation taking into account the function of the network element is performed for each code part corresponding to each network element among all the codes of the encoded neural network. And a means for newly generating a neural network of the same format as the neural network stored in the solution candidate neural network set storage means ,
Each of the neural nets generated by the means is evaluated based on the model evaluation function to obtain an evaluation result, and at least the neural network and the evaluation result are stored in the solution candidate neural net set storage based on the evaluation result. Determining whether to add to the means, and means for associating the neural network determined to be added with the evaluation result and storing it in the solution candidate neural network set storage means, Neural network connection weight optimization device.

A program for causing a computer to function as a neural network connection weight optimization device that optimizes connection weights of a neural network having a predetermined connection structure composed of a plurality of network elements and zero or one or more bias elements,
One or more neural network, a function for storing the connection weights of the neural network in the candidate solution neural network set storage unit in association with the evaluation results of the evaluation on the basis of a predetermined evaluation function,
For each of a predetermined number of neural networks selected from the neural network stored in the solution candidate neural network set storage means, for each connection weight related to input to the same network element among all the connection weights of the neural network, Encoding by performing a predetermined encoding function conversion in consideration of the function of the network element, concatenating the codes obtained for each network element, and a function for generating an encoded neural network,
Based on the encoded neural net generated by this function, a function for newly generating a predetermined number of encoded neural nets,
For each of the encoded neural nets generated by this function, a predetermined inverse function conversion considering the function of the network element is performed for each code part corresponding to each network element among all codes of the encoded neural network. And a function for newly generating a neural network of the same format as the neural network belonging to the above stored in the solution candidate neural network set storage means ,
Each of the neural nets generated by this function is evaluated based on the model evaluation function to obtain an evaluation result, and at least the neural network and the evaluation result are stored in the solution candidate neural net set storage based on the evaluation result. Determining whether to add to the means , and causing the computer to realize a function for associating the neural network determined to be added with the evaluation result and storing it in the solution candidate neural network set storage means program.