JP2022161099A

JP2022161099A - Arithmetic apparatus, integrated circuit, machine learning apparatus, and discrimination apparatus

Info

Publication number: JP2022161099A
Application number: JP2021065650A
Authority: JP
Inventors: 富美男大庭; Fumio Oba
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-04-08
Filing date: 2021-04-08
Publication date: 2022-10-21

Abstract

To provide an arithmetic apparatus that enables a treatment of phenomenon expressed by power exponents and enables an accurate derivation of a correlation between an input and an output in the phenomenon.SOLUTION: An arithmetic apparatus outputs an output value from an output layer for a plurality of input data (D0, D1,..., DN) input to an input layer, using a neural network structure including at least the input layer and the output layer. The input layer includes a plurality of power exponents (p0, p1,..., pN) respectively associated with the plurality of input data and powered to the plurality of input data, as a learning parameter. The output layer outputs an output value (y=f(YY0)) based on products (YY0=D0p0*D1p1*...*DNpN) obtained by powering the plurality of input data input to the input layer by the plurality of power exponents (D0p0*D1p1*...*DNpN) respectively.SELECTED DRAWING: Figure 1

Description

本発明は、演算装置、集積回路、機械学習装置、及び、判別装置に関する。 The present invention relates to arithmetic devices, integrated circuits, machine learning devices, and discrimination devices.

近年、機械学習は様々な分野に適用され、特に、ニューラルネットワーク構造は回帰問題及び分類問題のいずれにも幅広く応用されている。このようなニューラルネットワーク構造では、入力層に入力された複数の入力データに対して重み係数がそれぞれ乗算され、それらの総和を算出した結果に基づく出力値が出力層から出力される（例えば、特許文献１、特許文献２等参照）。 In recent years, machine learning has been applied to various fields, and in particular, neural network structures have been widely applied to both regression problems and classification problems. In such a neural network structure, multiple pieces of input data input to the input layer are multiplied by respective weighting factors, and an output value based on the sum of the calculated results is output from the output layer (for example, patent See Document 1, Patent Document 2, etc.).

特開平７－１２９５３５号公報JP-A-7-129535 特開平７－１４１３１５号公報JP-A-7-141315

上記の特許文献１、特許文献２等に記載されたような従来のニューラルネットワーク構造は、重み係数を調節することで機械学習が行われるが、複数の入力データに対する「べき指数」は、例えば、「１」等のように固定されている。そのため、ニューラルネットワーク構造が適用される現象として、入力データに対するべき指数が事前に判明し、その値に固定されているのであれば、機械学習時にその現象を適切に規定するような重み係数に収束すると考えられる。 Conventional neural network structures such as those described in Patent Document 1, Patent Document 2, etc. perform machine learning by adjusting weighting coefficients, but the "exponent" for a plurality of input data is, for example, It is fixed like "1". Therefore, as a phenomenon to which the neural network structure is applied, if the power exponent for the input data is known in advance and fixed at that value, the weighting factor will converge to appropriately define the phenomenon during machine learning. It is thought that

しかしながら、自然現象、経済現象、社会現象等の様々な現象において、複数の入力データに対するべき指数が事前に判明していない場合や、複数の入力データがべき指数によりべき乗されたべき乗値の積に応じて出力値が算出される場合も当然に想定される。このような場合、従来のニューラルネットワーク構造では、特定の入力データの組み合わせに対してはその現象を規定する重み係数に近似できたとしても、別の入力データの組み合わせに対しては出力データの誤差が大きくなるため、機械学習にて適切な重み係数に収束させることが難しいという構造的な問題点があった。換言すると、従来のニューラルネットワーク構造では、モデル化の対象として取り扱う現象が、入力データに対するべき乗値の積を含む場合、入力（入力データ）と出力（出力値）との間に成り立つ相関関係を精度良く導出することができないという問題点があった。 However, in various phenomena such as natural phenomena, economic phenomena, social phenomena, etc., when the power index for multiple input data is not known in advance, or when multiple input data are multiplied by the power exponent, It is naturally assumed that the output value is calculated accordingly. In such a case, in a conventional neural network structure, even if it is possible to approximate the weighting coefficients that define the phenomenon for a specific combination of input data, the error in the output data for another combination of input data is becomes large, there is a structural problem that it is difficult to converge to an appropriate weighting factor by machine learning. In other words, in the conventional neural network structure, when the phenomenon to be modeled includes the product of power values for the input data, the correlation between the input (input data) and the output (output value) is defined as accuracy. There was a problem that it could not be derived well.

本発明は、上述した課題に鑑み、べき指数により表現される現象の取り扱いを可能とするとともに、当該現象において入力と出力との間に成り立つ相関関係を精度良く導出することを可能とする演算装置、集積回路、機械学習装置、及び、判別装置を提供することを目的とする。 SUMMARY OF THE INVENTION In view of the above-mentioned problems, the present invention is an arithmetic device that enables handling of phenomena represented by power exponents and enables accurate derivation of correlations between inputs and outputs in the phenomena. , an integrated circuit, a machine learning device, and a discrimination device.

上記目的を達成するために、本発明の一態様に係る演算装置は、
入力層及び出力層を少なくとも含むニューラルネットワーク構造を用いて、前記入力層に入力される複数の入力データ（Ｄ０,Ｄ１,…,ＤＮ）に対して前記出力層から出力値を出力する演算装置であって、
前記入力層は、
複数の前記入力データにそれぞれ対応付けられて、複数の前記入力データをそれぞれべき乗する複数のべき指数（ｐ０,ｐ１,…,ｐＮ）を、前記ニューラルネットワーク構造の学習パラメータとして有し、
前記出力層は、
前記入力層に入力された複数の前記入力データが複数の前記べき指数によりそれぞれべき乗された複数のべき乗値（Ｄ０^ｐ０,Ｄ１^ｐ１,…,ＤＮ^ｐＮ）の積（ＹＹ０＝Ｄ０^ｐ０＊Ｄ１^ｐ１＊…＊ＤＮ^ｐＮ）に基づいて、前記出力値（ｙ＝ｆ（ＹＹ０））を出力する。 In order to achieve the above object, an arithmetic device according to one aspect of the present invention includes:
A computing device that outputs output values from the output layer for a plurality of input data (D0, D1, . . . , DN) input to the input layer using a neural network structure including at least an input layer and an output layer There is
The input layer is
having a plurality of power exponents (p0, p1, .
The output layer is
A product ( ^YY0 =D0 ^p0 *D1 ^p1 *) of a plurality of power values (D0 ^p0 , D1 ^p1 , . . . , output the output value (y=f( ^YY0 )) based on *DN pN ).

本発明の一態様に係る演算装置が用いるニューラルネットワーク構造によれば、入力層が、複数の入力データをそれぞれべき乗する複数のべき指数を、ニューラルネットワーク構造の学習パラメータとして有し、出力層が、入力層に入力された複数の入力データが複数のべき指数によりそれぞれべき乗された複数のべき乗値の積に基づいて出力値を出力する。したがって、演算装置は、べき指数により表現される現象の取り扱いを可能とするとともに、当該現象において入力と出力との間に成り立つ相関関係を精度良く導出することができる。 According to the neural network structure used by the arithmetic device according to one aspect of the present invention, the input layer has, as learning parameters of the neural network structure, a plurality of power exponents that respectively raise a plurality of input data, and the output layer: An output value is output based on a product of a plurality of power values obtained by powering a plurality of input data input to the input layer by a plurality of exponents. Therefore, the arithmetic unit can handle the phenomenon represented by the exponent and can accurately derive the correlation between the input and the output in the phenomenon.

上記以外の課題、構成及び効果は、後述する発明を実施するための形態にて明らかにされる。 Problems, configurations, and effects other than the above will be clarified in the mode for carrying out the invention, which will be described later.

本発明の第１の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ａ及びその基本原理を説明する図である。It is a figure explaining 100 A of neural-network structures used by the arithmetic device which concerns on the 1st basic form of this invention, and its basic principle. 本発明の第１の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ｂ及びその基本原理を説明する図である。1 is a diagram illustrating a neural network structure 100B used by an arithmetic device according to the first basic form of the present invention and its basic principle; FIG. 本発明の第３の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ｃ及びその基本原理を説明する図である。It is a figure explaining 100 C of neural-network structures used by the arithmetic device based on the 3rd basic form of this invention, and its basic principle. 本発明の第１乃至第３の基本形態に係るニューラルネットワーク構造を用いた演算装置１の構成を示すブロック図である。1 is a block diagram showing a configuration of an arithmetic device 1 using a neural network structure according to first to third basic forms of the present invention; FIG. 本発明の第１の実施形態に係るニューラルネットワークの構造を示す図である。It is a figure which shows the structure of the neural network based on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るべき指数追加加算型ニューラルネットワークの構造を示す図である。1 is a diagram showing the structure of an exponential addition addition neural network according to the first embodiment of the present invention; FIG. 本発明の第１の実施形態に係るニューラルネットワーク装置による、べき指数の最適解を探索する方法を示すフローチャートである。4 is a flow chart showing a method of searching for an optimal exponent solution by the neural network device according to the first embodiment of the present invention; 本発明の第１の実施形態に係る多層型のニューラルネットワークの構造を示す図である。1 is a diagram showing the structure of a multilayer neural network according to a first embodiment of the present invention; FIG. 本発明の第２の実施形態に係る差分マトリックス、積入力マトリックスの構成を示す図である。FIG. 10 is a diagram showing the configurations of a difference matrix and a product input matrix according to the second embodiment of the present invention; 本発明の第２の実施形態に係る差分探索法を用いて最適解を探索する方法を示すフローチャートである。8 is a flow chart showing a method of searching for an optimum solution using a difference search method according to the second embodiment of the present invention; 本発明の実施例１に係る９つの惑星名と２つの測定データ（太陽からの平均距離、公転周期）を一覧にした表である。It is the table|surface which listed nine planet names and two measurement data (average distance from the sun, revolution period) which concerns on Example 1 of this invention. 本発明の実施例１に係る変動係数を出力値に、横軸にＤ０のべき指数ｐ０、縦軸にＤ１のべき指数ｐ１として（ｐ０、ｐ１）を座標とした出力図である。FIG. 4 is an output diagram in which the coefficient of variation according to Example 1 of the present invention is the output value, the horizontal axis is the exponent p0 of D0, and the vertical axis is the exponent p1 of D1, and the coordinates are (p0, p1). 本発明の実施例１に係る変動係数の出力値をｌｏｇ値（常用対数）に変換した３次元のワイヤフレームプロット図である。FIG. 4 is a three-dimensional wireframe plot diagram obtained by transforming the output value of the coefficient of variation into a log value (common logarithm) according to Example 1 of the present invention. 本発明の実施例１に係る９つの惑星とＹＹ／Ｗの値を一覧にした表である。It is the table|surface which listed nine planets and the value of YY/W based on Example 1 of this invention. 本発明の実施例１に係るＹＹ／Ｗ＝Ｄ０＾（－５）＊Ｄ１＾（３）の式が答えとなるデータに変えたときの変動係数の出力値をｌｏｇ値（常用対数）に変換した３次元のワイヤフレームプロット図である。Convert the output value of the coefficient of variation when the formula YY/W=D0^(-5)*D1^(3) according to Example 1 of the present invention is changed to the answer data to a log value (common logarithm) 3 is a three-dimensional wireframe plot diagram; FIG. 本発明の実施例２に係るヘロンの公式の発見に適用する１０個の３角形の絵の図である。FIG. 10 is a pictorial view of 10 triangles applied to the discovery of Heron's formula according to Example 2 of the present invention; 本発明の実施例２に係る１０個の３角形の三辺の寸法と面積を一覧にした表である。It is the table|surface which listed the dimension and area of three sides of ten triangles based on Example 2 of this invention. 本発明の実施例２に係る積入力要素である３辺計算式を一覧にした表である。FIG. 10 is a table listing three-side calculation formulas that are product input elements according to Example 2 of the present invention; FIG. 本発明の実施例２に係る、べき乗探索法に入力する５次元入力データテーブルを一覧にした表である。It is the table|surface which listed the five-dimensional input data table input into the power search method which concerns on Example 2 of this invention. 本発明の実施例２に係る１０個の３角形（ＳＮ列）とＹＹ／Ｗの値を一覧にした表である。FIG. 10 is a table listing 10 triangles (SN rows) and YY/W values according to Example 2 of the present invention; FIG. 本発明の実施例２に係る１０個の３角形の偶数番号の面積Ｓを１．０倍、奇数番号の面積を０．９倍の値にし、それぞれ群Ａ、群Ｂと２分類にした表である。The even-numbered areas S of the 10 triangles according to Example 2 of the present invention are multiplied by 1.0 and the odd-numbered areas are multiplied by 0.9. is. 本発明の実施例２に係るニューラルネットワークの出力値Ｚ－Ａｃｔの３角形番号順のグラフを表す図である。FIG. 10 is a diagram showing a graph of output values Z-Act of the neural network according to Example 2 of the present invention in the order of triangle numbers; 本発明の実施例２に係るＹＹ／Ｗの３角形番号順のグラフを表す図である。It is a figure showing the graph of YY/W triangular number order based on Example 2 of this invention. 本発明の実施例４に係るＣａｒｔＰｏｌｅ倒立振子の図である。FIG. 4 is a diagram of a CartPole inverted pendulum according to Example 4 of the present invention; 本発明の実施例４に係るＣａｒｔＰｏｌｅ倒立振子の出力を一覧にした表である。FIG. 11 is a table listing outputs of the CartPole inverted pendulum according to Example 4 of the present invention; FIG. 本発明の実施例４に係るＣａｒｔＰｏｌｅ倒立振子の状態変数からとりうる行動を一覧にした表である。FIG. 11 is a table listing actions that can be taken from state variables of the CartPole inverted pendulum according to Example 4 of the present invention; FIG. 本発明の実施例４に係る従来型ニューラルネットワークの構造を示す図である。FIG. 4 is a diagram showing the structure of a conventional neural network according to Example 4 of the present invention; 本発明の実施例４に係るｔエピソード目の終了時に与える報酬を一覧にした表である。FIG. 13 is a table listing rewards given at the end of the t-th episode according to Example 4 of the present invention; FIG. 本発明の実施例４に係る従来型の方策勾配法を用いたフローチャートである。FIG. 4 is a flow chart using a conventional policy gradient method according to Embodiment 4 of the present invention; FIG. 本発明の実施例４に係る従来型の方策勾配法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装した結果のステップ数推移グラフを表す図である。It is a figure showing the step number transition graph of the result of implementing the conventional policy gradient method based on Example 4 of this invention to a CartPole inverted pendulum simulation. 本発明の実施例４に係る従来型の方策勾配法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装し、棒が倒れずに耐えることができた重み付けパラメータの５例を一覧にした表である。FIG. 10 is a table listing 5 examples of weighting parameters that implemented the conventional policy gradient method according to Example 4 of the present invention to a CartPole inverted pendulum simulation and that the rod could withstand without tipping over. FIG. 本発明の実施例４に係る、べき乗探索法を用いて制御する強化学習アルゴリズムのフローチャートである。FIG. 10 is a flow chart of a reinforcement learning algorithm controlled using a power search method according to Example 4 of the present invention; FIG. 本発明の実施例４に係る、べき指数を更新させる更新量Δｐｎを偏差Ｎの配列に設定した表である。FIG. 11 is a table in which update amounts Δpn for updating exponents are set in an array of deviations N according to Example 4 of the present invention; FIG. 本発明の実施例４に係る、べき乗探索法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装した結果のステップ数推移グラフを表す図である。FIG. 12 is a diagram showing a step number transition graph of the result of implementing the power search method to the CartPole inverted pendulum simulation according to Example 4 of the present invention. 本発明の実施例４に係る、べき乗探索法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装し、棒が倒れずに耐えることができた、べき指数値の５例を一覧にした表である。FIG. 11 is a table listing five examples of power exponent values that the rod could withstand without tipping over when the power search method was implemented in the CartPole inverted pendulum simulation according to Example 4 of the present invention; FIG. 本発明の実施例４に係る、ＹＹ／ＷのステップＮｏ．順（台車を押した時系列順）のグラフを表す図である。YY/W step No. according to the fourth embodiment of the present invention. It is a figure showing the graph of order (time-series order in which the truck was pushed). 本発明の実施例４に係る、閾値Ａの値を変化させたときの台車の動作を纏めた表である。It is the table|surface which put together the operation|movement of a trolley|bogie when changing the value of the threshold value A based on Example 4 of this invention. 本発明の実施例４に係る、入力データを棒（Ｐｏｌｅ）の角度と角速度に絞り、べき乗探索法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装し、棒が倒れずに耐えることができた、べき指数値の３例を一覧にした表である。According to Example 4 of the present invention, the input data was narrowed down to the angle and angular velocity of the pole (Pole), the power search method was implemented in the CartPole inverted pendulum simulation, and the power exponent value that the pole could withstand without falling It is the table|surface which made the list three examples. 本発明の実施例４に係る、台車（Ｃａｒｔ）を左右に動かす制御式をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装し、応用動作させた絵である。It is a picture which implemented the control type|formula which moves a cart (Cart) right and left to the CartPole inverted pendulum simulation, and applied-operated according to Example 4 of this invention. 本発明の実施例５に係る、２入力排他論理和（ＥＸＯＲ）の真理値表である。FIG. 10 is a truth table of a two-input exclusive OR (EXOR) according to Example 5 of the present invention; FIG. 本発明の実施例５に係る、３入力排他論理和（ＥＸＯＲ）の真理値表である。FIG. 12 is a truth table of a 3-input exclusive OR (EXOR) according to Example 5 of the present invention; FIG. 本発明の実施例５に係る、３入力排他論理和（ＥＸＯＲ）のべき指数追加加算型ニューラルネットワークを用いた判別学習結果の表である。FIG. 11 is a table of discriminant learning results using a 3-input exclusive-OR (EXOR) exponential addition addition neural network according to Example 5 of the present invention; FIG. 本発明の実施例５に係る、２進数と１０進数の関係を表す表である。FIG. 10 is a table showing the relationship between binary numbers and decimal numbers according to Example 5 of the present invention; FIG. 本発明の実施例５に係る、２進数と１０進数に成り立つ関係式をべき指数追加加算型ニューラルネットワークを用いて数式探索した結果の表である。FIG. 11 is a table showing the results of a mathematical search using a power-exponent addition addition neural network for a relational expression between binary numbers and decimal numbers according to Example 5 of the present invention; FIG.

以下、本発明の基本原理を示す「基本形態」と、その基本原理を応用して本発明を実施するための「実施形態」とに分けて、図面を参照しつつ説明する。以下では、本発明の目的を達成するための説明に必要な範囲を模式的に示し、本発明の該当部分の説明に必要な範囲を主に説明することとし、説明を省略する箇所については公知技術によるものとする。 DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a description will be given of a "basic form" showing the basic principle of the present invention and an "embodiment" for carrying out the present invention by applying the basic principle, with reference to the drawings. In the following, the range necessary for the description to achieve the object of the present invention is schematically shown, and the range necessary for the description of the relevant part of the present invention is mainly described. It shall be by technology.

（第１の基本形態）
図１は、本発明の第１の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ａ及びその基本原理を説明する図である。 (First basic form)
FIG. 1 is a diagram illustrating a neural network structure 100A used by an arithmetic device according to the first basic form of the present invention and its basic principle.

演算装置は、入力層１１０Ａ及び出力層１２０Ａを少なくとも含むニューラルネットワーク構造１００Ａを用いて、入力層１１０Ａに入力される複数の入力データＤｎ＝（Ｄ０,Ｄ１,…,ＤＮ）に対して出力層１２０Ａから出力値ｙを出力する装置である。 The arithmetic device uses a neural network structure 100A including at least an input layer 110A and an output layer 120A to generate a plurality of input data Dn=(D0, D1, . . . , DN) input to the input layer 110A. is a device for outputting an output value y from .

図１に示すニューラルネットワーク構造１００Ａは、Ｎ＋１次元（Ｎは１以上の自然数）のニューロン（ノード）を有する入力層１１０Ａと、１個のニューロン（ノード）と有する出力層１２０Ａとから構成される。入力層１１０ＡのＮ個のニューロンと、出力層１２０Ａの１個のニューロンとの間は、Ｎ＋１次元のシナプス（エッジ）によりそれぞれ接続される。なお、各シナプスには、Ｎ＋１次元の重み付けパラメータｗｎ＝（ｗ０，ｗ１，ｗ２，…，ｗＮ）がそれぞれ対応付けられていてもよく、本基本形態では、Ｎ＋１次元の重みｗｎが１である場合について説明する。 The neural network structure 100A shown in FIG. 1 is composed of an input layer 110A having N+1-dimensional (N is a natural number of 1 or more) neurons (nodes) and an output layer 120A having one neuron (node). N neurons in the input layer 110A and one neuron in the output layer 120A are connected by N+1-dimensional synapses (edges). Note that each synapse may be associated with an N+1-dimensional weighting parameter wn=(w0, w1, w2, . . . , wN). will be explained.

入力層１１０ＡのＮ個のニューロンは、Ｎ＋１次元の入力データＤｎにそれぞれ対応付けられて、Ｎ＋１次元の入力データＤｎがそれぞれ入力される。また、入力層１１０Ａは、Ｎ＋１次元の入力データＤｎをそれぞれべき乗するＮ＋１次元のべき指数ｐｎ＝（ｐ０,ｐ１,…,ｐＮ）を、ニューラルネットワーク構造１００Ａの学習パラメータとして有する。なお、Ｎ＋１次元の入力データＤｎの少なくとも１つは、複素数で表されるデータでもよい。 The N neurons of the input layer 110A are associated with the N+1-dimensional input data Dn, respectively, and receive the N+1-dimensional input data Dn. The input layer 110A also has N+1-dimensional exponents pn=(p0, p1, . At least one of the N+1-dimensional input data Dn may be data represented by a complex number.

出力層１２０Ａは、入力層１１０Ａに入力されたＮ＋１次元の入力データＤｎがＮ＋１次元のべき指数ｐｎによりそれぞれべき乗されたＮ＋１次元のべき乗値Ｄｎ^ｐｎ＝（Ｄ０^ｐ０,Ｄ１^ｐ１,…,ＤＮ^ｐＮ）の積ＹＹ０（＝Ｄ０^ｐ０＊Ｄ１^ｐ１＊…＊ＤＮ^ｐＮ）に基づいて、出力値ｙ（＝ｆ（ＹＹ０））を出力する。したがって、出力層１２０Ａは、下記の（数１－１）、（数１－２）で示すように、出力値ｙを出力する。なお、「＊」は、積の記号を表す。 The output layer 120A ^generates N+1-dimensional power values Dn ^pn =(D0 ^p0 , D1 ^p1 , . output value y (=f(YY0)) based on the product YY0 (=D0 ^p0 *D1 ^p1 * . . . *DN ^pN ). Therefore, the output layer 120A outputs an output value y as indicated by (Equation 1-1) and (Equation 1-2) below. Note that "*" represents a product symbol.

（数１－１）
ＹＹ０＝Ｄ０^ｐ０＊Ｄ１^ｐ１＊…＊ＤＮ^ｐＮ
（数１－２）
ｙ＝ｆ（ＹＹ０）
ただし、上記の式における各パラメータは、下記の通りである。
Ｄｎ（ｎ＝０，１,…,Ｎ）：入力データ
ｐｎ（ｎ＝０，１,…,Ｎ）：べき指数（学習パラメータ）
Ｄｎ^ｐｎ（ｎ＝０，１,…,Ｎ）：べき乗値
ＹＹ０：べき乗値の積
ｙ：出力値 (Number 1-1)
YY0=D0 ^p0 *D1 ^p1 *...*DN ^pN
(Number 1-2)
y=f(YY0)
However, each parameter in the above formula is as follows.
Dn (n = 0, 1, ..., N): input data pn (n = 0, 1, ..., N): exponent (learning parameter)
Dn ^pn (n = 0, 1, ..., N): power value YY0: product of power values y: output value

学習パラメータとしてのＮ＋１次元のべき指数ｐｎは、Ｎ＋１次元の入力データＤｎと、そのＮ＋１次元の入力データＤｎに対応付けられた教師データＴとを含む学習データを複数組用いることで学習されるパラメータである。 The N+1-dimensional power exponent pn as a learning parameter is a parameter learned by using a plurality of sets of learning data including N+1-dimensional input data Dn and teacher data T associated with the N+1-dimensional input data Dn. is.

Ｎ＋１次元のべき指数ｐｎは、学習データに含まれるＮ＋１次元の入力データＤｎを入力層１１０Ａに入力したときに出力層１２０Ａから出力される出力値ｙと、学習データに含まれる教師データＴとの間の差分（誤差）が小さくなるように調整される。 The N+1-dimensional exponent pn is the difference between the output value y output from the output layer 120A when the N+1-dimensional input data Dn included in the learning data is input to the input layer 110A, and the teacher data T included in the learning data. Adjustments are made so that the difference (error) between

演算装置は、上記のように、学習データにより学習パラメータを調整（探索）する一連の工程を所定の回数だけ反復実施したときや上記の差分が所定の許容値より小さくなったときに、所定の学習終了条件が満たされたと判定し、学習パラメータに対する学習を終了する。これにより、学習パラメータとしてのＮ＋１次元のべき指数ｐｎを有する学習済みのニューラルネットワーク構造１００Ａが実現される。演算装置は、出力値が未知のＮ＋１次元の入力データＤｎを学習済みのニューラルネットワーク構造１００Ａの入力層１１０Ａに入力することで、当該Ｎ＋１次元の入力データＤｎに対する出力値ｙを出力層１２０Ａから出力する。 As described above, when the series of steps for adjusting (searching) the learning parameter using the learning data is repeated a predetermined number of times, or when the difference becomes smaller than a predetermined allowable value, the arithmetic device It is determined that the learning termination condition is satisfied, and the learning for the learning parameters is terminated. This realizes a trained neural network structure 100A having N+1-dimensional exponents pn as learning parameters. By inputting N+1-dimensional input data Dn whose output value is unknown to the input layer 110A of the trained neural network structure 100A, the arithmetic device outputs the output value y for the N+1-dimensional input data Dn from the output layer 120A. do.

なお、演算装置は、入力層１１０Ａに入力する前の入力データに対して所定の前処理（正規化、標準化、ワンホットエンコーディング等）を施してもよいし、出力層１２０Ａから出力された後の出力データに対して所定の後処理を施してもよい。 Note that the arithmetic unit may perform predetermined preprocessing (normalization, standardization, one-hot encoding, etc.) on the input data before being input to the input layer 110A, or Predetermined post-processing may be performed on the output data.

本基本形態に係る演算装置が用いるニューラルネットワーク構造１００Ａによれば、入力層１１０Ａが、複数の入力データをそれぞれべき乗する複数のべき指数を、ニューラルネットワーク構造１００Ａの学習パラメータとして有し、出力層１２０Ａが、入力層１１０Ａに入力された複数の入力データが複数のべき指数によりそれぞれべき乗された複数のべき乗値の積に基づいて出力値を出力する。したがって、演算装置は、べき指数により表現される現象の取り扱いを可能とするとともに、当該現象において入力と出力との間に成り立つ相関関係を精度良く導出することができる。 According to the neural network structure 100A used by the arithmetic device according to the present basic mode, the input layer 110A has a plurality of exponents for exponentiating a plurality of input data as learning parameters of the neural network structure 100A, and the output layer 120A outputs an output value based on the product of a plurality of power values obtained by powering a plurality of input data input to the input layer 110A by a plurality of exponents. Therefore, the arithmetic unit can handle the phenomenon represented by the exponent and can accurately derive the correlation between the input and the output in the phenomenon.

（第２の基本形態）
図２は、本発明の第１の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ｂ及びその基本原理を説明する図である。 (Second basic form)
FIG. 2 is a diagram for explaining the neural network structure 100B used by the arithmetic device according to the first basic form of the present invention and its basic principle.

第２の基本形態に係るニューラルネットワーク構造１００Ｂ（図２）は、第１の基本形態（図１）と同様に、入力層１１０Ｂ及び出力層１２０Ｂを少なくとも含むものであるが、入力層１１０Ｂにて対数計算を行い、出力層１２０Ｂにて真数（逆対数）計算を行う点で第１の基本形態と相違する。以下、第２の基本形態に係るニューラルネットワーク構造１００Ｂの特徴部分を中心に説明する。 A neural network structure 100B (FIG. 2) according to the second basic form includes at least an input layer 110B and an output layer 120B, as in the first basic form (FIG. 1). and perform antilogarithm (antilogarithm) calculation in the output layer 120B. The characteristic portions of the neural network structure 100B according to the second basic form will be mainly described below.

入力層１１０ＢのＮ個のニューロンは、第１の基本形態と同様に、Ｎ＋１次元の入力データＤｎ＝（Ｄ０,Ｄ１,…,ＤＮ）にそれぞれ対応付けられて、Ｎ＋１次元の入力データＤｎがそれぞれ入力される。また、入力層１１０Ｂは、Ｎ＋１次元の入力データＤｎをそれぞれべき乗するＮ＋１次元のべき指数ｐｎ＝（ｐ０,ｐ１,…,ｐＮ）を、ニューラルネットワーク構造１００Ｂの学習パラメータとして有する。そして、入力層１１０Ｂは、Ｎ＋１次元の入力データＤｎを対数ｄｎ＝（ｄ０,ｄ１,…,ｄＮ）にそれぞれ変換し、Ｎ＋１次元の入力データの対数ｄｎとＮ＋１次元のべき指数ｐｎとをそれぞれ乗算したＮ＋１次元の乗算値ｄｎ＊ｐｎ＝（ｄ０＊ｐ０，ｄ１＊ｐ１，…，ｄＮ＊ｐＮ）を出力層１２０Ｂに出力する。なお、Ｎ＋１次元の入力データＤｎの少なくとも１つは、複素数で表されるデータでもよい。 The N neurons of the input layer 110B are associated with N+1-dimensional input data Dn=(D0, D1, . is entered. The input layer 110B also has N+1-dimensional exponents pn=(p0, p1, . Then, the input layer 110B converts the N+1-dimensional input data Dn into logarithms dn=(d0, d1, . Then, the N+1-dimensional multiplication value dn*pn=(d0*p0, d1*p1, . . . , dN*pN) is output to the output layer 120B. At least one of the N+1-dimensional input data Dn may be data represented by a complex number.

出力層１２０Ｂは、Ｎ＋１次元の乗算値ｄｎ＊ｐｎに対する総和（ｄ０＊ｐ０＋ｄ１＊ｐ１＋…＋ｄＮ＊ｐＮ）を真数（ｂａｓｅ^{ｄ０＊ｐ０＋ｄ１＊ｐ１＋…＋ｄＮ＊ｐＮ}）に変換し、その真数を、Ｎ＋１次元のべき乗値の積として、出力値ｙ（＝ｆ（ＹＹ０））を出力する。したがって、出力層１２０Ｂは、下記の（数２－１）、（数２－２）で示すように、出力値ｙを出力する。 The output layer 120B converts the sum (d0*p0+d1*p1+...+dN*pN) of the N+1-dimensional multiplication value dn*pn into an antilogarithm (base ^{d0*p0+d1*p1+...+dN*pN} ), and converts the antilogarithm into Output the output value y (=f(YY0)) as the product of the N+1-dimensional exponentiation values. Therefore, the output layer 120B outputs an output value y as shown in (Equation 2-1) and (Equation 2-2) below.

（数２－１）
ＹＹ０＝ｂａｓｅ^{ｄ０＊ｐ０＋ｄ１＊ｐ１＋…＋ｄＮ＊ｐＮ}
（＝Ｄ０^ｐ０＊Ｄ１^ｐ１＊…＊ＤＮ^ｐＮ）
（数２－２）
ｙ＝ｆ（ＹＹ０）
ただし、上記の式における各パラメータは、下記の通りである。
ｂａｓｅは、１を除く正の数
Ｄｎ＝ｂａｓｅ^ｄｎ（ｎ＝０，１,…,Ｎ）：入力データ
ｐｎ（ｎ＝０，１,…,Ｎ）：べき指数（学習パラメータ）
Ｄｎ^ｐｎ（ｎ＝０，１,…,Ｎ）：べき乗値
ＹＹ０：べき乗値の積
ｙ：出力値 (Number 2-1)
YY0=base ^{d0*p0+d1*p1+...+dN*pN}
(= D0 ^p0 * D1 ^p1 *... * DN ^pN )
(Number 2-2)
y=f(YY0)
However, each parameter in the above formula is as follows.
base is a positive number excluding 1 Dn = base ^dn (n = 0, 1, ..., N): input data pn (n = 0, 1, ..., N): exponent (learning parameter)
Dn ^pn (n = 0, 1, ..., N): power value YY0: product of power values y: output value

学習パラメータとしてのＮ＋１次元のべき指数ｐｎは、第１の基本形態と同様に、Ｎ＋１次元の入力データＤｎと、そのＮ＋１次元の入力データＤｎに対応付けられた教師データＴとを含む学習データを複数組用いることで学習されるパラメータである。 The N+1-dimensional power exponent pn as a learning parameter is, similarly to the first basic form, learning data including N+1-dimensional input data Dn and teacher data T associated with the N+1-dimensional input data Dn. This parameter is learned by using multiple sets.

Ｎ＋１次元のべき指数ｐｎは、学習データに含まれるＮ＋１次元の入力データＤｎを入力層１１０Ｂに入力したときに出力層１２０Ｂから出力される出力値ｙと、学習データに含まれる教師データＴとの間の差分（誤差）が小さくなるように調整される。 The N+1-dimensional exponent pn is the difference between the output value y output from the output layer 120B when the N+1-dimensional input data Dn included in the learning data is input to the input layer 110B, and the teacher data T included in the learning data. Adjustments are made so that the difference (error) between

演算装置は、上記のように、学習データにより学習パラメータを調整（探索）する一連の工程を所定の回数だけ反復実施したときや上記の差分が所定の許容値より小さくなったときに、所定の学習終了条件が満たされたと判定し、学習パラメータに対する学習を終了する。これにより、学習パラメータとしてのＮ＋１次元のべき指数ｐｎを有する学習済みのニューラルネットワーク構造１００Ｂが実現される。演算装置は、出力値が未知のＮ＋１次元の入力データＤｎを学習済みのニューラルネットワーク構造１００Ｂの入力層１１０Ｂに入力することで、当該Ｎ＋１次元の入力データＤｎに対する出力値ｙを出力層１２０Ｂから出力する。 As described above, when the series of steps for adjusting (searching) the learning parameter using the learning data is repeated a predetermined number of times, or when the difference becomes smaller than a predetermined allowable value, the arithmetic device It is determined that the learning termination condition is satisfied, and the learning for the learning parameters is terminated. This realizes a trained neural network structure 100B having N+1-dimensional exponents pn as learning parameters. The arithmetic device inputs N+1-dimensional input data Dn whose output value is unknown to the input layer 110B of the learned neural network structure 100B, and outputs the output value y for the N+1-dimensional input data Dn from the output layer 120B. do.

本基本形態に係る演算装置が用いるニューラルネットワーク構造１００Ｂによれば、入力層１１０Ｂは、複数の入力データを対数にそれぞれ変換し、その変換後の複数の対数と複数のべき指数とをそれぞれ乗算した複数の乗算値を出力層１２０Ｂに出力し、出力層１２０Ｂは、複数の乗算値に対する総和を真数に変換し、その変換後の真数に基づいて出力値を出力する。したがって、演算装置は、べき指数により表現される現象の取り扱いを可能とするとともに、当該現象において入力と出力との間に成り立つ相関関係を精度良く導出することができる。 According to the neural network structure 100B used by the arithmetic device according to the present basic mode, the input layer 110B converts a plurality of input data into logarithms, and multiplies the converted logarithms by a plurality of power exponents. A plurality of multiplied values are output to the output layer 120B, and the output layer 120B converts the sum of the multiplied values into an antilogarithm and outputs an output value based on the antilogarithm after the conversion. Therefore, the arithmetic unit can handle the phenomenon represented by the exponent and can accurately derive the correlation between the input and the output in the phenomenon.

（第３の基本形態）
図３は、本発明の第３の基本形態に係る演算装置により用いられるニューラルネットワーク構造１００Ｃ及びその基本原理を説明する図である。 (Third basic form)
FIG. 3 is a diagram illustrating a neural network structure 100C used by an arithmetic device according to the third basic form of the present invention and its basic principle.

第３の基本形態に係るニューラルネットワーク構造１００Ｃ（図３）は、第１の基本形態（図１）と同様に、入力層１１０Ｃ及び出力層１２０Ｃを含むものであるが、入力層１１０Ｃと出力層１２０Ｃとの間に隠れ層１３０をさらに含む点で第１の基本施形態と相違する。以下、第３の基本形態に係るニューラルネットワーク構造１００Ｃの特徴部分を中心に説明する。 A neural network structure 100C (FIG. 3) according to the third basic form includes an input layer 110C and an output layer 120C as in the first basic form (FIG. 1). This embodiment differs from the first basic embodiment in that a hidden layer 130 is further included between . Hereinafter, the characteristic portion of the neural network structure 100C according to the third basic form will be mainly described.

入力層１１０ＣのＮ個のニューロンは、第１の基本形態と同様に、Ｎ＋１次元の入力データＤｎ＝（Ｄ０,Ｄ１,…,ＤＮ）にそれぞれ対応付けられて、Ｎ＋１次元の入力データＤｎがそれぞれ入力される。また、入力層１１０Ｃは、Ｎ＋１次元の入力データＤｎをそれぞれべき乗するＮ＋１次元のべき指数ｐｎ＝（ｐ０,ｐ１,…,ｐＮ）を、ニューラルネットワーク構造１００Ｃの学習パラメータとして有する。なお、Ｎ＋１次元の入力データＤｎの少なくとも１つは、複素数で表されるデータでもよい。 The N neurons of the input layer 110C are associated with N+1-dimensional input data Dn=(D0, D1, . is entered. The input layer 110C also has N+1-dimensional exponents pn=(p0, p1, . At least one of the N+1-dimensional input data Dn may be data represented by a complex number.

隠れ層１３０は、Ｎ＋１次元の入力データＤｎが学習パラメータとしてのＮ＋１次元の重み付けパラメータｗｎ＝（ｗ０，ｗ１，…,ｗＮ）を介してそれぞれ入力されて、下記の式（数３－１）で規定される目標値ＹＹ１を出力層１２０Ａに出力する第１の隠れノード１３１と、Ｎ＋１次元の入力データＤｎがＮ＋１次元の重み付けパラメータｗｎを介してそれぞれ入力されるともに、学習パラメータとしてのバイアスパラメータｂが入力されて、下記の式（数３－２）で規定される加算型演算出力ＢＹＡを前記出力層１２０Ａに出力する第２の隠れノード１３２とを有する。 The hidden layer 130 receives N+1-dimensional input data Dn via N+1-dimensional weighting parameters wn=(w0, w1, . . . , wN) as learning parameters. A first hidden node 131 that outputs a defined target value YY1 to the output layer 120A, and N+1-dimensional input data Dn are input via N+1-dimensional weighting parameters wn, respectively, and a bias parameter b , and a second hidden node 132 that outputs to the output layer 120A an additive operation output BYA defined by the following equation (Equation 3-2).

出力層１２０Ｃは、目標値ＹＹ１と加算型演算出力ＢＹＡとに基づいて、出力値ｙ（＝ｆ（ＹＹ１，ＢＹＡ））を出力する。 The output layer 120C outputs an output value y (=f(YY1, BYA)) based on the target value YY1 and the addition type operation output BYA.

（数３－１）
ＹＹ１＝Ｄ０^ｐ０＊Ｄ１^ｐ１＊…＊ＤＮ^ｐＮ＊Ｗ０＊Ｗ１＊…＊ＷＮ
（数３－２）
ＢＹＡ＝Ｂ＊（ｂａｓｅ）^{（Σ［ｎ＝０→Ｎ］（ｗｎ＊ｐｎ＊ｄｎ））}
ただし、上記の式における各パラメータは、下記の通りである。
ｂａｓｅは、１を除く正の数
Ｄｎ＝ｂａｓｅ^ｄｎ（ｎ＝０，１,…,Ｎ）：入力データ
ｐｎ（ｐ０，ｐ１，…，ｐＮ）：べき指数
Ｄｎ^ｐｎ：べき乗値
ｗｎ＝ｌｏｇ_ｂａｓｅＷｎ（ｎ＝０，１,…,Ｎ）：重み付けパラメータ
（Ｗｎ＝ｂａｓｅ^ｗｎ）
ｂ＝ｌｏｇ_ｂａｓｅＢ：バイアスパラメータ
（Ｂ＝ｂａｓｅ^ｂ）
ＹＹ１：目標値
ＢＹＡ：加算型演算出力
ｙ：出力値 (Number 3-1)
YY1=D0 ^p0 *D1 ^p1 *...*DN ^pN *W0*W1*...*WN
(Number 3-2)
BYA=B*(base) ^{(Σ[n=0→N](wn*pn*dn))}
However, each parameter in the above formula is as follows.
base is a positive number excluding 1 Dn = base ^dn (n = 0, 1, ..., N): input data pn (p0, p1, ..., pN): power exponent Dn ^pn : power value wn = log _base Wn (n=0, 1, . . . , N): weighting parameter (Wn=base ^wn )
b=log _base B : bias parameter (B=base ^b )
YY1: Target value BYA: Addition type calculation output y: Output value

学習パラメータとしてのＮ＋１次元のべき指数ｐｎ、Ｎ＋１次元の重み付けパラメータｗｎ、及び、バイアスパラメータｂは、複数の入力データＤｎを学習データとして複数用いることで学習されるパラメータである。 The N+1-dimensional exponent pn, the N+1-dimensional weighting parameter wn, and the bias parameter b as learning parameters are parameters learned by using a plurality of pieces of input data Dn as learning data.

Ｎ＋１次元のべき指数ｐｎ、Ｎ＋１次元の重み付けパラメータｗｎ、及び、バイアスパラメータｂは、学習データとしてのＮ＋１次元の入力データＤｎを入力層１１０Ｃに入力したときに第１の隠れノード１３１から出力される目標値ＹＹ１と第２の隠れノード１３２から出力される加算型演算出力ＢＹＡとの間の差分（｜ＹＹ１－ＢＹＡ｜）が小さくなるように調整される。 The N+1-dimensional exponent pn, the N+1-dimensional weighting parameter wn, and the bias parameter b are output from the first hidden node 131 when the N+1-dimensional input data Dn as learning data is input to the input layer 110C. The difference (|YY1−BYA|) between the target value YY1 and the additive operation output BYA output from the second hidden node 132 is adjusted so as to be small.

演算装置は、上記のように、学習データにより学習パラメータを調整（探索）する一連の工程を所定の回数だけ反復実施したときや上記の差分が所定の許容値より小さくなったときに、所定の学習終了条件が満たされたと判定し、学習パラメータに対する学習を終了する。これにより、学習パラメータとしてのＮ＋１次元のべき指数ｐｎ、Ｎ＋１次元の重み付けパラメータｗｎ、及び、バイアスパラメータｂを有する学習済みのニューラルネットワーク構造１００Ｃが実現される。演算装置は、出力値が未知のＮ＋１次元の入力データＤｎを学習済みのニューラルネットワーク構造１００Ｃの入力層１１０Ｃに入力することで、当該Ｎ＋１次元の入力データＤｎに対する出力値ｙを出力層１２０Ｃから出力する。 As described above, when the series of steps for adjusting (searching) the learning parameter using the learning data is repeated a predetermined number of times, or when the difference becomes smaller than a predetermined allowable value, the arithmetic device It is determined that the learning termination condition is satisfied, and the learning for the learning parameters is terminated. This implements a trained neural network structure 100C having an N+1-dimensional power exponent pn, an N+1-dimensional weighting parameter wn, and a bias parameter b as learning parameters. The arithmetic device inputs N+1-dimensional input data Dn whose output value is unknown to the input layer 110C of the learned neural network structure 100C, and outputs the output value y for the N+1-dimensional input data Dn from the output layer 120C. do.

本基本形態に係る演算装置が用いるニューラルネットワーク構造１００Ｃによれば、隠れ層１３０が、複数の入力データが複数の重み付けパラメータを介してそれぞれ入力されて、上記の式（数３－１）で規定される目標値を出力層に出力する第１の隠れノードと、複数の入力データが複数の重み付けパラメータを介してそれぞれ入力されるともに、バイアスパラメータが入力されて、上記の式（数３－２）で規定される加算型演算出力を出力層に出力する第２の隠れノードとを有し、出力層１２０Ｃが、目標値と加算型演算出力とに基づいて、出力値を出力する。したがって、演算装置は、べき指数により表現される現象の取り扱いを可能とするとともに、当該現象において入力と出力との間に成り立つ相関関係を精度良く導出することができる。 According to the neural network structure 100C used by the arithmetic device according to the present basic mode, the hidden layer 130 receives a plurality of input data via a plurality of weighting parameters, and is defined by the above formula (Equation 3-1). A first hidden node that outputs to the output layer the target value to be calculated, a plurality of input data are input via a plurality of weighting parameters, and a bias parameter is input, and the above equation (Equation 3-2 ) to the output layer, and the output layer 120C outputs the output value based on the target value and the addition type operation output. Therefore, the arithmetic unit can handle the phenomenon represented by the exponent and can accurately derive the correlation between the input and the output in the phenomenon.

（基本形態の装置構成）
図４は、本発明の第１乃至第３の基本形態に係るニューラルネットワーク構造を用いた演算装置１の構成を示すブロック図である。 (Equipment configuration of basic configuration)
FIG. 4 is a block diagram showing the configuration of an arithmetic device 1 using a neural network structure according to the first to third basic forms of the present invention.

演算装置１は、第１乃至第３の基本形態のいずれかに相当するニューラルネットワーク構造１００Ａ～１００Ｃを有する学習モデルを生成する機械学習装置１Ａと、機械学習装置１Ａにより生成された学習モデルを用いて判別対象の判別データＢＢに対する判別結果ＡＡを出力する判別装置１Ｂとして機能する。機械学習装置１Ａは、学習フェーズにて用いられ、判別装置１Ｂは、判別フェーズ（推論フェーズ）にて用いられる。 The computing device 1 uses a machine learning device 1A that generates a learning model having neural network structures 100A to 100C corresponding to any of the first to third basic forms, and a learning model generated by the machine learning device 1A. function as a discrimination device 1B that outputs a discrimination result AA for discrimination data BB to be discriminated. The machine learning device 1A is used in the learning phase, and the discrimination device 1B is used in the discrimination phase (inference phase).

演算装置１は、その構成要素として、判別器学習部２、学習パラメータ記憶部３、学習データ記憶部４、学習データ処理部５、判別結果処理部６、及び、判別データ取得部７を備えて構成される。 Arithmetic device 1 includes classifier learning unit 2, learning parameter storage unit 3, learning data storage unit 4, learning data processing unit 5, discrimination result processing unit 6, and discrimination data acquisition unit 7 as its components. Configured.

判別器学習部２は、ニューラルネットワーク構造１００Ａ～１００Ｃを有する学習モデルを用いて学習パラメータの学習を行う学習部２０と、学習中又は学習済みの学習パラメータを反映させた学習モデルを用いて判別データに対する判別結果を出力する判別処理部２１とを備える。第１及び第２の基本形態に係る学習パラメータは、Ｎ＋１次元のべき指数ｐｎである。第３の基本形態に係る学習パラメータは、Ｎ＋１次元のべき指数ｐｎ、Ｎ＋１次元の重み付けパラメータｗｎ、及び、バイアスパラメータｂである。 The discriminator learning unit 2 includes a learning unit 20 that performs learning parameter learning using a learning model having a neural network structure 100A to 100C, and a learning model that reflects the learning parameter being learned or already learned. and a discrimination processing unit 21 that outputs a discrimination result for. The learning parameter according to the first and second basic forms is the N+1-dimensional exponent pn. The learning parameters according to the third basic form are the N+1-dimensional exponent pn, the N+1-dimensional weighting parameter wn, and the bias parameter b.

学習パラメータ記憶部３は、学習フェーズにおいて学習部２０により学習が行われた学習結果として、学習パラメータを記憶する。学習パラメータ記憶部３には、学習パラメータの初期化処理により学習パラメータの初期値が記憶され、学習部２０で学習が繰り返し行われることにより学習パラメータが逐次更新される。そして、学習パラメータ記憶部３には、学習部２０による学習が終了したときの学習パラメータが記憶され、判別フェーズ（推論フェーズ）にて判別処理部２１により読み出される。 The learning parameter storage unit 3 stores learning parameters as learning results of the learning performed by the learning unit 20 in the learning phase. Initial values of the learning parameters are stored in the learning parameter storage unit 3 by initialization processing of the learning parameters, and the learning parameters are successively updated as the learning is repeatedly performed in the learning unit 20 . The learning parameter storage unit 3 stores the learning parameters when the learning by the learning unit 20 is completed, and is read by the determination processing unit 21 in the determination phase (inference phase).

学習データ記憶部４は、複数の入力データを少なくとも含む学習データを複数組記憶する。第１及び第２の基本形態に係る学習データは、入力データと、その入力データに対応付けられた教師データとを含む。第３の基本形態に係る学習データは、入力データのみを含む。教師データは、例えば、判別結果に対応するデータであり、判別結果として、例えば、正常を「０」及び異常を「１」で表す場合には、「０」か「１」が設定される。 The learning data storage unit 4 stores a plurality of sets of learning data including at least a plurality of input data. The learning data according to the first and second basic forms include input data and teacher data associated with the input data. Learning data according to the third basic form includes only input data. The teacher data is, for example, data corresponding to the discrimination result, and if the discrimination result is represented by "0" for normal and "1" for abnormal, for example, "0" or "1" is set.

学習部２０は、学習データ記憶部４に記憶された学習データを、学習データ処理部５を介して学習モデルに入力し、例えば、損失関数が最小となるように学習パラメータの学習を行う。すなわち、学習部２０は、判別処理部２１から出力された判別結果と、学習データ処理部５から読み出した学習データとが入力されて、これらのデータを用いて学習を行い、学習パラメータ記憶部３に学習パラメータを記憶する。 The learning unit 20 inputs the learning data stored in the learning data storage unit 4 to the learning model via the learning data processing unit 5, and learns the learning parameters so as to minimize the loss function, for example. That is, the learning unit 20 receives the determination result output from the determination processing unit 21 and the learning data read from the learning data processing unit 5, and performs learning using these data. store the learning parameters in

判別処理部２１は、学習フェーズにて、学習データ処理部５により取得された学習データを、初期値又は学習中の学習パラメータを反映させた学習モデルに入力することで、当該学習モデルからの出力値に基づいて判別結果を学習部２０及び判別結果処理部６に出力する。 In the learning phase, the discrimination processing unit 21 inputs the learning data acquired by the learning data processing unit 5 to a learning model that reflects the initial values or the learning parameters during learning, so that the output from the learning model A determination result is output to the learning unit 20 and the determination result processing unit 6 based on the value.

また、判別処理部２１は、判別フェーズ（推論フェーズ）にて、判別データ取得部７により取得された判別データを、学習済みの学習パラメータを反映させた学習モデルに入力することで、当該学習モデルからの出力値（例えば、特徴量等）を判別結果処理部６に出力する。 Further, in the discrimination phase (inference phase), the discrimination processing unit 21 inputs the discrimination data acquired by the discrimination data acquisition unit 7 to the learning model reflecting the learned learning parameters, thereby output value (for example, feature amount, etc.) from is output to the discrimination result processing unit 6 .

学習データ処理部５は、学習フェーズにて、学習データ記憶部４から学習データを読み出して所定の前処理を施した後、その学習データを学習部２０及び判別処理部２１に送る。その際、学習データ処理部５は、判別結果処理部６からの要求に応じて、学習データを学習部２０及び判別処理部２１に送る。 In the learning phase, the learning data processing unit 5 reads the learning data from the learning data storage unit 4 and performs predetermined preprocessing, and then sends the learning data to the learning unit 20 and the discrimination processing unit 21 . At that time, the learning data processing unit 5 sends learning data to the learning unit 20 and the discrimination processing unit 21 in response to a request from the discrimination result processing unit 6 .

判別結果処理部６は、判別処理部２１から出力された出力値を受け取り、判別結果ＡＡとして、例えば、ディスプレイ等の所定の出力装置に出力する。また、判別結果処理部６は、学習フェーズにて、判別結果に基づいて変動係数や判別率等を計算し、その計算結果に応じて、学習データを学習部２０及び判別処理部２１にさらに送るように、学習データ処理部５に要求する。 The determination result processing unit 6 receives the output value output from the determination processing unit 21, and outputs it as a determination result AA to a predetermined output device such as a display, for example. Further, in the learning phase, the discrimination result processing unit 6 calculates a coefficient of variation, a discrimination rate, etc. based on the discrimination result, and further sends learning data to the learning unit 20 and the discrimination processing unit 21 according to the calculation result. is requested to the learning data processing unit 5 as follows.

判別データ取得部７は、判別フェーズ（推論フェーズ）にて、所定の入力装置から判別データＢＢを受け付けて所定の前処理を施した後、その判別データＢＢを判別処理部２１に送る。 In the discrimination phase (inference phase), the discrimination data acquisition unit 7 receives the discrimination data BB from a predetermined input device, performs predetermined preprocessing, and then sends the discrimination data BB to the discrimination processing unit 21 .

上記構成を有する演算装置１は、汎用又は専用のコンピュータにより構成される。なお、機械学習装置１Ａ及び判別装置１Ｂは、別々のコンピュータにより構成されていてもよい。その場合、機械学習装置１Ａは、学習データ記憶部４、学習部２０及び学習パラメータ記憶部３を少なくとも備えていれればよい。また、判別装置１Ｂは、判別データ取得部７及び判別処理部２１を少なくとも備えていれればよい。 The computing device 1 having the above configuration is configured by a general-purpose or dedicated computer. Note that the machine learning device 1A and the discrimination device 1B may be configured by separate computers. In that case, the machine learning device 1A may include at least the learning data storage unit 4, the learning unit 20, and the learning parameter storage unit 3. Moreover, the discrimination device 1B only needs to include at least the discrimination data acquisition unit 7 and the discrimination processing unit 21 .

演算装置１の構成要素のうち、学習パラメータ記憶部３、及び、学習データ記憶部４は、ハードディスクドライブ（ＨＤＤ）、ソリッドステートドライブ（ＳＳＤ）等の記憶装置（内蔵型、外付け型、ネットワーク接続型等）で構成されてもよいし、ＵＳＢメモリ、記憶メディア再生装置で再生可能な記憶メディア（ＣＤ、ＤＶＤ、ＢＤ）等で構成されてもよい。また、演算装置１の構成要素のうち、判別器学習部２、学習データ処理部５、判別結果処理部６及び判別データ取得部７は、例えば、１又は複数のプロセッサ（ＣＰＵ、ＭＰＵ、ＧＰＵ等）を有する演算装置で構成される。 Among the components of the arithmetic unit 1, the learning parameter storage unit 3 and the learning data storage unit 4 are hard disk drives (HDD), solid state drives (SSD), and other storage devices (built-in type, external type, network connection). type, etc.), or a USB memory, a storage medium (CD, DVD, BD) that can be played back by a storage media playback device, or the like. Among the constituent elements of the arithmetic unit 1, the discriminator learning unit 2, the learning data processing unit 5, the discrimination result processing unit 6, and the discrimination data acquisition unit 7 are, for example, one or a plurality of processors (CPU, MPU, GPU, etc.). ).

(プログラム)
演算装置１は、各種の記憶装置や記憶メディアに記憶されたプログラムや外部からネットワークを介してダウンロードにより取得されたプログラムを実行することで、判別器学習部２、学習データ処理部５、判別結果処理部６及び判別データ取得部７として機能するものでもよい。 (program)
The computing device 1 executes a program stored in various storage devices and storage media, or a program downloaded from the outside via a network, thereby obtaining a discriminator learning unit 2, a learning data processing unit 5, and a discrimination result. It may function as the processing unit 6 and the discrimination data acquisition unit 7 .

(集積回路)
第１乃至第３のいずれかの相当するニューラルネットワーク構造１００Ａ～１００Ｃは、集積回路により構成されてもよい。その場合、集積回路は、入力層及び出力層を構成する入出力部と、学習パラメータを記憶する記憶部と、入力層に入力される複数の入力データ及び記憶部に記憶された学習パラメータに基づいて、出力層から前記出力値を出力するための演算を行う演算部とを備える。集積回路は、例えば、ＦＰＧＡ、ＡＳＩＣ等により構成され、これら以外のハードウェアが用いられてもよい。 (integrated circuit)
Any of the first through third corresponding neural network structures 100A-100C may be implemented by integrated circuits. In that case, the integrated circuit includes an input/output unit constituting an input layer and an output layer, a storage unit storing learning parameters, and a plurality of input data input to the input layer and learning parameters stored in the storage unit. and a calculation unit for performing calculation for outputting the output value from the output layer. The integrated circuit is configured by FPGA, ASIC, etc., for example, and hardware other than these may be used.

（第１の実施形態）
最初に、本発明に用いるニューラルネットワークの基本構造（以下、加算型ニューラルネットワークと呼ぶ）について図面を参照して説明する。図５は加算型ニューラルネットワークの基本構造を示す図である。加算型ニューラルネットワークは入力層、隠れ層及び出力層によって構成され、各層は複数のノードを有している。また、加算型ニューラルネットワークは、入力層と中間層とのノード間および、隠れ層と出力層とのノード間に任意の重みを設定してノード間の結合状態を調整することにより様々な問題（分類問題あるいは回帰問題）を解くことができる判別器として機能する。 (First embodiment)
First, the basic structure of a neural network used in the present invention (hereinafter referred to as an additive neural network) will be described with reference to the drawings. FIG. 5 is a diagram showing the basic structure of an additive neural network. An additive neural network consists of an input layer, a hidden layer and an output layer, each layer having a plurality of nodes. Additive neural networks can solve various problems ( It functions as a discriminator that can solve classification problems or regression problems).

ここで、図５の隠れ層の演算式ＹＹ（目標値）、ＢＹＡ（加算型演算出力）について説明する。但し、図５は便宜上４次元入力であるが、説明はＮ次元入力として説明する。隠れ層のＹＹ、ＢＹＡは下記式（数１）、（数２）、（数３）で表すことができる。ただし、第１の特徴量ｗｎ、第２の特徴量ｂの底（ｂａｓｅ）のべき乗をそれぞれＷｎ，Ｂとし、Ｎ次元の入力データ要素Ｄｎ＝（Ｄ０，Ｄ１，・・，Ｄｎ，・・，Ｄ（Ｎ－１））のｌｏｇ値をｄｎ＝（ｄ０，ｄ１，・・，ｄｎ，・・，ｄ（Ｎ－１））とする。損失関数ＬをＹＹ（目標値）とＢＹＡ（加算型演算出力）の差分式｜ＹＹ－ＢＹＡ｜とおくと、前記損失関数Ｌを最小化する演算により、前記重み付けパラメータｗｎと前記バイアスパラメータｂの値を抽出すること、を特徴とする加算型ニューラルネットワーク演算方法を提供できる。ここで、べき乗を表す記号は＾、積の記号は＊を用いた。
(数１)
ＹＹ＝Ｄ０＊Ｄ１＊・・・＊Ｄ（Ｎ－１）＊Ｗ０＊Ｗ１＊・・・＊Ｗ（Ｎ－１）
(数２)
ＢＹＡ＝（ｂａｓｅ）＾（Σ［ｎ＝０→Ｎ－１］（ｗｎ＊ｄｎ＋ｂ））
(数３)
ＢＹＡ＝Ｂ＊（ｂａｓｅ）＾（Σ［ｎ＝０→Ｎ－１］（ｗｎ＊ｄｎ）） Here, the calculation formula YY (target value) and BYA (additive calculation output) of the hidden layer in FIG. 5 will be described. However, although FIG. 5 shows four-dimensional input for the sake of convenience, the description will be made assuming N-dimensional input. YY and BYA of the hidden layer can be expressed by the following formulas (Formula 1), (Formula 2), and (Formula 3). However, the powers of the bases of the first feature amount wn and the second feature amount b are Wn and B, respectively, and N-dimensional input data elements Dn=(D0, D1, . . . , Dn, . D(N−1)) is assumed to be dn=(d0, d1, . . . , dn, . . . , d(N−1)). Assuming that the loss function L is a differential expression |YY−BYA| It is possible to provide an additive neural network operation method characterized by extracting a value. Here, ^ is used as a symbol for exponentiation, and * is used as a symbol for product.
(Number 1)
YY=D0*D1*...*D(N-1)*W0*W1*...*W(N-1)
(Number 2)
BYA=(base)^(Σ[n=0→N−1](wn*dn+b))
(Number 3)
BYA=B*(base)^(Σ[n=0→N−1](wn*dn))

続いて、関係式を探索し発見する方法の発明である第１の実施形態について説明する。図６はこの発明に用いる、べき指数追加加算型ニューラルネットワークの基本構造を示す図である。前述、図５との違いは、入力に１次元追加したＮ＋１次元にしたことと、第３の特徴量として、べき指数Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）を新たに設けて入力データ要素に繋いだところにあり、次にその演算方法について説明する。 Next, a first embodiment, which is an invention of a method for searching and finding a relational expression, will be described. FIG. 6 is a diagram showing the basic structure of the exponent addition addition neural network used in the present invention. The difference from the above-described FIG. 1), pN) are newly provided and connected to the input data elements.

Ｎ＋１次元のデータをＤｎ＝（Ｄ０，Ｄ１，・・，Ｄｎ，・・，Ｄ（Ｎ－１），ＤＮ）と表現し、べき指数Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）を乗じてＤｎ＾Ｐｎ＝（Ｄ０＾ｐｏ，Ｄ１＾ｐ１，・・，Ｄｎ＾ｐｎ，・・，Ｄ（Ｎ－１）＾ｐ（Ｎ－１），ＤＮ＾ｐＮ）と表現する。また、Ｗ＝Ｗ０＊Ｗ１＊・・・＊ＷＮとおくと、前述の（数１）、（数３）式から（数４）及び、（数５）、（数６）式が導かれる。（数５）式ＹＹ／Ｗは入力データ要素Ｄｎをべき指数Ｐｎでべき乗した各要素どうしの積で表されることから、「べき乗値の積」と表現する。また、データの属する群が共通の特徴量ｗｎを持つとき、それらのべき乗積であるＷの値も共通であることから、ＹＹ／Ｗ（べき乗値の積）が定数に近似できるとき、ＹＹ（目標値）もまた定数に近似される。従って、ＹＹ（目標値）が定数に近似できる値を探索することは、損失量｜ＹＹ－ＢＹＡ｜が最小となる特徴量ｗｎ、ｂ及び、べき指数Ｐｎを探索することに等しく、得られたべき指数から最適な関係式を得ることができる。
(数４)
ＹＹ＝Ｄ０＊Ｄ１＊・・・＊Ｄ（Ｎ）＊Ｗ０＊Ｗ１＊・・・＊ＷＮ
(数５)
ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊・・・＊ＤＮ＾ｐＮ
(数６)
ＢＹＡ＝Ｂ＊（ｂａｓｅ）＾（Σ［ｎ＝０→Ｎ］（ｗｎ＊ｐｎ＊ｄｎ）） N+1 dimensional data is expressed as Dn=(D0, D1, . . . , Dn, . p(N-1), pN) to obtain Dn^Pn = (D0^po, D1^p1, ..., Dn^pn, ..., D(N-1)^p(N-1), DN ^pN). Further, when W=W0*W1*...*WN, equations (4), (5) and (6) are derived from equations (1) and (3). (Formula 5) YY/W is represented by the product of the input data elements Dn to the power of the exponent Pn, so it is expressed as "product of power values". Further, when the group to which the data belongs has a common feature value wn, the value of W, which is the power product of them, is also common. target value) is also approximated to a constant. Therefore, searching for a value that allows YY (target value) to approximate a constant is equivalent to searching for the feature quantities wn, b and power exponent Pn that minimize the loss |YY-BYA|. Optimal relations can be obtained from power exponents.
(Number 4)
YY=D0*D1*...*D(N)*W0*W1*...*WN
(Number 5)
YY/W=D0^p0*D1^p1*...*DN^pN
(Number 6)
BYA=B*(base)^(Σ[n=0→N](wn*pn*dn))

ここで、べき指数Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）をパラメータに振って探索するとき、出力値がどの程度、所定の目標に近づいたかを表す評価関数に、べき指数毎の損失量｜ＹＹ－ＢＹＡ｜を用いて比較すると、損失量の大きさは、べき指数の値で大きく変化してしまう問題がある。対策として、評価関数は標準偏差を平均値で規格化した変動係数を用いて、べき指数をパラメータとしたそれぞれの平均値に対する相対的なばらつきの大きさを評価することで防止できる。 Here, how close the output value is to the predetermined target when searching with the exponent Pn = (p0, p1, ..., pn, ..., p(N-1), pN) as parameters is compared using the loss amount |YY-BYA| for each power exponent, the magnitude of the loss varies greatly depending on the value of the power exponent. As a countermeasure, the evaluation function can be prevented by using a coefficient of variation obtained by normalizing the standard deviation by the average value and evaluating the magnitude of the relative variation with respect to each average value with the exponent as a parameter.

また、評価関数に判別率を用いて、２群以上に分けた分類問題として解くことができる。 In addition, it is possible to solve a classification problem divided into two or more groups by using a discrimination rate as an evaluation function.

続いて、前述のべき指数追加加算型ニューラルネットワークを用いて、べき指数の最適解を導く方法（以下、べき乗探索法と呼ぶ）について図４を参照して説明する。 Next, a method of deriving an optimum solution for exponents (hereinafter referred to as exponent search method) using the aforementioned exponent addition addition neural network will be described with reference to FIG.

判別器学習部２は、ニューラルネットワークを学習し、学習したニューラルネットワークを用いた判別を行う。その構成として、判別学習部２は、学習部２０、判別処理部２１を備える。 The discriminator learning unit 2 learns a neural network and performs discrimination using the learned neural network. As its configuration, the discrimination learning unit 2 includes a learning unit 20 and a discrimination processing unit 21 .

学習部２０は、損失関数が最小となるようにニューラルネットワークを学習する。すなわち、学習部２０は判別処理部２１から出力された判別結果と学習データ処理部５から読み出した学習データを入力すると、これらのデータを用いて学習を行い、学習データ記憶部３に学習パラメータを記憶する。 The learning unit 20 learns the neural network so as to minimize the loss function. That is, when the learning unit 20 receives the determination result output from the determination processing unit 21 and the learning data read from the learning data processing unit 5, learning is performed using these data, and the learning parameter is stored in the learning data storage unit 3. Remember.

判別処理部２１は、学習パラメータ記憶部３から重みとバイアス、学習データを入力すると、これらを用いた判別結果を判別結果処理部６へ送る。 When the weight, bias, and learning data are input from the learning parameter storage unit 3 , the discrimination processing unit 21 sends discrimination results using these to the discrimination result processing unit 6 .

判別結果処理部６は判別処理部２１から出力された判別結果を受け取ると、べき指数をパラメータとした学習データの入力を学習データ処理部５へ要求する。受け取った判別結果は変動係数の最小順または、判別率の最大順に並べ替える等、装置外部のディスプレイなどの所定の出力装置へ出力する。 Upon receiving the discrimination result output from the discrimination processing section 21, the discrimination result processing section 6 requests the learning data processing section 5 to input learning data with the exponent as a parameter. The received determination results are output to a predetermined output device such as a display outside the apparatus, such as sorting in order of minimum variation coefficient or maximum determination rate.

学習データ記憶部３は、ニューラルネットワークにおけるノード間の重みとバイアス及び学習データ処理部５の学習データを記憶する記憶部である。学習データ記憶部３には、重みの初期化処理時には、ニューラルネットワークの全てのノード間の重みとバイアスの初期値が記憶され、学習データ処理部５から送られた学習データを用いて学習部２０でニューラルネットワークを学習したノード間の重みとバイアス及び学習データを記憶する。 The learning data storage unit 3 is a storage unit that stores weights and biases between nodes in the neural network and learning data of the learning data processing unit 5 . The learning data storage unit 3 stores initial values of weights and biases between all nodes of the neural network at the time of weight initialization processing. store the weights and biases between the nodes trained by the neural network in , and the learning data.

学習データ記憶部４は、学習データを記憶する記憶部である。学習データとは、予め正常と異常が判別された状態情報及び特徴量を示すテスト用のデータである。また、判別データＢＢは判別対象のデータであり判別データ取得部７へ送られ、所定の前処理を施した後、判別処理部２１へ送られる。 The learning data storage unit 4 is a storage unit that stores learning data. The learning data is data for testing indicating the state information and the feature amount in which normality and abnormality are determined in advance. The discrimination data BB, which is data to be discriminated, is sent to the discrimination data acquiring section 7 and sent to the discrimination processing section 21 after being subjected to predetermined preprocessing.

学習データ処理部５は、学習データ記憶部４を入力し、べき指数をパラメータとした所定の学習データの型へ変換処理する。変換処理された学習データは、判別結果処理部６の要求に応じて、重み学習部２０へ送られる。 The learning data processing unit 5 inputs the learning data storage unit 4 and converts it into a predetermined learning data type using the exponent as a parameter. The converted learning data is sent to the weight learning section 20 in response to a request from the discrimination result processing section 6 .

なお、判別器学習部２、学習データ処理部５と判別結果処理部６及び判別データ取得部７は、例えば、この実施の形態に特有な処理が記述されたプログラムをマイクロコンピュータが実行することで、ハードウェアとソフトウェアが協働した具体的な手段として実現することができる。
Note that the discriminator learning unit 2, the learning data processing unit 5, the discrimination result processing unit 6, and the discrimination data acquisition unit 7 can be obtained, for example, by a microcomputer executing a program in which processing specific to this embodiment is described. , can be realized as concrete means in which hardware and software cooperate.

また、図４に示した学習機能を有するべき指数追加加算型ニューラルネットワーク装置の構成である判別器学習部２、学習パラメータ記憶部３、学習データ記憶部４、学習データ処理部５、判別結果処理部６、判別データ取得部７を組み合わせた集積回路にして小型化、高速化、低消費電力、安価に提供することができる。 Further, the discriminator learning unit 2, the learning parameter storage unit 3, the learning data storage unit 4, the learning data processing unit 5, and the discrimination result processing, which are the configuration of the exponent addition addition type neural network device that should have the learning function shown in FIG. An integrated circuit combining the unit 6 and the determination data acquisition unit 7 can be provided at a reduced size, higher speed, lower power consumption, and at a lower cost.

続いて、上記した図４のニューラルネットワーク装置の構成により、べき指数をパラメータとした重み学習処理を行い、変動係数あるいは判別率を計算し、べき指数の最適解を探索する方法について図７のフローチャートに沿って説明する。 Next, with the configuration of the neural network device shown in FIG. 4, the method of performing weight learning processing using the exponent as a parameter, calculating the coefficient of variation or the discrimination rate, and searching for the optimum solution of the exponent is shown in the flow chart of FIG. I will explain along.

まず、学習データ処理部５は、学習データ記憶部４の学習データをニューラルネットワーク演算を行う判別器学習部２にある学習部２０への入力形式へ変換する。学習データ記憶部４の学習データは、Ｎ次元の入力データと１次元の出力データで構成される。学習データ処理部５はＮ次元の入力データと１次元の出力データを繋いだＮ＋１次元のデータを、Ｄｎ＝（Ｄ０，Ｄ１，・・，Ｄｎ，・・，Ｄ（Ｎ－１），ＤＮ）として結合させる（ステップＳＰ１）。 First, the learning data processing unit 5 converts the learning data in the learning data storage unit 4 into an input format for the learning unit 20 in the discriminator learning unit 2 that performs neural network calculations. The learning data in the learning data storage unit 4 is composed of N-dimensional input data and one-dimensional output data. The learning data processing unit 5 converts N+1-dimensional data obtained by connecting N-dimensional input data and 1-dimensional output data into Dn=(D0, D1, . . . , Dn, . . . , D(N-1), DN) (Step SP1).

次に、べき指数Ｐｎの探索方法を設定する（ステップＳＰ２）。例えば、｜ｐｎ｜≦５の整数とした総当たり探索にする。また、ｐｎは実数を扱い、任意の刻みの範囲を設定することもできる。但し、コンピュータのメモリ及び演算能力の制約範囲内にとどめる。 Next, a search method for exponent Pn is set (step SP2). For example, a round-robin search with integers |pn|≦5 is used. In addition, pn handles real numbers, and it is possible to set an arbitrary increment range. However, it must be within the limits of the memory and computing power of the computer.

次に、べき指数探索値Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）の初期値を設定する（ステップＳＰ３）。例えば、｜ｐｎ｜≦５の整数とした場合の総当たり探索（しらみつぶし探索とも呼ぶ）では、探索初期値を、探索ラベルＮｏ．０とし、べき指数Ｐ０＝（－５、－５、、、－５）とする。 Next, the initial value of the exponent search value Pn=(p0, p1, . . . pn, . . . , p(N-1), pN) is set (step SP3). For example, in a round-robin search (also called an exhaustive search) when |pn|≦5 is an integer, the search initial value is the search label No. 0 and exponent P0=(-5, -5, . . . -5).

次に、べき指数探索値Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）の探索終了条件を設定する（ステップＳＰ４）。例えば、探索ラベルＮｏ．０とし、べき指数Ｐ０＝（－５、－５、、、－５）、次の探索ラベルをＮｏ．１にＰ１＝（－５、－５、、、－４）のように連番にして探索終了条件は、探索終了値（５、５、、、５）に設定できる。また、探索終了条件は予め、所定の探索回数、探索ラベル、あるいは閾値を設定してもよい。 Next, a search end condition for the exponent search value Pn=(p0, p1, . . . , pn, . . . , p(N-1), pN) is set (step SP4). For example, search label No. 0, exponent P0=(-5, -5, . . . -5), and the next search label is No. The search end condition can be set to a search end value (5, 5, 5) by assigning serial numbers such as 1 to P1=(-5, -5, . . . , -4). Further, the search end condition may be set in advance as a predetermined number of searches, a search label, or a threshold value.

次に、データＤｎ、べき指数Ｐｎの探索テーブルを作成する（ステップＳＰ５）。例えば、探索ラベルＮｏ．０とし、べき指数Ｐ０＝（－５、－５、、、－５）、次の探索ラベルをＮｏ．１にＰ１＝（－５、－５、、、－４）のように連番にして、探索終了値（５、５、、、５）とした探索テーブルを作ることができる。 Next, a search table for the data Dn and exponent Pn is created (step SP5). For example, search label No. 0, exponent P0=(-5, -5, . . . -5), and the next search label is No. It is possible to create a search table with sequential numbers such as P1=(-5, -5, .

次に、データＤｎ、べき指数Ｐｎを探索テーブルから、探索ラベル順に取り出す（ステップＳＰ６）。 Next, the data Dn and exponent Pn are extracted from the search table in the order of the search labels (step SP6).

次に、Ｄｎ＾Ｐｎをニューラルネットワークの入力に再定義する（ステップＳＰ７）。ここで、学習データ処理部５は、ステップＳＰ６で受け取った、データＤｎと、べき指数Ｐｎを用いたＤｎ＾Ｐｎ＝（Ｄ０＾ｐｏ，Ｄ１＾ｐ１，・・，Ｄｎ＾ｐｎ，・・，Ｄ（Ｎ－１）＾ｐ（Ｎ－１），ＤＮ＾ｐＮ）の式から、Ｄｎ＾ＰｎをＤｎに再定義し、加算型ニューラルネットワークの入力に設定する。 Next, Dn̂Pn is redefined as the input of the neural network (step SP7). Here, the learning data processing unit 5 uses the data Dn received in step SP6 and Dn̂Pn=(D0̂po, D1̂p1, . . . , Dn̂pn, . From the equation (N−1)̂p(N−1), DN̂pN), Dn̂Pn is redefined as Dn and set as the input of the additive neural network.

以上までは、ニューラルネットワークの入力データＤｎを作成する手順であり、図５のステップＳＰ１～ＳＰ７までを説明した。 The above is the procedure for creating the input data Dn of the neural network, and steps SP1 to SP7 in FIG. 5 have been described.

次に、ステップＳＰ７で作成されたＮ＋１次元の入力データＤｎ＝（Ｄ０＾ｐｏ，Ｄ１＾ｐ１，・・，Ｄｎ＾ｐｎ，・・，Ｄ（Ｎ－１）＾ｐ（Ｎ－１），ＤＮ＾ｐＮ）は、判別器学習部２の学習部２０へ送られステップＳＴ１～ＳＴ８を通して、重み学習処理を行う。以下、学習部２０の詳細を説明する。 Next, the N+1-dimensional input data Dn=(D0^po, D1^p1,..., Dn^pn,..., D(N-1)^p(N-1), DN ̂pN) is sent to the learning section 20 of the discriminator learning section 2 and subjected to weight learning processing through steps ST1 to ST8. Details of the learning unit 20 will be described below.

まず、学習部２０は、ニューラルネットワークの特徴量である重みとバイアスを初期化する（ステップＳＴ１）。具体的には、初期値に０を与える。 First, the learning unit 20 initializes weights and biases, which are feature quantities of the neural network (step ST1). Specifically, 0 is given to the initial value.

ここで、隠れ層の演算式ＹＹ（目標値）、ＢＹＡ（加算型演算出力）は、前記の通り、（数１）、（数２）、（数３）で表すことができ、学習部２０は、損失関数Ｌで表す損失量｜ＹＹ－ＢＹＡ｜の初期値を計算する（ステップＳＴ３）。 Here, the calculation formulas YY (target value) and BYA (additive calculation output) of the hidden layer can be represented by (Equation 1), (Equation 2), and (Equation 3) as described above, and the learning unit 20 calculates the initial value of the loss amount |YY-BYA| represented by the loss function L (step ST3).

次に、学習部２０は、バイアス（パラメータｂ）を少しプラス方向に設定量だけ更新する（ステップＳＴ４）。 Next, the learning unit 20 updates the bias (parameter b) slightly in the positive direction by a set amount (step ST4).

続いて、学習部２０は、損失量の値が小さくなるように、重み（重み付けパラメータｗｎ）修正量（適度なシフト量Δｗｎ）を算出する（ステップＳＴ５）。 Subsequently, the learning unit 20 calculates the weight (weighting parameter wn) correction amount (appropriate shift amount Δwn) so that the value of the loss amount becomes small (step ST5).

この後、学習部２０は、ＳＴ５で求めた修正量で重みの値を従前の値から更新する（ステップＳＴ６）。 After that, the learning unit 20 updates the weight value from the previous value with the correction amount obtained in ST5 (step ST6).

さらに、学習部２０は、ＳＴ５～ＳＴ６のステップを設定回数分のループを廻し重み量を更新する（ステップＳＴ７）。 Further, the learning section 20 repeats steps ST5 to ST6 a set number of times to update the weight amount (step ST7).

この後、学習部２０は重み学習の終了条件を満たしたか否かを確認する（ステップＳＴ８）。ここで終了条件は損失量が減少から増加に転じた一つ前の最小値がよい。また、学習回数が設定回数以上となった場合でもよい。 After that, the learning section 20 confirms whether or not the end condition of weight learning is satisfied (step ST8). Here, the termination condition is preferably the minimum value immediately before the loss amount turns from decrease to increase. Alternatively, the number of times of learning may be equal to or greater than the set number of times.

終了条件を満たすと、学習部２０は、抽出した損失量｜ＹＹ－ＢＹＡ｜を最小にする特徴量を、学習パラメータ記憶部３に記憶し判別処理部２１へ送る。 When the termination condition is satisfied, the learning unit 20 stores the extracted feature amount that minimizes the loss amount |YY−BYA| in the learning parameter storage unit 3 and sends it to the discrimination processing unit 21 .

次に、判別処理部２１は得られた特徴量を判別結果処理部６へ送る。 Next, the discrimination processing section 21 sends the obtained feature amount to the discrimination result processing section 6 .

次に、判別結果処理部６は、特徴量から変動係数、判別率を計算し、結果を記憶する（ステップＳＰ８）。 Next, the discrimination result processing unit 6 calculates the coefficient of variation and the discrimination rate from the feature amount, and stores the result (step SP8).

次に、判別結果処理部６は、探索テーブルの探索ラベルを、従前の値から更新する（ステップＳＰ９）。例えば、総当たり探索を設定した場合、探索ラベルを一つ進める。ここで、幅優先探索法や、よりヒューリスティックな探索法を用いて、現在までのステップで計算された変動係数あるいは判別率から、より小さくする変動係数、または、より高い判別率に、従前の探索順より速く到達する可能性を予測できるアルゴリズムを仕込み、効率よい探索順ラベルに更新してもよい。 Next, the determination result processing unit 6 updates the search label of the search table from the previous value (step SP9). For example, if a round-robin search is set, the search label is advanced by one. Here, using a breadth-first search method or a more heuristic search method, the coefficient of variation or discrimination rate calculated in the current step is replaced with a smaller coefficient of variation or a higher discrimination rate in the previous search order. An algorithm that can predict the possibility of faster arrival may be incorporated to update the search order label to an efficient one.

次に、ステップＳＰ９を通して探索ラベルを更新した後、は探索終了条件を満たしたか否かを確認する（ステップＳＰ１０）。終了条件を満たしていない場合、ステップＳＰ６に戻り、繰り返す。 Next, after updating the search label through step SP9, it is checked whether or not the search end condition is satisfied (step SP10). If the end condition is not satisfied, the process returns to step SP6 and repeats.

このようにして得られた、べき指数Ｐｎ＝（ｐ０，ｐ１，・・，ｐｎ，・・，ｐ（Ｎ－１），ｐＮ）が最適な関係式を与える。具体的なＮ＋１次元のデータ、関係式の型、及び評価関数として用いる変動係数と判別率の詳細な説明は（実施例１）、（実施例２）を通して後述する。 The power exponent Pn=(p0, p1, . . . pn, . Detailed descriptions of specific N+1-dimensional data, types of relational expressions, and coefficients of variation and discrimination rates used as evaluation functions will be given later through (Example 1) and (Example 2).

上記した第１の実施形態では、隠れ層が１段の場合の例を挙げて説明したが、複数段の隠れ層にも適用することができる。図８は、多層型のべき指数追加加算型ニューラルネットワークの基本構造を示す図である。ここで、１段目の隠れ層ノードｎ１，ｎ２の出力を受け取る２段目の隠れ層として、２つの重みｈ０、ｈ１を紐づけた２段目の目標値ＺＺのノードｎ３，加算型出力ＢＺＡののノードをｎ４とする２つのノードを挿入し拡張し、１次元出力Ｚ－Ａｃｔを得る。このような２段の隠れ層を持つニューラルネットワークを用いることで、より複雑な問題に対して精度を向上できる。 In the above-described first embodiment, an example in which the number of hidden layers is one has been described, but the present invention can also be applied to a plurality of hidden layers. FIG. 8 is a diagram showing the basic structure of a multi-layer exponential addition addition neural network. Here, as the second-level hidden layer that receives the outputs of the first-level hidden layer nodes n1 and n2, the node n3 of the second-level target value ZZ linked to the two weights h0 and h1, the additive output BZA 2 nodes are inserted and extended with n4 as the first node to obtain the one-dimensional output Z-Act. By using such a neural network with two hidden layers, accuracy can be improved for more complicated problems.

（第２の実施形態）
続いて、本発明の第２の実施形態について説明する。本発明の第２の実施形態は、入力データ要素間の和や差を含めて、第１の実施形態の、べき乗探索法への入力データ要素にする前処理を行い、その入力データ要素を、べき乗探索法へ入力して演算を行い、加減乗除から成り立つ関係式を発見する学習方法である。 (Second embodiment)
Next, a second embodiment of the invention will be described. The second embodiment of the present invention preprocesses the input data elements to the power search method of the first embodiment, including sums and differences between the input data elements, and converts the input data elements to This is a learning method in which a relational expression consisting of addition, subtraction, multiplication, and division is found by inputting to the exponentiation search method and performing calculations.

第１の実施形態は、べき乗探索法へのＮ次元の入力データの単位が異なる場合、あるいは入力データ間の和や差を必要としないとき、最適な関係式が得られる。一方、入力データ間の和や差を用いた関係式がある。例えば三角形の３辺の長さ（ａ，ｂ，ｃ）を元データとして、答えである面積Ｓを求めるヘロンの公式（数７）は、辺の差分の積を利用した方程式である。このような類の方程式を盲目的に解くには、元データである３辺の和や差を、べき乗探索法への入力データに加える前処理を行い、べき乗探索法への入力テーブルを作り、順に、べき乗探索を行う。
(数７)
１６＝（ａ＋ｂ＋ｃ）＊（－ａ＋ｂ＋ｃ）＊（ａ－ｂ＋ｃ）＊（ａ＋ｂ－ｃ）／（Ｓ＾２）
(数８)
１６＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ２＾ｐ２＊Ｄ３＾ｐ３＊Ｄ４＾ｐ４ The first embodiment provides an optimum relational expression when the units of N-dimensional input data to the power search method are different, or when sums and differences between input data are not required. On the other hand, there are relational expressions using sums and differences between input data. For example, Heron's formula (Equation 7), which obtains the area S as the answer using the lengths (a, b, c) of the three sides of a triangle as the original data, is an equation using the product of the differences of the sides. In order to blindly solve this type of equation, preprocessing is performed by adding the sum or difference of the three sides, which are the original data, to the input data for the power search method, creating an input table for the power search method, A power search is performed in order.
(Number 7)
16=(a+b+c)*(-a+b+c)*(a-b+c)*(a+b-c)/(S^2)
(Number 8)
16=D0^p0*D1^p1*D2^p2*D3^p3*D4^p4

ヘロンの公式（数７）に倣うと、三角形の元データである３辺の長さ（ａ，ｂ，ｃ）の和と差を用いた入力データは、Ｄ０＝（ａ＋ｂ＋ｃ）、Ｄ１＝（－ａ＋ｂ＋ｃ）、Ｄ２＝（ａ＋ｂ－ｃ）の値であり、その３次元の入力データ要素（Ｄ０、Ｄ１、Ｄ２）を組み合わせて構成する所定の関係を有する答えデータＤ３は面積Ｓである。４次元の入力データ要素Ｄｎ＝（Ｄ０、Ｄ１、Ｄ２、Ｄ３）を作り、４次元のべき指数Ｐｎ＝（ｐ０、ｐ１、ｐ２、ｐ３）を用いてべき乗探索を行うと、べき指数Ｐｎの解は、（ｐ０、ｐ１、ｐ２、ｐ３）＝（１、１、１、－２）が得られる。 According to Heron's formula (Formula 7), the input data using the sum and difference of the lengths (a, b, c) of the three sides, which are the original data of the triangle, are D0=(a+b+c), D1=(- a+b+c), D2=(a+b−c), and the answer data D3 having a predetermined relationship formed by combining the three-dimensional input data elements (D0, D1, D2) is the area S. When a four-dimensional input data element Dn=(D0, D1, D2, D3) is created and a power search is performed using the four-dimensional exponent Pn=(p0, p1, p2, p3), the solution of the exponent Pn is gives (p0, p1, p2, p3)=(1, 1, 1, -2).

次に、測定対象物の元データに和や差を施す前処理を行い、べき乗探索法への入力テーブルを作る方法について説明する。 Next, a description will be given of a method of performing preprocessing such as adding or subtracting the original data of the object to be measured and creating an input table for the power search method.

単位が同じで差分可能なＭ行（Ｍ次元）及び学習サンプル数（ＳＮ）を列とする元データをａｍ＝（ａ０，ａ１，，，ａ（Ｍ－１））とする。また、差分要素マトリックスＣｍ及び、元データａｍの要素に掛け合わせる係数ｋを定義する。差分要素マトリックスＣｍは、元データａｍの各要素に係数ｋ倍して得られる全ての組み合わせのマトリックスと定義し図９に例示する。図９は、係数ｋを－１，０，１の整数、Ｍ＝３次元のときの差分要素マトリックスＣｍの例であり、２７行３列のマトリックスの行列で表すことができる。
Let am=(a0, a1, . Also, a difference element matrix Cm and a coefficient k by which the elements of the original data am are multiplied are defined. The difference element matrix Cm is defined as a matrix of all combinations obtained by multiplying each element of the original data am by a factor k, and is illustrated in FIG. FIG. 9 shows an example of the difference element matrix Cm when the coefficient k is an integer of −1, 0, or 1 and M is three dimensions, which can be represented by a matrix of 27 rows and 3 columns.

ここで係数ｋの値は、データ間の差はｋ＝－１、和は、ｋ＝１、不要な係数はｋ＝０として表せる。さらに、係数ｋ＝－２、－１，０，１，２のように順に整数を設定し、多様な整数倍に対応することができる。また、係数ｋ＝－１，－０．５，０，０．５，１のように実数を設定することもできる。 Here, the value of the coefficient k can be expressed as k=−1 for the difference between data, k=1 for the sum, and k=0 for unnecessary coefficients. Furthermore, integers can be set in order such as coefficient k=-2, -1, 0, 1, 2 to support various integer multiples. Real numbers such as coefficient k=-1, -0.5, 0, 0.5, 1 can also be set.

さらに係数ｋは虚数単位ｉを用いることができる。例えば円の方程式、１＝ｘ＾２＋ｙ＾２は、虚数単位ｉを用いた因数分解を利用し、１＝（ｘ＋ｉ＊ｙ）＊（ｘ－ｉ＊ｙ）に等しく、元データｘ、ｙから円の方程式を導くことができる。 Furthermore, the coefficient k can use the imaginary unit i. For example, the circle equation, 1=x^2+y^2, uses factorization with the imaginary unit i, equal to 1=(x+i*y)*(x−i*y), from the original data x,y We can derive the equation of the circle.

次に、べき乗探索法の入力データＤｎの各要素となる積入力要素マトリックスＬｎＳを定義する。積入力要素マトリックスＬｎＳは（数９）に示すように、Ｃｍとａｍの内積で表す。図９に積入力要素マトリックスＬｎＳを例示する。図９は、係数ｋを－１，０，１の整数、Ｍ＝３次元のときの積入力要素マトリックスＬｎＳの例であり、２７行ＳＮ列のマトリックスの行列テーブルで表すことができる。また、積入力要素マトリックスＬｎＳのｎ行目ＳＮ列の要素を積入力要素Ｌｎと定義する。
(数９)
ＬｎＳ＝Ｃｍ・ａｍ Next, a product input element matrix LnS, which is each element of the input data Dn of the power search method, is defined. The product input element matrix LnS is represented by the inner product of Cm and am as shown in (Equation 9). FIG. 9 illustrates the product input element matrix LnS. FIG. 9 shows an example of the product input element matrix LnS when the coefficient k is an integer of -1, 0, or 1 and M is three dimensions, which can be represented by a matrix table of 27 rows and SN columns. Also, the element of the n-th row and SN column of the product input element matrix LnS is defined as the product input element Ln.
(Number 9)
LnS = Cm·am

次に、積入力要素マトリックスＬｎＳに含まれる全ての要素を探索目的とするには、不必要な積入力要素Ｌｎを含んでいる場合、制約条件を設定して、不必要な積入力要素Ｌｎを省いたＬｎＳテーブルにする。制約条件がない場合、そのままの積入力要素マトリックスＬｎＳをＬｎＳテーブルとする。 Next, in order to search for all the elements included in the product input element matrix LnS, if unnecessary product input elements Ln are included, a constraint condition is set so that the unnecessary product input elements Ln are Make it the omitted LnS table. If there are no constraints, the product input element matrix LnS as it is is used as the LnS table.

次に、ＬｎＳテーブルの積入力要素Ｌｎの中から、べき乗探索法へ入力する入力データ要素の個数ＮＹを設定する。例えば、（数８）の場合、入力データは（Ｄ０、Ｄ１、Ｄ２、Ｄ３、Ｄ４）の５次元要素の積でありＮＹ＝５である。 Next, the number NY of input data elements to be input to the power search method is set from among the product input elements Ln of the LnS table. For example, in the case of (Formula 8), the input data is the product of five-dimensional elements of (D0, D1, D2, D3, D4) and NY=5.

次に、ＬｎＳテーブルの行から、（ＮＹ－１）行を抽出し組み合わせた（ＮＹ－１）行（次元）のＤｎＬテーブルを作成する。 Next, a DnL table with (NY-1) rows (dimensions) is created by extracting and combining (NY-1) rows from the rows of the LnS table.

さらに、前記、ＤｎＬテーブルの最後尾に１次元の答えデータを連結させて、べき乗探索法へ入力するＮＹ行（次元）のＤｎＬテーブルにする。 Further, the end of the DnL table is linked with the one-dimensional answer data to form a DnL table of NY rows (dimensions) to be input to the power search method.

このＤｎＬテーブルの順に従い、ＮＹ行（次元）のデータを、べき乗探索法へ入力し最適解を導く方法を差分探索法と呼ぶ。以下、差分探索法を用いて最適解を探索する方法について、図１０のフローチャートに沿って説明する。 A method of inputting the data of NY rows (dimensions) to the power search method to derive the optimum solution according to the order of the DnL table is called the difference search method. A method of searching for the optimum solution using the difference search method will be described below with reference to the flowchart of FIG.

最初に、測定対象物の元データａｍに加減算が可能な要素があるかどうかをチェックする（ステップＳＳ１）。加減算が可能な要素があれば、加減算を行う要素を設定する。（ステップＳＳ２）。 First, it is checked whether or not the original data am of the object to be measured has an element that can be added or subtracted (step SS1). If there is an element that can be added or subtracted, set the element to be added or subtracted. (Step SS2).

次に、前記した係数ｋ、及び学習サンプル数ＳＮを設定し、差分要素マトリックスＣｍを生成する（ステップＳＳ３）。 Next, the coefficient k and the number of learning samples SN are set to generate the difference element matrix Cm (step SS3).

次に、入力データａｍの各要素間の和と差に制約条件がある場合、その制約条件を設定する（ステップＳＳ４）。例えば、上記ヘロンの公式において、辺の差分が正値であること、つまり（±ａ±ｂ±ｃ）＞０の条件のみを利用する場合、その条件を設定し、正値以外の値を省く。 Next, if there are constraints on the sum and difference between the elements of the input data am, the constraints are set (step SS4). For example, in the above Heron's formula, when using only the condition that the difference between the sides is a positive value, that is, (±a±b±c)>0, set that condition and omit values other than positive values. .

次に、積入力要素マトリックスＬｎＳを（数９）式から計算し、ステップＳＳ４で設定した制約条件を満足するＬｎＳテーブルを作成する（ステップＳＳ５）。 Next, the product input element matrix LnS is calculated from the equation (9) to create an LnS table that satisfies the constraint conditions set in step SS4 (step SS5).

次に、べき乗探索法へ入力する入力データ要素の個数ＮＹを設定する。（ステップＳＳ６） Next, the number NY of input data elements to be input to the power search method is set. (Step SS6)

次に、ＬｎＳテーブルの行から、（ＮＹ－１）行を抽出し組み合わせた（ＮＹ－１）行（次元）のＤｎＬテーブルを作成する（ステップＳＳ７）。 Next, a DnL table of (NY-1) rows (dimensions) is created by extracting and combining (NY-1) rows from the rows of the LnS table (step SS7).

次に、前記ＤｎＬテーブルの最後尾の行に１次元の答えデータを連結する。（ステップＳＳ８）。 Next, one-dimensional answer data is linked to the last row of the DnL table. (Step SS8).

次に、ＤｎＬテーブルから先頭データＤｎ行を取得する（ステップＳＰ１）。その後のステップＳＰ２～ＳＰ１０、及びＳＴ１～ＳＴ８までは、べき乗探索法と同じであり説明を省略する。 Next, the top data Dn row is obtained from the DnL table (step SP1). The subsequent steps SP2 to SP10 and ST1 to ST8 are the same as in the exponential search method, and descriptions thereof are omitted.

次のステップＳＳ９でデータＤｎ行をＤｎＬテーブルの順番に従い次のデータに更新する。次のステップＳＳ１０で、データＤｎ行が最終データで無ければステップＳＰ２に戻り繰り返す。ＤｎＬテーブルの最終順が完了すると終了する。あるいは、変動係数、あるいは判別率に閾値を設けて途中終了させてもよい。 In the next step SS9, the data Dn row is updated to the next data according to the order of the DnL table. In the next step SS10, if the data Dn row is not the final data, the process returns to step SP2 and repeats. It ends when the final order of the DnL table is completed. Alternatively, a threshold value may be set for the coefficient of variation or discrimination rate, and the process may be terminated halfway.

（実施例１）
第１の実施例として、第１の実施形態をケプラーの第３法則の発見に適用する。ケプラーの第３法則は「各惑星の公転周期Ｔの２乗は、太陽からの平均距離ｒの３乗に比例する。」であり、物理法則に基づくべき乗則が成り立っている。図１１に、９つの惑星名と２つの測定データ（太陽からの平均距離ｒ［ｋｍ］と公転周期Ｔ［ｄａｙ］）を明記した。ここでは２次元の入力データ要素Ｄ０＝ｒ／１Ｅ８、Ｄ１＝Ｔ／１Ｅ２を用いて、本発明のべき乗探索法により、法則を発見する方法について説明する。法則の関数形態は、単位が異なるため加減算を除く除算と乗算で構成されると第一に推定できる。 (Example 1)
As a first example, the first embodiment is applied to the discovery of Kepler's third law. Kepler's third law is "The square of the orbital period T of each planet is proportional to the cube of the average distance r from the sun." Nine planet names and two measurement data (average distance r [km] from the sun and revolution period T [day]) are specified in FIG. Here, a method of finding a law by the power search method of the present invention will be described using two-dimensional input data elements D0=r/1E8 and D1=T/1E2. The functional form of the law can be first presumed to be composed of division and multiplication, excluding addition and subtraction, since the units are different.

前記の２次元の入力データ要素（Ｄ０、Ｄ１）を組み合わせて構成する関数は、ｆ（Ｄ０，Ｄ１）＝１と表すことができる。発見したい左項の何らかの関数をｆ（Ｄ０，Ｄ１）としたとき、右項は、所定の関係を有する答えデータをＤ２とすると、１である。従って、べき乗探索法への３次元の入力データ要素は（Ｄ０、Ｄ１、１）であり、ＹＹ／Ｗ（べき乗値の積）は（数５）から（数１０）の関数で与えられる。
(数１０)
ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１ A function configured by combining the two-dimensional input data elements (D0, D1) can be expressed as f(D0, D1)=1. When some function of the left term to be found is f(D0, D1), the right term is 1 when D2 is answer data having a predetermined relationship. Therefore, the three-dimensional input data elements to the power search method are (D0, D1, 1) and YY/W (the product of power values) is given by the functions (5) to (10).
(Number 10)
YY/W=D0^p0*D1^p1

次に、図７のフローチャートに従い、べき乗探索法を使って、最適な関係式を導く方法を説明する。 Next, a method of deriving the optimum relational expression using the power search method will be described according to the flow chart of FIG.

最初に、３次元の入力テーブル（Ｄ０、Ｄ１、１）を図８から作成する（ステップＳＰ１）。べき指数の探索方法は、べき指数を｜ｐｎ｜≦７の整数とする総当たり探索とし、探索初期値をＮｏ．０、べき指数Ｐ０＝（－７、－７）、Ｎｏ．１、Ｐ１＝（－７、－６）のように連番にして、探索終了値を（７，７）に設定した探索テーブルを作成する。また、評価関数には変動係数を用いる（ステップＳＰ２～ＳＰ５）。探索初期値を設定した後は、探索終了まで、探索テーブル順に従ったニューラルネットワーク演算を行う（ステップＳＰ６～ＳＰ１０）。ステップＳＰ７では、初期値Ｄｎ＾Ｐｎ＝（Ｄ０＾ｐ０、Ｄ１＾ｐ１、１）で演算される行を入力項に設定する。次にニューラルネットワーク演算の重みとバイアスを初期化する（ステップＳＴ１）。 First, a three-dimensional input table (D0, D1, 1) is created from FIG. 8 (step SP1). A method of searching for exponents is a brute-force search in which the exponent is an integer of |pn|≦7, and the search initial value is No. 0, exponent P0=(-7, -7), No. 1, P1 = (-7, -6), and a search table is created in which the search end value is set to (7, 7). A coefficient of variation is used as the evaluation function (steps SP2 to SP5). After setting the search initial value, neural network operations are performed according to the order of the search table until the end of the search (steps SP6 to SP10). In step SP7, a row calculated with initial values Dn^Pn=(D0^p0, D1^p1, 1) is set as an input term. Next, the weights and biases for neural network operations are initialized (step ST1).

ここで、特徴量抽出のための初期設定について述べる。図７のフローチャートに示す重み学習を進めるにあたり、ループを行うバイアス更新回数、重み更新回数、及び重み修正量（Δｗｎ）及びバイアス更新量を適度な値に初期設定する。本事例では、隠れ層１のバイアス更新回数を５０、重み更新回数を１０回に設定した。バイアスの更新量はＹＹ（数１）より、重みＷｎ＝１（ｗｎ＝０）のときは、データ積の項のみに単純化できることから、そのデータ積の平均値を５０で割った値を、５０回刻みの分割量とし、その分割量の１０％をバイアス更新量の設定値とした。重み修正量（Δｗｎ）は、損失量の０．１％を設定値とした。目的に応じ、初期設定値を細かくしたり、または荒く可変してもよい。ｂａｓｅ（低）の設定値は０．９とした。本事例の入力データｄ１の値は最大値９１５を扱っており、例えば１０をｂａｓｅ（底）の設定値とおくと、容易にコンピュータの上限計算限界に至ってしまう。本特許では小数をｂａｓｅ（底）に設定できるため、計算限界を回避できる。 Initial settings for feature quantity extraction will now be described. In proceeding with the weight learning shown in the flowchart of FIG. 7, the number of times of bias update, the number of times of weight update, the amount of weight correction (Δwn) and the amount of bias update to be looped are initially set to appropriate values. In this example, the number of bias updates for hidden layer 1 is set to 50, and the number of weight updates is set to 10. When the weight Wn=1 (wn=0), the update amount of the bias can be simplified to only the term of the data product from YY (Equation 1). A division amount is set in increments of 50 times, and 10% of the division amount is set as the set value of the bias update amount. The amount of weight correction (Δwn) was set to 0.1% of the amount of loss. Depending on the purpose, the initial set values may be finely set or may be changed roughly. The set value of base (low) was set to 0.9. The value of the input data d1 in this example deals with a maximum value of 915, and if, for example, 10 is set as the base set value, the upper calculation limit of the computer is easily reached. The patent allows a decimal to be set to base, thus avoiding computational limits.

次に、隠れ層演算式ＹＹ(数１)、ＢＹＡ(数３)の初期値、及び損失量｜ＹＹ－ＢＹ
Ａ｜の初期値を算出する（ステップＳＴ２～ＳＴ３）。 Next, the initial values of the hidden layer calculation formulas YY (Equation 1) and BYA (Equation 3), and the amount of loss |YY−BY
An initial value of A| is calculated (steps ST2 and ST3).

次に、ステップＳＴ４～ＳＴ８のパラメータ学習ループを通して、損失量｜ＹＹ－ＢＹＡ｜の最小値を計算し、パラメータ学習結果をステップＳＰ８に送り変動係数を計算し、結果を記憶する。 Next, through the parameter learning loop of steps ST4 to ST8, the minimum value of the loss amount |YY-BYA| is calculated, the parameter learning result is sent to step SP8 to calculate the coefficient of variation, and the result is stored.

ここで、本事例に用いる評価関数である変動係数について説明する。変動係数は（数７）のＹＹ／Ｗ（べき乗値の積）の標準偏差σをＹＹ／Ｗ（べき乗値の積）の平均値で割った値である。 Here, the coefficient of variation, which is the evaluation function used in this example, will be described. The coefficient of variation is a value obtained by dividing the standard deviation σ of YY/W (product of power values) in Equation 7 by the average value of YY/W (product of power values).

次の、ステップＳＰ９は、得られた変動係数を用いてヒューリステックな探索方法を導入し探索順を初期値の探索ラベルから入れ替えたい場合に設定する、本事例では総当たり探索のため、順送りである。 The next step SP9 is set when a heuristic search method is introduced using the obtained coefficient of variation and the search order is to be changed from the search label of the initial value. be.

次の、ステップＳＰ１０は、探索テーブルの探索終了値（７，７）完了後、変動係数の最小となったべき指数Ｐｎ、及び、べき指数Ｐｎと変動係数の対応リスト、グラフ等を出力し終了する。グラフ例を図１２、図１３に示す。また、探索終了値（７，７）に届いていない場合、ステップＳＰ６に戻り、繰り返す。 The next step SP10 outputs the exponent Pn with the smallest coefficient of variation after the search end value (7, 7) of the search table is completed, and the correspondence list, graph, etc. between the exponent Pn and the coefficient of variation. do. Graph examples are shown in FIGS. If the search end value (7, 7) is not reached, the process returns to step SP6 and repeats.

図１２は、変動係数を出力値に、横軸にＤ０のべき指数ｐ０、縦軸にＤ１のべき指数ｐ１として（ｐ０、ｐ１）を座標とした本事例の出力図である。変動係数は、べき指数（ｐ０、ｐ１）の座標位置、（－６、４）、（－３、２）、（０，０）、（３、－２）、（６、－４）の点で０～０．０００５と小さくなっていることが判る。但し、便宜上０．０００１より小さい値を０と表示している。図１３は、図１２のｌｏｇ値（常用対数）を３次元のワイヤフレームプロットにしたものである。図１３には、球（●）を表示し、ワイヤフレームの傾斜に沿って、最小点へ流れる様子を模擬した。（数１０）式に（ｐ０、ｐ１）＝（－３、２）を代入すると、ＹＹ／Ｗ＝Ｄ０＾（－３）＊Ｄ１＾（２）≒４（一定）が導かれ図１４の表に示した。この結果から、最初にｆ（Ｄ０，Ｄ１）＝１と表したｆ（Ｄ０，Ｄ１）の最適な関数は、ｆ（Ｄ０，Ｄ１）＝Ｄ０＾（－３）＊Ｄ１＾（２）／４と得ることができた。つまり、９つの惑星名と２つの測定データ（Ｄ０は太陽からの平均距離ｒ、Ｄ１は公転周期Ｔ）から、次の法則が導かれている。「各惑星の公転周期Ｔの２乗は、太陽からの平均距離ｒの３乗に比例する。」 FIG. 12 is an output diagram of this example in which the coefficient of variation is the output value, the horizontal axis is the exponent p0 of D0, and the vertical axis is the exponent p1 of D1, and (p0, p1) is used as coordinates. The coefficient of variation is the coordinate position of the power index (p0, p1), (-6, 4), (-3, 2), (0, 0), (3, -2), (6, -4) points It can be seen that it is small from 0 to 0.0005. However, for the sake of convenience, values smaller than 0.0001 are indicated as 0. FIG. 13 is a three-dimensional wireframe plot of the log values (common logarithms) of FIG. In FIG. 13, a sphere (●) is displayed to simulate the flow to the minimum point along the slope of the wire frame. Substituting (p0, p1)=(-3, 2) into the formula (10) yields YY/W=D0^(-3)*D1^(2) ≈ 4 (constant). It was shown to. From this result, the optimal function of f(D0,D1), initially expressed as f(D0,D1)=1, is f(D0,D1)=D0^(-3)*D1^(2)/4 I was able to get it. That is, the following law is derived from nine planet names and two measurement data (D0 is the average distance r from the sun, and D1 is the orbital period T). "The square of the orbital period T of each planet is proportional to the cube of the average distance r from the sun."

評価関数に変動係数を用いて総当たり探索を行いケプラーの第３法則を導く方法を前述した。変動係数を図１２及び、図１３にグラフ表示すると、最小値となる極小値が複数存在し、規則的に与えられていることが判り、この例では、探索法に変動係数の小さくなる方向へ移動する付近値探索法を用いて最小値となる極小値へ素早く探索できる。しかし、損失量が最小値とならない極小値が複数存在する関数は多く、ニューラルネットワークの勾配消失を引き起こす問題点の一つである。従来のニューラルネットワークの解決策として、全てのデータを用いずにサンプリングした荒いデータ（ミニバッチサイズと呼ぶ）を用いて故意に精度を下げた勾配計算を行い極小値を避ける方法がある。しかし、この方法では、どこに極小値があるのかわからず、ミニバッチサイズの大きさの変更、乱数を取り入れる等の試行錯誤が必要となる課題が残る。本特許では、べき指数座標を軸としたグラフを用いて効率的な付近値探索法を検討できる。 The method for deriving Kepler's third law by performing a round-robin search using the coefficient of variation as the evaluation function has been described above. When the coefficient of variation is displayed graphically in FIGS. 12 and 13, it can be seen that there are a plurality of minimum values that are the minimum values, and that they are given regularly. A moving neighborhood search method can be used to quickly search to a local minimum that is the minimum value. However, there are many functions that have multiple local minimum values that do not have the minimum loss amount, which is one of the problems that causes gradient vanishing in neural networks. A conventional neural network solution is to avoid local minima by intentionally reducing the accuracy of gradient calculations using coarse data (called mini-batch size) that is sampled without using all the data. However, with this method, it is not possible to know where the minimum value is, and there remains a problem that trial and error is required, such as changing the size of the mini-batch and incorporating random numbers. In this patent, an efficient neighborhood value search method can be considered using a graph whose axis is exponential coordinates.

例えば、損失量が最小値とならない極小値が複数存在する関数例として、ケプラーの第３法則を「各惑星の公転周期Ｔの３乗は、太陽からの平均距離ｒの５乗に比例する。」と変えたデータを作成し関係式を総当たり探索を行うと、最小値のべき数値は（－５，３）、（５、－３）が与えられ、ＹＹ／Ｗ＝Ｄ０＾（－５）＊Ｄ１＾（３）の式が得られる、図１５にそのワイヤフレームプロット図を示した。この図から最小値のべき数値は（－５，３）、（５、－３）の間に（－３、２）、（－２、１）、（３、－２）、（２，－１）の極小値が規則的に存在することが判る。従って、探索法に評価関数の小さくなる、あるいは大きくなる方向へ移動する付近値探索法を用いる場合は、探索初期値の選び方によっては特異点（０，０）、あるいは極小値、極大値へ流れてしまい正解が得られない不都合が生じてしまう。これを避けるために探索初期値の座標位置は複数の象限へ設定する注意が必要なこと、規則性を考慮した複数の極値に近い位置へ初期値を設定すると探索時間を短くできることが図１２及び、図１３、図１５のグラフから理解できる。 For example, as an example of a function in which there are a plurality of minimum values where the amount of loss is not the minimum value, Kepler's third law states that "the third power of the orbital period T of each planet is proportional to the fifth power of the average distance r from the sun. ” and performing a brute force search for the relational expression, the power values of the minimum value are given as (−5, 3) and (5, −3), and YY/W=D0^(−5 )*D1^(3), the wireframe plot of which is shown in FIG. From this figure, the power of the minimum value is (-3, 2), (-2, 1), (3, -2), (2, -) between (-5, 3) and (5, -3). It can be seen that the minimum values of 1) exist regularly. Therefore, when using the neighborhood value search method in which the evaluation function decreases or increases as the search method, depending on how the search initial value is selected, it may flow to a singular point (0, 0), or to a local minimum value or maximum value. Inconvenience arises that the correct answer cannot be obtained. In order to avoid this, it is necessary to set the coordinate positions of the search initial values in multiple quadrants, and the search time can be shortened by setting the initial values to positions close to multiple extreme values in consideration of regularity. And it can be understood from the graphs of FIGS. 13 and 15. FIG.

このように、べき指数を座標軸にして評価関数を変動係数で表現するグラフを用いて、より速く正解に辿り着くためのヒューリステックな探索方法を構築することができる。 In this manner, a heuristic search method for reaching the correct answer more quickly can be constructed by using a graph expressing the evaluation function by the coefficient of variation with the exponent as the coordinate axis.

また、（数１０）の方程式の解は、特徴量パラメータを固定することで速く求めることができる。（数３）のバイアスＢ＝０（ｂ＝０）で固定し、重み学習のループを廻さない、つまりｗｎ＝０の初期値で計算したＷ＝１、ＢＹＡ＝１とした損失量｜ＹＹ－ＢＹＡ｜＝｜ＹＹ－１｜に単純化することで演算を速くすることができる。この手法は、データにノイズ（外乱）が少ないと判断できる探索、特に、べき指数だけの関係式を評価したいときに有効である。 Also, the solution of the equation (10) can be obtained quickly by fixing the feature parameter. The bias B in (Formula 3) is fixed at B = 0 (b = 0), and the weight learning loop is not run, that is, the loss amount |YY- A simplification to BYA|=|YY-1| can speed up the computation. This method is effective for a search where it can be judged that the data has little noise (disturbance), especially when it is desired to evaluate a relational expression with only power exponents.

（実施例２）
第２の実施例として、第２の実施形態をヘロンの公式の発見に適用する。図１６は１０個の番号（１）～（１０）の種々の３角形の絵であり、図１７に、その３辺の長さａ，ｂ，ｃ及び面積Ｓを小数第一位までを有効桁とした表である。３辺の長さは、単位がｃｍと共通であることから、３辺の長さａ，ｂ，ｃ及び面積Ｓをダイレクトにべき乗探索法の入力に用いても、答えに辿りつけない懸念がある。この解決策として、３辺の長さの加減算の値を含めてべき乗探索法の入力とする方法を、図１０のフローチャートを用いて具体的に説明する。 (Example 2)
As a second example, we apply the second embodiment to the discovery of Heron's formula. FIG. 16 is a picture of various triangles with ten numbers (1) to (10), and FIG. It is a table with digits. Since the length of the three sides has a common unit of cm, there is a concern that even if the lengths a, b, and c of the three sides and the area S are directly used for the input of the power search method, the answer cannot be reached. be. As a solution to this problem, a method of inputting the power search method including the addition and subtraction values of the lengths of the three sides will be specifically described with reference to the flow chart of FIG.

最初に、測定対象物の元データａｍを３辺の長さ（ａ０，ａ１，ａ２）とし、これを加減算可能な３次元データａｍ＝（ａ０，ａ１，ａ２）として設定する（ステップＳＳ１～ＳＳ２）。 First, the original data am of the object to be measured is set to the lengths of three sides (a0, a1, a2), and this is set as three-dimensional data am=(a0, a1, a2) that can be added and subtracted (steps SS1 to SS2). ).

次に、元データａｍのサンプル数（ＳＮ）は三角形１０個であり、ＳＮ＝１０を設定する（ステップＳＳ３）。 Next, the number of samples (SN) of the original data am is 10 triangles, and SN=10 is set (step SS3).

次に、元データａｍの要素に掛け合わせる係数ｍを設定する。３辺（ａ０，ａ１，ａ２）間の和及び差を用いるとき、係数ｍは－１，０，１である。これらを用いて差分要素マトリックスＣｍを生成すると、前記した図９のように２７行３列の差分要素マトリックスＣｍが自動生成される（ステップＳＳ３）。 Next, a coefficient m is set by which the element of the original data am is multiplied. The coefficient m is -1, 0, 1 when using the sum and difference between the three sides (a0, a1, a2). When the difference element matrix Cm is generated using these, the difference element matrix Cm of 27 rows and 3 columns is automatically generated as shown in FIG. 9 (step SS3).

次に、入力データａｍの各要素間の和と差に、制約条件がある場合、その制約条件を設定する。３角形のように３辺の長さの加減算から構成される積入力要素は負値あるいは零を持たないと容易に推測できることから、辺の差分が正値である条件、（±ａ±ｂ±ｃ）＞０を設定する（ステップＳＳ４）。 Next, if there are constraints on the sum and difference between the elements of the input data am, the constraints are set. Since it can be easily assumed that a product input element composed of addition and subtraction of three side lengths, such as a triangle, does not have a negative value or zero, the condition that the side difference is a positive value is (±a±b± c) Set >0 (step SS4).

次に、積入力要素マトリックスＬｎＳを（数９）式から計算し、ステップＳＳ４で設定された制約条件を満足するＬｎＳテーブルを作成する（ステップＳＳ５）。図１８に制約条件を満足して生成された１０行１０列のＬｎＳテーブルを示す。積入力要素の１０行をＬ０～Ｌ９として、３辺の差及び和の式で与えられる１０行Ｌ０～Ｌ９と、三角形（１）～（１０）の、それらの式の３辺の差及び和の値である１０列の要素で構成される。 Next, the product input element matrix LnS is calculated from the equation (9) to create an LnS table that satisfies the constraint conditions set in step SS4 (step SS5). FIG. 18 shows an LnS table of 10 rows and 10 columns generated by satisfying the constraint conditions. 10 rows L0 to L9 of the product input elements, 10 rows L0 to L9 given by the equations of the differences and sums of the three sides, and triangles (1) to (10), the differences and sums of the three sides of those equations It consists of 10 columns of elements that are the values of

次に、べき乗探索法へ入力する入力データ要素の個数ＮＹを設定する。３角形の面積を求める元データは、３辺（ａ０，ａ１，ａ２）であり、（数５）で表されるＹＹ／Ｗ（べき乗値の積）の式は、答えである面積Ｓを含めた４要素以上の積で構成されることから、ＮＹを、ＮＹ＝４、次にＮＹ＝５、さらにＮＹ＝６と最適解が得られるまで増加させてループを廻す。但し、べき乗探索回数は増大してしまうことから、コンピュータの性能及び計算時間制約の範囲内に上限を設定する。ここでは、便宜上ＮＹ＝５に固定した例を用いて説明する（ステップＳＳ６）。 Next, the number NY of input data elements to be input to the power search method is set. The original data for determining the area of the triangle is the three sides (a0, a1, a2), and the YY/W (product of power values) equation represented by (Formula 5) includes the area S, which is the answer. Since it consists of products of four or more elements, NY is increased to NY=4, then NY=5, and then NY=6 until the optimum solution is obtained, and the loop is run. However, since the number of exponentiation searches increases, an upper limit is set within the limits of computer performance and computation time constraints. Here, for the sake of convenience, an example in which NY is fixed to 5 will be described (step SS6).

次に、ＬｎＳテーブルの行から、（ＮＹ－１）の４行を抽出し組み合わせた４行１０列のＤｎＬテーブルを作成する（ステップＳＳ７）。 Next, a DnL table of 4 rows and 10 columns is created by extracting and combining 4 rows of (NY-1) from the rows of the LnS table (step SS7).

次に、前記ＤｎＬテーブルの最後尾に、三角形（１）～（１０）答えデータである１行１０列の面積Ｓを連結する。（ステップＳＳ８）。図１９に生成されたＤｎＬテーブルを示す。このように、べき乗探索に入力される５次元データ（Ｄ０、Ｄ１、Ｄ２、Ｄ３、Ｄ４）は、（Ｄ０、Ｄ１、Ｄ２、Ｄ３）へ、ＬｎＳテーブルの積入力要素Ｌ０～Ｌ９から抽出された４要素の組み合わせを配置し、Ｄ４へ面積Ｓを配置し、２１０個（Ｎｏ．０～２０９）のインデックスを付したテーブルで構成される。 Next, the area S of 1 row and 10 columns, which is the triangle (1) to (10) answer data, is connected to the end of the DnL table. (Step SS8). FIG. 19 shows the generated DnL table. Thus, the five-dimensional data (D0, D1, D2, D3, D4) input to the power search are converted to (D0, D1, D2, D3) extracted from the product input elements L0-L9 of the LnS table. It consists of a table in which a combination of four elements is arranged, an area S is arranged in D4, and 210 (No. 0 to 209) indexes are attached.

次に、ＤｎＬテーブルから、最初の５次元入力データＤｎ行を取得する（ステップＳＰ１）。図１９を参照すると、ＤｎＬテーブルの先頭インデックスＤｎ行Ｎｏ．０の（Ｄ０、Ｄ１、Ｄ２、Ｄ３、Ｄ４）＝（Ｌ０、Ｌ１、Ｌ２、Ｌ３、Ｓ）である。 Next, the first five-dimensional input data Dn row is obtained from the DnL table (step SP1). Referring to FIG. 19, the top index Dn row No. of the DnL table. (D0, D1, D2, D3, D4) of 0 = (L0, L1, L2, L3, S).

次に、べき指数Ｐｎの探索方法を設定する（ステップＳＰ２）。べき数値を｜ｐｎ｜≦４の整数とする総当たり探索とする。 Next, a search method for exponent Pn is set (step SP2). A brute-force search is performed in which the exponent is an integer of |pn|≤4.

次に、べき指数Ｐｎの初期値を設定する（ステップＳＰ３）。べき数値を｜ｐｎ｜≦４の整数とする総当たり探索のとき、探索初期値である探索ラベルＮｏ．０は、べき指数（－４、－４、－４、－４、－４）である。 Next, the initial value of exponent Pn is set (step SP3). In a round-robin search where the exponent is an integer of |pn| 0 is exponent (-4, -4, -4, -4, -4).

次に、べき指数Ｐｎの探索終了条件を設定する（ステップＳＰ４）。探索終了値を、入力データ要素の先頭Ｄ０の正のべき数は、負のべき数の逆数の解であり、重複するため不要とし、（－１、４、４、４、４）に設定する。 Next, a search end condition for exponent Pn is set (step SP4). The search end value is set to (-1, 4, 4, 4, 4) because the positive power number at the beginning of the input data element D0 is the solution of the reciprocal of the negative power number and is unnecessary because it overlaps. .

次に、データＤｎ行、べき指数Ｐｎの探索テーブルを作成する（ステップＳＰ５）。例えば、探索ラベルＮｏ．０とし、べき指数Ｐ０＝（－４、－４、－４、－４、－４）、次の探索ラベルをＮｏ．１にＰ１＝（－４、－４、－４、－４、－３）のように連番にして、探索終了値（－１、４、４、４、４）とした探索テーブルを作る。 Next, a search table for data Dn rows and exponents Pn is created (step SP5). For example, search label No. 0, the power index P0=(-4, -4, -4, -4, -4), and the next search label is No. P1=(-4,-4,-4,-4,-3) is assigned to 1, and a search table is created with search end values (-1, 4, 4, 4, 4).

次に、データＤｎ行、べき指数Ｐｎを探索テーブルから、探索ラベル順に取り出す（ステップＳＰ６）。 Next, data Dn rows and exponents Pn are taken out from the search table in search label order (step SP6).

次に、Ｄｎ＾Ｐｎをニューラルネットワークの入力に再定義する（ステップＳＰ７）。ステップＳＰ６で受け取った、データＤｎ行と、べき指数Ｐｎを用いたＤｎ＾Ｐｎ＝（Ｄ０＾ｐｏ，Ｄ１＾ｐ１，Ｄ２＾ｐ２，Ｄ３＾ｐ３，Ｄ４＾ｐ４）の式から、Ｄｎ＾ＰｎをＤｎに再定義し、加算型ニューラルネットワークの入力に設定する。 Next, Dn̂Pn is redefined as the input of the neural network (step SP7). From the formula Dn^Pn=(D0^po, D1^p1, D2^p2, D3^p3, D4^p4) using the data Dn row and exponent Pn received in step SP6, Dn^Pn is Dn is redefined and set as the input of the additive neural network.

その後のステップＳＴ１～ＳＴ８は、前記した、第１の実施例と同じ加算型ニューラルネットワーク演算の手順であり、説明は省略する。但しｂａｓｅ（低）の設定値は０．９９を用いた。 The subsequent steps ST1 to ST8 are the same addition-type neural network calculation procedure as in the first embodiment, and the description thereof is omitted. However, the set value of base (low) was 0.99.

次の、ステップＳＰ９は、総当たり探索のため、データＤｎ、べき指数Ｐｎの探索ラベルに従い、順送りする。 In the next step SP9, for a round-robin search, forward feed is performed according to the search label of the data Dn and exponent Pn.

次の、ステップＳＰ１０は、探索テーブルの探索終了値（－１、－４、－４、－４、－４）のとき、ステップＳＳ９に進む。また、探索終了値（－１、－４、－４、－４、－４）で無ければステップＳＰ６に戻り、繰り返す。 At the next step SP10, when the search end value of the search table is (-1, -4, -4, -4, -4), the process proceeds to step SS9. If the search end value is not (-1, -4, -4, -4, -4), the process returns to step SP6 and repeats.

次のステップＳＳ９で、データＤｎをＤｎＬテーブルのインデックス順に従い次のデータに更新する。 In the next step SS9, the data Dn is updated to the next data according to the index order of the DnL table.

次のステップＳＳ１０で、ＤｎＬテーブルの最終インデックスのデータＤｎ（Ｌ６、Ｌ７、Ｌ８、Ｌ９、Ｓ）であれば終了する。最終インデックスで無ければ、ステップＳＰ２に戻り、繰り返す。 In the next step SS10, if the data Dn (L6, L7, L8, L9, S) of the final index of the DnL table, the process ends. If it is not the final index, return to step SP2 and repeat.

５次元のＤｎＬテーブルの最終インデックスの終了後、変動係数を最小にする積入力要素Ｌｍ（Ｌ０～Ｌ９）の組み合わせは、５次元入力データＤｎ＝（Ｌ０、Ｌ４、Ｌ７、Ｌ９、Ｓ）のときに、べき指数Ｐｎ＝（－１－１、－１、－１，２）が得られる。図２０にべき指数Ｐｎ＝（－１－１、－１、－１，２）における、ＹＹ／Ｗ（べき乗値の積）の計算値の表を示した。ＹＹ／Ｗ（べき乗値の積）はほぼ一定値（１／１６＝０．０６２５）に収束している。この出力結果から、ヘロンの公式が導かれていることが判る。 After the end of the final index of the five-dimensional DnL table, the combination of the product input elements Lm (L0 to L9) that minimizes the coefficient of variation is when the five-dimensional input data Dn = (L0, L4, L7, L9, S) , the exponents Pn=(-1-1,-1,-1,2) are obtained. FIG. 20 shows a table of calculated values of YY/W (product of power values) for power exponents Pn=(-1-1, -1, -1, 2). YY/W (product of power values) converges to a substantially constant value (1/16=0.0625). From this output result, it can be seen that Heron's formula is derived.

前述の例は、べき乗探索法の評価関数に変動係数を用いた例である。本発明は、評価関数に判別率を適用することができる。以下、ヘロンの公式を例にして評価関数に判別率を用いて、ヘロンの公式を導く方法について説明する。 The above example is an example in which the coefficient of variation is used as the evaluation function of the power search method. The present invention can apply the discrimination rate to the evaluation function. Hereinafter, a method of deriving Heron's formula using the discrimination rate as an evaluation function will be described using Heron's formula as an example.

三角形の面積Ｓを２等分し判別に用いる。例えば、１０個のサンプルＮｏ．ＳＮ列の偶数番号の面積Ｓを１．０倍、奇数番号の面積を０．９倍の値にし、それぞれ群Ａ、群Ｂと２分類にする。よって、答えは三角形の面積Ｓでは無く、判別結果である群Ａまたは群Ｂである。判別結果の一覧表を図２１に示した。 The area S of the triangle is halved and used for discrimination. For example, 10 sample nos. The even-numbered areas S of the SN series are multiplied by 1.0 and the odd-numbered areas S are multiplied by 0.9, and are divided into groups A and B, respectively. Therefore, the answer is not the area S of the triangle, but the group A or group B, which is the discrimination result. FIG. 21 shows a list of discrimination results.

例えば、三角形の形態を持つ製造物の３辺を測定器で測長し、面積を画像により測定することで、角が欠けて面積が小さい等、異常な外観の物を除きたい検査工程を想定する。正常な物は論理的なルールに従い、所定の閾値により良品判定され、それ以外のものは不良判定される。 For example, by measuring the length of three sides of a triangular product with a measuring instrument and measuring the area from an image, we assume an inspection process in which we want to exclude objects with an abnormal appearance, such as a small area due to missing corners. do. A normal object is determined to be non-defective according to a predetermined threshold according to logical rules, and other objects are determined to be defective.

この判別の答えは、群Ａ，群Ｂというラベルであり数値化しないと演算ができない問題が生じる。本発明は、答えがラベルである場合、答えの数値を定数にして扱うことができる。具体的には、三角形の面積Ｓを入力に追加し、答えは定数１にして演算を進めることができる。 The answer to this determination is the labels of group A and group B, and there arises a problem that computation cannot be performed unless these are quantified. The present invention can treat the numeric value of the answer as a constant when the answer is a label. Specifically, the area S of the triangle can be added to the input, and the answer can be set to a constant 1 to proceed with the calculation.

評価関数に判別率と変動係数を用いた例とのフローチャート図１０での違いは、探索方法に判別率を設定（ステップＳＰ２）し、それに従い判別率の計算処理を行う（ステップＳＰ８）ところであり、その他は同じで説明を省略する。 The difference between the flowchart of FIG. 10 and the example using the discrimination rate and the coefficient of variation in the evaluation function is that the discrimination rate is set in the search method (step SP2), and the discrimination rate is calculated according to it (step SP8). , and others are the same, and the description is omitted.

ここで、評価関数に判別率を用いた計算方法、及び、べき指数Ｐｎの探索方法について説明する。 Here, a calculation method using the discrimination rate as the evaluation function and a search method for the exponent Pn will be described.

（数５）の式、ＹＹ／Ｗ（べき乗値の積）は５次元入力の場合、（数１１）の関数で与えられ、最後尾のＤ４は面積Ｓを表すものとする。
(数１１)
ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ０＾ｐ２＊Ｄ３＊＾ｐ３＊Ｄ４＾ｐ４ The expression (5) YY/W (product of power values) is given by the function (11) in the case of five-dimensional input, and D4 at the end represents the area S.
(number 11)
YY/W=D0^p0*D1^p1*D0^p2*D3*^p3*D4^p4

（数１１）のＹＹ／Ｗ（べき乗値の積）を定数に近似できたとき、右項の値は、最後尾のＤ４にサンプルＮｏ．ＳＮ列の偶数番号の１．０倍、奇数番号を０．９倍の値にした面積Ｓの値を用いているため、１．０倍のＡ群の定数、及び、０．９倍したＢ群の定数の２つの分布に分かれる。これを利用し加算型ニューラルネットワークに５次元の入力を行い、加算型ニューラルネットワークの１次元出力値を用いて、Ａ群とＢ群を最大に区別する閾値を自動計算し判別率を算出する。ここでは、図８に示した隠れ層２段のべき指数追加加算型ニューラルネットワークの１次元出力値Ｚ－Ａｃｔを用いて判別率を最大にする５次元入力データＤｎのべき指数Ｐｎを探索した。 When YY/W (product of exponentiation values) of (Equation 11) can be approximated to a constant, the value of the right term is sample number D4 at the end. Since the value of the area S, which is 1.0 times the even number of the SN string and 0.9 times the odd number, is used, the A group constant of 1.0 times and the B There are two distributions of group constants. Using this, five-dimensional inputs are made to the additive neural network, and the one-dimensional output value of the additive neural network is used to automatically calculate a threshold value that maximizes discrimination between the A group and the B group, thereby calculating the discrimination rate. Here, a power exponent Pn of the five-dimensional input data Dn that maximizes the discrimination rate is searched using the one-dimensional output value Z-Act of the two-stage hidden layer power exponent addition addition neural network shown in FIG.

５次元のリストＤｎＬの最終インデックスが終了すると（ステップＳＳ１０）、判別率を最大にする積入力要素Ｌｎ（Ｌ０～Ｌ９）の組み合わせは、５次元入力データＤｎ＝（Ｌ０、Ｌ４、Ｌ７、Ｌ９、Ｓ）のときに、判別率１００％、べき指数Ｐｎ＝（－１、－１、－１、－１、２）の結果が得られる。 When the final index of the five-dimensional list DnL ends (step SS10), the combination of the product input elements Ln (L0 to L9) that maximizes the discrimination rate is the five-dimensional input data Dn=(L0, L4, L7, L9, S) results in a discrimination rate of 100% and exponent Pn=(-1, -1, -1, -1, 2).

次に、評価関数に判別率を用いて得られる出力グラフについての特徴を述べる。評価関数に判別率を用いて得られた出力値は、人に判りやすく視覚化することができる。図２２は、加算型ニューラルネットワークの出力値Ｚ－Ａｃｔの３角形番号順のグラフであり、図２３はＹＹ／Ｗ（べき乗値の積）の３角形番号順のグラフである。このグラフから出力値Ｚ－Ａｃｔは群Ａ及び群Ｂに２分され、ＹＹ／Ｗ（べき乗値の積）は傾きのない２つの定数線であることが視覚的に判る。 Next, the characteristics of the output graph obtained using the discrimination rate as the evaluation function will be described. The output value obtained by using the discrimination rate as the evaluation function can be easily understood and visualized. FIG. 22 is a graph of the output value Z-Act of the additive neural network in triangular number order, and FIG. 23 is a graph of YY/W (product of power values) in triangular number order. From this graph, it can be visually understood that the output value Z-Act is divided into groups A and B, and that YY/W (product of exponentiation values) is two constant lines without slope.

例に用いたヘロンの公式は、面積Ｓを小数第２位で４捨５入した誤差以外に、ノイズ要素は無い（データの粒がよい）。しかし、測定対象物から得られるデータに成り立つ関係式の多くは、解を求めるには不明なパラメータを含んでいたり、複雑な関数形態、あるいはノイズの多いデータから最適な関係式を推測する。このような場合、評価関数に判別率を用いる方法が有効であり、あらゆる分野に応用できる。 The Heron's formula used in the example has no noise elements other than the error of rounding the area S to the second decimal place (the grain of the data is good). However, many of the relational expressions based on the data obtained from the object to be measured contain unknown parameters to find solutions, or the optimal relational expressions are estimated from data with complex functional forms or noise. In such a case, a method using a discrimination rate as an evaluation function is effective and can be applied to all fields.

例えば、医療の大勢の検診データから、健康な人と、少数ではあるが、ある疾患を持っている人をＡ群とＢ群に分け、検診データの項目に何らかの最適な関係式が存在するか、の調査（探索）に利用できる。本発明のニューラルネットワークを用いて精度の高い関係式を見出し、その対策にあたる医療の発展に貢献することができる。 For example, from a large number of medical checkup data, healthy people and a small number of people with a certain disease are divided into groups A and B, and whether there is any optimal relational expression in the items of the checkup data. , can be used for investigation (search). By using the neural network of the present invention, it is possible to find a highly accurate relational expression and contribute to the development of medical treatment to deal with it.

さらに、前述したＳＮ列の偶数番号の面積Ｓを故意に１．０倍、奇数番号の面積を０．９倍の値にし、それぞれ群Ａ、群Ｂと故意に２分類したＹＹ／Ｗ（べき乗値の積）の３角形番号順のグラフを示す図２３に着目すると次のことが解る。群Ａの領域と群Ｂの領域の間には、群Ａ（バンドＡと呼ぶ）とも群Ｂ（バンドＢと呼ぶ）ともいえないグレイゾーンの空白領域（バンドＣと呼ぶ）が広く形成される。このグレイゾーンの空白領域（バンドＣ）を積極的に利用することでシステム制御に応用することができる。 Furthermore, the even-numbered area S of the SN string is intentionally multiplied by 1.0 and the odd-numbered area is 0.9 times, and YY/W (exponential power) is purposely divided into groups A and B, respectively. 23, which shows a graph of the product of values) in order of triangle number, the following can be understood. Between the area of group A and the area of group B, a gray zone blank area (called band C) that cannot be called either group A (called band A) or group B (called band B) is widely formed. . Active use of this gray zone blank area (band C) can be applied to system control.

（実施例３）
第３の実施例として、第２の実施形態を次数２のフェルマー曲線を表す円の方程式、１＝ｘ＾２＋ｙ＾２に適用する。１＝ｘ＾２＋ｙ＾２は、１＝（ｘ＋ｉ＊ｙ）＊（ｘ－ｉ＊ｙ）に因数分解できる。従って、元データを右項のｘとｙの数値を複数個用意し、答えデータを定数１として、係数ｋ＝－ｉ、－１，０，１，ｉを予め設定することにより、±１及び虚数単位ｉの係数を掛け合わせた差及び和を含む組み合わせで構成されるＬｎＳテーブルを自動作成する。そのＬｎＳテーブルから２次元の入力データ要素を抽出し組み合わせたＤｎＬテーブルを自動作成し、１次元の答えデータである１を連結させた３次元のＤｎＬテーブルを作る前処理が行われ、順にニューラルネットワークへ入力されて最適な関係式である円の方程式が導かれる。 (Example 3)
As a third example, we apply the second embodiment to the equation of a circle representing a Fermat curve of degree 2, 1=x^2+y^2. 1=x^2+y^2 can be factored into 1=(x+i*y)*(x−i*y). Therefore, by preparing a plurality of values of x and y on the right side of the original data, setting the answer data to a constant 1, and presetting coefficients k = -i, -1, 0, 1, i, ±1 and An LnS table composed of combinations including differences and sums obtained by multiplying coefficients of imaginary unit i is automatically created. A DnL table is automatically created by extracting and combining two-dimensional input data elements from the LnS table, and preprocessing is performed to create a three-dimensional DnL table by concatenating 1, which is one-dimensional answer data. , the equation of the circle, which is the optimal relational expression, is derived.

このように本発明のニューラルネットワークは、円あるいは楕円の曲線を方程式で認識でき、直線の認識より困難な曲線対象物の判別に利用できる。例えば、回転運動する機械の軸と軸受けの外観や非破壊検査データの良否の特徴を学習し、関係式と閾値を見つけて設計値との差異、変形、傷、ヒビ、摩耗他欠陥を判別できる。 As described above, the neural network of the present invention can recognize curves of circles or ellipses by equations, and can be used to discriminate curved objects that are more difficult than recognition of straight lines. For example, it is possible to learn the appearance of shafts and bearings in rotating machines and the characteristics of non-destructive inspection data, find relational expressions and thresholds, and determine differences from design values, deformations, scratches, cracks, wear and other defects. .

（実施例４）
第４の実施例として、ＣａｒｔＰｏｌｅ倒立振子装置の２次元シミュレーションを用いて、棒が倒れない安定制御できる制御式を導き出す。本事例では、４次元の入力データをリアルタイムに受け取り、Ｃａｒｔを右に押すか、左に押すかの出力を返してＣａｒｔ上のＰｏｌｅを倒さない制御式をべき乗探索法を用いた強化学習を行い、いち速く制御式を探索し、棒（Ｐｏｌｅ）を倒さず安定化させることを目的とする。 (Example 4)
As a fourth example, a two-dimensional simulation of the CartPole inverted pendulum device is used to derive a control formula that enables stable control so that the rod does not fall. In this example, 4-dimensional input data is received in real time, and the control formula that returns the output of whether to push the Cart to the right or the left and does not knock the Pole on the Cart is performed reinforcement learning using the power search method. , to search for the control formula as quickly as possible and stabilize the pole without knocking it down.

ＣａｒｔＰｏｌｅ倒立振子のアルゴリズムの性能評価のプラットフォームは、ＯｐｅｎＧｙｍより提供されており、これに、べき乗探索法を用いた強化学習のアルゴリズムを実装し最短で安定化させる制御式を探索する。また、従来のニューラルネットワークを用いた強化学習法の一つである方策勾配法と比較する。 A platform for evaluating the performance of the CartPole inverted pendulum algorithm is provided by Open Gym, on which a reinforcement learning algorithm using a power search method is implemented to search for a control formula that stabilizes in the shortest time. It is also compared with the policy gradient method, which is one of the conventional reinforcement learning methods using neural networks.

ＣａｒｔＰｏｌｅ倒立振子は、図２４のように、台座（Ｃａｒｔ）の上に連結されている棒（Ｐｏｌｅ）を最初、横軸ｘ＝０に垂直に立てると、重力とゆらぎを模擬した力が働き左右どちらかに倒れようとする、これを倒さないように台座（Ｃａｒｔ）を左右に均等な力で押し、所定時間倒さないようにするシミュレーションであり、所定時間内に一定の角度以上、棒（Ｐｏｌｅ）が倒れてしまうと終了となる。 In the CartPole inverted pendulum, as shown in FIG. 24, when a pole (Pole) connected to a pedestal (Cart) is first set vertically to the horizontal axis x=0, a force simulating gravity and fluctuation acts and moves left and right. This is a simulation in which the pedestal (cart) is pushed with equal force to the left and right so as not to fall down in either direction, and the cart is not pushed down for a predetermined time. ) falls down, the game ends.

最初に、棒（Ｐｏｌｅ）が倒れないようにするアルゴリズムの一つである従来型の方策勾配法を用いて所定時間倒さないようにする方法を説明する。ＣａｒｔＰｏｌｅ倒立振子の出力として得られる情報は、図２４に図示、及び図２５の表に示すように、その都度の状態を台座（Ｃａｒｔ）の位置、速度、棒（Ｐｏｌｅ）の角度、角速度の４つが状態変数（ｄ０、ｄ１、ｄ２、ｄ３）として台座（Ｃａｒｔ）を押したときに返される。また、ある状態の状態変数からとりうる行動は図２６のように、台座（Ｃａｒｔ）を同じ力で右に押すか左に押すかの２つである。 First, a method for preventing a pole from falling for a predetermined time using a conventional policy gradient method, which is one of the algorithms for preventing the pole from falling, will be described. The information obtained as the output of the CartPole inverted pendulum is shown in FIG. 24 and shown in the table of FIG. One is returned when the pedestal (Cart) is pressed as state variables (d0, d1, d2, d3). Also, as shown in FIG. 26, there are two actions that can be taken from the state variables of a certain state: pushing the pedestal (Cart) to the right or pushing it to the left with the same force.

従来型のニューラルネットワークは、図２７のように４入力のシンプルな単層構造を使い、重み付けパラメータｗｎ＝（ｗ０、ｗ１、ｗ２、ｗ３）及びバイアスｂを学習し更新する。バイアスｂは使わずｂ＝０とすると、その出力値ｘは下記（数１２）で表される。また、方策勾配法には報酬関数（Ｒｔ）を設定し、報酬関数の値を最大化するように学習させていく方法を用いる。重み付けパラメータの更新方法は、従来ネットワークの学習率η及び偏微分を用いて下記（数１３）のように表される。
(数１２)
ｘ＝ｄ０＊ｗ０＋ｄ１＊ｗ１＋ｄ２＊ｗ２＋ｄ３＊ｗ３
(数１３)
ｗｎ←ｗｎ＋η（∂Ｒｔ）／（∂ｗｎ） A conventional neural network uses a simple single-layer structure with four inputs as shown in FIG. 27 to learn and update the weighting parameters wn=(w0, w1, w2, w3) and the bias b. When the bias b is not used and b=0, the output value x is expressed by the following (Equation 12). Also, in the policy gradient method, a method of setting a reward function (Rt) and learning to maximize the value of the reward function is used. The method of updating the weighting parameters is represented by the following (Equation 13) using the learning rate η and partial differentiation of the conventional network.
(number 12)
x=d0*w0+d1*w1+d2*w2+d3*w3
(number 13)
wn←wn+η(∂Rt)/(∂wn)

方策勾配法は、いくつかのエピソードごとを一つの評価範囲に設定しパラメータを更新していく方法である。このシミュレーションでは、１エピソードを台座（Ｃａｒｔ）を１回押す作業を１ステップと定義して、棒（Ｐｏｌｅ）が倒れる（終了）までのステップ数が動作の回数を表し、１エピソードとする。また所定時間倒れないときの最大ステップ数は２００として打ち切り、そのエピソードを終了する。従って、１エピソードの最大ステップ数は２００に設定し、いくつかのエピソードの平均ステップ数は、棒（Ｐｏｌｅ）が倒れずに耐えることができたステップ数の平均である。ここでは、評価範囲を過去１００エピソード毎に設定し、その平均ステップ数を記録し、学習の進行具合をモニターするとともに、報酬関数の更新パラメータに用いる。 The policy gradient method is a method in which several episodes are set as one evaluation range and the parameters are updated. In this simulation, one episode is defined as the work of pushing the cart once as one step, and the number of steps until the pole falls down (end) represents the number of actions, and is defined as one episode. Also, the maximum number of steps when the character does not fall down for a predetermined time is set to 200, and the episode is terminated. Therefore, the maximum number of steps per episode was set to 200, and the average number of steps over several episodes is the average number of steps the Pole was able to withstand without falling over. Here, the evaluation range is set every 100 episodes in the past, the average number of steps is recorded, and the progress of learning is monitored, and used as an update parameter of the reward function.

報酬関数の与え方は、ｔエピソード目の報酬をＲｔとすると図２８のように、２００ステップ倒れずに終了すると（－１）の値、２００ステップ内で倒れると（ステップ数－２００）の値を与える。 As shown in FIG. 28, if the reward for the t-th episode is Rt, the value of the reward function is (-1) when the reward is completed without falling over 200 steps, and the value is (number of steps -200) when falling within 200 steps. give.

重み付けパラメータｗｎの学習を進めるうえで初期値を０に設定、あるいは何らかの値を設定し開始するが、重み付けパラメータｗｎの初期値及び更新状況によっては、いつまで学習しても目標ステップ数２００へ到達しない問題が発生する。従来型の方策勾配法の解決策として、重み付けパラメータｗの初期値に乱数値を設け、さらに途中に、ある程度ランダムな行動を起こすことを目的とした適度な乱数値Ｎを加えてパラメータｗを更新し報酬を最大化する手法が提案されており、ε－ｇｒｅｅｄｙアルゴリズムとして知られている。具体的には（数１３）式を基本にして１０エピソード毎（バッチ数毎）にパラメータｗｎに標準偏差σの振れ幅を持つ１０個の乱数値Ｎ［ｉ］を再構成し、エピソードの進行ｉ＝０～９の順に乱数値Ｎ［ｉ］を加え、さらに報酬の偏微分∂Ｒｔ／∂ｗｎを加えて更新しランダムに次の行動を選択する（数１４）式を採用している。以上に説明した従来型の方策勾配法のフローチャートを図２９に示した。ここで、初期値パラメータとして重み付けｗｎを変動させる学習率η及び振れ幅の標準偏差σの値をη＝０．２、σ＝０．０５に設定している。
(数１４)
ｗｎ←ｗｎ＋Ｎ［ｉ］＋η（∂Ｒｔ）／（∂ｗｎ） When learning the weighting parameter wn, the initial value is set to 0 or some value is set and started, but depending on the initial value and update status of the weighting parameter wn, the target number of steps 200 is not reached no matter how long the learning is performed. a problem arises. As a solution to the conventional policy gradient method, a random value is provided as the initial value of the weighting parameter w, and an appropriate random value N is added in the middle to cause random behavior to some extent, and the parameter w is updated. A method for maximizing the reward has been proposed and is known as the ε-greedy algorithm. Specifically, based on the equation (13), 10 random numbers N[i] having a standard deviation σ of the parameter wn are reconfigured every 10 episodes (every batch number), and the progress of the episode A random number N[i] is added in the order of i=0 to 9, and the partial differential ∂Rt/∂wn of the reward is added to update and randomly select the next action (Formula 14). A flowchart of the conventional policy gradient method described above is shown in FIG. Here, as initial parameters, the learning rate η for varying the weighting wn and the standard deviation σ of the amplitude are set to η=0.2 and σ=0.05.
(number 14)
wn←wn+N[i]+η(∂Rt)/(∂wn)

前述の従来型の方策勾配法をＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装した結果例を図３０に示した。図３０は横軸にエピソード数、縦軸に棒（Ｐｏｌｅ）が倒れずに耐えることができた過去１００エピソード毎の平均ステップ数のグラフである。このグラフから１５００エピソードで平均ステップ数１９５に到達し終了している。また、平均ステップ数１９５を達成したときの重み付けパラメータは（ｗ０、ｗ１、ｗ２、ｗ３）＝（－０．５３２、０．６１０、１．２５４，１．４２１）であった。 An example result of implementing the conventional policy gradient method described above to a CartPole inverted pendulum simulation is shown in FIG. FIG. 30 is a graph showing the number of episodes on the horizontal axis and the average number of steps per past 100 episodes that the pole can endure without falling down on the vertical axis. From this graph, the average number of steps reaches 195 in 1500 episodes and ends. Also, the weighting parameters when the average step number of 195 was achieved were (w0, w1, w2, w3)=(-0.532, 0.610, 1.254, 1.421).

図３１の表は、棒（Ｐｏｌｅ）が倒れずに耐えることができた過去１００エピソード毎の平均ステップ数≧１９５を満足する重み付けパラメータ例であり、前記の（ｗ０、ｗ１、ｗ２、ｗ３）＝（－０．５３２、０．６１０、１．２５４，１．４２１）のみでなく、ＣａｒｔＰｏｌｅ倒立振子シミュレーションを繰り返すと多数存在し、その５例を示した。図３１にある５例の重み付けパラメータを用いたプログラムをＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装すると、どれも最初から２００ステップ数以上、棒（Ｐｏｌｅ）を倒さずに立たせておくことができる。しかし、従来型の方策勾配法から得られた５例の重み付けパラメータを見ても、棒（Ｐｏｌｅ）を倒さずに立たせておくことができる概念を理解するには、極めて困難な欠点がある。 The table in FIG. 31 is an example of weighting parameters that satisfy the average number of steps in the past 100 episodes that the Pole could endure without collapsing≧195, and (w0, w1, w2, w3)= Not only (−0.532, 0.610, 1.254, 1.421) but also many repeated CartPole inverted pendulum simulations exist, and five examples are shown. When the program using the weighting parameters of the five examples shown in FIG. 31 is implemented in the CartPole inverted pendulum simulation, the pole can be kept upright for 200 steps or more from the beginning. However, even looking at the five example weighting parameters obtained from the conventional policy gradient method, it is extremely difficult to comprehend the concept of keeping the Pole standing rather than knocking it down.

前述のように、ＣａｒｔＰｏｌｅ倒立振子の安定化制御に従来型の方策勾配法を用いて、棒（Ｐｏｌｅ）を一定時間倒れない制御式を導く方法について説明した。しかし、得られた制御式を分析し理解し応用へ発展させることは困難である。例えば棒（Ｐｏｌｅ）を垂直に立たせた状態から、右あるいは左にコントロールし動かすような制御方法を見出すには至らない。本発明の、べき乗探索法を用いた強化学習のアルゴリズムは、得られた関係式を人が理解できるように分析、視覚化することができ、棒（Ｐｏｌｅ）を垂直に立たせた状態から、右あるいは左にコントロールし動かす制御方法を直感できる。さらに、目的とする制御に必要な状態パラメータ（入力データ）のみを抽出し不必要（余剰）な状態パラメータ（入力データ）を削除することができる。 As described above, a method of deriving a control formula that does not tilt the pole for a certain period of time using the conventional policy gradient method for the stabilization control of the CartPole inverted pendulum has been described. However, it is difficult to analyze and understand the obtained control equations and develop them into applications. For example, it is not possible to find a control method for controlling and moving a pole to the right or left from a state in which it stands vertically. The reinforcement learning algorithm using the power search method of the present invention can analyze and visualize the obtained relational expression so that people can understand it. Or you can intuitively control how to control and move to the left. Furthermore, it is possible to extract only the state parameters (input data) necessary for the target control and delete unnecessary (surplus) state parameters (input data).

本発明の、べき乗探索法を用いた強化学習について説明する。ＣａｒｔＰｏｌｅ倒立振子の棒（Ｐｏｌｅ）及び台車（Ｃａｒｔ）の動きは前述と同じであり、棒（Ｐｏｌｅ）が倒れないように制御する強化学習アルゴリズムについて図３２のフローチャートに沿って詳細に説明する。 Reinforcement learning using the power search method of the present invention will be described. The movements of the pole and cart of the CartPole inverted pendulum are the same as described above, and the reinforcement learning algorithm for controlling the pole so as not to fall will be described in detail with reference to the flow chart of FIG.

べき乗探索法をＣａｒｔＰｏｌｅ倒立振子に適用するにあたり、４次元の状態変数（ｄ０、ｄ１、ｄ２、ｄ３）の（ｂａｓｅ）べき乗値を（Ｄ０、Ｄ１、Ｄ２、Ｄ３）とし、べき指数をＰｎ＝（ｐ０、ｐ１、ｐ２、ｐ３）とする。ここで、４次元の状態変数を組み合わせて構成する所定の関係を有する答えデータをＤ４とおく。Ｄ４の期待値は定数である１にできる。従って５次元の入力要素は（Ｄ０、Ｄ１、Ｄ２、Ｄ３、１）と置くことができる。ＹＹ／Ｗ（べき乗値の積）は（数５）から（数１５）の関数で与えられる。ここでＷ＝１に単純化すると目標値ＹＹは（数１６）に表すことができる。次に（数１６）の両辺をｌｏｇ値の式にすると（数１７）が得られる。（数１７）式の右辺は、（数１１）式の重み付けｗｎをべき指数数Ｐｎに置き換えた式に等しく、左辺ｌｏｇ（ＹＹ）は目標値ＹＹ＝１のとき、ｌｏｇ（ＹＹ）＝０である。ここで、ｌｏｇ（ＹＹ）＝ｘとおくと、前述の従来型の方策勾配法に用いた重み付けｗｎをべき指数Ｐｎに置き換えた（数１２）式に等しく、アルゴリズムの比較説明に都合がよい。
(数１５)
ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ０＾ｐ２＊Ｄ３＾ｐ３
(数１６)
ＹＹ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ０＾ｐ２＊Ｄ３＾ｐ３
(数１７)
ｌｏｇ（ＹＹ）＝ｄ０＊ｐ０＋ｄ１＊ｐ１＋ｄ２＊ｐ２＋ｄ３＾ｐ３ In applying the power search method to the CartPole inverted pendulum, the (base) power value of the four-dimensional state variables (d0, d1, d2, d3) is set to (D0, D1, D2, D3), and the power exponent is set to Pn = ( p0, p1, p2, p3). Here, let D4 be answer data having a predetermined relationship formed by combining four-dimensional state variables. The expected value of D4 can be 1, which is a constant. Therefore, a five-dimensional input element can be put as (D0, D1, D2, D3, 1). YY/W (product of exponentiation values) is given by the functions of (Equation 5) to (Equation 15). Here, simplifying to W=1, the target value YY can be represented by (Equation 16). Next, if both sides of (Equation 16) are expressed as log values, (Equation 17) is obtained. The right side of the equation (17) is equivalent to the equation in which the weighting wn in the equation (11) is replaced by the exponent number Pn, and the left side log(YY) is log(YY)=0 when the target value YY=1. be. Here, setting log(YY)=x is equivalent to equation (12) in which the weighting wn used in the conventional policy gradient method is replaced by the exponent Pn, which is convenient for comparing algorithms.
(number 15)
YY/W=D0^p0*D1^p1*D0^p2*D3^p3
(number 16)
YY=D0^p0*D1^p1*D0^p2*D3^p3
(number 17)
log(YY)=d0*p0+d1*p1+d2*p2+d3^p3

最初に初期設定を行う（ステップＳＳ１）。説明の便宜上、従来型の方策勾配法に倣い、１エピソード内の最大ステップ数２００、平均値評価に用いるエピソード数を１００、べき指数Ｐｎを更新させる偏差Ｎの配列のバッチ数を１０に設定する。ここで、４次元の偏差Ｎの設定値は、従来型の方策勾配法では乱数値の初期値０を設定したが、べき乗探索法に用いる偏差は、べき数を更新させる更新量Δｐｎを設定する。本事例では更新量Δｐｎは±１として図３３の表に示した。バッチ数１０に相当する１０個（ｉ＝０～９）の偏差Ｎ［ｉ］を４次元配列（Δｐ０、Δｐ１、Δｐ２、Δｐ３）の各項に順次１、－１の整数値を設定する。但しｉ＝８及び９においては、０を設定した。４次元の場合８個の更新量Δｐｎでよいのでｉ＝８及び９の設定は余剰であるが、従来型の方策勾配法との比較説明の便宜上２個は更新量Δｐｎを０設定とし偏差Ｎを更新しない余剰部分として残した。次に、報酬Ｒｔ及び報酬Ｒｔを正規化した変数Ｒｔａの初期値を０に設定する。 First, initial setting is performed (step SS1). For convenience of explanation, following the conventional policy gradient method, the maximum number of steps in one episode is set to 200, the number of episodes used for mean value evaluation is set to 100, and the number of batches of the deviation N array for updating the exponent Pn is set to 10. . Here, the set value of the four-dimensional deviation N is set to the initial value 0 of the random value in the conventional policy gradient method, but the deviation used in the power search method sets the update amount Δpn for updating the power number. . In this example, the update amount Δpn is shown in the table of FIG. 33 as ±1. Ten deviations N[i] (i=0 to 9) corresponding to the number of batches of 10 are sequentially set to integer values of 1 and -1 in each term of a four-dimensional array (Δp0, Δp1, Δp2, Δp3). However, 0 is set for i=8 and 9. In the case of four dimensions, eight update amounts Δpn are sufficient, so setting i=8 and 9 is redundant. was left as an unupdated redundant part. Next, the initial values of the reward Rt and the normalized variable Rta of the reward Rt are set to zero.

次に、バッチ数１０回分のループ初期値ｉ＝０を設定（ステップＳＳ２）したあとに、べき指数Ｐｎを更新する。べき指数Ｐｎの更新方法は、（数１８）式で表され、偏差Ｎ［ｉ］と報酬の偏微分∂Ｒｔ／∂Ｐｎを加えて更新する（ステップＳＳ３）。
(数１８)
Ｐｎ←Ｐｎ＋Ｎ［ｉ］＋η（∂Ｒｔ）／（∂Ｐｎ） Next, after setting the loop initial value i=0 for 10 batches (step SS2), the exponent Pn is updated. The method for updating the power exponent Pn is represented by the equation (18), and is updated by adding the deviation N[i] and the partial differential ∂Rt/∂Pn of the reward (step SS3).
(Number 18)
Pn←Pn+N[i]+η(∂Rt)/(∂Pn)

次に、ＣａｒｔＰｏｌｅの動作の回数を表すステップ数を初期値ｓｔｅｐ＝０に設定したあと、状態変数（ｄ０、ｄ１、ｄ２、ｄ３）を０にリセットし初期状態にする（ステップＳＳ４）。 Next, after setting the number of steps representing the number of CartPole operations to an initial value step=0, the state variables (d0, d1, d2, d3) are reset to 0 and initialized (step SS4).

次に、ＣａｒｔＰｏｌｅを初期状態（棒の垂直に立っている状態）からリリースする（ステップＳＳ５）。 Next, the CartPole is released from the initial state (the state in which the pole stands vertically) (step SS5).

最初に台車を左へ一回押す（ステップＳＳ６）。 First, the truck is pushed once to the left (step SS6).

台車を押すことにより、ＣａｒｔＰｏｌｅから状態変数（ｄ０、ｄ１、ｄ２、ｄ３）が出力され、記憶する（ステップＳＳ７）。 By pushing the cart, the state variables (d0, d1, d2, d3) are output from the CartPole and stored (step SS7).

ニューラルネットワークの出力値ｘを（数１１）式から計算する（ステップＳＳ８）。 The output value x of the neural network is calculated from the equation (11) (step SS8).

次に、出力値ｘに基づき、ｘ＞０のとき、台車を右に押す。ｘ≦０のとき、台車を左に押す（ステップＳＳ９）。 Then, based on the output value x, when x>0, push the truck to the right. When x≦0, the truck is pushed to the left (step SS9).

台車を押すことにより、ＣａｒｔＰｏｌｅから状態変数（ｄ０、ｄ１、ｄ２、ｄ３）及び、棒が倒れて終了したかどうかの信号が出力され、記憶する（ステップＳＳ１０）。 By pushing the carriage, CartPole outputs and stores state variables (d0, d1, d2, d3) and a signal indicating whether or not the pole has fallen (step SS10).

棒が倒れて終了したら報酬Ｒｔ＝ｓｔｅｐ－２００を得て、バッチ数１０回分のループ数を１増やす（ステップＳＳ１１→ＳＫ１→ＳＳ１２）。棒が倒れずに、１エピソードｓｔｅｐ＝２００を達成したら報酬Ｒｔ＝－１を得て、バッチ数１０回分のループ数を１増やす（ステップＳＳ１１→ＳＫ２→ＳＫ３→ＳＳ１２）。棒が倒れずに、１エピソードｓｔｅｐ＜２００であれば、ステップ数を１増やしてステップＳＳ１１の先頭にループを戻す（ステップＳＳ１１→ＳＫ２→ＳＫ４→ＳＳ８）。 When the stick falls down and ends, a reward Rt=step-200 is obtained, and the number of loops corresponding to 10 batches is increased by 1 (steps SS11→SK1→SS12). If one episode step=200 is achieved without the stick falling down, a reward Rt=-1 is obtained, and the number of loops corresponding to 10 batches is increased by 1 (steps SS11→SK2→SK3→SS12). If the stick does not fall and one episode step<200, the number of steps is increased by 1 and the loop is returned to the beginning of step SS11 (steps SS11→SK2→SK4→SS8).

次に、バッチ数のループｉを１増やし、報酬Ｒｔを過去１０回分の値を記憶する。次に、１エピソード内で倒れなかったステップ数を表す値であるｓｔｅｐを過去１００回分記憶し、その平均値ｓｔｅｐｍｅａｎを計算し記憶する（ステップＳＳ１２～ＳＳ１３）。 Next, the batch number loop i is incremented by 1, and the reward Rt for the past 10 times is stored. Next, step, which is a value representing the number of steps in which the player did not collapse in one episode, is stored for the past 100 times, and the average value stepmean is calculated and stored (steps SS12 and SS13).

次に、バッチ数のループｉがバッチ数１０回分に達するかどうかチェックする（ステップＳＳ１４）。バッチ数１０回分に達していないときは、ステップＳＳ４に戻る。バッチ数１０回分に達すると、ｓｔｅｐｍｅａｎの値をチェックし、ｓｔｅｐｍｅａｎ≧１９５を満足すると終了する（ステップＳＳ１５）。ｓｔｅｐｍｅａｎ＜１９５のときは、過去１０回分の報酬Ｒｔを正規化したＲｔａを計算、記憶する（ステップＳＳ１６）。そのＲｔａとべき指数Ｐｎを更新させる偏差Ｎの内積を計算し、偏微分値∂Ｒｔ／∂Ｐｎとして記憶してから、ステップＳＳ２に戻る（ステップＳＳ１７）。 Next, it is checked whether or not the batch number loop i reaches 10 batch numbers (step SS14). If the number of batches has not reached 10, the process returns to step SS4. When the number of batches reaches 10, the value of stepmean is checked, and if stepmean≧195 is satisfied, the process ends (step SS15). When stepmean<195, Rta, which is obtained by normalizing the past ten rewards Rt, is calculated and stored (step SS16). The inner product of the deviation N for updating the Rta and exponent Pn is calculated and stored as a partial differential value ∂Rt/∂Pn, and then the process returns to step SS2 (step SS17).

前述のべき乗探索法を用いたアルゴリズムをＣａｒｔＰｏｌｅ倒立振子シミュレーションへ実装した結果例を図３４に示した。このグラフから１１０エピソードで平均ステップ数１９５に到達し終了している。また、平均ステップ数１９５を達成したときの、べき数値は（ｐ０、ｐ１、ｐ２、ｐ３）＝（－１、２、３，３）であった。従来型の方策勾配法のグラフ図３０と比較すると１／１０以下のエピソード数、すなわち短時間で棒が倒れない関数の探索を完了している。図３５の表は、棒（Ｐｏｌｅ）が倒れずに耐えることができた過去１００エピソード毎の平均ステップ数≧１９５を満足するべき数値の例であり、前述の（ｐ０、ｐ１、ｐ２、ｐ３）＝（－１、２、３，３）のみでなく、ＣａｒｔＰｏｌｅ倒立振子シミュレーションを繰り返すと多数存在し、その５例を示した。 FIG. 34 shows an example of the result of implementing the algorithm using the power search method described above to the CartPole inverted pendulum simulation. From this graph, the average number of steps reaches 195 in 110 episodes and ends. Also, when the average step number of 195 was achieved, the exponent was (p0, p1, p2, p3)=(-1, 2, 3, 3). The number of episodes is 1/10 or less compared to the graph of the conventional policy gradient method, that is, the search for a function in which the bar does not fall down in a short period of time has been completed. The table in FIG. 35 is an example of numerical values that should satisfy the average number of steps for each past 100 episodes≧195 that the pole could withstand without collapsing. =(−1, 2, 3, 3), and there are many repeated CartPole inverted pendulum simulations, and five examples are shown.

棒が倒れず安定する理由を、本特許であるニューラルネットワークを用いて、人が理解できるように分析、視覚化することができる。べき数値（ｐ０、ｐ１、ｐ２、ｐ３）＝（－１、２、３，３）を例にして説明する。 The neural network of this patent can be used to analyze and visualize the reason why the stick does not fall down and is stable so that people can understand it. The power values (p0, p1, p2, p3)=(-1, 2, 3, 3) will be described as an example.

べき指数の値（ｐ０、ｐ１、ｐ２、ｐ３）＝（－１、２、３，３）をＣａｒｔＰｏｌｅ倒立振子に実装しシミュレーションを実施する。１エピソード内の最初から２００ステップの４次元の状態変数（ｄ０、ｄ１、ｄ２、ｄ３）の入力値、及び台車を右に押したステップを群Ａ、左に押したステップを群Ｂの２分類の答えのデータとして、第２の実施例（ヘロンの公式）で説明した評価関数に判別率を用いる方法と同様に、本発明のニューラルネットワークへ入力すると図２３で説明した縦軸ＹＹ／Ｗ（べき乗値の積）のグラフが得られ図３６に示した。図３６は、横軸に台車を押した時系列順、すなわちステップＮｏ．順を表し、縦軸にＹＹ／Ｗ（べき乗値の積）の値をプロットし、台車を右に押したステップ群Ａを●、台車を左に押したステップ群Ｂを菱形で表示している。なおＹＹ／Ｗ（べき乗値の積）の値は、４次元の状態変数（ｄ０、ｄ１、ｄ２、ｄ３）の（ｂａｓｅ）べき乗値（Ｄ０、Ｄ１、Ｄ２、Ｄ３）に変換する底（ｂａｓｅ）は１０を用いて、ニューラルネットワークへ入力する５次元の入力要素を（Ｄ０、Ｄ１、Ｄ２、Ｄ３、１）とし、判別率を最大にする（数５）及び（数１５）式に基づく出力値として得られる。 The exponent values (p0, p1, p2, p3) = (-1, 2, 3, 3) are implemented in the CartPole inverted pendulum and simulated. Input values of four-dimensional state variables (d0, d1, d2, d3) in 200 steps from the beginning in one episode, and two classifications: group A for steps that push the cart to the right and group B for steps that push the cart to the left. As data for the answer to , input to the neural network of the present invention in the same manner as in the method of using the discrimination rate for the evaluation function described in the second embodiment (Heron's formula), the vertical axis YY/W ( A graph of the product of power values) was obtained and shown in FIG. In FIG. 36, the horizontal axis indicates the chronological order of pushing the carriage, that is, the step number. YY/W (product of exponentiated values) is plotted on the vertical axis, and step group A that pushes the truck to the right is indicated by ●, and step group B that pushes the truck to the left is indicated by diamonds. . Note that the value of YY/W (product of power values) is the base for converting to the power values (D0, D1, D2, D3) of the four-dimensional state variables (d0, d1, d2, d3). Using 10, the five-dimensional input elements to be input to the neural network are (D0, D1, D2, D3, 1), and the output value based on the equations (Equation 5) and (Equation 15) that maximizes the discrimination rate is obtained as

前述の動作説明から、ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ２＾ｐ２＊Ｄ３＾ｐ３＞１のとき台車を右に押し、ＹＹ／Ｗ＝Ｄ０＾ｐ０＊Ｄ１＾ｐ１＊Ｄ２＾ｐ２＊Ｄ３＾ｐ３≦１のとき台車を左に押すルールであり、図３６のグラフを用いて、次のことを説明できる。 From the above description of operation, when YY/W=D0^p0*D1^p1*D2^p2*D3^p3>1, the carriage is pushed to the right, YY/W=D0^p0*D1^p1*D2^p2 *When D3^p3≤1, the rule is to push the truck to the left, and the following can be explained using the graph of FIG.

図３６のグラフは、縦軸ＹＹ／Ｗ（べき乗値の積）の値で台車を確実に右へ押すＡ群、台車を確実に左へ押すＢ群、及び台車を右に押すときと左に押すときが混在するＣ群の領域に区別することができる。その中心値はＹＹ／Ｗ＝１である。ここで、ＹＹ／Ｗ（べき乗値の積）の値を判定する閾値を導入し変数Ａとすると、棒（Ｐｏｌｅ）はＹＹ／Ｗ（べき乗値の積）の閾値Ａを用いて左右の動きを制御できる。具体的には、ＹＹ／Ｗの閾値Ａが１のとき台車は中心に留まり棒（Ｐｏｌｅ）を垂直に立たせた状態を保つ。ＹＹ／Ｗ（べき乗値の積）の閾値Ａが１より大きいときは、初期状態で台車を右に押す機会が多くなり棒（Ｐｏｌｅ）は右に傾く。次の動作は棒（Ｐｏｌｅ）を倒さないようにするため、台車を左に押し台車は左へ進む。逆にＹＹ／Ｗ（べき乗値の積）の閾値Ａが１より小さいときは、初期状態で台車を左に押す機会が多くなり棒（Ｐｏｌｅ）は左に傾く。次の動作は棒（Ｐｏｌｅ）を倒さないようにするため、台車を右に押し台車は右へ進む。更に、ＹＹ／Ｗ＝１を中心とした閾値の深度により台車の移動速度を制御できることが直感できる。具体例として、べき指数の値を（ｐ０、ｐ１、ｐ２、ｐ３）＝（－１、２、３，３）のときのＹＹ／Ｗ（べき乗値の積）の式及び閾値Ａの値を変化させたときの台車の動作を図３７に纏めた。 In the graph of FIG. 36, the value of the vertical axis YY/W (product of exponentiation values) shows a group A that surely pushes the carriage to the right, a group B that surely pushes the carriage to the left, and a group B that pushes the carriage to the right and to the left. It is possible to distinguish between regions of group C in which the pressing time is mixed. Its central value is YY/W=1. Here, if a threshold for judging the value of YY/W (product of power values) is introduced as a variable A, the pole moves left and right using the threshold A of YY/W (product of power values). You can control it. Specifically, when the threshold value A of YY/W is 1, the trolley maintains a state in which the stay pole (Pole) stands vertically in the center. When the threshold value A of YY/W (the product of exponentiation values) is greater than 1, there are many opportunities to push the carriage to the right in the initial state, and the pole leans to the right. The next action is to push the carriage to the left so as not to knock over the Pole, and the carriage moves to the left. Conversely, when the threshold value A of YY/W (product of exponentiation values) is smaller than 1, there are many opportunities to push the carriage to the left in the initial state, and the pole leans to the left. The next action is to push the trolley to the right so as not to knock over the Pole, and the trolley moves to the right. Furthermore, it can be intuitively understood that the moving speed of the truck can be controlled by the threshold depth centered on YY/W=1. As a specific example, the value of the power exponent is (p0, p1, p2, p3) = (-1, 2, 3, 3), and the formula of YY/W (product of power values) and the value of the threshold value A are changed. Fig. 37 summarizes the motion of the truck when it is moved.

また、図３５のＮｏ．４及びＮｏ．５のべき指数ｐ０、ｐ１はそれぞれ０の値に着目すると、棒が中央で倒れない安定制御には、台座（Ｃａｒｔ）の位置、速度の状態変数であるＤ０、Ｄ１は不要であることを示している。棒が中央で安定している状態では、台座（Ｃａｒｔ）の位置はほぼ０、速度もほぼ０で中央に位置している状態であることから、無くても制御できると理解できる。このことからＤ０、Ｄ１を外し、Ｄ２、Ｄ３の棒（Ｐｏｌｅ）の角度、角速度の２つの状態変数を使って、前述のべき乗探索法を用いた強化学習を行い、棒（Ｐｏｌｅ）が倒れずに過去１００エピソード毎の平均ステップ数≧１９５を満足するべき数値の３例を図３８に示した。さらに、図３９に、状態パラメータ（Ｄ２、Ｄ３）及びべき数値（ｐ２、ｐ３）＝（５、３）を用いて、棒（Ｐｏｒｌ）を倒さずに、台座（Ｃａｒｔ）を中心位置から、左に移動、次に右へ移動し、さらに左端へ移動制御する制御式の適用例を示した。 Moreover, No. in FIG. 4 and no. Focusing on the value of 0 for each of the power exponents p0 and p1 of 5, it indicates that D0 and D1, which are status variables for the position and speed of the pedestal (Cart), are unnecessary for stable control in which the rod does not fall down in the center. ing. When the bar is stable in the center, the position of the pedestal (Cart) is almost 0, the speed is almost 0, and it is positioned at the center. For this reason, D0 and D1 are removed, and using the two state variables of D2 and D3, the angle and angular velocity of the pole (pole), reinforcement learning is performed using the power search method described above, and the pole does not fall down. FIG. 38 shows three examples of numerical values that should satisfy the average number of steps for each past 100 episodes≧195. Furthermore, in FIG. 39, using the state parameters (D2, D3) and exponents (p2, p3)=(5, 3), the pedestal (Cart) is moved from the center position to the left without tilting the pole (Porl). An application example of a control formula that moves to, then moves to the right, and then controls to move to the left end is shown.

このように、本特許は、答えを得るために必要な入力データを絞り込むことができる。つまり、不必要（余剰）な入力データを除くことで演算時間の削除、及び入力データを得る手段として必要なセンサー等の削減ができる。 Thus, this patent can narrow down the input data needed to get an answer. In other words, by removing unnecessary (surplus) input data, it is possible to eliminate the calculation time and reduce the number of sensors required as means for obtaining input data.

本事例の応用例として、各種センサー、モーター、通信及び制御用マイクロコンピュータを装備した教育版の組み立てキット、積み木（ブロック）を用いて、倒立振子装置を組み立て、棒を倒さずに静止、あるいは棒を左右に制御する体験を通してＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｉｇｅｎｃｅ）を学べる。教材によっては関係式が公式、法則を導きだしている場合もあるし、それに近い形で提供され、何か発見できるようなワクワク感があり、学習者の動機づけになる。 As an application example of this example, an educational version assembly kit equipped with various sensors, motors, communication and control microcomputers, and building blocks (blocks) are used to assemble an inverted pendulum device. You can learn AI (Artificial Intelligence) through the experience of controlling left and right. Depending on the teaching material, there are cases where the relational expressions lead to formulas and laws, and they are provided in a form similar to that, giving the students a sense of excitement as if they were discovering something, which motivates the learners.

制御方法を学習し、べき乗値の積を内蔵する制御式が提供される。シンプルな制御式が得られ、その式の成り立ちや制御方法が理解しやすい。場合によっては、制御ヘの貢献度が小さく不必要な入力データ部品（センサー等）の削減に繋がったり、新たな制御方式の発見に繋がったりする。 A control formula is provided that learns the control method and contains the product of power values. A simple control formula is obtained, and the origin of the formula and the control method are easy to understand. In some cases, this may lead to the reduction of unnecessary input data components (sensors, etc.) that contribute little to control, or to the discovery of new control methods.

得られた制御式を制御装置に適用すると、リアルタイムに制御式の安定性を評価し最適化できる。例えば、環境が異なった同装置の制御状態を学習し、動作が悪化している場合は良好な制御状態を保てる制御式へ更新、いわゆるズレ補正をリアルアイムに行い、より高い安定性を追求したフィードバック制御の自動化ができる。 By applying the obtained control formula to the controller, the stability of the control formula can be evaluated and optimized in real time. For example, learning the control state of the same device in different environments, updating the control formula to maintain a good control state when the operation is deteriorating, so-called deviation correction in real time, pursuing higher stability Feedback control can be automated.

産業用ロボットは様々な現場に持ち運んで組み上げて、目的とした条件で動くように調整が入る。制御パラメータを設定しなおす、あるいは制御式に補正が必要な場面で、本特許による、べき乗探索法を用いた強化学習を用いて再学習を行うと、より速く安定した最適な制御式を導き出し、制御パラメータあるいは制御式を実装しなおすことが容易である。同様に自動車や飛行体の自動制御にも応用できる。 Industrial robots are transported to various sites, assembled, and adjusted so that they operate under the desired conditions. When the control parameters need to be reconfigured or the control formula needs to be corrected, re-learning using reinforcement learning using the power search method according to this patent will lead to a faster and more stable optimal control formula, It is easy to reimplement control parameters or control equations. Similarly, it can be applied to the automatic control of automobiles and flying objects.

（実施例５）
前述の第１～第４の実施例で、べき指数追加加算型ニューラルネットワーク用いて適切な法則、方程式、関係式（制御式）を導けることを説明した。このように本発明は、学習した入力以外の未学習の入力に対しても適切な出力を与える与える能力を持つ汎化能力に優れており、この能力をプロセスに適用すると、学習したプロセスだけでなく、それに類似したプロセスについても適切な予測を極めて論理的に行うことができる。この背景にあるのが、論理演算子（ＡＮＤ．ＯＲ，ＮＡＮＤ，ＮＯＲ，ＥＸＯＲ）を簡単に学習することができること、ｎ進法を１０進法に変換するなどの論理演算の数値データを簡単に学習し汎用式を提示できる優れた演算機能を有していることにある。 (Example 5)
In the first to fourth embodiments described above, it has been explained that appropriate laws, equations, and relational expressions (control expressions) can be derived using exponential addition addition neural networks. In this way, the present invention is excellent in generalization ability with the ability to give appropriate outputs to unlearned inputs other than learned inputs. It is quite logical to make good predictions about similar processes. Behind this is the ability to easily learn logical operators (AND.OR, NAND, NOR, EXOR), and to easily convert numerical data of logical operations such as converting n-ary to decimal. It has an excellent arithmetic function that can learn and present general-purpose formulas.

論理演算子の排他論理和（ＥＸＯＲ）は非線形性を有するため、従来の単純パーセプトロンを使って真偽の出力を一本の直線（閾値）で分割できない。そのため２入力の真理値表を図４０に示すように、単純パーセプトロンで構成したＮＡＮＤ，ＯＲ，ＡＮＤ論理演算子を繋ぎ合わせた多層化ニューラルネットワーク構造へ大幅な設計変更を伴う。その学習した判別出力式は複雑に入り組んだパラメータ式になり、その式の理解も容易ではない。一方、べき指数追加加算型ニューラルネットワークは非線形を扱うことができ、図６及び図８のいずれか一つの、べき指数追加加算型ニューラルネットワークの基本構造に変更を加えることなくそのまま適用し、真偽の出力を一本の直線で分割するシンプルな判別出力式を導くことができる。 Since the exclusive OR (EXOR) of logical operators has non-linearity, it is not possible to divide the true/false output by a single straight line (threshold) using a conventional simple perceptron. Therefore, as shown in FIG. 40, the two-input truth table involves a significant design change to a multi-layered neural network structure in which NAND, OR, and AND logic operators configured by simple perceptrons are connected. The learned discriminant output formula becomes a complicated parameter formula, and the formula is not easy to understand. On the other hand, the exponent addition addition neural network can handle non-linearity, and the basic structure of the exponent addition addition neural network in either one of FIGS. A simple discriminant output formula can be derived that divides the output of by a straight line.

例えば、図４１に示す３入力（ｄ０、ｄ１、ｄ２）の真理値表に示す排他論理和（ＥＸＯＲ）の出力データｄ３は、底（ｂａｓｅ）を１０とした４次元入力値（Ｄ０、Ｄ１、Ｄ２、Ｄ３）にして、べき指数追加加算型ニューラルネットワークを用いて出力分類の判別学習を行うと、図４２に示すように、べき指数（－１，１、－１，２）を持つ判別式が導かれ、一本の直線（閾値）５を用いて正しく分割される。なお、２入力の排他論理和（ＥＸＯＲ）は、あまりに簡単に解けるので説明を省いた。 For example, the output data d3 of the exclusive OR (EXOR) shown in the three-input (d0, d1, d2) truth table shown in FIG. D2, D3), and performing discriminant learning for output classification using a power exponent addition addition neural network, as shown in FIG. is derived and split correctly using a single straight line (threshold) 5 . Note that the exclusive OR (EXOR) of two inputs is too easy to solve, so the explanation is omitted.

次に、２進数と１０進数の関係を表す表を図４３に示した。図４３は２進数４次元入力データ（ｄ０、ｄ１、ｄ２、ｄ３）と、その１０進数である０～９の出力値ｄ４の表である。これを底（ｂａｓｅ）を１０とした５次元入力値（Ｄ０、Ｄ１、Ｄ２、Ｄ３、Ｄ４）にして、図６及び図８のいずれか一つの、べき指数追加加算型ニューラルネットワークの基本構造をそのまま適用し、べき指数－１０～１０間の数式探索を行うと、図４４に示すように、べき指数（－８、－４、－２、－１，１）を持つ出力式と出力値１が導かれる。これより１０進数出力ｄ４の関係式は、ｄ４＝ｌｏｇ１０（Ｄ０＾８＊Ｄ１＾４＊Ｄ２＾２＊Ｄ３）＝２＾３＊ｄ０＋２＾２＊ｄ１＋２＾１＊ｄ３＋２＾０＊ｄ０と表され、２進数を１０進数へ変換させる公式（汎用式）そのものであることが理解できる。これより２進数４次元データで表せる未学習の１０進数値１０～１５の値も正しく予測されることが判る。 Next, FIG. 43 shows a table representing the relationship between binary numbers and decimal numbers. FIG. 43 is a table of four-dimensional binary input data (d0, d1, d2, d3) and its decimal output value d4 of 0-9. Using this as the five-dimensional input values (D0, D1, D2, D3, D4) with a base of 10, the basic structure of the exponential additive addition neural network shown in either one of FIGS. Applying it as it is and performing a formula search between exponents -10 to 10, as shown in FIG. is guided. From this, the relational expression of the decimal output d4 is expressed as d4=log10(D0^8*D1^4*D2^2*D3)=2^3*d0+2^2*d1+2^1*d3+2^0*d0 , is a formula (general-purpose formula) itself for converting a binary number to a decimal number. From this, it can be seen that the values of unlearned decimal values 10 to 15 that can be represented by four-dimensional binary data are also correctly predicted.

このように、べき指数追加加算型ニューラルネットワークは、その構造に手を加えることなく関係式、判別式を導くことができる応用範囲の広い演算方式であり、集積回路にしたＩＣ、マイクロコンピュータを提供し、判別装置及び制御装置に搭載すると、装置の高速化、小型化、低消費電力が実現できる。 In this way, the exponent addition addition type neural network is a calculation method with a wide range of applications that can derive relational expressions and discriminants without modifying its structure, and ICs and microcomputers made into integrated circuits are provided. However, when it is installed in the discriminating device and the control device, it is possible to increase the speed, reduce the size, and reduce the power consumption of the device.

（他の実施形態）
本発明は上述した実施形態に制約されるものではなく、本発明の主旨を逸脱しない範囲内で種々変更して実施することが可能である。そして、それらはすべて、本発明の技術思想に含まれるものである。 (Other embodiments)
The present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present invention. All of them are included in the technical idea of the present invention.

１…演算装置、１Ａ…機械学習装置、１Ｂ…判別装置、
２…判別器学習部、３…学習パラメータ記憶部、４…学習データ記憶部、
５…学習データ処理部、６…判別結果処理部、７…判別データ取得部、
２０…学習部、２１…判別処理部、
１００Ａ～１００Ｃ…ニューラルネットワーク構造、
１１０Ａ～１１０Ｃ…入力層、１２０Ａ～１２０Ｃ…出力層
１３０…隠れ層、１３１…第１の隠れノード、１３２…第２の隠れノード
DESCRIPTION OF SYMBOLS 1... Arithmetic device, 1A... Machine learning device, 1B... Discrimination device,
2... classifier learning unit, 3... learning parameter storage unit, 4... learning data storage unit,
5... Learning data processing unit, 6... Discrimination result processing unit, 7... Discrimination data acquisition unit,
20... Learning unit, 21... Discrimination processing unit,
100A to 100C ... neural network structure,
110A to 110C input layer 120A to 120C output layer 130 hidden layer 131 first hidden node 132 second hidden node

Claims

A computing device that outputs output values from the output layer for a plurality of input data (D0, D1, . . . , DN) input to the input layer using a neural network structure including at least an input layer and an output layer There is
The input layer is
having a plurality of power exponents (p0, p1, .
The output layer is
A product ( ^YY0 =D0 ^p0 *D1 ^p1 *) of a plurality of power values (D0 ^p0 , D1 ^p1 , . ... outputting the output value (y=f( ^YY0 )) based on *DN pN );
Arithmetic unit.

The neural network structure is
further comprising a hidden layer between the input layer and the output layer;
The hidden layer is
A plurality of the input data are input via a plurality of weighting parameters (w0, w1, . a first hidden node that outputs to the output layer;
A plurality of input data are respectively input via the plurality of weighting parameters, and a bias parameter (b) as the learning parameter is input, and an addition type operation output defined by the following formula [Equation 2] a second hidden node that outputs (BYA) to the output layer;
The output layer is
outputting the output value (y=f(YY1, BYA)) based on the target value (YY1) and the addition type operation output (BYA);
A computing device according to claim 1 .
[Number 1]
YY1=D0 ^p0 *D1 ^p1 *...*DN ^pN *W0*W1*...*WN
[Number 2]
BYA=B*(base) ^{(Σ[n=0→N](wn*pn*dn))}
however,
base is a positive number excluding 1 Dn=base ^dn (n=0,1,...,N)
Wn = base ^wn (n = 0, 1, ..., N)
B = base ^b
is.

The plurality of power exponents, the plurality of weighting parameters, and the bias parameters as the learning parameters are
A parameter learned by using a plurality of sets of the input data as the learning data,
The target value (YY1) output from the first hidden node and the additive operation output from the second hidden node when a plurality of the input data as the learning data are input to the input layer. Adjusted so that the difference (|YY1-BYA|) between the output (BYA) is small,
3. The computing device according to claim 2.

The input layer is
A plurality of logarithms (d0, d1, ..., dN) obtained by multiplying the plurality of input data (D0, D1, ..., DN) by the logarithms (d0, d1, ..., dN) and multiplying the plurality of logarithms of the plurality of input data by the plurality of exponents output the multiplied value (d0*p0, d1*p1, ..., dN*pN) to the output layer,
The output layer is
The sum (d0*p0+d1*p1+...+dN*pN) of the multiple multiplied values is converted into an antilog (base ^{d0*p0+d1*p1+...+dN*pN), and the antilog} is the product, and the output value (y =f(YY0)),
A computing device according to claim 1 .

The plurality of power exponents as the learning parameters are
A parameter learned by using a plurality of sets of learning data including a plurality of the input data and teacher data associated with the plurality of the input data,
so that the difference between the output value output from the output layer when the plurality of input data included in the learning data is input to the input layer and the teacher data included in the learning data is reduced. adjusted,
The computing device according to claim 1 or 4.

at least one of the plurality of input data,
data represented by complex numbers,
The computing device according to any one of claims 1 to 5.

An integrated circuit constituting the neural network structure used by the arithmetic device according to any one of claims 1 to 6,
an input/output unit that configures the input layer and the output layer;
a storage unit that stores the learning parameters;
a computation unit that performs computation for outputting the output value from the output layer based on the plurality of input data input to the input layer and the learning parameters stored in the storage unit;
integrated circuit.

A machine learning device that generates a learning model having the neural network structure used by the arithmetic device according to any one of claims 1 to 6,
a learning data storage unit that stores learning data including at least a plurality of the input data;
a learning unit that learns the learning parameter by inputting the learning data stored in the learning data storage unit into the learning model;
a learning parameter storage unit that stores the learning parameter as a result of learning by the learning unit;
Machine learning device.

A discrimination device that outputs a discrimination result for discrimination data using the learning model generated by the machine learning device according to claim 8,
a discrimination data acquisition unit that acquires the discrimination data;
a discrimination processing unit that outputs the discrimination result based on the output value from the learning model by inputting the discrimination data acquired by the discrimination data acquisition unit into the learning model;
Discriminator.