JP4472506B2

JP4472506B2 - Information processing system, information processing method, and program

Info

Publication number: JP4472506B2
Application number: JP2004363742A
Authority: JP
Inventors: 重樹菅野; 天海金; 哲也尾形
Original assignee: Waseda University
Current assignee: Waseda University
Priority date: 2004-12-15
Filing date: 2004-12-15
Publication date: 2010-06-02
Anticipated expiration: 2024-12-15
Also published as: JP2006172141A; WO2006073025A1

Description

本発明は、情報処理を行う複数のノードおよびこれらのノードを連結してノード間の情報伝達を行うリンクを構成エレメントとして備えるネットワークを用いた情報処理システムおよび情報処理方法、並びにプログラムに係り、例えば、ロボットの動作制御、ディスプレイ画面上のゲームのキャラクタの動作制御、空調管理等を行う場合に利用できる。 The present invention relates to an information processing system, an information processing method, and a program using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between the nodes as constituent elements. It can be used when performing robot motion control, game character motion control on a display screen, air conditioning management, and the like.

知能ロボットの開発をはじめ、現在の機械制御、情報処理の分野において、自律制御に用いる学習器の作成は、大きな課題となっている。そして、学習器に求められる必要条件としては、（１）多様な出力の自律的模索、（２）任意のタスクへの応用、（３）小さな計算コスト、（４）既存知識の再利用による学習、（５）時系列への対応、といったものが考えられるが、これらの条件を全て満足する学習器の作成は、未だ達成されていないのが現状である。 In the field of machine control and information processing, including the development of intelligent robots, the creation of learners for autonomous control has become a major issue. Necessary conditions for the learner include (1) autonomous search for various outputs, (2) application to arbitrary tasks, (3) small calculation cost, and (4) learning by reusing existing knowledge. (5) Correspondence to time series can be considered, but the creation of a learning device that satisfies all these conditions has not yet been achieved.

一般に、自律制御に用いる学習器の作成方法としては、強化学習の分野で使用されている次のような強化信号を用いた代表的な方法がある。この方法では、学習器へ外界からの入力を与え、そのときに生成される出力の評価として、外界から学習器に対して強化信号（正ならば報酬、負ならば罰に相当する信号）を付与することで、学習器の振る舞いを改善する。このような方法により作成される様々な学習器の中で、ニューロジェネティックラーニングという学習方法により作成される学習器があり、前述した（１）、（２）、（５）の条件を同時に満足することができることが知られている。このニューロジェネティックラーニングによる学習器は、神経回路を模したニューラルネットワークで構築される。ニューラルネットワークの構築には、仮想的な遺伝子が用いられ、その遺伝子を強化信号に応じて淘汰することで、ネットワークの進化を促し、Ｉ／Ｏ処理の性能を強化する。 Generally, as a method of creating a learning device used for autonomous control, there is a typical method using the following reinforcement signal used in the field of reinforcement learning. In this method, an input from the outside world is given to the learning device, and a reinforcement signal (a signal corresponding to a reward if positive, a punishment if negative) is sent from the outside world to the learning device as an evaluation of the output generated at that time. By giving it, the behavior of the learner is improved. Among various learning devices created by such a method, there is a learning device created by a learning method called neurogenetic learning, which satisfies the above-mentioned conditions (1), (2), and (5) at the same time. It is known that it can be. The learning device by this neurogenetic learning is constructed by a neural network imitating a neural circuit. In constructing the neural network, a virtual gene is used, and the gene is promoted according to the enhancement signal, thereby promoting the evolution of the network and enhancing the performance of the I / O processing.

また、再構成可能な回路を有し、この回路の環境に対する適応度を評価し、その評価結果に基づいて回路構成を変更して進化させることにより、環境の変化に対応して自律的にハードウェア構成を変化させる自律進化型システムがある（特許文献１参照）。 In addition, it has a reconfigurable circuit, evaluates the adaptability of this circuit to the environment, and changes and evolves the circuit configuration based on the evaluation result, thereby autonomously responding to changes in the environment. There is an autonomous evolution system that changes the wear configuration (see Patent Document 1).

さらに、ニューロンユニット間の結合係数を最適化するニューラルネットワーク学習方法を用いた信号処理装置がある（特許文献２，３参照）。 Furthermore, there is a signal processing device using a neural network learning method for optimizing the coupling coefficient between neuron units (see Patent Documents 2 and 3).

特開平１０−３０７８０５号公報（請求項１、図１、要約）Japanese Patent Laid-Open No. 10-307805 (Claim 1, FIG. 1, Abstract) 特開平５−７３７０５号公報（請求項１、図１、要約）JP-A-5-73705 (Claim 1, FIG. 1, Summary) 特開平４−３３６６５６号公報（請求項１、図１、要約）JP-A-4-336656 (Claim 1, FIG. 1, Abstract)

しかしながら、前述したニューロジェネティックラーニングによる学習器では、ネットワークの進化を促すのに、ネットワークを全体として評価するので、評価に膨大な時間がかかり、これに伴って計算コストも大きくなるうえ、環境やタスクが変化した場合に、以前の学習結果を既存知識として再利用する学習が行われるか否かが不明である。 However, the learning device based on neurogenetic learning described above evaluates the network as a whole in order to promote the evolution of the network, so it takes an enormous amount of time for evaluation, which increases the calculation cost and increases the environment and task. It is unclear whether or not learning that reuses the previous learning result as existing knowledge is performed.

また、前述した特許文献１に記載された自律進化型システムでは、進化の手法が、ネットワーク全体を評価し、ネットワーク全体を淘汰、生成する手法である。すなわち、評価結果に基づく回路構成の変更は、回路構成全体を別の構成のものに取り替えることであると捉えることができ、たとえ結果的に回路構成の一部の変更に止まったとしても、その一部を評価した結果に基づく変更ではなく、回路構成全体を評価した結果の変更である。従って、評価期間が長くなるという問題がある。この点で、後述する如く、ネットワーク全体ではなく、ネットワークの構成エレメント単位での評価、生成、淘汰を行うため評価期間が非常に短くなる本発明とは異なる。 In the autonomous evolution system described in Patent Document 1 described above, the evolution method is a method for evaluating the entire network and generating and generating the entire network. In other words, a change in the circuit configuration based on the evaluation result can be regarded as a replacement of the entire circuit configuration with another configuration, and even if only a partial change in the circuit configuration results as a result, It is not a change based on the result of evaluating a part, but a change of the result of evaluating the entire circuit configuration. Therefore, there is a problem that the evaluation period becomes long. In this respect, as will be described later, this is different from the present invention in which the evaluation period is very short because evaluation, generation, and selection are performed not on the entire network but on a network element basis.

さらに、前述した特許文献２，３に記載されたニューラルネットワーク学習方法を用いた信号処理装置では、ニューロンユニット間の結合係数の最適化を行っているが、このような結合係数の最適化手法の場合、通常、ネットワークを構築する際に、構築者がネットワークの使用される環境、タスクに対して持つ先見的知識によりネットワークの構造を決定しておき、その決定された構造の中での最適化が行われる。すなわち、ネットワーク構造は変化させずに結合係数の最適化が行われる。従って、作成される学習器は、特定の環境、タスクに対しては高い能力を発揮するが、任意の環境、タスクでの使用は困難である。この点で、決定されたネットワーク構造の中での結合係数の最適化ではなく、ネットワーク構造そのものをも自律的に変化させ、最適化していく本発明とは異なる。 Further, in the signal processing apparatus using the neural network learning method described in Patent Documents 2 and 3 described above, the coupling coefficient between the neuron units is optimized. In general, when constructing a network, the builder decides the network structure based on the environment in which the network is used and the a priori knowledge of the tasks, and then optimizes the determined structure. Is done. That is, the coupling coefficient is optimized without changing the network structure. Accordingly, the created learning device exhibits high ability for a specific environment and task, but is difficult to use in an arbitrary environment and task. In this respect, the present invention is different from the present invention in which the network structure itself is autonomously changed and optimized instead of the optimization of the coupling coefficient in the determined network structure.

本発明の目的は、短時間で効果的な自律制御を行うことができる情報処理システムおよび情報処理方法、並びにプログラムを提供するところにある。 An object of the present invention is to provide an information processing system, an information processing method, and a program capable of performing effective autonomous control in a short time.

本発明は、情報処理を行う複数のノードおよびこれらのノードを連結してノード間の情報伝達を行うリンクを構成エレメントとして備えるネットワークを用いた情報処理システムであって、構成エレメント間の結合関係を含むネットワークの構造を記憶するネットワーク構造記憶手段と、ネットワークの出力生成処理で形成される構成エレメントの入出力状態を記憶する入出力状態記憶手段と、ネットワークの出力結果に基づき形成された制御対象の状態の評価結果に応じてネットワークに対して報酬または罰として付与する強化信号を生成する強化信号生成手段と、この強化信号生成手段により生成した強化信号を少なくとも１つの構成エレメントに付与し、強化信号を付与された構成エレメントから他の構成エレメントへ構成エレメント間の連鎖的な結合関係に従って強化信号を伝播させるために、順次、伝播元の構成エレメントに付与された強化信号に基づき、入出力状態記憶手段に記憶された伝播元および／または伝播先の構成エレメントの入出力状態に応じて伝播先の構成エレメントに対して報酬または罰として付与する強化信号を生成するとともに、構成エレメントに付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて構成エレメント毎に構成エレメントの生成または削除を行ってネットワークの構造を変化させ、変化後のネットワークの構造をネットワーク構造記憶手段に記憶させる学習手段と、ネットワーク構造記憶手段に記憶されたネットワークの構造を参照し、学習手段により構造を変化させたネットワークを用いてネットワークの出力を生成する出力生成手段と、学習手段により生成された構成エレメントの強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を構成エレメント毎に記憶する強化信号記憶手段とを備えたことを特徴とするものである。 The present invention is an information processing system using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between the nodes as constituent elements, and includes a connection relationship between the constituent elements. Network structure storage means for storing the structure of the network including the input / output state storage means for storing the input / output state of the constituent elements formed by the network output generation process, and the control target formed based on the output result of the network An enhancement signal generation means for generating an enhancement signal to be given as a reward or punishment to the network according to the evaluation result of the state, and an enhancement signal generated by the enhancement signal generation means is assigned to at least one component element, and the enhancement signal A configuration element from a configuration element that has been assigned In order to propagate the reinforcement signal in accordance with the chain connection relationship, the propagation source and / or propagation destination constituent elements stored in the input / output state storage means are sequentially based on the reinforcement signals given to the propagation source constituent elements. Depending on the input / output state of the element, a reinforcement signal to be given as a reward or punishment is generated for the constituent element of the propagation destination, and the reinforcement signal given to the constituent element or its history or the cumulative value of the enhancement signal or its history is used. Learning means for generating or deleting a configuration element for each configuration element to change the network structure, and storing the changed network structure in the network structure storage unit, and a network structure stored in the network structure storage unit And use a network whose structure has been changed by means of learning. Output generation means for generating the output of the work, and reinforcement signal storage means for storing the reinforcement signal of the component element generated by the learning means or the history thereof, or the accumulated value of the enhancement signal or the history thereof for each component element It is characterized by.

ここで、「制御対象」とは、例えば、ロボット（実機ロボットでもよく、ディスプレイ画面上に表示されたロボットや、ホログラフィで表示されたロボット等の仮想的なロボットでもよい。）、ディスプレイ画面上に表示されるゲームのキャラクタ、空調管理の対象となる空間の環境等である。以下の発明においても同様である。 Here, the “control target” is, for example, a robot (which may be a real robot or a virtual robot such as a robot displayed on a display screen or a robot displayed by holography) or on a display screen. These are the game characters to be displayed, the environment of the space that is the target of air conditioning management, and the like. The same applies to the following inventions.

また、「制御対象の状態」とは、例えば、ネットワークの出力結果に基づくロボットの動作によってもたらされるロボットの状態（行動結果）、ネットワークの出力結果に基づくゲームのキャラクタの動作によってもたらされるキャラクタの状態（例えば、格闘系ゲームであれば、自己が受けたダメージ、敵に与えたダメージ、勝敗の結果等）、ネットワークの出力結果に基づく空調管理によってもたらされる対象空間の環境の状態（快適性、安全性等）等である。以下の発明においても同様である。 The “control target state” means, for example, a robot state (behavior result) brought about by a robot action based on a network output result, and a character state brought about by a game character action based on a network output result (For example, in the case of a fighting game, the damage taken by the player, the damage given to the enemy, the result of winning or losing, etc.), the environmental conditions of the target space brought about by the air conditioning management based on the network output result (comfort, safety Sex etc.). The same applies to the following inventions.

さらに、「入出力状態記憶手段」には、必ずしも各構成エレメント毎に入力および出力の双方を記憶させる必要はなく、例えば、各構成エレメントの出力のみを記憶させ、ネットワークの構造を参照することで各構成エレメント毎の入力および出力を把握できるようにしてもよい。また、入出力状態記憶手段に記憶される「構成エレメントの入出力状態」には、現在（最新のステップ）の入出力状態のみならず、過去（前回以前のステップ）の入出力状態が含まれていてもよい。従って、学習手段により「構成エレメントの入出力状態に応じて」強化信号を生成する際には、現在のみならず過去の入出力状態（過去の一時点でもよく、複数時点の履歴でもよい。）を参照してもよい。以下の発明においても同様である。 Furthermore, it is not always necessary to store both input and output for each component element in the “input / output state storage means”. For example, by storing only the output of each component element and referring to the network structure You may enable it to grasp | ascertain the input and output for every component element. In addition, the “input / output state of the component element” stored in the input / output state storage means includes not only the current (latest step) input / output state but also the past (previous step) input / output state. It may be. Therefore, when the reinforcement signal is generated by the learning means “depending on the input / output state of the constituent element”, not only the present but also the past input / output state (a past point in time or a history of a plurality of points may be used). You may refer to The same applies to the following inventions.

また、「順次、伝播元の構成エレメントに付与された強化信号に基づき」における「強化信号」には、現在（最新のステップ）の強化信号のみならず、過去（前回以前のステップ）の強化信号が含まれていてもよい。従って、学習手段により伝播先の構成エレメントに対して付与する強化信号を生成する際には、伝播元の構成エレメントに付与された現在の強化信号のみならず過去の強化信号（過去の一時点でもよく、複数時点の履歴でもよい。）をも参照し、それらを用いて演算を行った結果に基づいて生成処理を行ってもよい。 The “enhancement signal” in “sequentially based on the enhancement signal assigned to the constituent element of the propagation source” includes not only the current (latest step) enhancement signal but also the past (previous step) enhancement signal. May be included. Therefore, when generating the reinforcement signal to be given to the constituent element of the propagation destination by the learning means, not only the current reinforcement signal given to the constituent element of the propagation source but also the past reinforcement signal (at the past one point in time). It is also possible to refer to the history of a plurality of points in time, and to perform generation processing based on the result of calculation using them.

そして、「構成エレメントに付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて」構成エレメントの生成または削除を行うことには、例えば、強化信号や強化信号の累積値の値をそのまま用いて生成または削除の判定処理を行う場合、強化信号の履歴を用いて各種の演算処理（例えば、各強化信号の単純和、各強化信号の単純平均、各強化信号に重みを付けた和、各強化信号に重みを付けた加重平均、各強化信号の分散・標準偏差等を算出する処理であって、線形・非線形を問わない処理）を行って得られた値を用いて生成または削除の判定処理を行う場合、強化信号の累積値の履歴を用いて各種の演算処理（例えば、各累積値の変化率、各累積値の分散・標準偏差等を算出する処理であって、線形・非線形を問わない処理）を行って得られた値を用いて生成または削除の判定処理を行う場合等が含まれる。以下の発明においても同様である。 In order to generate or delete the configuration element “using the enhancement signal given to the configuration element or its history or the cumulative value of the enhancement signal or its history”, for example, the cumulative value of the enhancement signal or the enhancement signal When performing generation or deletion determination processing using values as they are, various arithmetic processing using the history of enhancement signals (for example, simple sum of each enhancement signal, simple average of each enhancement signal, weighting each enhancement signal) Is generated using values obtained by performing a process of calculating the sum, the weighted average of each enhancement signal, the variance / standard deviation of each enhancement signal, regardless of whether it is linear or non-linear. Alternatively, when performing the determination process of deletion, various calculation processes (for example, a process of calculating a change rate of each cumulative value, a variance / standard deviation of each cumulative value, etc. using the history of the cumulative value of the enhancement signal, Linear / Non Etc. When performing the determination process of generating or deleting a value obtained by performing the process) does not matter form contains. The same applies to the following inventions.

また、「ノード」により行われる情報処理は、通常の場合には、複数の入力を用いて１つの出力を得る処理であるが、例えばネットワークの末端に位置するノードのような特殊なノードの場合には、ダミーノードとし、例えば、１つの入力を用いて１つの出力を得る処理、あるいは入力なしに１つの出力を得る処理等としてもよい。以下の発明においても同様である。 In addition, the information processing performed by the “node” is usually a process of obtaining one output using a plurality of inputs, but for a special node such as a node located at the end of the network, for example. For example, a dummy node may be used, for example, a process for obtaining one output using one input, or a process for obtaining one output without input. The same applies to the following inventions.

このような本発明の情報処理システムにおいては、制御対象の状態の評価結果に応じてネットワークに対して付与する強化信号を生成し、さらにこの強化信号をネットワークの構成エレメントから他の構成エレメントへ伝播させる。この際、伝播させる強化信号、すなわち伝播先の構成エレメントに対して付与する強化信号は、伝播元および／または伝播先の構成エレメントの入出力状態に応じて生成され、このようにして構成エレメント毎に個別に付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて構成エレメント毎に構成エレメントの生成（追加）または削除（淘汰）を行うか否かの判定を行い、その処理を実行し、ネットワークの構造を自律的に変化させる。 In such an information processing system of the present invention, an enhancement signal to be given to the network is generated according to the evaluation result of the state of the controlled object, and this enhancement signal is further propagated from the configuration element of the network to another configuration element. Let At this time, the reinforcing signal to be propagated, that is, the reinforcing signal to be given to the constituent element of the propagation destination, is generated according to the input / output state of the constituent element of the propagation source and / or the destination of the propagation. It is determined whether or not to generate (add) or delete (構成) a configuration element for each configuration element using the reinforcement signal or its history given individually or the cumulative value of the enhancement signal or its history. Execute processing and autonomously change the network structure.

従って、前述した従来のニューロジェネティックラーニングによる学習器の場合とは異なり、ネットワークの構造を変化させるにあたり、ネットワーク全体を評価単位として評価を行うのではなく、構成エレメント単位での評価を行い、構成エレメント単位で生成または削除を行うので、評価に要する時間が短時間で済むようになり、低い時間オーダでネットワークを自律的に構築していくことが可能となるうえ、これに伴って計算コストの削減も図られる。 Therefore, unlike the case of the learning device by the conventional neurogenetic learning described above, when changing the structure of the network, the evaluation is performed in units of constituent elements instead of using the entire network as an evaluation unit. Since creation or deletion is performed in units, the time required for evaluation can be shortened, and it becomes possible to construct a network autonomously in a low time order, and the calculation cost is reduced accordingly. Is also planned.

また、前述した特許文献２，３に記載されたニューラルネットワーク学習方法のように、ネットワークの使用環境やタスクに応じてネットワークの構造を決定しておき、その決定された構造の中でニューロンユニット間の結合係数の最適化を行うのではなく、本発明では、ネットワーク構造そのものをも自律的に変化させ、最適化していくので、構造決定による環境、タスクへの限定を回避することが可能となる。このため、ネットワークの使用環境やタスクが変化した場合でも、以前の学習結果を既存知識として再利用する学習を行うことができる可能性が高まり、これらにより前記目的が達成される。 Further, like the neural network learning method described in Patent Documents 2 and 3 described above, the network structure is determined according to the use environment and tasks of the network, and between the neuron units in the determined structure. In the present invention, the network structure itself is autonomously changed and optimized, so that it is possible to avoid limitations on the environment and tasks due to the structure determination. . For this reason, even when the network usage environment and tasks change, there is a high possibility that learning can be performed by reusing previous learning results as existing knowledge, thereby achieving the object.

また、前述した情報処理システムにおいて、制御対象の状態を検出する状態検出手段または制御対象自身から制御対象の状態を評価するための状態評価用信号を取得する状態評価用信号取得手段を備え、強化信号生成手段は、状態評価用信号取得手段により取得した状態評価用信号に基づき制御対象の状態を評価し、この評価結果に応じて強化信号を生成する構成としてもよい。 The information processing system described above further includes a state detection unit for detecting the state of the control target or a state evaluation signal acquisition unit for acquiring a state evaluation signal for evaluating the state of the control target from the control target itself. The signal generation unit may be configured to evaluate the state of the control target based on the state evaluation signal acquired by the state evaluation signal acquisition unit and generate an enhancement signal according to the evaluation result.

ここで、「状態検出手段」は、例えば、位置、速度、加速度、距離、回転角、回転角速度、回転角加速度、温度、湿度、圧力、臭い、光、音、振動、触覚等を検出する各種センサ等である。 Here, the “state detection means” includes, for example, various kinds of sensors that detect position, velocity, acceleration, distance, rotation angle, rotation angular velocity, rotation angular acceleration, temperature, humidity, pressure, odor, light, sound, vibration, touch, and the like. Sensors and the like.

このように状態評価用信号取得手段により取得した状態評価用信号に基づき制御対象の状態を評価する構成とした場合には、人為的判断を介在させることなく制御対象の状態の評価を行うことが可能となるので、ネットワークの自律的な構築速度を向上させることが可能となるうえ、目的に沿って一貫した学習を容易に行うことが可能となる。 When the state of the control target is evaluated based on the state evaluation signal acquired by the state evaluation signal acquisition unit as described above, the state of the control target can be evaluated without any artificial judgment. As a result, it becomes possible to increase the autonomous construction speed of the network and to easily perform consistent learning according to the purpose.

さらに、前述した情報処理システムにおいて、ユーザによる制御対象の状態の評価結果の入力を受け付ける評価結果入力受付手段を備え、強化信号生成手段は、評価結果入力受付手段により受け付けた評価結果に応じて強化信号を生成する構成としてもよい。 The information processing system further includes an evaluation result input receiving unit that receives an input of the evaluation result of the state of the control target by the user, and the enhancement signal generating unit is strengthened according to the evaluation result received by the evaluation result input receiving unit. It is good also as a structure which produces | generates a signal.

このように評価結果入力受付手段を備えた構成とした場合には、ユーザの評価結果に応じて強化信号を生成し、この強化信号を構成エレメントから他の構成エレメントへと伝播させることが可能となるので、ユーザの意思に沿った形で制御対象の制御を行うことができるようにネットワークの自律的な構築を促すことが可能となる。 In the case of the configuration including the evaluation result input receiving means in this way, it is possible to generate an enhancement signal according to the user's evaluation result and propagate the enhancement signal from the component element to another component element. Therefore, it becomes possible to promote the autonomous construction of the network so that the control target can be controlled in a manner in line with the user's intention.

なお、「ユーザ」は、１人でも、複数人でもよい。複数のユーザが同一の制御対象を使用し、あるいは参照する場合には、後者のように複数のユーザによる評価結果（同一の制御対象についての異なる状態または同一の状態の評価結果）を受け付けることが好ましく、例えば、制御対象がネットワーク上の検索エンジンである場合には、ネットワークに接続された複数のユーザ端末装置から送信されてくる各ユーザの評価結果（目的通りの検索を行うことができたか否か）を受け付け、検索エンジンの検索アルゴリズム等を変化させていくことができる。 The “user” may be one person or a plurality of persons. When a plurality of users use or refer to the same control object, as in the latter case, an evaluation result by a plurality of users (evaluation results of different states or the same state for the same control object) can be received. Preferably, for example, when the control target is a search engine on the network, the evaluation result of each user transmitted from a plurality of user terminal devices connected to the network (whether or not the search can be performed as intended) )) And the search algorithm of the search engine can be changed.

そして、以上に述べた情報処理システムにおいて、学習手段は、強化信号生成手段により生成された強化信号をネットワークの出力層を構成する全ての出力ノードに対して均等に付与するとともに、伝播元の構成エレメントをノードとし、伝播先の構成エレメントを伝播元のノードの入力側リンクとし、伝播元のノードに付与された強化信号に基づき、伝播元のノードの入出力状態に従って定まる伝播先の入力側リンクのノード出力への寄与度に応じ、伝播先の入力側リンクに対して報酬または罰として付与する強化信号を生成する構成とされていることが望ましい。 In the information processing system described above, the learning unit equally applies the enhancement signal generated by the enhancement signal generation unit to all the output nodes constituting the output layer of the network, and the configuration of the propagation source. The element is a node, the component of the propagation destination is the input side link of the propagation source node, and the propagation destination input side link is determined according to the input / output status of the propagation source node based on the reinforcement signal given to the propagation source node It is desirable to generate a reinforcement signal to be given as a reward or punishment to the input link on the propagation destination in accordance with the degree of contribution to the node output.

このようにノードからその入力側リンクへ強化信号を伝播させる構成とした場合には、ネットワークに対して付与された強化信号を出力ノードから逆伝播させていくことが可能となり、また、伝播先の入力側リンクのノード出力への寄与度に応じて、入力側リンクに対して付与する強化信号を生成するので、個々のリンクに対し、妥当な評価を個別に行うことが可能となり、構成エレメント毎の生成または削除を実現することが可能となる。 When the enhancement signal is propagated from the node to the input side link in this way, the enhancement signal given to the network can be back-propagated from the output node, and the propagation destination Depending on the degree of contribution of the input side link to the node output, an enhanced signal to be given to the input side link is generated, so that it is possible to perform an appropriate evaluation for each link individually, and for each component element Can be generated or deleted.

また、上述したようにノードからその入力側リンクへ強化信号を伝播させる構成とした場合において、学習手段は、伝播元の構成エレメントをノードとし、伝播先の構成エレメントを伝播元のノードの入力側リンクの入力側に結合された入力側ノードとし、伝播元のノードに付与された強化信号に基づき、伝播元のノードの入出力状態に従って定まる入力側リンクのノード出力への寄与度に応じ、伝播先の入力側ノードに対して報酬または罰として付与する強化信号を生成する構成としてもよい。 In addition, in the case where the enhancement signal is propagated from the node to the input side link as described above, the learning means uses the propagation source configuration element as the node and the propagation destination configuration element as the input side of the propagation source node. Propagation according to the contribution to the node output of the input link determined according to the input / output state of the propagation source node based on the reinforcement signal given to the propagation source node based on the input side node coupled to the input side of the link It is good also as a structure which produces | generates the reinforcement | strengthening signal provided as a reward or punishment with respect to the previous input side node.

このようにノードからその入力側リンクの入力側ノードへの強化信号の伝播を行う構成とした場合には、ノードからその入力側リンクへの強化信号の逆伝播と併せ、ノードからその入力側リンクの入力側ノードへの強化信号の逆伝播も行うことが可能となり、より一層円滑な強化信号の逆伝播を実現することができるようになる。 In this way, when the configuration is such that the reinforcement signal is propagated from the node to the input side node of the input side link, the reinforcement signal is propagated back from the node to the input side link, and the input side link from the node to the input side link. It is also possible to perform the back propagation of the reinforcement signal to the input side node, and it is possible to realize the smooth back propagation of the reinforcement signal.

なお、ノードからノードへの強化信号の伝播は、上記のように伝播元のノードから伝播先のノードへ直接に行うのではなく、これらのノードを連結するリンクを介して、すなわちこれらのノードを連結するリンクに一旦強化信号を蓄え、それを伝播先のノードへ渡すことにより行ってもよい。 Note that the propagation of the reinforcement signal from node to node is not performed directly from the propagation source node to the propagation destination node as described above, but via the link connecting these nodes, that is, these nodes are connected to each other. The enhancement signal may be temporarily stored in the link to be connected, and then transferred to the propagation destination node.

さらに、前述したようにノードからその入力側リンクへ強化信号を伝播させる構成とした場合において、強化信号記憶手段は、リンクに対して付与された強化信号の履歴または強化信号の累積値をリンク毎に記憶する構成とされ、学習手段は、リンクに対して付与された強化信号の累積値が閾値を下回ったときに、このリンクを削除する構成とされていることが望ましい。 Further, in the case where the enhancement signal is propagated from the node to the input side link as described above, the enhancement signal storage means stores the enhancement signal history or the cumulative value of the enhancement signal given to the link for each link. Preferably, the learning means is configured to delete the link when the cumulative value of the enhancement signal given to the link falls below a threshold value.

ここで、学習手段は、閾値を下回ったか否かを判定する際に必要となる強化信号の累積値を、強化信号記憶手段に記憶された強化信号の履歴を加算する処理を行って把握してもよく、強化信号記憶手段に記憶された強化信号の累積値を読み込んで把握してもよい。以下の発明の場合も同様である。 Here, the learning means grasps the cumulative value of the enhancement signal required when determining whether or not the threshold value is exceeded by performing a process of adding the history of the enhancement signal stored in the enhancement signal storage means. Alternatively, the accumulated value of the enhancement signal stored in the enhancement signal storage means may be read and grasped. The same applies to the following inventions.

このようにリンクに対して付与された強化信号の累積値が閾値を下回ったときに、このリンクを削除する構成とした場合には、目的通りに制御対象を制御するために役立たないと考えられるリンク、すなわち不要と思われるリンクの淘汰を行うことが可能となり、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the cumulative value of the reinforcement signal given to the link falls below the threshold value, if the configuration is such that the link is deleted, it is considered that it is not useful for controlling the control target as intended. Link, that is, a link that seems unnecessary, can be performed, and the network structure can be changed autonomously.

そして、上記のようにリンクに対して付与された強化信号の累積値が閾値を下回ったときに、このリンクを削除する構成とした場合において、学習手段は、ノードの入力側リンクの数が１以下になったときに、このノードを削除する構成とすることが望ましい。 And when it is set as the structure which deletes this link when the accumulation value of the reinforcement | strengthening signal provided with respect to the link falls below a threshold value as mentioned above, a learning means is the number of the input side links of a node. It is desirable that this node be deleted when the following occurs.

このようにノードの入力側リンクの数が１以下になったときに、このノードを削除する構成とした場合には、目的通りに制御対象を制御するために役立たないと考えられるノード、すなわち不要と思われるノードの淘汰を行うことが可能となり、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the number of links on the input side of the node becomes 1 or less, when this node is deleted, it is considered that the node is not useful for controlling the controlled object as intended, that is, unnecessary. It is possible to perform the trapping of the node that seems to be, and it is possible to autonomously change the network structure.

また、ノードからその入力側リンクへ強化信号を伝播させる構成とした場合において、伝播元のノードの入力側には、伝播先の入力側リンクの他に、ノード出力に寄与しないテストリンクが設けられ、強化信号記憶手段は、テストリンクに対して付与された強化信号の履歴または強化信号の累積値も記憶する構成とされ、学習手段は、テストリンクに対して付与された強化信号の累積値が閾値を上回ったときに、テストリンクを伝播元のノードの入力側リンクとしてネットワーク構造記憶手段に登録する構成とされていることが望ましい。 In addition, in the case where the enhancement signal is propagated from the node to the input side link, a test link that does not contribute to the node output is provided on the input side of the propagation source node in addition to the input side link of the propagation destination. The reinforcement signal storage means is also configured to store the history of the enhancement signal given to the test link or the cumulative value of the enhancement signal, and the learning means stores the cumulative value of the enhancement signal given to the test link. It is desirable that when the threshold value is exceeded, the test link is registered in the network structure storage means as the input side link of the propagation source node.

このようにテストリンクを設けた構成とした場合には、目的通りに制御対象を制御するために役立つと考えられるテストリンクを、ノード出力に寄与する実リンクに昇格させ、正式に入力側リンクとして登録することができるので、自律的なリンクの生成を実現し、ネットワークの構造を自律的に変化させていくことが可能となる。 When the test link is configured in this way, the test link that is thought to be useful for controlling the controlled object as intended is promoted to a real link that contributes to the node output, and formally used as the input side link. Since it can be registered, autonomous link generation can be realized and the network structure can be changed autonomously.

さらに、上記のようにテストリンクを設けた構成とした場合において、学習手段は、テストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、テストリンクを削除し、任意のノードに結合される新たなテストリンクを生成し、ネットワーク構造記憶手段に登録する構成とされていることが望ましい。 Further, in the case where the test link is provided as described above, the learning unit deletes the test link when the cumulative value of the enhancement signal given to the test link falls below the threshold, It is desirable that a new test link coupled to the node is generated and registered in the network structure storage means.

このようにテストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、このテストリンクを削除し、新たなテストリンクを生成する構成とした場合には、新しく生成されるリンク（実リンク）の適切な候補となるテストリンクを用意しておくことができるので、リンクの適切で、かつ円滑な生成を実現し、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the cumulative value of the reinforcement signal given to the test link falls below the threshold value, this test link is deleted and a new test link is generated. It is possible to prepare test links that are appropriate candidates for (real links), so that appropriate and smooth generation of links can be realized, and the network structure can be changed autonomously. Become.

そして、前述した情報処理システムにおいて、リンクには、このリンクに付随してネットワークの出力に寄与しないテストノードが設けられ、このテストノードは、リンクの入力側ノードに第１の入力側テストリンクで連結され、かつ、リンクの出力側ノードに出力側テストリンクで連結されるとともに、任意のノードに第２の入力側テストリンクで連結され、学習手段は、伝播元の構成エレメントをリンクとし、伝播先の構成エレメントをテストノードとし、伝播元のリンクに付与された強化信号に基づき、伝播元のリンクの出力および伝播先のテストノードの出力の状態に応じ、伝播先のテストノードに対して報酬または罰として付与する強化信号を生成する構成とされていることが望ましい。 In the information processing system described above, the link is provided with a test node that does not contribute to the output of the network accompanying the link. The test node is connected to the input side node of the link by the first input side test link. And connected to the output side node of the link by the output side test link, and connected to an arbitrary node by the second input side test link. Based on the strengthening signal given to the propagation source link, with the previous configuration element as the test node, depending on the output state of the propagation source link and the output of the propagation test node, the reward is given to the propagation test node Or it is desirable to be the structure which produces | generates the reinforcement | strengthening signal provided as punishment.

このようにリンクに付随させてテストノードを設ける構成とした場合には、新たに生成されるノード（実ノード）の候補を用意しておくことが可能となり、ネットワークの構造を自律的に変化させていくことが可能となる。 When a test node is provided in association with a link in this way, it is possible to prepare a candidate for a newly generated node (real node), and autonomously change the network structure. It is possible to continue.

また、上記のようにリンクに付随させてテストノードを設ける構成とした場合において、学習手段は、伝播元の構成エレメントをテストノードとし、伝播先の構成エレメントをテストノードの第１および第２の入力側テストリンクとし、伝播元のテストノードに付与された強化信号に基づき、伝播元のテストノードの入出力状態に従って定まる伝播先の第１および第２の入力側テストリンクのテストノード出力への寄与度に応じ、伝播先の第１および第２の入力側テストリンクに対して報酬または罰として付与する強化信号を生成する構成とされていることが望ましい。 Further, in the case where the test node is provided in association with the link as described above, the learning unit uses the propagation source configuration element as the test node and the propagation destination configuration element as the first and second test nodes. Based on the reinforcement signal given to the propagation source test node as the input side test link, to the test node output of the propagation destination first and second input side test links determined according to the input / output state of the propagation source test node It is desirable that the enhancement signal to be given as reward or punishment is generated for the first and second input side test links of the propagation destination according to the degree of contribution.

このようにテストノードから第１および第２の入力側テストリンクへ強化信号を伝播させる構成とした場合には、新たに生成されるリンク（実リンク）の候補を用意しておくことが可能となり、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the enhancement signal is propagated from the test node to the first and second input side test links, it is possible to prepare candidates for newly generated links (actual links). It becomes possible to change the network structure autonomously.

さらに、上記のようにテストノードから第１および第２の入力側テストリンクへ強化信号を伝播させる構成とした場合において、強化信号記憶手段は、伝播先の第１および第２の入力側テストリンクに対して付与された強化信号の履歴または強化信号の累積値もリンク毎に記憶する構成とされ、学習手段は、伝播先の第１または第２の入力側テストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、閾値を下回った入力側テストリンクを削除し、任意のノードに結合される新たな入力側テストリンクを生成し、ネットワーク構造記憶手段に登録する構成とされていることが望ましい。 Further, in the case where the enhancement signal is propagated from the test node to the first and second input side test links as described above, the enhancement signal storage means has the first and second input side test links as propagation destinations. The history of the enhancement signal given to the link or the cumulative value of the enhancement signal is also stored for each link, and the learning means is the enhancement given to the first or second input side test link of the propagation destination A configuration in which when an accumulated value of a signal falls below a threshold, an input side test link that falls below the threshold is deleted, a new input side test link coupled to an arbitrary node is generated, and registered in the network structure storage unit; It is desirable that

このように第１または第２の入力側テストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、閾値を下回った入力側テストリンクを削除し、新たな入力側テストリンクを生成する構成とした場合には、新しく生成されるリンク（実リンク）の適切な候補となるテストリンクを用意しておくことができるので、リンクの適切で、かつ円滑な生成を実現し、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the cumulative value of the enhancement signal given to the first or second input side test link falls below the threshold, the input side test link below the threshold is deleted, and a new input side test link is obtained. If you have a configuration that generates, you can prepare test links that are suitable candidates for newly generated links (actual links), so that appropriate and smooth generation of links can be realized, It becomes possible to change the network structure autonomously.

なお、第１の入力側テストリンクには、このリンクの生成時に十分に大きな報酬を付与しておき、削除されないようにすることが好ましく、このようにした場合には、実質的に、第２の入力側テストリンクのみが削除の対象となる。 In addition, it is preferable that a sufficiently large reward is given to the first input-side test link when the link is generated so that the first input side test link is not deleted. Only the input side test link is subject to deletion.

そして、上記のようにテストノードから第１および第２の入力側テストリンクへ強化信号を伝播させる構成とした場合において、強化信号記憶手段は、伝播先の第１および第２の入力側テストリンクに対して付与された強化信号の履歴または強化信号の累積値もリンク毎に記憶する構成とされ、学習手段は、伝播先の第１および第２の入力側テストリンクに対して付与された強化信号の累積値がいずれも閾値を上回ったときに、テストノードを実用化するためにテストノードをネットワークの出力に寄与する実ノードに昇格させてネットワーク構造記憶手段に登録する構成とされていることが望ましい。 When the enhancement signal is propagated from the test node to the first and second input side test links as described above, the enhancement signal storage means has the first and second input side test links as propagation destinations. The enhancement signal history or the cumulative value of the enhancement signal given to the link is also stored for each link, and the learning means is the enhancement given to the first and second input side test links of the propagation destination When the cumulative value of all signals exceeds the threshold value, the test node is promoted to a real node that contributes to the network output and registered in the network structure storage means in order to put the test node into practical use. Is desirable.

このように第１および第２の入力側テストリンクに対して付与された強化信号の累積値がいずれも閾値を上回ったときに、テストノードを実用化する構成とした場合には、新たにノード（実ノード）を生成（追加）することが可能となり、ネットワークの構造を自律的に変化させていくことが可能となる。 In this way, when the cumulative value of the enhancement signals given to the first and second input side test links both exceeds the threshold value, the test node is put into practical use. (Real node) can be generated (added), and the network structure can be changed autonomously.

また、以上に述べた情報処理システムにおいて、ノードは、少なくとも１つの論理回路を用いて情報処理を行う構成とされていることが望ましい。 In the information processing system described above, the node is preferably configured to perform information processing using at least one logic circuit.

ここで、「論理回路」としては、例えば、ＡＮＤ（論理積）回路、ＯＲ（論理和）回路、ＸＯＲ（排他的論理和：Exclusive OR）回路、ＮＯＴ（否定）回路、ＮＡＮＤ（否定論理積：Not AND）回路、ＮＯＲ（否定論理和：Not OR）回路、ＸＮＯＲ（排他的論理和の否定：Exclusive Not OR）回路等を用いることができる。 Here, examples of the “logic circuit” include an AND (logical product) circuit, an OR (logical sum) circuit, an XOR (exclusive OR) circuit, a NOT (negative) circuit, and a NAND (negative logical product: A Not AND circuit, a NOR (Negative OR: Not OR) circuit, an XNOR (Exclusive OR) circuit, or the like can be used.

このように論理回路を用いてノードを構成した場合には、目的の制御を実現することができる情報処理システムを単純な構造で構築することが可能となる。 When a node is configured using a logic circuit as described above, an information processing system capable of realizing target control can be constructed with a simple structure.

また、以上に述べた本発明の情報処理システムにより実現される情報処理方法として、以下のような本発明の情報処理方法が挙げられる。 The following information processing method of the present invention can be given as an information processing method realized by the information processing system of the present invention described above.

すなわち、本発明は、情報処理を行う複数のノードおよびこれらのノードを連結してノード間の情報伝達を行うリンクを構成エレメントとして備えるネットワークを用いた情報処理方法であって、構成エレメント間の結合関係を含むネットワークの構造をネットワーク構造記憶手段に記憶させるとともに、ネットワークの出力生成処理で形成される構成エレメントの入出力状態を入出力状態記憶手段に記憶させておき、強化信号生成手段が、ネットワークの出力結果に基づき形成された制御対象の状態の評価結果に応じてネットワークに対して報酬または罰として付与する強化信号を生成する処理を行い、学習手段が、強化信号生成手段により生成した強化信号を少なくとも１つの構成エレメントに付与し、強化信号を付与された構成エレメントから他の構成エレメントへ構成エレメント間の連鎖的な結合関係に従って強化信号を伝播させるために、順次、伝播元の構成エレメントに付与された強化信号に基づき、入出力状態記憶手段に記憶された伝播元および／または伝播先の構成エレメントの入出力状態に応じて伝播先の構成エレメントに対して報酬または罰として付与する強化信号を生成し、生成した構成エレメントの強化信号またはその累積値を構成エレメント毎に強化信号記憶手段に記憶させるとともに、構成エレメントに付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて構成エレメント毎に構成エレメントの生成または削除を行ってネットワークの構造を変化させ、変化後のネットワークの構造をネットワーク構造記憶手段に記憶させる処理を行い、出力生成手段が、ネットワーク構造記憶手段に記憶されたネットワークの構造を参照し、学習手段により構造を変化させたネットワークを用いてネットワークの出力を生成する処理を行うことを特徴とするものである。 That is, the present invention is an information processing method using a network including, as constituent elements, a plurality of nodes that perform information processing and links that link these nodes and transmit information between the nodes, and The network structure including the relationship is stored in the network structure storage means, and the input / output states of the constituent elements formed by the network output generation processing are stored in the input / output state storage means. The reinforcement signal generated by the learning signal is generated by the reinforcement signal generator by performing the process of generating the reinforcement signal to be given as a reward or punishment to the network according to the evaluation result of the state of the control target formed based on the output result of Is applied to at least one component element, and the component element to which the enhancement signal is applied Stored in the input / output state storage means in order based on the reinforcement signal given to the constituent element of the propagation source in order to propagate the reinforcement signal from the first to the other constituent element according to the chain connection relationship between the constituent elements. Generates a reinforcement signal to be given as a reward or punishment to the propagation destination component element according to the input / output state of the propagation source and / or propagation destination component element, and configures the reinforcement signal of the generated component element or its accumulated value Each element is stored in the enhancement signal storage means, and a component element is generated or deleted for each component element using the enhancement signal given to the component element or its history, or the cumulative value of the enhancement signal or its history. Change the structure and record the changed network structure in the network structure storage means. The output generation means refers to the network structure stored in the network structure storage means, and performs processing for generating the network output using the network whose structure has been changed by the learning means. To do.

ここで、「生成した構成エレメントの強化信号またはその累積値を構成エレメント毎に強化信号記憶手段に記憶させる」ことには、強化信号または強化信号の累積値を上書き保存する場合と、過去の強化信号または強化信号の過去の累積値を履歴として残した状態で強化信号または強化信号の累積値を追加保存する場合とが含まれる。 Here, “to store the enhancement signal of the generated component element or its accumulated value in the enhancement signal storage means for each component element” includes the case where the enhancement signal or the accumulated value of the enhancement signal is overwritten and saved in the past. And a case where the accumulated value of the enhancement signal or the enhancement signal is additionally stored in a state where the past accumulation value of the signal or the enhancement signal is left as a history.

このような本発明の情報処理方法においては、前述した本発明の情報処理システムで得られる作用・効果がそのまま得られ、これにより前記目的が達成される。 In such an information processing method of the present invention, the operations and effects obtained by the above-described information processing system of the present invention can be obtained as they are, and thereby the object is achieved.

また、本発明は、情報処理を行う複数のノードおよびこれらのノードを連結してノード間の情報伝達を行うリンクを構成エレメントとして備えるネットワークを用いた情報処理システムとして、コンピュータを機能させるためのプログラムであって、構成エレメント間の結合関係を含むネットワークの構造を記憶するネットワーク構造記憶手段と、ネットワークの出力生成処理で形成される構成エレメントの入出力状態を記憶する入出力状態記憶手段と、ネットワークの出力結果に基づき形成された制御対象の状態の評価結果に応じてネットワークに対して報酬または罰として付与する強化信号を生成する強化信号生成手段と、この強化信号生成手段により生成した強化信号を少なくとも１つの構成エレメントに付与し、強化信号を付与された構成エレメントから他の構成エレメントへ構成エレメント間の連鎖的な結合関係に従って強化信号を伝播させるために、順次、伝播元の構成エレメントに付与された強化信号に基づき、入出力状態記憶手段に記憶された伝播元および／または伝播先の構成エレメントの入出力状態に応じて伝播先の構成エレメントに対して報酬または罰として付与する強化信号を生成するとともに、構成エレメントに付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて構成エレメント毎に構成エレメントの生成または削除を行ってネットワークの構造を変化させ、変化後のネットワークの構造をネットワーク構造記憶手段に記憶させる学習手段と、ネットワーク構造記憶手段に記憶されたネットワークの構造を参照し、学習手段により構造を変化させたネットワークを用いてネットワークの出力を生成する出力生成手段と、学習手段により生成された構成エレメントの強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を構成エレメント毎に記憶する強化信号記憶手段とを備えたことを特徴とする情報処理システムとして、コンピュータを機能させるためのものである。 The present invention also provides a program for causing a computer to function as an information processing system using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between the nodes as constituent elements. A network structure storage means for storing a network structure including a connection relationship between the constituent elements, an input / output state storage means for storing an input / output state of the constituent elements formed by the output generation processing of the network, and a network An enhanced signal generating means for generating an enhanced signal to be given as a reward or punishment to the network according to the evaluation result of the state of the control target formed based on the output result of the output, and an enhanced signal generated by the enhanced signal generating means Given to at least one component element and given a reinforcement signal In order to propagate the reinforcing signal from the constituent element to the other constituent elements according to the chain connection relationship between the constituent elements, the reinforcing signal is sequentially stored in the input / output state storage means based on the reinforcing signal given to the constituent element of the propagation source. Depending on the input / output state of the propagation source and / or propagation destination component element, a reinforcement signal to be given as a reward or punishment to the propagation destination component element is generated, and the enhancement signal given to the component element or its history Alternatively, learning means for generating or deleting a configuration element for each configuration element using the cumulative value of the enhancement signal or its history, changing the network structure, and storing the changed network structure in the network structure storage unit; Refer to the network structure stored in the network structure storage means, The output generation means for generating the output of the network using the network whose structure has been changed by the learning means, and the reinforcement signal of the constituent element generated by the learning means or the history thereof or the cumulative value of the enhancement signal or the history thereof for each constituent element The information processing system is provided with an enhanced signal storage means for storing the information, and is for causing a computer to function.

なお、上記のプログラムまたはその一部は、例えば、光磁気ディスク（ＭＯ）、コンパクトディスク（ＣＤ）を利用した読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）、デジタル・バーサタイル・ディスク（ＤＶＤ）を利用した読出し専用メモリ（ＤＶＤ−ＲＯＭ）、ＤＶＤを利用したランダム・アクセス・メモリ（ＤＶＤ−ＲＡＭ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去および書換可能な読出し専用メモリ（ＥＥＰＲＯＭ）、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）等の記録媒体に記録して保存や流通等させることが可能であるとともに、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等の有線ネットワーク、あるいは無線通信ネットワーク、さらにはこれらの組合せ等の伝送媒体を用いて伝送することが可能であり、また、搬送波に載せて搬送することも可能である。さらに、上記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。 The above-mentioned program or a part thereof is, for example, a magneto-optical disk (MO), a read-only memory (CD-ROM) using a compact disk (CD), a CD recordable (CD-R), a CD rewritable (CD -RW), read-only memory (DVD-ROM) using digital versatile disk (DVD), random access memory (DVD-RAM) using DVD, flexible disk (FD), magnetic tape, hard disk, It can be recorded on storage media such as read-only memory (ROM), electrically erasable and rewritable read-only memory (EEPROM), flash memory, and random access memory (RAM) for storage and distribution. And, for example, a local area network (LA ), A metropolitan area network (MAN), a wide area network (WAN), a wired network such as the Internet, an intranet, or an extranet, or a wireless communication network, or a combination thereof. It is also possible to carry it on a carrier wave. Furthermore, the above program may be a part of another program, or may be recorded on a recording medium together with a separate program.

以上に述べたように本発明によれば、制御対象の状態の評価結果に応じてネットワークに対して付与する強化信号を生成し、さらにこの強化信号をネットワークの構成エレメントから他の構成エレメントへ伝播させ、構成エレメント毎に個別に付与された強化信号若しくはその履歴または強化信号の累積値若しくはその履歴を用いて構成エレメント毎に評価、生成または削除を行うことにより、ネットワークの構造を自律的に変化させるので、従来のようにネットワーク全体を評価単位として評価を行う場合に比べ、評価に要する時間を短縮でき、低い時間オーダでネットワークを自律的に構築していくことができるという効果がある。 As described above, according to the present invention, an enhancement signal to be given to the network is generated according to the evaluation result of the state of the controlled object, and this enhancement signal is further propagated from the network configuration element to other configuration elements. The structure of the network is autonomously changed by evaluating, generating, or deleting for each configuration element using the enhancement signal or its history given individually for each configuration element or the cumulative value of the enhancement signal or its history. Therefore, as compared with the conventional case where evaluation is performed using the entire network as an evaluation unit, the time required for evaluation can be shortened, and the network can be autonomously constructed in a low time order.

以下に本発明の一実施形態について図面を参照して説明する。図１には、本実施形態の情報処理システム１０の全体構成が示されている。図２には、情報処理システム１０による処理で用いられるデータの構造が示されている。また、図３には、ロボット３０の動作制御の全体的な流れが示され、図４には、ネットワーク２０の処理の流れが示され、図５には、中間ＯＲノード（実ノード）の学習処理の流れが示され、図６には、非反転リンクの学習処理の流れが示されている。さらに、図７は、中間ＯＲノードの学習処理の説明図であり、図８は、中間ＯＲノードの学習時の強化信号の分配例を示す図であり、図９は、中間ＡＮＤノードの学習時の強化信号の分配例を示す図であり、図１０は、非反転リンク（実リンク）の学習処理の説明図である。そして、図１１には、初期化の構成が示され、図１２には、学習時の削除処理の構成が示されている。また、図１３は、出力ノード初期化処理Ｇ４の説明図であり、図１４は、中間ＯＲノード初期化処理Ｇ５の説明図であり、図１５は、テスト中間ＯＲノード初期化処理Ｇ７の説明図であり、図１６〜図１８は、中間ＯＲノード削除処理Ｅ１の説明図である。 An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 shows the overall configuration of an information processing system 10 of the present embodiment. FIG. 2 shows the structure of data used in processing by the information processing system 10. 3 shows an overall flow of operation control of the robot 30, FIG. 4 shows a flow of processing of the network 20, and FIG. 5 shows learning of an intermediate OR node (real node). A flow of processing is shown, and FIG. 6 shows a flow of learning processing of a non-inverted link. Further, FIG. 7 is an explanatory diagram of the learning process of the intermediate OR node, FIG. 8 is a diagram showing an example of the reinforcement signal distribution during the learning of the intermediate OR node, and FIG. 9 is the learning time of the intermediate AND node. FIG. 10 is an explanatory diagram of learning processing for a non-inverted link (real link). FIG. 11 shows the configuration of initialization, and FIG. 12 shows the configuration of deletion processing during learning. FIG. 13 is an explanatory diagram of the output node initialization process G4, FIG. 14 is an explanatory diagram of the intermediate OR node initialization process G5, and FIG. 15 is an explanatory diagram of the test intermediate OR node initialization process G7. FIGS. 16 to 18 are explanatory diagrams of the intermediate OR node deletion processing E1.

図１において、情報処理システム１０は、ネットワーク２０を用いて制御対象（本実施形態では、一例としてロボット３０とする。）の制御を行う情報処理システムであり、１台または複数台のコンピュータにより構成されている。ネットワーク２０は、コンピュータ内に構成された情報処理用のネットワークであり、入力層、中間層、および出力層に配置されてそれぞれ個別に情報処理を行う複数の入力ノード２１、複数の中間ノード２２、および複数の出力ノード２３と、これらのノード２１，２２，２３を連結してノード間の情報伝達を行うリンク２４とを備えて構成されている。 In FIG. 1, an information processing system 10 is an information processing system that controls a control target (in this embodiment, a robot 30 as an example) using a network 20, and is configured by one or a plurality of computers. Has been. The network 20 is an information processing network configured in a computer, and is arranged in an input layer, an intermediate layer, and an output layer, and performs a plurality of input nodes 21, a plurality of intermediate nodes 22, And a plurality of output nodes 23 and a link 24 that links these nodes 21, 22, and 23 and transmits information between the nodes.

各ノード２１，２２，２３およびリンク２４は、学習器の作成に用いる素子として機能する自己組織化ネットワーク素子（ＳＯＮＥ：Self-Organizing Network Elements）である。自己組織化ネットワーク素子（ＳＯＮＥ）とは、これらの素子に死滅条件、新たな素子の生成機能、強化信号生成・伝播機能等を持たせることにより、ネットワーク２０を自律的に構築していくことができる回路素子である。 Each of the nodes 21, 22, 23 and the link 24 is a self-organizing network element (SONE) that functions as an element used for creating a learning device. Self-organizing network elements (SONE) means that these elements have a death condition, a new element generation function, an enhanced signal generation / propagation function, etc., so that the network 20 can be autonomously constructed. It is a circuit element that can be used.

制御対象は、本実施形態では、一例としてケペラロボットと称されるロボット３０として説明を行うものとする。但し、本発明の情報処理システムの制御対象は、ケペラロボットに限定されるものではなく、また、ロボットに限定されるものでもない。 In the present embodiment, the control target is described as a robot 30 called a Kepera robot as an example. However, the control target of the information processing system of the present invention is not limited to the Keppera robot, and is not limited to the robot.

ロボット３０は、図１に示すように、右車輪３１およびこれを駆動するモータ３２と、左車輪３３およびこれを駆動するモータ３４と、進行方向の前方の部分に６本、後方の部分に２本設けられた赤外線センサ３５とを備えて構成されている。このロボット３０は、壁３６への衝突を回避しながら前方へ進んでいくロボットである。８本の赤外線センサ３５は、ロボット３０と壁３６との距離Ｄを検出するために設けられている。 As shown in FIG. 1, the robot 30 includes a right wheel 31 and a motor 32 that drives the right wheel 31, a left wheel 33 and a motor 34 that drives the right wheel 31, six in the front part in the traveling direction, and two in the rear part. The infrared sensor 35 provided here is provided. The robot 30 is a robot that moves forward while avoiding a collision with the wall 36. The eight infrared sensors 35 are provided for detecting the distance D between the robot 30 and the wall 36.

ノードは、情報処理装置として機能するものであり、本実施形態では、それぞれ論理回路（ＡＮＤ回路またはＯＲ回路）により構成され、入力ノード２１、４種類の中間ノード２２（中間ＡＮＤノード、中間ＯＲノード、テスト中間ＡＮＤノード、テスト中間ＯＲノード）、出力ノード２３の合計６種類のノードがある。ノードは、基本的には、複数の入力から１つの出力が得られる論理回路により構成されるが、入力ノード２１は、出力のみを行うダミーノードである。なお、本実施形態では、ＡＮＤ回路およびＯＲ回路を用いているが、ＸＯＲ回路等、他の種類の論理回路を用いてもよく、あるいは複数の論理回路を組み合わせて１つのノードとしてもよい。 The node functions as an information processing apparatus. In this embodiment, each node is configured by a logic circuit (AND circuit or OR circuit), and includes an input node 21 and four types of intermediate nodes 22 (intermediate AND node, intermediate OR node). , Test intermediate AND node, test intermediate OR node) and output node 23 in total. The node is basically composed of a logic circuit that can obtain one output from a plurality of inputs, but the input node 21 is a dummy node that performs only output. In the present embodiment, an AND circuit and an OR circuit are used. However, other types of logic circuits such as an XOR circuit may be used, or a plurality of logic circuits may be combined into one node.

入力ノード２１は、８本の赤外線センサ３５に対応して設けられている。すなわち、１本の赤外線センサ３５のセンサ信号が１６ビットであり、８本では、１６×８＝１２８ビットとなるので、１ビットに対し、１つの入力ノード２１を割り当てるとすると、本実施形態では、入力ノード２１の個数は１２８個となる。 The input node 21 is provided corresponding to the eight infrared sensors 35. That is, the sensor signal of one infrared sensor 35 is 16 bits, and in the case of eight, 16 × 8 = 128 bits. Therefore, if one input node 21 is assigned to one bit, in this embodiment, The number of input nodes 21 is 128.

出力ノード２３は、２個のモータ３２，３４に対応して設けられている。すなわち、１個のモータ出力信号（回転数）が１６ビットであり、左右２個では、１６×２＝３２ビットとなるので、１ビットに対し、１つの出力ノード２３を割り当てるとすると、本実施形態では、出力ノード２３の個数は３２個となる。出力ノード２３は、本実施形態では、全てＯＲノードとするが、ＡＮＤノードを混在させてもよい。 The output node 23 is provided corresponding to the two motors 32 and 34. That is, one motor output signal (rotation speed) is 16 bits, and the left and right two are 16 × 2 = 32 bits, so if one output node 23 is assigned to one bit, this implementation In the embodiment, the number of output nodes 23 is 32. In this embodiment, the output nodes 23 are all OR nodes, but AND nodes may be mixed.

なお、入力ノード２１および出力ノード２３の個数は固定されているが、中間ノード２２の個数は、ネットワーク２０の構造が自律的に変化することから変動する。 The number of input nodes 21 and output nodes 23 is fixed, but the number of intermediate nodes 22 varies because the structure of the network 20 changes autonomously.

リンクは、本実施形態では、反転リンク（出力が入力と反転するリンク）と、非反転リンクと、テスト反転リンクと、テスト非反転リンクとの合計４種類のリンクがある。 In this embodiment, there are four types of links: an inverted link (a link whose output is inverted with respect to the input), a non-inverted link, a test inverted link, and a test non-inverted link.

図１において、情報処理システム１０は、センサ信号取得手段４１と、状態評価用信号取得手段４２と、強化信号生成手段４３と、モータ出力信号送信手段４４と、ネットワーク処理手段５０と、ロボット情報記憶手段６０と、ネットワーク情報記憶手段６１と、ノード情報記憶手段６２と、リンク情報記憶手段６３とを備えて構成されている。 In FIG. 1, an information processing system 10 includes a sensor signal acquisition unit 41, a state evaluation signal acquisition unit 42, an enhancement signal generation unit 43, a motor output signal transmission unit 44, a network processing unit 50, and a robot information storage. Means 60, network information storage means 61, node information storage means 62, and link information storage means 63 are provided.

センサ信号取得手段４１は、８本の赤外線センサ３５から出力されたセンサ信号を取得し、ロボット情報記憶手段６０に書き込む処理を行うものである。 The sensor signal acquisition unit 41 performs a process of acquiring the sensor signals output from the eight infrared sensors 35 and writing them in the robot information storage unit 60.

状態評価用信号取得手段４２は、制御対象であるロボット３０の状態（行動結果）を評価するための状態評価用信号を取得する処理を行うものである。本実施形態では、状態評価用信号として、センサ信号取得手段４１により取得されてロボット情報記憶手段６０（図２参照）に記憶されている赤外線センサ３５によるセンサ信号と、モータ出力信号送信手段４４によりロボット情報記憶手段６０から読み込まれてロボット３０のモータ３２，３４へ送信されるモータ出力信号（回転数）とを用いる。従って、赤外線センサ３５は、制御対象であるロボット３０の状態を検出する状態検出手段として機能する。また、本実施形態では、モータ出力信号をロボット情報記憶手段６０から読み込んで取得しているが、ロボット情報記憶手段６０に記憶されたモータ出力信号は、そのままロボット３０へ送信されるので、モータ出力信号をロボット３０から取得していると捉えることもできる。そして、このようにモータ３２，３４へ送信される制御信号としてのモータ出力信号ではなく、状態検出手段により検出した実際のモータ出力信号（実際の回転数）を、状態評価用信号として用いてもよい。なお、ロボット３０が実機ロボットではなく、ディスプレイ画面上に表示される仮想的なロボットである場合には、制御信号としてのモータ出力信号と実際のモータ出力信号（実際の回転数）とは同じである。また、状態評価用信号取得手段４２は、ロボット情報記憶手段６０に記憶された１ステップ前の状態指標値Ａ６（図２参照）も、状態評価用信号として取得する。 The state evaluation signal acquisition means 42 performs a process of acquiring a state evaluation signal for evaluating the state (behavior result) of the robot 30 that is the control target. In this embodiment, the sensor signal from the infrared sensor 35 acquired by the sensor signal acquisition unit 41 and stored in the robot information storage unit 60 (see FIG. 2) and the motor output signal transmission unit 44 as the state evaluation signal. A motor output signal (number of rotations) read from the robot information storage means 60 and transmitted to the motors 32 and 34 of the robot 30 is used. Therefore, the infrared sensor 35 functions as a state detection unit that detects the state of the robot 30 that is a control target. In this embodiment, the motor output signal is read and acquired from the robot information storage unit 60. However, since the motor output signal stored in the robot information storage unit 60 is transmitted to the robot 30 as it is, the motor output is output. It can also be understood that the signal is acquired from the robot 30. Then, instead of the motor output signal as the control signal transmitted to the motors 32 and 34 as described above, the actual motor output signal (actual rotational speed) detected by the state detecting means may be used as the state evaluation signal. Good. When the robot 30 is not a real robot but a virtual robot displayed on the display screen, the motor output signal as the control signal and the actual motor output signal (actual rotation speed) are the same. is there. Further, the state evaluation signal acquisition unit 42 also acquires the state index value A6 (see FIG. 2) one step before stored in the robot information storage unit 60 as a state evaluation signal.

強化信号生成手段４３は、状態評価用信号取得手段４２により取得した状態評価用信号に基づき、ネットワーク２０の出力結果に基づき形成された制御対象であるロボット３０の状態（行動結果）を評価し、その評価結果に応じ、ネットワーク２０に対して報酬または罰として付与する強化信号を生成する処理を行うものである。この際、強化信号生成手段４３は、赤外線センサ３５によるセンサ信号に基づき、ロボット３０と壁３６との相対的な距離Ｄを把握し、ロボット３０が壁３６から遠ざかる動きをしたときには報酬（プラスの強化信号）を与え、壁３６に近づく動きをしたときには罰（マイナスの強化信号）を与える。また、モータ出力信号に基づき、ロボット３０が直進行動をしているか否かを把握し、直進しているときには報酬（プラスの強化信号）を与え、直進していないときには罰（マイナスの強化信号）を与える。 The reinforcement signal generation unit 43 evaluates the state (behavior result) of the robot 30 that is the control target formed based on the output result of the network 20 based on the state evaluation signal acquired by the state evaluation signal acquisition unit 42. According to the evaluation result, a process of generating an enhancement signal to be given as a reward or punishment to the network 20 is performed. At this time, the reinforcement signal generating means 43 grasps the relative distance D between the robot 30 and the wall 36 based on the sensor signal from the infrared sensor 35, and when the robot 30 moves away from the wall 36, a reward (plus) (Enhancement signal) is given, and a punishment (minus enhancement signal) is given when moving toward the wall 36. Also, based on the motor output signal, it is ascertained whether or not the robot 30 is moving in a straight line. A reward (a positive reinforcement signal) is given when the robot 30 is traveling straight, and a punishment (a negative reinforcement signal) when the robot 30 is not traveling straight give.

より具体的には、強化信号生成手段４３は、例えば、赤外線センサ３５によるセンサ信号のうちの少なくとも１つが、閾値（例えばゼロ）より大きな値となっていた場合には、ロボット３０が壁３６の近傍にいることになるので、８個のセンサ信号の値を合計し、その合計値に−１を乗じ、さらに必要に応じて定数を乗じ、この値を現在のロボット３０の状態を示す状態指標値とし、ロボット情報記憶手段６０の現在の状態指標値Ａ５に書き込む。また、赤外線センサ３５によるセンサ信号の合計値が、閾値（例えばゼロ）より大きな値となっていた場合に、この合計値に−１を乗じ、さらに必要に応じて定数を乗じてもよい。従って、壁３６に接近している程、マイナスの絶対値が大きくなる。そして、この現在の状態指標値から、１ステップ前のロボット３０の状態を示す状態指標値（同様にして算出されてロボット情報記憶手段６０に記憶されていた値であり、状態評価用信号取得手段４２により取得したものである。）を減じることにより、ステップ間のセンサ信号の差分をとり、得られた値をネットワーク２０に付与する強化信号とする。これにより、ロボット３０が壁３６から遠ざかれば、強化信号はプラス（報酬）となり、壁３６に近づけばマイナス（罰）となる。その後、次回のステップの処理のために、現在の状態指標値を１ステップ前の状態指標値としてロボット情報記憶手段６０に保存する。また、赤外線センサ３５によるセンサ信号のいずれもが閾値（例えばゼロ）以下の値となっていた場合には、ロボット３０が壁３６の近傍にはいないことになるので、ロボット３０の左右のモータ３２，３４の回転数が同じか否かを判断し、回転数が同じ場合には、直進していると判断し、「＋１」の強化信号（報酬）を与え、回転数が同じでない場合には、直進していないと判断し、「−０．０１」の強化信号（小さな罰）を与える。 More specifically, the enhancement signal generation unit 43 determines that the robot 30 is connected to the wall 36 when, for example, at least one of the sensor signals from the infrared sensor 35 has a value greater than a threshold value (for example, zero). Since it is in the vicinity, the values of the eight sensor signals are summed, the sum is multiplied by −1, and further multiplied by a constant as necessary, and this value is a state index indicating the current state of the robot 30. The value is written in the current state index value A5 of the robot information storage means 60. Further, when the total value of the sensor signals from the infrared sensor 35 is larger than a threshold value (for example, zero), the total value may be multiplied by −1 and further multiplied by a constant as necessary. Therefore, the closer to the wall 36, the larger the absolute value of minus. Then, from this current state index value, a state index value indicating the state of the robot 30 one step before (a value calculated in the same manner and stored in the robot information storage unit 60, and a state evaluation signal acquisition unit The difference between the sensor signals between steps is taken and the obtained value is used as an enhancement signal to be given to the network 20. As a result, if the robot 30 moves away from the wall 36, the reinforcement signal becomes positive (reward), and if the robot 30 approaches the wall 36, it becomes negative (punishment). Thereafter, the current state index value is stored in the robot information storage means 60 as the state index value of the previous step for the processing of the next step. If any of the sensor signals from the infrared sensor 35 has a value equal to or less than a threshold value (for example, zero), the robot 30 is not in the vicinity of the wall 36. , 34 are determined to be the same, and if the rotation speed is the same, it is determined that the vehicle is traveling straight, and a reinforcement signal (reward) of “+1” is given. If the rotation speed is not the same, It judges that it is not going straight, and gives a reinforcement signal (small punishment) of “−0.01”.

モータ出力信号送信手段４４は、ネットワーク２０の出力結果に基づきロボット情報記憶手段６０に書き込まれたモータ出力信号を、ロボット３０のモータ３２，３４へ送信する処理を行うものである。 The motor output signal transmission unit 44 performs processing to transmit the motor output signal written in the robot information storage unit 60 to the motors 32 and 34 of the robot 30 based on the output result of the network 20.

ネットワーク処理手段５０は、ネットワーク２０を用いた処理を行うものであり、学習手段５１と、入力変換手段５２と、出力生成手段５３と、出力変換手段５４とを備えて構成されている。 The network processing unit 50 performs processing using the network 20 and includes a learning unit 51, an input conversion unit 52, an output generation unit 53, and an output conversion unit 54.

学習手段５１は、強化信号生成手段４３により生成した強化信号を全ての出力ノード２３に均等に付与し、出力層から中間層へ、さらに中間層から入力層へと、順次、強化信号を逆伝播させる処理、すなわち各リンク２４、各中間ノード２２、各入力ノード２１へ、構成エレメント間（ノードとリンクとの間、およびノード同士の間）の連鎖的な結合関係に従って強化信号を伝播させる処理を行うものである。この際、学習手段５１は、伝播元の構成エレメント（ノードまたはリンク）に付与された強化信号に基づき、伝播元および／または伝播先の構成エレメントの入出力状態に応じて、伝播先の構成エレメントに対して報酬または罰として付与する強化信号を生成する。また、学習手段５１は、構成エレメント（ノードまたはリンク）に付与された強化信号の累積値を用いて、構成エレメント毎に構成エレメントの生成または削除を行ってネットワーク２０の構造を変化させ、変化後のネットワーク２０の構造を、ネットワーク構造記憶手段として機能するネットワーク情報記憶手段６１、ノード情報記憶手段６２、およびリンク情報記憶手段６３（図２参照）に登録する。なお、学習処理の詳細は、後述する。 The learning unit 51 equally applies the enhancement signal generated by the enhancement signal generation unit 43 to all the output nodes 23, and sequentially back propagates the enhancement signal from the output layer to the intermediate layer and from the intermediate layer to the input layer. That is, a process of propagating the enhancement signal to each link 24, each intermediate node 22, and each input node 21 according to a chain connection relationship between the constituent elements (between the node and the link and between the nodes). Is what you do. At this time, the learning means 51, based on the reinforcement signal given to the propagation source configuration element (node or link), in accordance with the input / output state of the propagation source and / or propagation destination configuration element, A reinforcement signal that is given as a reward or punishment is generated. In addition, the learning unit 51 uses the accumulated value of the enhancement signal given to the configuration element (node or link) to generate or delete the configuration element for each configuration element to change the structure of the network 20. The network 20 is registered in the network information storage unit 61, the node information storage unit 62, and the link information storage unit 63 (see FIG. 2) that function as the network structure storage unit. Details of the learning process will be described later.

入力変換手段５２は、ロボット情報記憶手段６０に記憶されたセンサ信号を２進数に変換し、各入力ノード２１の出力として設定する処理を行うものである。 The input conversion means 52 performs processing for converting the sensor signal stored in the robot information storage means 60 into a binary number and setting it as an output of each input node 21.

出力生成手段５３は、ネットワーク構造記憶手段として機能するネットワーク情報記憶手段６１、ノード情報記憶手段６２、およびリンク情報記憶手段６３（図２参照）に記憶されたネットワーク２０の構造を参照し、学習手段５１により構造を変化させたネットワーク２０を用いて、ネットワーク２０の出力を生成する処理を行うものである。この出力生成手段５３は、各中間ノード２２および各出力ノード２３を構成する個々の論理回路の機能（出力生成機能）を、プログラムを実行して実現するものである。 The output generation unit 53 refers to the structure of the network 20 stored in the network information storage unit 61, the node information storage unit 62, and the link information storage unit 63 (see FIG. 2) functioning as a network structure storage unit, and learns. Using the network 20 whose structure has been changed by 51, processing for generating the output of the network 20 is performed. The output generation means 53 realizes the function (output generation function) of each logic circuit constituting each intermediate node 22 and each output node 23 by executing a program.

出力変換手段５４は、各出力ノード２３の出力（２進数）を実数に変換し、モータ出力信号（回転数）としてロボット情報記憶手段６０に書き込む処理を行うものである。 The output conversion means 54 performs processing for converting the output (binary number) of each output node 23 into a real number and writing it in the robot information storage means 60 as a motor output signal (rotation number).

図２において、ロボット情報記憶手段６０は、センサ信号取得手段４１により取得した８本の赤外線センサ３５によるセンサ信号である入力配列Ａ１（実数×８、すなわち各赤外線センサ３５のセンサ信号に対応するＡ１（１）〜Ａ１（８）からなる。）と、左右のモータ出力信号（回転数）である出力配列Ａ２（実数×２、すなわち各モータ出力信号に対応するＡ２（１），Ａ２（２）からなる。）と、ネットワーク情報記憶手段６１のアドレスであるネットワークアドレスＡ３と、強化信号生成手段４３により生成されてネットワーク２０に対して付与される強化信号Ａ４（実数）と、現在のロボット３０の状態を指標する状態指標値Ａ５（実数）と、１ステップ前のロボット３０の状態を指標する状態指標値Ａ６（実数）とを記憶するものである。ここで、１ステップ前というときの「１ステップ」とは、図３のステップＳ５〜Ｓ９のループ処理を単位とする１巡の処理という意味であり、ループを構成する個々のステップＳ５〜Ｓ９毎の処理を意味するものではない。 In FIG. 2, the robot information storage means 60 has an input array A1 (real number × 8, that is, A1 corresponding to the sensor signal of each infrared sensor 35) which is a sensor signal obtained by the eight infrared sensors 35 acquired by the sensor signal acquisition means 41. (1) to A1 (8)), and an output array A2 (real number × 2, that is, A2 (1), A2 (2) corresponding to each motor output signal) which is the left and right motor output signals (rotations). Network address A3 which is the address of the network information storage means 61, the reinforcement signal A4 (real number) generated by the reinforcement signal generation means 43 and applied to the network 20, and the current robot 30 A state index value A5 (real number) that indicates the state and a state index value A6 (real number) that indicates the state of the robot 30 one step before are stored. It is. Here, “one step” when referring to one step means one round of processing in units of the loop processing of steps S5 to S9 in FIG. 3, and each step S5 to S9 constituting the loop. It does not mean that processing.

なお、ネットワーク２０に対して付与される強化信号Ａ４は、後述するネットワーク情報記憶手段６１に記憶される強化信号Ｂ４と同じであるため、強化信号生成手段４３が、生成した強化信号を、ロボット情報記憶手段６０の強化信号Ａ４ではなく、ネットワーク情報記憶手段６１の強化信号Ｂ４に直接に書き込むようにすれば、強化信号Ａ４のためのメモリ確保は省略してもよい。また、現在のロボット３０の状態を指標する状態指標値は、一旦、ロボット情報記憶手段６０の現在の状態指標値Ａ５に書き込まれ、その後、ロボット情報記憶手段６０に書き込まれた現在の状態指標値Ａ５と、ロボット情報記憶手段６０に記憶されている１ステップ前の状態指標値Ａ６とを用いて強化信号の算定処理が行われるようになっているが、ロボット情報記憶手段６０に１ステップ前の状態指標値Ａ６が記憶されていれば、ロボット情報記憶手段６０の現在の状態指標値Ａ５への書き込みを行わなくても、強化信号の算定処理は行うことができるので、現在の状態指標値Ａ５のためのメモリ確保は省略してもよい。 Since the reinforcement signal A4 given to the network 20 is the same as the reinforcement signal B4 stored in the network information storage means 61 described later, the reinforcement signal generated by the reinforcement signal generation means 43 is used as robot information. If writing directly to the enhancement signal B4 of the network information storage means 61 instead of the enhancement signal A4 of the storage means 60, the memory reservation for the enhancement signal A4 may be omitted. The state index value that indicates the current state of the robot 30 is once written in the current state index value A5 of the robot information storage unit 60, and then the current state index value written in the robot information storage unit 60. The reinforcement signal calculation process is performed using A5 and the state index value A6 of the previous step stored in the robot information storage unit 60, but the robot information storage unit 60 stores the previous step. If the state index value A6 is stored, the enhancement signal calculation process can be performed without writing the current state index value A5 in the robot information storage means 60, so the current state index value A5 Securing memory for can be omitted.

ネットワーク情報記憶手段６１は、ノード情報記憶手段６２の中の各入力ノード２１の情報を記憶する部分のアドレスである入力ノードアドレスＢ１（可変長配列であり、各入力ノード２１に対応するＢ１（１），Ｂ１（２）…Ｂ１（ｍ）…からなる。）と、ノード情報記憶手段６２の中の各中間ノード２２の情報を記憶する部分のアドレスである中間ノードアドレスＢ２（可変長配列であり、各中間ノード２２に対応するＢ２（１），Ｂ２（２）…Ｂ２（ｎ）…からなる。）と、ノード情報記憶手段６２の中の各出力ノード２３の情報を記憶する部分のアドレスである出力ノードアドレスＢ３（可変長配列であり、各出力ノード２３に対応するＢ３（１），Ｂ３（２）…Ｂ３（ｊ）…からなる。）と、強化信号生成手段４３により生成されてネットワーク２０に対して付与される強化信号Ｂ４（実数）とを記憶するものである。 The network information storage means 61 has an input node address B1 (variable length array, B1 (1 corresponding to each input node 21), which is an address of a part for storing information of each input node 21 in the node information storage means 62. ), B1 (2)... B1 (m)...), And an intermediate node address B2 (variable length array) that is the address of the portion that stores information of each intermediate node 22 in the node information storage means 62 , And B2 (1), B2 (2)... B2 (n)... Corresponding to each intermediate node 22), and the address of the portion storing the information of each output node 23 in the node information storage means 62. An output node address B3 (which is a variable-length array and consists of B3 (1), B3 (2)... B3 (j)... Corresponding to each output node 23) and the enhancement signal generation means 43. It is for storing the enhanced signal B4 (a real number) to be applied to the network 20.

ノード情報記憶手段６２は、６種類のノードのそれぞれについて複数ずつ存在する各ノードの情報を個別に記憶するものであり、各ノードについて、リンク情報記憶手段６３の中の当該ノードの入力側リンクの情報を記憶する部分のアドレスである入力側リンクアドレスＣ１（可変長配列であり、各入力側リンクに対応するＣ１（１），Ｃ１（２）…Ｃ１（ｋ）…からなる。）と、リンク情報記憶手段６３の中の当該ノードの出力側リンクの情報を記憶する部分のアドレスである出力側リンクアドレスＣ２（可変長配列であり、各出力側リンクに対応するＣ２（１），Ｃ２（２）…Ｃ２（ｈ）…からなる。）と、ネットワーク情報記憶手段６１のアドレスであるネットワークアドレスＣ３と、リンク情報記憶手段６３の中の当該ノードの入力側に設けられたテストリンクの情報を記憶する部分のアドレスであるテストリンクアドレスＣ４と、当該ノードがＡＮＤノードかＯＲノードかを識別するためのＡＮＤ・ＯＲノードフラグＣ５（１ビットであり、ＡＮＤノードであれば「True（または１）」、ＯＲノードであれば「False（または０）」となる。）と、当該ノードが入力ノード２１であるか否かを識別するための入力ノードフラグＣ６（１ビットであり、入力ノードであれば「True（または１）」、入力ノードでなければ「False（または０）」となる。）と、当該ノードが出力ノード２３であるか否かを識別するための出力ノードフラグＣ７（１ビットであり、出力ノードであれば「True（または１）」、出力ノードでなければ「False（または０）」となる。）と、当該ノードがテストノードであるか否かを識別するためのテストノードフラグＣ８（１ビットであり、テストノードであれば「True（または１）」、テストノードでなければ「False（または０）」となる。）と、当該ノードの出力Ｃ９（１ビットであり、「True（または１）」か「False（または０）」となる。）と、当該ノードに付与された強化信号の合計値Ｃ１０（実数であるが、合計値とは各ステップの累積値のことではなく、各伝播元の構成エレメントから伝播された強化信号の合計値という意味である。）とを記憶するものである。このノード情報記憶手段６２では、ノードの追加・削除に従って、これらのノードに対応するメモリの追加・削除を動的に行う。 The node information storage means 62 individually stores information on each of a plurality of nodes for each of the six types of nodes. For each node, the input side link of the node in the link information storage means 63 is stored. An input side link address C1 (which is a variable-length array and includes C1 (1), C1 (2)... C1 (k). An output side link address C2 (which is a variable length array and is an address of a part for storing the information on the output side link of the node in the information storage means 63, C2 (1), C2 (2 ) ... C2 (h) ...), the network address C3 which is the address of the network information storage means 61, and the input of the node in the link information storage means 63 And a test link address C4 which is an address of a part for storing the information of the test link provided in the AND and an OR / OR node flag C5 (one bit, AND node for identifying whether the node is an AND node or an OR node) "True (or 1)" if it is an OR node and "False (or 0)" if it is an OR node), an input node flag C6 (for identifying whether or not the node is an input node 21) 1 bit, “True (or 1)” if it is an input node, “False (or 0)” if it is not an input node), it is identified whether the node is an output node 23 or not. Output node flag C7 (1 bit, “True (or 1)” if output node, “False (or 0)” if not output node), the node is tested. Test node flag C8 for identifying whether or not the node is a node (1 bit, “True (or 1)” if it is a test node, “False (or 0)” if it is not a test node) And the output C9 of the node (1 bit, which is “True (or 1)” or “False (or 0)”) and the total value C10 of enhancement signals given to the node (which is a real number). However, the total value is not a cumulative value of each step, but means a total value of enhancement signals propagated from the constituent elements of each propagation source). The node information storage means 62 dynamically adds / deletes memory corresponding to these nodes according to the addition / deletion of nodes.

なお、ノード情報記憶手段６２において、当該ノードがテストノードの場合には、入力側リンクアドレスＣ１は、第１および第２の入力側テストリンクアドレスＣ１（Ｃ１（１）およびＣ１（２）のみ）となり、出力側リンクアドレスＣ２は、出力側テストリンクアドレスＣ２（Ｃ２（１）のみ）となり、テストリンクアドレスＣ４は無くなる。なお、テストリンクとは、出力に寄与しないリンクを意味し、付随するテストノードを所有していないリンクである。一方、実リンクというときは、出力に寄与する実用化されたリンクを意味し、付随するテストノードを所有しているリンクである。 In the node information storage means 62, when the node is a test node, the input side link address C1 is the first and second input side test link addresses C1 (C1 (1) and C1 (2) only). Therefore, the output side link address C2 becomes the output side test link address C2 (only C2 (1)), and the test link address C4 is lost. The test link means a link that does not contribute to the output, and is a link that does not have an accompanying test node. On the other hand, an actual link means a practical link that contributes to output, and is a link that owns an associated test node.

リンク情報記憶手段６３は、４種類のリンクのそれぞれについて複数ずつ存在する各リンクの情報を個別に記憶するものであり、各リンクについて、ノード情報記憶手段６２の中の当該リンクの入力側ノードの情報を記憶する部分のアドレスである入力側ノードアドレスＤ１と、ノード情報記憶手段６２の中の当該リンクの出力側ノードの情報を記憶する部分のアドレスである出力側ノードアドレスＤ２と、ネットワーク情報記憶手段６１のアドレスであるネットワークアドレスＤ３と、ノード情報記憶手段６２の中の当該リンクに付随するテストノードの情報を記憶する部分のアドレスであるテストノードアドレスＤ４と、当該リンクが反転リンクであるか非反転リンクであるかを識別するための反転・非反転フラグＤ５（１ビットであり、反転リンクであれば「True（または１）」、非反転リンクであれば「False（または０）」となる。）と、当該リンクがテストリンクであるか否かを識別するためのテストリンクフラグＤ６（１ビットであり、テストリンクであれば「True（または１）」、テストリンクでなければ「False（または０）」となる。）と、当該リンクの出力Ｄ７（１ビットであり、「True（または１）」か「False（または０）」となる。）と、当該リンクに対して付与された強化信号の累積値Ｄ８（実数であり、複数のステップの累積値である。）と、当該リンクに対して付与された強化信号Ｄ９（実数であり、１ステップ分の値である。）とを記憶するものである。このリンク情報記憶手段６３では、リンクの追加・削除に従って、これらのリンクに対応するメモリの追加・削除を動的に行う。 The link information storage means 63 individually stores information on each of a plurality of links for each of the four types of links. For each link, the input side node of the link in the node information storage means 62 is stored. An input side node address D1 which is an address of a part for storing information, an output side node address D2 which is an address of a part for storing information of an output side node of the link in the node information storage means 62, and a network information storage The network address D3 that is the address of the means 61, the test node address D4 that is the address of the portion of the node information storage means 62 that stores the information of the test node associated with the link, and whether the link is an inverted link Inversion / non-inversion flag D5 (1 bit) for identifying whether the link is a non-inversion link "True (or 1)" for an inverted link, "False (or 0)" for a non-inverted link)) and a test link flag for identifying whether or not the link is a test link D6 (1 bit, “True (or 1)” if it is a test link, “False (or 0)” if it is not a test link) and the output D7 (1 bit, “1”, “ True (or 1) "or" False (or 0) "), and a cumulative value D8 (a real number and a cumulative value of a plurality of steps) of the enhancement signal given to the link. The enhancement signal D9 (a real number and a value for one step) given to the link is stored. The link information storage means 63 dynamically adds and deletes memory corresponding to these links in accordance with the addition and deletion of links.

また、ネットワーク情報記憶手段６１のＢ１〜Ｂ３を記憶する部分と、ノード情報記憶手段６２のＣ１〜Ｃ８を記憶する部分と、リンク情報記憶手段６３のＤ１〜Ｄ６を記憶する部分とにより、構成エレメント間の結合関係を含むネットワーク２０の構造を記憶するネットワーク構造記憶手段が構成されている。 The network information storage unit 61 stores B1 to B3, the node information storage unit 62 stores C1 to C8, and the link information storage unit 63 stores D1 to D6. Network structure storage means for storing the structure of the network 20 including the connection relationship between them is configured.

さらに、ノード情報記憶手段６２のＣ９を記憶する部分と、リンク情報記憶手段６３のＤ７を記憶する部分とにより、ネットワーク２０の出力生成処理で形成される構成エレメントの入出力状態を記憶する入出力状態記憶手段が構成されている。 Further, the input / output for storing the input / output state of the constituent elements formed by the output generation processing of the network 20 by the part for storing C9 of the node information storage means 62 and the part for storing D7 of the link information storage means 63 State storage means is configured.

そして、ネットワーク情報記憶手段６１のＢ４を記憶する部分と、ノード情報記憶手段６２のＣ１０を記憶する部分と、リンク情報記憶手段６３のＤ８，Ｄ９を記憶する部分とにより、学習手段５１により生成された、構成エレメントに対する強化信号またはその累積値を構成エレメント毎に記憶する強化信号記憶手段が構成されている。 Then, the learning unit 51 generates the B4 of the network information storage unit 61, the C10 unit of the node information storage unit 62, and the D8 and D9 units of the link information storage unit 63. Further, an enhancement signal storage means for storing the enhancement signal for the constituent element or its accumulated value for each constituent element is configured.

以上において、センサ信号取得手段４１、状態評価用信号取得手段４２、強化信号生成手段４３、モータ出力信号送信手段４４、およびネットワーク処理手段５０は、情報処理システム１０を構成するコンピュータ本体（パーソナル・コンピュータのみならず、その上位機種のものも含む。）の内部に設けられた中央演算処理装置（ＣＰＵ）、およびこのＣＰＵの動作手順を規定する１つまたは複数のプログラム（例えば、Ｃ＋＋言語で記述されたプログラム等）により実現される。 In the above, the sensor signal acquisition means 41, the state evaluation signal acquisition means 42, the enhancement signal generation means 43, the motor output signal transmission means 44, and the network processing means 50 are the computer main body (personal computer) constituting the information processing system 10. And a central processing unit (CPU) provided inside the higher-level model) and one or more programs (for example, written in C ++ language) that define the operation procedure of the CPU. Etc.).

また、ロボット情報記憶手段６０、ネットワーク情報記憶手段６１、ノード情報記憶手段６２、およびリンク情報記憶手段６３は、例えば、主メモリやキャッシュメモリ、あるいはローカルメモリ等で実現されるが、アクセス速度や記憶容量等に問題が生じない範囲であれば、例えば、ハードディスク、ＭＯ、ＤＶＤ−ＲＡＭ、ＦＤ、磁気テープ等の外部記憶装置を用いて実現してもよい。 The robot information storage unit 60, the network information storage unit 61, the node information storage unit 62, and the link information storage unit 63 are realized by, for example, a main memory, a cache memory, or a local memory. For example, an external storage device such as a hard disk, MO, DVD-RAM, FD, or magnetic tape may be used as long as there is no problem in capacity.

このような本実施形態においては、以下のようにして情報処理システム１０によりロボット３０の動作の自律制御を行う。 In this embodiment, autonomous control of the operation of the robot 30 is performed by the information processing system 10 as follows.

先ず、図３〜図６を参照しながら、情報処理システム１０によるロボット３０の動作制御の全体的な流れを説明する。 First, an overall flow of operation control of the robot 30 by the information processing system 10 will be described with reference to FIGS.

図３において、情報処理システム１０を実現するためのプログラムを立ち上げ、ロボット３０の動作制御を開始する（ステップＳ１）。 In FIG. 3, a program for realizing the information processing system 10 is launched, and operation control of the robot 30 is started (step S1).

続いて、ネットワーク処理手段５０により、必要な初期化処理を行う（ステップＳ２）。ここで行う初期化処理には、ロボット情報記憶手段６０に記憶される情報の初期化処理（後述する図１１のロボット初期化処理Ｇ１）と、ネットワーク情報記憶手段６１に記憶される情報の初期化処理（後述する図１１のネットワーク初期化処理Ｇ２）と、必要個数（本実施形態では、１２８個）の入力ノード２１を生成する初期化処理（後述する図１１の入力ノード初期化処理Ｇ３）と、必要個数（本実施形態では、３２個）の出力ノード２３を生成する初期化処理（後述する図１１の出力ノード初期化処理Ｇ４）と、各出力ノード２３の入力側リンクとして各出力ノード２３からいずれかの入力ノード２１にランダムに連結する実リンクを生成する初期化処理（後述する図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０）と、各出力ノード２３の入力側に設けられて各出力ノード２３からいずれかの入力ノード２１にランダムに連結するテストリンクを生成する初期化処理（後述する図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２）と、生成した実リンク（後述する図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０で生成された反転リンクまたは非反転リンク）に付随するテストノードを生成する初期化処理（後述する図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８）と、生成したテストノードの第１および第２の入力側テストリンクを生成する初期化処理（後述する図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２）とがある。 Subsequently, necessary initialization processing is performed by the network processing means 50 (step S2). Initialization processing performed here includes initialization processing of information stored in the robot information storage means 60 (robot initialization processing G1 in FIG. 11 described later) and initialization of information stored in the network information storage means 61. Processing (network initialization processing G2 in FIG. 11 described later), initialization processing for generating the required number (128 in this embodiment) of input nodes 21 (input node initialization processing G3 in FIG. 11 described later), and Initialization processing (output node initialization processing G4 in FIG. 11 described later) for generating the required number (32 in this embodiment) of output nodes 23, and each output node 23 as an input side link of each output node 23 An initialization process for generating a real link randomly connected to any one of the input nodes 21 (an inverted link initialization process G9 or a non-inverted link initialization process of FIG. 10) and an initialization process that is provided on the input side of each output node 23 and generates a test link that is randomly connected from each output node 23 to any one of the input nodes 21 (test inversion link initialization in FIG. 11 described later) Process G11 or test non-inverted link initialization process G12) and the generated actual link (inverted link or non-inverted link generated in inverted link initialization process G9 or non-inverted link initialization process G10 in FIG. 11 described later). Initialization processing for generating an accompanying test node (test intermediate OR node initialization processing G7 or test intermediate AND node initialization processing G8 in FIG. 11 described later), and first and second input side tests of the generated test node An initialization process for generating a link (a test inversion link initialization process G11 in FIG. 11 described later or a test non-inversion link) Initialized processing G12) and there is.

それから、センサ信号取得手段４１により、８本の赤外線センサ３５で検出されたセンサ信号を取得し、取得した８個のセンサ信号をロボット情報記憶手段６０の入力配列Ａ１（１）〜Ａ１（８）（図２参照）に書き込む（ステップＳ３）。 Then, the sensor signal acquisition unit 41 acquires sensor signals detected by the eight infrared sensors 35, and the acquired eight sensor signals are input to the robot information storage unit 60 as input arrays A1 (1) to A1 (8). (See FIG. 2) is written (step S3).

そして、状態評価用信号取得手段４２により、状態評価用信号として、ロボット情報記憶手段６０の入力配列Ａ１（１）〜Ａ１（８）に記憶されている８本の赤外線センサ３５によるセンサ信号と、ロボット情報記憶手段６０の出力配列Ａ２（１），Ａ２（２）に記憶されているモータ出力信号（回転数）と、ロボット情報記憶手段６０に記憶されている１ステップ前の状態指標値Ａ６とを取得する（ステップＳ４）。 Then, by the state evaluation signal acquisition means 42, sensor signals from the eight infrared sensors 35 stored in the input arrays A1 (1) to A1 (8) of the robot information storage means 60 as state evaluation signals, The motor output signals (number of revolutions) stored in the output arrays A2 (1) and A2 (2) of the robot information storage unit 60, and the state index value A6 one step before stored in the robot information storage unit 60 Is acquired (step S4).

続いて、強化信号生成手段４３により、状態評価用信号取得手段４２により取得した状態評価用信号に基づき、制御対象であるロボット３０の状態（行動結果）を評価し、その評価結果に応じ、ネットワーク２０に対して報酬または罰として付与する強化信号を生成する（ステップＳ４）。なお、前述した初回の状態評価用信号取得処理では、出力配列Ａ２（１），Ａ２（２）には、学習により構造を変化させたネットワーク２０の出力結果に基づくモータ出力信号（回転数）は入っておらず、また、１ステップ前の状態指標値Ａ６にも前回ステップで状態評価を行った結果としての状態指標値が入っているわけではないので、生成される初回の強化信号は、ゼロとなる。そして、強化信号生成手段４３は、このようにして生成した強化信号を、ロボット情報記憶手段６０の強化信号Ａ４へ書き込む。また、強化信号生成手段４３は、次回ステップでの状態評価処理で用いるために、現在のステップでロボット３０の状態（行動結果）を評価して得られた現在の状態指標値を、ロボット情報記憶手段６０の１ステップ前の状態指標値Ａ６へ書き込んで保存する。なお、前述したように、初回の強化信号はゼロとなるので、後述する学習手段５１による初回の学習処理は、実質的に行われないに等しく、ネットワーク２０の構造は変化しない。 Subsequently, based on the state evaluation signal acquired by the state evaluation signal acquisition unit 42, the reinforcement signal generation unit 43 evaluates the state (behavior result) of the robot 30 to be controlled. A reinforcement signal to be given to 20 as a reward or punishment is generated (step S4). In the initial state evaluation signal acquisition process described above, the motor output signal (the number of revolutions) based on the output result of the network 20 whose structure has been changed by learning is stored in the output arrays A2 (1) and A2 (2). In addition, since the state index value as a result of the state evaluation performed in the previous step is not included in the state index value A6 one step before, the first enhancement signal generated is zero. It becomes. Then, the reinforcement signal generation means 43 writes the reinforcement signal generated in this way into the reinforcement signal A4 of the robot information storage means 60. In addition, the reinforcement signal generation unit 43 stores the current state index value obtained by evaluating the state (behavior result) of the robot 30 at the current step for use in the state evaluation process at the next step. Write and save the state index value A6 one step before the means 60. As described above, since the initial reinforcement signal is zero, the initial learning process by the learning means 51 described later is substantially not performed, and the structure of the network 20 does not change.

それから、ネットワーク処理手段５０により、ネットワーク２０の処理、すなわち学習処理および出力生成処理を行う（ステップＳ５）。 Then, the network processing means 50 performs processing of the network 20, that is, learning processing and output generation processing (step S5).

図４において、学習処理では、先ず、学習手段５１により、ロボット情報記憶手段６０の強化信号Ａ４を読み込み、ネットワーク情報記憶手段６１の強化信号Ｂ４に書き込むことにより、ネットワーク２０として強化信号を受け取る（ステップＳ５０１）。 In FIG. 4, in the learning process, first, the reinforcement signal A4 of the robot information storage unit 60 is read by the learning unit 51 and written to the enhancement signal B4 of the network information storage unit 61, so that the enhancement signal is received as the network 20 (step). S501).

次に、学習手段５１により、ネットワーク情報記憶手段６１の出力ノードアドレスＢ３を参照し、ノード情報記憶手段６２の中のこれらの出力ノードアドレスＢ３に対応する各出力ノード２３の情報を記憶する部分における強化信号の合計値Ｃ１０に、それぞれネットワーク情報記憶手段６１の強化信号Ｂ４と同じ値を格納する。これにより、全ての出力ノード２３に対し、均等に強化信号が伝わる（ステップＳ５０２）。 Next, the learning unit 51 refers to the output node address B3 of the network information storage unit 61, and in the part of the node information storage unit 62 that stores the information of each output node 23 corresponding to these output node addresses B3. The same value as the enhancement signal B4 of the network information storage means 61 is stored in the enhancement signal total value C10. As a result, the enhancement signal is uniformly transmitted to all the output nodes 23 (step S502).

続いて、学習手段５１により、ネットワーク情報記憶手段６１の出力ノードアドレスＢ３に対応する各出力ノード２３について学習処理を行う（ステップＳ５０３）。出力ノード２３の学習処理の詳細は後述する。 Subsequently, the learning unit 51 performs a learning process for each output node 23 corresponding to the output node address B3 of the network information storage unit 61 (step S503). Details of the learning process of the output node 23 will be described later.

さらに、学習手段５１により、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２に対応する各中間ノード２２について学習処理を行う（ステップＳ５０４）。中間ノード２２の学習処理の詳細は、図５を用いて後述する。図５には、中間ＯＲノード（実ノード）の学習処理の流れが示されている。 Further, the learning means 51 performs learning processing for each intermediate node 22 corresponding to the intermediate node address B2 of the network information storage means 61 (step S504). Details of the learning process of the intermediate node 22 will be described later with reference to FIG. FIG. 5 shows the flow of learning processing of an intermediate OR node (real node).

それから、学習手段５１により、ネットワーク情報記憶手段６１の出力ノードアドレスＢ３に対応する各出力ノード２３についてのノード情報記憶手段６２の入力側リンクアドレスＣ１を参照し、これらの入力側リンクアドレスＣ１に対応する各出力ノード２３の各入力側リンクの学習処理を行う（ステップＳ５０５）。出力ノード２３の入力側リンクの学習処理の詳細は後述する。 Then, the learning means 51 refers to the input side link addresses C1 of the node information storage means 62 for each output node 23 corresponding to the output node address B3 of the network information storage means 61, and corresponds to these input side link addresses C1. Learning processing of each input side link of each output node 23 to be performed is performed (step S505). Details of the learning process of the input side link of the output node 23 will be described later.

さらに、学習手段５１により、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２に対応する各中間ノード２２についてのノード情報記憶手段６２の入力側リンクアドレスＣ１を参照し、これらの入力側リンクアドレスＣ１に対応する各中間ノード２３の各入力側リンクの学習処理を行う（ステップＳ５０６）。中間ノード２２の入力側リンクの学習処理の詳細は、図６を用いて後述する。図６には、非反転リンクの学習処理の流れが示されている。 Further, the learning means 51 refers to the input side link addresses C1 of the node information storage means 62 for each intermediate node 22 corresponding to the intermediate node address B2 of the network information storage means 61, and corresponds to these input side link addresses C1. The learning process of each input side link of each intermediate node 23 to be performed is performed (step S506). Details of the learning process of the input side link of the intermediate node 22 will be described later with reference to FIG. FIG. 6 shows the flow of the learning process for the non-inverted link.

その後、以上のようにして学習処理（ステップＳ５０１〜Ｓ５０６参照）を行って構造を変化させた後のネットワーク２０を用いて、ネットワーク２０の新たな出力を生成する処理を行う。なお、学習処理によるネットワーク２０の構造の変化は、構造を変化させる前のネットワーク２０の出力結果に基づいて形成されたロボット３０の状態の評価結果に応じて生成された強化信号によってもたらされるので、学習処理（ステップＳ５０１〜Ｓ５０６参照）で各種判定に用いられる構成エレメントの入出力状態は、その強化信号の生成の基になるロボット３０の状態を形成したネットワーク出力の生成処理で得られた入出力状態でなければならない。この点で、学習処理（ステップＳ５０１〜Ｓ５０６参照）で各種判定に用いられる構成エレメントの入出力状態は、メモリ（図２の入出力状態記憶手段）に残っている入出力状態、すなわち構造を変化させる前のネットワーク２０の出力生成処理で得られた入出力状態であるから、上記の要請を満足している。 Thereafter, learning processing (see steps S501 to S506) is performed as described above, and processing for generating a new output of the network 20 is performed using the network 20 after the structure is changed. The change in the structure of the network 20 due to the learning process is brought about by the reinforcement signal generated according to the evaluation result of the state of the robot 30 formed based on the output result of the network 20 before the structure is changed. The input / output states of the constituent elements used for various determinations in the learning process (see steps S501 to S506) are the input / output obtained by the network output generation process that forms the state of the robot 30 that is the basis for generating the reinforcement signal. It must be in a state. In this respect, the input / output states of the constituent elements used for various determinations in the learning process (see steps S501 to S506) change the input / output states remaining in the memory (input / output state storage means in FIG. 2), that is, the structure. Since this is the input / output state obtained by the output generation process of the network 20 prior to the generation, the above request is satisfied.

出力生成処理では、先ず、入力変換手段５２により、ネットワーク情報記憶手段６１の入力ノードアドレスＢ１を参照してノード情報記憶手段６２の中の各入力ノード２１の情報を記憶する部分を把握し、ロボット情報記憶手段６０の入力配列Ａ１（１）〜Ａ１（８）に記憶された８個のセンサ信号をそれぞれ２進数に変換し、変換して得られた値をノード情報記憶手段６２の各入力ノード２１の出力Ｃ９として設定する（ステップＳ５０７）。 In the output generation process, first, the input conversion means 52 refers to the input node address B1 of the network information storage means 61 to grasp the part storing the information of each input node 21 in the node information storage means 62, and the robot The eight sensor signals stored in the input arrays A1 (1) to A1 (8) of the information storage means 60 are converted into binary numbers, and the values obtained by the conversion are converted into the respective input nodes of the node information storage means 62. 21 is set as the output C9 of 21 (step S507).

続いて、出力生成手段５３により、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２を参照してノード情報記憶手段６２の中の各中間ノード２２の情報を記憶する部分を把握し、各中間ノード２２を構成する論理回路の機能に従って、各中間ノード２２の出力Ｃ９を計算する（ステップＳ５０８）。この際、新しく生成される中間ノード２２は、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２の配列の後ろに追加されていき、この新設の中間ノード２２は、入出力の連鎖上、ネットワーク２０の入力層に近い側に配置されていくので、入力層から出力層に向かう出力生成処理を実現するために、中間ノード２２の出力生成処理は、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２の配列の逆順で行う。 Subsequently, the output generation unit 53 refers to the intermediate node address B2 of the network information storage unit 61 to grasp the part storing the information of each intermediate node 22 in the node information storage unit 62. The output C9 of each intermediate node 22 is calculated according to the function of the logic circuit to configure (step S508). At this time, the newly generated intermediate node 22 is added after the array of the intermediate node address B2 of the network information storage means 61, and this newly installed intermediate node 22 is input to the network 20 in the input / output chain. In order to realize output generation processing from the input layer to the output layer, the output generation processing of the intermediate node 22 is performed in the reverse order of the arrangement of the intermediate node address B2 of the network information storage means 61. To do.

さらに、出力生成手段５３により、ネットワーク情報記憶手段６１の出力ノードアドレスＢ３を参照してノード情報記憶手段６２の中の各出力ノード２３の情報を記憶する部分を把握し、各出力ノード２３を構成する論理回路の機能に従って、各出力ノード２３の出力Ｃ９を計算する（ステップＳ５０９）。 Further, the output generation means 53 refers to the output node address B3 of the network information storage means 61 to grasp the portion of the node information storage means 62 that stores the information of each output node 23, and configures each output node 23. The output C9 of each output node 23 is calculated according to the function of the logic circuit to perform (step S509).

以上のステップＳ５０８，Ｓ５０９で行うノードの計算処理は、通常の論理回路の処理と同様であり、ノード情報記憶手段６２の当該ノードの全ての入力側リンクアドレスＣ１に対応する入力側リンクについてのリンク情報記憶手段６３のリンクの出力Ｄ７を読み込み、これらの入力側リンクの出力Ｄ７を、計算対象となっている当該ノードの入力とする。そして、ノード情報記憶手段６２の当該ノードのＡＮＤ・ＯＲノードフラグＣ５を参照して当該ノードがＡＮＤノードかＯＲノードかを把握し、ＡＮＤノードであればＡＮＤ回路と同じ処理を行い、ＯＲノードであればＯＲ回路と同じ処理を行うことにより、当該ノードの出力Ｃ９を算出する。 The node calculation processing performed in the above steps S508 and S509 is the same as the normal logic circuit processing, and the links for the input side links corresponding to all the input side link addresses C1 of the node in the node information storage means 62. The link output D7 of the information storage means 63 is read, and the output D7 of these input side links is set as the input of the node to be calculated. Then, by referring to the AND / OR node flag C5 of the node in the node information storage means 62, it is determined whether the node is an AND node or an OR node. If the node is an AND node, the same processing as that of the AND circuit is performed. If there is, the same processing as that of the OR circuit is performed to calculate the output C9 of the node.

例えば、計算対象となっている当該ノードが、中間ＯＲノード、テスト中間ＯＲノード、出力ノード２３（本実施形態では、ＯＲノードのみとする。）である場合には、当該ノードの出力Ｃ９をFalse（または０）で上書きした後、入力側リンクアドレスＣ１に対応する全ての入力側リンクの出力Ｄ７のうち、１つでもTrue（または１）であれば、当該ノードの出力Ｃ９をTrue（または１）で上書きする。一方、計算対象となっている当該ノードが、中間ＡＮＤノード、テスト中間ＡＮＤノードである場合には、当該ノードの出力Ｃ９をTrue（または１）で上書きした後、入力側リンクアドレスＣ１に対応する全ての入力側リンクの出力Ｄ７のうち、１つでもFalse（または０）であれば、当該ノードの出力Ｃ９をFalse（または０）で上書きする。 For example, when the relevant node to be calculated is an intermediate OR node, a test intermediate OR node, and an output node 23 (in this embodiment, only the OR node), the output C9 of the relevant node is set to False. After overwriting with (or 0), if at least one of the outputs D7 of all input side links corresponding to the input side link address C1 is True (or 1), the output C9 of the node is set to True (or 1). ). On the other hand, when the node to be calculated is an intermediate AND node or a test intermediate AND node, after overwriting the output C9 of the node with True (or 1), it corresponds to the input side link address C1. If at least one of the outputs D7 of all the input side links is False (or 0), the output C9 of the node is overwritten with False (or 0).

また、ステップＳ５０８，Ｓ５０９のノードの計算処理と併せて行うリンクの計算処理も、通常の論理回路の処理と同様であり、計算対象となっている当該リンクが、反転リンク、テスト反転リンクである場合には、リンク情報記憶手段６３の当該リンクの入力側ノードアドレスＤ１に対応する入力側ノードについてのノード情報記憶手段６２のノードの出力Ｃ９の値を反転させた値を、当該リンクの出力Ｄ７に上書きし、計算対象となっている当該リンクが、非反転リンク、テスト非反転リンクである場合には、リンク情報記憶手段６３の当該リンクの入力側ノードアドレスＤ１に対応する入力側ノードについてのノード情報記憶手段６２のノードの出力Ｃ９の値を、そのまま当該リンクの出力Ｄ７に上書きする。 In addition, the link calculation processing performed in conjunction with the node calculation processing in steps S508 and S509 is the same as the normal logic circuit processing, and the link to be calculated is an inverted link or a test inverted link. In this case, a value obtained by inverting the value of the node output C9 of the node information storage unit 62 for the input side node corresponding to the input side node address D1 of the link of the link information storage unit 63 is set as the output D7 of the link. If the link to be calculated is a non-inverted link or a test non-inverted link, the link on the input side node corresponding to the input side node address D1 of the link in the link information storage means 63 The value of the output C9 of the node in the node information storage means 62 is overwritten on the output D7 of the link as it is.

その後、出力変換手段５４により、ネットワーク情報記憶手段６１の出力ノードアドレスＢ３を参照してノード情報記憶手段６２の中の各出力ノード２３の情報を記憶する部分を把握し、各出力ノード２３の出力Ｃ９（２進数）を実数に変換し、モータ出力信号（回転数）としてロボット情報記憶手段６０の出力配列Ａ２に書き込む（ステップＳ５１０）。 Thereafter, the output conversion unit 54 refers to the output node address B3 of the network information storage unit 61 to grasp the portion of the node information storage unit 62 that stores the information of each output node 23, and outputs the output of each output node 23. C9 (binary number) is converted into a real number, and is written in the output array A2 of the robot information storage means 60 as a motor output signal (number of rotations) (step S510).

図３において、ネットワーク処理手段５０による処理後に、モータ出力信号送信手段４４により、ネットワーク２０の出力結果（各出力ノード２３の出力Ｃ９）に基づきロボット情報記憶手段６０の出力配列Ａ２に書き込まれたモータ出力信号（回転数）を、ロボット３０のモータ３２，３４へ送信し、これによりモータ３２，３４を駆動させてロボット３０を動作させる（ステップＳ６）。 In FIG. 3, after the processing by the network processing means 50, the motor output signal transmitting means 44 writes the motor 20 written in the output array A2 of the robot information storage means 60 based on the output result of the network 20 (output C9 of each output node 23). An output signal (the number of rotations) is transmitted to the motors 32 and 34 of the robot 30, thereby driving the motors 32 and 34 to operate the robot 30 (step S6).

続いて、センサ信号取得手段４１により、８本の赤外線センサ３５で検出されたセンサ信号を取得し、取得した８個のセンサ信号をロボット情報記憶手段６０の入力配列Ａ１（１）〜Ａ１（８）に書き込む（ステップＳ７）。 Subsequently, the sensor signal acquisition means 41 acquires the sensor signals detected by the eight infrared sensors 35, and the acquired eight sensor signals are input to the input arrays A1 (1) to A1 (8) of the robot information storage means 60. ) (Step S7).

そして、状態評価用信号取得手段４２により、状態評価用信号として、ロボット情報記憶手段６０の入力配列Ａ１（１）〜Ａ１（８）に記憶されている８本の赤外線センサ３５によるセンサ信号と、ロボット情報記憶手段６０の出力配列Ａ２（１），Ａ２（２）に記憶されているモータ出力信号（回転数）と、ロボット情報記憶手段６０に記憶されている１ステップ前の状態指標値Ａ６とを取得する（ステップＳ８）。なお、この際には、初回の状態評価用信号取得処理の場合（ステップＳ４の場合）とは異なり、出力配列Ａ２（１），Ａ２（２）には、ステップＳ５の学習処理で構造を変化させたネットワーク２０の出力結果に基づくモータ出力信号（回転数）が入っており、また、１ステップ前の状態指標値Ａ６にも前回ステップで状態評価を行った結果としての状態指標値が入っているので、この状態評価用信号に基づく評価結果に応じて強化信号生成手段４３により生成される強化信号は、適正な状態評価結果に応じた意味のある強化信号となる。 Then, by the state evaluation signal acquisition means 42, sensor signals from the eight infrared sensors 35 stored in the input arrays A1 (1) to A1 (8) of the robot information storage means 60 as state evaluation signals, The motor output signals (number of revolutions) stored in the output arrays A2 (1) and A2 (2) of the robot information storage unit 60, and the state index value A6 one step before stored in the robot information storage unit 60 Is acquired (step S8). In this case, unlike the case of the first state evaluation signal acquisition process (in the case of step S4), the structure of the output arrays A2 (1) and A2 (2) is changed by the learning process of step S5. The motor output signal (the number of revolutions) based on the output result of the network 20 is included, and the state index value A6 one step before includes the state index value as a result of the state evaluation performed at the previous step. Therefore, the enhancement signal generated by the enhancement signal generator 43 according to the evaluation result based on the state evaluation signal becomes a meaningful enhancement signal according to the appropriate state evaluation result.

続いて、強化信号生成手段４３により、状態評価用信号取得手段４２により取得した状態評価用信号に基づき、ステップＳ５の学習処理で構造を変化させたネットワーク２０の出力結果に基づき形成された制御対象であるロボット３０の状態（行動結果）を評価し、その評価結果に応じ、ネットワーク２０に対して報酬または罰として付与する強化信号を生成する（ステップＳ８）。例えば、ロボット３０が直進している場合には、「＋１」の強化信号（報酬）を生成し、センサ信号の合計値が閾値（例えば０）より大きい場合には、その合計値の増減分（前回ステップの合計値との差分）に−１および定数を乗じた値の強化信号（報酬または罰）を生成し、それ以外の場合には、例えば「−０．０１」の強化信号（小さな罰）を生成する。そして、強化信号生成手段４３は、このようにして生成した強化信号を、ロボット情報記憶手段６０の強化信号Ａ４へ書き込む。また、強化信号生成手段４３は、次回ステップでの状態評価処理のために、現在のステップでロボット３０の状態（行動結果）を評価して得られた現在の状態指標値を、ロボット情報記憶手段６０の１ステップ前の状態指標値Ａ６へ書き込んで保存する。 Subsequently, the control target formed based on the output result of the network 20 in which the structure is changed by the learning process in step S5 based on the state evaluation signal acquired by the state evaluation signal acquisition unit 42 by the reinforcement signal generation unit 43. The state (behavior result) of the robot 30 is evaluated, and a reinforcement signal to be given to the network 20 as a reward or punishment is generated according to the evaluation result (step S8). For example, when the robot 30 is traveling straight, a reinforcement signal (reward) of “+1” is generated. When the total value of the sensor signals is larger than a threshold value (for example, 0), the increase / decrease amount of the total value ( A reinforcement signal (reward or punishment) having a value obtained by multiplying the difference from the total value of the previous step by -1 and a constant is generated. Otherwise, for example, a reinforcement signal of “−0.01” (small punishment) ) Is generated. Then, the reinforcement signal generation means 43 writes the reinforcement signal generated in this way into the reinforcement signal A4 of the robot information storage means 60. Further, the reinforcement signal generation means 43 uses the current state index value obtained by evaluating the state (behavior result) of the robot 30 at the current step for the state evaluation processing at the next step, and the robot information storage means. Write and save the status index value A6 one step before 60.

その後、ロボット３０の動作制御を終了させる指示が出ているか否かを判断し（ステップＳ９）、終了指示が出ていない場合には、ステップＳ５のネットワーク２０の処理に戻り、以降、終了指示が出るまで、ステップＳ５〜Ｓ９の処理を繰り返し、一方、終了指示が出ている場合には、ロボット３０の動作制御を終了する（ステップＳ１０）。 Thereafter, it is determined whether or not an instruction to end the operation control of the robot 30 is issued (step S9). If no end instruction is issued, the process returns to the processing of the network 20 in step S5. Until exiting, the processing of steps S5 to S9 is repeated. On the other hand, if an end instruction is issued, the operation control of the robot 30 is terminated (step S10).

以下には、学習手段５１による中間ノード２２（中間ＯＲノード、中間ＡＮＤノード、テスト中間ＯＲノード、テスト中間ＡＮＤノード）および出力ノード２３の学習処理の流れを説明する。 Hereinafter, the flow of learning processing of the intermediate node 22 (intermediate OR node, intermediate AND node, test intermediate OR node, test intermediate AND node) and output node 23 by the learning unit 51 will be described.

＜中間ＯＲノードの学習処理＞
図７には、学習対象となる中間ＯＲノード（実ノード）１００の一例が示されている。中間ＯＲノード１００には、一例として３本の入力側リンク１０１，１０２，１０３および出力側リンク１０４が結合されるとともに、中間ＯＲノード１００の入力側には、テストリンク１０５が設けられている。各入力側リンク１０１，１０２，１０３は、入力側ノード１０６，１０７，１０８にそれぞれ結合され、出力側リンク１０４は、出力側ノード１０９に結合され、テストリンク１０５は、任意のノード１１０にランダムに結合されている。 <Intermediate OR node learning process>
FIG. 7 shows an example of an intermediate OR node (real node) 100 to be learned. As an example, three input side links 101, 102, 103 and an output side link 104 are coupled to the intermediate OR node 100, and a test link 105 is provided on the input side of the intermediate OR node 100. Each input side link 101, 102, 103 is coupled to input side node 106, 107, 108, output side link 104 is coupled to output side node 109, and test link 105 is randomly connected to any node 110. Are combined.

ここで、各入力側リンク１０１，１０２，１０３による中間ＯＲノード１００への入力を、Ｘ（１），Ｘ（２），Ｘ（３）とする。より一般的には、Ｎ本の入力側リンクがあるとすると、Ｘ（１）〜Ｘ（Ｎ）とする。つまり、ｋ番目の入力側リンクによる中間ＯＲノード１００への入力を、Ｘ（ｋ）（ｋ＝１〜Ｎ）とする。また、中間ＯＲノード１００の出力をＹとする。また、Ｘ（１）〜Ｘ（Ｎ）のうちのＴｒｕｅの数をＮｕｍＴとし、中間ＯＲノード１００に対して付与された強化信号をＲとする。さらに、各入力側リンク１０１，１０２，１０３に対して付与する強化信号を、Ｒ１（１），Ｒ１（２），Ｒ１（３）とし、これらの入力側ノード１０６，１０７，１０８に対して付与する強化信号を、Ｒ２（１），Ｒ２（２），Ｒ２（３）とする。より一般的には、着目するｋ番目の入力側リンクに対して付与する強化信号を、Ｒ１（ｋ）（ｋ＝１〜Ｎ）とし、その入力側ノードに対して付与する強化信号を、Ｒ２（ｋ）（ｋ＝１〜Ｎ）とする。 Here, the input to the intermediate OR node 100 by the input side links 101, 102, 103 is assumed to be X (1), X (2), X (3). More generally, assuming that there are N input links, let X (1) to X (N). That is, an input to the intermediate OR node 100 by the kth input side link is X (k) (k = 1 to N). The output of the intermediate OR node 100 is Y. Further, the number of Trues among X (1) to X (N) is NumT, and the enhancement signal given to the intermediate OR node 100 is R. Further, the reinforcement signals to be given to the respective input side links 101, 102, 103 are R1 (1), R1 (2), R1 (3), and are given to these input side nodes 106, 107, 108. The enhancement signals to be performed are R2 (1), R2 (2), and R2 (3). More generally, the enhancement signal to be given to the k-th input side link of interest is R1 (k) (k = 1 to N), and the enhancement signal to be given to the input side node is R2. (K) (k = 1 to N).

図５において、先ず、学習手段５１は、中間ＯＲノード１００に対して付与された強化信号Ｒに基づき、中間ＯＲノード１００の入出力状態に応じて、各入力側リンク１０１，１０２，１０３の中間ＯＲノード１００の出力Ｙへの寄与度に従って各入力側リンク１０１，１０２，１０３に強化信号が分配（伝播）されるように、各入力側リンク１０１，１０２，１０３に対して付与する強化信号Ｒ１（１），Ｒ１（２），Ｒ１（３）を算定する（ステップＳ５０４０１）。また、これと併せて、各入力側リンク１０１，１０２，１０３の入力側ノード１０６，１０７，１０８に対して付与する強化信号Ｒ２（１），Ｒ２（２），Ｒ２（３）を算定する（ステップＳ５０４０２）。 In FIG. 5, first, the learning means 51 determines the intermediate of each input side link 101, 102, 103 according to the input / output state of the intermediate OR node 100 based on the reinforcement signal R given to the intermediate OR node 100. Reinforcement signal R1 to be given to each input side link 101, 102, 103 so that the reinforcement signal is distributed (propagated) to each input side link 101, 102, 103 according to the contribution to the output Y of OR node 100. (1), R1 (2), R1 (3) are calculated (step S50401). At the same time, the reinforcement signals R2 (1), R2 (2), R2 (3) to be given to the input side nodes 106, 107, 108 of the input side links 101, 102, 103 are calculated ( Step S50402).

この際、中間ＯＲノード１００への入力Ｘ（１），Ｘ（２），Ｘ（３）は、ノード情報記憶手段６２の中の中間ＯＲノード１００の入力側リンクアドレスＣ１を参照し、リンク情報記憶手段６３の各入力側リンク１０１，１０２，１０３の出力Ｄ７を読み込んで得られる。また、中間ＯＲノード１００の出力Ｙは、ノード情報記憶手段６２の中の中間ＯＲノード１００の出力Ｃ９を読み込んで得られる。さらに、中間ＯＲノード１００に対して付与された強化信号Ｒは、ノード情報記憶手段６２の中の中間ＯＲノード１００の強化信号の合計値Ｃ１０を読み込んで得られる。 At this time, the inputs X (1), X (2), and X (3) to the intermediate OR node 100 refer to the input side link address C1 of the intermediate OR node 100 in the node information storage means 62, and link information It is obtained by reading the output D7 of each input side link 101, 102, 103 of the storage means 63. The output Y of the intermediate OR node 100 is obtained by reading the output C9 of the intermediate OR node 100 in the node information storage means 62. Further, the enhancement signal R given to the intermediate OR node 100 is obtained by reading the total value C10 of the enhancement signals of the intermediate OR node 100 in the node information storage means 62.

そして、学習手段５１は、次のようなルールで、中間ＯＲノードに結合されているＮ本の入力側リンクのうち着目する１本の入力側リンクに対して付与する強化信号Ｒ１（ｋ）（ｋ＝１〜Ｎ）、および着目する１本の入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）（ｋ＝１〜Ｎ）を算定する。すなわち、ｋ番目（ｋ＝１〜Ｎ）の入力側リンクが、次のケース１〜５のいずれに該当するかを判断し、１本１本の入力側リンクについて強化信号Ｒ１（ｋ）を算定するとともに、１本１本の入力側リンクの入力側ノードについて強化信号Ｒ２（ｋ）を算定していく。 And the learning means 51 is the reinforcement | strengthening signal R1 (k) (provided with respect to one input side link to which it pays attention among the N input side links couple | bonded with the intermediate OR node with the following rules. k = 1 to N) and the reinforcement signal R2 (k) (k = 1 to N) to be given to the input side node of one input side link of interest. That is, it is determined which of the following cases 1 to 5 corresponds to the kth (k = 1 to N) input side link, and the enhancement signal R1 (k) is calculated for each input side link. At the same time, the reinforcement signal R2 (k) is calculated for the input side node of each input side link.

ケース１：（Ｙ＝Ｔ）∧（Ｘ（ｋ）＝Ｆ）の場合には、Ｒ１（ｋ）＝０，Ｒ２（ｋ）＝０とする。この場合は、ｋ番目の入力側リンクによる入力Ｘ（ｋ）が、中間ＯＲノードの出力Ｙに寄与していないので、強化信号を０とする。 Case 1: In the case of (Y = T) ∧ (X (k) = F), R1 (k) = 0 and R2 (k) = 0. In this case, since the input X (k) by the kth input side link does not contribute to the output Y of the intermediate OR node, the enhancement signal is set to 0.

ケース２：Ｙ＝Ｆの場合には、Ｒ１（ｋ）＝Ｒ／Ｎ，Ｒ２（ｋ）＝Ｒ／Ｎとする。この場合は、Ｙ＝Ｆなので、全ての入力Ｘ（ｋ）（ｋ＝１〜Ｎ）がＸ（ｋ）＝Ｆであり、出力Ｙに均等に寄与しているので、強化信号を均等に分配する。 Case 2: When Y = F, R1 (k) = R / N and R2 (k) = R / N. In this case, since Y = F, all the inputs X (k) (k = 1 to N) are X (k) = F and contribute equally to the output Y, so the enhancement signal is evenly distributed. To do.

ケース３：（Ｙ＝Ｔ）∧（ＮｕｍＴ＝１）の場合には、Ｒ１（ｋ）＝Ｒ，Ｒ２（ｋ）＝Ｒとする。この場合は、着目する入力側リンクによる入力がＸ（ｋ）＝Ｔであり、しかもＴｒｕｅの入力がこの入力側リンクによる入力だけであり、この入力側リンクの出力Ｙへの寄与度が大きいので、絶対値の大きな強化信号を付与する。 Case 3: When (Y = T) ＝ (NumT = 1), R1 (k) = R and R2 (k) = R. In this case, since the input by the input side link of interest is X (k) = T, and the input of True is only the input by this input side link, the contribution to the output Y of this input side link is large. , Give a strengthening signal with large absolute value.

ケース４：（Ｙ＝Ｔ）∧（ＮｕｍＴ≠１）∧（Ｒ≧０）の場合には、Ｒ１（ｋ）＝−Ｒ×（ＮｕｍＴ−１）／Ｎ，Ｒ２（ｋ）＝０とする。この場合は、着目する入力側リンクによる入力がＸ（ｋ）＝Ｔであるが、Ｔｒｕｅの入力がこの入力側リンクによる入力だけではないので、この入力側リンクによる入力がたとえＴｒｕｅでなかったとしても、出力Ｙは、他の入力側リンクによる入力により、Ｙ＝Ｔとなることから、この入力側リンクの出力Ｙへの寄与度は低い。従って、強化信号Ｒ１（ｋ）として比較的小さな罰を付与する。 Case 4: When (Y = T) ＝ (NumT ≠ 1) ∧ (R ≧ 0), R1 (k) = − R × (NumT−1) / N, R2 (k) = 0. In this case, the input by the input link of interest is X (k) = T, but since the input of True is not only the input by this input side link, it is assumed that the input by this input side link is not True. However, since the output Y is Y = T due to the input by the other input side link, the contribution of the input side link to the output Y is low. Therefore, a relatively small punishment is given as the reinforcement signal R1 (k).

ケース５：（Ｙ＝Ｔ）∧（ＮｕｍＴ≠１）∧（Ｒ≦０）の場合には、Ｒ１（ｋ）＝Ｒ×ＮｕｍＴ／Ｎ，Ｒ２（ｋ）＝０とする。この場合も、ケース４の場合と同様に、着目する入力側リンクによる入力がＸ（ｋ）＝Ｔであるが、Ｔｒｕｅの入力がこの入力側リンクによる入力だけではないので、この入力側リンクによる入力がたとえＴｒｕｅでなかったとしても、出力Ｙは、他の入力側リンクによる入力により、Ｙ＝Ｔとなることから、この入力側リンクの出力Ｙへの寄与度は低い。また、伝播元の中間ＯＲノードに対する強化信号Ｒとして罰が付与されているので、強化信号Ｒ１（ｋ）として、ケース４の場合よりも大きな罰を付与する。 Case 5: If (Y = T) ∧ (NumT ≠ 1) ∧ (R ≦ 0), R1 (k) = R × NumT / N, R2 (k) = 0. Also in this case, as in the case 4, the input by the input side link of interest is X (k) = T, but since the input of True is not only the input by this input side link, this input side link Even if the input is not True, the output Y is Y = T due to the input by the other input side link, so the contribution of the input side link to the output Y is low. Further, since the punishment is given as the strengthening signal R for the intermediate OR node of the propagation source, the punishment larger than the case 4 is given as the strengthening signal R1 (k).

図８には、伝播元の中間ＯＲノードが、図７の中間ＯＲノード１００である場合について、以上のケース１〜５のルールに従って算定した強化信号の分配例が示されている。 FIG. 8 shows an example of distribution of the enhancement signal calculated according to the rules of the above cases 1 to 5 when the intermediate OR node of the propagation source is the intermediate OR node 100 of FIG.

さらに、学習手段５１は、テストリンク１０５に対して付与する強化信号ＲＴを算定する。この際、学習手段５１は、テストリンク１０５が、仮に中間ＯＲノード１００の入力側リンクとして存在していた場合を想定して強化信号を算定する（ステップＳ５０４０３）。先ず、テストリンク１０５が入力側リンクとして加わることにより出力Ｙが変化しない場合には、入力Ｘ（ｋ）にテストリンク１０５による入力ＴＸ、すなわちテストリンク１０５の出力（リンク情報記憶手段６３のテストリンク１０５の出力Ｄ７を読み込んで得られる。）を追加し、前述したケース１〜５の場合分けに従ってその強化信号ＲＴを算定する。次に、テストリンク１０５が入力側リンクとして加わることにより出力Ｙが変化する場合には、入力Ｘ（ｋ）にテストリンク１０５による入力ＴＸ、すなわちテストリンク１０５の出力Ｄ７を追加し、Ｙへ中間ＯＲノード１００の出力Ｃ９（実際の出力）を反転させた値を代入し、Ｒへ中間ＯＲノード１００の強化信号の合計値Ｃ１０（実際の強化信号の合計値）の符号を変えた−Ｃ１０を代入して、前述したケース１〜５のルールを適用することにより、その強化信号ＲＴを算定する。 Further, the learning means 51 calculates a reinforcement signal RT to be given to the test link 105. At this time, the learning unit 51 calculates a reinforcement signal assuming that the test link 105 exists as an input side link of the intermediate OR node 100 (step S50403). First, when the output Y does not change due to the addition of the test link 105 as an input side link, the input TX by the test link 105, that is, the output of the test link 105 (the test link of the link information storage means 63) is added to the input X (k). The output signal D7 of 105 is read in.) Is added, and the enhancement signal RT is calculated according to the cases 1 to 5 described above. Next, when the output Y changes due to the addition of the test link 105 as an input side link, the input TX by the test link 105, that is, the output D7 of the test link 105 is added to the input X (k), and the intermediate to Y A value obtained by inverting the output C9 (actual output) of the OR node 100 is substituted, and -C10 obtained by changing the sign of the total value C10 of the enhancement signal of the intermediate OR node 100 (total value of the actual enhancement signal) to R By substituting and applying the rules of cases 1 to 5 described above, the reinforcement signal RT is calculated.

そして、以上のようにして算定した強化信号、すなわち各入力側リンク１０１，１０２，１０３に対して付与する強化信号Ｒ１（１），Ｒ１（２），Ｒ１（３）と、テストリンク１０５に対して付与する強化信号ＲＴとを、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８に加算して累積値を更新するとともに、当該リンクの強化信号Ｄ９に上書きし、また、各入力側リンク１０１，１０２，１０３の入力側ノード１０６，１０７，１０８に対して付与する強化信号Ｒ２（１），Ｒ２（２），Ｒ２（３）を、ノード情報記憶手段６２の当該ノードの強化信号の合計値Ｃ１０に加算（当該ノードには、他の構成エレメントからも強化信号が伝播されてくるので、それらを加算するという意味である。）する（ステップＳ５０４０４）。 Then, the reinforcement signal calculated as described above, that is, the reinforcement signals R1 (1), R1 (2), R1 (3) to be given to the input side links 101, 102, 103, and the test link 105, respectively. Is added to the link reinforcement signal accumulated value D8 of the link information storage means 63 to update the accumulated value, and overwrites the link reinforcement signal D9. The reinforcement signals R2 (1), R2 (2), R2 (3) to be given to the input side nodes 106, 107, 108 of the links 101, 102, 103 are used as the reinforcement signals of the relevant node of the node information storage means 62. Addition is made to the total value C10 (meaning that the enhancement signal is propagated from the other constituent elements to the node, meaning that they are added) (step S50404). .

続いて、学習手段５１は、各入力側リンク１０１，１０２，１０３について、それぞれリンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、その入力側リンクを削除する（ステップＳ５０４０５）。この場合には、後述する図１２の反転リンクの削除処理Ｅ５または非反転リンクの削除処理Ｅ６を行う。 Subsequently, for each input side link 101, 102, 103, the learning means 51 has a cumulative value D8 of the reinforcement signal of the link in the link information storage means 63 as a threshold value (in this embodiment, it is set to 0 as an example). It is determined whether or not the input side link is deleted (step S50405). In this case, a reverse link deletion process E5 or a non-reverse link deletion process E6 shown in FIG.

また、学習手段５１は、テストリンク１０５について、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、テストリンク１０５を削除する（ステップＳ５０４０６）。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の中間ＯＲノード１００のテストリンクアドレスＣ４に登録する。 Further, the learning unit 51 determines whether or not the cumulative value D8 of the link reinforcement signal of the link information storage unit 63 is lower than a threshold value (in this embodiment, 0 as an example) for the test link 105. If it is lower, the test link 105 is deleted (step S50406). In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the intermediate OR node 100 of the node information storage means 62.

さらに、学習手段５１は、テストリンク１０５について、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値を上回っているか否かを判断し、上回っている場合には、テストリンク１０５を実リンクに昇格させて実用化するため、ノード情報記憶手段６２の中間ＯＲノード１００のテストリンクアドレスＣ４と、中間ＯＲノード１００のアドレスＢ２と、ネットワークアドレスＣ３とを用いて、実リンクを新たに生成し、中間ＯＲノード１００の入力側リンクアドレスＣ１に追加登録する。この際、テストリンク１０５についてのリンク情報記憶手段６３の反転・非反転フラグＤ５がＴｒｕｅ（反転リンクを意味する。）のときには、反転リンクを新たに生成し、Ｆａｌｓｅ（非反転リンクを意味する。）のときには、非反転リンクを新たに生成する。また、これと併せて、テストリンク１０５を削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の中間ＯＲノード１００のテストリンクアドレスＣ４に登録する（ステップＳ５０４０７）。 Further, the learning means 51 determines whether or not the cumulative value D8 of the reinforcement signal of the link in the link information storage means 63 exceeds the threshold for the test link 105. In order to promote a practical link and put it to practical use, a new real link is newly created using the test link address C4 of the intermediate OR node 100 of the node information storage means 62, the address B2 of the intermediate OR node 100, and the network address C3. And is additionally registered in the input side link address C1 of the intermediate OR node 100. At this time, when the inversion / non-inversion flag D5 of the link information storage unit 63 for the test link 105 is True (meaning an inversion link), an inversion link is newly generated and False (means a non-inversion link). ), A new non-inverted link is generated. At the same time, the test link 105 is deleted. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the intermediate OR node 100 of the node information storage unit 62 (step S50407).

そして、学習手段５１は、中間ＯＲノード１００の入力側リンクアドレスＣ１に登録されている入力側リンクの数が、１以下になった場合には、中間ＯＲノード１００を削除する（ステップＳ５０４０８）。この場合には、後述する図１２の中間ＯＲノードの削除処理Ｅ１を行う。 When the number of input side links registered in the input side link address C1 of the intermediate OR node 100 becomes 1 or less, the learning unit 51 deletes the intermediate OR node 100 (step S50408). In this case, an intermediate OR node deletion process E1 shown in FIG.

それから、学習手段５１は、ノード情報記憶手段６２の中間ＯＲノード１００の強化信号の合計値Ｃ１０をクリアして０にする（ステップＳ５０４０９）。 Then, the learning unit 51 clears the total value C10 of the reinforcement signal of the intermediate OR node 100 in the node information storage unit 62 to 0 (step S50409).

＜中間ＡＮＤノードの学習処理＞
中間ＡＮＤノードの学習処理は、上述した中間ＯＲノードの学習処理と略同様である。先ず、学習手段５１は、中間ＡＮＤノードに対して付与された強化信号Ｒに基づき、中間ＡＮＤノードの入出力状態に応じて、各入力側リンクの中間ＡＮＤノードの出力Ｙへの寄与度に従って各入力側リンクに強化信号が分配（伝播）されるように、各入力側リンクに対して付与する強化信号Ｒ１（ｋ）を算定する。また、これと併せて、各入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）を算定する。 <Intermediate AND node learning process>
The intermediate AND node learning process is substantially the same as the intermediate OR node learning process described above. First, based on the reinforcement signal R given to the intermediate AND node, the learning unit 51 changes each input link according to the contribution to the output Y of the intermediate AND node according to the input / output state of the intermediate AND node. The enhancement signal R1 (k) to be given to each input side link is calculated so that the enhancement signal is distributed (propagated) to the input side link. At the same time, the reinforcement signal R2 (k) to be given to the input side node of each input side link is calculated.

この際、前述した中間ＯＲノードの強化信号の算定の際の入力Ｘ（ｋ）および出力Ｙへ、中間ＡＮＤノードの入力側リンクアドレスＣ１に対応する各入力側リンクの出力Ｄ７および中間ＡＮＤノードの出力Ｃ９の値を代入するときに、それらの値を反転させて代入する。これは、ド・モルガンの法則により、ＡＮＤノードの入出力を全て反転させると、ＯＲノードとなることによるものである。 At this time, to the input X (k) and the output Y in the calculation of the reinforcement signal of the intermediate OR node described above, the output D7 of each input side link corresponding to the input side link address C1 of the intermediate AND node and the intermediate AND node When the values of the output C9 are substituted, those values are inverted and substituted. This is because if all the inputs and outputs of the AND node are inverted according to De Morgan's law, it becomes an OR node.

そして、学習手段５１は、上記のようにド・モルガンの法則に従って入出力を全て反転させることにより、前述した中間ＯＲノードの学習処理の場合と同じルールで、中間ＡＮＤノードに結合されているＮ本の入力側リンクのうち着目する１本の入力側リンクに対して付与する強化信号Ｒ１（ｋ）（ｋ＝１〜Ｎ）、および着目する１本の入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）（ｋ＝１〜Ｎ）を算定する。すなわち、ｋ番目（ｋ＝１〜Ｎ）の入力側リンクが、前述したケース１〜５のいずれに該当するかを判断し、１本１本の入力側リンクについて強化信号Ｒ１（ｋ）を算定するとともに、１本１本の入力側リンクの入力側ノードについて強化信号Ｒ２（ｋ）を算定していく。 Then, the learning means 51 inverts all inputs and outputs according to De Morgan's law as described above, so that N is coupled to the intermediate AND node according to the same rule as in the above-described intermediate OR node learning process. The enhancement signal R1 (k) (k = 1 to N) given to one input side link of interest among the two input side links and the input side node of one input side link of interest The reinforcement signal R2 (k) (k = 1 to N) to be given is calculated. That is, it is determined whether the k-th (k = 1 to N) input side link corresponds to any of the cases 1 to 5 described above, and the reinforcement signal R1 (k) is calculated for each input side link. At the same time, the reinforcement signal R2 (k) is calculated for the input side node of each input side link.

図９には、伝播元の中間ＡＮＤノードが、図７の中間ＯＲノード１００のように３つの入力側リンクを有する場合について、前述したケース１〜５のルールに従って算定した強化信号の分配例が示されている。図９は、図８の入出力Ｘ（１），Ｘ（２），Ｘ（３），Ｙを反転させ、その他の強化信号Ｒ，Ｒ１（１），Ｒ１（２），Ｒ１（３），Ｒ２（１），Ｒ２（２），Ｒ２（３）をそのままとした状態となっている。 FIG. 9 shows an example of distribution of the enhancement signal calculated according to the rules of cases 1 to 5 described above when the propagation-source intermediate AND node has three input links as in the intermediate OR node 100 of FIG. It is shown. FIG. 9 inverts the inputs / outputs X (1), X (2), X (3), and Y in FIG. 8 and other reinforcing signals R, R1 (1), R1 (2), R1 (3), R2 (1), R2 (2), and R2 (3) are left as they are.

さらに、学習手段５１は、テストリンクに対して付与する強化信号ＲＴを算定する。この際、学習手段５１は、テストリンクが、仮に中間ＡＮＤノードの入力側リンクとして存在していた場合を想定して強化信号を算定する。先ず、テストリンクが入力側リンクとして加わることにより出力Ｙが変化しない場合には、入力Ｘ（ｋ）にテストリンクによる入力ＴＸ、すなわちテストリンクの出力（リンク情報記憶手段６３のテストリンクの出力Ｄ７を読み込んで得られる。）を反転させて追加し、出力Ｙに中間ＡＮＤノードの出力Ｃ９を反転させた値を代入し、前述したケース１〜５の場合分けに従ってその強化信号ＲＴを算定する。次に、テストリンクが入力側リンクとして加わることにより出力Ｙが変化する場合には、入力Ｘ（ｋ）にテストリンクによる入力ＴＸ、すなわちテストリンクの出力Ｄ７を反転させて追加し、Ｙへ中間ＡＮＤノードの出力Ｃ９（実際の出力）の値を代入し、Ｒへ中間ＡＮＤノードの強化信号の合計値Ｃ１０（実際の強化信号の合計値）の符号を変えた−Ｃ１０を代入して、前述したケース１〜５のルールを適用することにより、その強化信号ＲＴを算定する。 Further, the learning means 51 calculates a reinforcement signal RT to be given to the test link. At this time, the learning means 51 calculates the reinforcement signal assuming that the test link exists as an input side link of the intermediate AND node. First, when the output Y does not change due to the addition of the test link as the input side link, the input TX (x) is input to the input X (k), that is, the output of the test link (the output D7 of the test link of the link information storage means 63). And the value obtained by inverting the output C9 of the intermediate AND node is substituted for the output Y, and the enhancement signal RT is calculated according to the cases 1 to 5 described above. Next, when the output Y changes due to the addition of the test link as the input side link, the input TX by the test link, that is, the output D7 of the test link is inverted and added to the input X (k), and intermediate to Y The value of the output C9 (actual output) of the AND node is substituted, and -C10 obtained by changing the sign of the total value C10 of the enhancement signal of the intermediate AND node (total value of the actual enhancement signal) to R is substituted. The reinforcement signal RT is calculated by applying the rules of cases 1 to 5 described above.

そして、以上のようにして算定した強化信号、すなわち各入力側リンクに対して付与する強化信号Ｒ１（ｋ）（ｋ＝１〜Ｎ）と、テストリンクに対して付与する強化信号ＲＴとを、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８に加算して累積値を更新するとともに、当該リンクの強化信号Ｄ９に上書きし、また、各入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）（ｋ＝１〜Ｎ）を、ノード情報記憶手段６２の当該ノードの強化信号の合計値Ｃ１０に加算（当該ノードには、他の構成エレメントからも強化信号が伝播されてくるので、それらを加算するという意味である。）する。 Then, the enhancement signal calculated as described above, that is, the enhancement signal R1 (k) (k = 1 to N) to be given to each input side link and the enhancement signal RT to be given to the test link, The accumulated value is updated by adding to the accumulated value D8 of the link reinforcement signal in the link information storage means 63, overwritten on the enhancement signal D9 of the link, and given to the input side node of each input side link The enhancement signal R2 (k) (k = 1 to N) to be added is added to the total value C10 of the enhancement signal of the node in the node information storage means 62 (the enhancement signal is propagated to the node from other constituent elements as well). It means to add them.)

続いて、学習手段５１は、各入力側リンクについて、それぞれリンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、その入力側リンクを削除する。この場合には、後述する図１２の反転リンクの削除処理Ｅ５または非反転リンクの削除処理Ｅ６を行う。 Subsequently, the learning unit 51 determines, for each input side link, whether the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 is below a threshold value (in this embodiment, it is 0 as an example). If it is lower, the input side link is deleted. In this case, a reverse link deletion process E5 or a non-reverse link deletion process E6 shown in FIG.

また、学習手段５１は、テストリンクについて、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の中間ＡＮＤノードのテストリンクアドレスＣ４に登録する。 Further, the learning unit 51 determines whether or not the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 is lower than a threshold value (in this embodiment, 0 as an example) for the test link. If so, delete the test link. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the intermediate AND node in the node information storage unit 62.

さらに、学習手段５１は、テストリンクについて、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値を上回っているか否かを判断し、上回っている場合には、テストリンクを実リンクに昇格させて実用化するため、ノード情報記憶手段６２の中間ＡＮＤノードのテストリンクアドレスＣ４と、中間ＡＮＤノードのアドレスＢ２と、ネットワークアドレスＣ３とを用いて、実リンクを新たに生成し、中間ＡＮＤノードの入力側リンクアドレスＣ１に追加登録する。この際、テストリンクについてのリンク情報記憶手段６３の反転・非反転フラグＤ５がＴｒｕｅ（反転リンクを意味する。）のときには、反転リンクを新たに生成し、Ｆａｌｓｅ（非反転リンクを意味する。）のときには、非反転リンクを新たに生成する。また、これと併せて、テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の中間ＡＮＤノードのテストリンクアドレスＣ４に登録する。 Further, the learning unit 51 determines whether or not the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 exceeds the threshold for the test link. In order to be promoted to practical use, a new actual link is generated using the test link address C4 of the intermediate AND node of the node information storage means 62, the address B2 of the intermediate AND node, and the network address C3. It is additionally registered in the input side link address C1 of the AND node. At this time, when the inversion / non-inversion flag D5 of the link information storage unit 63 for the test link is True (meaning an inversion link), an inversion link is newly generated and False (means a non-inversion link). In this case, a new non-inverted link is generated. At the same time, the test link is deleted. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the intermediate AND node in the node information storage unit 62.

そして、学習手段５１は、中間ＡＮＤノードの入力側リンクアドレスＣ１に登録されている入力側リンクの数が、１以下になった場合には、中間ＡＮＤノードを削除する。この場合には、後述する図１２の中間ＡＮＤノードの削除処理Ｅ２を行う。 Then, when the number of input side links registered in the input side link address C1 of the intermediate AND node becomes 1 or less, the learning unit 51 deletes the intermediate AND node. In this case, an intermediate AND node deletion process E2 shown in FIG.

それから、学習手段５１は、ノード情報記憶手段６２の中間ＡＮＤノードの強化信号の合計値Ｃ１０をクリアして０にする。 Then, the learning unit 51 clears the total value C10 of the reinforcement signal of the intermediate AND node in the node information storage unit 62 to zero.

＜テスト中間ＯＲノードの学習処理＞
テスト中間ＯＲノードの学習処理は、前述した中間ＯＲノードの学習処理（図７参照）を簡略化したものである。先ず、学習手段５１は、テスト中間ＯＲノードに対して付与された強化信号Ｒに基づき、テスト中間ＯＲノードの入出力状態に応じて、第１および第２の入力側テストリンク（後述する図１０の場合と同様）のテスト中間ＯＲノードの出力Ｙへの寄与度に従って第１および第２の入力側テストリンクに強化信号が分配（伝播）されるように、第１および第２の入力側テストリンクに対して付与する強化信号Ｒ１（１），Ｒ１（２）を算定する。但し、中間ＯＲノードの学習処理（図７参照）の場合とは異なり、テスト中間ＯＲノードの学習処理では、第１および第２の入力側テストリンクに対して付与する強化信号Ｒ１（１），Ｒ１（２）のみを算定し、第１および第２の入力側テストリンクの各入力側ノードに対して付与する強化信号Ｒ２（１），Ｒ２（２）は算定しない。 <Test intermediate OR node learning process>
The test intermediate OR node learning process is a simplification of the above-described intermediate OR node learning process (see FIG. 7). First, the learning means 51, based on the reinforcement signal R given to the test intermediate OR node, according to the input / output state of the test intermediate OR node, first and second input side test links (FIG. 10 described later). The first and second input side tests so that the enhancement signal is distributed (propagated) to the first and second input side test links according to the contribution of the test intermediate OR node to the output Y of Strengthening signals R1 (1) and R1 (2) to be given to the link are calculated. However, unlike the case of the learning process of the intermediate OR node (see FIG. 7), in the learning process of the test intermediate OR node, the reinforcement signals R1 (1), given to the first and second input side test links Only R1 (2) is calculated, and the enhancement signals R2 (1) and R2 (2) to be given to the input side nodes of the first and second input side test links are not calculated.

この際、学習手段５１は、前述した中間ＯＲノードの学習処理の場合と全く同じルールで、第１および第２の入力側テストリンクが、前述したケース１〜５のいずれに該当するかをそれぞれ判断し、強化信号Ｒ１（１），Ｒ１（２）を算定する。なお、テスト中間ＯＲノードには、ノード情報記憶手段６２のテストリンクアドレスＣ４に登録すべきテストリンクは無いので、このテストリンクアドレスＣ４に対応するテストリンクに対して付与する強化信号ＲＴの算定は行わない。 At this time, the learning unit 51 determines whether the first and second input-side test links correspond to any of the cases 1 to 5 described above under exactly the same rules as in the above-described intermediate OR node learning process. Judgment is made and the enhancement signals R1 (1) and R1 (2) are calculated. Since there is no test link to be registered in the test link address C4 of the node information storage means 62 in the test intermediate OR node, the calculation of the reinforcement signal RT to be given to the test link corresponding to the test link address C4 is as follows. Not performed.

そして、以上のようにして算定した第１および第２の入力側テストリンクに対して付与する強化信号Ｒ１（１），Ｒ１（２）を、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８に加算して累積値を更新するとともに、当該リンクの強化信号Ｄ９に上書きする。なお、第１および第２の入力側テストリンクの各入力側ノードに対して付与する強化信号Ｒ２（１），Ｒ２（２）は算定されないので、これらをノード情報記憶手段６２の当該ノードの強化信号の合計値Ｃ１０に加算する処理は行わない。 Then, the reinforcement signals R1 (1) and R1 (2) to be given to the first and second input-side test links calculated as described above are accumulated in the link reinforcement signal of the link in the link information storage means 63. The accumulated value is updated by adding to the value D8 and overwritten on the strengthening signal D9 of the link. Note that the reinforcement signals R2 (1) and R2 (2) to be given to the input side nodes of the first and second input side test links are not calculated, so these are strengthened for the relevant node of the node information storage means 62. The process of adding to the total signal value C10 is not performed.

続いて、学習手段５１は、第１、第２の入力側テストリンクについて、それぞれリンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、その入力側テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。なお、第１の入力側テストリンクは、削除されないように十分に大きな正の値の強化信号を蓄えた状態にしておくので、結局、ここで削除されるのは、第２の入力側テストリンクとなる。また、このようにして第２の入力側テストリンクが削除された場合の他に、第２の入力側テストリンクの入力側ノード（実ノード）の削除に伴って第２の入力側テストリンクが削除された場合を含め、リンクの数が１になったとき（つまり、第１の入力側テストリンクだけになったとき）には、任意のノードに結合する新たな第２の入力側テストリンクをランダムに生成し、ノード情報記憶手段６２のテスト中間ＯＲノードの入力側テストリンクアドレスＣ１（Ｃ１（２）となる。）に登録する。 Subsequently, the learning unit 51 sets the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 for each of the first and second input side test links as a threshold value (in this embodiment, 0 as an example). ) And if so, delete the input side test link. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. The first input side test link is stored in a state where a sufficiently large positive signal is stored so as not to be deleted. Therefore, the second input side test link is eventually deleted here. It becomes. Further, in addition to the case where the second input side test link is deleted in this way, the second input side test link is changed along with the deletion of the input side node (real node) of the second input side test link. When the number of links becomes 1, including the case of being deleted (that is, when only the first input side test link is present), a new second input side test link coupled to an arbitrary node Are randomly generated and registered in the input test link address C1 (becomes C1 (2)) of the test intermediate OR node of the node information storage means 62.

さらに、学習手段５１は、ノード情報記憶手段６２のテスト中間ＯＲノードの第１の入力側テストリンクアドレスＣ１（配列１番目のＣ１（１）となる。）に対応する第１の入力側テストリンクの強化信号の累積値Ｄ８を、十分大きな正の値（例えば１０³⁰⁰等）に設定し、累積値Ｄ８に、十分大きな正の値が常に保持されるようにし、第１の入力側テストリンクが削除されないようにする。 Further, the learning means 51 has a first input-side test link corresponding to the first input-side test link address C1 of the test intermediate OR node in the node information storage means 62 (which is the first array C1 (1)). Is set to a sufficiently large positive value (for example, 10 ³⁰⁰ ), so that a sufficiently large positive value is always held in the accumulated value D8, and the first input side test link is Prevent it from being deleted.

＜テスト中間ＡＮＤノードの学習処理＞
テスト中間ＡＮＤノードの学習処理は、前述した中間ＡＮＤノードの学習処理を簡略化したものである。先ず、学習手段５１は、テスト中間ＡＮＤノードに対して付与された強化信号Ｒに基づき、テスト中間ＡＮＤノードの入出力状態に応じて、第１および第２の入力側テストリンク（後述する図１０参照）のテスト中間ＡＮＤノードの出力Ｙへの寄与度に従って第１および第２の入力側テストリンクに強化信号が分配（伝播）されるように、第１および第２の入力側テストリンクに対して付与する強化信号Ｒ１（１），Ｒ１（２）を算定する。但し、中間ＡＮＤノードの学習処理の場合とは異なり、テスト中間ＡＮＤノードの学習処理では、第１および第２の入力側テストリンクに対して付与する強化信号Ｒ１（１），Ｒ１（２）のみを算定し、第１および第２の入力側テストリンクの各入力側ノードに対して付与する強化信号Ｒ２（１），Ｒ２（２）は算定しない。 <Test intermediate AND node learning process>
The test intermediate AND node learning process is a simplification of the above-described intermediate AND node learning process. First, the learning means 51, based on the reinforcing signal R given to the test intermediate AND node, according to the input / output state of the test intermediate AND node, first and second input side test links (FIG. 10 described later). For the first and second input test links so that the enhancement signal is distributed (propagated) to the first and second input test links according to the contribution to the output Y of the test intermediate AND node The reinforcement signals R1 (1) and R1 (2) to be given are calculated. However, unlike the learning process of the intermediate AND node, in the learning process of the test intermediate AND node, only the reinforcement signals R1 (1) and R1 (2) to be given to the first and second input side test links are used. And the enhancement signals R2 (1) and R2 (2) to be given to the input side nodes of the first and second input side test links are not calculated.

この際、学習手段５１は、前述した中間ＡＮＤノードの学習処理の場合と全く同じルールで、第１および第２の入力側テストリンクが、前述したケース１〜５のいずれに該当するかをそれぞれ判断し、強化信号Ｒ１（１），Ｒ１（２）を算定する。なお、テスト中間ＡＮＤノードには、ノード情報記憶手段６２のテストリンクアドレスＣ４に登録すべきテストリンクは無いので、このテストリンクアドレスＣ４に対応するテストリンクに対して付与する強化信号ＲＴの算定は行わない。 At this time, the learning unit 51 determines whether the first and second input-side test links correspond to the above-described cases 1 to 5, respectively, according to the same rule as in the above-described intermediate AND node learning process. Judgment is made and the enhancement signals R1 (1) and R1 (2) are calculated. Since there is no test link to be registered in the test link address C4 of the node information storage means 62 in the test intermediate AND node, the calculation of the reinforcement signal RT to be given to the test link corresponding to the test link address C4 is as follows. Not performed.

続いて、学習手段５１は、第１、第２の入力側テストリンクについて、それぞれリンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、その入力側テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。なお、第１の入力側テストリンクは、削除されないように十分に大きな正の値の強化信号を蓄えた状態にしておくので、結局、ここで削除されるのは、第２の入力側テストリンクとなる。また、このようにして第２の入力側テストリンクが削除された場合の他に、第２の入力側テストリンクの入力側ノード（実ノード）の削除に伴って第２の入力側テストリンクが削除された場合を含め、リンクの数が１になったとき（つまり、第１の入力側テストリンクだけになったとき）には、任意のノードに結合する新たな第２の入力側テストリンクをランダムに生成し、ノード情報記憶手段６２のテスト中間ＡＮＤノードの入力側テストリンクアドレスＣ１（Ｃ１（２）となる。）に登録する。 Subsequently, the learning unit 51 sets the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 for each of the first and second input side test links as a threshold value (in this embodiment, 0 as an example). ) And if so, delete the input side test link. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. The first input side test link is stored in a state where a sufficiently large positive signal is stored so as not to be deleted. Therefore, the second input side test link is eventually deleted here. It becomes. Further, in addition to the case where the second input side test link is deleted in this way, the second input side test link is changed along with the deletion of the input side node (real node) of the second input side test link. When the number of links becomes 1, including the case of being deleted (that is, when only the first input side test link is present), a new second input side test link coupled to an arbitrary node Are randomly generated and registered in the input test link address C1 (which becomes C1 (2)) of the test intermediate AND node of the node information storage means 62.

さらに、学習手段５１は、ノード情報記憶手段６２のテスト中間ＡＮＤノードの第１の入力側テストリンクアドレスＣ１（配列１番目のＣ１（１）となる。）に対応する第１の入力側テストリンクの強化信号の累積値Ｄ８を、十分大きな正の値（例えば１０³⁰⁰等）に設定し、累積値Ｄ８に、十分大きな正の値が常に保持されるようにし、第１の入力側テストリンクが削除されないようにする。 Further, the learning means 51 is a first input-side test link corresponding to the first input-side test link address C1 of the test intermediate AND node of the node information storage means 62 (which becomes the first array C1 (1)). Is set to a sufficiently large positive value (for example, 10 ³⁰⁰ ), so that a sufficiently large positive value is always held in the accumulated value D8, and the first input side test link is Prevent it from being deleted.

＜出力ノードの学習処理＞
出力ノードの学習処理は、前述した中間ＯＲノードの学習処理と略同様である。先ず、学習手段５１は、出力ノードに対して付与された強化信号Ｒに基づき、出力ノードの入出力状態に応じて、各入力側リンクの出力ノードの出力Ｙへの寄与度に従って各入力側リンクに強化信号が分配（伝播）されるように、各入力側リンクに対して付与する強化信号Ｒ１（ｋ）（ｋ＝１〜Ｎ）を算定する。また、これと併せて、各入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）（ｋ＝１〜Ｎ）を算定する。 <Learning process of output node>
The learning process for the output node is substantially the same as the learning process for the intermediate OR node described above. First, the learning means 51 determines each input side link according to the contribution degree of each input side link to the output Y of the output node according to the input / output state of the output node, based on the reinforcement signal R given to the output node. The reinforcement signal R1 (k) (k = 1 to N) to be given to each input side link is calculated so that the reinforcement signal is distributed (propagated). At the same time, the reinforcement signal R2 (k) (k = 1 to N) to be given to the input side node of each input side link is calculated.

この際、学習手段５１は、前述した中間ＯＲノードの学習処理の場合と全く同じルールで、出力ノードに結合されているＮ本の入力側リンクのうち着目する１本の入力側リンクに対して付与する強化信号Ｒ１（ｋ）（ｋ＝１〜Ｎ）、および着目する１本の入力側リンクの入力側ノードに対して付与する強化信号Ｒ２（ｋ）（ｋ＝１〜Ｎ）を算定する。すなわち、ｋ番目（ｋ＝１〜Ｎ）の入力側リンクが、前述したケース１〜５のいずれに該当するかを判断し、１本１本の入力側リンクについて強化信号Ｒ１（ｋ）を算定するとともに、１本１本の入力側リンクの入力側ノードについて強化信号Ｒ２（ｋ）を算定していく。 At this time, the learning means 51 performs the same rule as the above-described learning process of the intermediate OR node on one input side link of interest among the N input side links coupled to the output node. The reinforcement signal R1 (k) (k = 1 to N) to be given and the reinforcement signal R2 (k) (k = 1 to N) to be given to the input side node of one input side link of interest are calculated. . That is, it is determined whether the k-th (k = 1 to N) input side link corresponds to any of the cases 1 to 5 described above, and the reinforcement signal R1 (k) is calculated for each input side link. At the same time, the reinforcement signal R2 (k) is calculated for the input side node of each input side link.

さらに、学習手段５１は、出力ノードの入力側に結合されたテストリンク（出力ノードのテストリンクアドレスＣ４に対応するテストリンク）に対して付与する強化信号ＲＴを算定する。この際、学習手段５１は、テストリンクが、仮に出力ノードの入力側リンクとして存在していた場合を想定して強化信号を算定する。先ず、テストリンクが入力側リンクとして加わることにより出力Ｙが変化しない場合には、入力Ｘ（ｋ）にテストリンクによる入力ＴＸ、すなわちテストリンクの出力（リンク情報記憶手段６３のテストリンクの出力Ｄ７を読み込んで得られる。）を追加し、前述したケース１〜５の場合分けに従ってその強化信号ＲＴを算定する。次に、テストリンクが入力側リンクとして加わることにより出力Ｙが変化する場合には、入力Ｘ（ｋ）にテストリンクによる入力ＴＸ、すなわちテストリンクの出力Ｄ７を追加し、Ｙへ出力ノードの出力Ｃ９（実際の出力）を反転させた値を代入し、Ｒへ出力ノードの強化信号の合計値Ｃ１０（実際の強化信号の合計値）の符号を変えた−Ｃ１０を代入して、前述したケース１〜５のルールを適用することにより、その強化信号ＲＴを算定する。 Further, the learning unit 51 calculates the reinforcement signal RT to be given to the test link coupled to the input side of the output node (the test link corresponding to the test link address C4 of the output node). At this time, the learning means 51 calculates the enhancement signal assuming that the test link exists as the input side link of the output node. First, when the output Y does not change due to the addition of the test link as the input side link, the input TX (x) is input to the input X (k), that is, the output of the test link (the output D7 of the test link of the link information storage means 63) And the enhancement signal RT is calculated according to the case classification of cases 1 to 5 described above. Next, when the output Y changes due to the addition of the test link as the input side link, the input TX by the test link, that is, the output D7 of the test link is added to the input X (k), and the output of the output node is output to Y A case where the value obtained by inverting C9 (actual output) is substituted, and -C10 in which the sign of the total value C10 of the enhancement signal of the output node (total value of the actual enhancement signal) is changed to R is substituted. The reinforcement signal RT is calculated by applying the rules 1 to 5.

また、学習手段５１は、テストリンクについて、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値（本実施形態では、一例として０とする。）を下回っているか否かを判断し、下回っている場合には、テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の出力ノードのテストリンクアドレスＣ４に登録する。 Further, the learning unit 51 determines whether or not the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 is lower than a threshold value (in this embodiment, 0 as an example) for the test link. If so, delete the test link. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the output node of the node information storage unit 62.

さらに、学習手段５１は、テストリンクについて、リンク情報記憶手段６３の当該リンクの強化信号の累積値Ｄ８が閾値を上回っているか否かを判断し、上回っている場合には、テストリンクを実リンクに昇格させて実用化するため、ノード情報記憶手段６２の出力ノードのテストリンクアドレスＣ４と、この出力ノードのアドレスＢ３と、ネットワークアドレスＣ３とを用いて、実リンクを新たに生成し、出力ノードの入力側リンクアドレスＣ１に追加登録する。この際、テストリンクについてのリンク情報記憶手段６３の反転・非反転フラグＤ５がＴｒｕｅ（反転リンクを意味する。）のときには、反転リンクを新たに生成し、Ｆａｌｓｅ（非反転リンクを意味する。）のときには、非反転リンクを新たに生成する。また、これと併せて、テストリンクを削除する。この場合には、後述する図１２のテスト反転リンクの削除処理Ｅ７またはテスト非反転リンクの削除処理Ｅ８を行う。そして、任意のノードに結合する新たなテストリンクをランダムに生成し、ノード情報記憶手段６２の出力ノードのテストリンクアドレスＣ４に登録する。 Further, the learning unit 51 determines whether or not the cumulative value D8 of the reinforcement signal of the link in the link information storage unit 63 exceeds the threshold for the test link. In order to be put into practical use, a real link is newly generated using the test link address C4 of the output node of the node information storage means 62, the address B3 of this output node, and the network address C3, and the output node Is additionally registered in the input side link address C1. At this time, when the inversion / non-inversion flag D5 of the link information storage unit 63 for the test link is True (meaning an inversion link), an inversion link is newly generated and False (means a non-inversion link). In this case, a new non-inverted link is generated. At the same time, the test link is deleted. In this case, a test inversion link deletion process E7 or a test non-inversion link deletion process E8 in FIG. Then, a new test link coupled to an arbitrary node is randomly generated and registered in the test link address C4 of the output node of the node information storage unit 62.

そして、学習手段５１は、出力ノードの入力側リンクアドレスＣ１に登録されている入力側リンクの数が、０になった場合には、ネットワークアドレスＣ３でネットワーク情報記憶手段６１を参照し、入力ノードアドレスＢ１、中間ノードアドレスＢ２、出力ノードアドレスＢ３からランダムに選択したノードアドレスと、当該出力ノードのアドレスと、ネットワークアドレスＣ３とを用いて、反転リンクまたは非反転リンクのいずれかをランダムに選択して新たに実リンクを生成し、生成した実リンクのアドレスを当該出力ノードの入力側リンクアドレスＣ１に加える。この場合には、後述する図１１の反転リンクの初期化処理Ｇ９または非反転リンクの初期化処理Ｇ１０を行う。 Then, when the number of input side links registered in the input side link address C1 of the output node becomes 0, the learning unit 51 refers to the network information storage unit 61 by the network address C3, and the input node A node address randomly selected from the address B1, the intermediate node address B2, and the output node address B3, the address of the output node, and the network address C3 are used to randomly select either an inverted link or a non-inverted link. A new real link is generated, and the address of the generated real link is added to the input side link address C1 of the output node. In this case, an inversion link initialization process G9 or a non-inversion link initialization process G10 in FIG.

それから、学習手段５１は、ノード情報記憶手段６２の出力ノードの強化信号の合計値Ｃ１０をクリアして０にする。 Then, the learning unit 51 clears the total value C10 of the reinforcement signal of the output node of the node information storage unit 62 to zero.

＜反転リンクの学習処理＞
反転リンクの学習処理は、後述する非反転リンクの学習処理と等しいので、説明を省略する。 <Reverse link learning process>
Since the reverse link learning process is the same as the non-inverted link learning process described later, description thereof is omitted.

＜非反転リンクの学習処理＞
図１０には、学習対象となる非反転リンク（実リンク）１２０の一例が示されている。非反転リンク１２０の入力側には、入力側ノード１２１が結合され、出力側には、出力側ノード１２２が結合されている。また、非反転リンク１２０に付随してテストノード１２３（図示の例では、テスト中間ＡＮＤノードであるが、テスト中間ＯＲノードでもよい。）が設けられている。このテストノード１２３の入力側には、第１および第２の入力側テストリンク１２４，１２５が結合され、出力側には、出力側テストリンク１２６が結合されている。但し、出力側テストリンク１２６は、本実施形態では、実質的な情報伝達を行わないので、二点鎖線で示されている。そして、第１の入力側テストリンク１２４は、非反転リンク１２０の入力側ノード１２１に結合され、第２の入力側テストリンク１２５は、任意のノード１２７にランダムに結合され、出力側テストリンク１２６は、非反転リンク１２０の出力側ノード１２２に結合されている。 <Non-inverted link learning process>
FIG. 10 shows an example of a non-inverted link (real link) 120 to be learned. An input side node 121 is coupled to the input side of the non-inverted link 120, and an output side node 122 is coupled to the output side. Further, a test node 123 (in the example shown, a test intermediate AND node, but may be a test intermediate OR node) is provided along with the non-inverted link 120. The first and second input side test links 124 and 125 are coupled to the input side of the test node 123, and the output side test link 126 is coupled to the output side. However, since the output side test link 126 does not perform substantial information transmission in the present embodiment, it is indicated by a two-dot chain line. The first input side test link 124 is coupled to the input side node 121 of the non-inverted link 120, the second input side test link 125 is randomly coupled to the arbitrary node 127, and the output side test link 126. Is coupled to the output node 122 of the non-inverted link 120.

ここで、非反転リンク１２０の出力をＹとし、テストノード１２３の出力をＴＹとし、非反転リンク１２０に対して付与される強化信号をＲ１とし、非反転リンク１２０の入力側ノード１２１に対して付与される強化信号をＲ２とし、テストノード１２３に対して付与される強化信号をＲＴとする。 Here, the output of the non-inverted link 120 is set to Y, the output of the test node 123 is set to TY, the enhancement signal given to the non-inverted link 120 is set to R1, and the input side node 121 of the non-inverted link 120 is set to The enhancement signal to be given is R2, and the enhancement signal given to the test node 123 is RT.

図６において、先ず、学習手段５１は、伝播元の非反転リンク１２０に対して付与された強化信号Ｒ１に基づき、伝播元の非反転リンク１２０の出力Ｙおよび伝播先のテストノード１２３の出力ＴＹの状態に応じて、伝播先のテストノード１２３に対して付与する強化信号ＲＴを算定する（ステップＳ５０６０１）。 In FIG. 6, first, the learning means 51, based on the reinforcement signal R1 given to the non-inverted link 120 of the propagation source, the output Y of the non-inverted link 120 of the propagation source and the output TY of the test node 123 of the propagation destination. The reinforcement signal RT to be given to the propagation destination test node 123 is calculated in accordance with the state (step S50601).

この際、伝播元の非反転リンク１２０に対して付与された強化信号Ｒ１は、リンク情報記憶手段６３の非反転リンク１２０の強化信号Ｄ９を読み込んで得られる。また、伝播元の非反転リンク１２０の出力Ｙは、リンク情報記憶手段６３の非反転リンク１２０の出力Ｄ７を読み込んで得られる。さらに、伝播先のテストノード１２３の出力ＴＹは、リンク情報記憶手段６３の非反転リンク１２０のテストノードアドレスＤ４を参照し、そのテストノードアドレスＤ４に対応するテストノード１２３についてのノード情報記憶手段６２のノードの出力Ｃ９を読み込んで得られる。 At this time, the enhancement signal R1 given to the non-inverted link 120 of the propagation source is obtained by reading the enhanced signal D9 of the non-inverted link 120 of the link information storage unit 63. Further, the output Y of the non-inverted link 120 of the propagation source is obtained by reading the output D7 of the non-inverted link 120 of the link information storage means 63. Further, the output TY of the propagation destination test node 123 refers to the test node address D4 of the non-inverted link 120 of the link information storage means 63, and the node information storage means 62 for the test node 123 corresponding to the test node address D4. Obtained by reading the output C9 of the node.

そして、学習手段５１は、次のようなルールで、伝播先のテストノード１２３に対して付与する強化信号ＲＴを算定する。 Then, the learning means 51 calculates the reinforcement signal RT to be given to the propagation test node 123 according to the following rule.

ケース１：（Ｒ１＞０）∧（ＴＹ＝Ｙ）の場合には、ＲＴ＝０とする。この場合は、ＴＹ＝Ｙであるので、非反転リンク１２０が存在していれば、用が足りるため、テストノード１２３は、必要ないからである。 Case 1: RT = 0 when (R1> 0) Ｔ (TY = Y). In this case, since TY = Y, if the non-inverted link 120 exists, it is sufficient, and the test node 123 is not necessary.

ケース２：（Ｒ１＞０）∧（ＴＹ≠Ｙ）の場合には、テストノード１２３を削除し、新たにテストノードを生成し（生成するテストノードの第２の入力側テストリンクを任意のノードにランダムに結合する。）、リンク情報記憶手段６３の非反転リンク１２０のテストノードアドレスＤ４に登録する。この際、非反転リンク１２０の出力側ノードアドレスＤ２に対応する出力側ノード１２２のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）のときには、テスト中間ＯＲノードを生成し、Ｆａｌｓｅ（ＯＲノードを意味する。）のときには、テスト中間ＡＮＤノードを生成する。この場合は、Ｒ１＞０であり、非反転リンク１２０が良い働きをしているのに対し、ＴＹ≠Ｙであり、テストノード１２３が非反転リンク１２０と異なる出力を行っているため、テストノード１２３が悪い働きをすると考えられるからである。 Case 2: When (R1> 0) ∧ (TY ≠ Y), the test node 123 is deleted and a new test node is generated (the second input side test link of the test node to be generated is an arbitrary node) Are registered in the test node address D4 of the non-inverted link 120 of the link information storage means 63. At this time, when the AND / OR node flag C5 of the output side node 122 corresponding to the output side node address D2 of the non-inverted link 120 is True (meaning an AND node), a test intermediate OR node is generated and False If it means (OR node), a test intermediate AND node is generated. In this case, since R1> 0 and the non-inverted link 120 works well, TY ≠ Y, and the test node 123 outputs different from the non-inverted link 120. This is because 123 is considered to perform badly.

ケース３：（Ｒ１≦０）∧（ＴＹ＝Ｙ）の場合には、ＲＴ＝Ｒ１とする。この場合は、Ｒ１≦０であり、非反転リンク１２０が悪い働きをしているのに対し、ＴＹ＝Ｙであり、テストノード１２３も非反転リンク１２０と同じ出力を行っているので、テストノード１２３に対しても非反転リンク１２０の場合と同様に強化信号として罰を与える。 Case 3: When (R1 ≦ 0) ∧ (TY = Y), RT = R1. In this case, R1 ≦ 0 and the non-inverted link 120 works badly, whereas TY = Y and the test node 123 outputs the same output as the non-inverted link 120. Similarly to the non-inverted link 120, 123 is given a penalty as an enhancement signal.

ケース４：（Ｒ１≦０）∧（ＴＹ≠Ｙ）の場合には、ＲＴ＝−Ｒ１とする。この場合は、Ｒ１≦０であり、非反転リンク１２０が悪い働きをしているのに対し、ＴＹ≠Ｙであり、テストノード１２３は、非反転リンク１２０と異なる出力を行っているので、テストノード１２３に対しては、非反転リンク１２０の場合とは異なり、強化信号として報酬を与える。 Case 4: When (R1 ≦ 0) ∧ (TY ≠ Y), RT = −R1. In this case, R1 ≦ 0 and the non-inverted link 120 is working badly, whereas TY ≠ Y, and the test node 123 outputs different from the non-inverted link 120. Unlike the non-inverted link 120, the node 123 is rewarded as an enhancement signal.

そして、以上のようにして算定したテストノード１２３に対して付与する強化信号ＲＴを、ノード情報記憶手段６２の当該ノードの強化信号の合計値Ｃ１０に加算する（図６のステップＳ５０６０２）。 Then, the enhancement signal RT to be given to the test node 123 calculated as described above is added to the total value C10 of the enhancement signals of the node in the node information storage means 62 (step S50602 in FIG. 6).

続いて、学習手段５１は、ノード情報記憶手段６２のテストノード１２３のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）のときには、テストノード１２３について前述したテスト中間ＡＮＤノードの学習処理を行い、Ｆａｌｓｅ（ＯＲノードを意味する。）のときには、前述したテスト中間ＯＲノードの学習処理を行う（ステップＳ５０６０３）。 Subsequently, when the AND / OR node flag C5 of the test node 123 of the node information storage unit 62 is True (meaning an AND node), the learning unit 51 learns the test intermediate AND node described above for the test node 123. Processing is performed, and if it is False (which means an OR node), the above-described test intermediate OR node learning processing is performed (step S50603).

その後、学習手段５１は、テストノード１２３の第１および第２の入力側テストリンク１２４，１２５の双方について、リンク情報記憶手段６３のこれらのリンクの強化信号の累積値Ｄ８が閾値を上回っているか否かを判断し、いずれも閾値を上回っている場合には、テストノード１２３を実ノードに昇格させて実用化するため、テストノード１２３のアドレスＤ４と、学習対象となっている非反転リンク１２０のアドレスと、ネットワークアドレスＤ３とを用いて、実ノードを新たに生成し、ネットワークアドレスＤ３を参照してネットワーク情報記憶手段６１の中間ノードアドレスＢ２に追加登録する（ステップＳ５０６０４）。この際、ノード情報記憶手段６２のテストノード１２３のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）のときには、中間ＡＮＤノードを生成し、Ｆａｌｓｅ（ＯＲノードを意味する。）のときには、中間ＯＲノードを生成する。また、これと併せて、テストノード１２３を削除し、学習対象となっている非反転リンク１２０も削除する。 Thereafter, the learning means 51 determines whether the cumulative value D8 of the reinforcement signals of these links in the link information storage means 63 exceeds the threshold for both the first and second input side test links 124 and 125 of the test node 123. If both are higher than the threshold value, the test node 123 is promoted to a real node and put into practical use. Therefore, the address D4 of the test node 123 and the non-inverted link 120 to be learned are used. And the network address D3, a real node is newly generated, and is additionally registered in the intermediate node address B2 of the network information storage unit 61 with reference to the network address D3 (step S50604). At this time, when the AND / OR node flag C5 of the test node 123 of the node information storage means 62 is True (meaning an AND node), an intermediate AND node is generated and False (meaning an OR node) is generated. Sometimes an intermediate OR node is generated. At the same time, the test node 123 is deleted, and the non-inverted link 120 that is the learning target is also deleted.

＜テスト反転リンクの学習処理＞
テスト反転リンクは、学習を行わない。 <Test reverse link learning process>
The test reversal link does not learn.

＜テスト非反転リンクの学習処理＞
テスト非反転リンクは、学習を行わない。 <Test non-inverted link learning process>
Test non-inverted links do not learn.

図１１には、初期化の構成が示されている。図１１において、ロボット初期化処理Ｇ１、ネットワーク初期化処理Ｇ２、入力ノード初期化処理Ｇ３、および出力ノード初期化処理Ｇ４は、プログラムを立ち上げ、ロボット３０の動作制御を開始した直後にのみ行うが、その他のノードやリンクの初期化処理Ｇ５〜Ｇ１２は、ロボット３０の動作制御を開始した直後のみならず、その後の学習でノードやリンクが生成される都度に行う。また、初期化の方法は、ノードの種類やリンクの種類によって異なり、状況に応じて複数の初期化を使い分ける場合も存在する。さらに、初期化処理内で別の初期化処理を行う必要が生じる場合もあり、それぞれの初期化は関連しあっている。そして、これらの初期化の関係が図１１に示されている。図１１において、矢印の付け根の初期化を行うには、矢印の先端の初期化が必要である。図中の実線は、必ず使用し、点線は、使用する可能性があることを意味する。なお、図中の一点鎖線は、昇格によりテストノードやテストリンクから実ノードや実リンクになる場合を示している。 FIG. 11 shows an initialization configuration. In FIG. 11, the robot initialization process G1, the network initialization process G2, the input node initialization process G3, and the output node initialization process G4 are performed only immediately after starting the program and starting the operation control of the robot 30. The other node and link initialization processes G5 to G12 are performed not only immediately after the operation control of the robot 30 is started but also whenever a node or link is generated in the subsequent learning. Also, the initialization method differs depending on the type of node and the type of link, and there are cases where a plurality of initializations are used properly depending on the situation. Further, it may be necessary to perform another initialization process within the initialization process, and the respective initializations are related. These initialization relationships are shown in FIG. In FIG. 11, in order to initialize the root of the arrow, it is necessary to initialize the tip of the arrow. The solid line in the figure is always used, and the dotted line means that it may be used. Note that a one-dot chain line in the figure indicates a case where a test node or test link is changed to a real node or real link by promotion.

＜ロボット初期化処理Ｇ１＞
ロボット初期化処理Ｇ１では、ロボット情報記憶手段６０の入力配列Ａ１および出力配列Ａ２は、初期化の必要はない。ネットワークアドレスＡ３については、本実施形態では、一例として、入力ノード数１２８、出力ノード数３２で初期化し、得られたネットワークアドレスを登録する。Ａ４，Ａ５，Ａ６は、０とする。 <Robot initialization process G1>
In the robot initialization process G1, the input array A1 and the output array A2 of the robot information storage means 60 do not need to be initialized. As an example, in the present embodiment, the network address A3 is initialized with the number of input nodes 128 and the number of output nodes 32, and the obtained network address is registered. A4, A5, and A6 are set to zero.

＜ネットワーク初期化処理Ｇ２＞
ネットワーク初期化処理Ｇ２では、ネットワーク情報記憶手段６１に記憶する情報の初期化処理を行う。ネットワーク２０は、入力ノード２１の数と出力ノード２３の数を指定して初期化する。入力ノードアドレスＢ１には、初期化されるべきネットワーク２０のアドレスを使用し、指定された入力ノード数の分だけ入力ノード２１の初期化処理Ｇ３を行い、得られた入力ノード２１のアドレスを順次登録する。中間ノードアドレスＢ２への登録は、中間ノード２２が生成される都度に行うので、中間ノードアドレスＢ２の初期化は必要ない。出力ノードアドレスＢ３には、初期化されるべきネットワーク２０のアドレスを使用し、指定された出力ノード数の分だけ出力ノード２３の初期化処理Ｇ４を行い、得られた出力ノード２３のアドレスを順次登録する。ネットワーク２０に対する強化信号Ｂ４は、０とする。 <Network initialization process G2>
In the network initialization process G2, an initialization process of information stored in the network information storage unit 61 is performed. The network 20 is initialized by specifying the number of input nodes 21 and the number of output nodes 23. As the input node address B1, the address of the network 20 to be initialized is used, the initialization process G3 of the input node 21 is performed for the designated number of input nodes, and the obtained address of the input node 21 is sequentially obtained. sign up. Since the registration to the intermediate node address B2 is performed every time the intermediate node 22 is generated, the initialization of the intermediate node address B2 is not necessary. As the output node address B3, the address of the network 20 to be initialized is used, the initialization process G4 of the output node 23 is performed by the designated number of output nodes, and the obtained output node 23 addresses are sequentially assigned. sign up. The enhancement signal B4 for the network 20 is 0.

＜入力ノード初期化処理Ｇ３＞
入力ノード２１は、この入力ノード２１が所属するネットワーク２０のアドレス（ノード情報記憶手段６２に記憶させるネットワークアドレスＣ３）を指定して初期化する。入力ノード２１は、ダミーノードなので、入力側リンクアドレスＣ１の初期化の必要はない。出力側リンクアドレスＣ２への登録は、入力ノード２１の出力側に結合される出力側リンクが生成される都度に行うので、出力側リンクアドレスＣ２の初期化の必要はない。ネットワークアドレスＣ３は、指定されたネットワーク２０のアドレスを用いて上書きする。入力ノード２１は、ダミーノードなので、テストリンクアドレスＣ４およびＡＮＤ・ＯＲノードフラグＣ５の初期化の必要はない。当該ノードが入力ノード２１であるから、入力ノードフラグＣ６は、Ｔｒｕｅとし、出力ノードフラグＣ７およびテストノードフラグＣ８は、初期化しないか、Ｆａｌｓｅとする。ノードの出力Ｃ９は、入力変換手段５２により設定されるので（図４のステップＳ５０７参照）、初期化の必要はない。強化信号の合計値Ｃ１０は、０とする。 <Input node initialization processing G3>
The input node 21 designates and initializes the address of the network 20 to which the input node 21 belongs (network address C3 stored in the node information storage means 62). Since the input node 21 is a dummy node, there is no need to initialize the input side link address C1. Since registration to the output side link address C2 is performed every time an output side link coupled to the output side of the input node 21 is generated, there is no need to initialize the output side link address C2. The network address C3 is overwritten using the specified network 20 address. Since the input node 21 is a dummy node, it is not necessary to initialize the test link address C4 and the AND / OR node flag C5. Since the node is the input node 21, the input node flag C6 is set to True, and the output node flag C7 and the test node flag C8 are not initialized or set to False. Since the output C9 of the node is set by the input conversion means 52 (see step S507 in FIG. 4), there is no need for initialization. The total value C10 of the enhancement signal is 0.

＜出力ノード初期化処理Ｇ４＞
出力ノード２３は、この出力ノード２３が所属するネットワーク２０のアドレス（ノード情報記憶手段６２に記憶させるネットワークアドレスＣ３）を指定して初期化する。入力側リンクアドレスＣ１については、指定されたネットワークアドレスＣ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、出力ノードアドレスＢ３からランダムに選択したノードアドレス（なお、中間ノードアドレスＢ２には、この時点ではデータは入っていないので、選択対象とはならない。）と、この初期化される出力ノード２３のアドレスと、指定されたネットワークアドレスＣ３とを用いて、図１３に示すように、ランダムに選択されたノード１４０に結合される実リンク１４１を、反転リンクまたは非反転リンクのいずれかをランダムに選択して新たに生成し（図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０を行う。）、生成された実リンク１４１のアドレスを入力側リンクアドレスＣ１に加える。この際、実リンク１４１に付随するテストノード１４２も新たに生成し（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、さらにノード１４０に結合される第１の入力側テストリンク１４３と、初期化の対象となっている出力ノード２３に結合される出力側テストリンク１４４と、任意のノード１４５にランダムに結合される第２の入力側テストリンク１４６とを新たに生成する（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）。 <Output node initialization processing G4>
The output node 23 specifies and initializes the address of the network 20 to which the output node 23 belongs (network address C3 stored in the node information storage means 62). As for the input side link address C1, a node address randomly selected from the input node address B1 and the output node address B3 of the network information storage means 61 referred to by the designated network address C3 (the intermediate node address B2 includes Since no data is entered at this time, it is not a selection target.), And the address of the output node 23 to be initialized and the designated network address C3 are used as shown in FIG. The real link 141 to be coupled to the node 140 selected in the above is newly generated by randomly selecting either the inverted link or the non-inverted link (the inverted link initialization process G9 in FIG. 11 or the non-inverted link initialization). Process G10 is performed), and the address of the generated real link 141 is input-side link. Add to dress C1. At this time, a test node 142 associated with the actual link 141 is also newly generated (the test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 in FIG. 11 is performed) and further coupled to the node 140. The first input side test link 143, the output side test link 144 coupled to the output node 23 to be initialized, and the second input side test link 146 randomly coupled to an arbitrary node 145 Are newly generated (the test inversion link initialization process G11 or the test non-inversion link initialization process G12 in FIG. 11 is performed).

出力側リンクアドレスＣ２の初期化の必要はない。ネットワークアドレスＣ３は、指定されたネットワーク２０のアドレスを用いて上書きする。 There is no need to initialize the output side link address C2. The network address C3 is overwritten using the specified network 20 address.

テストリンクアドレスＣ４については、指定されたネットワークアドレスＣ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、出力ノードアドレスＢ３からランダムに選択したノードアドレス（なお、中間ノードアドレスＢ２には、この時点ではデータは入っていないので、選択対象とはならない。）と、この初期化される出力ノード２３のアドレスと、指定されたネットワークアドレスＣ３とを用いて、図１３に示すように、ランダムに選択されたノード１４７に結合されるテストリンク１４８を、テスト反転リンクまたはテスト非反転リンクのいずれかをランダムに選択して新たに生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成されたテストリンク１４８のアドレスをテストリンクアドレスＣ４に登録する。 For the test link address C4, a node address randomly selected from the input node address B1 and the output node address B3 of the network information storage means 61 referred to by the designated network address C3 (note that the intermediate node address B2 Since no data is entered at the time, it is not a selection target.), And the address of the output node 23 to be initialized and the designated network address C3 are used as shown in FIG. A test link 148 coupled to the selected node 147 is newly generated by randomly selecting either the test inversion link or the test non-inversion link (the test inversion link initialization process G11 in FIG. 11 or the test non-inversion). The link initialization process G12 is performed.) To register the address of the click 148 to test the link address C4.

ＡＮＤ・ＯＲノードフラグＣ５は、本実施形態では、出力ノード２３はＯＲノードとするため、Ｆａｌｓｅ（ＯＲノードを意味する。）とする。また、当該ノードが出力ノード２３であるから、入力ノードフラグＣ６は、Ｆａｌｓｅとし、出力ノードフラグＣ７は、Ｔｒｕｅとし、テストノードフラグＣ８は、Ｆａｌｓｅとする。さらに、ノードの出力Ｃ９は、Ｆａｌｓｅとし、強化信号の合計値Ｃ１０は、０とする。 In this embodiment, the AND / OR node flag C5 is set to False (which means an OR node) because the output node 23 is an OR node. Since the node is the output node 23, the input node flag C6 is set to False, the output node flag C7 is set to True, and the test node flag C8 is set to False. Further, the output C9 of the node is set to False, and the total value C10 of the enhancement signal is set to 0.

＜中間ＯＲノード初期化処理Ｇ５＞
中間ＯＲノード初期化処理Ｇ５は、削除される実リンク（テストリンク以外のリンクのいずれか）を指定し、参照することで行われる。この処理は、ネットワーク２０内から実リンク１本を削除し、その実リンクに付随するテストノード（その実リンクについてのリンク情報記憶手段６３のテストノードアドレスＤ４に対応するテストノード）を実ノードへと昇格させる際に使用される処理だからである。 <Intermediate OR node initialization process G5>
The intermediate OR node initialization process G5 is performed by designating and referring to the actual link to be deleted (any link other than the test link). In this process, one actual link is deleted from the network 20, and the test node associated with the actual link (the test node corresponding to the test node address D4 of the link information storage means 63 for the actual link) is promoted to the actual node. This is because the process is used when making it happen.

入力側リンクアドレスＣ１については、図１４に示すように、削除対象として指定された実リンク１６０に付随するテスト中間ＯＲノード１６１（リンク情報記憶手段６３の実リンク１６０のテストノードアドレスＤ４に対応するテストノード）の第１および第２の入力側テストリンク１６２，１６３のアドレスＣ１（Ｃ１（１），Ｃ１（２））と、生成される初期化対象の中間ＯＲノード１８０のアドレス（これから確保しようとしているメモリ領域のアドレス）とを用いて、第１の入力側テストリンク１６２がテスト反転リンクであれば反転リンク（実リンク）を、第１の入力側テストリンク１６２がテスト非反転リンクであれば非反転リンク（実リンク）を新たに初期化して生成し（図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＯＲノード１８０の入力側リンク１８１として登録するために、入力側リンク１８１のアドレスを入力側リンクアドレスＣ１に登録する。同様に、第２の入力側テストリンク１６３がテスト反転リンクであれば反転リンク（実リンク）を、第２の入力側テストリンク１６３がテスト非反転リンクであれば非反転リンク（実リンク）を新たに初期化して生成し（図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＯＲノード１８０の入力側リンク１８２として登録するために、入力側リンク１８２のアドレスを入力側リンクアドレスＣ１に登録する。つまり、第１の入力側テストリンク１６２と、入力側リンク１８１との反転・非反転を一致させ、かつ、第２の入力側テストリンク１６３と、入力側リンク１８２との反転・非反転を一致させる。この際、入力側リンク１８１の入力側ノードは、第１の入力側テストリンク１６２の入力側に結合されていたノード１６４（すなわち、削除される実リンク１６０の入力側ノード）とし、入力側リンク１８２の入力側ノードは、第２の入力側テストリンク１６３の入力側に結合されていたノード１６５とする。なお、図示は省略されているが、新たに生成した入力側リンク１８１，１８２には、これらにそれぞれ付随するテストノードが設けられる（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）。 As shown in FIG. 14, the input-side link address C1 corresponds to the test intermediate OR node 161 (the test node address D4 of the real link 160 in the link information storage unit 63) associated with the real link 160 designated as the deletion target. Address C1 (C1 (1), C1 (2)) of the first and second input side test links 162, 163 of the test node) and the address of the intermediate OR node 180 to be generated (to be secured from now on) If the first input-side test link 162 is a test inverted link, the inverted link (actual link) is used, and the first input-side test link 162 is a test non-inverted link. For example, a non-inverted link (actual link) is newly initialized and generated (inverted link initialization processing G9 in FIG. 11 or non-inverted link). Performing click initialization G10.), To register the actual link that is generated as the input side link 181 of the intermediate OR node 180 and registers on the input side link address C1 to address the input side link 181. Similarly, if the second input-side test link 163 is a test inversion link, an inversion link (real link) is used. If the second input-side test link 163 is a test non-inversion link, a non-inversion link (real link) is used. In order to newly initialize and generate (perform the reverse link initialization process G9 or non-reverse link initialization process G10 in FIG. 11), and register the generated actual link as the input side link 182 of the intermediate OR node 180 In addition, the address of the input side link 182 is registered in the input side link address C1. That is, the inversion / non-inversion of the first input side test link 162 and the input side link 181 are matched, and the inversion / non-inversion of the second input side test link 163 and the input side link 182 are matched. Let At this time, the input side node of the input side link 181 is the node 164 coupled to the input side of the first input side test link 162 (that is, the input side node of the actual link 160 to be deleted). The input side node 182 is a node 165 that is coupled to the input side of the second input side test link 163. Although not shown in the figure, the newly generated input side links 181 and 182 are provided with test nodes respectively attached thereto (the test intermediate OR node initialization process G7 or the test intermediate AND node in FIG. 11). Initialization processing G8 is performed).

その後、中間ＯＲノード１８０の第１番目の入力側リンク１８１（入力側リンクアドレスＣ１のうち、配列の先頭に格納される入力側リンクアドレスＣ（１）に対応する入力側リンク）の強化信号の累積値Ｄ８を、削除対象として指定された実リンク１６０の強化信号の累積値Ｄ８で上書きすることで初期化する。これは、後述するテスト中間ＯＲノード初期化処理Ｇ７およびテスト中間ＡＮＤノード初期化処理Ｇ８の説明に示すように、削除される実リンク１６０と、テスト中間ＯＲノード１６１の第１の入力側テストリンク１６２との反転・非反転は一致しているので、結局、第１番目の入力側リンク１８１は、実リンク１６０と反転・非反転が一致していることから、削除する実リンク１６０の強化信号を第１番目の入力側リンク１８１が受け継ぐ形にするものである。 Thereafter, the enhancement signal of the first input side link 181 of the intermediate OR node 180 (the input side link corresponding to the input side link address C (1) stored at the head of the array among the input side link addresses C1) is displayed. The cumulative value D8 is initialized by being overwritten with the cumulative value D8 of the reinforcement signal of the real link 160 designated as the deletion target. This is because the actual link 160 to be deleted and the first input-side test link of the test intermediate OR node 161, as shown in the description of the test intermediate OR node initialization process G7 and the test intermediate AND node initialization process G8 described later. Since the inversion / non-inversion with 162 matches, the first input side link 181 eventually matches the actual link 160 with inversion / non-inversion. Is inherited by the first input side link 181.

出力側リンクアドレスＣ２については、図１４に示すように、生成される初期化対象の中間ＯＲノード１８０のアドレスと、削除対象として指定された実リンク１６０の出力側ノードアドレスＤ２と、削除対象として指定された実リンク１６０のネットワークアドレスＤ３とを用いて、非反転リンク（実リンク）を新たに初期化して生成し（図１１の非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＯＲノード１８０の出力側リンク１８３として登録するために、出力側リンク１８３のアドレスを出力側リンクアドレスＣ２に登録する。この際、出力側リンク１８３の出力側ノードは、テスト中間ＯＲノード１６１の出力側テストリンク１６６の出力側に結合されていたノード１６７（すなわち、削除される実リンク１６０の出力側ノード）とする。なお、図示は省略されているが、新たに生成した出力側リンク１８３には、これに付随するテストノードが設けられる（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）。 For the output side link address C2, as shown in FIG. 14, the generated address of the intermediate OR node 180 to be initialized, the output side node address D2 of the real link 160 designated as the deletion target, and the deletion target Using the network address D3 of the designated real link 160, a non-inverted link (actual link) is newly initialized and generated (the non-inverted link initialization process G10 of FIG. 11 is performed), and the generated. In order to register the actual link as the output side link 183 of the intermediate OR node 180, the address of the output side link 183 is registered in the output side link address C2. At this time, the output side node of the output side link 183 is the node 167 coupled to the output side of the output side test link 166 of the test intermediate OR node 161 (that is, the output side node of the actual link 160 to be deleted). . Although not shown, the newly generated output side link 183 is provided with a test node associated therewith (test intermediate OR node initialization process G7 or test intermediate AND node initialization process in FIG. 11). Do G8).

また、生成された非反転リンク（実リンク）である出力側リンク１８３の強化信号の累積値Ｄ８を、削除される実リンク１６０の強化信号の累積値Ｄ８で上書きすることで初期化する。 Also, initialization is performed by overwriting the enhancement signal accumulated value D8 of the output side link 183 which is the generated non-inverted link (actual link) with the enhancement signal accumulated value D8 of the actual link 160 to be deleted.

ネットワークアドレスＣ３は、削除対象として指定された実リンク１６０のネットワークアドレスＤ３で上書きする。 The network address C3 is overwritten with the network address D3 of the real link 160 designated as the deletion target.

テストリンクアドレスＣ４については、指定されたネットワークアドレスＣ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、中間ノードアドレスＢ２、出力ノードアドレスＢ３からランダムに選択した１つのノードアドレスと、生成される初期化対象の中間ＯＲノード１８０のアドレスと、ネットワークアドレスＤ３とを用いて、図１４に示すように、ランダムに選択されたノード１８４に結合されるテストリンク１８５を、テスト反転リンクまたはテスト非反転リンクのいずれかをランダムに選択して新たに生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成されたテストリンク１８５のアドレスをテストリンクアドレスＣ４に登録する。 The test link address C4 is generated with one node address randomly selected from the input node address B1, the intermediate node address B2, and the output node address B3 of the network information storage unit 61 referred to by the designated network address C3. As shown in FIG. 14, using the address of the intermediate OR node 180 to be initialized and the network address D3, a test link 185 coupled to a randomly selected node 184 is connected to a test inversion link or a non-test link. One of the inverted links is randomly selected and newly generated (the test inverted link initialization process G11 or the test non-inverted link initialization process G12 of FIG. 11 is performed), and the address of the generated test link 185 is tested. Register at link address C4.

また、初期化対象が中間ＯＲノード１８０であるから、ＡＮＤ・ＯＲノードフラグＣ５は、Ｆａｌｓｅ（ＯＲノードを意味する。）とし、入力ノードフラグＣ６は、Ｆａｌｓｅとし、出力ノードフラグＣ７は、Ｆａｌｓｅとし、テストノードフラグＣ８は、Ｆａｌｓｅとする。さらに、ノードの出力Ｃ９は、Ｆａｌｓｅとし、強化信号の合計値Ｃ１０は、０とする。 Since the initialization target is the intermediate OR node 180, the AND / OR node flag C5 is set to False (meaning an OR node), the input node flag C6 is set to False, and the output node flag C7 is set to False. The test node flag C8 is set to False. Further, the output C9 of the node is set to False, and the total value C10 of the enhancement signal is set to 0.

＜中間ＡＮＤノード初期化処理Ｇ６＞
中間ＡＮＤノード初期化処理Ｇ６は、前述した中間ＯＲノード初期化処理Ｇ５と略同様である。すなわち、中間ＡＮＤノード初期化処理Ｇ６は、削除される実リンク（テストリンク以外のリンクのいずれか）を指定し、参照することで行われる。この処理は、ネットワーク２０内から実リンク１本を削除し、その実リンクに付随するテストノード（その実リンクについてのリンク情報記憶手段６３のテストノードアドレスＤ４に対応するテストノード）を実ノードへと昇格させる際に使用される処理だからである。 <Intermediate AND node initialization processing G6>
The intermediate AND node initialization process G6 is substantially the same as the intermediate OR node initialization process G5 described above. That is, the intermediate AND node initialization process G6 is performed by designating and referring to the actual link to be deleted (any link other than the test link). In this process, one actual link is deleted from the network 20, and the test node associated with the actual link (the test node corresponding to the test node address D4 of the link information storage means 63 for the actual link) is promoted to the actual node. This is because the process is used when making it happen.

入力側リンクアドレスＣ１については、削除対象として指定された実リンクに付随するテスト中間ＡＮＤノード（リンク情報記憶手段６３の実リンクのテストノードアドレスＤ４に対応するテストノード）の第１および第２の入力側テストリンクのアドレスＣ１（Ｃ１（１），Ｃ１（２））と、生成される初期化対象の中間ＡＮＤノードのアドレス（これから確保しようとしているメモリ領域のアドレス）とを用いて、第１の入力側テストリンクがテスト反転リンクであれば反転リンク（実リンク）を、第１の入力側テストリンクがテスト非反転リンクであれば非反転リンク（実リンク）を新たに初期化して生成し（図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＡＮＤノードの入力側リンクとして登録するために、その実リンクのアドレスを入力側リンクアドレスＣ１に登録する。同様に、第２の入力側テストリンクがテスト反転リンクであれば反転リンク（実リンク）を、第２の入力側テストリンクがテスト非反転リンクであれば非反転リンク（実リンク）を新たに初期化して生成し（図１１の反転リンク初期化処理Ｇ９または非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＡＮＤノードの入力側リンクとして登録するために、その実リンクのアドレスを入力側リンクアドレスＣ１に登録する。 As for the input side link address C1, the first and second test intermediate AND nodes (test nodes corresponding to the test link address D4 of the real link in the link information storage means 63) associated with the real link designated as the deletion target. The first test link address C1 (C1 (1), C1 (2)) and the address of the intermediate AND node to be initialized to be generated (the address of the memory area to be secured) are used to Inverted link (real link) is generated if the input side test link is a test inverted link, and a non-inverted link (real link) is newly generated if the first input side test link is a test non-inverted link. (The reverse link initialization process G9 or the non-reverse link initialization process G10 of FIG. 11 is performed.) The generated real link is subjected to an intermediate AND. To register as an input side link over de, it registers the address of the actual link to the input side link address C1. Similarly, if the second input test link is a test reversal link, a reverse link (real link) is newly added. If the second input test link is a test non-reverse link, a non-reverse link (real link) is newly set. In order to register the generated actual link as an input link of the intermediate AND node, the actual link is generated by initialization (inverted link initialization processing G9 or non-inverted link initialization processing G10 in FIG. 11 is performed). Is registered in the input side link address C1.

その後、中間ＡＮＤノードの第１番目の入力側リンク（入力側リンクアドレスＣ１のうち、配列の先頭に格納される入力側リンクアドレスＣ（１）に対応する入力側リンク）の強化信号の累積値Ｄ８を、削除対象として指定された実リンクの強化信号の累積値Ｄ８で上書きすることで初期化する。 Thereafter, the cumulative value of the enhancement signal of the first input side link of the intermediate AND node (the input side link corresponding to the input side link address C (1) stored at the head of the array among the input side link addresses C1). D8 is initialized by overwriting it with the accumulated value D8 of the enhancement signal of the actual link designated as the deletion target.

出力側リンクアドレスＣ２については、生成される初期化対象の中間ＡＮＤノードのアドレスと、削除対象として指定された実リンクの出力側ノードアドレスＤ２と、削除対象として指定された実リンクのネットワークアドレスＤ３とを用いて、非反転リンク（実リンク）を新たに初期化して生成し（図１１の非反転リンク初期化処理Ｇ１０を行う。）、その生成された実リンクを中間ＡＮＤノードの出力側リンクとして登録するために、その実リンクのアドレスを出力側リンクアドレスＣ２に登録する。また、生成された非反転リンク（実リンク）である出力側リンクの強化信号の累積値Ｄ８を、削除される実リンクの強化信号の累積値Ｄ８で上書きすることで初期化する。 For the output side link address C2, the generated intermediate AND node address to be initialized, the output side node address D2 of the real link specified as the deletion target, and the network address D3 of the real link specified as the deletion target Are used to newly initialize and generate a non-inverted link (actual link) (perform non-inverted link initialization processing G10 in FIG. 11), and the generated actual link is output to the output link of the intermediate AND node. Is registered as the output side link address C2. Also, initialization is performed by overwriting the accumulated value D8 of the strengthened signal of the output side link that is the generated non-inverted link (actual link) with the accumulated value D8 of the enhanced signal of the actual link to be deleted.

ネットワークアドレスＣ３は、削除対象として指定された実リンクのネットワークアドレスＤ３で上書きする。 The network address C3 is overwritten with the network address D3 of the real link designated as the deletion target.

テストリンクアドレスＣ４については、指定されたネットワークアドレスＣ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、中間ノードアドレスＢ２、出力ノードアドレスＢ３からランダムに選択した１つのノードアドレスと、生成される初期化対象の中間ＡＮＤノードのアドレスと、ネットワークアドレスＤ３とを用いて、ランダムに選択されたノードに結合されるテストリンクを、テスト反転リンクまたはテスト非反転リンクのいずれかをランダムに選択して新たに生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成されたテストリンクのアドレスをテストリンクアドレスＣ４に登録する。 The test link address C4 is generated with one node address randomly selected from the input node address B1, the intermediate node address B2, and the output node address B3 of the network information storage unit 61 referred to by the designated network address C3. Using the address of the intermediate AND node to be initialized and the network address D3, the test link coupled to the randomly selected node is randomly selected as either the test reverse link or the test non-reverse link. Is newly generated (the test inversion link initialization process G11 or the test non-inversion link initialization process G12 in FIG. 11 is performed), and the generated test link address is registered in the test link address C4.

また、初期化対象が中間ＡＮＤードであるから、ＡＮＤ・ＯＲノードフラグＣ５は、Ｔｒｕｅ（ＡＮＤノードを意味する。）とし、入力ノードフラグＣ６は、Ｆａｌｓｅとし、出力ノードフラグＣ７は、Ｆａｌｓｅとし、テストノードフラグＣ８は、Ｆａｌｓｅとする。さらに、ノードの出力Ｃ９は、Ｆａｌｓｅとし、強化信号の合計値Ｃ１０は、０とする。 Since the initialization target is an intermediate AND node, the AND / OR node flag C5 is set to True (meaning an AND node), the input node flag C6 is set to False, and the output node flag C7 is set to False. The test node flag C8 is set to False. Further, the output C9 of the node is set to False, and the total value C10 of the enhancement signal is set to 0.

＜テスト中間ＯＲノード初期化処理Ｇ７＞
テスト中間ＯＲノード初期化処理Ｇ７は、実リンクと、この実リンクのネットワークアドレスＤ３とを指定することで行われる。これは、テスト中間ＯＲノードが、必ず１本の実リンクに付随して設けられる（実リンクのテストノードアドレスＤ４に登録される）からである。 <Test intermediate OR node initialization processing G7>
The test intermediate OR node initialization process G7 is performed by designating the actual link and the network address D3 of the actual link. This is because the test intermediate OR node is always provided along with one real link (registered at the test link address D4 of the real link).

入力側テストリンクアドレスＣ１については、図１５に示すように、指定された実リンク２００の入力側ノードアドレスＤ１と、この生成されるテスト中間ＯＲノード２０１のアドレスと、指定された実リンク２００のネットワークアドレスＤ３とを用いて、指定された実リンク２００が反転リンクであればテスト反転リンクを、指定された実リンク２００が非反転リンクであればテスト非反転リンクを新たに初期化して生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成したリンクを第１の入力側テストリンク２０２とし、そのアドレスを第１の入力側テストリンクアドレスＣ１（１）として登録し、この第１の入力側テストリンク２０２の強化信号の累積値Ｄ８を十分大きな正の値（例えば１０³⁰⁰等）で上書きする。これは、第１の入力側テストリンク２０２が削除されることを防止するためである。 As for the input-side test link address C1, as shown in FIG. 15, the input-side node address D1 of the designated real link 200, the address of the generated test intermediate OR node 201, and the designated real link 200 If the specified real link 200 is a reverse link, a test reverse link is newly generated by using the network address D3, and a test non-reverse link is newly generated if the specified real link 200 is a non-reverse link. (The test inversion link initialization process G11 or the test non-inversion link initialization process G12 in FIG. 11 is performed.) The generated link is set as the first input side test link 202, and the address is set as the first input side test link address. C1 (1) is registered and the accumulated value D8 of the enhancement signal of the first input side test link 202 is sufficiently large. Overwriting a positive value (e.g., 10 ^300, etc.). This is to prevent the first input side test link 202 from being deleted.

さらに、入力側テストリンクアドレスＣ１については、指定された実リンク２００のネットワークアドレスＤ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、中間ノードアドレスＢ２、出力ノードアドレスＢ３からランダムに選択した１つのノードアドレスと、生成される初期化対象のテスト中間ＯＲノード２０１のアドレスと、指定された実リンク２００のネットワークアドレスＤ３とを用いて、図１５に示すように、ランダムに選択されたノード２０３に結合される第２の入力側テストリンク２０４を、テスト反転リンクまたはテスト非反転リンクのいずれかをランダムに選択して新たに生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成された第２の入力側テストリンク２０４のアドレスを第２の入力側テストリンクアドレスＣ１（２）として登録する。 Further, the input side test link address C1 is randomly selected from the input node address B1, the intermediate node address B2, and the output node address B3 of the network information storage unit 61 referred to by the network address D3 of the designated real link 200. A node selected at random as shown in FIG. 15 using one node address, the address of the test intermediate OR node 201 to be initialized to be generated, and the network address D3 of the designated real link 200 The second input-side test link 204 coupled to 203 is newly generated by randomly selecting either the test inversion link or the test non-inversion link (the test inversion link initialization process G11 in FIG. Reverse link initialization processing G12 is performed.) The address of the second input-side test link 204 to register as a second input test link address C1 (2).

出力側テストリンクアドレスＣ２については、初期化の必要はない。テスト中間ＯＲノード２０１の出力側テストリンク２０５では、強化信号の蓄積は行われず、情報伝達も行わないからである。従って、図１５では、出力側テストリンク２０５は、一点鎖線で示されている。 There is no need to initialize the output side test link address C2. This is because the output side test link 205 of the test intermediate OR node 201 does not store the enhancement signal and does not transmit information. Accordingly, in FIG. 15, the output side test link 205 is indicated by a one-dot chain line.

ネットワークアドレスＣ３は、指定された実リンク２００のネットワークアドレスＤ３で上書きする。また、テストノードの場合には、テストリンクアドレスＣ４に登録すべきテストリンクは無いので、テストリンクアドレスＣ４の初期化は必要ない。 The network address C3 is overwritten with the network address D3 of the designated real link 200. In the case of a test node, since there is no test link to be registered at the test link address C4, it is not necessary to initialize the test link address C4.

また、初期化対象がテスト中間ＯＲノードであるから、ＡＮＤ・ＯＲノードフラグＣ５は、Ｆａｌｓｅ（ＯＲノードを意味する。）とし、入力ノードフラグＣ６は、Ｆａｌｓｅとし、出力ノードフラグＣ７は、Ｆａｌｓｅとし、テストノードフラグＣ８は、Ｔｒｕｅとする。さらに、ノードの出力Ｃ９は、Ｆａｌｓｅとし、強化信号の合計値Ｃ１０は、０とする。 Since the initialization target is a test intermediate OR node, the AND / OR node flag C5 is set to False (meaning an OR node), the input node flag C6 is set to False, and the output node flag C7 is set to False. The test node flag C8 is set to True. Further, the output C9 of the node is set to False, and the total value C10 of the enhancement signal is set to 0.

＜テスト中間ＡＮＤノード初期化処理Ｇ８＞
テスト中間ＡＮＤノード初期化処理Ｇ８は、前述したテスト中間ＯＲノード初期化処理Ｇ７と略同様である。すなわち、テスト中間ＡＮＤノード初期化処理Ｇ８は、実リンクと、この実リンクのネットワークアドレスＤ３とを指定することで行われる。これは、テスト中間ＡＮＤノードが、必ず１本の実リンクに付随して設けられる（実リンクのテストノードアドレスＤ４に登録される）からである。 <Test intermediate AND node initialization processing G8>
The test intermediate AND node initialization process G8 is substantially the same as the test intermediate OR node initialization process G7 described above. That is, the test intermediate AND node initialization process G8 is performed by designating the actual link and the network address D3 of the actual link. This is because the test intermediate AND node is always provided along with one real link (registered in the real link test node address D4).

入力側テストリンクアドレスＣ１については、指定された実リンクの入力側ノードアドレスＤ１と、この生成されるテスト中間ＡＮＤノードのアドレスと、指定された実リンクのネットワークアドレスＤ３とを用いて、指定された実リンクが反転リンクであればテスト反転リンクを、指定された実リンクが非反転リンクであればテスト非反転リンクを新たに初期化して生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成したリンクを第１の入力側テストリンクとし、そのアドレスを第１の入力側テストリンクアドレスＣ１（１）として登録し、この第１の入力側テストリンクの強化信号の累積値Ｄ８を十分大きな正の値（例えば１０³⁰⁰等）で上書きする。これは、第１の入力側テストリンクが削除されることを防止するためである。 The input side test link address C1 is designated by using the designated real link input side node address D1, the generated test intermediate AND node address, and the designated real link network address D3. If the actual link is an inverted link, a test inverted link is generated, and if the specified actual link is a non-inverted link, a test non-inverted link is newly initialized and generated (test inverted link initialization process G11 in FIG. 11 or The test non-inversion link initialization process G12 is performed.), The generated link is set as the first input side test link, the address is registered as the first input side test link address C1 (1), and the first input is registered. the accumulated value D8 of reinforcement signal side test link overridden by a sufficiently large positive value (e.g., 10 ^300, etc.). This is to prevent the first input side test link from being deleted.

さらに、入力側テストリンクアドレスＣ１については、指定された実リンクのネットワークアドレスＤ３で参照されるネットワーク情報記憶手段６１の入力ノードアドレスＢ１、中間ノードアドレスＢ２、出力ノードアドレスＢ３からランダムに選択した１つのノードアドレスと、生成される初期化対象のテスト中間ＡＮＤノードのアドレスと、指定された実リンクのネットワークアドレスＤ３とを用いて、ランダムに選択されたノードに結合される第２の入力側テストリンクを、テスト反転リンクまたはテスト非反転リンクのいずれかをランダムに選択して新たに生成し（図１１のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、生成された第２の入力側テストリンクのアドレスを第２の入力側テストリンクアドレスＣ１（２）として登録する。 Further, the input side test link address C1 is randomly selected from the input node address B1, the intermediate node address B2, and the output node address B3 of the network information storage means 61 referred to by the designated real link network address D3. A second input-side test coupled to a randomly selected node using two node addresses, the address of the test intermediate AND node to be generated to be generated, and the network address D3 of the designated real link A new link is generated by randomly selecting either a test inverted link or a test non-inverted link (the test inverted link initialization process G11 or the test non-inverted link initialization process G12 in FIG. 11 is performed) and generated. The address of the second input side test link assigned to the second input side To register as a strike link address C1 (2).

出力側テストリンクアドレスＣ２については、初期化の必要はない。テスト中間ＡＮＤノードの出力側テストリンクでは、強化信号の蓄積は行われず、情報伝達も行わないからである。 There is no need to initialize the output side test link address C2. This is because the strengthening signal is not accumulated and information is not transmitted on the output side test link of the test intermediate AND node.

ネットワークアドレスＣ３は、指定された実リンクのネットワークアドレスＤ３で上書きする。また、テストノードの場合には、テストリンクアドレスＣ４に登録すべきテストリンクは無いので、テストリンクアドレスＣ４の初期化は必要ない。 The network address C3 is overwritten with the network address D3 of the designated real link. In the case of a test node, since there is no test link to be registered at the test link address C4, it is not necessary to initialize the test link address C4.

また、初期化対象がテスト中間ＡＮＤノードであるから、ＡＮＤ・ＯＲノードフラグＣ５は、Ｔｒｕｅ（ＡＮＤノードを意味する。）とし、入力ノードフラグＣ６は、Ｆａｌｓｅとし、出力ノードフラグＣ７は、Ｆａｌｓｅとし、テストノードフラグＣ８は、Ｔｒｕｅとする。さらに、ノードの出力Ｃ９は、Ｔｒｕｅとし、強化信号の合計値Ｃ１０は、０とする。 Since the initialization target is a test intermediate AND node, the AND / OR node flag C5 is set to True (meaning an AND node), the input node flag C6 is set to False, and the output node flag C7 is set to False. The test node flag C8 is set to True. Further, the node output C9 is set to True, and the total value C10 of the enhancement signal is set to 0.

＜反転リンク初期化処理Ｇ９＞
反転リンク初期化処理Ｇ９には、次の２通りの場合がある。１つは、テスト反転リンクを昇格する場合であり、もう１つは、元になるテスト反転リンク無しに、直接、反転リンクを生成する場合である。後者の場合は、プログラムを立ち上げ、ロボット３０の動作制御を開始した直後に、出力ノード２３から他のノードに向けて生成される場合と、出力ノード２３に結合されていた実リンクが一旦削除されたときにそれに代えて生成される場合とがある。 <Reverse link initialization process G9>
The reverse link initialization process G9 includes the following two cases. One is a case where a test reverse link is promoted, and the other is a case where a reverse link is directly generated without the original test reverse link. In the latter case, immediately after starting the program and starting the operation control of the robot 30, there is a case where it is generated from the output node 23 toward another node, and a case where the actual link connected to the output node 23 is once deleted. May be generated instead.

＜反転リンク初期化処理Ｇ９：テスト反転リンクを用いた初期化処理＞
テスト反転リンクを用いる場合には、元になるテスト反転リンクと、出力側ノードアドレスＤ２を指定して初期化を行う。生成される反転リンクは、昇格によるものであるため、生成される反転リンクの出力側ノードは、元になるテスト反転リンクの出力側ノードと同じノードとなる。 <Reverse Link Initialization Process G9: Initialization Process Using Test Reverse Link>
When using the test reverse link, initialization is performed by designating the original test reverse link and the output side node address D2. Since the generated inverted link is due to promotion, the output node of the generated inverted link is the same node as the output node of the original test inverted link.

入力側ノードアドレスＤ１については、元になるテスト反転リンクの入力側ノードアドレスＤ１を登録する。出力側ノードアドレスＤ２については、指定された出力側ノードアドレスを登録する。ネットワークアドレスＤ３については、元になるテスト反転リンクのネットワークアドレスＤ３を登録する。 For the input side node address D1, the input side node address D1 of the original test inversion link is registered. For the output side node address D2, the designated output side node address is registered. For the network address D3, the network address D3 of the original test reverse link is registered.

テストノードアドレスＤ４については、この生成される反転リンクと、ネットワークアドレスＤ３とを指定し、指定された出力側ノードアドレスＤ２に対応する出力側ノード（生成される反転リンクの出力側ノード）のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）であればテスト中間ＯＲノードを、Ｆａｌｓｅ（ＯＲノードを意味する。）であればテスト中間ＡＮＤノードを、新たに初期化して生成し（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、生成されたテストノードをテストノードアドレスＤ４へ登録する。つまり、生成される反転リンクの出力側ノードと、その反転リンクに付随するテストノードとのＡＮＤ・ＯＲを逆にする。 For the test node address D4, the generated inverted link and the network address D3 are designated, and the AND of the output side node (output side node of the generated inverted link) corresponding to the designated output side node address D2 If the OR node flag C5 is True (means an AND node), a test intermediate OR node is generated, and if it is False (means an OR node), a test intermediate AND node is newly initialized and generated. (The test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 in FIG. 11 is performed), and the generated test node is registered in the test node address D4. That is, the AND / OR of the output side node of the generated inverted link and the test node associated with the inverted link are reversed.

反転・非反転フラグＤ５は、Ｔｒｕｅ（反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｆａｌｓｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、指定された元になるテスト反転リンクの強化信号の累積値Ｄ８で上書きし、強化信号Ｄ９は、０とする。 The inversion / non-inversion flag D5 is set to True (meaning an inverted link), and the test link flag D6 is set to False. Further, the link output D7 is set to False, the cumulative value D8 of the enhancement signal is overwritten with the cumulative value D8 of the specified test inversion link, and the enhancement signal D9 is set to zero.

＜反転リンク初期化処理Ｇ９：テスト反転リンクを用いない直接の初期化処理＞
テスト反転リンクを用いない直接の初期化処理は、入力側ノードアドレスＤ１と、出力側ノードアドレスＤ２と、ネットワークアドレスＤ３とを指定して行う。Ｄ１〜Ｄ３には、指定されたアドレスを登録する。この場合の初期化処理で生成される反転リンク（実リンク）は、出力ノード２３からしか出ていかないので、生成される反転リンクの出力側ノードは、出力ノード２３となる。一方、生成される反転リンクの入力側ノードは、ランダムに決定される。 <Reverse Link Initialization Process G9: Direct Initialization Process without Using Test Reverse Link>
The direct initialization process without using the test reverse link is performed by designating the input side node address D1, the output side node address D2, and the network address D3. The designated addresses are registered in D1 to D3. Since the inverted link (real link) generated in the initialization process in this case only comes out from the output node 23, the output side node of the generated inverted link is the output node 23. On the other hand, the input side node of the generated reverse link is determined randomly.

テストノードアドレスＤ４を初期化する前に、Ｄ５〜Ｄ９の初期化を行う。反転・非反転フラグＤ５は、Ｔｒｕｅ（反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｆａｌｓｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、０とし、強化信号Ｄ９は、０とする。 Before the test node address D4 is initialized, D5 to D9 are initialized. The inversion / non-inversion flag D5 is set to True (meaning an inverted link), and the test link flag D6 is set to False. The link output D7 is set to False, the enhancement signal accumulated value D8 is set to 0, and the enhancement signal D9 is set to 0.

テストノードアドレスＤ４については、この生成される反転リンクと、ネットワークアドレスＤ３とを指定し、指定された出力側ノードアドレスＤ２に対応する出力側ノード（生成される反転リンクの出力側ノード）のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）であればテスト中間ＯＲノードを、Ｆａｌｓｅ（ＯＲノードを意味する。）であればテスト中間ＡＮＤノードを、新たに初期化して生成し（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、生成されたテストノードをテストノードアドレスＤ４へ登録する。つまり、生成される反転リンクの出力側ノードと、その反転リンクに付随するテストノードとのＡＮＤ・ＯＲを逆にする。なお、テストノードアドレスＤ４を後に初期化するのは、テストノードの初期化の際に、そのテストノードが付随する反転リンクの反転・非反転フラグＤ５が参照されるからである。 For the test node address D4, the generated inverted link and the network address D3 are designated, and the AND of the output side node (output side node of the generated inverted link) corresponding to the designated output side node address D2 If the OR node flag C5 is True (means an AND node), a test intermediate OR node is generated, and if it is False (means an OR node), a test intermediate AND node is newly initialized and generated. (The test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 in FIG. 11 is performed), and the generated test node is registered in the test node address D4. That is, the AND / OR of the output side node of the generated inverted link and the test node associated with the inverted link are reversed. The test node address D4 is initialized later because the inversion / non-inversion flag D5 of the inversion link associated with the test node is referred to when the test node is initialized.

そして、以上のテスト反転リンクを用いた初期化処理、およびテスト反転リンクを用いない直接の初期化処理の双方について、最後に、入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２と、出力側ノードアドレスＤ２に対応する出力側ノードの入力側リンクアドレスＣ１とへ、この生成される反転リンクのアドレスを登録し、初期化を終える。 Then, for both the initialization process using the test inversion link and the direct initialization process without using the test inversion link, finally, the output side link address C2 of the input side node corresponding to the input side node address D1 The address of the generated inverted link is registered in the input side link address C1 of the output side node corresponding to the output side node address D2, and the initialization is completed.

＜非反転リンク初期化処理Ｇ１０＞
非反転リンク初期化処理Ｇ１０は、前述した反転リンク初期化処理Ｇ９と略同様である。すなわち、非反転リンク初期化処理Ｇ１０には、次の２通りの場合がある。１つは、テスト非反転リンクを昇格する場合であり、もう１つは、元になるテスト非反転リンク無しに、直接、非反転リンクを生成する場合である。後者の場合は、プログラムを立ち上げ、ロボット３０の動作制御を開始した直後に、出力ノード２３から他のノードに向けて生成される場合と、出力ノード２３に結合されていた実リンクが一旦削除されたときにそれに代えて生成される場合とがある。 <Non-inverted link initialization processing G10>
The non-inversion link initialization process G10 is substantially the same as the above-described inversion link initialization process G9. That is, there are the following two cases in the non-inverted link initialization process G10. One is to promote the test non-inverted link, and the other is to directly generate the non-inverted link without the original test non-inverted link. In the latter case, immediately after starting the program and starting the operation control of the robot 30, there is a case where it is generated from the output node 23 toward another node, and a case where the actual link connected to the output node 23 is once deleted. May be generated instead.

＜非反転リンク初期化処理Ｇ１０：テスト非反転リンクを用いた初期化処理＞
テスト非反転リンクを用いる場合には、元になるテスト非反転リンクと、出力側ノードアドレスＤ２を指定して初期化を行う。生成される非反転リンクは、昇格によるものであるため、生成される非反転リンクの出力側ノードは、元になるテスト非反転リンクの出力側ノードと同じノードとなる。 <Non-Inverted Link Initialization Process G10: Initialization Process Using Test Non-Inverted Link>
When the test non-inversion link is used, initialization is performed by designating the original test non-inversion link and the output side node address D2. Since the generated non-inverted link is due to promotion, the output side node of the generated non-inverted link becomes the same node as the output side node of the original test non-inverted link.

入力側ノードアドレスＤ１については、元になるテスト非反転リンクの入力側ノードアドレスＤ１を登録する。出力側ノードアドレスＤ２については、指定された出力側ノードアドレスを登録する。ネットワークアドレスＤ３については、元になるテスト非反転リンクのネットワークアドレスＤ３を登録する。 For the input side node address D1, the input side node address D1 of the original test non-inversion link is registered. For the output side node address D2, the designated output side node address is registered. For the network address D3, the network address D3 of the original test non-inversion link is registered.

テストノードアドレスＤ４については、この生成される非反転リンクと、ネットワークアドレスＤ３とを指定し、指定された出力側ノードアドレスＤ２に対応する出力側ノード（生成される非反転リンクの出力側ノード）のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）であればテスト中間ＯＲノードを、Ｆａｌｓｅ（ＯＲノードを意味する。）であればテスト中間ＡＮＤノードを、新たに初期化して生成し（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、生成されたテストノードをテストノードアドレスＤ４へ登録する。つまり、生成される非反転リンクの出力側ノードと、その非反転リンクに付随するテストノードとのＡＮＤ・ＯＲを逆にする。 For the test node address D4, the generated non-inverted link and the network address D3 are designated, and the output side node corresponding to the designated output side node address D2 (the output side node of the generated non-inverted link) If the AND / OR node flag C5 is True (means an AND node), a test intermediate OR node is newly initialized. If False (means an OR node), a test intermediate AND node is newly initialized. (The test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 in FIG. 11 is performed), and the generated test node is registered in the test node address D4. That is, the AND / OR of the generated output node of the non-inverted link and the test node associated with the non-inverted link is reversed.

反転・非反転フラグＤ５は、Ｆａｌｓｅ（非反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｆａｌｓｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、指定された元になるテスト非反転リンクの強化信号の累積値Ｄ８で上書きし、強化信号Ｄ９は、０とする。 The inversion / non-inversion flag D5 is set to False (meaning a non-inversion link), and the test link flag D6 is set to False. Further, the link output D7 is set to False, the cumulative value D8 of the enhancement signal is overwritten with the cumulative value D8 of the enhancement signal of the designated test non-inversion link, and the enhancement signal D9 is set to zero.

＜非反転リンク初期化処理Ｇ１０：テスト非反転リンクを用いない直接の初期化処理＞
テスト非反転リンクを用いない直接の初期化処理は、入力側ノードアドレスＤ１と、出力側ノードアドレスＤ２と、ネットワークアドレスＤ３とを指定して行う。Ｄ１〜Ｄ３には、指定されたアドレスを登録する。この場合の初期化処理で生成される非反転リンク（実リンク）は、出力ノード２３からしか出ていかないので、生成される非反転リンクの出力側ノードは、出力ノード２３となる。一方、生成される非反転リンクの入力側ノードは、ランダムに決定される。 <Non-Inverted Link Initialization Process G10: Direct Initialization Process without Using Test Non-Inverted Link>
The direct initialization process without using the test non-inversion link is performed by designating the input side node address D1, the output side node address D2, and the network address D3. The designated addresses are registered in D1 to D3. Since the non-inverted link (real link) generated in the initialization process in this case only comes out from the output node 23, the output-side node of the generated non-inverted link becomes the output node 23. On the other hand, the input side node of the generated non-inverted link is determined at random.

テストノードアドレスＤ４を初期化する前に、Ｄ５〜Ｄ９の初期化を行う。反転・非反転フラグＤ５は、Ｆａｌｓｅ（非反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｆａｌｓｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、０とし、強化信号Ｄ９は、０とする。 Before the test node address D4 is initialized, D5 to D9 are initialized. The inversion / non-inversion flag D5 is set to False (meaning a non-inversion link), and the test link flag D6 is set to False. The link output D7 is set to False, the enhancement signal accumulated value D8 is set to 0, and the enhancement signal D9 is set to 0.

テストノードアドレスＤ４については、この生成される非反転リンクと、ネットワークアドレスＤ３とを指定し、指定された出力側ノードアドレスＤ２に対応する出力側ノード（生成される非反転リンクの出力側ノード）のＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）であればテスト中間ＯＲノードを、Ｆａｌｓｅ（ＯＲノードを意味する。）であればテスト中間ＡＮＤノードを、新たに初期化して生成し（図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、生成されたテストノードをテストノードアドレスＤ４へ登録する。つまり、生成される非反転リンクの出力側ノードと、その非反転リンクに付随するテストノードとのＡＮＤ・ＯＲを逆にする。なお、テストノードアドレスＤ４を後に初期化するのは、テストノードの初期化の際に、そのテストノードが付随する非反転リンクの反転・非反転フラグＤ５が参照されるからである。 For the test node address D4, the generated non-inverted link and the network address D3 are designated, and the output side node corresponding to the designated output side node address D2 (the output side node of the generated non-inverted link) If the AND / OR node flag C5 is True (means an AND node), a test intermediate OR node is newly initialized. If False (means an OR node), a test intermediate AND node is newly initialized. (The test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 in FIG. 11 is performed), and the generated test node is registered in the test node address D4. That is, the AND / OR of the generated output node of the non-inverted link and the test node associated with the non-inverted link is reversed. The test node address D4 is initialized later because the inversion / non-inversion flag D5 of the non-inversion link associated with the test node is referred to when the test node is initialized.

そして、以上のテスト非反転リンクを用いた初期化処理、およびテスト非反転リンクを用いない直接の初期化処理の双方について、最後に、入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２と、出力側ノードアドレスＤ２に対応する出力側ノードの入力側リンクアドレスＣ１とへ、この生成される非反転リンクのアドレスを登録し、初期化を終える。 Then, for both the initialization process using the test non-inversion link and the direct initialization process without using the test non-inversion link, finally, the output side link of the input side node corresponding to the input side node address D1 The generated non-inverted link address is registered in the address C2 and the input side link address C1 of the output side node corresponding to the output side node address D2, and the initialization is completed.

＜テスト反転リンク初期化処理Ｇ１１＞
テスト反転リンク初期化処理Ｇ１１は、入力側ノードアドレスＤ１と、出力側ノードアドレスＤ２と、ネットワークアドレスＤ３とを指定して行う。入力側ノードアドレスＤ１については、指定された入力側ノードアドレスを登録する。出力側ノードアドレスＤ２については、指定された出力側ノードアドレスを登録する。ネットワークアドレスＤ３については、指定されたネットワークアドレスを登録する。 <Test reverse link initialization process G11>
The test reverse link initialization process G11 is performed by designating the input side node address D1, the output side node address D2, and the network address D3. For the input side node address D1, the designated input side node address is registered. For the output side node address D2, the designated output side node address is registered. For the network address D3, the designated network address is registered.

テストリンクには、これに付随するテストノードは設けないので、テストノードアドレスＤ４は、初期化の必要はない。反転・非反転フラグＤ５は、Ｔｒｕｅ（反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｔｒｕｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、０とし、強化信号Ｄ９は、０とする。 Since the test link is not provided in the test link, the test node address D4 does not need to be initialized. The inversion / non-inversion flag D5 is set to True (which means an inverted link), and the test link flag D6 is set to True. The link output D7 is set to False, the enhancement signal accumulated value D8 is set to 0, and the enhancement signal D9 is set to 0.

そして、最後に、入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２へ、この生成されるテスト反転リンクのアドレスを登録し、初期化を終える。 Finally, the address of the generated test inversion link is registered in the output side link address C2 of the input side node corresponding to the input side node address D1, and the initialization is completed.

＜テスト非反転リンク初期化処理Ｇ１２＞
テスト非反転リンク初期化処理Ｇ１２は、前述したテスト反転リンク初期化処理Ｇ１１と略同様である。すなわち、テスト非反転リンク初期化処理Ｇ１２は、入力側ノードアドレスＤ１と、出力側ノードアドレスＤ２と、ネットワークアドレスＤ３とを指定して行う。入力側ノードアドレスＤ１については、指定された入力側ノードアドレスを登録する。出力側ノードアドレスＤ２については、指定された出力側ノードアドレスを登録する。ネットワークアドレスＤ３については、指定されたネットワークアドレスを登録する。 <Test non-reverse link initialization process G12>
The test non-inversion link initialization process G12 is substantially the same as the test inversion link initialization process G11 described above. That is, the test non-inversion link initialization process G12 is performed by designating the input side node address D1, the output side node address D2, and the network address D3. For the input side node address D1, the designated input side node address is registered. For the output side node address D2, the designated output side node address is registered. For the network address D3, the designated network address is registered.

テストリンクには、これに付随するテストノードは設けないので、テストノードアドレスＤ４は、初期化の必要はない。反転・非反転フラグＤ５は、Ｆａｌｓｅ（非反転リンクを意味する。）とし、テストリンクフラグＤ６は、Ｔｒｕｅとする。また、リンクの出力Ｄ７は、Ｆａｌｓｅとし、強化信号の累積値Ｄ８は、０とし、強化信号Ｄ９は、０とする。 Since the test link is not provided in the test link, the test node address D4 does not need to be initialized. The inversion / non-inversion flag D5 is set to False (meaning a non-inversion link), and the test link flag D6 is set to True. The link output D7 is set to False, the enhancement signal accumulated value D8 is set to 0, and the enhancement signal D9 is set to 0.

そして、最後に、入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２へ、この生成されるテスト非反転リンクのアドレスを登録し、初期化を終える。 Finally, the address of the generated test non-inversion link is registered in the output side link address C2 of the input side node corresponding to the input side node address D1, and the initialization is completed.

図１２には、学習時の削除処理の構成が示されている。図１２において、前述した図１１のロボット初期化処理Ｇ１、ネットワーク初期化処理Ｇ２、入力ノード初期化処理Ｇ３、および出力ノード初期化処理Ｇ４に対応する終了処理は、ロボット３０の動作制御用のプログラムを終了させる直前にのみ行うが、これらの終了処理については、ネットワーク２０の構造変化に直接結びつくものではないので、説明は省略する。その他のノードやリンクの終了処理は、ノードやリンクが削除される都度に行われるので、それぞれ学習時の削除処理Ｅ１〜Ｅ８として説明を行う。削除の方法は、ノードの種類やリンクの種類によって異なる。図１２において、矢印の付け根の削除処理を行うには、矢印の先端の削除処理が必要である。図中の実線は、必ず使用し、点線は、使用する可能性があることを意味する。 FIG. 12 shows the configuration of the deletion process at the time of learning. In FIG. 12, the end process corresponding to the robot initialization process G1, the network initialization process G2, the input node initialization process G3, and the output node initialization process G4 of FIG. However, since these end processes are not directly related to the structural change of the network 20, the description thereof will be omitted. The other node and link termination processing is performed each time the node or link is deleted, and therefore will be described as deletion processing E1 to E8 during learning. The deletion method varies depending on the type of node and the type of link. In FIG. 12, in order to delete the root of the arrow, it is necessary to delete the tip of the arrow. The solid line in the figure is always used, and the dotted line means that it may be used.

＜中間ＯＲノード削除処理Ｅ１＞
中間ＯＲノード削除処理Ｅ１では、先ず、削除対象の中間ＯＲノードのテストリンクアドレスＣ４に対応するテストリンクについてのメモリを開放する。すなわち、リンク情報記憶手段６３の当該テストリンクの情報を、後述するテスト反転リンク削除処理Ｅ７またはテスト非反転リンク削除処理Ｅ８に従って開放し、テストリンクを削除する。次に、削除対象の中間ＯＲノードのネットワークアドレスＣ３を参照し、ネットワーク情報記憶手段６１の中間ノードアドレスＢ２から、削除対象の中間ＯＲノードのアドレスを検索し、削除する。その後、条件によって場合分けし、それぞれ異なる次のような３通りの処理（１）、（２）、（３）のうちのいずれかの処理を行う。 <Intermediate OR node deletion processing E1>
In the intermediate OR node deletion process E1, first, the memory for the test link corresponding to the test link address C4 of the intermediate OR node to be deleted is released. That is, the information on the test link in the link information storage unit 63 is released according to the test reverse link deletion process E7 or the test non-invert link deletion process E8 described later, and the test link is deleted. Next, referring to the network address C3 of the intermediate OR node to be deleted, the address of the intermediate OR node to be deleted is searched from the intermediate node address B2 of the network information storage means 61 and deleted. Thereafter, the process is classified according to conditions, and any one of the following three processes (1), (2), and (3) is performed.

（１）図１６に示すように、削除対象の中間ＯＲノード２２０の入力側リンクアドレスＣ１に対応する入力側リンクが１個であり（これを入力側リンク２２１とする。）、この入力側リンク２２１の入力側ノードアドレスＤ１に対応する入力側ノード２２２が、削除対象の中間ＯＲノード２２０自身でない場合には、中間ＯＲノード２２０の各出力側リンクアドレスＣ２に対応する出力側リンク（図１６では、一例として３つの出力側リンク２２３，２２４，２２５とする。）のそれぞれについて、次のような３通りの処理（１−Ａ）、（１−Ｂ）、（１−Ｃ）のうちのいずれかの処理を行う。 (1) As shown in FIG. 16, there is one input side link corresponding to the input side link address C1 of the intermediate OR node 220 to be deleted (this is referred to as the input side link 221), and this input side link. When the input side node 222 corresponding to the input side node address D1 of 221 is not the intermediate OR node 220 to be deleted, the output side links (in FIG. 16) corresponding to the output side link addresses C2 of the intermediate OR node 220 As an example, for each of the three output side links 223, 224, and 225), any one of the following three processes (1-A), (1-B), and (1-C) Do the process.

（１−Ａ）出力側リンク２２３のテストリンクフラグＤ６がＴｒｕｅ（テストリンクを意味する。）で、かつ、その出力側リンク２２３の出力側ノードアドレスＤ２に対応する出力側ノード２２６のテストノードフラグＣ８がＦａｌｓｅ（実ノードを意味する。）の場合には、出力側リンク２２３、すなわち出力側ノード２２６に結合されているテストリンク（出力側ノード２２６のテストリンクアドレスＣ４に対応するテストリンク）を削除し（後述する図１２のテスト反転リンク削除処理Ｅ７またはテスト非反転リンク削除処理Ｅ８を行う。）、ランダムに選択されたノード２４０に結合するテストリンク２４１をランダムに生成し（前述した図１１、図１２のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、そのテストリンク２４１のアドレスを出力側ノード２２６のテストリンクアドレスＣ４に登録する。 (1-A) The test link flag D6 of the output side link 223 is True (meaning a test link), and the test node flag of the output side node 226 corresponding to the output side node address D2 of the output side link 223 When C8 is False (meaning a real node), the output link 223, that is, the test link coupled to the output node 226 (the test link corresponding to the test link address C4 of the output node 226) is selected. (A test reverse link deletion process E7 or a test non-inverted link deletion process E8 in FIG. 12 described later is performed), and a test link 241 coupled to a randomly selected node 240 is randomly generated (FIG. 11 described above). , Test inversion link initialization processing G11 or test non-inversion link initialization processing G in FIG. 2 performs.), Registers the address of the test link 241 to the test link address C4 of the output side node 226.

（１−Ｂ）出力側リンク２２４のテストリンクフラグＤ６がＴｒｕｅ（テストリンクを意味する。）で、かつ、その出力側リンク２２３の出力側ノードアドレスＤ２に対応する出力側ノード２２７のテストノードフラグＣ８がＴｒｕｅ（テストノードを意味する。）の場合には、出力側リンク２２４を削除する。 (1-B) The test link flag D6 of the output side link 224 is True (meaning a test link), and the test node flag of the output side node 227 corresponding to the output side node address D2 of the output side link 223 When C8 is True (meaning a test node), the output side link 224 is deleted.

（１−Ｃ）出力側リンク２２５のテストリンクフラグＤ６がＦａｌｓｅ（実リンクを意味する。）の場合には、その出力側リンク２２５の入力側ノードアドレスＤ１を、削除対象の中間ＯＲノード２２０の入力側リンクアドレスＣ１に対応する入力側リンク２２１（１つしかない入力側リンク）の入力側ノードアドレスＤ１で上書きする。つまり、設定変更前の出力側リンク２２５の出力側ノード２２８と、入力側リンク２２１の入力側ノード２２２とを、新たな設定とされた出力側リンク２２５で連結する。また、設定変更前の出力側リンク２２５に付随していたテストノード（設定変更前の出力側リンク２２５のテストノードアドレスＤ４に対応するテストノード）を削除し（後述する図１２のテスト中間ＯＲノード削除処理Ｅ３またはテスト中間ＡＮＤノード削除処理Ｅ４を行う。）、新たなテストノード２２９を生成し（前述した図１１のテスト中間ＯＲノード初期化処理Ｇ７またはテスト中間ＡＮＤノード初期化処理Ｇ８を行う。）、生成したテストノード２２９のアドレスを、設定変更後の出力側リンク２２５のテストノードアドレスＤ４に登録する。その後、入力側リンク２２１の入力側ノード２２２の出力側リンクアドレスＣ２に、削除対象の中間ＯＲノード２２０の出力側リンクアドレスＣ２（出力側リンク２２５のアドレス）を追加し、入力側リンク２２１を削除する（後述する図１２の反転リンク削除処理Ｅ５または非反転リンク削除処理Ｅ６を行う。）。 (1-C) When the test link flag D6 of the output side link 225 is False (meaning an actual link), the input side node address D1 of the output side link 225 is set to the intermediate OR node 220 to be deleted. It is overwritten with the input side node address D1 of the input side link 221 (the only input side link) corresponding to the input side link address C1. In other words, the output side node 228 of the output side link 225 before the setting change and the input side node 222 of the input side link 221 are connected by the output side link 225 that has been newly set. Further, the test node attached to the output side link 225 before the setting change (the test node corresponding to the test node address D4 of the output side link 225 before the setting change) is deleted (the test intermediate OR node in FIG. 12 described later). The deletion process E3 or the test intermediate AND node deletion process E4 is performed.) A new test node 229 is generated (the test intermediate OR node initialization process G7 or the test intermediate AND node initialization process G8 of FIG. 11 described above is performed). ), The generated address of the test node 229 is registered in the test node address D4 of the output side link 225 after the setting change. Thereafter, the output side link address C2 of the intermediate OR node 220 to be deleted (the address of the output side link 225) is added to the output side link address C2 of the input side node 222 of the input side link 221 to delete the input side link 221. (Reverse link deletion process E5 or non-inverted link deletion process E6 in FIG. 12 described later is performed).

（２）図１７に示すように、削除対象の中間ＯＲノード２６０の入力側リンクアドレスＣ１に対応する入力側リンクが１個であり（入力側リンク２６５とする。）、この入力側リンク２６５の入力側ノードアドレスＤ１に対応する入力側ノードが、削除対象の中間ＯＲノード２６０自身である場合には、中間ＯＲノード２６０の各出力側リンクアドレスＣ２に対応する出力側リンク（図１７では、一例として２つの出力側リンク２６１，２６２とする。）のそれぞれについて、次のような２通りの処理（２−Ａ）、（２−Ｂ）のうちのいずれかの処理を行う。 (2) As shown in FIG. 17, there is one input side link corresponding to the input side link address C1 of the intermediate OR node 260 to be deleted (referred to as input side link 265). When the input side node corresponding to the input side node address D1 is the intermediate OR node 260 itself to be deleted, the output side link corresponding to each output side link address C2 of the intermediate OR node 260 (in FIG. 17, an example) For each of the two output-side links 261 and 262), one of the following two processes (2-A) and (2-B) is performed.

（２−Ａ）出力側リンク２６１のテストリンクフラグＤ６がＴｒｕｅ（テストリンクを意味する。）で、かつ、その出力側リンク２６１の出力側ノードアドレスＤ２に対応する出力側ノード２６３のテストノードフラグＣ８がＦａｌｓｅ（実ノードを意味する。）の場合には、出力側リンク２６１、すなわち出力側ノード２６３に結合されているテストリンク（出力側ノード２６３のテストリンクアドレスＣ４に対応するテストリンク）を削除し（後述する図１２のテスト反転リンク削除処理Ｅ７またはテスト非反転リンク削除処理Ｅ８を行う。）、ランダムに選択されたノード２８０に結合するテストリンク２８１をランダムに生成し（前述した図１１、図１２のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、そのテストリンク２８１のアドレスを出力側ノード２６３のテストリンクアドレスＣ４に登録する。 (2-A) The test link flag D6 of the output side link 261 is True (meaning a test link), and the test node flag of the output side node 263 corresponding to the output side node address D2 of the output side link 261 When C8 is False (meaning an actual node), the output link 261, that is, the test link coupled to the output node 263 (the test link corresponding to the test link address C4 of the output node 263) is selected. (A test reverse link deletion process E7 or a test non-inverted link deletion process E8 in FIG. 12 described later is performed), and a test link 281 coupled to a randomly selected node 280 is randomly generated (FIG. 11 described above). , Test inversion link initialization processing G11 or test non-inversion link initialization processing G in FIG. 2 performs.), Registers the address of the test link 281 to the test link address C4 of the output side node 263.

（２−Ｂ）出力側リンク２６２およびこの出力側リンク２６２の出力側ノード２６４の条件が、上記（２−Ａ）以外の場合には、出力側リンク２６２を削除する。 (2-B) When the conditions of the output side link 262 and the output side node 264 of the output side link 262 are other than the above (2-A), the output side link 262 is deleted.

（３）図１８に示すように、削除対象の中間ＯＲノード３００の入力側リンクアドレスＣ１に対応する入力側リンクが０個の場合には、中間ＯＲノード３００の各出力側リンクアドレスＣ２に対応する出力側リンク（図１８では、一例として２つの出力側リンク３０１，３０２とする。）のそれぞれについて、次のような２通りの処理（３−Ａ）、（３−Ｂ）のうちのいずれかの処理を行う。 (3) As shown in FIG. 18, when the number of input side links corresponding to the input side link address C1 of the intermediate OR node 300 to be deleted is zero, it corresponds to each output side link address C2 of the intermediate OR node 300 For each output side link (in FIG. 18, two output side links 301 and 302 as an example), any one of the following two types of processing (3-A) and (3-B) Do the process.

（３−Ａ）出力側リンク３０１のテストリンクフラグＤ６がＴｒｕｅ（テストリンクを意味する。）で、かつ、その出力側リンク３０１の出力側ノードアドレスＤ２に対応する出力側ノード３０３のテストノードフラグＣ８がＦａｌｓｅ（実ノードを意味する。）の場合には、出力側リンク３０１、すなわち出力側ノード３０３に結合されているテストリンク（出力側ノード３０３のテストリンクアドレスＣ４に対応するテストリンク）を削除し（後述する図１２のテスト反転リンク削除処理Ｅ７またはテスト非反転リンク削除処理Ｅ８を行う。）、ランダムに選択されたノード３２０に結合するテストリンク３２１をランダムに生成し（前述した図１１、図１２のテスト反転リンク初期化処理Ｇ１１またはテスト非反転リンク初期化処理Ｇ１２を行う。）、そのテストリンク３２１のアドレスを出力側ノード３０３のテストリンクアドレスＣ４に登録する。 (3-A) The test link flag D6 of the output side link 301 is True (meaning a test link), and the test node flag of the output side node 303 corresponding to the output side node address D2 of the output side link 301 When C8 is False (meaning a real node), the output link 301, that is, the test link coupled to the output node 303 (the test link corresponding to the test link address C4 of the output node 303) is selected. (A test reverse link deletion process E7 or a test non-inverted link deletion process E8 in FIG. 12 described later is performed), and a test link 321 coupled to a randomly selected node 320 is randomly generated (FIG. 11 described above). , Test inversion link initialization processing G11 or test non-inversion link initialization processing G in FIG. 2 performs.), Registers the address of the test link 321 to the test link address C4 of the output side node 303.

（３−Ｂ）出力側リンク３０２およびこの出力側リンク３０２の出力側ノード３０４の条件が、上記（３−Ａ）以外の場合には、出力側リンク３０２を削除する。 (3-B) When the conditions of the output side link 302 and the output side node 304 of the output side link 302 are other than the above (3-A), the output side link 302 is deleted.

そして、以上の（１）〜（３）の処理が終了した後、削除対象の中間ＯＲノードの入力側リンクが存在すれば、それを削除し（後述する図１２の反転リンク削除処理Ｅ５または非反転リンク削除処理Ｅ６を行う。）、さらに、削除対象の中間ＯＲノードのＣ１〜Ｃ１０のメモリを開放し、中間ＯＲノードを削除する。 After the above processes (1) to (3) are completed, if there is an input side link of the intermediate OR node to be deleted, it is deleted (reverse link deletion process E5 in FIG. Reverse link deletion processing E6 is performed.) Further, the memory of C1 to C10 of the intermediate OR node to be deleted is released, and the intermediate OR node is deleted.

＜中間ＡＮＤノード削除処理Ｅ２＞
中間ＡＮＤノード削除処理Ｅ２は、前術した中間ＯＲノード削除処理Ｅ１と略等しく、中間ＯＲノード削除処理Ｅ１の説明において、中間ＯＲノードを中間ＡＮＤノードと読み替えるだけなので、説明を省略する。 <Intermediate AND node deletion processing E2>
The intermediate AND node deletion process E2 is substantially the same as the previous intermediate OR node deletion process E1, and in the description of the intermediate OR node deletion process E1, only the intermediate OR node is read as an intermediate AND node, and thus the description thereof is omitted.

＜テスト中間ＯＲノード削除処理Ｅ３＞
削除対象のテスト中間ＯＲノードの第１および第２の入力側テストリンクアドレスＣ１に対応する第１および第２の入力側テストリンクを削除する（後述する図１２のテスト反転リンク削除処理Ｅ７またはテスト非反転リンク削除処理Ｅ８を行う。）。その後、削除対象のテスト中間ＯＲノードのＣ１〜Ｃ１０のメモリを開放し、テスト中間ＯＲノードを削除する。 <Test intermediate OR node deletion processing E3>
The first and second input side test links corresponding to the first and second input side test link addresses C1 of the test intermediate OR node to be deleted are deleted (test inversion link deletion processing E7 in FIG. Non-inverted link deletion processing E8 is performed). Thereafter, the memory of C1 to C10 of the test intermediate OR node to be deleted is released, and the test intermediate OR node is deleted.

＜テスト中間ＡＮＤノード削除処理Ｅ４＞
テスト中間ＡＮＤノード削除処理Ｅ４は、前述したテスト中間ＯＲノード削除処理Ｅ３と略等しく、テスト中間ＯＲノード削除処理Ｅ３の説明において、テスト中間ＯＲノードをテスト中間ＡＮＤノードと読み替えるだけなので、説明を省略する。 <Test intermediate AND node deletion processing E4>
The test intermediate AND node deletion process E4 is substantially the same as the above-described test intermediate OR node deletion process E3. In the description of the test intermediate OR node deletion process E3, only the test intermediate OR node is read as a test intermediate AND node, and thus the description thereof is omitted. To do.

＜反転リンク削除処理Ｅ５＞
反転リンク削除処理Ｅ５では、削除対象の反転リンクの入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２から、この削除対象の反転リンクのアドレスを検索して削除し、同様に、削除対象の反転リンクの出力側ノードアドレスＤ２に対応する出力側ノードの入力側リンクアドレスＣ１から、この削除対象の反転リンクのアドレスを検索して削除する。 <Reverse link deletion process E5>
In the reverse link deletion process E5, the address of the reverse link to be deleted is searched and deleted from the output side link address C2 of the input side node corresponding to the input side node address D1 of the reverse link to be deleted. The address of the reverse link to be deleted is searched and deleted from the input side link address C1 of the output side node corresponding to the output side node address D2 of the reverse link to be deleted.

また、削除対象の反転リンクに付随するテストノード（削除対象の反転リンクのテストノードアドレスＤ４に対応するテストノード）を削除する。この際、このテストノードのＡＮＤ・ＯＲノードフラグＣ５が、Ｔｒｕｅ（ＡＮＤノードを意味する。）ならば、前述したテスト中間ＡＮＤノード削除処理Ｅ４を行い、Ｆａｌｓｅ（ＯＲノードを意味する。）ならば、前述したテスト中間ＯＲノード削除処理Ｅ３を行う。 Also, the test node associated with the reverse link to be deleted (the test node corresponding to the test node address D4 of the reverse link to be deleted) is deleted. At this time, if the AND / OR node flag C5 of this test node is True (meaning an AND node), the above-described test intermediate AND node deletion processing E4 is performed, and if it is False (meaning an OR node). Then, the above-described test intermediate OR node deletion processing E3 is performed.

その後、削除対象の反転リンクのＤ１〜Ｄ９のメモリを開放し、反転リンクの削除を終える。 Thereafter, the memories D1 to D9 of the reverse link to be deleted are released, and the deletion of the reverse link is completed.

＜非反転リンク削除処理Ｅ６＞
非反転リンク削除処理Ｅ６は、前述した反転リンク削除処理Ｅ５と同じであるため、説明を省略する。 <Non-inverted link deletion processing E6>
The non-inversion link deletion process E6 is the same as the above-described inversion link deletion process E5, and thus the description is omitted.

＜テスト反転リンク削除処理Ｅ７＞
テスト反転リンク削除処理Ｅ７では、削除対象のテスト反転リンクの入力側ノードアドレスＤ１に対応する入力側ノードの出力側リンクアドレスＣ２から、この削除対象のテスト反転リンクのアドレスを検索して削除する。 <Test reverse link deletion processing E7>
In the test reverse link deletion process E7, the address of the test reverse link to be deleted is searched for and deleted from the output side link address C2 of the input side node corresponding to the input side node address D1 of the test reverse link to be deleted.

また、削除対象のテスト反転リンクの出力側ノードアドレスＤ２に対応する出力側ノードのテストノードフラグＣ８が、Ｔｒｕｅ（テストノードを意味する。）ならば、その出力側ノード（テストノード）の入力側テストリンクアドレスＣ１から、削除対象のテスト反転リンクのアドレスを検出して削除し、Ｆａｌｓｅ（実ノードを意味する。）ならば、その出力側ノード（実ノード）のテストリンクアドレスＣ４を削除する。 If the test node flag C8 of the output side node corresponding to the output side node address D2 of the test inversion link to be deleted is True (meaning a test node), the input side of the output side node (test node) From the test link address C1, the address of the test reverse link to be deleted is detected and deleted. If False (meaning a real node), the test link address C4 of the output side node (real node) is deleted.

その後、削除対象のテスト反転リンクのＤ１〜Ｄ９のメモリを開放し、テスト反転リンクの削除を終える。 Thereafter, the memories D1 to D9 of the test inversion link to be deleted are released, and the deletion of the test inversion link is finished.

＜テスト非反転リンク削除処理Ｅ８＞
テスト非反転リンク削除処理Ｅ８は、前述したテスト反転リンク削除処理Ｅ７と同じであるため、説明を省略する。 <Test non-inverted link deletion processing E8>
The test non-inverted link deletion process E8 is the same as the test inversion link deletion process E7 described above, and thus the description thereof is omitted.

このような本実施形態によれば、次のような効果がある。すなわち、情報処理システム１０は、強化信号生成手段４３を備えているので、制御対象であるロボット３０の状態の評価結果に応じてネットワーク２０に対して付与する強化信号を生成することができる。 According to this embodiment, there are the following effects. That is, since the information processing system 10 includes the reinforcement signal generation unit 43, the reinforcement signal to be given to the network 20 can be generated according to the evaluation result of the state of the robot 30 to be controlled.

また、情報処理システム１０は、学習手段５１を備えているので、強化信号生成手段４３により生成した強化信号を、ネットワーク２０の構成エレメントから他の構成エレメントへ伝播させることができる。この際、学習手段５１は、伝播させる強化信号、すなわち伝播先の構成エレメントに対して付与する強化信号を、伝播元および／または伝播先の構成エレメントの入出力状態に応じて構成エレメント毎に生成するので、構成エレメント毎に個別に付与された強化信号の累積値を用いて、構成エレメント毎にその構成エレメントの生成（追加）または削除（淘汰）を行うか否かの判定を行い、その処理を実行し、ネットワーク２０の構造を自律的に変化させることができる。 In addition, since the information processing system 10 includes the learning unit 51, the enhancement signal generated by the enhancement signal generation unit 43 can be propagated from the constituent elements of the network 20 to other constituent elements. At this time, the learning unit 51 generates a reinforcement signal to be propagated, that is, a reinforcement signal to be given to the propagation destination constituent element, for each constituent element in accordance with the input / output state of the propagation source and / or propagation destination constituent element. Therefore, it is determined whether to generate (add) or delete (淘汰) the configuration element for each configuration element by using the cumulative value of the enhancement signal individually assigned to each configuration element, and the processing. And the structure of the network 20 can be autonomously changed.

従って、前述した従来のニューロジェネティックラーニングによる学習器の場合とは異なり、情報処理システム１０は、ネットワーク２０の構造を変化させるにあたり、ネットワーク２０全体を評価単位として評価を行うのではなく、構成エレメント単位（すなわち、１つ１つのノードやリンクの単位）での評価を行い、構成エレメント単位での生成または削除を行うので、評価に要する時間を短縮することができ、低い時間オーダでネットワーク２０を自律的に構築していくことができるうえ、これに伴って計算コストの削減を図ることもできる。 Therefore, unlike the above-described learning device based on the conventional neurogenetic learning, the information processing system 10 does not evaluate the entire network 20 as an evaluation unit but changes the structure of the network 20 in units of constituent elements. (In other words, each node or link unit) is evaluated, and generation or deletion is performed in units of constituent elements. Therefore, the time required for evaluation can be shortened, and the network 20 can be autonomous in a low time order. In addition to this, the calculation cost can be reduced.

また、前述した特許文献２，３に記載されたニューラルネットワーク学習方法のように、ネットワーク２０の使用環境やタスクに応じてネットワーク２０の構造を決定しておき、その決定された構造の中でニューロンユニット間の結合係数の最適化を行うのではなく、情報処理システム１０は、ネットワーク２０の構造そのものをも自律的に変化させ、最適化していくので、構造決定による環境、タスクへの限定を回避することができる。このため、ネットワーク２０の使用環境やタスクが変化した場合でも、以前の学習結果を既存知識として再利用する学習を行うことができる。 Further, like the neural network learning method described in Patent Documents 2 and 3 described above, the structure of the network 20 is determined according to the use environment and tasks of the network 20, and the neuron is determined in the determined structure. Instead of optimizing the coupling coefficient between units, the information processing system 10 autonomously changes and optimizes the structure of the network 20 as well, thereby avoiding limitations on the environment and tasks due to the structure determination. can do. For this reason, even when the use environment or task of the network 20 changes, it is possible to perform learning that reuses the previous learning result as existing knowledge.

さらに、情報処理システム１０は、状態評価用信号取得手段４２を備え、この状態評価用信号取得手段４２により取得した状態評価用信号に基づき制御対象であるロボット３０の状態を評価する構成とされているので、人為的判断を介在させることなく制御対象であるロボット３０の状態の評価を行うことができる。このため、ネットワーク２０の自律的な構築速度を向上させることができるうえ、目的に沿って一貫した学習を容易に行うことができる。 Furthermore, the information processing system 10 includes a state evaluation signal acquisition unit 42 and is configured to evaluate the state of the robot 30 to be controlled based on the state evaluation signal acquired by the state evaluation signal acquisition unit 42. Therefore, it is possible to evaluate the state of the robot 30 to be controlled without intervention of artificial judgment. For this reason, the autonomous construction speed of the network 20 can be improved, and consistent learning can be easily performed according to the purpose.

そして、学習手段５１は、伝播元のノードに付与された強化信号に基づき、伝播元のノードの入出力状態に従って定まる伝播先の入力側リンクのノード出力への寄与度に応じ、伝播先の入力側リンクに対して付与する強化信号を生成するので（図８、図９参照）、ネットワーク２０に対して付与された強化信号を、出力ノード２３から逆伝播させていくことができるうえ、個々のリンクに対し、妥当な評価を個別に行うことができ、構成エレメント毎の適切な生成または削除を実現することができる。 Then, the learning unit 51 inputs the propagation destination according to the contribution degree to the node output of the input link on the propagation destination determined according to the input / output state of the propagation node based on the reinforcement signal given to the propagation source node. Since the enhanced signal to be given to the side link is generated (see FIGS. 8 and 9), the enhanced signal given to the network 20 can be propagated back from the output node 23, and each Appropriate evaluation can be performed individually on the link, and appropriate generation or deletion for each constituent element can be realized.

また、学習手段５１は、上記のようなノードからその入力側リンクへの強化信号の逆伝播と併せ、ノードからその入力側リンクの入力側ノードへの強化信号の逆伝播も行うので、より一層円滑な強化信号の逆伝播を実現することができる。 Further, the learning means 51 performs back propagation of the reinforcement signal from the node to the input side node of the input side link in addition to the back propagation of the reinforcement signal from the node to the input side link as described above. Smooth back propagation of the enhancement signal can be realized.

さらに、学習手段５１は、リンクに対して付与された強化信号の累積値が閾値を下回ったときに、このリンクを削除する構成とされているので、目的通りに制御対象であるロボット３０を制御するのに役立たないと考えられるリンク、すなわち不要と思われるリンクの適切な淘汰を行うことができ、ネットワーク２０の構造を自律的に変化させていくことができる。 Further, the learning means 51 is configured to delete the link when the cumulative value of the reinforcement signal given to the link falls below the threshold value, so that the robot 30 that is the control target is controlled as intended. It is possible to perform appropriate trapping of links that are considered to be useless, that is, links that are considered unnecessary, and the structure of the network 20 can be autonomously changed.

そして、学習手段５１は、ノードの入力側リンクの数が１以下になったときに、このノードを削除する構成とされているので、目的通りに制御対象であるロボット３０を制御するのに役立たないと考えられるノード、すなわち不要と思われるノードの適切な淘汰を行うことができ、ネットワーク２０の構造を自律的に変化させていくことができる。 The learning unit 51 is configured to delete the node when the number of input-side links of the node becomes 1 or less, which is useful for controlling the robot 30 that is the control target as intended. Appropriate selection of nodes considered to be unnecessary, that is, unnecessary nodes can be performed, and the structure of the network 20 can be autonomously changed.

また、情報処理システム１０では、ノードには、テストリンクが設けられるので、そのテストリンクが目的通りに制御対象であるロボット３０を制御するのに役立つと考えられる場合に、そのテストリンクをノード出力に寄与する実リンクに昇格させ、正式に入力側リンクとして登録することができる。このため、自律的なリンクの生成を実現でき、ネットワーク２０の構造を自律的に変化させていくことができる。 Further, in the information processing system 10, since a test link is provided for a node, when it is considered that the test link is useful for controlling the robot 30 that is a control target as intended, the test link is output to the node. Can be promoted to a real link that contributes to and can be formally registered as an input side link. Therefore, autonomous link generation can be realized, and the structure of the network 20 can be changed autonomously.

さらに、学習手段５１は、上記のテストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、そのテストリンクを削除し、任意のノードに結合される新たなテストリンクを生成する構成とされているので、新しく生成されるリンク（実リンク）の適切な候補となるテストリンクを常に用意しておくことができる。このため、リンクの適切で、かつ円滑な生成を実現することができ、ネットワーク２０の構造を自律的に変化させていくことができる。 Further, the learning means 51 deletes the test link when the accumulated value of the enhancement signal given to the test link falls below the threshold value, and generates a new test link coupled to an arbitrary node. Therefore, it is possible to always prepare a test link as an appropriate candidate for a newly generated link (actual link). For this reason, appropriate and smooth generation of links can be realized, and the structure of the network 20 can be autonomously changed.

そして、情報処理システム１０では、実リンクには、このリンクに付随するテストノードが設けられるので、新たに生成されるノード（実ノード）の候補を常に用意しておくことができる。このため、自律的なノードの生成を実現でき、ネットワーク２０の構造を自律的に変化させていくことができる。 In the information processing system 10, since a test node associated with the link is provided in the actual link, a newly generated node (real node) candidate can always be prepared. Therefore, autonomous node generation can be realized, and the structure of the network 20 can be autonomously changed.

また、学習手段５１は、上記のテストノードから第１および第２の入力側テストリンクへ強化信号を伝播させる構成とされているので、これによっても新たに生成されるリンク（実リンク）の候補を用意しておくことができ、ネットワーク２０の構造を自律的に変化させていくことができる。 Further, since the learning means 51 is configured to propagate the enhancement signal from the test node to the first and second input side test links, a candidate for a link (actual link) newly generated also by this Can be prepared, and the structure of the network 20 can be autonomously changed.

さらに、学習手段５１は、上記の第１または第２の入力側テストリンクに対して付与された強化信号の累積値が閾値を下回ったときに、閾値を下回った入力側テストリンクを削除し、新たな入力側テストリンクを生成する構成とされているので、新しく生成されるリンク（実リンク）の適切な候補となるテストリンクを常に用意しておくことができる。このため、リンクの適切で、かつ円滑な生成を実現することができ、ネットワーク２０の構造を自律的に変化させていくことができる。 Furthermore, the learning means 51 deletes the input side test link that has fallen below the threshold when the cumulative value of the enhancement signal given to the first or second input side test link has fallen below the threshold, Since the configuration is such that a new input-side test link is generated, a test link that is an appropriate candidate for a newly generated link (actual link) can always be prepared. For this reason, appropriate and smooth generation of links can be realized, and the structure of the network 20 can be autonomously changed.

そして、学習手段５１は、上記の第１および第２の入力側テストリンクに対して付与された強化信号の累積値がいずれも閾値を上回ったときに、テストノードを実用化する構成とされているので、新たにノード（実ノード）を生成（追加）することができ、ネットワーク２０の構造を自律的に変化させていくことができる。 The learning unit 51 is configured to put the test node into practical use when the cumulative value of the enhancement signals given to the first and second input side test links above both exceeds a threshold value. Therefore, a new node (real node) can be generated (added), and the structure of the network 20 can be autonomously changed.

また、情報処理システム１０では、各ノードは、論理回路を用いて構成されているので、目的の制御を実現することができる情報処理システムを、単純な構造で構築することができる。 Further, in the information processing system 10, each node is configured using a logic circuit, so that an information processing system capable of realizing target control can be constructed with a simple structure.

なお、本発明の効果を確かめるため、以下のような実験を行った。 In order to confirm the effect of the present invention, the following experiment was conducted.

目標となるＩ／Ｏ動作を行う回路として、２、３ビット程度の小規模な回路を１０回路用意した。１０回路の全てについて、履歴を含まない範囲での生成実験を行うとともに、１０回路のうちの幾つかの回路について、１ステップ程度の履歴を含む生成実験を行った。 As a circuit for performing a target I / O operation, 10 small-scale circuits of about 2 or 3 bits were prepared. A generation experiment was performed for all 10 circuits in a range not including a history, and a generation experiment including a history of about one step was performed for some of the 10 circuits.

初期状態では、ネットワークの出力層にＯＲノードを１つだけ用意し、ランダムな入力を加え、目標となるＩ／Ｏ動作ができた場合には、強化信号として報酬を付与し、失敗した場合には、強化信号として罰を付与することにより、回路を生成した。 In the initial state, only one OR node is prepared in the output layer of the network, a random input is added, and when the target I / O operation is completed, a reward is given as an enhancement signal, and when it fails Generated the circuit by giving punishment as an enhancement signal.

図１９には、この実験結果が示されている。図１９は、３ビットのＸＯＲ回路を目標回路とし、１００ステップ分の移動平均で正答率を示したもの、すなわち直近の１００ステップ（１００回の出力）のうち正解を出力したステップの割合を示したものである。 FIG. 19 shows the result of this experiment. FIG. 19 shows a 3-bit XOR circuit as a target circuit, showing the correct answer rate with a moving average of 100 steps, that is, the ratio of the steps that output the correct answer in the last 100 steps (100 outputs). It is a thing.

各目標回路についての実験は、１０回ずつ行い、全て正しく目標回路を構成することができた。また、正誤判定にノイズをのせた場合にも、正しい回路が構成されることを確認することができ、正解へ達した回路は、構造的に安定することも確かめられた。 The experiment for each target circuit was performed 10 times, and all the target circuits could be configured correctly. In addition, it was confirmed that a correct circuit was configured even when noise was added to the correctness determination, and it was confirmed that the circuit that reached the correct answer was structurally stable.

また、別の実験として、上記の実験により２ビットのＸＯＲ回路を学習させた後、問題を変えて、３ビットのＸＯＲ回路を学習させた。そして、これらの学習内容に関連性があるか否かを調べることにより、以前の学習結果を利用した新たな構造を獲得することが確かめられた。 As another experiment, after learning a 2-bit XOR circuit by the above-mentioned experiment, the problem was changed and a 3-bit XOR circuit was learned. And by examining whether these learning contents are related or not, it was confirmed that a new structure using the previous learning result was acquired.

図２０には、この実験結果が示されている。実験で生成された３ビットのＸＯＲ回路の構造の中で、実験開始時に既知の知識として与えた２ビットのＸＯＲ回路の構造が再利用された部分については、太線で示されている。実際に構造が再利用されているか否かは、構造生成の履歴を追跡することにより調べることができる。また、図２０において、３ビットのＸＯＲ回路の構造のうち、ノードＡの部分は、２ビットのＸＯＲ回路の構造と異なっている。これは、この部分のリンクがノードＡへと置き換わったからであり、構造的な再利用は達成されている。 FIG. 20 shows the result of this experiment. Of the structure of the 3-bit XOR circuit generated in the experiment, a portion where the structure of the 2-bit XOR circuit given as known knowledge at the start of the experiment is reused is indicated by a thick line. Whether or not the structure is actually reused can be checked by tracking the history of structure generation. Also, in FIG. 20, the portion of the node A in the structure of the 3-bit XOR circuit is different from the structure of the 2-bit XOR circuit. This is because this part of the link has been replaced by node A, and structural reuse has been achieved.

さらに、以上の２つの実験の他にも、ケペラロボットシミュレータを用いた実験により、ノード数が１万個程度で、１ステップ６４ｍｓのリアルタイムな学習や、バックアップの機能を持たせ、遅延報酬による迷路問題への適用も行うことができた。以上により、本発明の効果が顕著に示された。 Furthermore, in addition to the above two experiments, an experiment using a Kepera robot simulator has a number of nodes of about 10,000 and provides a real-time learning of one step 64 ms and a backup function. It was possible to apply it to the maze problem. As described above, the effect of the present invention was remarkably shown.

なお、本発明は前記実施形態に限定されるものではなく、本発明の目的を達成できる範囲内での変形等は本発明に含まれるものである。 Note that the present invention is not limited to the above-described embodiment, and modifications and the like within a scope where the object of the present invention can be achieved are included in the present invention.

すなわち、前記実施形態では、制御対象は、ロボット３０とされていたが、これに限定されず、ゲームのキャラクタ等でもよく、例えば、格闘系ゲームの場合には、対戦相手のキャラクタとの相対位置、対戦相手のキャラクタが出している技の種類等をネットワークへの入力とし、自分の分身キャラクタの動作、すなわち自分の分身キャラクタが出す技の種類や、自分の分身キャラクタの動く方向等をネットワークの出力で定めて制御するようにしてもよい。 That is, in the above embodiment, the control target is the robot 30, but is not limited thereto, and may be a game character or the like. For example, in the case of a fighting game, the relative position with the opponent character The type of skill that the opponent's character is using is input to the network, and the behavior of your own character, that is, the type of skill that your character appears and the direction of your own character's movement, etc. You may make it control by determining with an output.

また、前記実施形態では、図８、図９に示すような強化信号の分配（伝播）が行われていたが、分配方法は、これに限定されるものではなく、要するに、伝播元の構成エレメントに付与された強化信号に基づき、伝播元および／または伝播先の構成エレメントの入出力状態に応じて、伝播先の構成エレメントに対して付与する強化信号を生成し、構成エレメントから他の構成エレメントへの強化信号の伝播を実現できればよい。 In the above embodiment, the reinforcement signal is distributed (propagated) as shown in FIGS. 8 and 9, but the distribution method is not limited to this. In short, the constituent element of the propagation source is used. Based on the reinforcement signal given to the, the reinforcement signal to be given to the propagation destination configuration element is generated according to the input / output state of the propagation source and / or propagation destination configuration element, and other configuration elements are generated from the configuration element What is necessary is just to realize propagation of the enhancement signal to.

さらに、前記実施形態では、情報処理システム１０で用いられるネットワーク２０は、主としてソフトウェアにより実現されていたが、これに限定されず、少なくとも一部にハードウェア回路を用いて実現してもよい。 Furthermore, in the embodiment, the network 20 used in the information processing system 10 is mainly realized by software. However, the present invention is not limited to this, and may be realized by using a hardware circuit at least partially.

そして、前記実施形態では、ノードは、ＡＮＤ回路やＯＲ回路を用いた論理回路により構成されていたが、ネットワークの構成エレメントとしてのノードを論理回路により構成する場合には、ＸＯＲ回路等のその他の論理回路を用いてもよい。 In the embodiment, the node is configured by a logic circuit using an AND circuit or an OR circuit. However, when a node as a network configuration element is configured by a logic circuit, other nodes such as an XOR circuit are used. A logic circuit may be used.

以上のように、本発明の情報処理システムおよび情報処理方法、並びにプログラムは、Ｉ／Ｏの学習全般に用いることができ、例えば、ロボットの動作制御、ディスプレイ画面上のゲームのキャラクタの動作制御、空調管理等を行う場合に用いるのに適している。 As described above, the information processing system, the information processing method, and the program of the present invention can be used for I / O learning in general. For example, robot motion control, game character motion control on a display screen, Suitable for air conditioning management.

本発明の一実施形態の情報処理システムの全体構成図。1 is an overall configuration diagram of an information processing system according to an embodiment of the present invention. 前記実施形態の情報処理システムによる処理で用いられるデータの構造を示す図。The figure which shows the structure of the data used by the process by the information processing system of the embodiment. 前記実施形態の情報処理システムによるロボットの動作制御の全体的な流れを示すフローチャートの図。The figure of the flowchart which shows the whole flow of the operation control of the robot by the information processing system of the said embodiment. 前記実施形態の情報処理システムによるネットワークの処理の流れを示すフローチャートの図。The figure of the flowchart which shows the flow of the process of the network by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる中間ＯＲノード（実ノード）の学習処理の流れを示すフローチャートの図。The figure of the flowchart which shows the flow of the learning process of the intermediate | middle OR node (real node) by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる非反転リンクの学習処理の流れを示すフローチャートの図。The figure of the flowchart which shows the flow of the learning process of the non-inversion link by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる中間ＯＲノードの学習処理の説明図。Explanatory drawing of the learning process of the intermediate | middle OR node by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる中間ＯＲノードの学習時の強化信号の分配例を示す図。The figure which shows the example of distribution of the reinforcement signal at the time of learning of the intermediate OR node by the information processing system of the embodiment. 前記実施形態の情報処理システムによる中間ＡＮＤノードの学習時の強化信号の分配例を示す図。The figure which shows the example of distribution of the reinforcement signal at the time of learning of the intermediate AND node by the information processing system of the embodiment. 前記実施形態の情報処理システムによる非反転リンク（実リンク）の学習処理の説明図。Explanatory drawing of the learning process of the non-inversion link (real link) by the information processing system of the embodiment. 前記実施形態の情報処理システムによる初期化の構成の説明図。Explanatory drawing of the structure of the initialization by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる学習時の削除処理の構成の説明図。Explanatory drawing of the structure of the deletion process at the time of learning by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる出力ノード初期化処理の説明図。Explanatory drawing of the output node initialization process by the information processing system of the embodiment. 前記実施形態の情報処理システムによる中間ＯＲノード初期化処理の説明図。Explanatory drawing of the intermediate OR node initialization process by the information processing system of the embodiment. 前記実施形態の情報処理システムによるテスト中間ＯＲノード初期化処理の説明図。Explanatory drawing of the test intermediate | middle OR node initialization process by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる中間ＯＲノード削除処理の説明図。Explanatory drawing of the intermediate | middle OR node deletion process by the information processing system of the said embodiment. 前記実施形態の情報処理システムによる中間ＯＲノード削除処理の別の説明図。Another explanatory view of intermediate OR node deletion processing by the information processing system of the embodiment. 前記実施形態の情報処理システムによる中間ＯＲノード削除処理のさらに別の説明図。Still another explanatory diagram of the intermediate OR node deletion processing by the information processing system of the embodiment. 本発明の効果確認実験の結果を示す図。The figure which shows the result of the effect confirmation experiment of this invention. 本発明の別の効果確認実験の結果を示す図。The figure which shows the result of another effect confirmation experiment of this invention.

Explanation of symbols

１０情報処理システム
２０ネットワーク
２１構成エレメントである入力ノード
２２構成エレメントである中間ノード
２３構成エレメントである出力ノード
２４構成エレメントであるリンク
４２状態評価用信号取得手段
４３強化信号生成手段
５１学習手段
５３出力生成手段
６１ネットワーク構造記憶手段および強化信号記憶手段として機能するネットワーク情報記憶手段
６２ネットワーク構造記憶手段、入出力状態記憶手段、および強化信号記憶手段として機能するノード情報記憶手段
６３ネットワーク構造記憶手段、入出力状態記憶手段、および強化信号記憶手段として機能するリンク情報記憶手段
１０５，１４８，１８５テストリンク
１２３，１４２，１６１，２０１，２２９テストノード
１２４，１４３，１６２，２０２第１の入力側テストリンク
１２５，１４６，１６３，２０４第２の入力側テストリンク DESCRIPTION OF SYMBOLS 10 Information processing system 20 Network 21 Input node which is a constituent element 22 Intermediate node which is a constituent element 23 Output node which is a constituent element 24 Link which is a constituent element 42 State evaluation signal acquisition means 43 Reinforcement signal generation means 51 Learning means 53 Output Generation means 61 Network information storage means functioning as network structure storage means and reinforcement signal storage means 62 Network structure storage means, input / output state storage means, and node information storage means functioning as reinforcement signal storage means 63 Network structure storage means, input Output information storage means, and link information storage means 105, 148, 185 functioning as enhancement signal storage means Test links 123, 142, 161, 201, 229 Test nodes 124, 143, 162 02 The first of the input side test link 125,146,163,204 second input side test link

Claims

An information processing system using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between the nodes as constituent elements,
Network structure storage means for storing a structure of the network including a coupling relationship between the constituent elements;
Input / output state storage means for storing input / output states of the constituent elements formed by the output generation processing of the network;
Enhanced signal generating means for generating an enhanced signal to be given as a reward or punishment to the network according to the evaluation result of the state of the controlled object formed based on the output result of the network;
The enhancement signal generated by the enhancement signal generation means is applied to at least one component element, and the enhancement signal is generated according to a chain connection relationship between the component elements from the component element to which the enhancement signal is applied to another component element. In accordance with the input / output states of the propagation source and / or the propagation destination component elements stored in the input / output state storage means based on the reinforcement signal given to the propagation source component element in order. For each constituent element, a reinforcing signal to be given as a reward or punishment to the constituent element of the propagation destination is generated, and the reinforcing signal given to the constituent element or its history or the cumulative value of the reinforcing signal or its history is used. The structure of the network is changed by creating or deleting the configuration element. So, a learning means for storing the structure of the network after the change to the network structure storage means,
Output generating means for generating the output of the network using the network whose structure is changed by the learning means with reference to the structure of the network stored in the network structure storing means;
An information processing system comprising: an enhancement signal storage means for storing, for each constituent element, an enhancement signal of the constituent element generated by the learning means or a history thereof, a cumulative value of the enhancement signal, or a history thereof.

The information processing system according to claim 1,
A state detection means for detecting the state of the control object or a state evaluation signal acquisition means for acquiring a state evaluation signal for evaluating the state of the control object from the control object itself;
The enhancement signal generation unit is configured to evaluate the state of the control target based on the state evaluation signal acquired by the state evaluation signal acquisition unit and generate the enhancement signal according to the evaluation result. An information processing system characterized by this.

The information processing system according to claim 1,
An evaluation result input receiving means for receiving an input of an evaluation result of the state of the control target by a user;
The information processing system, wherein the enhancement signal generation unit is configured to generate the enhancement signal according to the evaluation result received by the evaluation result input reception unit.

In the information processing system according to any one of claims 1 to 3,
The learning means equally applies the enhancement signal generated by the enhancement signal generation means to all the output nodes constituting the output layer of the network, and uses the propagation source configuration element as a node, Propagation destination component element as input side link of propagation source node, node output of propagation destination input side link determined according to input / output state of said propagation source node based on reinforcement signal given to said propagation source node An information processing system that generates an enhancement signal to be given as a reward or punishment to the input link on the propagation destination according to the degree of contribution to the transmission destination.

The information processing system according to claim 4,
The learning means uses the propagation source configuration element as a node and the propagation destination configuration element as an input side node coupled to an input side of an input side link of the propagation source node, and is given to the propagation source node. Based on the reinforcement signal generated, a reinforcement signal to be given as reward or punishment to the propagation destination input side node is generated according to the contribution to the node output of the input side link determined according to the input / output state of the propagation source node An information processing system characterized by being configured to perform.

The information processing system according to claim 4 or 5,
The enhancement signal storage means is configured to store, for each link, a history of the enhancement signal given to a link or a cumulative value of the enhancement signal.
The information processing system according to claim 1, wherein the learning unit is configured to delete the link when a cumulative value of the enhancement signal given to the link falls below a threshold value.

The information processing system according to claim 6,
The information processing system according to claim 1, wherein the learning unit is configured to delete a node when the number of input side links of the node becomes 1 or less.

The information processing system according to claim 4,
On the input side of the propagation source node, in addition to the propagation destination input side link, a test link that does not contribute to node output is provided,
The enhancement signal storage means is configured to store a history of the enhancement signal given to the test link or a cumulative value of the enhancement signal,
The learning means registers the test link in the network structure storage means as the input side link of the propagation source node when the cumulative value of the enhancement signal given to the test link exceeds a threshold value. An information processing system characterized by being configured.

The information processing system according to claim 8,
The learning unit deletes the test link when a cumulative value of the enhancement signal given to the test link falls below a threshold value, and generates a new test link coupled to an arbitrary node; An information processing system configured to be registered in the network structure storage unit.

In the information processing system according to any one of claims 1 to 3,
The link is provided with a test node that does not contribute to the output of the network associated with the link, the test node being connected to the input side node of the link by a first input side test link, and the link Connected to the output side node of the output side by the output side test link, and connected to an arbitrary node by the second input side test link,
The learning means uses the propagation source configuration element as the link, the propagation destination configuration element as the test node, and based on the reinforcement signal given to the propagation source link, outputs of the propagation source link and An information processing system configured to generate an enhancement signal to be given as a reward or punishment to the propagation destination test node according to an output state of the propagation destination test node.

The information processing system according to claim 10,
The learning means uses the propagation source configuration element as the test node, the propagation destination configuration element as the first and second input-side test links of the test node, and the enhancement given to the propagation source test node. Based on the signal, according to the contribution degree to the test node output of the first and second input side test links of the propagation destination determined according to the input / output state of the test node of the propagation source, the first and second of the propagation destination An information processing system configured to generate an enhancement signal to be given as reward or punishment to an input-side test link.

The information processing system according to claim 11,
The enhancement signal storage means is configured to store the history of the enhancement signal or the cumulative value of the enhancement signal given to the first and second input side test links of the propagation destination for each link,
The learning means deletes an input-side test link that has fallen below the threshold value when a cumulative value of the enhancement signal given to the first or second input-side test link at the propagation destination has fallen below the threshold value. An information processing system characterized in that a new input side test link coupled to an arbitrary node is generated and registered in the network structure storage means.

The information processing system according to claim 11,
The enhancement signal storage means is configured to store the history of the enhancement signal or the cumulative value of the enhancement signal given to the first and second input side test links of the propagation destination for each link,
The learning means puts the test node into practical use when the accumulated value of the enhancement signal given to the first and second input side test links of the propagation destinations exceeds a threshold value. An information processing system, wherein the test node is promoted to a real node that contributes to the output of the network and registered in the network structure storage unit.

In the information processing system according to any one of claims 1 to 13,
The node is configured to perform information processing using at least one logic circuit.

An information processing method using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between the nodes as constituent elements,
Storing the structure of the network including the coupling relationship between the constituent elements in a network structure storage means;
The input / output state of the constituent element formed by the output generation processing of the network is stored in the input / output state storage means,
The reinforcement signal generation means performs a process of generating an enhancement signal to be given as a reward or punishment for the network according to the evaluation result of the state of the control target formed based on the output result of the network,
A learning unit assigns the enhancement signal generated by the enhancement signal generation unit to at least one of the constituent elements, and a chain connection relationship between the constituent elements from the constituent element provided with the enhancement signal to another constituent element In order to propagate the enhancement signal according to the above, the input and output of the propagation source and / or propagation destination component elements stored in the input / output state storage means are sequentially based on the enhancement signal given to the propagation source component element According to the state, a reinforcement signal to be given as a reward or punishment to the propagation destination component element is generated, and the generated reinforcement signal of the component element or its accumulated value is stored in the reinforcement signal storage means for each component element , The enhancement signal given to the constituent element or its history or the cumulative value of the enhancement signal It performs processing for performing generation or deletion of the configuration elements for each of the construction elements using the history by changing the structure of the network, and stores the structure of the network after the change to the network structure storage means,
The output generation means refers to the network structure stored in the network structure storage means, and performs processing to generate the network output using the network whose structure has been changed by the learning means. Information processing method.

A program for causing a computer to function as an information processing system using a network including a plurality of nodes that perform information processing and a link that links these nodes and transmits information between nodes as a constituent element,
Network structure storage means for storing a structure of the network including a coupling relationship between the constituent elements;
Input / output state storage means for storing input / output states of the constituent elements formed by the output generation processing of the network;
Enhanced signal generating means for generating an enhanced signal to be given as a reward or punishment to the network according to the evaluation result of the state of the controlled object formed based on the output result of the network;
The enhancement signal generated by the enhancement signal generation means is applied to at least one component element, and the enhancement signal is generated according to a chain connection relationship between the component elements from the component element to which the enhancement signal is applied to another component element. In accordance with the input / output states of the propagation source and / or the propagation destination component elements stored in the input / output state storage means based on the reinforcement signal given to the propagation source component element in order. For each constituent element, a reinforcing signal to be given as a reward or punishment to the constituent element of the propagation destination is generated, and the reinforcing signal given to the constituent element or its history or the cumulative value of the reinforcing signal or its history is used. The structure of the network is changed by creating or deleting the configuration element. So, a learning means for storing the structure of the network after the change to the network structure storage means,
Output generating means for generating the output of the network using the network whose structure is changed by the learning means with reference to the structure of the network stored in the network structure storing means;
An information processing system comprising: an enhancement signal storage means for storing the reinforcement signal of the constituent element generated by the learning means or the history thereof, or the cumulative value of the enhancement signal or the history thereof for each constituent element, A program that allows a computer to function.