JP2523150B2

JP2523150B2 - Robot control system

Info

Publication number: JP2523150B2
Application number: JP63007889A
Authority: JP
Inventors: 隆木本; 和雄浅川; 茂美長田; 信雄渡部; 旭川村; 英樹吉沢; 実関口
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1988-01-18
Filing date: 1988-01-18
Publication date: 1996-08-07
Anticipated expiration: 2011-08-07
Also published as: JPH01183703A

Description

【発明の詳細な説明】〔概要〕複数のセンサ情報に基づいて行動するロボットを重み
を有する階層ネットワークで制御するロボット制御方式
における重みの設定のための学習方式に関し，階層ネットワークを用いるロボット制御方式の実用性
を高めることを目的とし，計算機上に階層ネットワーク制御ロボットと同一構成
のロボット学習シミュレータを構成するとともに，更に
このロボット学習シミュレータの行動をシミュレートす
る動作模倣手段を備えるよう構成し，このロボット学習
シミュレータを用いて階層ネットワーク制御ロボットが
必要とする階層ネットワークの重みの値を求め，この求
められた重みの値の有効性を動作模倣手段に従って検証
するとともに，有効性の検証された重みの値を階層ネッ
トワーク制御ロボットの階層ネットワークの重みの値と
して設定するようにと構成するものである。The present invention relates to a learning method for setting weights in a robot control method in which a robot that behaves based on multiple sensor information is controlled by a hierarchical network having weights. Robot control using a hierarchical network For the purpose of improving the practicality of the method, a robot learning simulator with the same configuration as the hierarchical network control robot is constructed on the computer, and it is further configured with a motion mimicking means for simulating the behavior of this robot learning simulator. Using this robot learning simulator, the weight value of the hierarchical network required by the hierarchical network control robot is obtained, and the effectiveness of the obtained weight value is verified by the action mimicking means. Value of hierarchical network control robot floor And it constitutes a to set the value of the network weights.

[Industrial applications]

本発明は，複数のセンサ情報に基づいて行動するロボ
ットを重みを有する階層ネットワークで制御するロボッ
ト制御方式における学習方式に関する。The present invention relates to a learning method in a robot control method for controlling a robot that behaves based on a plurality of sensor information by a hierarchical network having weights.

近年，人工知能（AI），ファクトリオートメーション
（FA），オフィスオートメーション（OA）の進展に伴
い，人間にとって使い易く，人間と共存できる知的な
“柔らかい”システムへの要求が高まっている。この期
待に応えるため，エキスパートシステム等を適用したシ
ステムや，パターン認識機能を備えたロボット等が提供
され始めているが，未だ“柔らかい”システムというに
は程遠い段階にある。これから，使用方法や環境の変化
に応じてロボットの行動を適応的に制御する“柔らか
い”ロボット制御方式の開発が望まれているとともに，
そのようなロボット制御方式の実用性を高めるための技
術の開発も望まれているのである。In recent years, with the progress of artificial intelligence (AI), factory automation (FA), and office automation (OA), there is an increasing demand for an intelligent "soft" system that is easy for humans to use and can coexist with humans. In order to meet this expectation, systems to which expert systems and the like have been applied and robots with pattern recognition functions are beginning to be provided, but they are still far from being "soft" systems. From now on, there is a demand for the development of a "soft" robot control method that adaptively controls the behavior of the robot according to changes in usage and environment.
It is also desired to develop a technique for enhancing the practicality of such a robot control system.

[Conventional technology]

従来のロボット制御方式は，逐次処理コンピュータ
（ノイマン型コンピュータ）によって実現されている。
従って，複数のセンサ情報に基づいて行動するロボット
の制御方式についても，どのセンサがどういう情報をも
たらすかということを予め人間が考慮した上で，どのよ
うなセンサからの入力パターンが入ってきた時，どのよ
うな出力パターンを発生し，行動したらよいかをプログ
ラムの形で厳密に記述していくといったような構成によ
り実現されていた。The conventional robot control method is realized by a sequential processing computer (Neumann type computer).
Therefore, in the control method of a robot that behaves based on multiple sensor information, when a human considers in advance which sensor will bring what kind of information, when the input pattern from which sensor comes in, This has been realized by a configuration in which what output pattern is generated and what action should be taken is strictly described in the form of a program.

[Problems to be solved by the invention]

従って，ロボットはこのプログラムに基づいて行動し
ているだけであり，当然，プログラムに記述されていな
い状況には対応できず，また，センサの特性が少しでも
変化したり，多数のセンサのうち１つでも壊れたりする
と，適切な行動をとることができない等といった問題点
があった。さらに，ロボットが多数のセンサ群を備えて
いる場合，プログラムを作成すること自体が極めて困
難，或いは不可能になるといった問題点を生じていた。Therefore, the robot is only acting on the basis of this program, and naturally cannot cope with situations not described in the program. If it breaks, there is a problem that it is not possible to take appropriate actions. Furthermore, when the robot has a large number of sensors, it is extremely difficult or impossible to create the program itself.

これらの問題点を解決するために，本出願人は階層ネ
ットワーク構造に基づく“柔らかい”ロボット制御方式
を提案（出願日：昭和62年12月28日，発明の名称「ロボ
ット制御方式」）した。本願明細書でも詳細に説明する
ようにこの階層ネットワーク構造によるロボット制御方
式では，階層ネットワークの重みの値を学習し，検証し
ていく必要がある。In order to solve these problems, the applicant has proposed a “soft” robot control method based on a hierarchical network structure (filing date: December 28, 1987, invention title “robot control method”). As described in detail in the present specification, in the robot control method based on this hierarchical network structure, it is necessary to learn and verify the weight value of the hierarchical network.

しかるに，実際のロボット上において重みの学習や検
証を行うことは，開発環境の悪いロボット上で作業する
ために開発期間が長期になる点，学習や結果の検証に必
要なプログラムやハードウェアを搭載しなければならな
い点，ロボットのセンサ，モータなど構成要素の特性の
バラツキなどの影響を受ける点等の問題があるととも
に，演算を実行するロボット上の制御用プロセッサ等の
実行速度が低速であることが多く，従ってロボットの階
層ネットワークの重みの学習が迅速に行えないなど必ず
しも得策とはいえないことも多いのである。However, learning and verifying weights on an actual robot requires a long development period to work on a robot with a bad development environment, and the programs and hardware necessary for learning and verification of results are installed. There is a problem that it must be done and that it is affected by variations in the characteristics of components such as robot sensors and motors, and the execution speed of the control processor on the robot that executes calculations is low Therefore, it is not always a good idea because the weights of the hierarchical network of robots cannot be learned quickly.

本発明はかかる事情に鑑みてなされたものであって，
階層ネットワーク構造によるロボット制御方式における
階層ネットワークの重みの値を高速かつ正確に学習し，
検証できるロボット学習方式の提供を実現することで，
階層ネットワーク構造のロボット制御方式の実用性を高
めることを目的とするものである。The present invention has been made in view of such circumstances,
Fast and accurate learning of the weight value of the hierarchical network in the robot control method by the hierarchical network structure,
By providing a robot learning method that can be verified,
The purpose is to improve the practicality of a robot control system with a hierarchical network structure.

[Means for solving problems]

第１図は，本発明の原理説明図である。 FIG. 1 is an explanatory view of the principle of the present invention.

図中,1はセンサ手段であり，ロボットを取囲む外界情
報を取り込む複数のセンサからなるもの,2は行動パター
ン発生手段であり，行動パターン信号を発生させてロボ
ットの行動パターンを規定するもの,3はロボット制御手
段であり，重みを有する階層ネットワークから構成さ
れ，この階層ネットワークに従ってセンサ手段１の検出
パターンに応じた制御信号を生成した行動パターン発生
手段２を制御するもの,5は基本制御動作格納手段であ
り，予め選択されるセンサ手段１の１つまたは複数個の
特定の検出パターンに対してのロボット制御手段３の生
成すべき制御信号情報を格納するためのものである。こ
のセンサ手段１と行動パターン発生手段２とロボット制
御手段３と基本制御動作格納手段５とで実際のロボット
の制御を実行するところの階層ネットワーク制御ロボッ
ト10が構成される。20は計算機上に構成されるロボット
学習シミュレータであり，階層ネットワーク制御ロボッ
ト10と同一構成の10′で示される疑似階層ネットワーク
制御ロボットと，この疑似階層ネットワーク制御ロボッ
ト10′の行動をシミュレートするための７で示される動
作模倣手段とから構成されている。In the figure, 1 is a sensor means, which is composed of a plurality of sensors for fetching external information surrounding the robot, 2 is an action pattern generation means, which generates an action pattern signal to define the action pattern of the robot, 3 is a robot control means, which is composed of a hierarchical network having weights, and controls the action pattern generating means 2 which generates a control signal according to the detection pattern of the sensor means 1 according to this hierarchical network, 5 is a basic control operation The storage means is for storing control signal information to be generated by the robot control means 3 for one or a plurality of specific detection patterns of the sensor means 1 selected in advance. The sensor means 1, the action pattern generation means 2, the robot control means 3, and the basic control operation storage means 5 constitute a hierarchical network control robot 10 for executing actual control of the robot. Reference numeral 20 denotes a robot learning simulator configured on a computer, for simulating the behavior of the pseudo-hierarchical network control robot 10 'and the pseudo-hierarchical network control robot 10' having the same configuration as the hierarchical network control robot 10. No. 7 of the action imitation means.

[Work]

本発明では，ロボット学習シミュレータ20が，疑似階
層ネットワーク制御ロボット10′を使って基本制御動作
格納手段５の基本制御動作が実現できるようにと階層ネ
ットワークの重みの学習を行い，そして求められた重み
を動作模倣手段７によって検証し，その検証された重み
を階層ネットワーク制御ロボット10にと転送することで
階層ネットワーク制御ロボット10の重みを設定するとと
もに，階層ネットワーク制御ロボット10を実際に行動さ
せて検証することで１つまたは複数の基本制御動作を学
習させることが必要になったならば，基本制御動作を階
層ネットワーク制御ロボット10からロボット学習シミュ
レータ20に転送し，以前より保持している基本制御動作
とともに学習するか，あるいは追加学習して，その結果
の重みを階層ネットワーク制御ロボット10にと転送する
ことで階層ネットワーク制御ロボット10の重みの設定を
実行する。In the present invention, the robot learning simulator 20 learns the weights of the hierarchical network so that the basic control operation of the basic control operation storage means 5 can be realized by using the pseudo hierarchical network control robot 10 ', and the obtained weights are obtained. Is verified by the action mimicking means 7, and the verified weight is transferred to the hierarchical network control robot 10 to set the weight of the hierarchical network control robot 10 and verify by actually operating the hierarchical network control robot 10. If it becomes necessary to learn one or more basic control actions by doing so, the basic control actions are transferred from the hierarchical network control robot 10 to the robot learning simulator 20 and the basic control actions retained previously are retained. Learning with or with additional learning and weighting the results as a hierarchical network. Executing a set of weights of the hierarchical network control robot 10 by transferring the to control the robot 10.

このように本発明によれば，多大な処理時間を要する
重みの設定のための基本制御動作の学習をロボット学習
シミュレータ20で行うことができ，ロボット学習シミュ
レータ20のロボット制御手順部に高速の専用回路を用い
ることにより，階層ネットワーク制御ロボット10のハー
ドウェアを増加させることなしに，高速な重みの学習が
可能になる。加えて，実際のロボットを行動させること
なしに学習した重みの検証が行えるので，開発効率の向
上が見込まれる。As described above, according to the present invention, the robot learning simulator 20 can learn the basic control operation for setting the weight, which requires a great deal of processing time. By using the circuit, high-speed weight learning is possible without increasing the hardware of the hierarchical network control robot 10. In addition, since the weights learned can be verified without making the actual robot act, it is expected that the development efficiency will be improved.

〔Example〕

以下，実施例に従って本発明を詳細に説明する。 Hereinafter, the present invention will be described in detail according to examples.

第２図は，第１図で説明したロボット制御手段３の詳
細な実施例構成図である。この図に示すように，本発明
のロボット制御手段３は，ユニットと呼ぶ一種のノード
と重みを持つ内部結合とからなる階層ネットワークによ
り構成されることになる。この構成は，基本的には，コ
ンピュータに適応機能を与えるデータ処理方式として提
唱されているバック・プロパゲーション法と呼ばれてい
る処理方式（D.E.Rumelhart,G.E.Hinton,and R.J.Willi
ams,“Learning Internal Representations by Error P
ropagation,"PARALLEL DISTRIBUTED PROCESSING,Vol.1,
pp.318−364,The MIT Press,1986）と同じものである。FIG. 2 is a detailed configuration diagram of the robot control means 3 described in FIG. As shown in this figure, the robot control means 3 of the present invention is configured by a hierarchical network including a kind of node called a unit and an internal connection having a weight. This configuration is basically a processing method called a back-propagation method (DERumelhart, GEHinton, and RJWilli) that has been proposed as a data processing method that gives an adaptive function to a computer.
ams, “Learning Internal Representations by Error P
ropagation, "PARALLEL DISTRIBUTED PROCESSING, Vol.1,
pp.318-364, The MIT Press, 1986).

次に，このバック・プロパゲーション法の概要につい
て説明する。Next, an outline of this back propagation method will be described.

バック・プロパゲーション法ではユニットと呼ぶ一種
のノードと重みを持つ内部結合とから階層ネットワーク
を構成している。第３図は，ユニットの内部構成を示し
たもので基本ユニット30と呼ぶ。この基本ユニット30
は，多入力一出力系となっている。このユニットでは，
複数の各々の入力に対し各々の内部結合の重みを乗じて
これらの全乗算結果を和算する累算部31と，さらに，閾
値処理を施して１つの出力を出す閾値処理部32とをもっ
ている。即ちバック・プロパゲーション法における基本
ユニット30でのデータ処理は，入力と重みとの積和演算
および閾値処理からなっている。ここで，バック・プロ
パゲーション法では閾値の関数として，式（２）で示さ
れるシグモイド関数を用いている。In the back propagation method, a hierarchical network is composed of a kind of node called a unit and an internal connection with weight. FIG. 3 shows the internal structure of the unit and is called a basic unit 30. This basic unit 30
Is a multi-input, single-output system. In this unit,
It has an accumulator 31 that multiplies each input by a weight of each internal connection and sums the results of all multiplications, and a threshold processor 32 that performs threshold processing and outputs one output. . That is, the data processing in the basic unit 30 in the back propagation method consists of product-sum calculation of input and weight and threshold processing. Here, in the back propagation method, the sigmoid function represented by Expression (2) is used as a function of the threshold.

基本ユニットで行われる演算を数式で示すと以下のよ
うになる。The calculation performed by the basic unit is shown below by mathematical expressions.

y_pi＝1/（１＋exp（−x_pi＋θ_ｉ））式（２）但し， h:h層のユニット番号 i:i層のユニット番号 p:入力パターン番号 θ₁:i層のｉ番目のユニットの閾値 W_ih:h−ｉ層間の内部結合の重み x_pi:h層の各ユニットからｉ層のｉ番ユニットへの入力
の積和 y_ph:pパターン入力に対するｈ層の出力 y_pi:pパターン入力に対するｉ層の出力バック・プロパゲーション法では，この積和と閾値処
理を基本とする基本ユニット30を第２図のような誤差フ
ィードバッグを持つ積層ネットワーク構造として，重み
と閾値とを誤差のフィードバックにより適応的に自動調
節するアルゴリズムを用い，所望のデータ処理方法（入
力パターンと出力パターンとの間の連合）をネットワー
クに学習させることにより適応的なデータ処理を実行す
る。 y _pi = 1 / (1 + exp (-x _pi + θ _i )) Equation (2) where h: h layer unit number i: i layer unit number p: input pattern number θ ₁ : i layer i unit Threshold value W _ih : h-weight of internal coupling between i layers x _pi : sum of inputs from each unit of layer i to unit i of layer y _ph : p output of layer y for pattern input y _pi : p In the i-layer output back-propagation method with respect to the pattern input, the basic unit 30 based on the sum of products and threshold processing has a layered network structure having an error feedback bag as shown in FIG. The adaptive data processing is executed by learning the desired data processing method (association between the input pattern and the output pattern) in the network by using an algorithm that adaptively and automatically adjusts by the feedback.

ここで，式（１）および式（２）から，バック・プロ
パゲーション法では重みと閾値との調節が同時に実行さ
れる必要があるが，重みと閾値との設定は相互に干渉す
る難しい作業となる。従来のバック・プロパゲーション
法の方法では，ユニット毎に閾値を与えていたが，ユニ
ット数が多数となると各ユニット毎に閾値を与えること
が煩雑で面倒な作業となるため，閾値の自動設定が切望
されていたのである。そこで本発明のロボット制御手段
３では，第２図に示すように，入力層中に入力信号数の
入力ユニット３−ｈの外に，常に「１」が入力される閾
値調節用の閾値入力ユニット３′−ｈを設けるよう構成
するものである。Here, from the equations (1) and (2), in the back propagation method, the adjustment of the weight and the threshold needs to be executed at the same time, but the setting of the weight and the threshold is a difficult task that interferes with each other. Become. In the conventional back-propagation method, a threshold is given to each unit. However, when the number of units is large, it is a complicated and troublesome task to give a threshold to each unit. It was a coveted one. Therefore, in the robot control means 3 of the present invention, as shown in FIG. 2, in addition to the input unit 3-h for the number of input signals, "1" is always input in the input layer, the threshold input unit for threshold adjustment. 3'-h is provided.

このように，閾値入力ユニット３′−ｈを設けたこと
によって中間層の各ユニット３−ｉの閾値を，中間層に
対する重みW_ihを設定することと同等の処理にて設定で
きる理由は次の如きものと考えてよい。即ち，中間層の各ユニット３−ｉにおいては，入力層内の１
つの余分にもうけたユニート３′−ｈからの値y_phが常
に「１」であることから，値x_piは式（１）から， ∴ｘ′_pi＝x_pi＋Ｗ_ih′ （ただしｈ′は入力層の閾値入力ユニットの番号）で与
えられる。この結果からθ_ip−Ｗ_ih′として式（２）に
代入すると， y_pi＝1/（１＋exp（−x_pi−Ｗ_ih′））となり，式（２）に示されている閾値θ_ｉを実質的に−
W_ihに設定変更したことに対応する。即ち，中間層にお
ける各ユニットの閾値をいちいち設定する煩雑さが解消
される。As described above, the reason why the threshold value input unit 3'-h is provided so that the threshold value of each unit 3-i in the intermediate layer can be set by the same process as setting the weight W _ih for the intermediate layer is as follows. You can think of it as such. That is, in each unit 3-i of the intermediate layer, 1 unit in the input layer
Since the value y _ph from the three extra units 3'-h is always "1", the value x _pi is ∴x ' _pi = x _pi + W _ih' (where h'is the number of the threshold input unit of the input layer). From this result, when substituting θ _i p −W _{ih ′ into} equation (2), y _pi = 1 / (1 + exp (−x _pi −W _{ih ′} )), which is the threshold θ _i shown in equation (2). Is substantially −
Corresponds to the setting change to _Wih . That is, the complexity of setting the threshold value of each unit in the intermediate layer is eliminated.

次に，第４図に示す具体例に従って，第１図で説明し
た基本制御動作格納手段５に格納されるところの格納情
報について説明する。例えば第２図に示すようにセンサ
出力が２値化モードで入力ニット３−ｈが３個のときに
は，第４図にも示すようにロボット制御手段３への入力
パターンの数は８通りとなる。従って，ロボットの行動
パターンが例えば動状態か停止状態という２通りである
ならば，これらの８通りの入力パターンの各々に対して
「１」か「０」というロボットの行動パターンは定まる
ことになる。このように，センサ手段１の検出パターン
に対してロボットが本来的にどういう行動をとるべきか
ということは定まっているので，基本制御動作格納手段
５には，このような情報の内のいくつかが予め選択され
て格納されるよう構成されることになる。なお，第４図
に示したような８通りの入力パターンと数の少ないもの
にあってはそのすべてを格納することも可能である。Next, the stored information stored in the basic control operation storage means 5 described with reference to FIG. 1 will be described according to the specific example shown in FIG. For example, when the sensor output is in the binarization mode and there are three input units 3-h as shown in FIG. 2, the number of input patterns to the robot control means 3 is eight as shown in FIG. . Therefore, if the action pattern of the robot is, for example, two states of the moving state and the stopped state, the action pattern of the robot of "1" or "0" is determined for each of these eight input patterns. . As described above, it is already determined what kind of action the robot should take with respect to the detection pattern of the sensor means 1. Therefore, some of such information is stored in the basic control operation storage means 5. Would be preselected and stored. It should be noted that it is possible to store all eight input patterns as shown in FIG. 4 and a small number of input patterns.

第４図の例の内容についての理解を深めるために，第
５図に示すロボットの行動パターンの具体例について説
明する。第５図では具体的に，第５図（ａ）に示す３つ
の光センサと“右に旋回する”という行動をとるための
１つのモータを持つロボットを考えている。そして，こ
のロボットに，第５図（ａ）に示すようなロボットの周
りを回る発光源に追従するという行動をとらせたいと想
定する。このとき，ロボットのセンサ入力パターンとし
ては，第４図の入力パターンの欄に示す組み合わせたが
考えられる。例えば，第４図の「０」の入力パターンは
光センサ，，が３つともOFFの状態，「４」の入
力パターンは光センサがON,光センサ，がOFFの状
態，「６」の入力パターンは光センサ，がON,光セ
ンサがOFFの状態である。ロボットを回転する発光源
に追従させるためには，このような入力パターンに応じ
て，第４図の教師信号欄に示す出力パターン（行動）が
ネットワークの学習，訓練により，実現できるようにす
ればよい。例えば，第４図の「０」の入力パターンは，
光センサ，，ともOFFの状態，即ち，発光源から
光が出ていない状態なので，この時の出力パターンは
“0",即ち，“モータをOFFにせよ”という制御を表して
いる。また，「６」の入力パターンは，光センサ，
がON,光センサがOFFの状態，即ち，まさに発光源を捉
えた状態（第５図（ａ））なので，“モータをOFFにせ
よ”という制御を表し，「４」の入力パターンは光セン
サがON,光センサ，がOFFの場合，即ち，ロボット
が捉えた発光源がさらに移動した状態（第５図（ｂ））
なので，“モータをONにせよ（右に旋回せよ）”という
ことを表している。すなわち，第４図に示すような教師
信号が得られれば，第５図で説明したようなロボットの
行動パターンは好ましく制御されることになる。In order to deepen the understanding of the contents of the example of FIG. 4, a specific example of the action pattern of the robot shown in FIG. 5 will be described. FIG. 5 specifically considers a robot having three optical sensors shown in FIG. 5 (a) and one motor for performing the action of “turning to the right”. Then, it is assumed that the robot wants to take the action of following a light emitting source that revolves around the robot as shown in FIG. At this time, as the sensor input pattern of the robot, the combination shown in the input pattern column of FIG. 4 can be considered. For example, the input pattern of "0" in Fig. 4 is the state where all three optical sensors are OFF, and the input pattern of "4" is the state where the optical sensor is ON and the optical sensor is OFF, input of "6". The pattern is that the optical sensor is ON and the optical sensor is OFF. In order to make the robot follow the rotating light source, the output pattern (behavior) shown in the teacher signal column of FIG. 4 can be realized by network learning and training according to such input pattern. Good. For example, the input pattern of "0" in Fig. 4 is
Since both the optical sensors are in the OFF state, that is, no light is emitted from the light emitting source, the output pattern at this time represents "0", that is, "turn off the motor". The input pattern of "6" is the optical sensor,
Is ON and the optical sensor is OFF, that is, the state where the light source is captured (Fig. 5 (a)), so the control "turn off the motor" is indicated, and the input pattern of "4" is the optical sensor. Is ON and the optical sensor is OFF, that is, the light source captured by the robot has moved further (Fig. 5 (b)).
Therefore, it means "turn on the motor (turn to the right)." That is, if the teacher signal as shown in FIG. 4 is obtained, the action pattern of the robot as shown in FIG. 5 is preferably controlled.

このような好ましい制御を実現するために，本発明で
は，ロボット制御手段３の階層ネットワークの内部結合
の重みW_ih及びW_jiを，基本制御動作格納手段５に格納さ
れる格納情報に従って学習的に決定していこうとするも
のである。すなわち，ロボット制御手段３の動作モード
を学習モードにセットし，基本制御動作格納手段５から
順次格納情報を読み出して，バック・プロパゲーション
法の学習アルゴリズムを用いて重みW_ih及びW_jiの値を自
動決定するよう構成するものである。In order to realize such preferable control, according to the present invention, the weights W _ih and W _ji of the internal connection of the hierarchical network of the robot control means 3 are learned in accordance with the storage information stored in the basic control operation storage means 5. It is a decision to make. That is, the operation mode of the robot control means 3 is set to the learning mode, the stored information is sequentially read from the basic control operation storage means 5, and the values of the weights W _ih and W _ji are determined by using the learning algorithm of the back propagation method. It is configured to automatically determine.

次に，この重みW_ih及びW_jiの値を決定するための学習
手順について説明する。ここで，この学習手順の説明
は，第６図に示すようなロボット制御手段３の出力層が
複数の出力ユニット３−ｊをもつ更に一般化したもので
説明するものである。Next, a learning procedure for determining the values of the weights W _ih and W _ji will be described. Here, the explanation of this learning procedure is given by further generalizing the output layer of the robot control means 3 as shown in FIG. 6 having a plurality of output units 3-j.

まず，内部結合の重みの初期値を決める。この初期値
は,1以下の乱数により決める。これは，重みの値が全て
同じか，あるいはユニットに関して対称となると，バッ
ク・プロパゲーション法では重みの変化が生ぜず，学習
が進行しなくなってしまうのでこれを避ける目的で乱数
を用いるのである。続いて，基本制御動作格納手段５に
格納されるところの入力パターンとその入力パターンと
対をなす所望の出力パターンのデータを順次読み出す。
例えば第４図に示すように,8個の格納情報があればこれ
らのすべてが順次読み出されることになる。First, the initial value of the weight of the inner join is determined. This initial value is determined by a random number of 1 or less. This is because if the weight values are all the same or are symmetric with respect to the unit, the back propagation method does not change the weight and learning does not proceed, so random numbers are used to avoid this. Then, the data of the input pattern stored in the basic control operation storage means 5 and the data of the desired output pattern paired with the input pattern are sequentially read.
For example, as shown in FIG. 4, if there are eight pieces of stored information, all of these pieces of information will be sequentially read.

続いて，バック・プロパゲーション法に関わる制御パ
ラメータを入力する。ここでは，インクリメンタルな繰
り返し学習の一回当たりの重みの変化量の係数である学
習率と収束時の振動を抑えるための学習速度を制御パラ
メータとしている。そして，バック・プロパゲーション
法のアルゴリズムに従って出力層ユニット３−ｊの出力
値が所望の出力と一致するようにインクリメンタルに内
部結合の重みを学習し，最後に，出力層ユニットの出力
値と所望の出力値との誤差が全ての入力パターンに関し
て所定の値より少なくなったら，学習を終了する。Next, the control parameters related to the back propagation method are input. Here, the learning rate, which is the coefficient of the amount of change in weight per incremental iterative learning, and the learning speed for suppressing the oscillation at the time of convergence are used as control parameters. Then, according to the algorithm of the back propagation method, the weight of the inner coupling is incrementally learned so that the output value of the output layer unit 3-j matches the desired output, and finally, the output value of the output layer unit and the desired value are obtained. When the error from the output value becomes smaller than the predetermined value for all input patterns, the learning is finished.

具体的な階層ネットワークの重みの更新のアルゴリズ
ムを以下に説明する。式（１），式（２）からの類推に
よって次の式が得られる。即ち， y_pi＝1/（１＋exp（−x_pi＋θ_ｉ））式（４） y_pj＝1/（１＋exp（−x_pj＋θ_ｊ））式（６）但し， y_ph:p番目パターン入力値に対するｈ層（ここでは入力
層）のｈ番目のユニットからの出力 y_pi:p番目パターン入力値に対するｉ層（ここでは中間
層）のｉ番目のユニットからの出力 y_pj:p番目パターン入力値に対するｊ層（ここでは出力
層）のｊ番目のユニットからの出力 x_pi:i層のｉ番目のユニットへのｐ番目パターン入力に
対する総和 x_pj:j層のｊ番目のユニットへのｐ番目パターン入力に
対する総和 W_ih:h層ｈ番目ユニットとｉ層ｉ番目ユニット間の重み W_ji:i層ｉ番目ユニットとｊ層ｊ番目ユニット間の重み次に，これらの値から，教師入力ベクトルとネットワ
ークの出力ベクトルとの誤差の二乗和E_pをネットワーク
の誤差として計算する。A specific algorithm for updating the weight of the hierarchical network will be described below. The following equation is obtained by analogy with the equations (1) and (2). That is, y _pi = 1 / (1 + exp (−x _pi + θ _i )) Expression (4) y _pj = 1 / (1 + exp (−x _pj + θ _j )) Expression (6) where y _ph : output from the h-th unit of the h-layer (input layer) to the p-th pattern input value y _pi : p Output from the i-th unit of the i-th layer (here, the middle layer) for the p-th pattern input value y _pj : Output from the j-th unit of the j-th layer (here, the output layer) for the p-th pattern input value x _pi : i Sum of p-th pattern input to layer i-th unit x _pj : Sum of p-th pattern input to layer j-th unit W _ih : Weight W between h-layer h-th unit and i-th layer i-th unit _ji : Weight between the i-th unit of the i-th layer and the j-th unit of the j-th layer Next, the sum of squares E _p of the error between the teacher input vector and the output vector of the network is calculated as the error of the network from these values.

但し， E_p:p番目パターン入力に対する誤差ベクトル E:全パターン入力に対する誤差ベクトルの総和 d_pj:p番目パターン入力に対するｊ層ｊ番目ユニットへ
の教師信号ここで誤差ベクトルと出力層ベクトルとの関係を求め
るため，式（７）をy_pjに関して偏微分すると，を得る。更に，誤差ベクトルとｊ層への入力との関係を
求めるため，誤差ベクトルをx_pjで偏微分すると，を得る。但し，本実施例では入力層に１つの入力ユニッ
ト３′−ｈを設けこのユニット３′−ｈに常に「１」の
値を入力する方式を採用することで，他の各ユニットの
閾値θの自動調節を実現している。更に，誤差ベクトル
とｉ−ｊ層間の重みとの関係をもとめるため，誤差ベク
トルをW_jiで偏微分すると，の積和で表される解を得る。 Where E _p : error vector for p-th pattern input E: sum of error vectors for all pattern inputs d _pj : teacher signal to j-th layer j-th unit for p-th pattern input Here, the relationship between error vector and output layer vector In order to obtain Eq. (7) is partially differentiated with respect to y _pj , Get. Further, in order to obtain the relationship between the error vector and the input to the j-th layer, when the error vector is partially differentiated by x _pj , Get. However, in the present embodiment, one input unit 3'-h is provided in the input layer and a value of "1" is always input to this unit 3'-h. It realizes automatic adjustment. Furthermore, in order to obtain the relationship between the error vector and the weight between the ij layers, the error vector is partially differentiated with W _ji , Get the solution represented by the sum of products of

次に,i層の出力y_piに対する誤差ベクトルE_pの変化を
求めると，を得る。更に,i層入力ユニットへの総和x_piの変化に対
する誤差ベクトルE_pの変化を計算すると，の積和で表される解を得る。更に,h−ｉ層間の重みの変
化に対する誤差ベクトルの変化の関係を求めると，の積和で表される解を得る。Next, when the change of the error vector E _p with respect to the output y _pi of the i layer is _calculated , Get. Furthermore, when the change in the error vector E _p with respect to the change in the sum x _pi to the i-th layer input unit is calculated, Obtain the solution represented by the sum of products of. Furthermore, when the relation of the change of the error vector with respect to the change of the weight between the h-i layers is obtained, Obtain the solution represented by the sum of products of.

これらから，全入力パターンに対する誤差ベクトルと
ｉ−ｊ層間の重みとの関係を求めると以下のようにな
る。From these, the relationship between the error vector for all input patterns and the weight between the i-j layers is obtained as follows.

また，全入力パターンに対する誤差ベクトルとｈ−ｉ
層間の重みとの関係を求めると以下のようになる。 In addition, the error vector for all input patterns and h−i
The relationship with the weight between layers is calculated as follows.

式（14）および式（15）は，各層間の重みの変化に対
する誤差ベクトルの変化率を示していることから，この
値が常に負になるように重みを変化させると公知の勾配
法により，誤差の２乗和であるＥを漸近的に０とするこ
とができる。そこで本実施例では，重みの一回当たりの
変化量ΔW_jiを以下のように設定し，勾配法に基づき繰
り返しこの操作を行い,Eを零に収束させている。 Since equations (14) and (15) show the rate of change of the error vector with respect to the change of weight between layers, the weight is changed so that this value is always negative. E, which is the sum of squared errors, can be asymptotically set to zero. Therefore, in the present embodiment, the change amount ΔW _ji per weight is set as follows, and this operation is repeated based on the gradient method to converge E to zero.

但し， ε：学習率（勾配法の勾配率と同じ機能）更に，本発明では，勾配法における収束時の振動を抑
制する目的で学習速度を式（16）および式（17）に適用
して，ΔW_ihとΔW_jiを以下のように設定している。 However, ε: learning rate (the same function as the gradient rate of the gradient method) Further, in the present invention, the learning speed is applied to the equations (16) and (17) for the purpose of suppressing the oscillation at the time of convergence in the gradient method. , ΔW _ih and ΔW _ji are set as follows.

但し， α：学習速度定数 t:回数なお，上述の重みW_ihやW_jiは各ユニット３−ｉや３−
ｊに夫々対応する記憶装置内に格納され，その内容が上
述の学習の間フィードバックによって修正される。そし
て最終的に定まった学習結果の重みW_ih,W_jiが，これか
ら説明する実際の処理に当って使われることになる。こ
の記憶装置としては，基本制御動作格納手段５を用いる
ことも可能であるが，また別に設けるものであってもよ
い。 However, α: learning rate constant t: number of times The above weights W _ih and W _ji are the units 3-i and 3-
It is stored in the storage device corresponding to each j, and the contents are corrected by feedback during the learning described above. Then, the finally determined learning result weights W _ih and W _ji will be used in the actual processing described below. As this storage device, the basic control operation storage means 5 can be used, but it may be provided separately.

次に，このようにして決定された階層ネットワークの
内部結合の重みW_ih及びW_jiを用いて実行されることにな
るロボットの制御について説明する。Next, control of the robot to be executed using the weights W _ih and W _ji of the internal connection of the hierarchical network determined in this way will be described.

階層ネットワークの学習を終えると，ロボット制御手
段３の動作モードを処理モード，すなわち実際にロボッ
トの動作を制御するモードにと切換える。この処理モー
ドでは，複数のセンサからなるセンサ手段１により取り
込まれたセンサの検出信号からなる入力パターンは，ロ
ボット制御手段３の階層ネットワークの入力層にと入力
される。そして，ロボット制御手段３は，この入力パタ
ーンに対して学習モードで定まった重み値及び閾値を使
って積和演算と閾値処理を実行して出力パターンを求め
るとともに，この求めた出力パターンを制御信号として
行動パターン発生手段２にと送出する。そして，この制
御信号を受け取った行動パターン発生手段２は，その制
御信号に対応する行動パターン信号を発生してロボット
の行動パターンを規定することになる。従って，このよ
うな処理を繰返すことによって，ロボットは複数のセン
サの検出パターンに応じた適切な行動をとることができ
ることになる。When the learning of the hierarchical network is completed, the operation mode of the robot control means 3 is switched to the processing mode, that is, the mode for actually controlling the operation of the robot. In this processing mode, the input pattern composed of the detection signals of the sensors captured by the sensor means 1 composed of a plurality of sensors is input to the input layer of the hierarchical network of the robot control means 3. Then, the robot control means 3 obtains an output pattern by performing a product-sum operation and threshold processing on the input pattern using the weight value and the threshold determined in the learning mode, and outputs the obtained output pattern as a control signal. To the action pattern generating means 2. Then, the action pattern generation means 2 that has received this control signal will generate an action pattern signal corresponding to the control signal to define the action pattern of the robot. Therefore, by repeating such processing, the robot can take appropriate actions according to the detection patterns of the plurality of sensors.

この処理の内容を第５図の例に従って具体的にみてみ
ることにする。今，ロボットと発光源との関係が第５図
（ａ）の状態，即ち，ロボットの光センサ，がON,
光センサがOFF,モータがOFFの状態にあるとする。こ
の時，発光源が第５図（ｂ）のように，右側に動くと，
ロボットの光センサがON,光センサ，がOFFの状態
に変わる。これを第２図に示したネットワークに入力す
ると，出力パターンとして“モータをONにせよ（右に旋
回せよ）”という制御を表す「１」という出力が発生さ
れる。従って，ロボットは右旋回を行う。この右旋回
は,3つの光センサのうち２つがONになるまで続けられ
る。この場合では，第５図（ｃ）のように，光センサ
，がON,光センサがOFFの状態になった時，ネット
ワークは“モータをOFFにせよ”という制御を表す
「０」を出すので，ロボットは停止する。このようにし
て，ロボットが，時々刻々と取り込んでくるセンサ入力
信号パターンを第２図に示したネットワークに入力し，
ネットワークがその入力パターンに対応して時々刻々と
出力する最適な出力パターンに従ってモータを制御する
ことにより，発光源の動きに追従することができる。こ
の例では，発光源が右に動く場合について述べたが，第
５図（ｄ）のように発光源が左に動いても，同様にし
て，ロボットは追従することができる。The contents of this processing will be concretely examined according to the example of FIG. Now, the relationship between the robot and the light source is as shown in FIG. 5 (a), that is, the optical sensor of the robot is ON,
It is assumed that the optical sensor is OFF and the motor is OFF. At this time, if the light source moves to the right as shown in Fig. 5 (b),
The robot's optical sensor turns on and the optical sensor turns off. When this is input to the network shown in FIG. 2, the output pattern "1" representing the control "turn the motor ON (turn right)" is generated. Therefore, the robot turns right. This right turn is continued until two of the three optical sensors are turned on. In this case, as shown in Fig. 5 (c), when the optical sensor is in the ON state and the optical sensor is in the OFF state, the network outputs "0" indicating the control "turn off the motor". ， The robot stops. In this way, the robot inputs the sensor input signal pattern that is captured every moment into the network shown in FIG.
It is possible to follow the movement of the light emission source by controlling the motor according to the optimum output pattern that the network outputs every moment corresponding to the input pattern. In this example, the case where the light emitting source moves to the right has been described. However, even if the light emitting source moves to the left as shown in FIG. 5D, the robot can follow the same manner.

このようにして，ロボットが適切な行動をとるように
と制御されていることになる。しかるに，センサの数や
制御信号線の数が多いときには，基本制御動作格納手段
５に入力パターンとその入力パターンと対をなす所望の
出力パターンのデータのすべてを格納することができな
いので，基本的なものが格納されることになる。これか
ら，この基本的なデータを使った学習による求まる重み
値と閾値で出力パターンを求めていったときに，所望の
ロボットの行動が得られないことが生ずることがある。
このようなとき，本発明では，動作モードを学習モード
に切り換え，その段階でのネットワークの結合の重みを
初期値として，所望の得られなかった時点の入力パター
ンと所望の出力パターンとの対を基本制御動作格納手段
５に格納して，ロボット制御手段３に，それらの入出力
パターン対をさらに重ねて連合することができるように
結合の重みを自動的に微調整させることにより，ロボッ
トの制御性能を向上させることができる。したがって，
最終的には，ロボットの最適な制御を実現できるように
なる。In this way, the robot is controlled to take appropriate actions. However, when the number of sensors and the number of control signal lines are large, the basic control operation storage means 5 cannot store all the data of the input pattern and the desired output pattern paired with the input pattern. Will be stored. From this, when the output pattern is obtained with the weight value and the threshold obtained by learning using this basic data, the desired robot behavior may not be obtained.
In such a case, in the present invention, the operation mode is switched to the learning mode, and the weight of the connection of the network at that stage is used as an initial value to set a pair of an input pattern and a desired output pattern at a time when a desired unobtainable time is obtained. Control of the robot by storing it in the basic control operation storage means 5 and automatically fine-adjusting the weight of the coupling so that the robot control means 3 can further combine and combine the input / output pattern pairs. The performance can be improved. Therefore,
Ultimately, optimal control of the robot will be realized.

第７図は，バック・プロパゲーション法により一種の
パリティチェック処理をネットワークに学習（重みと閾
値との自動調整）させた結果のネットワークを示してい
る。ここで第７図（Ａ）図示のNet1と第７図（Ｂ）図示
のNet2とは夫々奇数パリティをチェックする同一の特
性，即ち，「０」または「１」をとる３つの入力に対し
て「１」の数が奇数個のとき出力層は「１」を出力する
ようになっている。第７図では，内部結合線の隣の数字
が重みであり，ユニットを表す○の中の数字が閾値であ
る。Net1とNet2とを比較すると，重みと閾値とが全く異
なることが分かる。これは，ネットワークに冗長性があ
るため，こうした差異が生ずることになる。この第７図
に示すように，基本ユニット30の出力を「１」と「０」
の状態しかとらないようにとネットワークを構成するこ
とも可能である。FIG. 7 shows a network obtained as a result of learning a kind of parity check processing in the network by the back propagation method (automatic adjustment of weight and threshold). Here, Net1 shown in FIG. 7 (A) and Net2 shown in FIG. 7 (B) have the same characteristics for checking odd parity, that is, for three inputs that take "0" or "1". When the number of "1" is an odd number, the output layer outputs "1". In FIG. 7, the number next to the internal connection line is the weight, and the number in the circles representing the unit is the threshold value. Comparing Net1 and Net2, it can be seen that the weight and the threshold are completely different. This is because of the network redundancy, which makes these differences. As shown in FIG. 7, the output of the basic unit 30 is set to "1" and "0".
It is also possible to configure the network so that it only takes the following states.

次に，階層ネットワークを構成する中間層のユニット
数の冗長性について説明する。第８図は，排他OR機能を
実現するネットワークであり，中間層ユニット数の冗長
性を持たせない場合の例である。排他ORでは，第９図に
おいて，入力パターンの直線による識別（線形識別）に
関して，一本の直線では不可能で少なくとも第９図
（Ｂ）図示の如く二本の直線を要する。従って，この識
別には第９図（Ａ）図示の如く２つの中間層ユニットが
最小限必要となっている。ところが，本発明では中間層
ユニットと出力層ユニットの閾値を自動調整するのに入
力層に常時「１」を入力信号として持つ閾値入力ユニッ
ト３′−ｈが設けられていることから，第８図図示のネ
ットワークにおいては，識別すべき数に対して中間層ユ
ニットの数に冗長性がない。この場合,2つの中間層ユニ
ット３−ｉは出力層ユニット３−ｊの閾値を含めた自動
調節をしなければならず，排他OR特性に適応するまでの
繰り返し学習回数が増大することになる。これに対して
第10図は冗長性をもつネットワークの例を示す。図に示
したように，中間層ユニット数に冗長性を持たせると，
余ったユニットが出力層のユニット３−ｊの閾値の調節
に専念できることから繰り返し学習回数は，冗長なユニ
ットがない場合に比し大幅に低減できる。Next, the redundancy of the number of units in the middle layer that constitutes the hierarchical network will be described. FIG. 8 is a network that realizes the exclusive OR function, and is an example of the case where the redundancy of the number of intermediate layer units is not provided. In the exclusive OR, as shown in FIG. 9, the identification of the input pattern by the straight line (linear identification) is impossible with one straight line, and at least two straight lines are required as shown in FIG. 9 (B). Therefore, for this identification, as shown in FIG. 9 (A), at least two intermediate layer units are required. However, in the present invention, in order to automatically adjust the threshold values of the intermediate layer unit and the output layer unit, the input layer is provided with the threshold value input unit 3'-h having "1" as an input signal at all times. In the network shown, there is no redundancy in the number of hidden units relative to the number to be identified. In this case, the two middle layer units 3-i have to perform automatic adjustment including the threshold value of the output layer unit 3-j, and the number of times of repeated learning until the exclusive OR characteristic is adapted increases. On the other hand, Fig. 10 shows an example of a network with redundancy. As shown in the figure, if there is redundancy in the number of middle layer units,
Since the surplus units can be dedicated to the adjustment of the threshold value of the unit 3-j in the output layer, the number of times of repeated learning can be greatly reduced as compared with the case where there is no redundant unit.

次に内部結合の重みの具体的な計算例について説明す
る。Next, a specific calculation example of the weight of the inner connection will be described.

奇数パリティチェック機能の学習を終了した時の各ユ
ニット間の内部結合重みを第11図および第13図に示す。
第11図ないし第13図の重みは異なるが機能は同じであ
る。これは，学習率と学習速度パラメータが異なること
と中間層ユニット数の冗長性とによる。第12図（Ａ）
（Ｂ）（Ｃ）は夫々，第11図で示した学習終了後に，任
意のパターンを入力した場合の入力と出力を示してい
る。また第14図（Ａ）（Ｂ）（Ｃ）は夫々，第13図で示
した学習終了後に，任意のパターンを入力した場合の入
力と出力を示している。ロボット制御手段３では，演算
をアナログ値で行うため，出力値は完全な１または０と
なっていないが，閾値を用いるなどすれば第12図および
第14図図示の出力値から１または０を判断することは容
易であることは明らかである。11 and 13 show the internal connection weights between the units when learning of the odd parity check function is completed.
11 to 13 have different weights but the same function. This is because the learning rate and the learning speed parameter are different and the redundancy of the number of hidden layer units. Figure 12 (A)
(B) and (C) respectively show an input and an output when an arbitrary pattern is input after the learning shown in FIG. 11 is completed. Further, FIGS. 14 (A), (B) and (C) respectively show inputs and outputs when an arbitrary pattern is input after the learning shown in FIG. 13 is completed. In the robot control means 3, since the calculation is performed by an analog value, the output value is not completely 1 or 0. However, if a threshold value is used, 1 or 0 is output from the output value shown in FIGS. 12 and 14. Clearly, the decision is easy.

また，第15図は奇数パリティチェック機能を学習させ
たのと全く同じネットワークに偶数パリティチェック機
能を学習させた結果を示すものであり，各内部結合の重
みを示している。第16図（Ａ）（Ｂ）（Ｃ）は夫々，第
15図で示した学習終了後に，任意のパターンを入力した
場合の入力と出力を示している。In addition, Fig. 15 shows the result of learning the even parity check function on the same network as that on which the odd parity check function was learned, and shows the weight of each inner connection. 16 (A), (B) and (C) are respectively
Figure 15 shows the input and output when an arbitrary pattern is input after the learning shown in Fig. 15.

次に，移動物体を追尾する若しくは移動物体から逃走
するロボットの行動パターンを制御するための本発明の
実施例について説明する。このロボットは，光の発光源
及び超音波の発信源であるところの移動物体を障害物を
避けながら追尾していく，若しくは障害物を避けながら
移動物体から逃走していく行動をとるようにと制御され
るものであり，移動物体を追尾していく若しくは移動物
体から逃走していくための駆動源として，前後進の行動
をとるための２個の推進用モータと左右回転の行動をと
るための２個の操舵用モータとを装備するとともに，人
間の発声に相当する圧電ブザーを装備するものである。
そして，これらのモータと圧電ブザーは，実施例ではON
/OFFの２値化モードで制御されるものである。このロボ
ットは５種類の駆動源をもつことから，このロボットの
制御のために設けられるロボット制御手段３の階層ネッ
トワークの出力層のユニット数は５個となり，ロボット
の行動パターンは2⁵通りあることになる。Next, an embodiment of the present invention for controlling the action pattern of a robot that tracks a moving object or escapes from a moving object will be described. This robot is designed to follow a moving object, which is a light emitting source and an ultrasonic wave source, while avoiding an obstacle, or to escape from a moving object while avoiding an obstacle. It is controlled, and as a driving source for tracking a moving object or escaping from a moving object, to take two propulsion motors for taking forward and backward actions and for making left and right actions. In addition to being equipped with two steering motors, a piezoelectric buzzer corresponding to human speech is equipped.
The motor and the piezoelectric buzzer are turned on in the embodiment.
It is controlled in the binary mode of / OFF. That the robot from having a five driving source, number of units of the output layer of the robot control unit 3 of the hierarchical network that is provided for control of the robot becomes five, behavior patterns robot which There are two ⁵ become.

第17図に，このロボットが追尾，逃走のために装備す
るセンサの配置図を示す。この図に示すように，ロボッ
ト100は，頭部には両目に相当する２個の頭部光センサ1
01R,101Lと，両耳に相当する２個の頭部超音波センサ10
2R,102Lとを装備するとともに，胴体には両目に相当す
る２個の胴体光センサ103R,103Lと，耳に相当する１個
の胴体超音波センサ（又は発信機）104と，障害物に接
触したことを感知するために120度の中心角度をもって
設けられる３個のタッチセンサ105F,105R,105Lとを装備
するものである。そしてロボット100は，更に自らが備
える操舵モータの回転によって，頭部及び胴体が限界の
回転角度まで回転したときに回転が例えば限度の180゜
に達した旨を検出するところの２個のリミットスイッチ
106R,106Lを装備するものである。この頭部光センサ101
R,101Lと胴体光センサ103R,103Lは，目標物となる移動
物体の発する光を検出し，頭部超音波センサ102R,102L
と胴体超音波センサ104は，目標物となる移動物体の発
する超音波を検出するよう動作する。ここで，頭部光セ
ンサ101R,101Lと胴体光センサ103R,103Lという人間の両
目に相当するセンサを２組設けているのは，移動物体の
発する光の波長が追尾時と逃走時で異なることになるよ
うにとなされているから，この波長に感度を合わせた２
種類の光検知センサを用意する必要があるからである。
また，ロボットの頭部と胴体は一体となってロボットの
基体に対して回転し，この回転に伴ってロボットの車輪
も頭部の正面に向くようにと構成されている。Figure 17 shows the layout of the sensors that this robot uses to track and escape. As shown in this figure, the robot 100 has two head light sensors 1 corresponding to both eyes on the head.
01R, 101L and two head ultrasonic sensors 10 corresponding to both ears
Equipped with 2R and 102L, the body has two body light sensors 103R and 103L corresponding to both eyes, one body ultrasonic sensor (or transmitter) 104 corresponding to the ears, and contact with obstacles. It is equipped with three touch sensors 105F, 105R, 105L provided with a central angle of 120 degrees in order to detect the fact. The robot 100 further detects the fact that the rotation reaches, for example, the limit of 180 ° when the head and the body rotate to the limit rotation angle by the rotation of the steering motor included in the robot 100.
It is equipped with 106R and 106L. This head light sensor 101
R, 101L and body light sensor 103R, 103L detect light emitted from a moving object that is a target object, and head ultrasonic sensor 102R, 102L
The body ultrasonic sensor 104 operates so as to detect ultrasonic waves emitted by a moving object that is a target. Here, two sets of sensors corresponding to both eyes of the human being, head light sensors 101R and 101L and body light sensors 103R and 103L, are provided because the wavelength of light emitted by a moving object is different between tracking and flight. Therefore, the sensitivity is adjusted to this wavelength.
This is because it is necessary to prepare various types of light detection sensors.
Further, the head and body of the robot are integrally rotated with respect to the base of the robot, and the wheels of the robot are directed toward the front of the head in accordance with this rotation.

このように，このロボットは12個のセンサをもつこと
から，このロボットの制御のために設けられるロボット
制御手段３の階層ネットワークの入力層のユニット数は
12個（閾値ユニットを加えれば13個）となる。そして実
施例では，これらのセンサは２値化モードの出力を出す
ように構成されていることから2¹²通りの検出パターン
が存在することになる。As described above, since this robot has 12 sensors, the number of units in the input layer of the hierarchical network of the robot control means 3 provided for controlling this robot is
It becomes 12 (13 if the threshold unit is added). Further, in the embodiment, since these sensors are configured to output the binary mode output, there are 2 ¹² detection patterns.

この2¹²通りの検出パターンに対して，ロボットが移
動物体を追尾するためにとる行動というのは本来的に定
まっているので，前述したように基本制御動作格納手段
５には，このような情報の内のいくつかが予め選択され
て格納されるよう構成されることになる。第18図にその
一例を示す。この例では,36通りのロボットセンサの検
出パターンに対してのロボット駆動源のとるべき駆動パ
ターンを，基本制御動作格納手段５にと格納する例を示
してある。ここで，図中の“HEAD,EAR"は頭部超音波セ
ンサ102R,Lを，“HEAD,EYE"は頭部光センサ101R,Lを，
“BODY,E"は胴体超音波センサ104を，“BODY,EYE"は胴
体光センサ103R,Lを，“BODY,TOUCH"はタッチセンサ105
F,R,Lを，“BODY,LIM"はリミットスイッチ106R,Lを表わ
し，“RIGHT"右旋回操舵用モータを，“LEFT"は左旋回
操舵用モータを，“FWD"は前進推進モータを，“BWD"は
後進推進モータを，“BUZZ"は圧電ブザーを表わしてい
る。With respect to these 2 ¹² detection patterns, the action taken by the robot to track a moving object is inherently determined, and as described above, the basic control action storage means 5 stores such information. Will be configured to be preselected and stored. Figure 18 shows an example. In this example, the drive patterns to be taken by the robot drive source for the 36 detection patterns of the robot sensor are stored in the basic control operation storage means 5. Here, "HEAD, EAR" in the figure is the head ultrasonic sensor 102R, L, "HEAD, EYE" is the head optical sensor 101R, L,
“BODY, E” is the body ultrasonic sensor 104, “BODY, EYE” is the body light sensor 103R, L, and “BODY, TOUCH” is the touch sensor 105.
F, R, L, "BODY, LIM" are limit switches 106R, L, "RIGHT" is a right turn steering motor, "LEFT" is a left turn steering motor, and "FWD" is a forward propulsion motor. "BWD" represents the reverse drive motor and "BUZZ" represents the piezoelectric buzzer.

第18図の格納情報例についての理解を深めるために，
そのいくつかについて具体的に説明する。例えば，第18
図の５行目のセンサ検出パターンは，頭部の左目に相当
する頭部光センサ101LがONであることを示している。こ
のようなときには移動物体が左前方に存在している状態
なので，教師信号に示すように，操舵用モータＬをONし
推進用モータＦをONすることで，左に旋回しながら前進
して追尾が実行できることになる。８行目のセンサ検出
パターンは，頭部光センサ101LがONでロボットの右側に
設けられているタッチセンサ105RがONであることを示し
ている。このようなときにはロボットの右側面が障害物
に接触しているものの移動物体が左前方に存在している
状態なので，教師信号に示すように，同じく操舵用モー
タＬをONし推進用モータＦをONすることで左に旋回しな
がら前進して追尾が実行できることになる。また15行目
のセンサ検出パターンは，頭部光センサ101Lと頭部の右
目に相当する頭部光センサ101RがONであることを示して
いる。このようなときには移動物体が正面前方に存在し
ている状態なので，教師信号に示すように，推進用モー
タＦをONすることで前進して追尾が実行できることにな
る。To deepen the understanding of the example of stored information in Fig. 18,
Some of them will be specifically described. For example, the 18th
The sensor detection pattern on the fifth line in the figure indicates that the head optical sensor 101L corresponding to the left eye of the head is ON. In such a case, since the moving object is in the front left direction, as shown by the teacher signal, turning on the steering motor L and turning on the propulsion motor F causes the vehicle to turn left while moving forward to track. Can be executed. The sensor detection pattern on the eighth line indicates that the head light sensor 101L is ON and the touch sensor 105R provided on the right side of the robot is ON. In such a case, the right side surface of the robot is in contact with the obstacle, but the moving object is in the front left direction. Therefore, as shown by the teacher signal, the steering motor L is turned on and the propulsion motor F is turned on. By turning it ON, it will be possible to perform tracking while moving forward while turning to the left. Further, the sensor detection pattern on the 15th row indicates that the head light sensor 101L and the head light sensor 101R corresponding to the right eye of the head are ON. In such a case, since the moving object is present in front of the front, as shown by the teacher signal, turning on the propulsion motor F allows the vehicle to move forward and perform tracking.

ロボット制御手段３の階層ネットワークの内部結合の
重みW_ih及びW_jiを，この基本制御動作格納手段５の格納
情報を使ってバック・プロパゲーション法の学習アルゴ
リズムにより自動決定することについては，第５図で説
明した実施例のロボットと基本的に変わるものではな
い。第19図に，このバック・プロパゲーション法の学習
アルゴリズムの実行により求められた重みW_ih,W_jiの計
算結果を示す。この計算結果は，ロボット制御手段３の
階層ネットワークの中間層のユニット数を５個とし，基
本制御動作格納手段５の格納情報としては第18図に示し
たものを用いて実行されたものである。そして，第５図
の実施例をロボットと同様に，このようにして決定され
た階層ネットワークの内部結合の重みW_ih及びW_jiを用い
て移動物体を追尾していくロボットの制御が実行される
ことになるのである。Regarding the automatic determination of the internal connection weights W _ih and W _ji of the hierarchical network of the robot control means 3 by the learning algorithm of the back propagation method using the storage information of this basic control operation storage means 5, It is basically the same as the robot of the embodiment described in the figure. Figure 19 shows the calculation results of the weights W _ih and W _ji obtained by executing the learning algorithm of this back propagation method. This calculation result is executed by using the number of units in the intermediate layer of the hierarchical network of the robot control means 3 as 5, and using the information stored in the basic control operation storage means 5 as shown in FIG. . Then, similarly to the robot in the embodiment of FIG. 5, the control of the robot for tracking the moving object is executed using the weights W _ih and W _ji of the internal connection of the hierarchical network determined in this way. It will be.

次に，このロボット制御の内容を第20図に示す説明図
に従って具体的に説明する。Next, the contents of this robot control will be specifically described with reference to the explanatory diagram shown in FIG.

今，ロボットと発光源および発信源を備えた移動物体
との関係が第20図（ａ）の状態，即ち，頭部の左目に相
当する頭部光センサ101LがONの状態にあるとする。この
センサ入力信号を重みの求められた階層ネットワークに
入力すると，出力パターンとして“操舵用モータＬをO
N,推進用モータＦをONにせよ（左に旋回し，前進せ
よ）”という制御を表す出力ベクトル（0,1,1,0,0）が
発生される。従って，ロボットは左に旋回しながら前進
する。この時，第20図（ｂ）に示すように，移動物体が
移動し，ロボットが追尾したとすると，ロボットのセン
サ入力信号は，頭部光センサ101LがON,タッチセンサ105
RがONの状態，即ち，ロボットの右側面が障害物にぶつ
かり，左前方に移動物体が見えている状態になる。この
センサ出力信号をネットワークに入力すると，今度は
“操舵用モータＬをON,推進用モータＦをONにせよ（左
に旋回し，前進せよ）”という制御を表す出力ベクトル
（0,1,1,0,0）が発生される。従って，ロボットは左に
旋回しながら前進する。この時，第20図（ｃ）に示すよ
うに，移動物体が移動し，ロボットが追尾したとする
と，ロボットのセンサ入力信号は，頭部光センサ101R,
および頭部光センサ101LがともにONの状態，即ち，移動
物体を正面に捉えた状態になる。この場合，このセンサ
入力信号をネットワークに入力すると，“推進用モータ
をONにせよ（前進せよ）”という制御を表す出力ベクト
ル（0,0,1,0,0）が発生され，ロボットは第20図（ｄ）
に示すように，移動物体に向かって前進する。このよう
にして，ロボットが，時々刻々と取り込んでくるセンサ
入力信号パターンをロボット制御手段３の階層ネットワ
ークに入力し，階層ネットワークがその入力パターンに
対応して時々刻々と出力する適切な出力パターンに従っ
てモータを制御することにより，移動物体を追尾するこ
とができる。Now, it is assumed that the relationship between the robot and the moving object having the light emission source and the transmission source is as shown in FIG. 20 (a), that is, the head light sensor 101L corresponding to the left eye of the head is in the ON state. When this sensor input signal is input to the hierarchical network for which the weight has been obtained, the output pattern is "steering motor L
N, turn on the propulsion motor F (turn left, move forward) ", an output vector (0,1,1,0,0) representing the control is generated. Therefore, the robot turns left. At this time, if the moving object moves and the robot tracks, as shown in Fig. 20 (b), the sensor input signal of the robot is that the head light sensor 101L is ON, and the touch sensor 105 is
When R is ON, that is, the right side of the robot hits an obstacle and the moving object is visible to the left front. When this sensor output signal is input to the network, the output vector (0, 1, 1 indicating the control "turn on the steering motor L and turn on the propulsion motor F (turn left and move forward)" this time. , 0,0) is generated. Therefore, the robot moves forward while turning to the left. At this time, as shown in FIG. 20 (c), if the moving object moves and the robot tracks, the sensor input signal of the robot is the head light sensor 101R,
Both the head light sensor 101L and the head light sensor 101L are in the ON state, that is, the moving object is captured in front. In this case, when this sensor input signal is input to the network, an output vector (0,0,1,0,0) representing the control "Turn on the propulsion motor (move forward)" is generated, and the robot Figure 20 (d)
As shown in, move forward toward the moving object. In this way, the robot inputs the sensor input signal pattern that is captured moment by moment into the hierarchical network of the robot control means 3, and the hierarchical network responds to the input pattern according to an appropriate output pattern that is output momentarily. A moving object can be tracked by controlling the motor.

この第20図の説明にあたっては，説明の便宜上，第18
図で示した基本制御動作格納手段５に格納される制御情
報（第５行目，第８行目，第15行目）と同じものを使っ
て移動物体を追尾していくことを説明したが，センサ
が，基本制御動作格納手段５に格納されていない検出パ
ターンを検出したとしても，学習された階層ネットワー
クの重みにより移動物体を追尾していく出力パターンが
生成されるので，その出力パターンに従ってモータを制
御していくことで移動物体を追尾できることになる。も
ちろん，階層ネットワークの出力では移動物体を追尾し
ていけない事態になったときにおいては，第５図の実施
例のロボットと同様に階層ネットワークの重みを学習に
より変更していくことになる。For the convenience of explanation, the explanation of FIG.
Although the same control information (fifth line, eighth line, fifteenth line) stored in the basic control operation storage means 5 shown in the figure is used to track a moving object, , Even if the sensor detects a detection pattern which is not stored in the basic control operation storage means 5, an output pattern for tracking a moving object is generated by the weight of the learned hierarchical network. A moving object can be tracked by controlling the motor. Of course, when it becomes impossible to track a moving object with the output of the hierarchical network, the weight of the hierarchical network is changed by learning as in the robot of the embodiment shown in FIG.

本発明においては，ロボット制御手段３の階層ネット
ワークの構造は同じでも基本制御動作格納手段５に異な
る制御内容の制御情報を格納すれば，その制御情報に従
って重みW_ih,W_jiが求まりこれによりその異なる制御内
容の制御が実行できることになる。第21図は，ロボット
が移動物体から逃走するための制御を実行するために，
基本制御動作格納手段５に格納されることになる基本制
御動作の一例である。この逃走の制御には，両目として
胴体光センサ103R,Lが用いられることになる。この第21
図の基本制御動作格納手段５の格納情報を使って求めら
れた重みW_ih,W_jiの計算結果を第22図に示す。階層ネッ
トワークの構造は追尾のものと全く同一である。この重
みを使って移動物体から逃走していくよう動作するロボ
ットの制御がそのまま実行できることになる。In the present invention, even if the structure of the hierarchical network of the robot control means 3 is the same, if the control information of different control contents is stored in the basic control operation storage means 5, the weights W _ih and W _ji are obtained according to the control information, and the Control of different control contents can be executed. FIG. 21 shows that the robot executes control to escape from a moving object.
6 is an example of a basic control operation stored in a basic control operation storage unit 5. The body optical sensors 103R and 103L are used for both eyes to control this escape. This 21st
FIG. 22 shows the calculation results of the weights W _ih and W _ji obtained using the storage information of the basic control operation storage means 5 in the figure. The structure of the hierarchical network is exactly the same as that of the tracking. The control of the robot that moves to escape from the moving object using this weight can be executed as it is.

次に，階層ネットワークの内部結合の重みW_ih,W_jiを
求めるための本発明の実施例について説明する。この内
部結合の重みW_ih,W_jiは，ロボット上に搭載された制御
用プロセッサか専用制御ハードウェアにより求めること
を基本としている。しかしながら，前述したように重み
W_ih,W_jiを求めるためのバックプロパゲーション法の算
術式はかなり複雑である。従って重みW_ih,W_jiを求める
ためには相当長い計算実行時間が要求されることになっ
てしまう。しかも求められる重みW_ih,W_jiの適格性を高
めるためには，基本制御動作格納手段５に格納する基本
制御動作の数を増加させたり，階層ネットワークの中間
層のユニット数を増加させたりしなければならず，これ
がために計算実行時間は更に長時間のものになることに
なる。これでは，せっかくの階層ネットワーク構造のロ
ボット制御方式も十分実用的なものとはいえなくなって
しまう恐れがある。Next, an embodiment of the present invention for _obtaining the weights W _ih , W _ji of the internal connection of the hierarchical network will be described. The weights W _ih and W _ji of the internal coupling are basically determined by the control processor mounted on the robot or the dedicated control hardware. However, as mentioned above, the weight
The backpropagation method for _calculating W _ih and W _ji is quite complicated. Therefore, a considerably long calculation execution time is required to obtain the weights W _ih and W _ji . Moreover, in order to enhance the qualification of the required weights W _ih and W _ji , the number of basic control operations stored in the basic control operation storage means 5 is increased, or the number of units in the middle layer of the hierarchical network is increased. This has to be done, which results in a much longer execution time. As a result, the robot control method with a hierarchical network structure may not be practical enough.

そこで本発明では，階層ネットワーク構造のロボット
制御方式と，このロボット制御方式で行動するロボット
の行動をシミュレーションするところの動作模倣手段と
からなるロボット学習シミュレータを高速性能の計算機
上で構成し，このロボット学習シミュレータ上で階層ネ
ットワークの重みを学習により求めるとともに，その有
効性を検証し，その結果求まった重みを実際のロボット
のプロセッサ上にと転送するようにと構成するサポート
システムを用意するものである。Therefore, in the present invention, a robot learning simulator consisting of a robot control system having a hierarchical network structure and a motion imitation means for simulating the action of a robot acting by this robot control system is constructed on a high-performance computer. We prepare a support system that is designed to obtain the weights of the hierarchical network by learning on a learning simulator, verify its effectiveness, and transfer the resulting weights to the actual robot processor. .

第23図に，このサポートシステムの実施例のシステム
構成図を示す。図中，第１図で示したものと同一のもの
は同一の記号で示してあり,1′はセンサ手段１に対応し
て計算機上で構成されるところの疑似センサ手段,2′は
行動パターン発生手段２に対応して計算機上で構成され
るところの疑似行動パターン発生手段,3′はロボット制
御手段３に対応して計算機上で構成されるところの疑似
ロボット制御手段,5′は基本制御動作格納手段５に対応
して計算機上で構成されるところの疑似基本制御動作格
納手段である。この疑似センサ手段１′と疑似行動パタ
ーン発生手段２′と疑似ロボット制御手段３′と疑似基
本制御動作格納手段５′とで，第１図で説明した疑似階
層ネットワーク制御ロボット10′が構成されることにな
る。６は動作モード決定手段であり，疑似ロボット制御
手段３′の動作モードを学習モード，追加学習モード処
理モードの内のいずれかに設定するとともに，疑似基本
制御動作格納手段５′と基本制御動作格納手段５との間
での基本制御動作情報の転送を送信，受信，転送せずの
内のいずれかに設定し，かつ疑似ロボット制御手段３′
とロボット制御手段３との間での重み情報の転送を送
信，受信，転送せずの内のいずれかに設定することで，
疑似階層ネットワーク制御ロボット10′の動作処理内容
を決定するものである。７は第１図でも説明した動作模
倣手段であり，疑似行動パターン発生手段２′の発生す
る行動パターン信号により規定される行動パターンに基
づいて疑似階層ネットワーク制御ロボット10′の行動を
シミュレーションして，次の段階の疑似センサ手段１′
が検出することになるセンサ信号を発生する。そして,2
1は基本制御動作情報転送回線であり，ロボット学習シ
ミュレータ20と階層ネットワーク制御ロボット10との間
で基本制御動作情報を転送するために用いられる回線,2
2は重み情報転送回線であり，ロボット学習シミュレー
タ20と階層ネットワーク制御ロボット10との間で階層ネ
ットワークの重み情報を転送するために用いられる回線
である。FIG. 23 shows a system configuration diagram of an embodiment of this support system. In the figure, the same parts as those shown in FIG. 1 are indicated by the same symbols, 1'is a pseudo sensor means which is configured on the computer corresponding to the sensor means 1, and 2'is an action pattern. Pseudo-behavior pattern generating means corresponding to the generating means 2 on the computer, 3'is pseudo robot controlling means corresponding to the robot control means 3 on the computer, and 5'is basic control It is a pseudo basic control operation storage means that is configured on a computer corresponding to the operation storage means 5. The pseudo sensor means 1 ', the pseudo behavior pattern generating means 2', the pseudo robot control means 3 ', and the pseudo basic control operation storage means 5'constitute the pseudo hierarchical network control robot 10' described in FIG. It will be. Reference numeral 6 denotes an operation mode determination means for setting the operation mode of the pseudo robot control means 3'to either the learning mode or the additional learning mode processing mode, and the pseudo basic control operation storage means 5'and the basic control operation storage. Transfer of the basic control operation information to and from the means 5 is set to any of transmission, reception, and no transfer, and the pseudo robot control means 3 '
By setting the transfer of the weight information between the robot control means 3 and the robot control means 3 to either transmission, reception, or no transfer,
The operation processing contents of the pseudo hierarchical network control robot 10 'are determined. Reference numeral 7 denotes the motion mimicking means described in FIG. 1, which simulates the behavior of the pseudo hierarchical network control robot 10 'on the basis of the behavior pattern defined by the behavior pattern signal generated by the pseudo behavior pattern generating means 2', Next stage pseudo sensor means 1 '
Generates a sensor signal that will be detected by the. And 2
Reference numeral 1 is a basic control operation information transfer line, which is a line used to transfer basic control operation information between the robot learning simulator 20 and the hierarchical network control robot 10, 2
Reference numeral 2 denotes a weight information transfer line, which is a line used to transfer the weight information of the hierarchical network between the robot learning simulator 20 and the hierarchical network control robot 10.

次に，このように構成される本発明のロボット学習方
式の動作について，第24図，第25図及び第26図に示すフ
ローチャートに従って詳細に説明する。Next, the operation of the robot learning method of the present invention thus configured will be described in detail with reference to the flowcharts shown in FIGS. 24, 25 and 26.

第24図に示すフローチャートは最も基本的な動作のフ
ローチャートであって，階層ネットワークの内部結合の
重みの学習と検証をすべてロボット学習シミュレータ20
側で実行する動作のフローチャートである。このフロー
チャートでは，最初に動作モード決定手段６の設定によ
り，基本制御動作格納手段５の基本制御動作情報を疑似
基本制御動作格納手段５′にと転送する。続いて，動作
モード決定手段６により学習モードにセットして，ロボ
ット学習シミュレータ20の高速計算機がこの疑似基本制
御動作格納手段５′の基本制御動作情報を用いて前述し
たバックプロパゲーション法により疑似ロボット制御手
段３′の階層ネットワークの重みの値を求める。続い
て。動作モード決定手段６により処理モードにセットし
て，動作模倣手段７が予め想定されている環境下で疑似
階層ネットワーク制御ロボット10′の行動をシミュレー
トしていくことで，求められた重みの値を検証する。具
体的には，動作模倣手段７は，疑似センサ手段１′から
疑似ロボット制御手段３′に入力するところの疑似的な
センサ検出パターンを発生し，そして疑似ロボット制御
手段３′の出力により発生するところの疑似行動パター
ン発生手段２′の疑似的な行動パターン信号から疑似階
層ネットワーク制御ロボット10′の移動位置をシミュレ
ートして，その移動位置での次の疑似的なセンサ検出パ
ターンを発生していくということを繰り返していくこと
で，求められた重みの値で疑似階層ネットワーク制御ロ
ボット10′が適格な行動をとることができるのかを検証
していくのである。この検証により求められた重みの値
が十分適格なものであると判断されるときには，動作モ
ード決定手段６の設定により，求められた重みを階層ネ
ットワーク制御ロボット10にと転送してセットする処理
を行うとともに，いまだ不十分なものであれば，用いる
疑似基本制御動作格納手段５′の基本制御動作情報を変
えたり追加したりして，十分適格な重みが求まるように
と学習するのである。The flowchart shown in Fig. 24 is the flowchart of the most basic operation, and the learning and verification of the weights of the internal connections of the hierarchical network are all performed by the robot learning simulator
It is a flowchart of the operation performed on the side. In this flowchart, the basic control operation information of the basic control operation storage means 5 is first transferred to the pseudo basic control operation storage means 5'by setting the operation mode determining means 6. Then, the operation mode determining means 6 sets the learning mode, and the high-speed computer of the robot learning simulator 20 uses the basic control operation information of the pseudo basic control operation storage means 5'to perform the pseudo robot by the back propagation method. The value of the weight of the hierarchical network of the control means 3'is obtained. continue. The value of the weight obtained by setting the processing mode by the operation mode determining means 6 and simulating the behavior of the pseudo hierarchical network control robot 10 'under the environment in which the behavior mimicking means 7 is assumed in advance. To verify. Specifically, the motion mimicking means 7 generates a pseudo sensor detection pattern which is input from the pseudo sensor means 1'to the pseudo robot control means 3 ', and is generated by the output of the pseudo robot control means 3'. However, the movement position of the pseudo hierarchical network control robot 10 'is simulated from the pseudo movement pattern signal of the pseudo movement pattern generating means 2', and the next pseudo sensor detection pattern at the movement position is generated. By repeating the process, it is verified whether the pseudo hierarchical network control robot 10 'can take an appropriate action with the obtained weight value. When it is determined that the value of the weight obtained by this verification is sufficiently qualified, a process of transferring the obtained weight to the hierarchical network control robot 10 and setting it by the setting of the operation mode determining means 6 is performed. In addition, if it is still insufficient, the basic control operation information of the pseudo basic control operation storing means 5'to be used is changed or added to learn to obtain a sufficiently qualified weight.

第25図に示すフローチャートは，第24図に示すフロー
チャートの実行により求められた重みの値を有する階層
ネットワーク制御ロボット10が実世界で行動していった
ときに，適格でない行動をとるようなことがあったとき
においての処理のフローチャートである。このフローチ
ャートに示すように，適格でない行動があったときに
は，より適格な行動を取らせるための追加の基本制御動
作情報をロボットにより自動的に収集するか観察装置又
は観察者が収集し，収集された追加の基本制御動作情報
をロボット学習シミュレータ20に転送して，ロボット学
習シミュレータ20が重みの値の追加学習を実行すること
で新たな重みの値を求め検証していくという処理を行う
ものである。この追加学習とは，前述したように，それ
までに求められている重みの値を初期値として，新たに
加わった追加の基本制御動作情報が実現できるようにと
バクプロパゲーション法を実行して，新たな重みの値を
求めていくという学習方式である。なお，このとき追加
学習ではなくて，新たに加わる追加の基本制御動作情報
も含めたすべての基本制御動作情報からバックプロパゲ
ーション法を実行して，新たな重みの値を求めていくと
いう学習方式を採ることも可能である。The flowchart shown in FIG. 25 is such that the hierarchical network control robot 10 having the weight value obtained by executing the flowchart shown in FIG. It is a flow chart of processing when there is. As shown in this flowchart, when there is a behavior that is not qualified, additional basic control motion information for more qualified behavior is automatically collected by the robot or collected by the observation device or the observer. The additional basic control operation information is transferred to the robot learning simulator 20, and the robot learning simulator 20 performs additional learning of the weight value to obtain a new weight value for verification. is there. As described above, this additional learning is performed by executing the back propagation method so that the newly added additional basic control motion information can be realized with the value of the weight obtained so far as the initial value. The learning method is to obtain new weight values. At this time, instead of the additional learning, the learning method of executing the backpropagation method from all the basic control operation information including the newly added basic control operation information to obtain a new weight value. It is also possible to take

第26図に示すフローチャートは，第25図に示すフロー
チャートでは追加学習をロボット学習シミュレータ20で
実行するようにしたのに対して，追加学習を階層ネット
ワーク制御ロボット10自身が自らのロボット制御手段３
を用いて実行する方式を開示するものである。In the flow chart shown in FIG. 26, in the flow chart shown in FIG. 25, the additional learning is executed by the robot learning simulator 20, while the hierarchical network control robot 10 itself performs the additional learning.
The method executed by using is disclosed.

このように，本発明は，階層ネットワーク制御ロボッ
ト10が必要とする重みの値の学習のすべて若しくは一部
を，ロボット学習シミュレータ20が実行するようにと構
成することを特徴とするものである。As described above, the present invention is characterized in that the robot learning simulator 20 executes all or part of the learning of the weight value required by the hierarchical network control robot 10.

〔The invention's effect〕

以上説明したように，本発明によれば，多大な処理時
間を要する重みの設定のための基本制御動作の学習を高
速計算機上に構成されるロボット学習シミュレータで行
うことができ，ロボット学習シミュレータのロボット制
御手順部に高速の専用回路を用いることにより，階層ネ
ットワーク制御ロボットのハードウェアを増加させるこ
となしに，高速な重みの学習が可能になる。加えて，実
際のロボットを行動させることなしに学習した重みの検
証が行えるので，開発効率の向上が見込まれる。As described above, according to the present invention, the learning of the basic control operation for setting the weight, which requires a great deal of processing time, can be performed by the robot learning simulator configured on the high speed computer. By using a high-speed dedicated circuit for the robot control procedure part, high-speed weight learning is possible without increasing the hardware of the hierarchical network control robot. In addition, since the weights learned can be verified without making the actual robot act, it is expected that the development efficiency will be improved.

これから，本出願人が提案した階層ネットワーク構造
をとる“柔かい”ロボット制御方式をより実用的なもの
にすることができるのである。From this, it is possible to make the "soft" robot control system having the hierarchical network structure proposed by the present applicant more practical.

[Brief description of drawings]

第１図は本発明の原理説明図，第２図はロボット制御手段の実施例構成図，第３図は基本ユニットの構成図，第４図は基本制御動作格納手段の格納情報例，第５図はロボットの行動パターンの具体例，第６図はロボット制御手段の他の実施例構成図，第７図は奇数パリティチェック機能をもつネットワーク
の例，第８図は中間層の数が冗長性を持たない排他ORのネット
ワーク，第９図は排他ORと中間層ユニットの役割を説明する説明
図，第10図は中間層の数が冗長性を持つ排他ORのネットワー
ク，第11図は学習結果を示す内部結合の重み（Ｉ）の説明
図，第12図は第11図の重みに従った処理例，第13図は学習結果を示す内部結合の重み（II）の説明
図，第14図は第13図の重みに従った処理例，第15図は学習結果を示す内部結合の重み（III）の説明
図，第16図は第15図の重みに従った処理例，第17図は移動ロボットが装備するセンサの説明図，第18図は移動ロボットの基本制御動作格納手段の格納情
報例（Ｉ），第19図は移動ロボットの学習結果を示す内部結合の重み
（Ｉ）の説明図，第20図は移動物体を追尾していくときの説明図，第21図は移動ロボットの基本制御動作格納手段の格納情
報例（II），第22図は移動ロボットの学習結果を示す内部結合の重み
（II）の説明図，第23図はロボット学習シミュレータのシステム構成図，第24図は重みの学習のためのフローチャート（Ｉ），第25図は重みの学習のためのフローチャート（II），第26図は重みの学習のためのフローチャート（III）で
ある。図中,1はセンサ手段,2は行動パターン発生手段,3はロボ
ット制御手段,3′−ｂは闘値入力ユニット,5は基本制御
動作格納手段,6は動作モード決定手段,7は動作模倣手
段,10は階層ネットワーク制御ロボット,10′は疑似階層
ネットワーク制御ロボット,20はロボット学習シミュレ
ータ,30は基本ユニット,31は累算部,32は闘値処理部,10
1は頭部光センサ,102は頭部超音波センサ,103は胴体光
センサ,105はタッチセンサである。FIG. 1 is a diagram for explaining the principle of the present invention, FIG. 2 is a block diagram of an embodiment of robot control means, FIG. 3 is a block diagram of a basic unit, FIG. 4 is an example of information stored in a basic control operation storage means, and a fifth example. FIG. 6 is a concrete example of the action pattern of the robot, FIG. 6 is a block diagram of another embodiment of the robot control means, FIG. 7 is an example of a network having an odd parity check function, and FIG. 8 is the number of intermediate layers is redundant. Exclusive-OR network without asterisk, Fig. 9 is an explanatory diagram for explaining the roles of exclusive-OR and middle tier units, Fig. 10 is an exclusive-OR network in which the number of middle tiers is redundant, and Fig. 11 is a learning result. Fig. 12 is an explanatory view of the weight of internal coupling (I), Fig. 12 is a processing example according to the weight of Fig. 11, Fig. 13 is an explanatory diagram of the weight of internal coupling (II) showing the learning result, Fig. 14 Shows an example of processing according to the weights in Fig. 13, and Fig. 15 shows the weights of the inner joins (III) showing the learning results. Explanatory diagram, FIG. 16 is a processing example according to the weight of FIG. 15, FIG. 17 is an explanatory diagram of a sensor equipped in the mobile robot, and FIG. 18 is an example of information stored in the basic control operation storage means of the mobile robot (I ), FIG. 19 is an explanatory diagram of the weight (I) of the internal coupling showing the learning result of the mobile robot, FIG. 20 is an explanatory diagram when tracking a moving object, and FIG. 21 is a basic control operation of the mobile robot. An example of information stored in the storage means (II), Fig. 22 is an explanatory diagram of the weight of internal coupling (II) showing the learning result of a mobile robot, Fig. 23 is a system configuration diagram of a robot learning simulator, and Fig. 24 is a diagram of weight. 25 is a flowchart (I) for learning, FIG. 25 is a flowchart (II) for learning weights, and FIG. 26 is a flowchart (III) for learning weights. In the figure, 1 is a sensor means, 2 is an action pattern generating means, 3 is a robot control means, 3'-b is a threshold value input unit, 5 is a basic control operation storing means, 6 is an operation mode determining means, and 7 is an imitation of operation. Means, 10 is a hierarchical network control robot, 10 'is a pseudo hierarchical network control robot, 20 is a robot learning simulator, 30 is a basic unit, 31 is an accumulation unit, 32 is a threshold value processing unit, 10
Reference numeral 1 is a head light sensor, 102 is a head ultrasonic sensor, 103 is a body light sensor, and 105 is a touch sensor.

───────────────────────────────────────────────────── フロントページの続き (72)発明者渡部信雄神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者川村旭神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者吉沢英樹神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者関口実神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (56)参考文献特開平１−173202（ＪＰ，Ａ) 特開平１−177603（ＪＰ，Ａ) ＦＵＪＩＴＳＵＶｏｌ．39，Ｎｏ. ３（10．06．1988）ＰＰ．175−184 ＮＩＫＫＥＩＥＬＥＣＴＲＯＮＩＣＳＮｏ．427（10．08．1987）ＰＰ. 115−127 ＮｅｕｒａｌＮｅｔｗｏｒｋｓＶｏｌ．１（1988）ＰＰ．251−265 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Nobuo Watanabe 1015 Kamiodanaka, Nakahara-ku, Kawasaki, Kanagawa Prefecture, Fujitsu Limited (72) Inventor Asahi Kawamura 1015, Kamedota, Nakahara-ku, Kawasaki, Kanagawa Prefecture, Fujitsu Limited ( 72) Inventor Hideki Yoshizawa 1015 Kamiodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa, within Fujitsu Limited (72) Inventor Minoru Sekiguchi 1015 Kamedota, Nakahara-ku, Kawasaki, Kanagawa, within Fujitsu Limited (56) Reference JP 1 -173202 (JP, A) JP-A-1-177603 (JP, A) FUJITSU Vol. 39, No. 3 (10.06.1988) PP. 175-184 NIKKEI ELECTRONIC S No. 427 (10.08.1987) PP. 115-127 Neural Networks Vol. 1 (1988) PP. 251-265

Claims

(57) [Claims]

1. A sensor means (1) comprising a plurality of sensors for taking in external environment information surrounding a robot, an action pattern generating means (2) for generating an action pattern signal defining an action pattern of the robot, and the sensor means. A robot control means (3) for sending a control signal to the action pattern generation means (2) according to the detection pattern of (1) and one or more specific ones of the sensor means (1) selected in advance. The robot control means (3) for the detection pattern
And a basic control operation storing means (5) for storing control signal information to be sent, and the robot control means (3) has one or more inputs and a weight to be multiplied with respect to the inputs. A basic unit (30) having an accumulator (31) for receiving a sum of products and obtaining a sum of products, and a threshold processor (32) for converting an output from the accumulator (31) by a threshold function to obtain a final output. As a unit unit, a plurality of the unit units (3-h) connected to the sensor means (1) as an input layer, and a plurality of the unit units (3-i) as an intermediate layer, or One or a plurality of the unit units (3-j) provided with a plurality of intermediate layers and connected to the action pattern generating means (2) are used as output layers, and the input layer and the frontmost intermediate layer are provided. Between the intermediate layers and between the intermediate layers at the final stage A hierarchical network is formed by forming an internal connection with the force layer, and the robot control means (3) implements learning control so that the stored information of the basic control operation storage means (5) can be realized. By doing so, the value of the weight of the hierarchical network provided by itself is set, and the output value of the output layer of the hierarchical network for which this weight value is set becomes the control signal to the action pattern generating means (2). Of the hierarchical network control robot (10), a similar hierarchical network control robot (10 ') configured on a computer with the same configuration as this hierarchical network control robot (10), and this pseudo hierarchical network control robot (10). Robot learning simulation consisting of motion mimicking means (7) for simulating the behavior of ′) on a computer (20) and the robot learning simulator (20) uses the pseudo-hierarchical network control robot (10 ′) as necessary to determine the weight required by the hierarchical network of the hierarchical network control robot (10). And verifying the validity of the obtained weight value by using the motion mimicking means (7), and the validity-verified weight value of the hierarchical network control robot (10). A robot control method characterized by being set as a weight value of the hierarchical network.