JPH07168605A

JPH07168605A - System identifying device

Info

Publication number: JPH07168605A
Application number: JP6191374A
Authority: JP
Inventors: Shinya Hosoki; 信也細木; Yoshiharu Maeda; 芳晴前田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1993-09-10
Filing date: 1994-08-15
Publication date: 1995-07-04

Abstract

PURPOSE:To enable the system identifying device which identifies operation parameters of an object system to identify a linear and a nonlinear system and also to make the device into hardware by using a neural network. CONSTITUTION:This system identifying device is so constituted as to calculate and output an output signal corresponding to an input signal according to a setting changeable data processing function, and is equipped with a data processing means 101 which outputs the identification result of the object system and a setting changing data output means 102 which outputs data for a setting change for changing the data processing function of the data processing means 101 on the basis of the input/output relation of the object system.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はロボットマニピュレー
タ、各種プラントなどの学習的制御の分野において、シ
ステムの特性を十分に精度よく数学モデルとして表現す
るためのシステム同定装置に関し、特に対象システムの
数学的モデル化が困難である場合、および対象システム
の特性が時間と共に変化する場合などにおける有効なシ
ステム同定装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system identification apparatus for representing characteristics of a system as a mathematical model with sufficient accuracy in the field of learning control such as robot manipulators and various plants, and in particular The present invention relates to an effective system identification device when modeling is difficult, and when the characteristics of a target system change with time.

【０００２】[0002]

【従来の技術】制御の対象となるシステムに対して、従
来からの制御手法を適用して制御を行う場合に、そのシ
ステムの特性を十分に精度よく数学モデルとして表現す
る必要がある。この目的のために、システムを同定する
手法のアルゴリズムが線形システム、非線形システムに
対し開発されている。2. Description of the Related Art When performing control by applying a conventional control method to a system to be controlled, it is necessary to represent the characteristics of the system as a mathematical model with sufficient accuracy. For this purpose, an algorithm of a system identification method has been developed for linear systems and non-linear systems.

【０００３】一方、近年ニューラルネットワークの学習
能力を利用した認識やロボット制御の研究が盛んに行わ
れている。このニューラルネットワーク方式は、システ
ムモデルの精度がよくない、あるいは未知の場合であっ
ても、繰り返し学習を行うことによって目標とする数学
モデルを表現できることが期待される技術である。On the other hand, in recent years, researches on recognition and robot control using the learning ability of neural networks have been actively conducted. This neural network system is a technology expected to be able to express a target mathematical model by repetitive learning even if the accuracy of the system model is not good or unknown.

【０００４】しかしながら、ニューラルネットワークを
用いる制御方式では、学習のために入力・出力の対とな
る学習データが通常必要とされる。つまり、ある入力に
対し、これに対応する既知の出力データを教師信号とし
てニューラルネットの学習を行う。このため、制御等の
分野にニューラルネットを学習制御装置として使用する
場合においては、教師となるべき信号が得られないため
に、またはその生成法が不明のために、応用が限られて
いるのが現状である。However, in a control system using a neural network, learning data as a pair of input and output is usually required for learning. That is, for a certain input, learning of the neural network is performed using known output data corresponding to this as a teacher signal. Therefore, in the case of using a neural network as a learning control device in the field of control, etc., the application is limited because a signal to be a teacher can not be obtained or the generation method thereof is unknown. Is the current situation.

【０００５】以下では具体例として、多自由度マニピュ
レータの逆運動学問題を例に説明する。マニピュレータ
の台座側から関節に番号を付け、関節１，２，３・・・
ｎとする。またこれに対応して第ｉ関節の関節変位角度
をｑ_iとする。これらをまとめて関節変位ベクトルを、In the following, as a specific example, the inverse kinematics problem of a multi-degree-of-freedom manipulator is described as an example. Numbering joints from the pedestal side of the manipulator, joints 1, 2, 3 ...
It is assumed that n. Further, the joint displacement angle of the ith joint is set to q _i correspondingly. These are put together and a joint displacement vector is

【０００６】[0006]

【数１】 [Equation 1]

【０００７】ここでｔは転置ベクトルを示す。以後も同
様である。また、マニピュレータの手先の位置を表わす
ベクトルを、Here, t indicates a transposed vector. The same is true thereafter. Also, a vector representing the position of the hand of the manipulator is

【０００８】[0008]

【数２】 [Equation 2]

【０００９】これらの間の関係は非線形となり一般に以
下のように書ける。The relationship between these is non-linear and can generally be written as follows.

【００１０】[0010]

【数３】 [Equation 3]

【００１１】ところで、関節変位外１（以下、ベク
トルｑとする）が与えられれば手By the way, if a joint displacement outside 1 (hereinafter referred to as vector q) is given, the hand

【００１２】[0012]

【外１】 [Extra 1]

【００１３】先の座標外２（以下、ベクトルｘとす
る）を求めることは、式(1) から比較The determination of the above-mentioned coordinate outside 2 (hereinafter referred to as vector x)

【００１４】[0014]

【外２】 [2]

【００１５】的簡単に出来るが、作業を行う場合には指
示は手先位置座標で与えられる。従って、指示されたベ
クトルｘに対応するベクトルｑを求めるために(1) 式を
逆に解く必要性が生じる。形式的には、Although it can be as simple as possible, the instructions are given by hand position coordinates when working. Therefore, it is necessary to reverse the equation (1) to obtain the vector q corresponding to the indicated vector x. Formally,

【００１６】[0016]

【数４】 [Equation 4]

【００１７】と書ける。この(2) 式を解く問題は、逆運
動学問題と言われるものであって、手先位置ベクトルｘ
に対応する関節変位角度ベクトルｑが存在するものとは
限らず、また、存在しても一意とは限らないという性質
を持っている。We can write: The problem of solving this equation (2) is called inverse kinematics problem, and the hand position vector x
The joint displacement angle vector q corresponding to does not necessarily exist, and it has the property that even if it exists, it is not necessarily unique.

【００１８】このように、ロボットマニピュレータの手
先位置が与えられるときに、それを実現する関節変位角
度を求めるには、(2) 式の関係式を解いていく必要があ
る。しかしながら、この方法は、単純な構造のロボット
マニピュレータに対しては実現可能であるものの、関節
数が多くなるような複雑な構造のロボットマニピュレー
タに対しては、複雑な解析処理が要求されることになっ
て現実に不可能である。これから、複雑な構造を持つロ
ボットマニピュレータについては、制御目的を実現する
ための教師信号群を得ることができないので、例えば適
応型データ処理装置により構成される適応型制御装置を
構築することができないことになる。Thus, when the end position of the robot manipulator is given, it is necessary to solve the relational expression of equation (2) in order to obtain the joint displacement angle which realizes it. However, although this method is feasible for a robot manipulator with a simple structure, complex analysis processing is required for a robot manipulator with a complicated structure that has a large number of joints. It is impossible in reality. From this, for a robot manipulator having a complicated structure, a teacher signal group for realizing the control purpose can not be obtained, so that it is impossible to construct, for example, an adaptive control device configured by an adaptive data processing device. become.

【００１９】ロボットマニピュレータに対する目標手先
位置外３（以下、ベクトルｘ_d Target hand position outside robot manipulator 3 (hereinafter, vector x _d

【００２０】[0020]

【外３】 [3]

【００２１】とする）が与えられたとき、この目標位
置を実現する、すなわち実際の手先位置ベクトルｘをベ
クトルｘ_dに一致させるためには、ベクトルｘ_dとベク
トルｘとの偏差の入力に対して、ベクトルｘをベクトル
ｘ_dに一致させるような関節角変位ベクトルｑをマニピ
ュレータに与える制御器を必要とする。この制御器を例
えばニューラルネットワークによって構成する場合には
逆運動学問題の解に適合する教師信号をニューラルネッ
トワークに与えなければならないことになる。[0021] to) when given, to realize this target position, i.e. in order to the actual hand position vector x matches the vector x _d is the input of the deviation between the vector x _d and the vector x Therefore, a controller is required to provide the manipulator with joint angle displacement vector q such that vector x matches vector x _d . When this controller is configured by, for example, a neural network, it is necessary to provide the neural network with a teacher signal that matches the solution of the inverse kinematics problem.

【００２２】このように逆運動学問題を解くこと、言い
換えればシステムを同定してシステムの動作パラメータ
を求めるためと、システム本来の制御目的を実現するた
めとにニューラルネットワークを応用した従来例を以下
に説明する。As described above, a conventional example in which a neural network is applied to solve the inverse kinematics problem, in other words, to identify the system and obtain the operation parameters of the system, and to realize the control purpose inherent to the system is described below. Explain to.

【００２３】まずシステムの動作パラメータ同定への応
用については、ホップフィールド・ネットワークを応用
した例がある（参考文献／ＩＳＣＩＥ May,22-24,199
1)。しかしながら、この例は非線形システムの同定には
利用できず、また制御目的への応用には触れられていな
い。First of all, as an application to identification of operating parameters of a system, there is an example where a Hopfield network is applied (Reference: ISCIE May, 22-24, 199).
1). However, this example can not be used for identification of non-linear systems and is not mentioned for control purposes.

【００２４】次に制御目的への応用について、図１２〜
図１７を用いて説明する。これらの図中、適応型制御装
置は、ニューラルネットワークにより実現されており、
該適応型制御装置に対してデータ変換機能を学習させる
ために制御入力誤差信号を与える学習処理装置は、適応
型制御装置を斜めに突きさす直線で表現している。ま
た、Ｎは、適応型制御装置のデータ変換機能の学習処理
の実現のために用意される適応型データ処理装置を表し
ている。Next, for application to control purposes, FIG.
This will be described with reference to FIG. In these figures, the adaptive controller is realized by a neural network,
A learning processing device for giving a control input error signal to cause the adaptive control device to learn a data conversion function is represented by a straight line obliquely pushing the adaptive control device. Further, N represents an adaptive data processing device prepared to realize learning processing of the data conversion function of the adaptive control device.

【００２５】図１２に示す従来技術（参考文献／D.Psal
tis,A.Sideris and A.A.Yamamura：A Multilayered Neu
ral Network,IEEE Control Systems Magazine,Vol.8, N
o.2，pp.17-21(1988)．以下資料１と略記）では、制御
目標量ベクトルｘ_dを入力とする適応型制御装置４０１
の出力する制御操作量ベクトルｑを制御対象４０３に与
えていくときに、該制御対象４０３から出力される制御
状態量ベクトルｘが入力される適応型データ処理装置Ｎ
４０２を設けている。そして、適応型制御装置４０１の
出力信号であるベクトルｑと適応型データ処理装置Ｎ４
０２の出力信号であるベクトル外４（以下、ベクト
ルｑ′とする）とが一致するように、適Prior art shown in FIG. 12 (reference document / D. Psal
tis, A. Sideris and AA Yamamura: A Multilayered Neu
ral Network, IEEE Control Systems Magazine, Vol. 8, N
o.2, pp. 17-21 (1988). In the following document 1), an adaptive controller 401 which receives a control target quantity vector x _d as an input
The adaptive data processing apparatus N to which the control state quantity vector x output from the control target 403 is input when the control operation quantity vector q output from the control target 403 is given to the control target 403
402 is provided. Then, the vector q which is the output signal of the adaptive controller 401 and the adaptive data processor N4
So that it matches the vector outside 4 (hereinafter referred to as vector q ′) that is the output signal of 02.

【００２６】[0026]

【外４】 [Outside 4]

【００２７】応型制御装置４０１及び適応型データ処理
装置Ｎ４０２のデータ変換機能を学習させていくこと
で、制御目標量ベクトルｘ_dにより指定される制御がよ
り忠実に実現されるための制御操作量ベクトルｑを求
め、これを制御対象４０３に加えながら実行する適応型
制御装置４０１を構築するものである。A control operation amount vector for realizing the control specified by the control target amount vector x _d more faithfully by learning the data conversion function of the adaptive control device 401 and the adaptive data processing device N 402. The adaptive controller 401 is constructed to obtain q and execute it while adding it to the control target 403.

【００２８】図１３に示す従来技術（資料１）では、
（ａ）に示すように、ランダムな制御操作量ベクトルｑ
を制御対象５０１に与えて実際に駆動し、そのときの制
御対象５０１の制御状態量ベクトルｘを求めて適応型制
御装置５０２に入力するとともに、そのときの適応型制
御装置５０２の出力ベクトルであるｑ′と制御対象５
０１に与えた制御操作量ベクトルｑとの差分量外５
（以下、ベクトルｅとすIn the prior art (document 1) shown in FIG.
As shown in (a), a random control operation vector q
Is applied to the control target 501 to actually drive it, and the control state quantity vector x of the control target 501 at that time is determined and input to the adaptive control unit 502, which is an output vector of the adaptive control unit 502 at that time. q 'and controlled object 5
Amount of difference with control operation amount vector q given to 01 5
(Hereafter, vector e

【００２９】[0029]

【外５】 [Outside 5]

【００３０】る）が小さくなるように、適応型制御装置
５０２の持つデータ変換機能を学習させる。そして、実
際に制御対象５０１を制御するときには、（ｂ）に示す
ように、学習されたデータ変換機能を持つ適応型制御装
置５０２に制御状態量の制御目標量ベクトルｘ_dを入力
して、そのときの適応型制御装置５０２の出力する制御
操作量ベクトルｑを制御対象５０１に与えていくこと
で、制御対象５０１の制御状態量が目標のものになるよ
うに制御するものである。The data conversion function of the adaptive controller 502 is learned so that Then, when actually controlling the control target 501, as shown in (b), the control target amount vector x _d of the control state amount is input to the adaptive control device 502 having the learned data conversion function, and The control operation amount vector q output from the adaptive control device 502 at this time is applied to the control target 501 to control so that the control state amount of the control target 501 becomes a target.

【００３１】図１４に示す従来技術（資料１）は、制御
目標量ベクトルｘ_dを入力とする適応型制御装置６０１
の出力する制御操作量ベクトルｑを制御対象６０２に与
えていくときに、制御対象の逆Jacobianを算出する算出
器６０３を設ける。そして、この算出器６０３により、
制御目標量ベクトルｘ_dと制御対象６０２の制御状態量
ベクトルｘとの差分量に対応して変化する制御操作差分
量外６（以下、ベThe prior art (document 1) shown in FIG. 14 is an adaptive controller 601 which receives a control target quantity vector x _d as an input.
The calculator 603 is provided to calculate the inverse Jacobian of the control target when the control operation amount vector q output by the control target is supplied to the control target 602. And by this calculator 603,
An amount of control operation difference that changes corresponding to the amount of difference between the control target amount vector _xd and the control state amount vector x of the control target 602

【００３２】[0032]

【外６】 [Extra 6]

【００３３】クトルΔｑとする）を算出する。そして、
この制御操作差分量ベクトルΔｑが小さくなるように適
応型制御装置６０１のデータ変換機能を学習させていく
ことで、制御目標量ベクトルｘ_dにより指定される制御
をより忠実に実現する制御操作量ベクトルｑを求め、こ
れを制御対象６０２に加えながら実行する適応型制御装
置６０１を構築するものである。Calculate the torque Δq). And
A control operation amount vector that more closely realizes the control specified by the control target amount vector x _d by learning the data conversion function of the adaptive controller 601 so that the control operation difference amount vector Δq becomes smaller. The adaptive controller 601 is constructed to obtain q and execute it while adding it to the control target 602.

【００３４】図１５に示す従来技術（資料１）は、図１
３と図１４とを組み合わせた制御方式である。制御目標
量ベクトルｘ_dを入力とする第２の適応型制御装置７０
２の出力する制御操作量ベクトルｑを制御対象７０３に
与えていくときにあって、制御対象７０３の逆Jacobian
を算出する算出器７０４を設けている。そして、この算
出器７０４により、制御目標量ベクトルｘ_dと制御対象
７０３の制御状態量ベクトルｘとの差分量に対応して変
化する制御操作差分量Δｑを算出して、この制御操作差
分量ベクトルΔｑが小さくなるように第２の適応型制御
装置７０２のデータ変換機能を学習させていく。また、
第１の適応型制御装置７０１の出力ベクトルであるｑ′
が制御対象７０３に与えた制御操作量ベクトルｑに等し
くなるように、第１の適応型制御装置７０１の持つデー
タ変換機能を学習させる。この場合、減算器７０５によ
り上記ベクトルｑとベクトルｑ′との差分量Δｑ′を算
出して、これを第１の適応型制御装置７０１に与える。
この方式では、ヤコビアンは本来微小量の間の関係であ
るので、ベクトルｘ_dとベクトルｘとの差が大きいとき
には図１４の方式では必ずしも適当でないため、図１３
の方式によってベクトルｘ_dとベクトルｘとの差を小さ
くするものである。The prior art (document 1) shown in FIG.
It is a control method which combined 3 and FIG. Second adaptive controller 70 having control target quantity vector x _d as an input
2 when the control operation amount vector q outputted is given to the control target 703, and the inverse Jacobian of the control target 703
A calculator 704 is provided to calculate. Then, the calculator 704 calculates a control operation difference amount Δq that changes corresponding to the difference amount between the control target amount vector _xd and the control state amount vector x of the control object 703, and this control operation difference amount vector The data conversion function of the second adaptive controller 702 is learned so that Δq becomes smaller. Also,
The output vector of the first adaptive controller 701 q ′
The data conversion function of the first adaptive control device 701 is learned so that the control operation amount vector q given to the control target 703 becomes equal. In this case, the difference amount Δq ′ between the vector q and the vector q ′ is calculated by the subtractor 705, and this is supplied to the first adaptive controller 701.
In this method, since the Jacobian is originally a relationship between small quantities, when the difference between the vector x _d and the vector x is large, the method of FIG. 14 is not necessarily appropriate.
The difference between the vector x _d and the vector x is reduced by the following method.

【００３５】図１６に示す従来技術（参考文献／M.Kawa
to,Y.Uno,M.Isobe and R.Suzuki:Ahierarchical neural
network model for voluntary movement with applica
tionto robotics, IEEE Control Systems Magazine 8,8
-16(1988).)は、制御目標量ベクトルｘ_dと制御対象８
０３の制御状態量ベクトルｘとの差分量を入力とする固
定利得Ｋを持つフィードバック制御器８０２を設けると
ともに、制御目標量ベクトルｘ_dを入力とする適応型制
御装置８０１の出力する制御操作量ベクトルｑと、この
フィードバック制御器８０２の出力する誤差量との加算
量を制御対象８０３に与える構成となっている。そし
て、フィードバック制御器８０２の出力する誤差量が小
さくなるようにと適応型制御装置８０１のデータ変換機
能を学習させていくことで、制御目標量ベクトルｘ_dで
指定される制御をより忠実に実現する制御操作量ベクト
ルｑを出力する適応型制御装置８０１を構築するもので
ある。尚、この場合、フィードバック制御器８０２は、
上記誤差量を制御対象８０３の逆Jacobianを基に生成す
る。Prior art shown in FIG. 16 (reference document / M. Kawa)
to, Y. Uno, M. Isobe and R. Suzuki: Aheararchical neural
network model for voluntary movement with application
tionto robotics, IEEE Control Systems Magazine 8, 8
-16 (1988).) Is the control target quantity vector x _d and the control target 8
A control operation amount vector output from an adaptive control device 801 having a fixed gain K as an input and a control target amount vector _xd as an input while providing a difference gain with the control state amount vector x of 03 as an input An amount of addition of q and an error amount output from the feedback controller 802 is provided to the control target 803. Then, by learning the data conversion function of the adaptive controller 801 so that the error amount output by the feedback controller 802 becomes smaller, the control specified by the control target amount vector x _d is realized more faithfully The adaptive control device 801 that outputs the control operation amount vector q is constructed. In this case, the feedback controller 802
The error amount is generated based on the inverse Jacobian of the control object 803.

【００３６】図１７に示す従来技術（参考文献／M.Jord
an：In ref.4. (ref.4/Neural Networks for Control：
ed. W.Thomas Miller,III et.al.(1990).)) は、適応型
制御装置９０１の出力する制御操作量ベクトルｑを入力
とする適応型データ処理装置Ｎ９０２を設けて、この適
応型データ処理装置Ｎ９０２が制御対象９０３と同一の
入出力特性を持つ順システムとなるように学習する。そ
して、十分学習した後に、適応型データ処理装置Ｎ９０
２の出力する制御状態量外７（以下、ベクPrior art shown in FIG. 17 (reference document / M. Jord
an: In ref. 4. (ref. 4 / Neural Networks for Control:
ed. W. Thomas Miller, III et. al. (1990).)) is provided with an adaptive data processing device N 902 having the control operation quantity vector q output from the adaptive controller 901 as its input. The data processing device N 902 learns to be a forward system having the same input / output characteristics as the control target 903. Then, after sufficiently learning, the adaptive data processing apparatus N90
Control state quantity 2 output 2 (below,

【００３７】[0037]

【外７】 [7]

【００３８】トルｘ′とする）と制御目標量ベクトル
ｘ_dとの誤差量を、適応型データ処理装置Ｎ９０２のデ
ータ変換機能の内部状態値を固定にして逆伝播させるこ
とで入力誤差を得て、この入力誤差を学習信号として用
いて適応型制御装置９０１がデータ変換機能を学習して
いくことで、制御目標量ベクトルｘ_dを実現する制御操
作量ベクトルｑの出力を実行する適応型制御装置９０１
を構築するものである。An input error is obtained by back-propagating the amount of error between the target value vector x _d and the control target amount vector x _d with the internal state value of the data conversion function of the adaptive data processor N902 fixed. Adaptive controller that executes output of the control operation amount vector q that realizes the control target amount vector x _d by the adaptive control device 901 learning the data conversion function using this input error as a learning signal 901
To build

【００３９】[0039]

【発明が解決しようとする課題】しかしながら、図１２
に示す従来技術では、適応型制御装置４０１のデータ変
換機能の学習にあたって、適応型データ処理装置Ｎ４０
２のデータ変換機能を予め学習させておく必要があるこ
とから、オフライン学習になるという問題点がある。そ
して、未学習の段階で、適応型制御装置４０１の出力信
号と適応型データ処理装置Ｎ４０２の出力信号とが一致
してしまうようなことが起こると、適応型制御装置４０
１のデータ変換機能の学習が実行できなくなるという問
題点がある。SUMMARY OF THE INVENTION However, FIG.
In the prior art shown in FIG. 6, in learning the data conversion function of the adaptive controller 401, the adaptive data processor N40
Since it is necessary to learn the data conversion function 2 in advance, there is a problem that it becomes offline learning. When the output signal of the adaptive controller 401 and the output signal of the adaptive data processing device N 402 coincide with each other at the unlearned stage, the adaptive controller 40
There is a problem that learning of the data conversion function 1 can not be performed.

【００４０】次に図１３に示す従来技術では、オフライ
ン学習であるという問題点がある。そして、適応型制御
装置のデータ変換機能を正しい逆システムのものとする
ためには、極めて多数のランダムな制御操作量ベクトル
ｑを制御対象５０１に与えて学習を行っていく必要があ
るという問題点がある。更に、ロボットアームのリンク
長といったような制御対象の持つパラメータが変化する
と、改めて学習をやり直さなくてはならないという問題
点がある。Next, in the prior art shown in FIG. 13, there is a problem that it is offline learning. Then, in order to make the data conversion function of the adaptive controller correct in the inverse system, it is necessary to give a large number of random control manipulated variable vectors q to the control target 501 to perform learning. There is. Furthermore, there is a problem that when the parameters of the control object, such as the link length of the robot arm, change, it is necessary to perform learning again.

【００４１】また、図１４、１５に示す従来技術では、
算出器６０３、７０４によって逆Jacobianをリアルタイ
ムでもって算出していく必要があることから、オフライ
ン学習になるという問題点がある。そして、算出器６０
３、７０４を備えるために、制御対象６０２、７０３の
逆Jacobianの知識が必要になるという問題点がある。In the prior art shown in FIGS.
Since it is necessary to calculate the inverse Jacobian in real time by the calculators 603 and 704, there is a problem that the learning becomes offline. And the calculator 60
There is a problem that inverse Jacobian's knowledge of the control object 602, 703 is needed to provide 3,704.

【００４２】また、図１６に示す従来技術では、オンラ
イン学習を実現できるという利点はある。しかしなが
ら、適応型制御装置８０１のデータ変換機能の学習を実
現するためには、フィードバック制御器８０２の固定利
得Ｋが適切に設定される必要があるが、この固定利得Ｋ
の設定のためには、制御対象８０３の逆Jacobianの知識
が必要になるという問題点がある。The prior art shown in FIG. 16 has the advantage of being able to realize on-line learning. However, in order to realize learning of the data conversion function of the adaptive controller 801, the fixed gain K of the feedback controller 802 needs to be appropriately set.
There is a problem that inverse Jacobian's knowledge of the control object 803 is required for setting.

【００４３】また、図１７に示す従来技術では、適応型
制御装置９０１のデータ変換機能の学習にあたって、適
応型データ処理装置Ｎ９０２のデータ変換機能を予め学
習しておく必要があることから、オフライン学習になる
という問題点がある。Further, in the prior art shown in FIG. 17, since it is necessary to learn in advance the data conversion function of the adaptive data processing device N902 in learning the data conversion function of the adaptive control device 901, offline learning is possible. There is a problem of becoming

【００４４】以上に説明した従来技術の問題点をまとめ
ると、まずニューラルネットワークを利用してシステム
同定を行う従来技術では非線形システムの同定を行うこ
とができないと言う問題点があった。またシステムの数
学的モデルを求める方法では制御対象のシステムが非線
形性冗長自由度を有する場合、該システムを正確にモデ
ル化することが困難であり、さらに該数学的モデルにお
ける逆ダイナミクスの解が数学的に一意的に決定できな
いと言う問題点があった。To summarize the problems of the prior art described above, the prior art in which system identification is performed using a neural network has a problem that it is not possible to identify a non-linear system. In addition, in the method of obtaining a mathematical model of a system, when the system to be controlled has non-linear redundant degree of freedom, it is difficult to model the system accurately, and furthermore, the solution of inverse dynamics in the mathematical model is mathematical There is a problem that it can not be determined uniquely.

【００４５】ニューラルネットワークをロボットなどの
適応制御に応用した場合、多くの従来例はオフラインの
学習方式を用いており、この方式では制御対象の特性が
変化するとネットワークの学習をやり直す必要が生じる
と言う問題点があった。When a neural network is applied to adaptive control of a robot or the like, many conventional examples use an off-line learning method, which says that it is necessary to re-learn the network if the characteristics of the control object change. There was a problem.

【００４６】更にオンラインの学習が可能な方式、すな
わち図１６で説明した制御方式においても、フィードバ
ック制御器８０２の部分に固定ゲインが使用されてお
り、このゲインの値を適切に設計しないと制御目標を達
成することができないと言う問題点があった。特に、強
い非線形性を持つシステムでは、上記固定ゲインの設定
が益々困難になる。Further, in the system capable of on-line learning, that is, in the control system described with reference to FIG. 16, a fixed gain is used for the feedback controller 802, and if this gain value is not properly designed, the control target There was a problem that could not be achieved. In particular, in systems with strong non-linearity, setting of the fixed gain becomes more difficult.

【００４７】本発明は、線形系だけでなく非線形系のシ
ステムパラメータを同定可能とし、更に同定装置をニュ
ーラルネットワークを用いて実現することによって、ハ
ードウェア化可能とすることを第１の目的とする。本発
明の第２の目的は、このシステム同定装置の出力結果を
用いて制御システム内で用いられるニューラルネットワ
ークへの教師信号を作成し、適応性の高い適応制御装置
を構成することにある。The first object of the present invention is to make it possible to realize hardware by identifying system parameters of not only linear systems but also nonlinear systems and further realizing an identification apparatus using a neural network. . A second object of the present invention is to construct a highly adaptive adaptive controller by creating a teacher signal to a neural network used in a control system using the output result of the system identification device.

【００４８】[0048]

【課題を解決するための手段】図１は本発明の原理ブロ
ック図である。同図は制御対象の同定結果、或いはシス
テムパラメータの同定結果を出力するデータ処理手段を
備えたシステム等を対象とする同定装置の原理構成の一
例を示すブロック図である。FIG. 1 is a block diagram of the principle of the present invention. This figure is a block diagram showing an example of the basic configuration of an identification apparatus for a system or the like equipped with data processing means for outputting the identification result of a control target or the identification result of a system parameter.

【００４９】図１においてデータ処理手段１０１は、設
定変更可能なデータ処理機能に従って、入力信号に対応
する出力信号を算出して出力する構成を持っており、例
えばホップフィールド・ニューラルネットワークであ
る。In FIG. 1, the data processing means 101 has a configuration for calculating and outputting an output signal corresponding to an input signal in accordance with a data processing function which can be changed in setting, and is, for example, a hop field neural network.

【００５０】設定変更用データ出力手段１０２は、対象
システムの入出力関係に基づいて、データ処理手段１０
１に対してデータ処理機能の設定を変更するための設定
変更用データを与えるものであり、例えば相関器によっ
て構成される。The setting change data output means 102 operates the data processing means 10 based on the input / output relationship of the target system.
Setting change data for changing the setting of the data processing function to 1 is given, and is constituted by, for example, a correlator.

【００５１】[0051]

【作用】本発明においては、対象システムの同定結果と
して例えばロボットマニピュレータに対するヤコビアン
又は逆ヤコビアンがデータ処理手段１０１、例えばホッ
プフィールド・ニューラルネットワークから出力され
る。In the present invention, as the identification result of the target system, for example, Jacobian or inverse Jacobian for the robot manipulator is output from the data processing means 101, for example, the Hopfield neural network.

【００５２】このホップフィールド・ニューラルネット
ワークに対しては、設定変更用データ出力手段１０２を
構成する相関器が接続され、この相関器は対象システム
の入力と出力との関係に基づいて、ホップフィールド・
ニューラルネットワークを構成する各ニューロンへの外
部入力と各ニューロン間の結合係数を出力として与え
る。A correlator constituting the setting change data output means 102 is connected to the hop field neural network, and the correlator is adapted to determine the hop field · · · based on the relationship between the input and the output of the target system.
The external input to each neuron constituting the neural network and the coupling coefficient between each neuron are given as an output.

【００５３】対象システムの入出力関係としては、例え
ばロボットマニピュレータの関節角の微小変位とそれに
対応するロボットの手先位置の微小変位との関係が与え
られ、その関係に対応して相関器から出力される外部入
力と結合係数とがホップフィールド・ニューラルネット
ワークに与えられる。微小入力とそれに対応する微小出
力との相関器への入力と、相関器による外部入力と結合
係数のネットワークへの出力との動作がネットワークの
出力が一定となるまで繰り返され、出力が一定となった
時にホップフィールド・ニューラルネットワークからヤ
コビアン又は逆ヤコビアンが出力されることになる。As the input / output relationship of the target system, for example, the relationship between the minute displacement of the joint angle of the robot manipulator and the corresponding minute displacement of the hand position of the robot is given, and output from the correlator according to the relationship External inputs and coupling coefficients are provided to the Hopfield neural network. The operation of the input to the correlator of the minute input and the corresponding minute output, and the operation of the correlator to the external input and the coupling coefficient output to the network are repeated until the output of the network becomes constant, and the output becomes constant. When this happens, Jacobian or reverse Jacobian will be output from the Hopfield neural network.

【００５４】更に本発明においては、この相関器とホッ
プフィールド・ニューラルネットワークと並列に階層型
ニューラルネットワークを設け、この階層型ニューラル
ネットワークへの入力信号として対象システムへの入力
信号を与え、ホップフィールド・ニューラルネットワー
クと、この階層型ニューラルネットワークとの出力が一
致するようにこの階層型ニューラルネットワークの学習
を行わせることにより、非線形システムに対しても、ま
たシステムの特性が変化しても対応可能なシステム同定
装置を構成することができる。Furthermore, in the present invention, a hierarchical neural network is provided in parallel with the correlator and the hop field neural network, and an input signal to the target system is provided as an input signal to the hierarchical neural network. A system that can cope with non-linear systems and with changes in system characteristics by learning the hierarchical neural network so that the outputs of the neural network and the hierarchical neural network coincide with each other. An identification device can be configured.

【００５５】更に例えば図１６で説明した適応型制御装
置において、固定フィードバックゲインＫの代わりに階
層型ニューラルネットワークを用い、本実施例のシステ
ム同定装置を用いてこの階層型ニューラルネットワーク
に対する教師信号を作成することにより、適応性に優れ
た制御装置を構成することが可能となる。Further, for example, in the adaptive controller described in FIG. 16, a hierarchical neural network is used instead of the fixed feedback gain K, and a teacher signal for this hierarchical neural network is created using the system identification apparatus of this embodiment. By doing this, it is possible to configure a control device with excellent adaptability.

【００５６】[0056]

【実施例】本発明のシステム同定装置の一実施例とし
て、ロボットアームの逆運動学問題に適応した例を具体
例として説明する。以下の説明においてはベクトルｘは
ロボット手先位置、ベクトルｑはロボット関節角のそれ
ぞれベクトルであり、また外８（以下、ベクトルΔ
ｘとする）はロボット手先位置の微小変位、DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS As an embodiment of the system identification apparatus of the present invention, an example adapted to the inverse kinematics problem of a robot arm will be described as a specific example. In the following description, the vector x is the position of the hand of the robot, and the vector q is the vector of the joint angle of the robot.
x) is a minute displacement of the hand position of the robot,

【００５７】[0057]

【外８】 [Ex. 8]

【００５８】ベクトルΔｑは関節角の微小変位であり、
これらの微小変位は測定可能であるものとする。これら
のベクトルΔｘとベクトルΔｑとの間には、ベクトルｑ
のヤコビアンを外９（以下、ベクトルＪとする）と
すると次式（３）の関係がある。The vector Δq is a minute displacement of the joint angle,
These minute displacements should be measurable. Between these vector Δx and vector Δq, vector q
Assuming that the Jacobian of is 9 (hereinafter, referred to as vector J), there is a relationship of the following equation (3).

【００５９】[0059]

【外９】 [外 9]

【００６０】[0060]

【数５】 [Equation 5]

【００６１】この関係は非線形である。そこで、線形系
として扱えるようにするために、ある特定の関節角外
１０（以下、ベクトルｑ₀とする）の値を考え、こ
の関This relationship is non-linear. Therefore, in order to be able to treat as a linear system, consider the value of a specific joint angle 10 (hereinafter referred to as vector q ₀ ),

【００６２】[0062]

【外１０】 [Extra 10]

【００６３】節角ベクトルｑ₀に対してヤコビアンを同
定するものとする。すなわちベクトルＪ（ｑ₀）を同定
の対象となるシステムパラメータとする。ここで図２に
示すニューラルネットワークを考える。このニューラル
ネットワークは各演算要素、すなわちニューロンがそれ
ぞれ相互に結合されたホップフィールド・ネットワーク
である。演算要素は、図２においてはそれぞれ演算器２
００₁〜２００₄で実現される。演算要素の数Ｎはここ
では４であり、ネットワークの各演算要素への入力はＩ
_i（ｉ＝１，・・・，Ｎ）、ネットワークの各演算要素
からの出力はＶ_i（ｉ＝１，・・・，Ｎ）、各演算要素
の間の結合係数はＴ_ij（ｉ，ｊ＝１，・・・，Ｎ）で表
わされるものとする。Suppose that the Jacobian is identified with respect to the articulating angle vector q ₀ . That is, the vector J (q ₀ ) is used as a system parameter to be identified. Now consider the neural network shown in FIG. This neural network is a hop field network in which each operation element, that is, neurons are mutually connected. The computing elements are each shown in FIG.
It is realized by 00 ₁ to 200 ₄ . The number N of operation elements is 4 here, and the input to each operation element of the network is I
_i (i = 1,..., N), the output from each operation element of the network is V _i (i = 1,..., N), and the coupling coefficient between each operation element is T _ij (i, It is assumed that j = 1,..., N).

【００６４】図２において演算要素について、入力の総
和はｕ_i In FIG. 2, with respect to the operation element, the sum of the inputs is u _i

【００６５】[0065]

【数６】 [Equation 6]

【００６６】となり、また演算要素の出力Ｖ_iはThe output V _{i of the} operation element is

【００６７】[0067]

【数７】 [Equation 7]

【００６８】と表わせる。このネットワークのダイナミ
クスは次のようになる。先ず、回路のエネルギーは、It can be expressed as: The dynamics of this network are as follows. First, the energy of the circuit is

【００６９】[0069]

【数８】 [Equation 8]

【００７０】となる。Ｖ_kがΔＶ_kだけ変化したものと
すると、エネルギーの変化は、It will be. _Assuming that V _k has changed by ΔV _{k, the} change in energy is

【００７１】[0071]

【数９】 [Equation 9]

【００７２】となる。ここで例えばδ_kiはクロネッカー
のデルタ記号であり、その値はｋ＝ｉのとき１、ｋ≠ｉ
のとき０となる。この関係から、Ｔ_kk≧０，（ｋ＝１，
・・・，Ｎ）の時にはΔＶ_kが何であっても式（７）の
第二項は０または負となる。第一項についても、It becomes. Here, for example, δ _ki is a Kronecker delta symbol, and its value is 1, when k = i, k ≠ i
It becomes 0 at the time of. From this relationship, T _kk 0, 0 (k = 1,
..., N), the second term of the equation (7) becomes 0 or negative whatever ΔV _k is. As for the first paragraph,

【００７３】[0073]

【数１０】 [Equation 10]

【００７４】と変更することによって常に０または負と
なる。即ち、各演算要素の出力Ｖ_iとその内部ポテンシ
ャルｕ_iの関係は前述のように、以下のようになってい
る。It is always 0 or negative by changing it. That is, the relationship between the output V _i of each operation element and the internal potential u _i is as follows, as described above.

【００７５】[0075]

【数１１】 [Equation 11]

【００７６】ここで、ｕ_kの値に応じてｕ_kの更新量Δ
ｕ_kを以下のように定める。[0076] Here, the update amount of u _k according to the value of u _k delta
Define u _k as follows.

【００７７】[0077]

【数１２】 [Equation 12]

【００７８】これにより、ΔＶ _k〜Δｕ_kの関係となる
から第一項はやはり０または負となりエネルギーが減少
する。以下の同定装置の説明では、ニューラルネットワ
ークのこのような性質を利用する。As a result, since the relation of ΔV _k to Δu _k is obtained, the first term is also zero or negative and the energy decreases. The following description of the identification device makes use of this property of neural networks.

【００７９】次に、ヤコビアンを出力する同定装置につ
いて説明する。ベクトルΔｘをアーム手先の実際の変
位、外１１（以下、ベクトルΔｘ′Next, an identification apparatus for outputting the Jacobian will be described. The actual displacement of the arm hand by the vector Δx, outside 11 (hereinafter, the vector Δx ′

【００８０】[0080]

【外１１】 [11]

【００８１】とする）をネットワーク１７の出力Ｖ_kを
用いて計算した変位とする。すなわちＶ_kをヤコビアン
と対応させる：Ｖ_k＝Ｊ_ij（ｑ₀）。評価関数として以
下のものを考える。Let】) be the displacement calculated using the output V _{k of the} network 17. That when V _k to correspond with _{_{Jacobian: V k = J ij (q}} 0). Consider the following as an evaluation function.

【００８２】[0082]

【数１３】 [Equation 13]

【００８３】上式をネットワーク１７のエネルギーの式
に埋め込むことができる。２次元平面で動くアーム数２
のロボットでは、The above equation can be embedded in the energy equation of the network 17. 2 arms moving in 2D plane
In the robot of

【００８４】[0084]

【数１４】 [Equation 14]

【００８５】[0085]

【数１５】 [Equation 15]

【００８６】とおくことによって達成できる。図３はシ
ステムの動作パラメータ、ここではヤコビアンを同定す
る同定装置の構成ブロック図である。同図において同定
装置は対象システム３０１の入力の微小変位ベクトルΔ
ｑと出力の微小変位ベクトルΔｘとが入力され、ホップ
フィールド・ネットワーク３０３内の結合係数の行列Ｔ
とネットワーク３０３内の各ニューロンに対する外部入
力ベクトル外１２（以下、ベクトルＩとする）とをThis can be achieved by FIG. 3 is a block diagram of an identification apparatus for identifying system operation parameters, here, the Jacobian. In the figure, the identification device is a minute displacement vector Δ of the input of the target system 301.
q and a small displacement vector Δx at the output are input, and a matrix T of coupling coefficients in the Hopfield network 303
And an external input vector 12 for each neuron in the network 303 (hereinafter referred to as vector I)

【００８７】[0087]

【外１２】 [12]

【００８８】出力する相関器３０２と、相関器３０２か
らの入力をうけて内部の各ニューロンからの出力Ｖとし
てヤコビアンを出力する上記ホップフィールド・ネット
ワーク３０３を備えている。It comprises a correlator 302 for output, and the above-mentioned Hopfield network 303 for receiving the input from the correlator 302 and outputting the Jacobian as the output V from each neuron inside.

【００８９】図４は相関器３０２の詳細な構成を示すブ
ロック図である。相関器３０２は係数行列Ｔを出力する
Ｔ設定部３０４と、ベクトル外部入力行列Ｉを出力する
Ｉ設定部３０５とを備える。FIG. 4 is a block diagram showing the detailed configuration of correlator 302. Referring to FIG. The correlator 302 includes a T setting unit 304 that outputs a coefficient matrix T, and an I setting unit 305 that outputs a vector external input matrix I.

【００９０】次に同定装置の動作は以下の順序で行われ
る。１．システムに微小な入力を与え、これに対応する出力
の変位を観測する。微小な入力の一例としてノイズを利
用することも考えれらる。Next, the operation of the identification device is performed in the following order. 1. Give a minute input to the system and observe the corresponding displacement of the output. It is also conceivable to use noise as an example of a minute input.

【００９１】２．観測結果を利用して相関器によって、
ネットワーク３０３の係数行列Ｔと入力ベクトルＩを得
て、ネットワーク３０３にセットする。３．ネットワーク３０３の出力Ｖが一定となるまで、
１．と２．の過程を繰り返す。2. By the correlator using the observation results,
The coefficient matrix T of the network 303 and the input vector I are obtained and set in the network 303. 3. Until the output V of the network 303 becomes constant,
1. And 2. Repeat the process of

【００９２】４．一定となった上記出力Ｖが、ロボット
アームのヤコビアンとなっている。上述の同定装置の動
作のうち、相関器の動作について更に詳しく説明する。
前述のようにシステム、すなわち制御対象への微小入
力、例えば関節角変位ベクトルΔｑと、これに対応する
微小出力、例えば手先変位ベクトルΔｘとが適当なセン
サなどを用いて観測される。4. The output V that has become constant is the Jacobian of the robot arm. Among the operations of the identification device described above, the operation of the correlator will be described in more detail.
As described above, a minute input to a system, that is, a control target, for example, a joint angle displacement vector Δq, and a corresponding minute output, for example, a hand tip displacement vector Δx are observed using an appropriate sensor or the like.

【００９３】図４の相関器３０２はベクトル要素の乗算
を実施する。結合係数を与えるＴ行列の設定部、すなわ
ち図４のＴ設定部３０４においてはヤコビアンを求める
際には関節角変位間の同時相関を計算し、その結果をＴ
行列の要素として割り当てる。また、逆ヤコビアンを求
める際には手先変位間の同時相関を計算し、その結果を
Ｔ行列の要素として割り当てる。The correlator 302 of FIG. 4 implements multiplication of vector elements. The setting section of T matrix giving coupling coefficients, ie, T setting section 304 in FIG. 4, calculates the simultaneous correlation between joint angular displacements when obtaining the Jacobian, and the result is
Assign as an element of a matrix. Also, when determining the inverse Jacobian, the simultaneous correlation between the hand displacements is calculated, and the result is assigned as an element of the T matrix.

【００９４】ネットワーク３０３への入力ベクトルＩの
設定部、すなわち図４のＩ設定部３０５は微小入力と微
小出力、すなわちベクトルΔｑとベクトルΔｘのベクト
ル要素間の同時相関を乗算によって計算し、該各乗算結
果を入力ベクトルＩの各要素として割り当てる。The setting unit of the input vector I to the network 303, ie, the I setting unit 305 of FIG. 4 calculates the simultaneous correlation between the minute input and the minute output, ie, the vector elements of the vector Δq and the vector Δx by multiplication. The multiplication result is assigned as each element of the input vector I.

【００９５】ここでベクトルの相関とは、ある同時刻、
すなわち時刻ｎにおける微小入力ベクトルΔｑ（ｎ）と
微小出力ベクトルΔｘ（ｎ）との同時相関、すなわち時
刻ｎにおける乗算を意味する。一般に互いに異なる時刻
ｎ，ｍにおける相関が相関関数と呼ばれ、この相関関数
は＜Δｑ_i（ｎ）・Δｑ_i（ｍ）＞のような形式で定義
される。ここで＜・・・＞はある種の時間平均を意味す
る。Here, the correlation of vectors means a certain time,
That is, this means simultaneous correlation between the minute input vector Δq (n) and the minute output vector Δx (n) at time n, that is, multiplication at time n. In general, correlations at different times n and m are called correlation functions, and the correlation functions are defined in the form <Δq _i (n) · Δq _i (m)>. Here, <...> means a kind of time average.

【００９６】本実施例では係数行列Ｔ、および入力ベク
トルＩの設定の仕方が相関の定義となる。前述の評価関
数Ｅの式の展開を考える。ヤコビアンＪはこの例では二
次の正方行列であり、その要素がホップフィールド・ニ
ューラルネットワーク３０３の出力ベクトル外１３
（以下、出力ベクトルＶとする）の各要素となる。すIn the present embodiment, the way of setting the coefficient matrix T and the input vector I is the definition of the correlation. Consider the expansion of the above expression of the evaluation function E. Jacobian J is a quadratic square matrix in this example, and its element is the output vector of Hopfield neural network 303.
It becomes each element of (it is hereafter set as the output vector V). The

【００９７】[0097]

【外１３】 [13]

【００９８】なわち出力ベクトルＶは次式で与えられ
る。That is, the output vector V is given by the following equation.

【００９９】[0099]

【数１６】 [Equation 16]

【０１００】従って係数行列Ｔを設定するＴ設定部３０
４の働きは結合係数Ｔ_ijを得て、これをネットワークの
重み係数とすることであり、具体的には外１４（以
下、Therefore, T setting unit 30 for setting coefficient matrix T
The function of 4 is to obtain the coupling coefficient T _ij and set it as the weighting coefficient of the network.

【０１０１】[0101]

【外１４】 [14]

【０１０２】ベクトルΔｑ^tとする）＝（Δｑ₁，Δ
ｑ₂）とする時、次式によって係数行列Ｔを求めること
である。Let the vector Δq ^t ) = (Δq ₁ , Δ
When q ₂ ), the coefficient matrix T is determined by the following equation.

【０１０３】[0103]

【数１７】 [Equation 17]

【０１０４】ここでベクトルΔｑ・ベクトルΔｑ^tは２
次の正方マトリクスであり、また０は同様に２次の０マ
トリクスであるから、Ｔは４次の正方マトリクスとな
る。係数ベクトルＩについても、同様にベクトルＩ＝２
（ベクトルΔｘ・ベクトルΔｑ ^t）のように定義するこ
とができる。Here, vector Δq · vector Δq^tIs 2
The next square matrix, and 0 is the second 0
T is a fourth-order square matrix because
Ru. Similarly for the coefficient vector I, the vector I = 2
(Vector Δx · vector Δq ^tDefine as)
It is possible.

【０１０５】前述の同定装置の動作のうち、動作３．す
なわちネットワーク３０３の出力が一定となるまでの
１．と２．の過程の繰り返しがいつまで行われるかにつ
いて詳しく説明する。Among the operations of the above-mentioned identification apparatus, the operation 3. That is, 1. until the output of the network 303 becomes constant. And 2. It will be explained in detail how long the process is repeated.

【０１０６】前述のようにホップフィールド・ニューラ
ルネットワークを構成する各ニューロンへの入力はAs described above, the input to each neuron constituting the Hopfield neural network is

【０１０７】[0107]

【数１８】 [Equation 18]

【０１０８】によって与えられ、この値はニューロンの
内部ポテンシャル（内部状態）を表わすことになる。こ
の内部ポテンシャルｕの時刻ｎ＋１における値は次式に
よって与えられる。This value is given by the internal potential (internal state) of the neuron. The value at time n + 1 of this internal potential u is given by the following equation.

【０１０９】[0109]

【数１９】 [Equation 19]

【０１１０】すなわち内部ポテンシャルｕの時刻ｎにお
ける更新量Δｕ（ｎ）はThat is, the update amount Δu (n) at time n of the internal potential u is

【０１１１】[0111]

【数２０】 [Equation 20]

【０１１２】によって与えられ、この更新量は入力の総
和に比例するものとなる。そしてこの更新量に従って内
部ポテンシャルの更新が行われ、内部ポテンシャルｕが
それ以上変化しなくなる平衡状態において出力Ｖ＝ｇ
（ｕ）が求められる。従ってネットワーク３０３の出力
が一定となるまでに必要な時間は、この内部ポテンシャ
ルの更新量が０となった時刻に対応する。This update amount is proportional to the sum of the inputs. Then, the internal potential is updated according to the update amount, and the output V = g in the equilibrium state where the internal potential u does not change any more.
(U) is required. Therefore, the time required for the output of the network 303 to become constant corresponds to the time when the update amount of the internal potential becomes zero.

【０１１３】前述のようにネットワーク３０３のエネル
ギーの変化はAs described above, the change in energy of the network 303 is

【０１１４】[0114]

【数２１】 [Equation 21]

【０１１５】によって与えられるが、ニューロンの内部
ポテンシャルｕと出力Ｖとの関係がThe relation between the inner potential u of the neuron and the output V is given by

【０１１６】[0116]

【数２２】 [Equation 22]

【０１１７】によって与えられるならば、関数ｇは単調
増加関数であり、ΔｕとΔＶとの変化の方向は同じであ
り、エネルギーは減少することになる。しかしながら本
発明では係数行列Ｔおよび入力ベクトルＩがその都度設
定されるために、ネットワーク３０３のエネルギーが時
間と共に減少するとは必ずしも言えないが、内部ポテン
シャルｕの更新の瞬間を取ってみればｕが変化する方向
はエネルギーの減少方向に一致する。普通のホップフィ
ールド型ニューラルネットワークの場合には、結合係数
や入力ベクトルの値は一定とされるために、エネルギー
は一般に時間と共に必ず減少する。If given by the function g is a monotonically increasing function, the directions of change of Δu and ΔV are the same and the energy will be reduced. However, in the present invention, since the coefficient matrix T and the input vector I are set each time, the energy of the network 303 may not necessarily decrease with time, but if the moment of updating the internal potential u is taken, u changes Direction corresponds to the energy reduction direction. In the case of an ordinary hop field neural network, the energy generally decreases with time because the values of coupling coefficients and input vectors are constant.

【０１１８】次に非線形システムも同定できる同定装置
の構成を図５に示す。この装置は、非線形写像を学習す
ることのできる学習装置３０６（階層状のニューラルネ
ットワーク）と並列に線形系を同定する図３の装置が付
加されたものである。図３の同定装置は、ネットワーク
３０３への係数入力は相関器３０２で次々に更新される
ために記憶能力をもたないが、図５の装置では上記同定
装置と並列に学習装置３０６が付加された構成となって
いるため記憶能力を有している。Next, FIG. 5 shows the configuration of an identification apparatus capable of identifying a non-linear system. This apparatus is obtained by adding the apparatus of FIG. 3 which identifies a linear system in parallel with a learning apparatus 306 (a hierarchical neural network) capable of learning a non-linear mapping. In the identification apparatus of FIG. 3, the coefficient input to the network 303 does not have a memory capacity because the coefficient input to the network 303 is successively updated by the correlator 302, but in the apparatus of FIG. It has a memory capacity because it is configured as follows.

【０１１９】また、その記憶内容は、システムが変化し
た場合には、同定装置によって修正されるようになって
いる。すなわち図３の同定装置では、ヤコビアンは特定
の状態ベクトルｑ₀で計算することができた。これを非
線形の同定に拡張するためには、ヤコビアンの状態ベク
トルｑへの依存性を考慮する必要がある。このために、
学習機能をもつニューラルネットワーク装置３０６を利
用している。該ニューラルネットワーク装置３０６の学
習は以下の手順で行われる。Also, the stored contents are corrected by the identification device when the system changes. That is, in the identification device of FIG. 3, the Jacobian could be calculated with a specific state vector q ₀ . In order to extend this to nonlinear identification, it is necessary to consider the dependence of the Jacobian on the state vector q. For this,
A neural network device 306 having a learning function is used. The learning of the neural network device 306 is performed in the following procedure.

【０１２０】１．内部状態ベクトルｑを学習装置３０６
に入力させ、これに対応する学習装置３０６の出力外
１５（以下、ベクトルＶ′(q) とする）を得る。1. A learning device 306 for the internal state vector q
To obtain an output 15 (hereinafter referred to as a vector V ′ (q)) of the corresponding learning device 306.

【０１２１】[0121]

【外１５】 [15]

【０１２２】２．上記状態ベクトルｑにその周りの微小
変位ベクトルΔｑを加え、これに対応する手先変位ベク
トルΔｘを測定し、これらを相関器３０２に加える。３．相関器３０２から同定装置の係数行列Ｔと入力ベク
トルＩを得る。2. The minute displacement vector Δq around the state vector q is added to the above-mentioned state vector q, the corresponding hand displacement vector Δx is measured, and these are added to the correlator 302. 3. From the correlator 302, the coefficient matrix T of the identification device and the input vector I are obtained.

【０１２３】４．ネットワーク３０３の出力Ｖが収束す
るまで過程２．と３．を繰り返す。５．収束したら、学習装置３０６に上記状態ベクトルｑ
に対応するベクトルＶ′(q) を学習させる。この学習時
の教師信号は減算器３０７の出力である。4. Process 2 until the output V of the network 303 converges And 3. repeat. 5. After convergence, the state vector q is
The vector V '(q) corresponding to is trained. The teacher signal at the time of learning is the output of the subtractor 307.

【０１２４】６．過程１．から５．を状態ベクトルｑを
変えて繰り返す。これにより、学習装置３０６は状態ベ
クトルｑとこれに対応する出力ベクトルＶ′(q) との組
を記憶する。尚、上記ベクトルＶ′(q) は、ヤコビアン
Ｊ(q) である。6. Process 1 To 5. Is repeated changing the state vector q. Thus, the learning device 306 stores a set of the state vector q and the corresponding output vector V '(q). The vector V '(q) is Jacobian J (q).

【０１２５】又、図５に示す同定装置は、逆ヤコビアン
を学習することも可能である。この場合には、下記(21)
式の評価関数Ｅを用いる：Ｖ_k＝Ｊ^-1 _ij（ｑ₀)。The identification device shown in FIG. 5 can also learn reverse Jacobian. In this case, the following (21)
Use the evaluation function E of the equation: V _k = J ⁻¹ _ij (q ₀ ).

【０１２６】[0126]

【数２３】 [Equation 23]

【０１２７】上式をネットークのエネルギーの式に埋め
込むことによって、ホップフィールド・ネットワーク３
０３内の結合係数の行列Ｔとネットワーク内の各ニュー
ロンに対する外部入力ベクトルＩとが以下のように求め
られる。Hopfield network 3 is obtained by embedding the above equation into the energy equation of the network.
The matrix T of coupling coefficients in 03 and the external input vector I for each neuron in the network are determined as follows.

【０１２８】[0128]

【数２４】 [Equation 24]

【０１２９】[0129]

【数２５】 [Equation 25]

【０１３０】相関器３０２は、上記結合係数の行列Ｔと
外部入力ベクトルＩを生成し、これらをホップフィール
ドネットワーク３０３に入力する。これにより、ホップ
フィールドネットワーク３０３は、逆ヤコビアンである
Ｖ^-1を出力する。逆ヤコビアンの学習方法は、結合係数
の行列Ｔと外部入力ベクトルＩが異なる点を除いて、上
述したヤコビアンの学習方法と同じであるので、詳細な
説明は省略する。The correlator 302 generates the matrix T of coupling coefficients and the external input vector I, and inputs them to the hop field network 303. By this, the hop field network 303 outputs V ⁻¹ which is the reverse Jacobian. The reverse Jacobian learning method is the same as the above-described Jacobian learning method except that the matrix T of coupling coefficients and the external input vector I are different, so detailed description will be omitted.

【０１３１】図６は上述した図５のシステム同定装置を
利用する第１の適応型学習制御装置の構成ブロック図で
ある。図６において上記図５の相関器３０２、ネットワ
ーク３０３、学習装置３０６、および減算器３０７から
成るシステム同定装置はブロック３１０で示されてお
り、この同定装置３１０によるシステム同定結果、ここ
ではヤコビアンの出力結果が適応型学習制御装置に利用
される。FIG. 6 is a configuration block diagram of a first adaptive learning control device using the system identification device of FIG. 5 described above. In FIG. 6, the system identification device consisting of the correlator 302, the network 303, the learning device 306 and the subtractor 307 of FIG. 5 described above is shown in block 310 and the system identification result by this identification device 310, here the output of the Jacobian The result is used for the adaptive learning controller.

【０１３２】該制御装置は制御対象（Ｃｏ）３１１、フ
ィードフォワード制御機構（例えばニューラルネットワ
ークＮｐによって構成される）３１２、外部入力される
目標値ベクトルｘ_dと制御対象（Ｃｏ）３１１の出力ベ
クトルｘとの減算結果、すなわち制御偏差ｅ_xを得るた
めの減算器３１３、同定装置３１０の出力と上記制御偏
差ｅ_xとを乗算してその乗算結果をニューラルネットワ
ークＮｐ３１２に対して教師信号として与える乗算器３
１４から構成されている。The control device includes a control object (Co) 311, a feedforward control mechanism (for example, composed of a neural network Np) 312, an externally input target value vector x _d, and an output vector x of the control object (Co) 311. subtraction result, i.e. the multiplier providing the multiplication result by multiplying the subtractor 313, the output and the control deviation e _x of the identification apparatus 310 for obtaining a control deviation e _x as a teacher signal to the neural network Np312 with 3
It consists of fourteen.

【０１３３】図６においてフィードフォワード制御器に
相当するニューラルネットワークＮｐ３１２に対する荷
重の更新量Ｔは次式によって与えられる。The load update amount T for the neural network Np 312 corresponding to the feedforward controller in FIG. 6 is given by the following equation.

【０１３４】[0134]

【数２６】 [Equation 26]

【０１３５】上記(24)式は、前記(11)式の評価関数に対
して、これが最小となるように荷重ｗ₁を変化させるこ
とを考える。変化の方式には最急降下法を用いると、荷
重の更新量はＴは以下のように求められる。In the equation (24), it is considered to change the load w ₁ so as to minimize the evaluation function of the equation (11). When the steepest descent method is used as the method of change, the updated amount of load can be obtained as follows.

【０１３６】[0136]

【数２７】 [Equation 27]

【０１３７】ここで、μは正の定数である。この(25)式
の変化の方法の詳細は、後述する。なお、ここで記号Ｔ
は図２などで説明した係数行列とは異なる。図６の第１
の適応型学習制御装置の学習処理方式について説明す
る。Here, μ is a positive constant. Details of the method of change of the equation (25) will be described later. Here, the symbol T
Is different from the coefficient matrix described in FIG. The first in FIG.
The learning processing method of the adaptive type learning control apparatus will be described.

【０１３８】制御偏差ベクトルΔｘ（＝ｅ_x）は、目標
値ベクトルｘ_dと出力ベクトルｘとの差分で求められ
る。但し、前述の説明ではベクトルΔｘを微小変位とし
たが、この微小変位は制御対象の出力に対応することか
ら同一の記号を用いる。ベクトルΔｑについても同様で
ある。[0138] Control deviation vector Δx (= e _x) is calculated by the difference between the target value vector x _d and output vector x. However, although the vector Δx is a minute displacement in the above description, the minute displacement corresponds to the output of the control target, and the same symbol is used. The same applies to the vector Δq.

【０１３９】ニューラルネットワークＮｐ３１２は、シ
ステム同定装置３１０から出力されるヤコビアンＪ(q)
と前記制御偏差ベクトルΔｘとの乗算結果（＝Ｔ）を教
師信号として受け取り、目標値ベクトルｘ_dに対応する
関節角の制御操作量ベクトルｑを出力するようにオンラ
インで学習を行う。ヤコビアンＪ(q) を求める際に使用
されるベクトルｑは、制御対象（Ｃｏ）３１１に入力さ
れるベクトルｑと同一である。The neural network Np 312 receives the Jacobian J (q) output from the system identification unit 310.
And the control deviation vector .DELTA.x is multiplied as a teaching signal, and learning is performed online so as to output a control operation amount vector q of the joint angle corresponding to the target value vector _xd . The vector q used to obtain the Jacobian J (q) is the same as the vector q input to the control target (Co) 311.

【０１４０】学習が充分に進行すると、ネットワークＮ
ｐ３１２は制御対象３１１の入出力特性の逆特性として
の「ベクトルｘ_d→ベクトルｑ」のデータ変換機能を獲
得することになる。また、制御対象（Ｃｏ）３１１の持
つパラメータが変化した場合でも、システム同定装置３
１０内のホップフィールド型ネットワーク３０３及び階
層型ネットワークである学習装置３０６の再学習によっ
て、ネットワークＮｐ３１２のデータ変換機能は適応的
に変化することになる。When learning progresses sufficiently, network N
p312 acquires the data conversion function of "vector x _d → vector q" as the inverse characteristic of the input / output characteristic of the control object 311. Also, even when the parameter of the control target (Co) 311 changes, the system identification device 3
The data conversion function of the network Np 312 is adaptively changed by re-learning of the hop field network 303 in 10 and the learning device 306 which is a hierarchical network.

【０１４１】図７は上述した図５のシステム同定装置を
利用する第２の適応型学習制御装置の構成ブロック図で
ある。図７において上記図５の相関器３０２、ネットワ
ーク３０３、学習装置３０６、および減算器３０７から
成るシステム同定装置はブロック３２０で示されてお
り、この同定装置３２０によるシステム同定結果、ここ
ではヤコビアンの出力結果が適応型学習制御装置に利用
される。FIG. 7 is a block diagram of a second adaptive learning control apparatus using the system identification apparatus of FIG. 5 described above. In FIG. 7, the system identification device consisting of the correlator 302, the network 303, the learning device 306 and the subtractor 307 of FIG. 5 is shown at block 320, and the system identification result by this identification device 320, here the output of Jacobian The result is used for the adaptive learning controller.

【０１４２】該制御装置は制御対象（Ｃｏ）３２１、フ
ィードフォワード制御機構（例えばニューラルネットワ
ークＮｐによって構成される）３２３、図５のフィード
バック制御器５０２に代わる、例えば階層型ニューラル
ネットワークＮａ３２２、外部入力される目標値ベクト
ルｘ_dと制御対象（Ｃｏ）３２１の出力ベクトルｘとの
減算結果、すなわち制御偏差ｅ_xを得るための減算器３
２４、同定装置３２０の出力と上記制御偏差ｅ_xとを乗
算してその乗算結果をニューラルネットワークＮａ３２
２に対して教師信号として与える乗算器３２５、及び２
つのネットワークＮａ３２２とＮｐ３２３の出力を加算
する加算器３２６から構成されている。The controller is a control object (Co) 321, a feedforward control mechanism (for example, composed of a neural network Np) 323, and instead of the feedback controller 502 of FIG. that the target value vector x _d and the control object (Co) 321 subtraction result between the output vector x, i.e. to obtain a control deviation e _x subtracter 3
24, the output of the identification device 320 control deviation e _x and Neural Network the multiplication result by multiplying the Na32
Multipliers 325 and 2 to give 2 as a teacher signal
And an adder 326 for adding the outputs of two networks Na 322 and Np 323.

【０１４３】図７においてフィードバック制御器に相当
する階層型ニューラルネットワークＮａ３２２に対する
荷重の更新量Ｔは第１の適応型学習制御装置で用いたも
のと同様に、（２４）式で与えられる。The load update amount T for the hierarchical neural network Na 322 corresponding to the feedback controller in FIG. 7 is given by the equation (24), similarly to that used in the first adaptive learning control device.

【０１４４】図７の第２の適応型制御装置の学習処理方
式について説明する。この装置は図５で説明した従来例
が固定フィードバックゲインを持つフィードバック制御
装置を有するのに対し、これに代わるものとして階層型
ニューラルネットワークＮａ３２２を有している。そし
て、このネットワークＮａ３２２は本発明の同定装置３
２０の出力に基づいて乗算器３２５により作成される教
師信号を用いて学習を行う。The learning processing method of the second adaptive control device of FIG. 7 will be described. This apparatus has a hierarchical neural network Na 322 as an alternative to the conventional example described in FIG. 5 having a feedback control apparatus with a fixed feedback gain. And this network Na 322 is an identification device 3 of the present invention.
Learning is performed using the teacher signal generated by the multiplier 325 based on the 20 outputs.

【０１４５】ネットワークＮａ３２２は、制御偏差ベク
トルΔｘ（＝目標値ベクトルｘ_d−出力ベクトルｘ）を
入力として受け取り、この制御偏差ベクトルΔｘに対応
する制御操作量（関節角）の差分量ベクトルΔｑを出力
するように学習を行う。また学習する際、システム同定
装置３２０から出力されるヤコビアンＪ(q) と前記制御
偏差ベクトルΔｘとの乗算結果（＝Ｔ）を教師信号とし
て受け取る。Network Na 322 receives control deviation vector Δx (= target value vector x _d −output vector x) as an input, and outputs difference amount vector Δq of the control operation amount (joint angle) corresponding to control deviation vector Δx. Learn to do. In addition, when learning, a multiplication result (= T) of Jacobian J (q) output from the system identification unit 320 and the control deviation vector Δx is received as a teacher signal.

【０１４６】一方、フィードフォワード制御機構を構成
するネットワークＮｐ３２３は、ネットワークＮａ３２
２の出力する偏差ベクトルΔｑ（＝Ｃ）を教師信号とし
て用い、この偏差ベクトルΔｑを小さくするようにオン
ラインで学習を行う。制御対象（Ｃｏ）３２１に対して
は２つのネットワークＮａ３２２とＮｐ３２３の出力の
加算結果が制御操作量として与えられる。On the other hand, the network Np 323 constituting the feedforward control mechanism is a network Na 32
The deviation vector Δq (= C) output from 2 is used as a teacher signal, and learning is performed online so as to reduce the deviation vector Δq. For the control target (Co) 321, the addition result of the outputs of the two networks Na 322 and Np 323 is given as a control operation amount.

【０１４７】学習が充分に進行すると、ネットワークＮ
ｐ３２３は制御対象３２１の入出力特性の逆特性として
の「ベクトルｘ_d→ベクトルｑ′」のデータ変換機能を
獲得することになるが、そのデータ変換機能は常にシス
テム同定装置３２０の出力とネットワークＮａ３２２の
データ変換機能に対応して適応的に変化することにな
る。第１の適応型制御装置と比較すると、第２の適応型
制御装置ではネットワークＮａ３２２があるために、制
御対象（Ｃｏ）３２０のパラメータが変化した場合でも
早く適応できるという利点がある。When learning progresses sufficiently, network N
Although p323 acquires the data conversion function of "vector x _d → vector q '" as the inverse characteristic of the input / output characteristic of the control target 321, the data conversion function is always the output of the system identification unit 320 and the network Na322. It changes adaptively corresponding to the data conversion function of Compared to the first adaptive control device, the second adaptive control device has the advantage of being able to adapt quickly even when the parameter of the control target (Co) 320 changes, because the network Na 322 is present.

【０１４８】図８は上述した図５のシステム同定装置を
利用する第３の適応型学習制御装置の構成ブロック図で
ある。図８において上記図５の相関器３０２、ネットワ
ーク３０３、学習装置３０６、および減算器３０７から
成るシステム同定装置はブロック３３０で示されてお
り、この同定装置３３０によるシステム同定結果、ここ
ではヤコビアンの出力結果が適応型学習制御装置に利用
される。FIG. 8 is a block diagram of the configuration of a third adaptive learning control apparatus using the system identification apparatus of FIG. 5 described above. The system identification unit comprising the correlator 302, the network 303, the learning unit 306, and the subtractor 307 in FIG. 5 in FIG. 8 is indicated by a block 330. As a result of the system identification by the identification unit 330, here the output of Jacobian. The result is used for the adaptive learning controller.

【０１４９】第２の適応型学習制御装置と第３の適応型
学習制御装置との違いは、第３の適応型学習制御装置の
Ｎａ３３２に関節変位ベクトルｑが入力されている点
と、Ｎｐ３３３にリンク長Ｌが入力されている点であ
る。The difference between the second adaptive learning controller and the third adaptive learning controller is that the joint displacement vector q is input to Na332 of the third adaptive learning controller, and Np333 The link length L is input.

【０１５０】前記(25)式の外１６の各項に関して以
下に説明する。The respective terms in the equation (25) will be described below.

【０１５１】[0151]

【外１６】 [16]

【０１５２】（第１項）手先位置の誤差であり、セン
サ等によって観測される。（第２項）システムの入出力にのみ関係する部分で、
システム出力ｘの状態ｑについての偏微分（Jacobian）
である。これは、(First Term) This is an error of the hand position and is observed by a sensor or the like. (Section 2) The part related only to the input and output of the system
Partial differentiation (Jacobian) of the state q of the system output x
It is. this is,

【０１５３】[0153]

【数２８】 [Equation 28]

【０１５４】によって近似的に与えられる。即ち、この
部分の計算は実際に数学モデルの偏微分を計算すること
なく、システムへの入力の変化と出力の変化を測定する
ことにより求めることができる。（第３項）ニューラルネットにのみ依存する部分で、
内部状態ｑの荷重ｗ₁についての偏微分である。これ
は、出力ニューロンへの入力で与えられる。即ち、φを
ニューラルネットの遷移関数とすると多層ニューラルネ
ットの出力は、It is approximately given by That is, the calculation of this part can be obtained by measuring the change of the input and the output of the system without actually calculating the partial derivative of the mathematical model. (Section 3) The part that depends only on the neural network,
It is a partial differential of the load w ₁ of the internal state q. This is given at the input to the output neuron. That is, assuming that φ is the transition function of the neural network, the output of the multilayer neural network is

【０１５５】[0155]

【数２９】 [Equation 29]

【０１５６】で与えられる。ｑ_i0を関節角の初期値とす
るとｎ回目の関節角は、It is given by _Assuming that _i0 is an initial value of the joint angle, the nth joint angle is

【０１５７】[0157]

【数３０】 [Equation 30]

【０１５８】として得られるから、第３項の偏微分は次
のようになる。Since the partial derivative of the third term is obtained as

【０１５９】[0159]

【数３１】 [Equation 31]

【０１６０】ここで、φ′（ｕ_i）は遷移関数の微分を
表す。ところで、もし遷移関数が線形であると仮定すれ
ば、この項は定数（１とおく）となるから、結局第３項
はＶ_jだけに依存することになる。Ｖ_jは第２層（出力
層の前層）の出力であるから、第２層で入力がどのよう
に表現されているかに依存することになる（表形式のニ
ューラルネットでは第２層ニューロンの出力であり、こ
れは“１”又は“０”であるから、“１”の時だけ学習
に寄与する）。Here, φ ′ (u _i ) represents the derivative of the transition function. By the way, if it is assumed that the transition function is linear, this term is a constant (set as 1), so the third term depends on only V _j . Since V _j is the output of the second layer (preceding layer of the output layer), it depends on how the input is represented in the second layer (in the case of the second layer neuron in a tabular neural network) This is an output, which is “1” or “0”, so it contributes to learning only when it is “1”.

【０１６１】このようにして、荷重を更新する（学習）
ことにより、Ｎａ３３２はｘ→ｘ_dとなるように動作す
る。その結果、学習後のＮａ３３２には、関節変位ベク
トルｑに対応したΔｘ→Δｑの写像、即ち、逆ヤコビア
ンが形成される。しかも、オンラインで行われる。In this way, the load is updated (learning)
Thus, Na332 operates so as to be x → _xd . As a result, a map of Δx → Δq corresponding to the joint displacement vector q, that is, an inverse Jacobian is formed in Na 332 after learning. And it's done online.

【０１６２】また、Ｎｐ３３３には手先の目標位置ｘ_d
が入力される。その他に、リンクの長さＬなどを入力と
することにより、リンク長が変化した場合にも（もし、
学習したリンク長であるならば）即座に変化に対応した
出力を出すことが可能となる。図８において、Ｎｐ３３
３の出力をｑ′とすると、Ｎｐ３３３の学習は系の内部
状態であるｑを教師として行う。評価関数を、Ｊ₂＝
（ｑ−ｑ′）²／２とすれば、その更新量は以下のよう
に与えられる。In addition, Np 333 is the target position of the hand x _d
Is input. In addition, if the link length is changed by inputting the link length L etc. (if,
It is possible to output an output corresponding to the change immediately if it is the learned link length. In FIG. 8, Np 33
Assuming that the output of 3 is q ', learning of Np 333 is performed using q which is an internal state of the system as a teacher. Evaluation function, J ₂ =
If (q-q ') ^2/2, the update amount is given as follows.

【０１６３】[0163]

【数３２】 [Equation 32]

【０１６４】この結果、Ｎｐ３３３内部に被制御系の入
出力の逆変換ｘ_d→ｑが形成される。Ｊ₂を評価関数と
する式（３１）の学習方式は、以下のようにも解釈でき
る。ｑ＝Δｑ＋ｑ′であるから、Ｊ₂を最小とすること
はｑを仮想的な目標値として、Δｑを最小にすることと
同じである。即ち、Ｎｐ３３３の学習をＮａ３３２の出
力であるΔｑを最小にするように行う学習方式と言って
も良い。As a result, the inverse transformation x _d → q of the input and output of the controlled system is formed in Np 333. Learning method of the formula (31) to an evaluation function J ₂ can be interpreted in the following manner. Since q = Δq + q ′, minimizing J ₂ is the same as minimizing Δq with q as a virtual target value. That is, it may be said to be a learning method in which learning of Np 333 is performed so as to minimize Δq which is the output of Na332.

【０１６５】図８の構成は、視覚、感覚（関節角度）な
どのセンサからの情報を主たる情報源として適応的動作
を行うＮａ３３２に、Ｎａ３３２の動作に基づいて学習
を行い、目標を課せられた被制御系の適切な動作をＮａ
３３２から獲得するＮｐ３３３が付加された構造をと
り、Ｎｐ３３３は学習能力を持つフィードフォワード適
応制御器として機能する。In the configuration shown in FIG. 8, learning is performed based on the operation of Na332 and a target is imposed on Na332 which performs adaptive operation using information from sensors such as vision and sense (joint angle) as a main information source. Proper operation of the controlled system
Np 333 obtained from 332 takes a structure added, and Np 333 functions as a feedforward adaptive controller with learning ability.

【０１６６】リンクの長さＬがＮｐ３３３に入力として
付加されていない場合には、その変化はＮｐ３３３によ
る補正が完了した後でなければ、Ｎｐ３３３は正しい出
力を出せない。When the link length L is not added as an input to Np 333, the change can not be output correctly after the correction by N p 333 is completed.

【０１６７】図９は前記図５のシステム同定装置による
同定結果を利用する第４の適応型学習制御装置の構成ブ
ロック図である。図９において図５の相関器３０２、ネ
ットワーク３０３、学習装置３０６、および減算器３０
７から成るシステム同定装置はブロック３４０で示され
ており、この同定装置によるシステム同定結果、ここで
は逆ヤコビアンの出力結果が適応型学習制御装置におい
て利用される。FIG. 9 is a block diagram of a fourth adaptive learning control apparatus using the identification result by the system identification apparatus of FIG. In FIG. 9, the correlator 302, the network 303, the learning device 306, and the subtractor 30 of FIG.
The system identification system consisting of 7 is indicated at block 340, and the system identification result by this identification system, here the output result of the inverse Jacobian, is utilized in the adaptive learning controller.

【０１６８】制御装置は制御対象（Ｃｏ）３４１、フィ
ードフォワード制御機構（例えばニューラルネットワー
クＮｐによって構成される）３４２、目標値ベクトルｘ
_dと制御対象３４１の出力ベクトルｘとの減算結果、す
なわち制御偏差ｅ_xを得るための減算器３４３、同定装
置３４０の出力と制御偏差ｅ_xとを乗算して乗算結果を
ニューラルネットワークＮｐ３４２に対して教師信号と
して与える乗算器３４４、及び乗算器３４４からの乗算
結果とネットワークＮｐ３４２から出力を加算する加算
器３４５から構成されている。The control device is a control object (Co) 341, a feedforward control mechanism (for example, composed of a neural network Np) 342, a target value vector x
subtraction result between the output vector x _d and the control object 341, i.e. to the subtractor 343, the neural network Np342 was by multiplying the multiplication result output and a control deviation e _x of the identification apparatus 340 for obtaining a control deviation e _x It comprises a multiplier 344 which provides a training signal as a teaching signal, and an adder 345 which adds the multiplication result from the multiplier 344 and the output from the network Np 342.

【０１６９】ニューラルネットワークＮｐ３４２はシス
テム同定装置３４０の出力に基づいて作成される教師信
号Ｃを小さくするように学習を行う。教師信号Ｃは次式
によって与えられる。The neural network Np 342 performs learning so as to reduce the teacher signal C generated based on the output of the system identification device 340. The teacher signal C is given by the following equation.

【０１７０】[0170]

【数３３】 [Equation 33]

【０１７１】学習が充分に進行すると、ネットワークＮ
ｐ３４３は制御対象３４１の入出力特性の逆特性として
の「ベクトルｘ_d→ベクトルｑ′」のデータ変換機能を
獲得することになる。第２の適応型学習制御装置と比較
して、偏差ベクトルΔｑを直接得られるという利点があ
るため、該偏差ベクトルΔｑとネットワークＮｐ３４２
の出力ｑ′を加算して得られる関節角の制御操作量ベク
トルｑを直接制御対象（Ｃｏ）に入力することによっ
て、より適応的な動作が可能になる。When learning progresses sufficiently, network N
At p343, the data conversion function of "vector x _d → vector q '" as the inverse characteristic of the input / output characteristic of the control object 341 is acquired. Since there is an advantage that the deviation vector Δq can be obtained directly as compared to the second adaptive learning controller, the deviation vector Δq and the network Np 342 are obtained.
A more adaptive operation is made possible by directly inputting the control operation amount vector q of the joint angle obtained by adding the output q 'of f to the control target (Co).

【０１７２】図１０では、２本のリンクのリンク長がそ
れぞれ0.5 ｍであり、システム同定時のヤコビアンＪの
値としては繰り返し回数１〜100 まではIn FIG. 10, the link lengths of two links are respectively 0.5 m, and the value of Jacobian J at the time of system identification is 1 to 100 repetitions.

【０１７３】[0173]

【数３４】 [Equation 34]

【０１７４】を、繰り返し回数101 〜200 まではThe number of repetitions 101 to 200 is

【０１７５】[0175]

【数３５】 [Equation 35]

【０１７６】を設定した場合のヤコビアンＪの値のシミ
ュレート結果が示されている。図１０から20回程度の繰
り返しを行うことによってヤコビアンＪの値が正しい値
に収束していくことが明らかである。The simulated result of the value of Jacobian J when the setting is set is shown. It is apparent from FIG. 10 that the value of Jacobian J converges to the correct value by repeating about 20 times.

【０１７７】図１１には、図１０のシミュレーション時
におけるネットワーク１７のエネルギーの変化を示す。
前述の(12)、(13)式を求める際には(11)式の展開時に、
（ベクトルΔｘ）²の項を無視したが、図１１ではその
項を含むエネルギーを示している。なお、前述のように
普通のホップフィールド・ネットワークの場合にエネル
ギーが時間とともに減少するのはこの（ベクトルΔｘ）
²に相当する項が定数となるためである。FIG. 11 shows the change of the energy of the network 17 at the time of the simulation of FIG.
When obtaining the aforementioned equations (12) and (13), at the time of expansion of equation (11),
Although the term of (vector Δx) ² is ignored, FIG. 11 shows the energy including the term. Note that, as described above, in the case of an ordinary Hopfield network, the energy decreases with time, this (vector Δx)
^This is because the term corresponding to ² is a constant.

【０１７８】図１１をみると、図１０に対応して５単位
時間程度の短時間でエネルギーの収束が見られる。また
ヤコビアンの値を変化させた 100単位時間のところで一
時的にエネルギーは増加するが、最初と同様に短時間の
うちに再び０に収束している。また、同定結果として逆
ヤコビアンの値を求める場合にも、ヤコビアンの値を求
める場合と同様に正しい値に収束するが、ここでの詳細
な説明は省略する。Referring to FIG. 11, energy convergence can be seen in a short time of about 5 unit times corresponding to FIG. The energy temporarily increases at 100 units of time when the Jacobian's value is changed, but it converges to 0 again in a short time as in the first case. Also, in the case of obtaining the value of the inverse Jacobian as the identification result, the value converges to the correct value as in the case of obtaining the value of the Jacobian, but the detailed description here is omitted.

【０１７９】[0179]

【発明の効果】以上説明したように、本発明によれば対
象システムの同定結果をニューラルネットワークを用い
て得ることができ、同定装置のハードウェア化が可能と
なる。そして階層型ニューラルネットワークを併用する
ことにより、非線形システムに対する同定装置として使
用することもできる。As described above, according to the present invention, the identification result of the target system can be obtained using a neural network, and the identification apparatus can be realized in hardware. And by using a layered neural network in combination, it can also be used as an identification device for nonlinear systems.

【０１８０】また、本発明の同定装置を適応型制御装置
における教師信号を得るために利用することによって、
適応型制御装置をオンラインで学習させることが可能に
なり、制御対象の特性が変化した場合にも再学習は不必
要であり、従来のオフライン学習方式に比較して使用効
率が向上する。Also, by utilizing the identification device of the present invention to obtain a teacher signal in an adaptive controller,
It becomes possible to make the adaptive control device learn online, re-learning is unnecessary even when the characteristic of the control object changes, and the use efficiency is improved as compared with the conventional off-line learning method.

【０１８１】更に、本発明は前述のように階層型ニュー
ラルネットワークを併用することにより非線形システム
に対しても適用でき、単に逆運動学問題の解法に限定さ
れず、動特性を持つシステムに対しても適用可能であ
る。Furthermore, the present invention can be applied to non-linear systems by combining hierarchical neural networks as described above, and is not limited to solving only inverse kinematics problems, but to systems having dynamic characteristics. Is also applicable.

Brief Description of the Drawings

【図１】本発明の原理構成ブロック図である。FIG. 1 is a block diagram showing the principle of the present invention.

【図２】システムの動作パラメータの同定に用いられる
ニューラルネットワークの構成を示す図である。FIG. 2 is a diagram showing a configuration of a neural network used to identify an operation parameter of a system.

【図３】システム同定装置の実施例の構成を示すブロッ
ク図である。FIG. 3 is a block diagram showing a configuration of an embodiment of a system identification device.

【図４】相関器の詳細構成を示すブロック図である。FIG. 4 is a block diagram showing a detailed configuration of a correlator.

【図５】非線形システムの同定を可能とするシステム同
定装置の構成を示すブロック図である。FIG. 5 is a block diagram showing a configuration of a system identification apparatus that enables identification of a non-linear system.

【図６】本発明によるシステム同定結果を利用する第１
の適応型学習制御装置の構成を示すブロック図である。FIG. 6 uses the system identification result according to the invention first
FIG. 12 is a block diagram showing a configuration of an adaptive learning control device of the present invention.

【図７】本発明によるシステム同定結果を利用する第２
の適応型学習制御装置の構成を示すブロック図である。[FIG. 7] Second using the system identification result according to the present invention
FIG. 12 is a block diagram showing a configuration of an adaptive learning control device of the present invention.

【図８】本発明によるシステム同定結果を利用する第３
の適応型学習制御装置の構成を示すブロック図である。[FIG. 8] Third using the system identification result according to the present invention
FIG. 12 is a block diagram showing a configuration of an adaptive learning control device of the present invention.

【図９】本発明によるシステム同定結果を利用する第４
の適応型学習制御装置の構成を示すブロック図である。FIG. 9 utilizes the system identification result according to the present invention;
FIG. 12 is a block diagram showing a configuration of an adaptive learning control device of the present invention.

【図１０】ヤコビアンの同定結果の説明する図である。FIG. 10 is a diagram for explaining a Jacobian identification result.

【図１１】図１０のシミュレーションにおけるエネルギ
ーの変化を示す図である。11 is a diagram showing changes in energy in the simulation of FIG.

【図１２】ニューラルネットワークを用いる制御装置の
従来例（その１）の構成を示すブロック図である。FIG. 12 is a block diagram showing a configuration of a conventional example (Part 1) of a control device using a neural network.

【図１３】ニューラルネットワークを用いる制御装置の
従来例（その２）の構成を示すブロック図である。FIG. 13 is a block diagram showing the configuration of a conventional example (Part 2) of a control device using a neural network.

【図１４】ニューラルネットワークを用いる制御装置の
従来例（その３）の構成を示すブロック図である。FIG. 14 is a block diagram showing a configuration of a conventional example (Part 3) of a control device using a neural network.

【図１５】ニューラルネットワークを用いる制御装置の
従来例（その４）の構成を示すブロック図である。FIG. 15 is a block diagram showing a configuration of a conventional example (Part 4) of a control device using a neural network.

【図１６】ニューラルネットワークを用いる制御装置の
従来例（その５）の構成を示すブロック図である。FIG. 16 is a block diagram showing a configuration of a conventional example (No. 5) of a control device using a neural network.

【図１７】ニューラルネットワークを用いる制御装置の
従来例（その６）の構成を示すブロック図である。FIG. 17 is a block diagram showing a configuration of a conventional example (part 6) of a control device using a neural network.

[Description of the code]

１０１データ処理手段１０２設定変更用データ出力手段 101 Data processing means 102 Data output means for changing settings

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所 // Ｇ０６Ｇ 7/60 ── ── ── ── ──続き Continuation of the front page (51) Int.Cl. ⁶ Identification code Office serial number FI Technical display location // G06G 7/60

Claims

[Claim of claim]

1. A system identification apparatus for outputting an identification result of an operation parameter of a control target system, comprising: setting change data for changing a setting of a data processing function based on an input / output relationship of the control target system. System identification characterized by: data output means for setting change, and data processing means for calculating and outputting the identification result using the data for setting change output from the data output means for setting change apparatus.

2. The input / output relationship of the controlled system
The system identification device according to claim 1, wherein the system identification apparatus has a correspondence relationship between a minute input to the control target system and a minute output of the control target system for the minute input.

3. The data processing means is constituted by a hop field neural network in which a plurality of neurons are mutually connected, and the setting change data output means is used as the setting change data for the neural network. The external input signal to the neuron and the coupling coefficient between the neurons are given while being slightly changed until the output of the neural network becomes constant, and the neural network determines the constant output value as the identification result. The system identification device according to claim 1, wherein the system identification information is output.

4. The system identification device according to claim 2, wherein noise is used as the minute input.

5. The setting change data output means sets the input / output relationship of the controlled system as a relationship between a minute displacement for a specific input value and a minute displacement of an output of the controlled system for the input minute displacement. The internal potential of each neuron is updated by changing the coupling coefficient between the external input signal to each neuron constituting the neural network and the corresponding neuron while making the specific input value minutely displaced successively. The system identification device according to claim 3, wherein the output of the neural network is changed.

6. The system identification device further includes a learning device to which an input signal to the control target system is input, wherein the learning device matches the output of the own device with the output of the data processing means. The system identification apparatus according to claim 1, wherein the system identification apparatus performs the learning and stores the identification result of the operation parameter of the control target system obtained as a result of the learning.

7. The system identification device according to claim 6, wherein the learning device stores the identification result of the operation parameter for the non-linear control target system.

8. The control target system is a multi-joint robot, and the input / output relation of the control target system is a relation between a minute displacement of a joint angle of the multi-joint robot and a micro displacement of a hand position of the robot. The data processing means may output, as the identification result, a Jacobian which is a matrix for converting minute displacements of joint angles of the articulated robot into minute displacements of hand positions. System identification device.

9. The control target system is a multi-joint robot, and the input / output relation of the control target system is a relation between a minute displacement of a joint angle of the multi-joint robot and a micro displacement of a hand position of the robot. The data processing means outputs, as the identification result, a reverse Jacobian which is a matrix for converting a minute displacement of the end position of the articulated robot into a minute displacement of a joint angle. System identification device as described.

10. An adaptive learning control apparatus having a feedforward control mechanism for a control target system, comprising: setting change data for changing a setting of a data processing function based on an input / output relationship of the control target system; Data output means for outputting setting change data, data processing means for calculating and outputting an identification result using the data for setting change output from the data output means for setting change, target value and actual value for the system to be controlled An adaptive learning control device comprising: a multiplier for giving the feedforward control mechanism a multiplication result of a control deviation between the output value of the data processing means and the output value of the data processing means.

11. The feedforward control mechanism according to
The adaptive system according to claim 10, comprising: a neural network, wherein an output from the multiplier is input as a teacher signal, and on-line learning is performed so that an output of the feedforward control mechanism becomes an appropriate amount. Learning control device.

12. An output from the data processing means is a Jacobian calculated from a minute input to the control target system and a minute output of the control target system for the minute input. 10. The adaptive learning control apparatus according to 10.

13. A feedforward control mechanism for a control target system and a control deviation between a target value and an actual output value for the control target system are input, and an output corresponding to the control deviation is the feedforward control mechanism. An adaptive learning control device including a feedback control mechanism that adds the result to the output of the control target system to the control target system, the feedback control mechanism comprising a neural network having a learning function, A setting change data output unit that outputs setting change data for changing the setting of the data processing function based on an input / output relationship of the system; and the setting change data output from the setting change data output unit Data processing means for calculating and outputting an identification result using It multiplies the control deviation, the adaptive learning control apparatus characterized by comprising: a multiplier to give as a teacher signal of the result to the neural network.

14. The feedforward control mechanism according to
The adaptive learning control apparatus according to claim 13, wherein an output to the control target system is changed such that the control deviation becomes gradually smaller by on-line learning based on the teacher signal.

15. The feedforward control mechanism according to
A neural network comprising: a neural network, wherein an output from the feedback control mechanism is input as a teacher signal, and on-line learning is performed so that an output of the feedforward control mechanism becomes an appropriate amount. Adaptive learning control device.

16. The adaptive learning control apparatus according to claim 15, wherein the feedback control mechanism further receives an addition result input to the control target system.

17. The controlled system is an articulated robot, and the input / output relation of the controlled system is a relation between a minute displacement of a joint angle of the articulated robot and a minute displacement of an end position of the robot. The adaptive learning control device according to claim 16, wherein the feedforward control mechanism further receives a link length of the robot.

The output from the data processing means is a Jacobian calculated from a minute input to the system to be controlled and a minute output of the system to be controlled for the minute input. The adaptive learning control apparatus according to 13.

19. An adaptive learning control apparatus having a feedforward control mechanism for a control target system, comprising: setting change data for changing a setting of a data processing function based on an input / output relationship of the control target system; Data output means for outputting setting change data, data processing means for calculating and outputting an identification result using the data for setting change output from the data output means for setting change, target value and actual value for the system to be controlled A multiplier giving the result of multiplication of the control deviation between the output value of the data processing means and the output of the data processing means to the feedforward control mechanism, and adding the output of the feedforward control mechanism and the output of the multiplier , An adder for giving the result of the addition to the control target system as a control operation amount, and adaptive learning Control device.

20. The feedforward control mechanism according to
20. The adaptive learning control device according to claim 19, comprising: a neural network, wherein the multiplication result is given as a teacher signal to the neural network.

21. The apparatus according to claim 1, wherein the output from the data processing means is a reverse Jacobian calculated from a minute input to the control target system and a minute output of the control target system for the minute input. The adaptive learning control device according to Item 19.