JPH06289918A

JPH06289918A - Method for controlling learning of robot

Info

Publication number: JPH06289918A
Application number: JP5077204A
Authority: JP
Inventors: Yoshito Nanjo; 義人南條
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1993-04-02
Filing date: 1993-04-02
Publication date: 1994-10-18
Anticipated expiration: 2018-08-04
Also published as: JP3433465B2

Abstract

PURPOSE:To provide the learning control method of a robot, by which the learning efficiency of the robot having a learning function can be improved. CONSTITUTION:The learning control method for correcting an input value supplied to the driving element of the robot based on an error occurring between a target orbit becoming the target of a reproduction operation at the time of causing the robot to repetitively execute the reproduction operation and the real orbit of the robot at the time of the reproduction operation is provided with a step p3 calculating the speed of the target orbit, p4 multiplying the speed error between the target orbit and the real orbit by a coefficient (learning gain) reduced in accordance with the lapse of time and adding a feed forward value supplied to the driving element of the robot at the time of the previous reproduction operation to the result and a step p5 setting the value obtained in the step p4 to the feed forward value for the driving element at the time of the subsequent reproduction operation.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、例えば繰り返し再生
動作を行う産業用ロボットに用いて好適なロボットの学
習制御方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a robot learning control method suitable for use in, for example, an industrial robot that performs repetitive reproduction operations.

【０００２】[0002]

【従来の技術】周知のように、ロボットを繰り返し再生
動作させる技術が各種開発されている。この種の技術に
おいて、例えば下記の文献（ａ）に記載された学習制御
方法が知られている。（ａ）Ｓ．Ａｒｉｍｏｔｏ：ＬｅａｒｎｉｎｇＣ
ｏｎｔｒｏｌＴｈｅｏｒｙｆｏｒＲｏｂｏｔｉｃ
Ｍｏｔｉｏｎ，ＩｎｔｅｒｎａｔｉｏｎａｌＪｏｕ
ｒｎａｌｏｆＡｄａｐｔｉｖｅＣｏｎｔｒｏｌ
ａｎｄＳｉｇｎａｌＰｒｏｃｅｓｓｉｎｇ，４−
６，５４６／５６４，（１９９０）2. Description of the Related Art As is well known, various techniques for repetitively performing a replay operation of a robot have been developed. In this type of technology, for example, a learning control method described in the following document (a) is known. (A) S. Arimoto: Learning C
ontrol Theory for Robotic
Motion, International Jou
rnal of Adaptive Control
and Signal Processing, 4-
6,546 / 564, (1990)

【０００３】この文献（ａ）に記載された学習制御方法
では、ロボットが有する各自由度に対し個別に位置誤差
および速度誤差のネガティブフィードバックループを構
成しておき、あらかじめ用意された各自由度に対応する
関節の目標軌道に従ってロボットを再生動作させる。そ
して、このときの目標軌道と再生動作時に生じる実軌道
との速度誤差を測定し、この速度誤差に定数（以下、学
習ゲインと称する）を乗算する。さらに、この乗算結果
を前回の再生動作時にロボットの駆動素子に与えた入力
値（以下、フィードフォワード値と称する）に加算し、
この加算結果を新たなフィードフォワード値として次回
の再生動作時にロボットの駆動素子に与えることによ
り、再生動作の精度を向上させている。In the learning control method described in this document (a), a negative feedback loop of position error and velocity error is individually configured for each degree of freedom of the robot, and each degree of freedom prepared in advance is set. The robot is regenerated according to the target trajectory of the corresponding joint. Then, the velocity error between the target trajectory at this time and the actual trajectory generated during the reproducing operation is measured, and this velocity error is multiplied by a constant (hereinafter referred to as a learning gain). Furthermore, this multiplication result is added to the input value (hereinafter referred to as feedforward value) given to the drive element of the robot at the time of the previous reproduction operation,
By providing the addition result as a new feedforward value to the drive element of the robot in the next reproducing operation, the accuracy of the reproducing operation is improved.

【０００４】この学習制御方法によれば、再生動作を何
回か繰り返すうちにロボットの実軌道は逐次修正され、
あらかじめ用意されていた目標軌道の近傍に収束する。
このため、多自由度を有するロボットにおいてみられる
遠心力、コリオリ力といった動力学的な影響が最終的に
補正され、高速かつ高精度な軌道制御が可能となる。ま
た、理論的には、学習ゲインの値が大きいほど、実軌道
は少ない再生動作で目標軌道近傍に収束する。According to this learning control method, the actual trajectory of the robot is sequentially corrected while the reproducing operation is repeated several times.
It converges near the target trajectory prepared in advance.
For this reason, the dynamic effects such as centrifugal force and Coriolis force, which are observed in a robot having multiple degrees of freedom, are finally corrected, and high-speed and highly accurate trajectory control becomes possible. Further, theoretically, as the value of the learning gain is larger, the actual orbit converges to the vicinity of the target orbit with less reproduction operation.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上述し
た従来の学習制御方法において、速度誤差に乗じる学習
ゲインを大きくし過ぎた場合、各関節における速度検出
時に生じる測定誤差等の影響により、実軌道は軌道の終
わりに近づくに従って振動する。このことは、例えば下
記文献（ｂ）によって紹介されている。（ｂ）南條ほか：学習を用いたロボット軌道制御に及ぼ
す忘却因子の影響，第９回日本ロボット学会講演会予稿
集，Ｎｏ．１，３９１／３９２，（１９９１）However, in the above-described conventional learning control method, when the learning gain by which the velocity error is multiplied is made too large, the actual trajectory is not changed due to the influence of the measurement error or the like occurring at the velocity detection at each joint. It vibrates as it approaches the end of the orbit. This is introduced, for example, by the following document (b). (B) Nanjo et al .: Influence of forgetting factor on robot trajectory control using learning, Proceedings of 9th Robotics Society of Japan, No. 1,391 / 392, (1991)

【０００６】ここで、図３は、従来の学習制御方法を３
自由度を有するロボットに適用し、学習ゲインを大きく
し過ぎた場合に、再生動作を２０回行ったときの実軌道
の例を示している。この図に示すように、軌道の始点Ｓ
から目標軌道Ａ１とほぼ一致していた実軌道Ａ２は、軌
道の終わりに近づくに従って振動している。すなわち、
従来の学習制御方法は、実用的には学習ゲインの大きさ
に制限があり、このため実軌道を目標軌道近傍に収束さ
せるには多くの再生動作を行わなければならず、したが
って学習効率が悪いという欠点があった。FIG. 3 shows a conventional learning control method.
This is applied to a robot having a degree of freedom, and shows an example of an actual trajectory when the reproducing operation is performed 20 times when the learning gain is too large. As shown in this figure, the starting point S of the trajectory
Therefore, the actual trajectory A2, which substantially coincides with the target trajectory A1, vibrates toward the end of the trajectory. That is,
In the conventional learning control method, the size of the learning gain is practically limited, and therefore many reproducing operations must be performed in order to converge the actual trajectory near the target trajectory, and thus the learning efficiency is poor. There was a drawback.

【０００７】この発明は、このような背景の下になされ
たもので、学習機能を有するロボットの学習効率を向上
させることができるロボットの学習制御方法を提供する
ことを目的としている。The present invention has been made under such a background, and an object thereof is to provide a learning control method for a robot which can improve the learning efficiency of a robot having a learning function.

【０００８】[0008]

【課題を解決するための手段】この発明は、上述した課
題を解決するために、ロボットに繰り返し再生動作を行
わせる際、該再生動作の目標となる目標軌道と該再生動
作時におけるロボットの実軌道との間に生じる誤差に基
づき、該ロボットの駆動部へ供給する入力値を修正する
学習制御方法において、前記目標軌道と前記実軌道との
間における位置誤差、速度誤差、加速度誤差のうち何れ
かの誤差を測定する第１ステップと、前記第１ステップ
で測定した誤差と時間の経過に従って減少する係数（学
習ゲイン）とを乗算する第２ステップと、前記第２ステ
ップの乗算結果と前回の再生動作時に前記駆動部へ供給
した入力値（フィードフォワード値）とを加算する第３
ステップと、前記第３ステップの加算結果を次回の再生
動作時における入力値（フィードフォワード値）として
前記駆動部へ供給する第４ステップとを具備することを
特徴としている。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, the present invention provides a target trajectory which is a target of the regenerating operation when the robot repeatedly performs the regenerating operation, and an actual robot at the time of the regenerating operation. In a learning control method for correcting an input value supplied to a drive unit of the robot based on an error generated between the trajectory and the trajectory, any one of a position error, a velocity error, and an acceleration error between the target trajectory and the actual trajectory. Or a second step of multiplying the error measured in the first step by a coefficient (learning gain) that decreases with the passage of time, a multiplication result of the second step and a previous step. Third addition of the input value (feedforward value) supplied to the drive unit during the reproduction operation
And a fourth step of supplying the addition result of the third step to the drive unit as an input value (feedforward value) for the next reproducing operation.

【０００９】[0009]

【作用】この発明によれば、第１ステップにおいて、目
標軌道と実軌道との間における位置誤差、速度誤差、加
速度誤差のうち何れかの誤差を測定し、第２ステップに
おいて、第１ステップで測定した誤差と時間の経過に従
って減少する係数（学習ゲイン）とを乗算し、第３ステ
ップにおいて、第２ステップの乗算結果と前回の再生動
作時にロボットの駆動部へ供給した入力値（フィードフ
ォワード値）とを加算し、第４ステップにおいて、第３
ステップの加算結果を次回の再生動作時における入力値
（フィードフォワード値）として前記駆動部へ供給す
る。これにより、ロボットの駆動部へのフィードフォワ
ード値を繰り返し再生動作毎に修正すれば、軌道の終わ
りに近づくに従って発生する実軌道の振動が抑制され
る。According to the present invention, in the first step, any one of position error, velocity error, and acceleration error between the target trajectory and the actual trajectory is measured, and in the second step, the first step is performed. The measured error is multiplied by the coefficient (learning gain) that decreases with the passage of time, and in the third step, the multiplication result of the second step and the input value (feedforward value) supplied to the drive unit of the robot during the previous playback operation. ) And, in the fourth step, the third
The addition result of the step is supplied to the drive unit as an input value (feedforward value) for the next reproducing operation. As a result, if the feedforward value to the drive unit of the robot is corrected for each repetitive reproduction operation, the vibration of the actual trajectory that occurs near the end of the trajectory is suppressed.

【００１０】[0010]

【実施例】以下、図面を参照して、この発明の実施例に
ついて説明する。図１は、この発明の一実施例に用いら
れる制御回路の構成を示すブロック図である。なお、こ
の図では、電動モータ駆動形のロボットの１つの自由度
についての制御系統のみを示しており、他の自由度につ
いては同様の構成となるため図示を省略している。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of a control circuit used in an embodiment of the present invention. In this figure, only the control system for one degree of freedom of the electric motor drive type robot is shown, and the other degrees of freedom have the same configuration and are not shown.

【００１１】図１において、４は各種ディジタル演算を
行う演算処理部であり、図示しないＣＰＵ（中央処理装
置）、メモリ１〜３等から構成されている。なお、この
演算処理部４には、Ｄ／Ａ（ディジタル／アナログ）変
換素子等の入出力インタフェース回路が含まれている
が、ここでは簡単のため図示を省略している。In FIG. 1, reference numeral 4 denotes an arithmetic processing unit for performing various digital arithmetic operations, which is composed of a CPU (central processing unit), memories 1 to 3, etc., which are not shown. The arithmetic processing unit 4 includes an input / output interface circuit such as a D / A (digital / analog) conversion element, but the illustration is omitted here for simplicity.

【００１２】メモリ１〜３は、少なくとも目標軌道を移
動するのに必要な時間Tを制御周期T_Sで除した数のデー
タを格納できる程度の記憶容量をそれぞれ有している。
そして、メモリ２には、目標軌道の軌道データθ
_ref（t）が、制御周期時間T_S毎のモータ９の目標位置θ
_ref（T_Si）（ただし、iは整数）としてあらかじめ格納
されている。なお、メモリ１，３の記憶内容およびその
用途については後述する。Each of the memories 1 to 3 has a storage capacity enough to store at least the time T required to move the target trajectory divided by the control cycle T _S.
Then, the trajectory data θ of the target trajectory is stored in the memory 2.
_ref (t) is the target position θ of the motor 9 for each control cycle time T _S
Pre-stored as _ref (T _S i) (where i is an integer). The stored contents of the memories 1 and 3 and their uses will be described later.

【００１３】また、演算処理部４は、比較器５、乗算器
６、比較器７およびアンプ８を介してロボットを駆動す
るモータ９に接続されている。さらに、モータ９は、回
転速度検出器１１および乗算器１０を介して比較器７に
接続されると共に、回転位置検出器１２を介して比較器
５に接続されている。The arithmetic processing section 4 is connected to a motor 9 for driving the robot via a comparator 5, a multiplier 6, a comparator 7 and an amplifier 8. Further, the motor 9 is connected to the comparator 7 via the rotation speed detector 11 and the multiplier 10, and is also connected to the comparator 5 via the rotation position detector 12.

【００１４】比較器５は、メモリ２から制御周期時間T_S
毎に供給される目標軌道データθ_ref(t)と、位置検出器
１２によって測定された実軌道θ(t)との差を算出し、
この結果を出力する。乗算器６は、比較器５の出力と正
定数k₁とを乗算し、この結果を出力する。この乗算器６
の出力は、比較器７を介しアンプ８へ供給される。The comparator 5 receives the control cycle time T _S from the memory 2.
Calculate the difference between the target trajectory data θ _ref (t) supplied for each and the actual trajectory θ (t) measured by the position detector 12,
This result is output. The multiplier 6 multiplies the output of the comparator 5 by the positive constant k ₁ and outputs the result. This multiplier 6
Is output to the amplifier 8 via the comparator 7.

【００１５】一方、速度検出器１１は、モータ９の実速
度dθ/dt（以下、θ^*(t)と表記する）を測定し、この測
定結果を出力する。乗算器１０は、速度検出器１１によ
って測定された実速度θ^*(t)と正定数k₂とを乗算し、こ
の結果を出力する。この乗算器１０の出力は、符号を反
転された後、比較器７を介しアンプ８へ供給される。さ
らに、アンプ８は、比較器７の出力を増幅し、これをモ
ータ９へ出力する。以上が、この制御回路におけるフィ
ードバック制御系である。On the other hand, the speed detector 11 measures the actual speed dθ / dt (hereinafter referred to as θ ^* (t)) of the motor 9 and outputs the measurement result. The multiplier 10 multiplies the actual speed θ ^* (t) measured by the speed detector 11 by the positive constant k ₂ and outputs the result. The output of the multiplier 10 is supplied to the amplifier 8 via the comparator 7 after the sign is inverted. Further, the amplifier 8 amplifies the output of the comparator 7 and outputs it to the motor 9. The above is the feedback control system in this control circuit.

【００１６】このような構成によれば、モータ９は、そ
の実軌道θ(t)が目標軌道θ_ref(t)に対して誤差が
「０」となる方向に駆動される。このとき、フィードバ
ック制御系の安定性を確保するためには、実用的には正
定数k₁，k₂の値を極端に大きくすることはできない。ま
た、ロボットを高速で動作させる場合には、多自由度を
有するロボットにみられる遠心力やコリオリ力の影響も
あり、モータ９を目標軌道θ_ref(t)に完全に追従させる
ことはできない。そこで、こうした原因に基づく誤差を
補正するため、上記フィードバック制御に加え、以下に
述べるフィードフォワード制御を行う。With such a configuration, the motor 9 is driven in a direction in which the actual trajectory θ (t) has an error of “0” with respect to the target trajectory θ _ref (t). At this time, in order to secure the stability of the feedback control system, the values of the positive constants k ₁ and k ₂ cannot be extremely increased practically. Further, when the robot is operated at a high speed, the motor 9 cannot be made to completely follow the target trajectory θ _ref (t) due to the influence of the centrifugal force and the Coriolis force which are seen in the robot having multiple degrees of freedom. Therefore, in order to correct an error based on such a cause, feedforward control described below is performed in addition to the above feedback control.

【００１７】次に、図２に示すフローチャートを参照
し、ＣＰＵの動作に基づくフィードフォワード制御動作
について説明する。同図において、まずステップｐ１で
は、メモリ１に記憶されるデータu_kを「０」クリアす
る。ここで、データu_kは、k回目の再生動作時にロボッ
トの駆動素子に与えるフィードフォワード値である。こ
の場合、１回目の再生動作時であるので、フィードフォ
ワード値u₁が「０」となる。Next, the feedforward control operation based on the operation of the CPU will be described with reference to the flow chart shown in FIG. In the figure, first, in step p1, the data u _k stored in the memory 1 is cleared to "0". Here, the data u _k is a feedforward value given to the drive element of the robot during the k-th reproduction operation. In this case, the feedforward value u ₁ is “0” because the reproduction operation is performed for the first time.

【００１８】次に、ステップｐ２に進むと、図１に示し
たフィードバック制御系において、メモリ１内のフィー
ドフォワード値u₁を制御周期T_S毎にアンプ８へ供給し、
１回目の再生動作を行う。このとき、制御周期T_S毎に速
度検出器１１によって測定された実軌道全域の速度θ^*
（T_Si）をメモリ３に格納する。そして、１回目の再生
動作が終わった時点で、下記ステップｐ３〜ｐ５により
メモリ１内のフィードフォワード値を修正する。Next, when proceeding to step p2, in the feedback control system shown in FIG. 1, the feedforward value u ₁ in the memory 1 is supplied to the amplifier 8 every control cycle T _S ,
The first playback operation is performed. At this time, the velocity θ ^* over the entire real trajectory measured by the velocity detector 11 for each control cycle T _S.
Store (T _S i) in the memory 3. Then, when the reproduction operation of the first time is finished, the feedforward value in the memory 1 is corrected by the following steps p3 to p5.

【００１９】ステップｐ３では、メモリ２に格納されて
いる目標軌道θ_ref(t)に基づき、目標速度θ^* _ref(t)を
算出する。すなわち、下式（１）に示すように、時間t
＝T_Siにおける速度θ^* _ref(T_Si)は、前後の制御周期にお
ける目標位置の差を制御周期T_Sで除して算出することが
できる。ただし、iは制御周期の順番を表す整数であ
る。 θ^* _ref（T_Si）＝｛θ_ref（T_S(i+1)）−θ_ref（T_Si）｝／T_S ………（１）At step p3, the target velocity θ ^* _ref (t) is calculated based on the target trajectory _θref (t) stored in the memory 2. That is, as shown in the following equation (1), the time t
The velocity θ ^* _ref (T _S i) at = T _S i can be calculated by dividing the difference between the target positions in the front and rear control cycles by the control cycle T _S. However, i is an integer representing the order of control cycles. θ ^* _ref (T _S i) = {θ _ref (T _S (i + 1))-θ _ref (T _S i)} / T _S ………… (1)

【００２０】次に、ステップｐ４に進むと、上記ステッ
プｐ３で算出した速度θ^* _ref(t)と、メモリ３に格納さ
れている前回の再生動作時の実速度θ^*(t)との差を制御
周期T_S毎に算出し、さらにこの結果と学習ゲインφとを
乗算する。ここで、学習ゲインφは、軌道の終わりに近
づくに従って小さな値となるよう、例えば下式（２）で
表される関数によって求める。ただし、φ₀およびλは
正定数、ｅは自然対数を表している。Next, when proceeding to step p4, the difference between the speed θ ^* _ref (t) calculated in step p3 and the actual speed θ ^* (t) stored in the memory 3 during the previous reproducing operation. Is calculated for each control cycle T _S , and this result is multiplied by the learning gain φ. Here, the learning gain φ is obtained by, for example, a function represented by the following equation (2) so that the learning gain φ has a smaller value as it approaches the end of the trajectory. However, φ ₀ and λ are positive constants, and e is a natural logarithm.

【数１】 [Equation 1]

【００２１】次に、ステップｐ５では、上記学習ゲイン
φを乗じた値とメモリ１内のフィードフォワード値u
₁(t)とを加算した結果を新たなフィードフォワード値u₂
(t)としてメモリ１に格納する。そして、ステップｐ６
では、学習動作を継続するか否かを判断する。すなわ
ち、あらかじめ設定してある学習回数に再生動作回数が
達していないか、あるいは実軌道があらかじめ設定して
ある所望の精度に達していない場合には、ここでの判断
結果が「Ｙｅｓ」となり、前述のステップｐ２に戻る。
これにより、２回目の再生動作に伴う学習動作が行われ
る。Next, in step p5, a value obtained by multiplying the learning gain φ by the feedforward value u in the memory 1 is used.
_The result of adding ₁ (t) and the new feedforward value u ₂
Store in memory 1 as (t). And step p6
Then, it is determined whether or not the learning operation is continued. That is, if the number of playback operations has not reached the preset number of learning times, or if the actual trajectory has not reached the preset desired accuracy, the determination result here is "Yes", It returns to the above-mentioned step p2.
As a result, the learning operation associated with the second reproduction operation is performed.

【００２２】次に、２回目の学習動作では、上記ステッ
プｐ５で新たにメモリ１に格納されたフィードフォワー
ド値u₂(t)をアンプ８へ供給する。以後、上記ステップ
ｐ６の判断結果が「Ｙｅｓ」となる間、再生動作毎に上
記ステップｐ３〜ｐ５を繰り返し、その都度メモリ１内
のフィードフォワード値を修正する。このときのフィー
ドフォワード値の算出式は、下式（３）によって表され
る。Next, in the second learning operation, the feedforward value u ₂ (t) newly stored in the memory 1 in the above step p5 is supplied to the amplifier 8. After that, while the determination result of the above step p6 is "Yes", the above steps p3 to p5 are repeated for each reproduction operation, and the feedforward value in the memory 1 is corrected each time. The calculation formula of the feedforward value at this time is expressed by the following formula (3).

【数２】 [Equation 2]

【００２３】こうして、再生動作回数があらかじめ設定
してある学習回数に達した場合、あるいは実軌道が所望
の精度に達した場合、ステップｐ６の判断結果が「Ｎ
ｏ」となり、学習動作は終了する。In this way, when the number of reproduction operations reaches the preset number of learning times or when the actual trajectory reaches the desired accuracy, the determination result of step p6 is "N".
Then, the learning operation ends.

【００２４】このように、本実施例によれば、目標軌道
と実軌道との差に乗じる学習ゲインを軌道の終わりに近
づくに従って小さくすることにより、軌道の終わりに近
づくに従って生じていた振動を抑制することができる。
この結果、従来の方法に比して学習ゲインを大きく設定
することが可能となり、より少ない繰り返し再生動作で
実軌道を目標軌道近傍に収束させることが可能となる。As described above, according to the present embodiment, the learning gain by multiplying the difference between the target trajectory and the actual trajectory is reduced toward the end of the trajectory, so that the vibration generated near the end of the trajectory is suppressed. can do.
As a result, the learning gain can be set larger than in the conventional method, and the actual trajectory can be converged to the vicinity of the target trajectory with fewer repetitive reproduction operations.

【００２５】なお、本実施例では、上式（３）において
学習ゲインφ(t)を前述の式（２）で表される関数によ
って設定したが、時間の経過に従って減少する関数値が
得られるものであれば、他の関数を用いることも可能で
ある。In this embodiment, the learning gain φ (t) in the above equation (3) is set by the function represented by the above equation (2), but a function value that decreases with the passage of time is obtained. Other functions can be used as long as they are available.

【００２６】また、本実施例では、メモリ２に格納され
るθ_ref(t)に基づいて目標速度 θ^* _ref(t)を算出するようにしたが、この代わりに、あ
らかじめ目標速度 θ^* _ref(t)を別のメモリに格納しておいてもよい。ま
た、再生動作時の実速度 θ^*(t)をメモリ３に格納せずに、直接上式（３）によっ
て各制御周期毎のフィードフォワード値を算出し、この
算出値u_k+1(t)を直接メモリ１に格納するようにしても
よい。In the present embodiment, the target speed θ ^* _ref (t) is calculated based on _θref (t) stored in the memory 2, but instead, the target speed θ ^* _ref is calculated in advance. (t) may be stored in another memory. Further, the actual speed θ ^* (t) during the reproducing operation is not stored in the memory 3, but the feedforward value for each control cycle is directly calculated by the above equation (3), and the calculated value u _{k + 1} (t ) May be stored directly in the memory 1.

【００２７】また、本実施例では、制御回路としてコン
ピュータ（ＣＰＵ等で構成された演算処理部４）を用い
た場合を例としているが、これを他の周知の種々の制御
素子に置き換えることも可能である。Further, in the present embodiment, the case where the computer (the arithmetic processing unit 4 composed of a CPU or the like) is used as the control circuit is taken as an example, but it may be replaced with other well-known various control elements. It is possible.

【００２８】また、本実施例では、ロボットの各関節を
電動モータにより駆動し、かつその電動モータに対し個
別に位置誤差および速度誤差のネガティブフィードバッ
ク制御系が構成されたプレイバック形ロボットを対象に
説明したが、この発明は油圧シリンダを駆動源とするロ
ボットや、その他種々のサーボ機構を有する機械にも適
用可能である。Further, in the present embodiment, a playback type robot in which each joint of the robot is driven by an electric motor and a negative feedback control system of the position error and the speed error is individually configured for the electric motor is targeted. Although described, the present invention is also applicable to a robot using a hydraulic cylinder as a drive source and a machine having various servo mechanisms.

【００２９】また、本実施例では、フィードフォワード
値の算出に、目標軌道とロボットの実軌道との間に生じ
る速度誤差のみを用いているが、位置誤差や加速度誤差
に時間とともに減少する係数を乗じた値を前回の再生動
作時のフィードフォワード値に加えることによって学習
を行うことも可能である。Further, in the present embodiment, only the velocity error generated between the target trajectory and the actual trajectory of the robot is used in the calculation of the feedforward value, but the position error and the acceleration error are calculated with a coefficient that decreases with time. It is also possible to perform learning by adding the multiplied value to the feedforward value at the time of the previous reproduction operation.

【００３０】さらにこの発明は、前述の文献（ａ）に記
載されている学習制御方法と併用することも可能であ
る。すなわち、この文献（ａ）には、測定誤差等の影響
を緩和するため、下式（４）に示すように、前回のフィ
ードフォワード値u_k(t)に適当な正定数(1-a)を乗じる学
習制御方法が提案されており、本発明との併用が可能で
ある。 u_k+1(t)＝φ（θ^* _ref(t)−θ^*(t)）＋(1-a)u_k(t) ………………（４）Further, the present invention can be used in combination with the learning control method described in the above-mentioned document (a). That is, in this document (a), in order to mitigate the influence of measurement error, etc., as shown in the following equation (4), an appropriate positive constant (1-a) is added to the previous feedforward value u _k (t). A learning control method that multiplies by has been proposed and can be used in combination with the present invention. _{u k + 1 (t) =} φ (θ * ref (t) -θ * (t)) + (1-a) u k (t) .................. (4)

【００３１】[0031]

【発明の効果】以上説明したように、この発明によれ
ば、軌道の終わりに近づくに従って発生する実軌道の振
動が抑制されるので、従来に比して学習ゲインを大きく
設定することができ、より少ない繰り返し再生動作で実
軌道を目標軌道近傍に収束させることが可能となる。こ
の結果、ロボットの学習効率を向上させることができ
る。As described above, according to the present invention, since the vibration of the actual orbit that occurs as the end of the orbit is approached is suppressed, the learning gain can be set larger than in the conventional case. It becomes possible to converge the actual orbit to the vicinity of the target orbit with fewer repeated reproduction operations. As a result, the learning efficiency of the robot can be improved.

[Brief description of drawings]

【図１】この発明の一実施例に用いられる制御回路の構
成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a control circuit used in an embodiment of the present invention.

【図２】同実施例において、ＣＰＵの動作に基づくフィ
ードフォワード制御動作を示すフローチャートである。FIG. 2 is a flowchart showing a feedforward control operation based on an operation of a CPU in the embodiment.

【図３】従来の学習制御方法を３自由度を有するロボッ
トに適用し、学習ゲインを大きくし過ぎた場合に、再生
動作を２０回行ったときの実軌道の例を示す図である。FIG. 3 is a diagram showing an example of actual trajectories when a reproduction operation is performed 20 times when a conventional learning control method is applied to a robot having three degrees of freedom and a learning gain is excessively increased.

[Explanation of symbols]

１〜３メモリ４演算処理部５，７比較器６，１０乗算器８アンプ９モータ１１回転速度検出器１２回転位置検出器Ａ１目標軌道Ａ２実軌道 1 to 3 memory 4 arithmetic processing unit 5,7 comparator 6,10 multiplier 8 amplifier 9 motor 11 rotational speed detector 12 rotational position detector A1 target trajectory A2 actual trajectory

Claims

[Claims]

1. Based on an error generated between a target trajectory which is a target of the reproducing operation and an actual trajectory of the robot during the reproducing operation when the robot repeatedly performs the reproducing operation,
A learning control method for correcting an input value supplied to a drive unit of the robot, which measures any one of a position error, a velocity error, and an acceleration error between the target trajectory and the actual trajectory.
A step, a second step of multiplying the error measured in the first step by a coefficient that decreases with the passage of time, a multiplication result of the second step, and an input value supplied to the drive unit at the time of the previous reproduction operation. And a fourth step of supplying the addition result of the third step to the drive unit as an input value for the next reproduction operation.