JPH0546206A

JPH0546206A - Learning control system

Info

Publication number: JPH0546206A
Application number: JP3223437A
Authority: JP
Inventors: Yuji Nakamura; 裕司中村
Original assignee: Yaskawa Electric Corp
Current assignee: Yaskawa Electric Corp
Priority date: 1991-08-07
Filing date: 1991-08-07
Publication date: 1993-02-26
Anticipated expiration: 2015-05-08
Also published as: JP3039814B2

Abstract

PURPOSE:To perform the high precise follow-up action by correcting a control input so that the weighted squares sum of the predicted value of a future deviation can be minimum and considering the future correction quantity then. CONSTITUTION:A trial is repeated so as to follow the output of a control object to a target command to repeat the same pattern, and a control input uk(i) at time (i) for the k-th trial is given by an expression I. In the expression I, sigmak(i) is a correction quantity from a previous control input uk-1, sigmap(i) is a correction quantity inputted until reaching the present time, eL(i) is a follow-up deviation at the previous trail, wh1 is the information concerning the dynamic characteristic of the controlled variable and a constant decided by the weighted matrix multiplied to the predicted value of the future follow-up deviation, HF and HP are the constant decided by the information concerning the dynamic characteristic of the controlled variable and W is the weighted matrix multiplied to the predicted value of the future follow-up deviation.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、繰り返し動作をする工
作機械、ロボット等の制御方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a control system for machine tools, robots and the like that repeatedly operate.

【０００２】[0002]

【従来の技術】繰り返し目標値に対する学習制御系の設
計法としては、本出願人が特開平1-237701号公報におい
て、提案した方式がある。この方式は、同じ目標値に対
する動作を繰り返し、過去の偏差および制御対象の動特
性に関する情報をもとに未来の偏差を予測し、その予測
値の重み付き２乗和を評価関数として、その評価関数が
最小となるように制御入力を補正していくというもの
で、最終的には目標値と出力が一致するため、高精度な
追従動作が実現される。2. Description of the Related Art As a method of designing a learning control system for a repetitive target value, there is a method proposed by the present applicant in Japanese Patent Laid-Open No. 1-237701. This method repeats the operation for the same target value, predicts the future deviation based on the information about the past deviation and the dynamic characteristics of the controlled object, and evaluates the weighted sum of squares of the predicted value as an evaluation function. The control input is corrected so that the function becomes the minimum. Finally, the target value and the output match, so that highly accurate follow-up operation is realized.

【０００３】[0003]

【発明が解決しようとする課題】上述の方式では、現在
時刻以降の補正量は現在時刻の値から変化しないと仮定
した上で、未来の偏差を予測し現在の補正量を決定して
いた。しかし、補正量は実際には変化するため、偏差の
予測値がずれてしまい、精度が悪化するうという問題が
あった。そこで本発明は、偏差の予測をより正確に行え
る制御方式を提供することを目的とする。In the above-mentioned method, it is assumed that the correction amount after the present time does not change from the value at the present time, and the future deviation is predicted to determine the present correction amount. However, since the correction amount actually changes, there is a problem that the predicted value of the deviation deviates and the accuracy deteriorates. Therefore, it is an object of the present invention to provide a control method capable of more accurately predicting deviation.

【０００４】[0004]

【課題を解決するための手段】上記問題点を解決するた
め、本願請求項１、２、３記載の発明では、現在時刻以
降の補正量の変化も考慮した上で未来の偏差を予測して
おり、それぞれ次のような特徴を持つ。本願の請求項１
記載の発明では、同じパターンを繰り返す目標指令に制
御対象の出力を追従させるよう試行を繰り返し、ｋ回目
の試行の、時刻ｉにおける制御入力ｕ_k(i) を、次式ｕ_k(i) = ｕ_k-1(i) +σ_k(i) σ_k(i) = ｗ_h1Ｈ_F ^TＷ[ ｅ_L(i)-Ｈ_Pσ_p(i)] ただし、 σ_k(i):前回の制御入力ｕ_k-1(i)からの補正量 σ_P(i):現在時刻に至るまでに入力してきた補正量ｅ_L(i):前回の試行における追従偏差ｗ_h1: 制御対象の動特性に関する情報と、未来の追従偏
差の予測値に掛ける重み行列によって決定される定数Ｈ_F,Ｈ_p: 制御対象の動特性に関する情報によって決定
される定数Ｗ: 未来の追従偏差の予測値にかける重み行列である。で与えることを特徴としている。本願の請求項２記載の
発明では、同じパターンを繰り返す目標指令に制御対象
の出力を追従させるよう試行を繰り返し、ｋ回目の試行
の、時刻ｉにおける制御入力ｕ_k(i) を、次式ｕ_k(i) = ｕ_k-1(i) +σ_k(i)In order to solve the above problems, in the inventions according to claims 1, 2, and 3, the future deviation is predicted in consideration of the change in the correction amount after the current time. And each has the following features. Claim 1 of the present application
In the described invention, the trial is repeated so that the output of the controlled object follows the target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is given by the following equation u _k (i) = u _k-1 (i) + σ _k (i) σ _k (i) = w _h1 H _F ^T W [e _L (i) -HP _P σ _p (i)] where σ _k (i): previous Correction amount from control input u _k-1 (i) σ _P (i): Correction amount that has been input until the current time e _L (i): Tracking deviation in the previous trial _wh1 : Dynamic characteristic of control target And a constant HF _, H _p determined by a weight matrix for multiplying the predicted value of the future tracking deviation, a constant W determined by information about the dynamic characteristics of the controlled object W: a weight applied to the predicted value of the future tracking deviation Is the matrix. It is characterized by giving in. In the invention according to claim 2 of the present application, the trial is repeated so that the output of the controlled object follows the target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is expressed by the following equation u _k (i) = u _k-1 (i) + σ _k (i)

【０００５】[0005]

【数４】 [Equation 4]

【０００６】ただし、 σ_k(i):前回の制御入力ｕ_k-1(i)からの補正量ｅ_k(i):k 回目の試行の時刻ｉにおける追従偏差ｐ_m，ｆ_n: 制御対象のステップ応答のサンプル値と、
未来の追従偏差の予測値に掛ける重み行列によって決定
される定数である。で与えることを特徴としている。本願の請求項３記載の
発明では、同じパターンを繰り返す目標指令に制御対象
の出力を追従させるよう試行を繰り返し、ｋ回目の試行
の、時刻ｉにおける制御入力ｕ_k(i) を、次式ｕ_k(i) = ｕ_k-1(i) +σ_k(i)[0006] However, σ _k (i): the previous correction amount e _k from the control input u _k-1 (i) of (i): tracking error at time i of the k-th trial p _m, f _n: the control object A sample value of the step response of
It is a constant determined by the weight matrix that is used to multiply the predicted value of future tracking deviation. It is characterized by giving in. In the invention according to claim 3 of the present application, the trial is repeated so that the output of the controlled object follows the target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is expressed by the following equation u _k (i) = u _k-1 (i) + σ _k (i)

【０００７】[0007]

【数５】 [Equation 5]

【０００８】ただし、 σ_k(i):前回の制御入力ｕ_k-1(i)からの補正量ｅ_k(i):ｋ回目の試行の時刻ｉにおける追従偏差 ΔH _n: 制御対象のステップ応答のサンプル値の差分値である。で与えることを特徴としている。Where σ _k (i): correction amount from the previous control input u _k-1 (i) e _k (i): tracking deviation at time i of the k-th trial ΔH _n : step response of the controlled object It is the difference value of the sample value of. It is characterized by giving in.

【０００９】[0009]

【作用】上記手段により、未来の追従偏差の予測がより
精確になり学習性能が向上する。By the above means, the prediction of the future tracking deviation becomes more accurate and the learning performance is improved.

【００１０】[0010]

【実施例】本発明は目標指令が一定周期で連続的に繰り
返す場合にも適用可能であるが、制御入力を決定する際
に偏差の現在値を利用しないため、各試行を間欠的に行
い、各試行間に次の１試行分の制御入力をオフライン的
にまとめて算出することも可能である。ここでは、後者
の場合について本発明の具体的実施例を図を用いて説明
する。図１は本願請求項１記載の発明の実施例である。
図中１は同じパターンを間欠的に発生する指令発生器で
あり、１試行分の目標指令値の系列｛ｒ(j) ｝ (j=i₀,i
₀+1,…,i_n) を発生する。ここで、i₀と i_nは、試行の
開始時刻と終了時刻である。２は減算器であり、今回の
試行時の偏差の系列｛ｅ_k(j) ｝ (j=i₀,i₀+1,…,i_n)
を出力する。３は、定数行列ｗ_h1、Ｈ_F、Ｈ_P、Ｗを記
憶するメモリ、４は、今回試行時の補正量σ_k(j)(j=
i₀,i₀+1,…,i_n) を記憶するメモリ、５は、前回の試行
時の偏差ｅ_k-1(j)(j=i₀,i₀+1,…,i_n) を記憶するメモ
リであり、今回の試行の際には、減算器２の出力値すな
わち偏差のｅ_k(j)(j=i₀,i₀+1,…,i_n) が記憶される。
６は演算器であり、 σ_k(i) = ｗ_h1Ｈ_F ^TＷ[ ｅ_L(i)-Ｈ_Pσ_P(i)] (1a) なる演算によって、時刻ｉにおける補正量σ_k(i) を算
出し、さらに、ｕ_k(i) = ｕ_k-1(i) + σ_k(i) により、今回の試行時の制御入力ｕ_k(j) (j=i₀,i₀+1,
…,i_n) を求め出力する。The present invention is applicable to the case where the target command is continuously repeated at a constant cycle, but since the current value of the deviation is not used when determining the control input, each trial is performed intermittently. It is also possible to collectively calculate the control inputs for the next one trial offline between each trial. Here, the latter case will be described using a specific embodiment of the present invention with reference to the drawings. FIG. 1 shows an embodiment of the invention described in claim 1 of the present application.
In the figure, 1 is a command generator that intermittently generates the same pattern, and is a series of target command values for one trial {r (j)} (j = i ₀ , i
₀ +1, ..., i _n ) is generated. Here, i ₀ and i _n are the start time and end time of the trial. Reference numeral 2 denotes a subtracter, which is a series of deviations at the time of this trial {e _k (j)} (j = i ₀ , i ₀ +1, ..., i _n )
Is output. 3, the constant matrix _{_{_{w h1, H F, H P}}} , a memory for storing W, 4, the correction amount at the time of this trial σ _k (j) (j =
i ₀ , i ₀ +1, ..., i _n ) is a memory for storing the deviation e _k-1 (j) (j = i ₀ , i ₀ +1, ..., i _n ) from the previous trial. Is a memory for storing the output value of the subtracter 2, that is, the deviation e _k (j) (j = i ₀ , i ₀ +1, ..., i _n ) is stored in this trial.
6 is a _{calculator, σ k (i) = w} h1 H F T W [e L (i) -H P σ P (i)] by (1a) becomes operational, the correction amount at time i σ _k (i ), And further, by using u _k (i) = u _k-1 (i) + σ _k (i), the control input u _k (j) (j = i ₀ , i ₀ +1) at the time of this trial is calculated. ,
…, I _n ), and output.

【００１１】７は、１試行分の制御入力を記憶するメモ
リで、前回の試行時には前回の試行時の入力ｕ_k-1(j)
(j=i₀,i₀+1,…,i_n) が記憶されており、前回の試行が
終了した後に、演算器６によって算出される今回の試行
時の入力ｕ_k(j)(j=i₀,i₀+1,…,i_n) が記憶され、今回
の試行の際に出力される。８、９はサンプリング周期Ｔ
で閉じるサンプラであり、１０はホールド回路である。
１１は入力がｕ(t) で出力がｙ(t) の制御対象である。
図２は本願請求項２記載の発明の実施例である。図中２
３は、定数ｐ₁、ｐ₂、・・・、ｐ_M、ｆ₁、ｆ₂、・・・
、ｆ_N-1を記憶するメモリ、２６は演算器であり、Reference numeral 7 is a memory for storing the control input for one trial. At the previous trial, the input u _k-1 (j) at the previous trial
(j = i ₀ , i ₀ +1, ..., i _n ) is stored, and the input u _k (j) (j) of the present trial calculated by the computing unit 6 after the previous trial is finished. = i ₀ , i ₀ +1, ..., i _n ) is stored and output at the time of this trial. 8 and 9 are sampling periods T
And 10 is a hold circuit.
Reference numeral 11 is a control target whose input is u (t) and whose output is y (t).
FIG. 2 shows an embodiment of the invention described in claim 2 of the present application. 2 in the figure
3 is a constant p ₁ , p ₂ , ..., P _M , f ₁ , f ₂ ,.
, F _N-1 is a memory for storing, and 26 is a computing unit,

【００１２】[0012]

【数６】 [Equation 6]

【００１０】なる演算によって、時刻ｉにおける補正量
σ_k(i) を算出し、さらに、ｕ_k(i) = ｕ_k-1(i) +σ_k(i) により、今回の試行時の制御入力ｕ_k(j) (j=i₀,i₀+1,
…_,i _n) を求め出力する。図３は本願請求項３記載の
発明の実施例である。図中３３は、制御対象のステップ
応答のサンプリング値の差分値Δ H_d, Δ H_d+1,…, Δ
H_Nを記憶するメモリ、３７は演算器であり、[0010] By comprising computing, calculates the correction amount σ _k (i) at time i, further by _{u k (i) = u k} -1 (i) + σ k (i), the control at the time of this trial Input u _k (j) (j = i ₀ , i ₀ +1,
_..., to obtain and output i _n). FIG. 3 is an embodiment of the invention described in claim 3 of the present application. In the figure, 33 is the difference value ΔH _d , ΔH _{d + 1} , ..., Δ of the sampling values of the step response of the controlled object.
A memory for storing H _N , 37 is a computing unit,

【００１３】[0013]

【数７】 [Equation 7]

【００１４】なる演算によって、時刻ｉにおける補正量
σ_k(i) を算出し、さらに、ｕ_k(i) = ｕ_k-1(i) + σ_k(i) により、今回の試行時の制御入力ｕ_k(j) (j=i₀,i₀+1,
…_,i _n) を求め出力する。(1a)〜(1c)式の導出を行
う。制御対象１１はインパルス応答モデルにより、[0014] By comprising computing, calculates the correction amount σ _k (i) at time i, further by _{u k (i) = u k} -1 (i) + σ k (i), the control at the time of this trial Input u _k (j) (j = i ₀ , i ₀ +1,
_..., to obtain and output i _n). Formulas (1a) to (1c) are derived. The controlled object 11 is based on the impulse response model.

【００１５】[0015]

【数８】 [Equation 8]

【００１６】と表すことができる。ここでΔ H_nは、前
もって測定された制御対象１１の単位ステップ応答のサ
ンプル値｛ H₁,H₂, …,H_N｝（図４）の差分値である
（Δ H_n= H_n- H_n-1）。Ｎは応答が十分に整定する
ように、すなわち、Δ H_n≒０(n>N) となるように選ぶ
ものとし、ΔH₀＝0 である。さらに、実際の出力ｙ(i)
と(2) 式のモデル出力It can be expressed as Here, Δ H _n is a difference value of sample values {H ₁ , H ₂ , ..., H _N } (FIG. 4) of the unit step response of the controlled object 11 measured in advance (Δ H _n = H _n- H _n-1 ). It is assumed that N is selected so that the response is sufficiently settled, that is, ΔH _n ≈0 (n> N), and ΔH ₀ = 0. Furthermore, the actual output y (i)
And model output of equation (2)

【００１７】[0017]

【数９】 [Equation 9]

【００１８】との差、すなわち、推定誤差をｄ(i) とす
る。The difference from the above, that is, the estimation error is d (i).

【００１９】[0019]

【数１０】 [Equation 10]

【００２０】いまｋ回目の試行の、時刻ｉにおける制御
入力ｕ_k(i) を、次式で与えるものとする。ｕ_k(i) = ｕ_k-1(i) + σ_k(i) (4) ただし、ｋは試行回数を表わし、σ_k(i) は前回の制御
入力ｕ_k-1(i) からの補正量である。ここで、未来の追
従偏差の予測値ｅ_k ^*を以下の手順で求める。ｋ回目の
試行の時刻ｉにおいて、出力ｙ_k(i) は、次式で表すこ
とができる。The control input u _k (i) at the time i of the k-th trial is given by the following equation. u _k (i) = u _k-1 (i) + σ _k (i) (4) where k represents the number of trials and σ _k (i) is the value from the previous control input u _k-1 (i). This is the correction amount. Here, the predicted value e _k ^* of the future tracking deviation is _obtained by the following procedure. At the time i of the k-th trial, the output y _k (i) can be expressed by the following equation.

【００２１】[0021]

【数１１】 [Equation 11]

【００２２】さらに k-1回目の試行の時刻ｉにおいて
は、Further, at time i of the k−1th trial,

【００２３】[0023]

【数１２】 [Equation 12]

【００２４】となる。(5) 式から(6) 式を引くことによ
り、次式を得る。It becomes The following equation is obtained by subtracting equation (6) from equation (5).

【００２５】[0025]

【数１３】 [Equation 13]

【００２６】である。ここでδ_k(i) は、出力ｙ_k(i)
の、前回試行時の同じ時刻の出力ｙ_k- ₁(i) からの変化
分である。さらに、時刻 i+mの出力変化分δ_k(i+m)は
次式で表される。[0026] Where δ _k (i) is the output y _k (i)
Of the output y _k- ₁ (i) at the same time at the previous trial. Further, the output variation δ _k (i + m) at time i + m can be expressed by the following equation.

【００２７】[0027]

【数１４】 [Equation 14]

【００２８】いま、時刻ｉにおいてＭステップ先までの
出力変化分の予測値δ_k ^*(i+m) (m=1,2, …,M) を求め
る際に、(2) 式のモデルによる推定誤差は不変、すなわ
ち、ｄ_k(i+m) ＝ｄ_k-1(i+m) であると仮定すると、予
測値δ_k ^*(i+m) は、(10)式より、At the time i, when the predicted value δ _k ^* (i + m) (m = 1,2, ..., M) of the output change up to M steps ahead is obtained, the model of the equation (2) is used. Assuming that the estimation error is invariant, that is, d _k (i + m) = d _k-1 (i + m), the predicted value δ _k ^* (i + m) is

【００２９】[0029]

【数１５】 [Equation 15]

【００３０】となる。δ_k(i) の定義により、時刻 i+m
における追従偏差ｅ_k(i+m) は次式で表される。ｅ_k(i+m) = ｅ_k-1(i+m) -δ_k(i+m) (12) したがって, その予測値ｅ_k ^*(i+m) は次式で与えられ
る。ｅ_k ^*(i+m) = ｅ_k-1(i+m) -δ_k ^*(i+m) (13) (11)、(13)式より、偏差の予測値ｅ_k ^*(i+m) は結局
次式で与えられる。It becomes By the definition of δ _k (i), the time i + m
The following deviation e _k (i + m) at is expressed by the following equation. e _k (i + m) = e _k-1 (i + m) -δ _k (i + m) (12) Therefore, the predicted value e _k ^* (i + m) is given by the following equation. e _k ^* (i + m) = e _k-1 (i + m) -δ _k ^* (i + m) (13) From equations (11) and (13), the predicted value of deviation e _k ^* (i + m) is finally given by the following equation.

【００３１】[0031]

【数１６】 [Equation 16]

【００３２】書き直すと、ｅ^*(i) = ｅ_L(i) - Ｈ_Pσ_P(i) - Ｈ_Fσ_F(i) (15) ただし、ｅ^*(i) = [ ｅ_k ^*(i+1),ｅ_k ^*(i+2),…, ｅ_k ^*(i+M) ]^T ｅ_L(i) = [ ｅ_k-1(i+1), ｅ_k-1(i+2), …, ｅ_k-1(i+M) ]^T σ_P(i) = [ σ_k(i-1),σ_k(i-2),……, σ_k(i-N+1) ]^T σ_F(i) = [ σ_k(i),σ_k(i+1),…, σ_k(i+M-1) ]^T [0032] rewritten ^{and, e * (i) = e} L (i) - H P σ P (i) - H F σ F (i) (15) ^{However, e * (i) = [} e k * (i +1), e _k ^* (i + 2), ..., e _k ^* (i + M)] ^T e _L (i) = [e _k-1 (i + 1), e _k-1 (i + 2) ),…, _Ek-1 (i + M)] ^T σ _P (i) = [σ _k (i-1), σ _k (i-2), ……, σ _k (i-N + 1) ] ^T σ _F (i) = [σ _k (i), σ _k (i + 1),…, σ _k (i + M-1)] ^T

【００３３】[0033]

【数１７】 [Equation 17]

【００３４】となる。上式より未来の追従偏差の予測値
ｅ^*(i) は、前回の試行における追従偏差ｅ_L(i) 、現
在に至るまでに入力してきた補正量σ_P(i) 、これから
決定すべき現在時刻以降の補正量σ_F(i) によって予測
されている。いま、Ｍステップ未来までの追従偏差の予
測値ｅ^*(i) をより小さくするための指標として、次の
評価関数ＪＪ = ｅ^*(i) ^TＷｅ^*(i) = [ｅ_L(i)-Ｈ_Pσ_P(i)-Ｈ_Fσ_F(i)]^T ×Ｗ[ ｅ_L(i)-Ｈ_Pσ_P(i) - Ｈ_Fσ_F(i)] (16)It becomes From the above equation, the predicted value e ^* (i) of the future tracking deviation is the tracking deviation e _L (i) in the previous trial, the correction amount σ _P (i) that has been input up to the present, and the present that should be determined from now. It is predicted by the correction amount σ _F (i) after time. Now, as an index in order to further reduce the predicted value e ^* (i) of the tracking error of up to M-step future, following the evaluation function ^{J J = e * (i)} T We * (i) = [e L (i ) -H _P σ _P (i) -H _F σ _F (i)] ^T × W [e _L (i) -H _P σ _P (i)-HF _F σ _F (i)] (16)

【００３５】[0035]

【数１８】 [Equation 18]

【００３６】を考え、この評価関数Ｊが最小となるよう
にσ_F(i) を決定する。ここで w_mは、ｍステップ未来
の追従偏差の予測値ｅ_k ^*(i+m) にかける重み係数であ
り、図５に一例を示す。ただし,w_m>0 (m=1,2,…,M) と
する。評価関数(16)を最小にするσ_F(i) は、重み付き
最小２乗推定により、次式で与えられる。 σ_F(i) = [ Ｈ_F ^TＷＨ_F ]^-1Ｈ_F ^TＷ[ ｅ_L(i)-Ｈ_Pσ_P(i)] (17) したがって、現在決定すべきσ_k(i) は、次式で与えら
れる。 σ_k(i) = ｗ_h1Ｈ_F ^TＷ[ ｅ_L(i)-Ｈ_Pσ_P(i)] (18) ただし、ｗ_h1は、行列[ Ｈ_F ^TＷＨ_F ]^-1の１行目であ
り、ｗ_h1Ｈ_F ^TＷは、ステップ応答データ｛ H_n｝を測
定し、重み行列Ｗを適当に与えることにより、学習を行
う前にあらかじめ算出できる。したがって、時刻ｉにお
ける補正量σ_k(i) _z(1a)式に従って決定される。本願
の請求項２記載の発明では、未来の補正量σ_k(i+2) 以
降は、すべてσ_k(i+1) と等しいと仮定すると、(15)式
は、ｅ^*(i) = ｅ_L(i) - Ｈ_Pσ_P(i) - Ｈ_F2σ_F2(i) (19) σ_F2(i) = [ σ_k(i),σ_k(i+1) ]^T Considering the above, σ _F (i) is determined so that this evaluation function J is minimized. Here, w _m is a weighting coefficient to be applied to the predicted value e _k ^* (i + m) of the follow-up deviation of m steps in the future, and an example is shown in FIG. However, w _m > 0 (m = 1,2,…, M). Σ _F (i) that minimizes the evaluation function (16) is given by the following equation by weighted least squares estimation. σ _F (i) = [H _F ^T W H _F ] ^-1 H _F ^T W [e _L (i) -H _P σ _P (i)] (17) Therefore, σ _k (i) to be currently determined is It is given by the following formula. σ _k (i) = w _h1 H _F ^T W [e _L (i) -H _P σ _P (i)] (18) where w _h1 is the first row of the matrix [H _F ^T W H _F ] ^-1 And w _h1 H _F ^T W can be calculated in advance before learning by measuring the step response data {H _n } and giving the weight matrix W appropriately. Therefore, the correction amount at time i is determined according to the equation σ _k (i) _z (1a). In the invention according to claim 2 of the present application, assuming that after the future correction amount σ _k (i + 2) is all equal to σ _k (i + 1), the equation (15) becomes e ^* (i) = e _L (i)-HP _P σ _P (i)-HF _F2 σ _F2 (i) (19) σ _F2 (i) = [σ _k (i), σ _k (i + 1)] ^T

【００３７】[0037]

【数１９】 [Formula 19]

【００３８】となる。ここで、(16)式の評価関数Ｊは、Ｊ =ｅ^*(i) ^TＷｅ^*(i) = [ｅ_L(i)-Ｈ_Pσ_P(i)-Ｈ_F2σ_F2(i)]^T ×Ｗ[ ｅ_L(i)-Ｈ_Pσ_P(i)-Ｈ_F2σ_F2 (i)] (20) となる。この評価関数(20)を最小にするσ_F2(i) は、同
様に重み付き最小２乗推定により、次式で与えられる。 σ_F2(i) = [ Ｈ_F2 ^TＷＨ_F2 ]^-1Ｈ_F2 ^TＷ[ ｅ_L(i)-Ｈ_Pσ_P(i)] (21) ここで、[0038] Here, (16) the evaluation function J of the ^{equation, J = e * (i)} T We * (i) = [e L (i) -H P σ P (i) -H F2 σ F2 (i)] ^T × W [e _L (i) -H _P σ _P (i) -H _F2 σ _F2 (i)] (20). Σ _F2 (i) that minimizes this evaluation function (20) is similarly given by the following equation by weighted least squares estimation. _{σ F2 (i) = [H} F2 T WH F2] -1 H F2 T W [e L (i) -H P σ P (i)] (21) where

【００３９】[0039]

【数２０】 [Equation 20]

【００４０】であるから、(21)式より現在決定すべきσ
_k(i) は、Therefore, σ which should be currently determined from the equation (21)
_k (i) is

【００４１】[0041]

【数２１】 [Equation 21]

【００４２】となる。さらに、 [ c,-b ]Ｈ_F2 ^TＷ = [ cW₁ΔH₁,cW₂ΔH₂-bW₂H₁, …, cW_MΔ H_M- bW_M H_M-1] であるから、(22)式より、It becomes _{Furthermore, [c, -b] H F2} T W = [cW 1 ΔH 1, cW 2 ΔH 2 -bW 2 H 1, ..., cW M Δ H M - bW M H M-1] a because, (22 From the expression,

【００４３】[0043]

【数２２】 [Equation 22]

【００４４】書き直すと、Rewriting,

【００４５】[0045]

【数２３】 [Equation 23]

【００４６】ただし、However,

【００４７】[0047]

【数２４】 [Equation 24]

【００４８】であり、これらの定数は、ステップ応答デ
ータ｛ H_n｝を測定し、重み行列Ｗを適当に与えること
により、学習を行う前にあらかじめ算出できる。したが
って、時刻ｉにおける補正量σ_k(i) は(1b)式に従って
決定される。本願の請求項３記載の発明では、制御対象
にむだ時間があり、H₁=H₂=… H_d-1=0, H_d≠0 であると
して、(15)式を次のように変形する。ｅ^* _d(i) = ｅ_Ld(i) - Ｈ_Pdσ_Pd(i) - Ｈ_Fdσ_Fd(i) (25) ただし、ｅ^* _d(i) = [ ｅ_k ^*(i+d),ｅ_k ^*(i+d+1),…, ｅ_k ^*(i+M) ]^T ｅ_Ld(i) = [ ｅ_k-1(i+d),ｅ_k-1(i+d+1), …, ｅ_k-1(i+M) ]^T σ_Pd(i) = [ σ_k(i-1),σ_k(i-2),……, σ_k(i-N+d) ]^T σ_Fd(i) = [ σ_k(i),σ_k(i+1),…, σ_k(i+M-d) ]^T These constants can be calculated in advance before learning by measuring the step response data {H _n } and giving the weight matrix W appropriately. Therefore, the correction amount σ _k (i) at time i is determined according to the equation (1b). In the invention according to claim 3 of the present application, assuming that the controlled object has a dead time and H ₁ = H ₂ = ... H _d-1 = 0, H _d ≠ 0, the equation (15) is modified as follows. To do. e ^* _d (i) = e _Ld (i)-H _Pd σ _Pd (i)-H _Fd σ _Fd (i) (25) where e ^* _d (i) = [e _k ^* (i + d), e _k ^* (i + d + 1), ..., e _k ^* (i + M)] ^T e _Ld (i) = [e _k-1 (i + d), e _k-1 (i + d + 1) ),…, E _k-1 (i + M)] ^T σ _Pd (i) = [σ _k (i-1), σ _k (i-2),…, σ _k (i-N + d) ] ^T σ _Fd (i) = [σ _k (i), σ _k (i + 1),…, σ _k (i + Md)] ^T

【００４９】[0049]

【数２５】 [Equation 25]

【００５０】となる。ここで、次の評価関数Ｊ、Ｊ = ｅ^* _d(i) ^TＷｅ^* _d(i) =[ ｅ_Ld(i)-Ｈ_Pdσ_Pd(i)-Ｈ_Fdσ_Fd(i)]^T ×Ｗ[ ｅ_Ld(i)-Ｈ_Pdσ_Pd(i)-Ｈ_Fdσ_Fd(i)] (26) を最小にするσ_Fd(i) は、同様に重み付き最小２乗推定
により、次式で与えられる。 σ_Fd(i) = [ Ｈ_Fd ^TＷＨ_Fd ]^-1Ｈ_Fd ^TＷ[ ｅ_Ld(i)-Ｈ_Pdσ_Pd(i)] (27) ここで、 [ Ｈ_Fd ^TＷＨ_Fd ]^-1Ｈ_Fd ^TＷ = Ｈ_Fd ^-1Ｗ^-1Ｈ_Fd ^-TＨ_Fd ^TＷ = Ｈ_Fd ^-1 (28) であり、さらにＨ_Fd ^-1の１行目は[ Δ H_d ^-1,0,0, …,0
]であるから、(27)、(28)式より現在決定すべきσ
_k(i) は、It becomes Here, the following evaluation function J, J = e ^* _d (i) ^T We ^* _d (i) = [ _eLd (i) -H _Pd σ _Pd (i) -H _Fd σ _Fd (i)] ^T × _{W [e Ld (i) -H} Pd σ Pd (i) -H Fd σ Fd (i)] σ Fd to the (26) to the minimum (i) likewise by weighted least squares estimation, the following formula Given in. _{σ Fd (i) = [H} Fd T WH Fd] -1 H Fd T W [e Ld (i) -H Pd σ Pd (i)] (27) _{^{_{where, [H Fd T WH Fd]}}} -1 H _Fd ^T W = H _Fd ^-1 W ^-1 H _Fd ^-T H _Fd ^T W = H _Fd ^-1 (28), and the first line of H _Fd ^-1 is [Δ H _d ^-1 , 0,0 ,…, 0
], So σ that should be currently determined from Eqs. (27) and (28)
_k (i) is

【００５１】[0051]

【数２６】 [Equation 26]

【００５２】したがって、時刻ｉにおける補正量σ
_k(i) は(1c)式に従って決定される。以上で、(1a)〜(1
c)式で与えられる補正量σ_k(i) が、それぞれ、(16)、
(20)、(26)式の評価関数Ｊを最小にすることが示され
た。Therefore, the correction amount σ at time i
_k (i) is determined according to equation (1c). Above, (1a) ~ (1
The correction amount σ _k (i) given by the equation (c) is (16),
It was shown that the evaluation function J of the equations (20) and (26) is minimized.

【００５３】[0053]

【発明の効果】以上述べたように、本発明によれば、同
じパターンの目標値に対する動作を繰り返す学習制御系
において、過去の偏差および制御対象の動特性に関する
情報をもとに未来の偏差を予測し、その予測値の重み付
き２乗和が最小となるように制御入力を補正しており、
その際に、未来の補正量も考慮しているため、最終的に
は目標値と出力が一致し、高精度な追従動作が実現され
る。さらに、この補正演算は現在時刻の偏差などの情報
を必要としないため、各試行の間で実行すれば良い。As described above, according to the present invention, in the learning control system in which the operation for the target value of the same pattern is repeated, the future deviation is calculated based on the information about the past deviation and the dynamic characteristics of the controlled object. Prediction is performed and the control input is corrected so that the weighted sum of squares of the predicted value is minimized.
At that time, since the future correction amount is also taken into consideration, the target value and the output finally match, and a highly accurate follow-up operation is realized. Further, since this correction calculation does not require information such as the deviation of the current time, it may be executed between trials.

[Brief description of drawings]

【図１】本発明の具体的実施例を示す図FIG. 1 is a diagram showing a specific embodiment of the present invention.

【図２】本発明の具体的実施例を示す図FIG. 2 is a diagram showing a specific embodiment of the present invention.

【図３】本発明の具体的実施例を示す図FIG. 3 is a diagram showing a specific embodiment of the present invention.

【図４】本発明の動作説明図FIG. 4 is an operation explanatory diagram of the present invention.

【図５】本発明の動作説明図FIG. 5 is an operation explanatory diagram of the present invention.

【符号の説明】１指令発生器２減算器３、４、５、７メモリ６演算器８、９サンプラ１０ホールド回路１１制御対象２３メモリ２６演算器３３メモリ３７演算器 [Explanation of Codes] 1 command generator 2 subtractor 3, 4, 5, 7 memory 6 arithmetic unit 8, 9 sampler 10 hold circuit 11 control target 23 memory 26 arithmetic unit 33 memory 37 arithmetic unit

Claims

[Claims]

1. A trial is repeated so that the output of the controlled object follows a target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is expressed by the following equation u _k (i) = u _k-1 (i) + σ _k (i) σ _k (i) = w _h1 H _F ^T W [e _L (i) -H _P σ _P (i)] (where σ _k (i): previous time Amount of correction from control input u _k-1 (i) σ _P (i): correction amount that has been input up to the current time e _L (i): tracking deviation in the previous trial w _h1 : movement of the controlled object multiplying the predicted value of the tracking error of the future: the information about the characteristics, constant H F which is determined by the weight matrix multiplying the predicted value of the future tracking _error, H _p: constant determined by the information on the dynamic characteristic of the controlled object W A learning control method characterized by being given by the weight matrix.

2. A trial is repeated so that the output of the controlled object follows the target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is expressed by the following equation u _k (i) = u _k-1 (i) + σ _k (i) [Equation 1] (However, sigma _k (i): correction amount e _k from the preceding control input _{u k-1 (i) (} i): a tracking error at time i of k-th trial, further, p _m = (cW _m Δ H _m -bW _m H _m-1 ) / (ac-b ² ) m = 1,2,…, M [Equation 2] , And these constants are constants determined by the sample value of the step response of the controlled object and the weighting matrix by which the predicted value of the future tracking deviation is multiplied).

3. The trial is repeated so that the output of the controlled object follows the target command that repeats the same pattern, and the control input u _k (i) at the time i of the k-th trial is given by the following equation u _k (i) = u _k-1 (i) + σ _k (i) [Equation 3] (However, σ _k (i): correction amount from the previous control input u _k-1 (i) e _k (i): tracking deviation at time i of the k-th trial ΔH _n : sample of step response of control target The learning control method is characterized in that it is given by the difference value).