JP2541166B2

JP2541166B2 - Learning control method

Info

Publication number: JP2541166B2
Application number: JP63295341A
Authority: JP
Inventors: 裕司中村
Original assignee: Yaskawa Electric Corp
Current assignee: Yaskawa Electric Corp
Priority date: 1988-11-22
Filing date: 1988-11-22
Publication date: 1996-10-09
Anticipated expiration: 2011-10-09
Also published as: JPH02140803A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、繰り返し動作をする工作機械，ロボット等
における学習制御方法に関する。The present invention relates to a learning control method for a machine tool, a robot or the like that repeatedly operates.

[Conventional technology]

繰り返し目標値に対する制御系の設計法としては、例
えば「陽子シンクロトロン電磁石電源の繰り返し運転に
おける高精度制御」井上他，電気学会論文誌C,100巻７
号において提案されている方法がある。この方法は、１
周期前の制御入力及び１周期前の制御偏差を利用してい
る点が大きな特徴となっている。これによって高精度な
追従を可能として、さらに周期的な外乱を除去するなど
の利点を有している。As a method of designing a control system for a repetitive target value, for example, “high-accuracy control in repetitive operation of a proton synchrotron electromagnet power source” Inoue et al.
There is a method proposed in the issue. This method is 1
A major feature is that the control input before the cycle and the control deviation before the cycle are used. This has the advantage of enabling highly accurate tracking and further eliminating periodic disturbances.

この手法は、目標値が同じパターンを断続的に繰り返
す場合にも適用可能で、その際の時刻ｔにおける制御入
力ｕ（ｔ）は、ｕ（ｔ）＝ｕ（ｔ′）＋ｅ（ｔ′）となる。ここでｔ′は時刻ｔに対応する前回の試行時の
時刻である。This method can be applied to the case where the pattern having the same target value is repeated intermittently, and the control input u (t) at the time t at that time is u (t) = u (t ′) + e (t ′) Becomes Here, t ′ is the time at the previous trial corresponding to the time t.

また、未来の制御偏差の予測値の重み付き二乗和を最
小とする予測制御方式としては、本出願人が先に出願し
た特開昭62−118405号公報記載の方式がある。Further, as a predictive control method for minimizing the weighted sum of squares of the predicted value of the future control deviation, there is a method described in Japanese Patent Application Laid-Open No. 62-118405 previously filed by the present applicant.

この方式は、現在サンプリング時刻ｉにおける制御入
力ｕ（ｉ）を増分制御入力を用いて、として与えている。ここでサンプリング時刻ｉにおける
増分制御入力ｍ（ｉ）は、制御対象のインディシャル応
答のサンプリング値と過去の増分制御入力と、現在の出
力と未来の目標値とから未来の制御偏差を予測し、その
予測値の重み付き２乗和が最小となるように決定され
る。This method uses the incremental control input for the control input u (i) at the current sampling time i, Is given as. Here, the incremental control input m (i) at the sampling time i predicts the future control deviation from the sampling value of the indicial response of the control target, the past incremental control input, the present output and the future target value, It is determined so that the weighted sum of squares of the predicted value is minimized.

この方式は、未来の目標値を利用しているため、現在
の目標値のみを用いる制御系よりも良好な応答特性が得
られ、また簡単な四則演算によって実現可能であるとい
う利点を有している。Since this method uses future target values, it has the advantage that it has better response characteristics than a control system that uses only current target values, and that it can be implemented by simple arithmetic operations. There is.

さらに、特開昭62−118406号公報においては、同じパ
タンを繰り返す目標値に対して、前記の増分制御入力に
１試行前の制御偏差の定数倍を加えたものを、改めて増
分制御入力として与えることを特徴とする１試行前の制
御偏差を利用した予測制御方式が提案されている。Further, in Japanese Patent Laid-Open No. 62-118406, a value obtained by adding a constant multiple of the control deviation one trial before to the target value for repeating the same pattern is given again as an incremental control input. A predictive control method using the control deviation of one trial before is proposed.

[Problems to be Solved by the Invention]

しかしながら、繰り返し目標値に対する上述の設計
法、すなわち繰り返し制御方式及び特開昭62−118406号
公報のいずれにおいても、時刻ｉにおける制御入力u
_k（ｉ）を決定する際に、前回の偏差e_k-1（ｉ）を利用
しており、制御対象の動特性による動作の遅れ及びむだ
時間を考慮していない。However, in any of the above-mentioned design method for the repetitive target value, that is, the repetitive control method and Japanese Patent Laid-Open No. 62-118406, the control input u
When determining _k (i), the previous deviation e _k-1 (i) is used, and the operation delay and dead time due to the dynamic characteristics of the controlled object are not taken into consideration.

また、前記の「陽子シンクロトロン電磁石電源の繰り
返し運転における高精度制御」に記載された方法では、
それらの伝達関数が必要である。Further, in the method described in the above "high precision control in repetitive operation of the proton synchrotron electromagnet power supply",
Those transfer functions are needed.

そこで本発明では、制御対象の伝達関数を求めるよう
な手間を必要とせずに、従来よりも収束性及び安定性に
優れた学習制御装置を実現することを目的とする。Therefore, it is an object of the present invention to realize a learning control device that is more excellent in convergence and stability than conventional ones without requiring the trouble of obtaining the transfer function of the controlled object.

[Means and Actions for Solving the Problems]

この目的を達成するため、本願の第１の発明の学習制
御方法は、同じパタンを連続的あるいは断続的に繰り返
す目標値に制御対象の出力を一致させるよう制御入力を
加える制御系において、ｋ回目の試行でのサンプリング制御ｉにおける制御入
力u_k（ｉ）を u_k（ｉ）＝g₁・e_k（ｉ）＋g₂σ_ｋ（ｉ） σ_ｋ（ｉ）＝σ_k-1（ｉ）＋｛e_k-1（ｉ＋C₁）＋e_k-1（ｉ＋C₂）＋…＋e_k-1（ｉ＋C_N）｝/N ここで、e_k（ｉ）:k回目の試行での時刻ｉにおける制
御偏差 σ_ｋ（ｉ）:k回目の試行での時刻ｉにおける修正量 g₁,g₂:適当な正の定数 C₁〜C_N:C₁＜C₂＜C₃…＜C_NであるＮ個の適当な整数ただし、目標値が連続的であり、その１周期分のサン
プリング数がＳである場合には e_k-1（ｊ）＝e_k（ｊ−Ｓ）ｊ＞Ｓである。In order to achieve this object, the learning control method according to the first invention of the present application is the k-th time in a control system in which a control input is added so as to match the output of a controlled object with a target value that repeats the same pattern continuously or intermittently. The control input u _k (i) in the sampling control i in the trial of u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + {E _k-1 (i + C ₁ ) + e _k-1 (i + C ₂ ) + ... + e _k-1 (i + C _N )} / N where e _k (i): control at time i in the kth trial deviation sigma _{k (i):} correction amount at time i at k-th trial g _1, g _2: appropriate positive constant _{_{_{C 1 ~C N: C 1 <}}} C 2 <C 3 ... < a C _N N However, if the target value is continuous and the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S.

とし、制御対象のステップ応答を測定して、ステップ
指令開始時刻から、ステップ応答の差分値が最大となる
時刻までのサンプリング数Δを求め、時刻ｉ＋C₁とｉ＋
C_Nとの中間時刻ｉ＋（C_N＋C₁）/2がｉ＋Δとなるように
整数C₁,C₂,…,C_Nを決定するか、制御対象のインパルス
応答を測定して、そのインパルス応答が最大値をとる時
刻Δを求め、時刻ｉ＋C₁とｉ＋C_Nとの中間時刻ｉ＋（C_N
＋C₁）/2がｉ＋Δとなるように整数C₁,C₂,…,C_Nを決定
することを特徴とするこの発明においては、位置の偏差値e_k（ｉ）と修正量
σ_ｋ（ｉ）とを重み付けして加算したものを制御入力u_k
（ｉ）とする。この修正量修正量σ_ｋ（ｉ）は、前回の
試行での時刻ｉにおける修正量σ_k-1（ｉ）に、前回の
試行での制御偏差のいくつかの未来値の平均値を加えた
ものとする。このように、前回の制御偏差を利用する際
に、制御対象の遅れ分だけ後の偏差を用い、さらにそれ
ら数ステップ分を平均化して利用することにより、むだ
時間要素を含む制御対象に対して、収束性及び安定性に
優れた学習制御を実現することができる。Then, the step response of the controlled object is measured, and the sampling number Δ from the step command start time to the time when the difference value of the step response becomes maximum is obtained, and the time i + C ₁ and i +
Intermediate time _{_{i + (C N + C 1}} ) / 2 integers such that a i + Δ C _1, C ₂ and C _N, ..., or to determine the C _N, to measure the impulse response of the controlled system, the impulse response There obtains a time Δ, which takes the maximum value, an intermediate time between the time i + C ₁ and _{i + C N i + (C} N
In the present invention, the integers C ₁ , C ₂ , ..., C _N are determined so that + C ₁ ) / 2 becomes i + Δ. In the present invention, the position deviation value e _k (i) and the correction amount σ _k ( i) weighted and added to control input u _k
(I). This correction amount correction amount σ _k (i) is the correction amount σ _k-1 (i) at time i in the previous trial added with the average value of some future values of the control deviation in the previous trial. I shall. In this way, when using the previous control deviation, using the deviation after the delay of the controlled object, and averaging the several steps, and using it, for the controlled object including the dead time element. Learning control with excellent convergence and stability can be realized.

また、本願の第２の発明の学習制御方法は、同じパタ
ンを連続的あるいは断続的に繰り返す目標値に制御対象
の出力を一致させるよう制御入力を加える制御系におい
て、ｋ回目の試行でのサンプリング制御ｉにおける制御入
力u_k（ｉ）を u_k（ｉ）＝g₁・e_k（ｉ）＋g₂σ_ｋ（ｉ） σ_ｋ（ｉ）＝σ_k-1（ｉ）＋h₀e_k-1（ｉ）＋h₁e_k-1（ｉ＋１）＋…＋h_Ne_k-1（ｉ＋Ｎ）ここで、e_k（ｉ）:k回目の試行での時刻ｉにおける制
御偏差 σ_ｋ（ｉ）:k回目の試行での時刻ｉにおける修正量 g₁,g₂:適当な正の定数 h₀〜h_N:制御対象のステップ応答のサンプル値の差分
値，あるいはインパルス応答のサンプル値ただし、目標値が連続的であり、その１周期分のサン
プリング数がＳである場合には e_k-1（ｊ）＝e_k（ｊ−Ｓ）ｊ＞Ｓである。Further, the learning control method of the second invention of the present application, in the control system for adding the control input so that the output of the controlled object coincides with the target value that repeats the same pattern continuously or intermittently, sampling at the k-th trial The control input u _k (i) in control i is u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + h ₀ e _{k- 1} (i) + h ₁ e _k-1 (i + 1) + ... + h _N e _k-1 (i + N) where e _k (i): control deviation σ _k (i) at time i at the k-th trial: Correction amount at time i in the k-th trial g ₁ , g ₂ : Appropriate positive constants h _{0 to} h _N : Difference value of sample values of step response of controlled object or sample value of impulse response However, target value Is continuous and the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S.

とし、制御対象のステップ応答を測定して、ステップ指
令開始時刻から、ステップ応答の差分値が最大となる時
刻までのサンプリング数Δを求め、時刻ｉとｉ＋Ｎとの
中間時刻ｉ＋N/2がｉ＋Δとなるように整数Ｎを決定す
るか、制御対象のインパルス応答を測定して、そのイン
パルス応答が最大値をとる時刻Δを求め、時刻ｉとｉ＋
Ｎとの中間時刻ｉ＋N/2がｉ＋Δとなるように整数Ｎを
決定することを特徴とする。Then, the step response of the controlled object is measured, and the sampling number Δ from the step command start time to the time when the difference value of the step response becomes maximum is obtained, and the intermediate time i + N / 2 between the time i and i + N is i + Δ. The integer N is determined so that or, the impulse response of the controlled object is measured, and the time Δ at which the impulse response takes the maximum value is obtained.
The feature is that the integer N is determined so that the intermediate time i + N / 2 with N becomes i + Δ.

この第２の発明では、修正量σ_ｋ（ｉ）として、前回
の試行での時刻ｉにおける修正量σ_k-1（ｉ）に、ステ
ップ応答の各サンプリング時刻における差分値と前回の
試行での制御偏差の未来値とを乗じたものを加えている
ため、制御対象の遅れ分だけ後の偏差が考慮され、学習
制御系の収束性及び安定性が向上する。In the second invention, as the correction amount σ _k (i), the correction amount σ _k-1 (i) at the time i in the previous trial is added to the difference value at each sampling time of the step response and the previous trial. Since the value obtained by multiplying the future value of the control deviation is added, the deviation after the delay of the controlled object is considered, and the convergence and stability of the learning control system are improved.

〔Example〕

以下、本発明を具体的に説明する。 Hereinafter, the present invention will be specifically described.

本願の第１の発明に対応する実施例を第１図に示す。
その際の制御計算のフローチャートを第２図に、さらに
制御対象のステップ応答を第３図に示す。ここでの制御
対象とは、その内部に既に従来の制御ループが組まれて
いても差し支えない。An embodiment corresponding to the first invention of the present application is shown in FIG.
FIG. 2 shows a flow chart of control calculation in that case, and FIG. 3 shows a step response of the controlled object. The controlled object here may be a conventional control loop already built therein.

図中１は指令発生器であり、同じパタン｛ｒ（０）,r
（１），‥‥,r（i_end）｝を繰り返し発生する。２は減
算器、3,4,5は各々ゲインがg₁,g₂,1/3の乗算器である。
乗算器５のゲインは、後述する（１）式においてＮ＝３
として３つの制御偏差の平均をとるために、1/3に設定
している。６は加算器、７は修正量｛σ（０），σ
（１），‥‥，σ（i_end）｝を記憶するためのメモリで
ある。In the figure, 1 is a command generator, which has the same pattern {r (0), r
(1), ..., r (i _end )} is repeatedly generated. Reference numeral 2 is a subtracter, and reference numerals 3 and 4 are multipliers having gains g ₁ , g ₂ and 1/3, respectively.
The gain of the multiplier 5 is N = 3 in the equation (1) described later.
Is set to 1/3 in order to average the three control deviations. 6 is an adder, 7 is a correction amount {σ (0), σ
(1), ..., σ (i _end )} is a memory for storing.

８は積算器、9,10はサンプリング周期Ｔで閉じるサン
プラで、11はホールド回路である。また、12はローパス
フィルタ、13は制御対象である。Reference numeral 8 is an integrator, 9 and 10 are samplers closed at a sampling cycle T, and 11 is a hold circuit. Further, 12 is a low-pass filter and 13 is a control target.

いま、ｋ回目の試行でのサンプリング時刻ｉにおける
制御入力u_k（ｉ）をここで、e_k（ｉ）:k回目の試行での時刻ｉにおける制
御偏差 σ_ｋ（ｉ）:k回目の試行での時刻ｉにおける修正量 g₁,g₂:適当な正の定数として与えることとする。Now, the control input u _k (i) at the sampling time i in the kth trial is Here, e _k (i): control deviation at time i in the k-th trial σ _k (i): correction amounts at time i in the k-th trial g ₁ , g ₂ : given as appropriate positive constants I will.

この（１）式の第１式は、位置の偏差値e_k（ｉ）と修
正量σ_ｋ（ｉ）とを重み付けして加算したものを制御入
力u_k（ｉ）とすることを意味している。また、（１）式
の第２式は、今回の試行での時刻ｉにおける修正量σ_ｋ
（ｉ）を、前回の試行での時刻ｉにおける修正量σ_k-1
（ｉ）に、前回の試行での制御偏差のいつくかの未来値
の平均値を加えたものとすることを意味している。The first expression of the expression (1) means that the deviation value e _k (i) of the position and the correction amount σ _k (i) are weighted and added to be the control input u _k (i). ing. Further, the second expression of the expression (1) is the correction amount σ _k at the time i in the present trial.
(I) is the correction amount σ _{k-1 at} time i in the previous trial
It means that (i) is added with the average value of some future values of the control deviation in the previous trial.

時刻ｉにおける入力u_k（ｉ）の影響は、制御対象の遅
れ分だけ後の偏差e_k（ｉ＋C₁）〜e_k（ｉ＋C_N）に現れて
くるため、（１）式で表される制御入力の修正を行うこ
とにより、むだ時間に対する学習制御の特性が改善され
る。The influence of the input u _k (i) at the time i appears in the deviations e _k (i + C ₁ ) to e _k (i + C _N ) after the delay of the control target, so the control expressed by the equation (1) By modifying the input, the characteristics of learning control with respect to dead time are improved.

なお、目標値が連続的であり、その１周期分のサンプ
リング数がＳである場合には e_k-1（ｊ）＝e_k（ｊ−Ｓ）ｊ＞Ｓ ……（２）である。When the target value is continuous and the number of samplings for one period is S, e _k-1 (j) = e _k (j−S) j> S (2).

たとえば、Ｓ＝100,N＝5,Δ＝３のとき、C₁＝1,C₂＝
2,C₃＝3,C₄＝4,C₅＝５とすると、ｋ回目の試行での時刻
98における修正量σ_ｋ（98）は、 σ_ｋ（98）＝σ_k-1（98）＋｛e_k-1（99）＋e_k-1（100）＋e_k（１）＋e_k（２）＋e_k（３）｝/5 となる。つまり、偏差を利用する際に数ステップ後の値
を利用するため、試行の後半部では、前回の試行ではな
く今回の試行における偏差が必要となってくる。そこ
で、（２）式の関係式に基づいて制御偏差を表す。For example, when S = 100, N = 5, Δ = 3, C ₁ = 1 and C ₂ =
2, C ₃ = 3, C ₄ = 4, C ₅ = 5, the time at the k-th trial
The correction amount σ _k (98) in 98 is σ _k (98) = σ _k-1 (98) + {e _k-1 (99) + e _k-1 (100) + e _k (1) + e _k (2) + E _k (3)} / 5. In other words, since the value after several steps is used when using the deviation, the deviation in the present trial, not in the previous trial, becomes necessary in the latter half of the trial. Therefore, the control deviation is expressed based on the relational expression (2).

第３図のステップ応答を例にとると、制御対象のステ
ップ応答の差分の最大値は、時刻4Tにおけるh₄である。Taking the step response of FIG. 3 as an example, the maximum value of the difference between the step responses of the controlled object is h ₄ at time 4T.

そこで、サンプリング時刻ｉ＋C₁とｉ＋C_Nとの中間時
刻がｉ＋４となるように、さらにＮ＝３とし、｛C₁,C₂,C₃｝＝｛3,4,5｝と選ぶと、（１）式はとなる。これは第２図のフローチャートにより算出され
る。Therefore, N = 3 is further set so that the intermediate time between the sampling times i + C ₁ and i + C _N is i + 4, and {C ₁ , C ₂ , C ₃ } = {3,4,5} is selected. )ceremony Becomes This is calculated according to the flowchart of FIG.

第３図のステップ応答の差分が最大値をとる時刻で
は、同じ制御対象のインパルス応答は最大値をとると言
えるので、インパルス応答が最大値をとる時刻Δを求
め、時刻ｉ＋C₁とｉ＋C_Nとの中間時刻ｉ＋（C_N＋C₁）/2
がｉ＋Δとなるように整数C₁,C₂,‥‥,C_Nを決定するこ
とができる。The time difference between the step response of Figure 3 takes the maximum value, because the impulse response of the same control target is said to take a maximum value, obtains a time Δ the impulse response takes the maximum value, and time i + C ₁ and i + C _N Intermediate time i + (C _N + C ₁ ) / 2
The integers C ₁ , C ₂ , ..., C _N can be determined so that becomes i + Δ.

次に、本願の第２の発明の実施例を第４図に、制御計
算のフローチャートを第５図に示す。Next, an embodiment of the second invention of the present application is shown in FIG. 4, and a flow chart of control calculation is shown in FIG.

図中21は指令発生器であり、同じパタン｛ｒ（０）,r
（１），‥‥,r（i_end）｝を１試行ごとに繰り返し発生
する。22は減算器、23,24,25は各々ゲインがg₁,g₂,h₀,h
₁,…,h_Nの乗算器である。26は加算器、27は１試行分の
修正量｛σ（０），σ（１），‥‥，σ（i_end）｝を記
憶するためのメモリである。In the figure, 21 is a command generator, and the same pattern {r (0), r
(1), ..., r (i _end )} is repeatedly generated for each trial. 22 is a subtracter, and 23, 24, 25 are gains g ₁ , g ₂ , h ₀ , h.
It is a multiplier of ₁ , ..., h _N. Reference numeral 26 is an adder, and 27 is a memory for storing the correction amount {σ (0), σ (1), ..., σ (i _end )} for one trial.

28は積算器、29,30はサンプラであり、31はホールド
回路である。32はローパスフィルタ、33は制御対象であ
る。28 is an integrator, 29 and 30 are samplers, and 31 is a hold circuit. Reference numeral 32 is a low-pass filter, and 33 is a control target.

ここでの制御対象とは、その内部に既に従来の制御ル
ープが組まれていても差し支えない。The controlled object here may be a conventional control loop already built therein.

いま、制御対象33のステップ応答のサンプル値の差分
が第６図に示すように｛h₀,h₁,…,h_N｝であるとする
と、第５図に示したフローチャートに従えば、ｋ回目の
試行でのサンプリング時刻ｉにおける制御入力u_k（ｉ）
は、 u_k（ｉ）＝g₁・e_k（ｉ）＋g₂σ_ｋ（ｉ） σ_ｋ（ｉ）＝σ_k-1（ｉ）＋h₀e_k-1（ｉ）＋h₁e_k-1（ｉ＋１）＋‥‥＋h_Ne_k-1（ｉ＋Ｎ） ……（４）ここで、e_k（ｉ）:k回目の試行での時刻ｉにおける制
御偏差 σ_ｋ（ｉ）:k回目の試行での時刻ｉにおける修正量 g₁,g₂:適当な正の定数 h₀〜h_N:制御対象のステップ応答のサンプル値の差分
値，あるいはインパルス応答のサンプル値として与えられる。Now, assuming that the difference between the sample values of the step response of the controlled object 33 is {h ₀ , h ₁ , ..., H _N } as shown in FIG. 6, according to the flowchart shown in FIG. Control input u _k (i) at sampling time i in the first trial
Is u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + h ₀ e _k-1 (i) + h ₁ e _{k- 1} (i + 1) + ... + h _N e _k-1 (i + N) (4) where e _k (i): control deviation at time i in the k-th trial σ _k (i): k-th The correction amounts g ₁ and g ₂ at the time i in the trial: Appropriate positive constants h _{0 to} h _N : Difference values of the sampled values of the step response of the controlled object or the sampled values of the impulse response.

この（４）式の第１式は、位置の偏差値e_k（ｉ）と修
正量σ_ｋ（ｉ）とを重み付けして加算したものを制御入
力u_k（ｉ）とすることを意味している。また、（４）式
の第２式は、今回の試行での時刻ｉにおける修正量σ_ｋ
（ｉ）を、前回の試行での時刻ｉにおける修正量σ_k-1
（ｉ）と、制御対象のステップ応答のサンプル値の差分
値（第６図参照）に各サンプリング時刻における制御偏
差の未来値を乗じたものを加えたものとすることを意味
している。The first expression of the expression (4) means that the deviation value e _k (i) of the position and the correction amount σ _k (i) are weighted and added to be the control input u _k (i). ing. The second expression of the expression (4) is the correction amount σ _k at the time i in the present trial.
(I) is the correction amount σ _{k-1 at} time i in the previous trial
(I) and the difference between the sample values of the step response of the controlled object (see FIG. 6) are multiplied by the future value of the control deviation at each sampling time.

入力の影響は制御対象の遅れ分だけ後の偏差に現れ、
さらにその影響の度合いはステップ応答の差分値に比較
するため、（４）式で表される制御入力の修正を行うこ
とにより、むだ時間に対する学習制御の特性が改善され
る。The influence of the input appears in the deviation after the delay of the controlled object,
Further, since the degree of the influence is compared with the difference value of the step response, the characteristic of the learning control with respect to the dead time is improved by modifying the control input represented by the equation (4).

なお、定常偏差が現れる場合には、学習制御器の出力
u_k（ｉ）を積算器に通したものを制御入力として制御対
象に加えることができる。If steady deviation appears, the output of the learning controller
What passed u _k (i) through an integrator can be added to the controlled object as a control input.

また、充分な繰り返しを経て制御偏差が希望する値以
下に収束した後は、学習が完了したことになるため、メ
モリに記憶されている１試行分の修正量σを用いて、制
御入力u_k（ｉ）を、 u_k（ｉ）＝g₁・e_k（ｉ）＋g₂σ（ｉ）として与えることができる。このときは、修正量σ
（ｉ）の計算は不必要となる。Further, after the control deviation converges to a desired value or less after sufficient repetition, the learning is completed. Therefore, the control input u _k is used by using the correction amount σ for one trial stored in the memory. (I) can be given as u _k (i) = g ₁ · e _k (i) + g ₂ σ (i). In this case, the correction amount σ
The calculation of (i) is unnecessary.

〔The invention's effect〕

以上に述べたように、本発明においては、前回の制御
偏差を利用する際に、制御対象の遅れ分だけ後の偏差を
用いて、さらにそれら数ステップ分を平均化して利用す
るか、あるいは制御対象の遅れをステップ応答の差分値
という形で利用している。このように、本発明では制御
対象の遅れを考慮しているため、制御対象の伝達関数を
求めるような手間を必要とせずに、従来よりも収束性及
び安定性に優れた学習制御方式を実現することができ
る。As described above, in the present invention, when the previous control deviation is used, the deviation after the delay of the controlled object is used, and then the several steps are averaged or used. The delay of the target is used in the form of the difference value of the step response. As described above, in the present invention, since the delay of the controlled object is taken into consideration, a learning control method which is superior in convergence and stability to the conventional one is realized without the need for the trouble of obtaining the transfer function of the controlled object. can do.

[Brief description of drawings]

第１図は本願の第１の発明の実施例を示すブロック図、
第２図は第１の発明における制御計算のフローチャー
ト、第３図は制御対象のステップ応答の特性図、第４図
は本願の第２の発明の実施例を示すブロック図、第５図
は第２の発明における制御計算のフローチャート、第６
図は制御対象のステップ応答の特性図である。 1:指令発生器、2:減算器 3,4,5:乗算器、6:加算器 7:メモリ、8:積算器 9,10:サンプラ、11:ホールド回路 12:ローパスフィルタ、13:制御対象 21:指令発生器、22:減算器 23,24,25:乗算器、26:加算器 27:メモリ、28:積算器 29,30:サンプラ、31:ホールド回路 32:ローパスフィルタ、33:制御対象FIG. 1 is a block diagram showing an embodiment of the first invention of the present application,
FIG. 2 is a flow chart of control calculation in the first invention, FIG. 3 is a characteristic diagram of step response of a controlled object, FIG. 4 is a block diagram showing an embodiment of the second invention of the present application, and FIG. Flowchart of control calculation in invention of 2nd, 6th
The figure is a characteristic diagram of the step response of the controlled object. 1: Command generator, 2: Subtractor 3,4,5: Multiplier, 6: Adder 7: Memory, 8: Accumulator 9,10: Sampler, 11: Hold circuit 12: Low pass filter, 13: Control target 21: Command generator, 22: Subtractor 23, 24, 25: Multiplier, 26: Adder 27: Memory, 28: Accumulator 29, 30: Sampler, 31: Hold circuit 32: Low pass filter, 33: Control target

Claims

(57) [Claims]

1. A control input in a sampling control i in a kth trial in a control system for adding a control input so that an output of a controlled object coincides with a target value that repeats the same pattern continuously or intermittently.
Let u _k (i) be u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + {e _k-1 (i + C ₁ ) + E _k-1 (i + C ₂ ) + ... + e _k-1 (i + C _N )} / N where e _k (i): control deviation σ _k (i) at the time i in the k-th trial: the k-th trial Correction amounts g ₁ , g ₂ at time i in trial: Appropriate positive constants C _{1 to} C _N : C ₁ <C ₂ <C ₃ … <C _N Appropriate integers where the target value is If it is continuous and the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S. And then, by measuring the step response of the controlled object, from step instruction start time, determine the sampling number Δ up to the time the difference value of the step response is maximized, the time i + C ₁ and i + C _N
Intermediate time _{_{i + (C N + C 1}} ) / 2 integers such that a _{_{i + Δ C 1, C 2}} , ..., a learning control method characterized by determining the C _N and.

2. A control system for adding a control input so as to match the output of a controlled object with a target value that repeats the same pattern continuously or intermittently, in the sampling control i at the k-th trial.
Let u _k (i) be u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + {e _k-1 (i + C ₁ ) + E _k-1 (i + C ₂ ) + ... + e _k-1 (i + C _N )} / N where e _k (i): control deviation σ _k (i) at the time i in the k-th trial: the k-th trial Correction amounts g ₁ , g ₂ at time i in trial: Appropriate positive constants C _{1 to} C _N : C ₁ <C ₂ <C ₃ … <C _N Appropriate integers where the target value is If it is continuous and the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S. Then, the impulse response of the controlled object is measured, and the time Δ at which the impulse response takes the maximum value is obtained, so that the intermediate time i + (C _N + C ₁ ) / 2 between the times i + C ₁ and i + C _N becomes A learning control method characterized by determining integers C ₁ , C ₂ , ..., C _N.

3. A control system for adding a control input so that the output of a controlled object coincides with a target value that repeats the same pattern continuously or intermittently, in the sampling control i in the kth trial.
Let u _k (i) be u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + h ₀ e _k-1 (i) + h ₁ e _k-1 (i + 1) + ... + h _N e _k-1 (i + N) where e _k (i): control deviation at time i at the k-th trial σ _k (i): at the k-th trial Correction amount g ₁ , g ₂ at time i: an appropriate positive constant h _{0 to} h _N : difference value of sample values of step response of controlled object or sample value of impulse response However, the target value is continuous. If the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S. Then, the step response of the controlled object is measured, and the sampling number Δ from the step command start time to the time when the difference value of the step response becomes maximum is obtained, and the intermediate time i + N / 2 between the time i and i + N is i + Δ. The learning control method is characterized in that the integer N is determined so that

4. A control input in a sampling control i at a k-th trial in a control system for adding a control input so that an output of a controlled object coincides with a target value that repeats the same pattern continuously or intermittently.
Let u _k (i) be u _k (i) = g ₁ · e _k (i) + g ₂ σ _k (i) σ _k (i) = σ _k-1 (i) + h ₀ e _k-1 (i) + h ₁ e _k-1 (i + 1) + ... + h _N e _k-1 (i + N) where e _k (i): control deviation at time i at the k-th trial σ _k (i): at the k-th trial Correction amount g ₁ , g ₂ at time i: an appropriate positive constant h _{0 to} h _N : difference value of sample values of step response of controlled object or sample value of impulse response However, the target value is continuous. If the number of samplings for one cycle is S, then e _k−1 (j) = e _k (j−S) j> S. Then, the impulse response of the controlled object is measured, the time Δ at which the impulse response takes the maximum value is obtained, and the times i and i +
A learning control method characterized by determining an integer N such that an intermediate time i + N / 2 with N becomes i + Δ.

5. The learning control according to claim 1, wherein the output u _k (i) of the learning controller is passed through an integrator and added as a control input to the controlled object. Method.