JP3872457B2

JP3872457B2 - Learning adaptive controller

Info

Publication number: JP3872457B2
Application number: JP2003277758A
Authority: JP
Inventors: 淳中西
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2003-07-22
Filing date: 2003-07-22
Publication date: 2007-01-24
Anticipated expiration: 2023-07-22
Also published as: JP2005044135A

Description

この発明は非線形適応制御に関し、特に、系の力学的特性を記述する非線形関数又はパラメータが未知の場合、それらを学習により近似又は同定し、漸近安定な軌道追従を行なう制御装置に関するものである。 The present invention relates to nonlinear adaptive control, and more particularly to a control apparatus that performs asymptotically stable trajectory tracking by approximating or identifying a nonlinear function or parameter that describes a dynamic characteristic of a system by learning.

従来、ロボット等の非線形性を有するダイナミクスに未知パラメータを含む系又はその力学的特性が未知である系の軌道追従制御系の設計において、モデルベース適応制御系（後掲の非特許文献３、５）や非線形関数近似器を用いた学習適応制御系が提案されてきている（非特許文献１、２、４）。また、後掲の非特許文献６により、ヒトの小脳の内部モデルの学習モデルとしてのフィードバック誤差学習が提案されており、ロボットアームの制御に適用されている。 Conventionally, in the design of a trajectory tracking control system of a system including an unknown parameter in a dynamics having nonlinearity such as a robot or a system whose mechanical characteristics are unknown, a model-based adaptive control system (non-patent documents 3, 5 described later) is used. And a learning adaptive control system using a nonlinear function approximator has been proposed (Non-Patent Documents 1, 2, and 4). Further, Non-Patent Document 6 described below proposes feedback error learning as a learning model of an internal model of a human cerebellum and is applied to control of a robot arm.

これらの制御系は一般的には、系の非線形性を近似又は同定する要素と誤差フィードバック制御器とにより構成され、パラメータ更新則により、未知ダイナミクスの近似又は同定を行なう。この様な学習適応制御系において、安定な制御系を実現するにあたって、設計者はフィードバック制御器のゲインを選定する必要がある。 These control systems are generally composed of an element that approximates or identifies the nonlinearity of the system and an error feedback controller, and approximates or identifies unknown dynamics by a parameter update rule. In such a learning adaptive control system, in order to realize a stable control system, the designer needs to select the gain of the feedback controller.

しかし、従来は、リアプノフ安定理論による漸近安定性の解析において、誤差ダイナミクスの「強正実性」を満たす様に選べば制御系の漸近安定性が保証されるという数学的に抽象的な条件が示されるのみで、物理系の制御を考えたときの具体的なフィードバックゲインの選定手法は示されていない。従って、実際に物理系に対する学習適応制御系を設計するにあたって、漸近安定性を保証する具体的なフィードバックゲインの選定基準が不明確で、フィードバックゲインの選び方如何によっては学習が収束せず、漸近安定な軌道追従が実現できないという問題が発生しうる。 However, in the past, mathematically abstract conditions have been shown that asymptotic stability of the control system is guaranteed if selected so as to satisfy the “strong reality” of the error dynamics in the analysis of asymptotic stability by Lyapunov stability theory. However, there is no specific feedback gain selection method when considering physical system control. Therefore, when actually designing a learning adaptive control system for a physical system, the specific feedback gain selection criteria that guarantee asymptotic stability are unclear, and depending on how the feedback gain is selected, learning does not converge and asymptotic stability May cause a problem in that accurate trajectory tracking cannot be realized.

Ｊ．Ｙ・チョイ及びＪ．Ａ．ファレル著「区分線形近似ネットワークを用いた非線形適応制御」ＩＥＥＥトランザクションズ・オン・ニューラルネットワークス、第１１巻、第２号、ｐｐ．３９０−４０１、２０００年３月（Choi, J. Y. and Farrell, J. A., Nonlinear adaptive control using networks of piecewise linear approximations, IEEE Transactions on Neural Networks, Vol. 11, No. 2, pp. 390-401, March 2000.）J. et al. Y. Choi and J. A. By Farrell "Nonlinear Adaptive Control Using Piecewise Linear Approximation Networks" IEEE Transactions on Neural Networks, Vol. 11, No. 2, pp. 390-401, March 2000 (Choi, JY and Farrell, JA, Nonlinear adaptive control using networks of piecewise linear approximations, IEEE Transactions on Neural Networks, Vol. 11, No. 2, pp. 390-401, March 2000. ) 中西淳、Ｊ．Ａ．ファレル及びＳ．シャール著「構造適応を持った局所重み付き複合学習適応制御」、ＩＥＥＥ／ＲＳＪインテリジェントロボット及びシステム国際会議予稿集、２００２年スイスローザンヌ（Nakanishi, J., Farrell, J. A., and Schaal, S., A locally weighted learning composite adaptive controller with structure adaptation. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 882-889, Lausanne, Switzerland, 2002.）Nakanishi, J. A. Farrell and S. Char, “Locally Weighted Complex Learning Adaptive Control with Structural Adaptation”, Proceedings of International Conference on IEEE / RSJ Intelligent Robots and Systems, 2002 Swiss Lausanne (Nakanishi, J., Farrell, JA, and Schaal, S., A locally weighted learning composite adaptive controller with structure adaptation.Proceedings of the IEEE / RSJ International Conference on Intelligent Robots and Systems, pp. 882-889, Lausanne, Switzerland, 2002.) Ｋ．Ｓナレンドラ及びＡ．Ｍ．アナスワミ著「安定適応制御」プレンティス・ホール出版、１９８９年（Narendra, K. S. and Annaswamy, A. M., Stable Adaptive Systems, Prentice Hall, 1989.）K. S. narendra and A.I. M.M. “Stable Adaptive Control” by Anaswami, published by Prentice Hall, 1989 (Narendra, K. S. and Annaswamy, A. M., Stable Adaptive Systems, Prentice Hall, 1989.) Ｒ．サナー及びＪ．−Ｊ．Ｅ．スロッティーン著「直接適応制御のためのガウシアンネットワーク」、ＩＥＥＥトランザクションズ・オン・ニューラルネットワークス、第３巻、第６号、ｐｐ．８３７−８６３、１９９２年１１月（Sanner, R. and Slotine, J.-J. E., Gaussian networks for direct adaptive control. IEEE Transactions on Neural Networks, Vol. 3, No. 6, pp. 837-863, November 1992.）R. Sanner and J.H. -J. E. Slotine, “Gaussian Network for Direct Adaptive Control”, IEEE Transactions on Neural Networks, Vol. 3, No. 6, pp. 837-863, November 1992 (Sanner, R. and Slotine, J.-JE, Gaussian networks for direct adaptive control. IEEE Transactions on Neural Networks, Vol. 3, No. 6, pp. 837-863, November 1992 .) Ｊ．−Ｊ．Ｅ．スロッティーン及びＷ．リー著「応用非線形制御」、プレンティス・ホール出版、１９９１年（Slotine, J.-J. E. and Li, W. Applied Nonlinear Control. Prentice Hall, 1991.）J. et al. -J. E. Slotine and W. "Applied Nonlinear Control", published by Prentice Hall, 1991 (Slotine, J.-J. E. and Li, W. Applied Nonlinear Control. Prentice Hall, 1991.) 川人光男著、「脳の計算理論」、産業図書、１９９６年Mitsuo Kawato, “Theoretical theory of brain”, Sangyo Tosho, 1996

以上の様に、従来の学習適応制御手法においては未知な対象ダイナミクスを持った系に対し、漸近安定な軌道追従を実現するフィードバックゲインの選定基準が具体的でないという問題点が存在する。
そこで本発明では、
１．系の漸近安定性（軌道追従誤差の収束性）を保証するフィードバック制御器のゲインの具体的な条件の導出
２．特に、軌道誤差ベースによる学習適応制御手法及びフィードバック誤差学習における１入力１出力の非線形２次のダイナミクスについて漸近安定性を保証するためのフィードバックゲインの選定法
によってパラメータが設定される学習適応制御装置を提供する事を目的とする。 As described above, the conventional learning adaptive control method has a problem that the feedback gain selection criterion for realizing asymptotically stable trajectory tracking is not concrete for a system having an unknown target dynamics.
Therefore, in the present invention,
1. Derivation of specific conditions for gain of feedback controller that guarantees asymptotic stability of system (convergence of trajectory tracking error) In particular, a learning adaptive control apparatus in which parameters are set by a selection method of feedback gain for guaranteeing asymptotic stability with respect to an orbital error-based learning adaptive control method and nonlinear secondary dynamics of one input and one output in feedback error learning. The purpose is to provide.

この発明の第１の局面に係る学習適応制御装置は、状態フィードバックを用いた１入力１出力の非線形２次系の学習適応制御装置である。この非線形２次系は

で表される。この学習適応制御装置は、被制御対象の出力と所望の軌道入力との間の軌道追従誤差ベクトルｅ＝[ｅ, ｅ ⁽¹⁾ , ｅ ⁽²⁾ ］を受ける、ゲインベクトルＫ＝［Ｋ_p ，Ｋ_D ］で−Ｋｅを出力するフィードバック制御器と、

により定義されるパラメータ線形の関数近似器であって、前記軌道追従誤差ベクトルｅに対して

なる演算を行なって得られる値ｅ ₁を用い、

にしたがって更新される、パラメータベクトルθの推定値ベクトル＾θを用い、

によって式ｆ（ｘ）を近似して−＾ｆ（ｘ）を出力する関数近似器と、フィードバック制御器の出力、被制御対象の出力ｘの所望の加速度ｘ ⁽²⁾、及び関数近似器の出力に基づいて被制御対象への制御入力ｕを生成するための演算手段とを含み、ゲインＫ_p及びＫ _D 、並びにベクトルｃの要素Λ ₁ 及びΛ ₂の値は、
Λ ₁ −Λ ₂ Ｋ _D ＜０ただしΛ ₂ ＞０、Ｋ _P ＞０、Ｋ _D ＞０
なる条件を充足する様に選択されている。 The learning adaptive control apparatus according to the first aspect of the invention, Ru der learning adaptive control system for the non-linear second-order system with one input and one output with state feedback. This nonlinear secondary system is

It is represented by The learning adaptive control system, trajectory tracking error vector ^{e = [e, e (1} ), e (2)] between the desired trajectory input and output of the controlled object undergoes a gain vector K = [K _p , K _D ] to output −Ke ,

A parameter linear function approximator defined by: for the trajectory tracking error vector e

Using the value e ₁ obtained by performing operations comprising,

Using the estimated vector ^ θ of the parameter vector θ , updated according to

By approximating the equation f (x) by - ^ f and the function approximator you output (x), the output of the feedback controller, the desired acceleration x ⁽²⁾ of the output x of the controlled object, and the function approximator arithmetic and means, gain K _p及beauty K _D, as well as the value of the element lambda ₁ and lambda ₂ of vector c for generating the control input u to the controlled object based on the output,
Λ ₁ -Λ ₂ K _D <0 However _{_{Λ 2> 0, K P>}} 0, K D> 0
It is selected so as to satisfy the Do that condition.

この発明の第２の局面に係る学習適応制御装置は、状態フィードバックを用いた１入力１出力の非線形２次系の学習適応制御装置である。この非線形２次系は

で表される。この学習適応制御装置は、被制御対象の出力と所望の軌道入力との間の軌道追従誤差ベクトルｅ＝[ｅ, ｅ ⁽¹⁾ , ｅ ⁽²⁾ ］を受ける、ゲインベクトルＫ＝［Ｋ _p ，Ｋ _D ］で−Ｋｅを出力するフィードバック制御器と、

なる演算を行なって得られる値ｅ ₁ を用い、

によって式ｆ（ｘ）を近似して−＾ｆ（ｘ）を出力する関数近似器と、フィードバック制御器の出力、被制御対象の出力ｘの所望の加速度ｘ ⁽²⁾ 、及び関数近似器の出力に基づいて被制御対象への制御入力ｕを生成するための演算手段とを含み、ゲインＫ _p 及びＫ _D の値は、Ｋ_D ²＞Ｋ_PただしＫ_P＞０、Ｋ_D＞０なる条件を充足する様に選ばれている。 A learning adaptive control device according to a second aspect of the present invention is a one-input one-output non-linear secondary learning adaptive control device using state feedback. This nonlinear secondary system is

It is represented by The learning adaptive control apparatus receives a trajectory tracking error vector e = [e, e ⁽¹⁾ , e ⁽²⁾ ] between an output of a controlled object and a desired trajectory input, and a gain vector K = [K _p , K _D ] to output −Ke,

Using the value e ₁ obtained by performing the operation

Using the estimated vector ^ θ of the parameter vector θ, updated according to

A function approximator that approximates the expression f (x) and outputs-^ f (x), an output of the feedback controller, a desired acceleration x ⁽²⁾ of the output x of the controlled object , and a function approximator And calculating means for generating a control input u to the controlled object based on the output, and the values of the gains K _p and K _D are K _D ² > K _{P where} K _P > 0 and K _D > 0 It is chosen to satisfy the conditions.

上記の非線形系に関する学習適応制御手法において、本発明で提案する非常に簡便な基準によりフィードバックゲインを選定する事で、漸近安定性を保証する事が可能となる。以下、本発明の一実施の形態に係る学習適応制御装置について説明する。 In the learning adaptive control method related to the above nonlinear system, asymptotic stability can be guaranteed by selecting a feedback gain according to a very simple criterion proposed in the present invention. Hereinafter, a learning adaptive control apparatus according to an embodiment of the present invention will be described.

なお、以下の議論において使用される記号「＾」「〜」は、本来その記号の直後の文字の直上に記載すべき記号でああり、式中にはその様に記載しているが、テキスト中では、記載の制限により文字の直前に記載する事にする。また、ベクトルについても、式中ではブロック体で記載しているが、テキスト中では通常の文字で記載してある。 In addition, the symbols “^” and “˜” used in the following discussion are symbols that should be described immediately above the character immediately after the symbol, and are described as such in the formula, but the text Inside, due to the limitation of the description, it will be described immediately before the character. In addition, vectors are also described in block form in the formula, but are described in ordinary characters in the text.

一般的に制御の対象となるシステムは Generally, the system to be controlled is

で表される。ただしベクトルｘ∈Ｒⁿは状態、ベクトルｚ∈Ｒ^pは出力、ベクトルｕ∈Ｒ^mは制御入力、ｆ：Ｒⁿ→Ｒⁿ、ｇ：Ｒⁿ→Ｒⁿ、Ｇ：Ｒⁿ→Ｒ^nxmは未知の非線形関数である。ｈ：Ｒⁿ→Ｒ^pは状態から出力への写像を表す。

It is represented by Where vector x∈R ⁿ is state, vector z∈R ^p is output, vector u∈R ^m is control input, f: R ⁿ → R ⁿ , g: R ⁿ → R ⁿ , G: R ⁿ → R ^nxm is An unknown nonlinear function. h: R ⁿ → R ^p represents a mapping from the state to the output.

従来の学習適応制御系では、まず、これに対し、関数近似器を用いた入出力線形化適応フィードバック制御器を構成し、その誤差ダイナミクスを導出する。次にリアプノフ理論により漸近安定性を保証するための誤差ダイナミクスの満たすべき条件がその「強正実」性である事が知られている。しかし、一般的な高次のダイナミクスを持つ系に対し、強正実性を満たすフィードバック制御器のゲインの具体的な選定基準の導出は容易ではない。 In the conventional learning adaptive control system, first, an input / output linearization adaptive feedback controller using a function approximator is constructed, and its error dynamics is derived. Next, it is known from Lyapunov theory that the condition to be satisfied by the error dynamics to guarantee asymptotic stability is its “strong and true” property. However, it is not easy to derive a specific selection criterion for the gain of the feedback controller that satisfies the strong realism for a system having general higher-order dynamics.

そこで、本発明では、系のダイナミクスが２次系の場合、後掲の参考文献１に記載の定理３．２を適用する事により、漸近安定性を保証するフィードバック制御器のゲインをどの様に選べばよいかを具体的に導出する。 Therefore, in the present invention, when the dynamics of the system is a quadratic system, the gain of the feedback controller that guarantees asymptotic stability can be obtained by applying Theorem 3.2 described in Reference Document 1 described later. Determining specifically what should be selected.

以下、本発明の一実施の形態について述べる。図１に示す様な非線形性が未知の系に関し、系のダイナミクスを学習し、目標軌道ｘ_dに対する漸近安定な追従を実現する学習制御系を考える。図１に示す学習適応制御系２０は、制御の対象となる、非線形性が未知のプラントダイナミクス３０と、このプラントダイナミクス３０を制御するための、本実施の形態に係る適応コントローラ３２とを含む。 Hereinafter, an embodiment of the present invention will be described. Consider a learning control system that learns the dynamics of the system and realizes asymptotically stable tracking with respect to the target trajectory _xd with respect to a system with unknown nonlinearity as shown in FIG. The learning adaptive control system 20 shown in FIG. 1 includes a plant dynamics 30 whose nonlinearity is unknown, which is an object of control, and an adaptive controller 32 according to the present embodiment for controlling the plant dynamics 30.

適応コントローラ３２は、プラントダイナミクス３０の出力ベクトルｘから所望の軌道を減算して誤差ベクトルｅを出力するための減算器４０と、減算器４０の出力する誤差ベクトルｅに対する、フィードバックゲインベクトルＫのフィードバック制御器４２と、所望のフィードフォワードコマンドｘ_d ⁽ⁿ⁾である値ｕ_ffからフィードバック制御器４２の出力ｕ_fbを減算するための減算器４４と、プラントダイナミクス３０の出力ベクトルｘに対する非線形の関数近似を行ない出力ｕ_adを出力するための関数近似器４６と、減算器４４の出力と関数近似器４６の出力とを加算しプラントダイナミクス３０に与えるための加算器４８とを含む。ｎ＝２の場合、フィードフォワードコマンドｘ_d ⁽ⁿ⁾ はｘ_d ⁽²⁾ となり、所望の軌道の加速度を表す。 Adaptive controller 32, a subtracter 40 for outputting the error vector e by subtracting the desired trajectory from the output vector x of the plant dynamics 30, for output to the error vector e of the subtracter 40, a feedback of the feedback gain vector K A controller 42, a subtractor 44 for subtracting the output u _fb of the feedback controller 42 from a value u _ff which is the desired feedforward command x _d ⁽ⁿ⁾ , and a non-linear function for the output vector x of the plant dynamics 30 A function approximator 46 for _{performing the} approximation and outputting the output u _ad, and an adder 48 for adding the output of the subtractor 44 and the output of the function approximator 46 to the plant dynamics 30 . When n = 2, the feedforward command x _d ⁽ⁿ⁾ is x _d ⁽²⁾ , which represents the desired trajectory acceleration.

本発明では議論を簡単にするため、ｇ（ｘ）＝１が既知であり、かつ１入力１出力系の場合のシステムを考える。 In order to simplify the discussion in the present invention, a system in which g (x) = 1 is known and a 1-input 1-output system is considered.

本実施の形態のシステムでは、基底関数

In the system of the present embodiment, the basis function

に対し、パラメータ線形である関数近似器４６を用いる。

On the other hand, a function approximator 46 that is parameter linear is used.

ここで、θは

Where θ is

により定義されるパラメータベクトルであり、以下に述べる適応則によって推定される。

Is a parameter vector defined by is estimated by the adaptive law described below.

ｆ（ｘ）が未知の場合、システムに対する制御則として以下の式で表されるものを用いる。 When f (x) is unknown, the control law for the system is expressed by the following equation.

ここで、

here,

と定義する。＾ｆはｆの推定値であり、

It is defined as ^ F is an estimate of f,

で定義される。ただしベクトル＾θはベクトルθの推定値である。ｕ_adは非線形項を打ち消すための、関数近似器４６から出力される制御入力、ｕ_ff はフィードフォワード入力、ｕ_fbは誤差フィードバック制御器４２の出力である。また、ベクトルＫはフィードバック制御器４２のフィードバックゲインベクトルであり、

Defined by However, the vector ^ θ is an estimated value of the vector θ. u _ad is a control input output from the function approximator 46 for canceling the nonlinear term, u _ff is a feedforward input, and u _fb is an output of the error feedback controller 42. The vector K is a feedback gain vector of the feedback controller 42,

である。ベクトルｅは軌道追従誤差ベクトルである。

It is. The vector e is a trajectory tracking error vector.

さらに、未知パラメータの更新則として、 Furthermore, as an update rule for unknown parameters,

を用いる。ただし、Γは正定な学習係数行列である。

Is used. Where Γ is a positive definite learning coefficient matrix.

このとき、軌道追従誤差及び軌道追従誤差ベクトルをそれぞれ At this time, the trajectory tracking error and the trajectory tracking error vector are respectively

と定義するとシステムの軌道追従誤差ダイナミクスは状態方程式を用いて

The trajectory tracking error dynamics of the system is

の様に表す事ができる。ただし、

It can be expressed as However,

である。また、〜θは

It is. Also, ~ θ is

で定義されるパラメータ推定誤差ベクトルである。このときフィードバックゲインベクトルＫは行列Ａが安定（すべての固有値の実部が負）になる様に選ばなくてはならない。

Is a parameter estimation error vector defined by At this time, the feedback gain vector K must be selected so that the matrix A is stable (the real part of all eigenvalues is negative).

また、関数近似器の未知パラメータを更新するために必要な変数e₁を式（８）の誤差ダイナミクスの出力として In addition, the variable e ₁ necessary for updating the unknown parameter of the function approximator is output as the error dynamics of equation (8).

の様に定義する。ただし、漸近安定な軌道追従を保証するためには

Define as follows. However, in order to guarantee asymptotically stable trajectory tracking

は

Is

が最小実現（可制御、可観測）かつ伝達関数

Is minimally realizable (controllable, observable) and transfer function

が強正実である様に選ばなくてはならない事がリアプノフの安定理論より適応制御理論では知られている（例えば非特許文献１）。また、非特許文献６において提案されたフィードバック誤差学習の場合は上記のｃがＫである場合と等価である。

It is known in adaptive control theory from Lyapunov's stability theory that it must be chosen so that is strongly positive (see, for example, Non-Patent Document 1). The feedback error learning proposed in Non-Patent Document 6 is equivalent to the case where c is K.

以上をまとめると一般的なｎの場合に対しては漸近安定な軌道追従を実現するための条件は、Ａが安定行列になる様にＫを選びかつ In summary, for the general case of n, the condition for realizing asymptotically stable trajectory tracking is that K is selected so that A becomes a stable matrix and

が強正実である様にｃを選ぶ。

Choose c so that is strongly real.

しかし、上記の条件は、数学的に抽象的であり、一般的な高次元のシステムに対しては、その判定は困難である。一般的に、物理系は２次のダイナミクスで記述される事が多いため、本発明では、ｎ＝２の場合に対し、具体的かつ簡便な条件を導出する。また、この条件は系のダイナミクスそのものには依存せず、独立に選ぶ事が可能である。 However, the above condition is mathematically abstract, and it is difficult to determine it for a general high-dimensional system. In general, since a physical system is often described by secondary dynamics, the present invention derives specific and simple conditions for the case of n = 2. This condition does not depend on the dynamics of the system itself, and can be selected independently.

ｎ＝２のとき、 When n = 2

となる。以下、フィードバック制御器のゲインを位置ゲインＫ_P＝Ｋ₁、速度ゲインＫ_D＝Ｋ₂と表記する。このとき、式（８）は

It becomes. Hereinafter, the gain of the feedback controller is expressed as a position gain K _P = K ₁ and a speed gain K _D = K ₂ . At this time, the equation (8) is

となる。ここで、参考文献１の定理３．２を用いると漸近安定性を保証するためのＫ及びｃが満たすべき条件は以下の様に具体的に求める事ができる。参考文献１の定理３．２の各条件は以下の通りである。

１．Ａのすべての固有値の実部は負である：Ｋ_P及びＫ_DはＫ_P＞０かつＫ_D＞０となる様に選ぶ。

It becomes. Here, using Theorem 3.2 of Reference Document 1, the conditions to be satisfied by K and c for guaranteeing asymptotic stability can be specifically obtained as follows. Each condition of Theorem 3.2 of Reference 1 is as follows.

1. The real part of all eigenvalues of A is negative: K _P and K _D are chosen so that K _P > 0 and K _D > 0.

２． 2.

３．

3.

これを計算すると

When this is calculated

となる。

It becomes.

以上をまとめると本実施の形態の装置における安定条件は以下の様になる。
・軌道追従誤差ベースの適応則の場合には
Λ₁−Λ₂Ｋ_D＜０ただしΛ₂＞０、Ｋ_P＞０、Ｋ_D＞０（１２）
とフィードバックゲインベクトルＫ及びベクトルｃを選べばよい。
・フィードバック誤差学習の場合にはベクトルｃ＝フィードバックゲインベクトルＫであるので、漸近安定性を保証するためにはフィードバックゲインは
Ｋ_D ²＞Ｋ_PただしＫ_P＞０、Ｋ_D＞０（１３）
と選べばよい。 In summary, the stability conditions in the apparatus of the present embodiment are as follows.
- Tracking the case of an error based adaptive law Λ ₁ -Λ ₂ K _D <0 However _{_{Λ 2> 0, K P>}} 0, K D> 0 (12)
And the feedback gain vector K and the vector c may be selected.
In the case of feedback error learning, since vector c = feedback gain vector K, the feedback gain is K _D ² > K _{P where} K _P > 0 and K _D > 0 to ensure asymptotic stability (13)
You can choose.

［数値シミュレーション］
本実施の形態の効果を確認するために、数値シミュレーションを行なった。このシミュレーションでは以下の様なｎ＝２である物理系を考える。 [Numerical simulation]
In order to confirm the effect of this embodiment, a numerical simulation was performed. In this simulation, the following physical system where n = 2 is considered.

ここでｍ＝１は既知とし、ｄ及びｋは未知とする。

Here, m = 1 is known and d and k are unknown.

制御系の実現にはまず系のダイナミクスを First, to realize the control system,

の様に書き直し、関数近似器として基底関数が

The basis function is a function approximator.

なる線形の関数近似器

A linear function approximator

を用いる。制御入力はｆの推定値＾ｆを用い

Is used. Control input using the estimated value ^ f of f

となる。以下のシミュレーションでは系のパラメータがｄ＝ｋ＝２、学習係数がΓ＝２．０Ｉ、目標軌道のｘ_d（ｔ）＝ｓｉｎ（２πｔ）の場合について結果を示す。初期値はｘ（０）＝０、＾θ＝０とする。以下、シミュレーション結果のプロットでは上段の図において実線で軌道ｘ、破線で目標軌道ｘ_dを表す。中段の図では軌道追従誤差ｅ、下段の図ではパラメータの推定値＾θを表す。

It becomes. In the following simulation, the results are shown for the case where the system parameters are d = k = 2, the learning coefficient is Γ = 2.0I, and the target trajectory x _d (t) = sin (2πt). The initial values are x (0) = 0 and ^ θ = 0. Hereinafter, in the plot of the simulation result, the trajectory x is represented by a solid line and the target trajectory _xd is represented by a broken line in the upper diagram. The middle figure shows the trajectory tracking error e, and the lower figure shows the estimated parameter ^ θ.

［トラッキング誤差ベース学習適応制御のシミュレーション結果］
図２にＫ_p＝５．０、Ｋ_D＝１．０、Λ₁＝０．１、Λ₂＝１．０の場合のシミュレーション結果を示す。このとき、式（１２）の条件を満たすので、漸近安定性は保証される。この結果より軌道追従誤差がゼロに収束し、系のパラメータが正しく推定できている事が分かる。パラメータ推定値＾θはθ＝［−ｋ、−ｄ］^T＝［−２、−２］^Tに収束している。 [Results of tracking error-based learning adaptive control simulation]
FIG. 2 shows the simulation results when K _p = 5.0, K _D = 1.0, Λ ₁ = 0.1, and Λ ₂ = 1.0. At this time, since the condition of Expression (12) is satisfied, asymptotic stability is guaranteed. From this result, it can be seen that the trajectory tracking error converges to zero, and the system parameters are correctly estimated. The parameter estimated value {circumflex over (θ)} converges to θ = [− k, −d] ^T = [− 2, −2] ^T.

これに対し、図３にＫ_p＝５．０、Ｋ_D＝１．０、Λ₁＝４．０、Λ₂＝１．０の場合のシミュレーション結果を示す。このとき、式（１２）の条件を満たさず、漸近安定性は保証されないため、軌道追従誤差はゼロに収束していない。また、パラメータ推定値も＾θも収束していない。 On the other hand, FIG. 3 shows the simulation results when K _p = 5.0, K _D = 1.0, Λ ₁ = 4.0, and Λ ₂ = 1.0. At this time, the condition of Expression (12) is not satisfied, and asymptotic stability is not guaranteed, so the trajectory tracking error does not converge to zero. Further, neither the parameter estimation value nor ^ θ has converged.

［フィードバック誤差学習のシミュレーション結果］
図４にＫ_P＝５．０、Ｋ_D＝３．０の場合のシミュレーション結果を示す。このとき、式（１３）の条件を満たすので、漸近安定性は保証される。この結果より軌道追従誤差がゼロに収束し、系のパラメータが正しく推定できている事が分かる。 [Simulation result of feedback error learning]
FIG. 4 shows the simulation results when K _P = 5.0 and K _D = 3.0. At this time, since the condition of Expression (13) is satisfied, asymptotic stability is guaranteed. From this result, it can be seen that the trajectory tracking error converges to zero, and the system parameters are correctly estimated.

これに対し、図５にＫ_P＝５．０、Ｋ_D＝１．０の場合のシミュレーション結果を示す。このとき、式（１３）の条件を満たさず、漸近安定性は保証されないため、軌道追従誤差はゼロに収束していない。また、パラメータ推定値も＾θも収束していない。 On the other hand, FIG. 5 shows the simulation results when K _P = 5.0 and K _D = 1.0. At this time, the condition of equation (13) is not satisfied, and asymptotic stability is not guaranteed, so the trajectory tracking error does not converge to zero. Further, neither the parameter estimation value nor ^ θ has converged.

今回開示された実施の形態は単に例示であって、本発明が上記した実施の形態のみに制限されるわけではない。本発明の範囲は、発明の詳細な説明の記載を参酌した上で、特許請求の範囲の各請求項によって示され、そこに記載された文言と均等の意味及び範囲内でのすべての変更を含む。 The embodiment disclosed herein is merely an example, and the present invention is not limited to the above-described embodiment. The scope of the present invention is indicated by each claim in the claims after taking into account the description of the detailed description of the invention, and all modifications within the meaning and scope equivalent to the wording described therein are intended. Including.

［参考文献１］
Ｇ．タオ及びＰ．Ａ．イオアノウ著「強正実行列のための必要十分条件」、ＩＥＥプロシーディングスＧ、回路、デバイス及びシステム、第１３７巻、第５号、ｐｐ．３６０−３６６、１９９０年（Tao, G. and Ioannou, P. A, Necessary and sufficient conditions for strictly positive real matrices. IEE Proceedings G, Circuits, Devices and Systems, Vol. 137, No. 5, pp. 360-366, 1990.） [Reference 1]
G. Tao and P.A. A. Ioanou, “Necessary and Sufficient Conditions for Strongly Execution Sequences”, IEEE Proceedings G, Circuits, Devices and Systems, Vol. 137, No. 5, pp. 360-366, 1990 (Tao, G. and Ioannou, P. A, Necessary and sufficient conditions for strictly positive real matrices. IEE Proceedings G, Circuits, Devices and Systems, Vol. 137, No. 5, pp. 360- 366, 1990.)

本発明の一実施の形態に係る学習適応制御装置のブロック図である。It is a block diagram of the learning adaptive control apparatus which concerns on one embodiment of this invention. 安定な軌道誤差ベース学習適応制御の結果を示すグラフである。It is a graph which shows the result of stable orbit error base learning adaptive control. 不安定な軌道誤差ベース学習適応制御の結果を示すグラフである。It is a graph which shows the result of unstable orbit error base learning adaptive control. 安定なフィードバック誤差学習の結果を示すグラフである。It is a graph which shows the result of stable feedback error learning. 不安定なフィードバック誤差学習の結果を示すグラフである。It is a graph which shows the result of unstable feedback error learning.

Explanation of symbols

２０学習適応制御系、３０プラントダイナミクス、３２適応コントローラ、４０減算器、４２フィードバック制御器、４４減算器、４６関数近似器、４８加算器 20 learning adaptive control system, 30 plant dynamics, 32 adaptive controller, 40 subtractor, 42 feedback controller, 44 subtractor, 46 function approximator, 48 adder

Claims

A one-input, one-output non-linear secondary learning adaptive control apparatus using state feedback,
The nonlinear secondary system is

Represented by
Trajectory tracking error vector e = [e, e ⁽¹⁾ , between output of controlled object and desired trajectory input
receiving an e ^(2)], and the feedback controller you output -Ke gain vector K = [K _p, in K _D],

Using the value e ₁ obtained by performing operations comprising,

A function approximation you output ^ f (x), - the type and approximate f (x) by
The output of the feedback controller, and an operation means for generating a control input u of the to the controlled object based on the output of the desired acceleration x ^(2), and wherein the function approximator output x of the controlled object Including
The gain K _p及beauty K _D, as well as the value of the element lambda ₁ and lambda ₂ of the vector c,
Λ ₁ -Λ ₂ K _D <0 However _{_{Λ 2> 0, K P>}} 0, K D> 0
A learning adaptive control device selected to satisfy the following condition.

Represented by
Trajectory tracking error vector e = [e, e between the output of the controlled object and the desired trajectory input ⁽¹⁾⁽¹⁾ ,
e ⁽²⁾⁽²⁾ ], Gain vector K = [K _pp , K _DD ], A feedback controller that outputs -Ke,

The value e obtained by performing ₁₁ Use

A function approximator that approximates the equation f (x) and outputs-^ f (x) by:
Output of the feedback controller, desired acceleration x of output x to be controlled ⁽²⁾⁽²⁾ And calculating means for generating a control input u to the controlled object based on the output of the function approximator,
The gain K _pp And K _DD The value of is K _DD ²² > K _PP K _PP > 0, K _DD A learning adaptive control device selected to satisfy the condition> 0.