JPH03182910A

JPH03182910A - Digital adaptive controller

Info

Publication number: JPH03182910A
Application number: JP1322043A
Authority: JP
Inventors: Takao Baba; 孝夫馬場
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1989-12-12
Filing date: 1989-12-12
Publication date: 1991-08-08
Anticipated expiration: 2014-04-12
Also published as: JP2881873B2

Abstract

PURPOSE:To attain highly precise positioning control by arranging a repetitive learning unit and a multiple layer neural circuit model-type learning unit so as to constitute the learning unit of two steps. CONSTITUTION:A position detector 5 which can highly precisely measure a mechanical load four axes end position thetaa is provided, a position error ea with a command value thetad is detected and a repetitive learning controller 8 first learns a highly precise positioning command sequence for respective control sampling based on error information. Then, a multiple layer neural circuit model 9 is caused to learn with the command sequence as a teacher signal so as to generate an error occurrence dynamic model. Since learning control is constituted so that it is executed in two steps, adaptive and highly precise positioning control which is difficult to be realized in a single learning unit can be executed.

Description

【発明の詳細な説明】〔産業上の利用分野）この発明は、工作機械やロボットなどの位置制御装置に
おいて、外部環境に起因する外乱や機械固有の運動誤差
を補償する高精度な位置決め制御を実現するデジタル適
応制御装置に関するものである。[Detailed Description of the Invention] [Field of Industrial Application] This invention provides highly accurate positioning control that compensates for disturbances caused by the external environment and motion errors inherent in the machine in position control devices for machine tools, robots, etc. This invention relates to a digital adaptive control device.

[Conventional technology]

本発明に類似する従来の公知技術には、例えば電子情報
通信学会の医用電子と生体工学に関する研究会で発表さ
れた瀬戸用らの「多層神経回路網内に学習される逆ダイ
ナミクスモデルによるマニピュレータの制御」　（信学
技報ＭＢＥ８７−１３５．１９８７年）がある。第２図
は、この公知文献に示された制御装置を示したものであ
り、図において（１）は指令値発生部、（２）はサーボ
制御器、（３）はモータ駆動部、（４）は機械負荷、（
９）は多層神経回路モデル、（１３）　、　（１８）は
サーボゲイン、（１４）は加算部、（１５）は駆動トル
ク、（１７）は教師信号である。Conventional known techniques similar to the present invention include, for example, Seto et al.'s ``Manipulator creation using an inverse dynamics model learned within a multilayer neural network'' presented at the Institute of Electronics, Information and Communication Engineers' study group on medical electronics and bioengineering. "Control" (IEICE Technical Report MBE87-135.1987). FIG. 2 shows the control device disclosed in this known document, in which (1) is a command value generation section, (2) is a servo controller, (3) is a motor drive section, and (4) is a servo controller. ) is the mechanical load, (
9) is a multilayer neural circuit model, (13) and (18) are servo gains, (14) is an adder, (15) is a driving torque, and (17) is a teacher signal.

次に動作について説明する。指令値発生部（１）て生成
された位置指令値θ、は各制御サンプリング毎Ｔ毎にサ
ーボ制御器（２）に与えられる。サーボ制御器（２）に
は指令値θｄに対する−次微分器（１１）と二次微分器
（１２）と多層神経回路モデル（９）がイ」加されてい
る。このサーボ制御器（２）の特長は学習機能が備わっ
ていることにある。まず上記）層神経回路モデル（９）
がサーボ系のダイナミクスを未学習の場合、サーボ制御
器（２）は通常のフィードバックサーボ制御系が働くよ
うに構成されている。すなわち指令値発生部（１）から
の位置指令値θ、は機械負荷（４）をドライブする駆動
サーボモータ軸端の回転位置θイと比較され、その偏差
ｅｍ　（−〇、−θｍ）に基ついてサー＊演算（この場
合は比例制御のみ）か行われ、トルク指令値Ｔ（（−Ｋ
（＊ｅイ）　（１Ｂ）が決定される。Next, the operation will be explained. The position command value θ generated by the command value generation unit (1) is given to the servo controller (2) every T for each control sampling. A -order differentiator (11), a second-order differentiator (12), and a multilayer neural circuit model (9) for the command value θd are added to the servo controller (2). The feature of this servo controller (2) is that it has a learning function. First above) layered neural circuit model (9)
If the servo controller (2) has not yet learned the dynamics of the servo system, the servo controller (2) is configured to operate as a normal feedback servo control system. In other words, the position command value θ from the command value generator (1) is compared with the rotational position θa of the shaft end of the drive servo motor that drives the mechanical load (4), and the position command value θ is calculated based on the deviation em (−〇, −θm). Then, the servo * calculation (in this case, only proportional control) is performed, and the torque command value T ((-K
(*e b) (1B) is determined.

多層神経回路モデル（９）は、学習によってこのフィー
ドバック制御系をシミュレートするために設けられてお
り、位置指令値θ６、その−次微分である速度指令値Ｏ
ｄと二次微分値である加速度指令値Ｏｄを入力とし、機
械負荷駆動アクチュエータへのトルク指令値Ｔを、フィ
ードフォワードサーボ系のトルク指令値Ｔ１とフィード
バックサーボ系のトルク指令値Ｔ８との加算値より出力
とする３層のパーセブトロン型神経回路モデルとなって
いる。この多層神経回路モデル（９）は教師信号として
フィードバックサーボ系のトルク指令値Ｔｆ（１Ｂ）を
用いるように構成されているのて、指令値入力に対して
フィードバックサーボ系のトルク指令値Ｔ、を出力する
ような学習が進行する。上記公知文献では、多層神経回
路モデルの学習アルゴリズムとして良く知られたハック
プロパゲーション法を用いている。The multilayer neural circuit model (9) is provided to simulate this feedback control system through learning, and it calculates the position command value θ6 and its negative derivative, the speed command value O.
d and the acceleration command value Od, which is a second derivative value, is input, and the torque command value T to the mechanical load drive actuator is the sum of the torque command value T1 of the feedforward servo system and the torque command value T8 of the feedback servo system. It is a three-layer persebutron-type neural circuit model with more output. This multilayer neural circuit model (9) is configured to use the torque command value Tf (1B) of the feedback servo system as a teacher signal, so that the torque command value T of the feedback servo system is Learning progresses as if it were to be output. In the above-mentioned known document, a well-known hack propagation method is used as a learning algorithm for a multilayer neural circuit model.

[Problem to be solved by the invention]

従来の制御装置は以上のように構成されているのて、フ
ィードバック信号Ｔ、（１６１を教師信号とすることに
より学習か進行するか、教師信号はあくまでもフィード
バック制御系のトルク指令値Ｔｆであるため、完全に学
習か完了したとしても、サーボ制御器（２）の代替機能
しか具備しえない。フィードバック制御系は一般に遅れ
要素を持つ制御系であるため一定時間後には位置決め精
度か保証されるが、過渡的な動特性を改善てきないとい
う欠点を持つ。従ってフィードバック制御系の代替機能
ではこの動特性を改善出来ない。また当然のことながら
機械負荷軸端の位置も補償出来ない。このような過渡的
な動特性の改善方法や機械誤差の改善方法には予め動特
性を観測しておき、前向きに予想される誤差を補償する
いわゆるフィードフォワード制御方法の採用がよく見ら
れる。しかし上記の方法ではフィードフォワード制御系
を多層神経回路モデル（９）で構成しておきながら上記
のような動特性に基づく機械負荷軸端の誤差ｅａ　（＝
θｄ−０８）を精度良く低減することかできない。この
欠点を除去するためには、機械負荷軸端に検出器を設け
てフルクローズフィードバックサーボ系を構成する方法
や、機械負荷軸端の位置信号θ８を観測して、指令値誤
差ｅ８（＝θｄ−θａ）を教師信号とする多層神経回路
モデルを構成することが考えられるが、前者の方法ては
機械のダイナミクスをサーボループ内に含むためにゲイ
ンをあげることが出来ず高応答化が図れないという問題
点があり、後者の方法では多層神経回路モデル（９）出
力Ｔと観測位置信号０８問に８３械負荷（４）のダイナ
ミクスが存在するため、この場合には学習を行う際の教
師信号をｅａを用いてつくることが出来ず、−数的な学
習アルゴリズムであるハックプロパゲーション法が適用
できないといった問題点があった。Since the conventional control device is configured as described above, learning or progress is performed by using the feedback signal T (161) as a teacher signal, since the teacher signal is only the torque command value Tf of the feedback control system. Even if the learning is completely completed, it will only provide a substitute function for the servo controller (2).Since the feedback control system is generally a control system with a delay element, positioning accuracy is guaranteed after a certain period of time. , it has the disadvantage of not being able to improve the transient dynamic characteristics.Therefore, the alternative function of the feedback control system cannot improve these dynamic characteristics.Naturally, the position of the mechanical load shaft end cannot be compensated either. A so-called feedforward control method is often used to improve transient dynamic characteristics and mechanical errors by observing dynamic characteristics in advance and compensating for anticipated errors.However, the above method Now, while configuring the feedforward control system using the multilayer neural circuit model (9), the error ea (=
θd-08) cannot be reduced with high precision. In order to eliminate this drawback, there are methods to configure a full-closed feedback servo system by installing a detector at the end of the mechanical load shaft, or to observe the position signal θ8 at the end of the mechanical load shaft to obtain a command value error e8 (=θd It is conceivable to construct a multilayer neural circuit model using -θa) as a teacher signal, but in the former method, the gain cannot be increased because the dynamics of the machine is included in the servo loop, and high response cannot be achieved. In the latter method, the dynamics of 83 machine load (4) exists in the output T of the multilayer neural circuit model (9) and the observation position signal 08 questions, so in this case, the teacher signal for learning cannot be created using ea, and the hack propagation method, which is a numerical learning algorithm, cannot be applied.

この発明は上記のような問題点を解消するためになされ
たもので、機械負荷（４）　！ｉｌｈ端位置を高精度に
測定できる位置検出器により指令値との位置誤差を検出
し、この誤差情報をもとにした繰り返し型学習制御器と
、この学習器によって作成された望ましいサーボ指令値
系列を教師信号とする多層神経回路モデル型学習器とを
配して２段階学習器を構成することにより高精度な位置
決め制御が可能なデジタル適応制御装置を得ることを目
的とする。This invention was made to solve the above-mentioned problems.Mechanical load (4)! A position detector that can measure the ilh end position with high precision detects the position error from the command value, and a repeating learning controller based on this error information and a desired servo command value series created by this learning device. The object of the present invention is to obtain a digital adaptive control device capable of highly accurate positioning control by configuring a two-stage learning device by arranging a multilayer neural circuit model type learning device using the following as a teacher signal.

[Means to solve the problem]

この発明に係るデジタル適応制御装置は、外部から指令
された目標位置へ被制御物を移動させる制御装置におい
て、上記外部より指令された目標位置と被制御物の実位
置との位置誤差を検出する位置誤差検出手段と、上記位
置誤差に繰り返し学習制御則を適用し、制御サンプリン
グ毎に上記位置指令値に対応した位置指令補正値を演算
する繰り返し学習器と、上記位置指令補正値を上記外部
指令値にわたって記憶する記憶手段と、上記位置指令値
を入力とし、上記位置指令補正値を教師信号とする誤差
伝搬型層状神経回路モデルと、上記神経回路モデルの学
習の完了時に上記繰り返し学習器の機能を停止する手段
とを設けたものである。The digital adaptive control device according to the present invention is a control device that moves a controlled object to a target position commanded from the outside, and detects a position error between the target position commanded from the outside and the actual position of the controlled object. a position error detection means; a repeat learning device that applies a repeat learning control law to the position error and calculates a position command correction value corresponding to the position command value for each control sampling; a storage means for storing all values; an error propagation layered neural circuit model using the position command value as an input and the position command correction value as a teacher signal; and a function of the iterative learning device upon completion of learning of the neural circuit model. The system is equipped with means for stopping the operation.

[Effect]

この発明に係るデジタル適応制御装置は、繰り返し学習
制御器によって各制御サンプリング毎τ毎の高精度な位
置決め指令系列をまず学習し、その後この指令系列を教
師信号として多層神経回路モデルを学習させることによ
り誤差発生ダイナミクスモデルを生成することにより、
メモリ容量の縮少化、及び神経回路モデル型学習器の教
師信号を効果的に取り扱うことかできる。The digital adaptive control device according to the present invention first learns a highly accurate positioning command sequence for each control sampling τ using an iterative learning controller, and then trains a multilayer neural circuit model using this command sequence as a teacher signal. By generating an error generation dynamics model,
It is possible to reduce the memory capacity and effectively handle the teacher signal of the neural circuit model learning device.

〔Example〕

以下、この発明の一実施例を図について説明する。第１
図において、（１）はＮＧプログラムなどの動作プログ
ラムから単位サンプリン７６丁あたりの移動指令値を生
成する指令生成部、（２）はモータのサーボ制御部、（
３）はモータ駆動部、（４）はサーボモータに連結され
た機械負荷、（５）は上記機械負荷の位置を精密に測定
する位置検出器、（６）は繰り返し型学習制御器、（７
）は上記繰り返し型学習制御器によって学習された指令
補正値列を格納する記憶装置、（９）は上記指令補正列
を教師信号として上記指令補正列を出力するように学習
する多層神経回路モデル、（１０）は上記繰り返し学習
器の出力である指令補正列と上記多層神経回路モデル出
力を比較し上記多層神経回路モデルの学習が完了したか
どうかを判別する流化判定機構、（８）は上記学習器（
６）と上記記憶装置（７）を含む繰り返し学習制御機構
である。An embodiment of the present invention will be described below with reference to the drawings. 1st
In the figure, (1) is a command generation unit that generates a movement command value per 76 unit samples from an operation program such as an NG program, (2) is a motor servo control unit, and (2) is a motor servo control unit.
3) is a motor drive unit, (4) is a mechanical load connected to the servo motor, (5) is a position detector that precisely measures the position of the mechanical load, (6) is a repetitive learning controller, and (7)
) is a storage device that stores the command correction value sequence learned by the iterative learning controller; (9) is a multilayer neural circuit model that learns to output the command correction sequence using the command correction sequence as a teacher signal; (10) is a flow determination mechanism that compares the command correction sequence that is the output of the iterative learning device with the output of the multilayer neural circuit model and determines whether learning of the multilayer neural circuit model is completed; Learning device (
6) and the storage device (7).

次に動作について説明する。一般にＮＧ工作機械なとて
は、加工機械の動作が標準的なＮＧ言語で表わされる。Next, the operation will be explained. Generally speaking, the operation of an NG machine tool is expressed in standard NG language.

この動作プログラムはＮＣ装置により解読され、ＮＣ装
置に組み込まれた特有の制御サイクルΔＴあたりの移動
量が指令生成部（１）により演算され機械負荷（４）に
接続されたサーボモータ（３）を制御するサーボ制御系
に与えられる。機械負荷（４）は上記サーボ系により駆
動され、ＮＧプログラムに指定された通りの動作を行う
。This operation program is decoded by the NC device, and the movement amount per specific control cycle ΔT built into the NC device is calculated by the command generation unit (1) to control the servo motor (3) connected to the mechanical load (4). It is given to the servo control system to control. The mechanical load (4) is driven by the servo system and operates as specified in the NG program.

高精度加工を行う場合、機械負荷（４）軸端の実際の加
工軌跡θ８が問題となる。通常のデジタルサーボ制御装
置では指令値に対してサーボ系の遅れや、機械負荷のイ
ナーシャ、摩擦などの非線形外乱などが原因となって機
械負荷（４）軸端の加工軌跡θ８は誤差を持つ。この誤
差は加工誤差となり、高精度加工を目的とした工作機械
の制御装置では、重大な問題となる。When performing high-precision machining, the actual machining trajectory θ8 of the mechanical load (4) axis end becomes a problem. In a normal digital servo control device, there is an error in the machining trajectory θ8 at the end of the mechanical load (4) axis due to delays in the servo system relative to the command value, inertia of the mechanical load, nonlinear disturbances such as friction, and the like. This error becomes a machining error, which becomes a serious problem in a control device for a machine tool intended for high-precision machining.

般にこのような誤差を取り除くためには、サーボ系の遅
れに対しては、加工速度を極端に遅くすることで対処し
、また上記のような非線形外乱に対しては予め求めてお
いた補正値を指令値に対して加えることで対処していた
。Generally, in order to eliminate such errors, delays in the servo system are dealt with by extremely slowing down the machining speed, and nonlinear disturbances such as those mentioned above are dealt with by pre-determined corrections. The solution was to add the value to the command value.

ここではこの誤差を機械負荷軸端位置θ８を検出するリ
ニアスケールまたはレーザ測長器などの位置検出器（５
）を設け、指令生成部（１）からの指令値θｄとの差を
とることによって求める。Here, this error is calculated using a position detector (5) such as a linear scale or laser length measuring device that detects the mechanical load shaft end position θ8.
), and is determined by taking the difference from the command value θd from the command generation unit (1).

繰り返し型学習制御機構（８）は、このようにして得ら
れた加工軌跡誤差ｅ、、（＝θ、−〇ａ）を減少させる
ための制御を行う部分である。繰り返し学習制御機構（
８）は内部に繰り返し学習制御演算部（６）と、上記学
習制御演算部（６）の制御サンプリング時間ΔＴ毎の時
系列出力データを記憶する大容量記憶装置（７）を持ち
、動作指令プログラム毎の時系列指令値に対応した補正
データ列をこの大容量記憶装置（７）に格納する。The iterative learning control mechanism (8) is a part that performs control to reduce the machining trajectory error e, , (=θ, -0a) obtained in this way. Iterative learning control mechanism (
8) has a repetitive learning control calculation section (6) and a large capacity storage device (7) for storing time-series output data for each control sampling time ΔT of the learning control calculation section (6), and has an operation command program. A correction data string corresponding to each time-series command value is stored in this mass storage device (7).

第３図（ａ）は繰り返し型学習制御機構（８）の各試行
毎の学習動作を表わしたブロック図であり、物理的には
、１回の試行に必要な要素（メモリ（７）、サーボゲイ
ン（１Ｂ）、モータ駆動部（３）、機械負荷（４）、学
習制御演算部（６）〉の−組で構成される。FIG. 3(a) is a block diagram showing the learning operation for each trial of the iterative learning control mechanism (8).Physically, the elements required for one trial (memory (7), servo It consists of a gain (1B), a motor drive section (3), a mechanical load (4), and a learning control calculation section (6).

上記のように構成された繰り返し型学習制御機構の考え
方としては与えられた軌跡指令θ６（例；円弧指令）に
対して、サーボ系を駆動し、その時の機械負荷端の実際
の軌跡θ８を検出し、その時の誤差ｅａ　（＝θｄ−θ
ａ）をもとに指令値θｄをどの程度補正すれはよいかを
推定し、その誤差補正系列ｍをメモリの中にストアする
動作を何試行も繰り返し、最終的に軌跡誤差ｅａをゼロ
にしようとするものである。The concept of the iterative learning control mechanism configured as above is to drive the servo system in response to a given trajectory command θ6 (e.g. circular arc command) and detect the actual trajectory θ8 of the mechanical load end at that time. Then, the error ea (=θd−θ
Based on a), estimate how much the command value θd should be corrected, store the error correction series m in memory, and repeat the process many times to finally make the trajectory error ea zero. That is.

上記誤差補正系列ｍの演算は、学習制御演算部（６）に
よって行われる。誤差補正系列ｍを記憶する記憶装置（
７）を詳細に書くと、第３図（ｂ）のように構成されて
おり、指令値θｄ（Ｊ）（ｊ＝１゜・・・、ｎ）に対す
る誤差補正量をｍｋ（ｊ）として、試行毎に絶えず書替
えるようになっている。The calculation of the error correction series m is performed by the learning control calculation section (6). A storage device (
7) is configured as shown in Fig. 3(b), and the error correction amount for the command value θd(J) (j=1°..., n) is set as mk(j). It is constantly rewritten with each trial.

従って、記憶装置（７）には、学習が進んで誤差ｅ８が
ゼロになった時、指令値の系列θｄ（１）　。Therefore, when the learning progresses and the error e8 becomes zero, the series of command values θd(1) is stored in the storage device (7).

・・・、θｄ（ｎ）に対応した誤差補正系列ｍ（１）、
・・・ｍ　（ｎ）がストアされることとなる。..., error correction series m(1) corresponding to θd(n),
... m (n) will be stored.

この誤差補正系列は、プログラムｐに対して組決定され
、このように繰り返し型学習制御機構（８）によって得
られた最終の値をｍｐ□（１）９ｍｐｍ（２）、・・・
２ｍｐｍ（ｎ）　　とする。This error correction series is set for the program p, and the final values obtained by the iterative learning control mechanism (8) are mp□(1)9mpm(2),...
2mpm(n).

次に、上記繰り返し学習制御機構による誤差補正系列の
演算について具体的に述べる。Next, the calculation of the error correction series by the iterative learning control mechanism will be specifically described.

プログラムＮｏ、　ｐの動作プログラムＦｐ（Ｘ）の実
行に際して、各制御サンプリングΔ丁毎に指令値系列θ
ｄ（１）　、θｄ（２）　、・・・θｄ（ｎ）が生成さ
れるが、これと同時にこの１回の動作プログラムの試行
に対しこの動作プログラムに応じた誤差補正系列ｍｐｋ
（１）　、−ｒｌｌｐ’（２）　ｌ　・ｍＰｋ（ｎ）が
生成される。ここでｋはプログラムの実行回数を表わす
。When executing the operation program Fp(X) of program No. p, the command value series θ is calculated for each control sampling Δ
d(1), θd(2), ... θd(n) are generated, but at the same time, an error correction series mpk corresponding to this operation program is generated for this one operation program trial.
(1) , -rllp'(2) l ·mPk(n) is generated. Here, k represents the number of times the program is executed.

またｎは動作プログラムｆｐの実行にかかる全時間をＴ
とすると、ｎ＝７７６丁（１）で表わされる整数値である。Also, n is the total time T required to execute the operating program fp.
Then, n=776 (1) This is an integer value expressed.

また補正系列ｍｐｋは通常、以下の演算式で決定される
。Further, the correction series mpk is usually determined by the following arithmetic expression.

ｍ　ｐ’（ｉ）　　＝　　Ｋ　　Ｌ、＊　　ｅ　　ａ’
（ｊ）　　十　Ｋ　　ＬＤ＊（ｅ　、ｋ（Ｊ）　−ｅ　
ａｋ（ｊ−１））十に、＊Σｅａｋ（Ｆ）　　　　　　
　（２）ここでＫＬＰは繰り返し学習制御機構の比例ゲ
イン、ＫＬＤは微分ゲイン、ＫＬＩは積分ゲインである
。m p'(i) = K L, * e a'
(j) 10 K LD*(e, k(J) −e
ak(j-1)) ten, *Σeak(F)
(2) Here, KLP is the proportional gain of the iterative learning control mechanism, KLD is the differential gain, and KLI is the integral gain.

さて次回、この動作プログラムｆ、（ｘ）を実行する際
には、前回の誤差補正列ｍｐｋを動作プログラム指令値
θｄに加えて新たな指令値ｕｋ＋＋　としてサーボ系に
与える。すなわち、ｕｋ＋Ｉ　（ｉ）　＝θｄ（ｉ）　＋　ｍ　ｐｋ（＋）
　。Next time, when executing the operation program f, (x), the previous error correction sequence mpk is added to the operation program command value θd and is given to the servo system as a new command value uk++. That is, uk+I (i) = θd(i) + m pk(+)
.

ｉ＝１．・・・、　　ｎ　　　　　　　（３）従って、
次回の動作プログラム実行時には軌跡誤差が減少すると
予想されるが、一般にサーボ系や機械負荷系にはダイナ
ミクスがあり指令に対する追従遅れが存在するため必ず
しも軌跡誤差が大幅に減少するとは限らない。しかしな
がら、以上の動作を繰り返し行えば、軌跡誤差が減少し
ていき、ついにはある値以下に収束し、その時の誤差補
正列ｍｐＩＩかもとまる。これが繰り返し学習制御機構
（８）の学習動作である。i=1. ..., n (3) Therefore,
It is expected that the trajectory error will be reduced the next time the operating program is executed, but in general, servo systems and mechanical load systems have dynamics and follow-up delays to commands, so the trajectory error does not necessarily decrease significantly. However, if the above operation is repeated, the trajectory error decreases and finally converges to a certain value or less, and the error correction sequence mpII at that time also stops. This is the learning operation of the iterative learning control mechanism (8).

このように繰り返し学習器（６）を従来の制御系に設け
れは、軌跡誤差を減少させることが可能であるが、指令
値補正のために記憶しなければならない補正系列ｍｐｋ
は、動作プログラム番号毎、及び加工速度などの運転条
件が変化する毎に対応した系列を持たねばならないため
、実用上は大容量の記憶装置が必要である。そのため、
本発明では、多層神経回路モデル（９）を繰り返し学習
制御機構（８）に付加する。多層神経回路モデル（９）
には良く知られているように入力出力間の関係から、学
習によってシステムのダイナミクスを自己組織化するい
わゆる流化作用がある。ここでは多層神経回路モデル（
９）のこの性質を利用して、すでに得られた繰り返し学
習データｍｐｍ（１）　、・・・ｍｐｍ（ｎ）を用いて
補正指令値発生システムの自己組織化を行い、汎用的な
フィードフォワード制御機構を構築する。Providing the iterative learning device (6) in the conventional control system in this way makes it possible to reduce trajectory errors, but the correction series mpk that must be stored for command value correction is
must have a series corresponding to each operation program number and each time operating conditions such as machining speed change, so a large-capacity storage device is required in practice. Therefore,
In the present invention, a multilayer neural circuit model (9) is added to the iterative learning control mechanism (8). Multilayer neural circuit model (9)
As is well known, there is a so-called flow effect that self-organizes the dynamics of the system through learning based on the relationship between input and output. Here, we introduce a multilayer neural circuit model (
9), we self-organize the correction command value generation system using the already obtained iterative learning data mpm(1), ... mpm(n), and perform general-purpose feedforward control. Build a mechanism.

本発明で用いる多層神経回路モデルは、層状に神経細胞
が並んだ３層以上の層構造を持つバーセブトロン型神経
回路モデルである。第４図に本発明の一実施例による３
層バーセブトロン型神経回路モデルを示す。図において
、（２４ａ）　、　（２４ｂ）　、・・・（２５ａ）　
、　（２５ｂ）　ｓ”　、　（２６ａ）　、　（２６ｂ
）−・・は入力層細胞、（２７ａ）、（２７ｂ）、−、
（２８ａ）、（２８ｂ）、−（２９ａ）、（２９ｂ）、
−は隠れ層細胞、（３０）は出力層細胞である。この実
施例では指令値θｄ、速度θｄ、加速度θｄをデイレイ
素子（２１ａ）　、　（２１ｂ）　、　（２２ａ）　、
　（２２ｂ）　、　（２３ａ）（２３ｂ）を介して神経
回路モデルに入力する。ここて位置指令値θ４、速度指
令値θｄ、加速度指令値θｄの入力数をＮ１個、Ｎ２個
、Ｎ３個とすると、このネットワークにはθ、１（ｉ）
、θｄ（ｉ−１）θｄ（１２）　、・・・θｄ（ｊ−Ｎ
＋−＋）　、θ−（ｉ）　、θｄ（ｉ−１）θ、ｈ　（
ｉ−２）　、・・・θｄ（ｉ−Ｎ２−１）　、θ、ｈ　
（ｉ）　、　ｅ　ｄ（ｉ−１）ｅ　ｄ（ｉ−２）　、・
・・θｄ（ｉ−Ｎ３−＋）が入力される。また神経回路
モデルに於る各細胞の人出力特性はシグモイド関数によ
ってモデル化される。シグモイド関数ｇ　（ｘ）は入力
をＸとするとｇ（ｘ）＝２／（１＋ｅｘｐ（−ｘ））　−１（４）て
表わされる。ここで入力Ｘは、神経細胞ｉから神経細胞
１間の結合係数をＷ＋、＋（ｋ）　とし、神経細胞ｊの
出力ａｊ（ｋ）とすれば、ｘＩ＝ΣＷ　ＩＪ　（ｋ）　＊ａ　Ｊ　（ｋ）　　　　
　　　（５）て表わされるものとする。The multilayer neural circuit model used in the present invention is a bercebutron type neural circuit model that has a layered structure of three or more layers in which neurons are arranged in layers. FIG. 4 shows 3 according to an embodiment of the present invention.
A layered bersebutron type neural circuit model is shown. In the figure, (24a), (24b), ... (25a)
, (25b) s", (26a), (26b
)-- are input layer cells, (27a), (27b),-,
(28a), (28b), -(29a), (29b),
- is a hidden layer cell, and (30) is an output layer cell. In this embodiment, the command value θd, speed θd, and acceleration θd are transmitted through delay elements (21a), (21b), (22a),
(22b), (23a) and input to the neural circuit model via (23b). Here, if the number of inputs of position command value θ4, speed command value θd, and acceleration command value θd is N1, N2, and N3, this network has θ, 1(i)
, θd(i-1) θd(12) ,...θd(j-N
+-+), θ-(i), θd(i-1)θ, h (
i-2),...θd(i-N2-1), θ, h
(i) , e d(i-1) e d(i-2) ,・
...θd(i-N3-+) is input. Furthermore, the human output characteristics of each cell in the neural circuit model are modeled by a sigmoid function. The sigmoid function g(x) is expressed as g(x)=2/(1+exp(-x))-1(4), where the input is X. Here, input X is the coupling coefficient between neuron i and neuron 1 as W+, +(k), and output aj(k) of neuron j, then k)
(5) shall be expressed as

各細胞間は変更可能な重み係数を持つリンクによって結
ばれており、同一層間の結合はないものとする。また各
細胞間の結合は層毎に逐次的に行われるものとする。す
なわち入力層の各細胞出力は隠れ層の細胞入力となり、
隠れ層の細胞出力は出力層の入力となるものとする。It is assumed that each cell is connected by a link with a changeable weighting coefficient, and there is no connection between the same layers. It is also assumed that the connections between each cell are made sequentially for each layer. In other words, each cell output of the input layer becomes the cell input of the hidden layer,
It is assumed that the cell output of the hidden layer becomes the input of the output layer.

従って、この発明の一実施例では、指令値生成部（１）
からの指令値θｄが神経回路モデル（９）の入力層に逐
次与えられ、各層間の結合演算を行ったのち、指令補正
出力を出力層から出力する。Therefore, in one embodiment of the present invention, the command value generation section (1)
The command value θd from is sequentially given to the input layer of the neural circuit model (9), and after performing a connection operation between each layer, a command correction output is output from the output layer.

次に多層神経回路モデル（９）の誤差学習アルゴリズム
（バックプロパゲーションアルゴリズム）について説明
する。いま、多層神経回路モデル（９）の出力層に着目
し、この出力がａ。（ｋ）とすると、望ましい出力は繰
り返し学習器（６）によって学習したｍ□（Ｋ）である
ので多層神経回路モデル（９）の出力誤差δ。（ｋ）は δｏ（ｋ）”’　、（ｍ、　（ｋ）　−ａ　ｏ（ｋ））
　＊（１−ａ、（ｋ）２）／２　　　　　　（６）であ
る。この誤差を用いて隠れ層の神経細胞ｊと出力層の神
経細胞との結合係数の変化量ΔＷＪ（ｋ）を ΔＷｈＪ　（ｋ）＝　τ　＊　　δ。（ｋ）＊　　ａ　
ｈｊ（ｋ）　　　　　　　　　　（７）とする。ここて
τは学習の進行速度を支配する時定数であり、ａｈＪは
隠れ層神経細胞ｊ出力を示す。Next, the error learning algorithm (backpropagation algorithm) of the multilayer neural circuit model (9) will be explained. Now, focusing on the output layer of the multilayer neural circuit model (9), this output is a. (k), the desired output is m□(K) learned by the iterative learner (6), so the output error δ of the multilayer neural network model (9). (k) is δo(k)”', (m, (k) −a o(k))
*(1-a, (k)2)/2 (6). Using this error, the amount of change ΔWJ(k) in the coupling coefficient between the neuron j in the hidden layer and the neuron in the output layer is calculated as ΔWhJ(k)=τ*δ. (k)*a
Let hj(k) (7). Here, τ is a time constant that governs the speed of learning progress, and ahJ indicates the hidden layer neuron j output.

次に隠れ層から入力層への誤差伝搬を考える。Next, consider error propagation from the hidden layer to the input layer.

隠れ層の神経細胞ｊの誤差δｈｊ（に）はδｙ（ｋ）　
＝　（（１ａ　ｈｊ（Ｋ）２／　２　）　＊Σδ。（ｋ
）　＊　Ｗｈ　（ｋ）　　　　　　　（８）として計算
されるので、隠れ層の神経細胞ｊと入力層の神経細胞ｉ
との結合係数の変化量ΔＷ＋、、（ｋ）は ΔＷｉＪ　（ｋ）　＝　Ｔ　＊δｈｊ（ｋ）　　＊　ａ
　＋ｊ（ｋ）　　　　　　（９）となる。ここで８１は
入力層神経細胞ｊの出力をを示す。以上のように逐次結
合係数を変化させていくと、）層神経回路モデル（９）
の学習が進行し教示信号ｍｍと同様の出力を行えるよう
になる。The error δhj(to) of neuron j in the hidden layer is δy(k)
= ((1a hj(K)2/2) *Σδ.(k
) * Wh (k) (8), so the hidden layer neuron j and the input layer neuron i
The amount of change in the coupling coefficient ΔW+,, (k) is ΔWiJ (k) = T *δhj(k) * a
+j(k) (9). Here, 81 indicates the output of the input layer neuron j. By successively changing the coupling coefficients as described above, the layered neural circuit model (9)
As the learning progresses, it becomes possible to output the same output as the teaching signal mm.

流化判定機構（１０）は繰り返し学習器（６）による学
習データｍ（ｋ）と、多層神経回路モデル（９）出力ａ
。（ｋ）との差を絶えずモニタし、この誤差が予め与え
られた一定値以下になると、多層神経回路モデル（９）
による学習が完了したとみなし、繰り返し学習制御機構
（８）の動作を停止させる。従ってこのとき、繰り返し
学習制御に必要な大容量記憶装置のデータが不必要にな
る。The flow determination mechanism (10) uses the learning data m(k) from the iterative learning device (6) and the output a of the multilayer neural circuit model (9).
. (k) is constantly monitored, and when this error becomes less than a predetermined value, the multilayer neural network model (9)
It is assumed that the learning has been completed, and the operation of the iterative learning control mechanism (8) is stopped. Therefore, at this time, the data in the mass storage device required for repeated learning control becomes unnecessary.

上記のように、流化判定機構（１０）により、多層神経
回路モデル（９）の学習が完了したと判定する動作とし
ては、まず流化判定機構（ｌＯ）のスイッチをＡＣの結
合状態とし、繰り返し学習制御機構（８）を動作させて
、各プログラムに対する補正値系列ｍ　、　（ｘ）を作
成する。As described above, the operation of determining that learning of the multilayer neural circuit model (9) is completed by the flow determination mechanism (10) is to first set the switch of the flow determination mechanism (lO) to the AC connection state; The iterative learning control mechanism (8) is operated to create a correction value series m, (x) for each program.

この時、繰り返し型学習制御機構（８）は、実際の誤差
ｅａを観測し、この誤差ｅａに基づいて学習制御演算を
施し、最終的に多数のＮＯ指令プログラムｆｌ（Ｘ）、
・・・ｆｐ（×）に対応した補正値系列ｍ　１ｍ（Ｘ）
　　、　ｍ２ｍ（ｘ）　、　・、　　ｍ＋＋ｍ（Ｘ）　
を得る。ここで、ｍ　＋−（Ｘ）　＝　（ｍ　＋−（１
）　１ｍ　＋ｍ（２）　、　・・・ｍ＋ｍ（ｎ）　）を
表わしているものとする。At this time, the iterative learning control mechanism (8) observes the actual error ea, performs learning control calculations based on this error ea, and finally generates a large number of NO command programs fl(X),
...Correction value series m 1m(X) corresponding to fp(×)
, m2m(x) , ・, m++m(X)
get. Here, m +-(X) = (m +-(1
) 1m + m(2) , ...m+m(n) ).

次に、流化判定機構（ｌＯ）のスイッチをフリーとし、
この状態て多層神経回路モデル（９）がすでに学習した
補正値系列ｍ□（Ｘ）を出力できるように、この補正値
系列データを教師信号としたトレーニングを行う。この
トレーニングが完了すると、多層神経回路モデル（９）
は、指令値θｄに対する補正値系列を出力てきるように
なるので、流化判定機構（１０）はスイッチをＡＢの接
続状態とし、指令値に対する補正入力を多層神経回路モ
デル（９）から取るようにする。Next, set the switch of the flow determination mechanism (lO) to free,
In this state, training is performed using this correction value series data as a teacher signal so that the multilayer neural circuit model (9) can output the already learned correction value series m□(X). Once this training is completed, the multilayer neural circuit model (9)
outputs a correction value series for the command value θd, so the flow determination mechanism (10) connects the switch AB and takes the correction input for the command value from the multilayer neural circuit model (9). Make it.

このようにすれば、多層神経回路モデル（９）の流化機
能によって、未学習、未経験のＮＣプログラムに対して
も、適切な補正値系列を出力できるようになり、繰り返
し学習制御機構（８）のような各プログラム毎の補正値
系列の記憶が不要となり、記憶装置の容量を節約するこ
とができる。In this way, the flow function of the multilayer neural circuit model (9) makes it possible to output an appropriate correction value series even for unlearned and inexperienced NC programs, and the iterative learning control mechanism (8) It is no longer necessary to store correction value series for each program, and the capacity of the storage device can be saved.

上記実施例では、高精度ＮＣ工作機械について説明した
が、ロボットなどの他のメカトロニクス機器のシステム
に適用しても、上記と同様の効果を得ることができる。In the above embodiment, a high-precision NC machine tool has been described, but the same effects as described above can be obtained even if the present invention is applied to other mechatronic equipment systems such as robots.

〔Effect of the invention〕

９以上のように本発明によれば、学習制御が２段階に行わ
れるように構成したので、それぞれ単の学習器ては実現
が困難てあった適応性のある高精度位置決め制御が行え
るという効果がある。9 As described above, according to the present invention, since the learning control is configured to be performed in two stages, it is possible to perform adaptive and high-precision positioning control, which was difficult to achieve with a single learning device. There is.

[Brief explanation of drawings]

第１図は本発明の一実施例によるデジタル適応制御装置
の構成を表わすブロック図、第２図は従来のデジタル制
御装置の構成を表わすブロック図、第３図（ａ）は繰り
返し学習制御機構の制御動作を表わすブロック図、同図
（ｂ）は記憶装置（７）の詳細図、第４図は）層神経回
路モデルの構成を表わす図である。図において、（５）は位置検出器、（６）は学習制御演
算器、（７）は記憶装置、（８）は繰り返し型学習制御
機構、（９）は多層神経回路モデル、（１０）は流化判
定機構である。尚、図中同一符号は同−又は相当部分を示す。FIG. 1 is a block diagram showing the configuration of a digital adaptive control device according to an embodiment of the present invention, FIG. 2 is a block diagram showing the configuration of a conventional digital control device, and FIG. 3(a) is a block diagram showing the configuration of a conventional digital control device. FIG. 4(b) is a block diagram showing the control operation, FIG. 4(b) is a detailed view of the storage device (7), and FIG. In the figure, (5) is a position detector, (6) is a learning control calculator, (7) is a storage device, (8) is a repetitive learning control mechanism, (9) is a multilayer neural circuit model, and (10) is a This is a flow determination mechanism. Note that the same reference numerals in the figures indicate the same or corresponding parts.

Claims

[Claims]

In a control device for moving a controlled object to a target position commanded from the outside, a position error detection means for detecting a position error between the target position commanded from the outside and the actual position of the controlled object; an iterative learning device that applies an iterative learning control law and calculates a position command correction value corresponding to the position command value at each control sampling; a storage means that stores the position command correction value across the external command values; An error propagation type layered neural circuit model that receives a command value as an input and uses the position command correction value as a teacher signal, and means for stopping the function of the iterative learning device upon completion of learning of the neural circuit model. Features a digital adaptive control device.