JPH04285293A

JPH04285293A - Neural type optimum gain auto-tuning method of small diameter tunnel robot

Info

Publication number: JPH04285293A
Application number: JP5162191A
Authority: JP
Inventors: Shinichi Aoshima; 伸一青島; Koki Takeda; 武田　幸喜; Tetsuo Yabuta; 藪田　哲郎
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1991-03-15
Filing date: 1991-03-15
Publication date: 1992-10-09
Anticipated expiration: 2012-05-21
Also published as: JP2612972B2

Abstract

PURPOSE:To obtain an optimum gain by causing a tunnel robot to learn using a directional control simulation a neural network to which initial deviation and deflection angle are input and from which deviation and deflection angle feedback gains are output. CONSTITUTION:A neural network N is provided to which deviation and deflection angle Y, are input and from which deviation and deflection angle feedback gains Kp, Ka are output. An initial position deviation Y(O) and an initial pitching angle deflection (O) are put in the input layer I of the neural network N and the deviation and deflection angle feedback gains Kp, Ka are output from the output layer O of the network N. These gains are put in a directional control simulation which uses feedback control laws, so that a robot learns the gains. If the difference between the gain which the robot is learning and that which the robot has learned last time is reduced below a predetermined set value as the learning is repeated a fixed number of times, the gains are regarded as having converged and the value of the gain obtained at that time is used as an optimum gain.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、無排土式で押し込み推
進させながらロボット先端のヘッド角を制御し、方向修
正を行なう小口径トンネルロボットのフィードバック方
向制御のニューラル型最適ゲインオートチューニング方
法に関するものである。[Field of Industrial Application] The present invention relates to a neural-type optimal gain auto-tuning method for feedback direction control of a small diameter tunnel robot, which controls the head angle of the robot tip and corrects the direction while pushing and propelling the robot without soil removal. It is something.

【０００２】0002

【従来の技術】以下に最適ゲインを求める従来技術に関
して述べる。図１にトンネルロボットのシステム構成を
示す。本システムはヘッド角修正機能を持つトンネルロ
ボット本体１１、埋設管１２、埋設管１２を押し込む押
管装置１３、油圧装置１４、操作盤１５よりなる。埋設
管１２は押管装置１３により油圧で１本ずつ押し込まれ
る。このとき、オペレータ１６はヘッド角を逐次修正し
、計画線に沿うように方向制御を行なう。１７は地表で
ある。2. Description of the Related Art A conventional technique for determining an optimum gain will be described below. Figure 1 shows the system configuration of the tunnel robot. This system consists of a tunnel robot main body 11 with a head angle correction function, a buried pipe 12, a push pipe device 13 for pushing the buried pipe 12, a hydraulic device 14, and an operation panel 15. The buried pipes 12 are pushed in one by one using hydraulic pressure by a pushing pipe device 13. At this time, the operator 16 sequentially corrects the head angle and performs direction control so as to follow the planned line. 17 is the surface of the earth.

【０００３】次に本ロボットのフィードバック方向制御
法とそのシミュレータについて述べる。本制御法ではロ
ボット本体の位置偏差とピッチング角偏差に、ある比例
ゲインをかけたものを次の入力ヘッド角とするフィード
バック制御則を用いた。図２でヘッド角とピッチング角
について定義する。シミュレータは方向修正に関するダ
イナミックモデル［式（１）］とロボットのピッチング
角と位置の算出式（２），（３）］によって構成される
。方向制御のシミュレーションは以下のように行なう。まず、式（４）の制御則によりヘッド角を求める。次にそのヘッド角を式（１）のダイナミックモデルに代
入し、方向修正量を計算する。そして、式（２），式（
３）を用い、ロボットのピッチング角と位置を計算する
。Next, a feedback direction control method for the robot and its simulator will be described. This control method uses a feedback control law in which the next input head angle is the positional deviation and pitching angle deviation of the robot body multiplied by a certain proportional gain. The head angle and pitching angle are defined in FIG. The simulator is composed of a dynamic model related to direction correction [formula (1)] and calculation formulas (2) and (3) for the pitching angle and position of the robot. The direction control simulation is performed as follows. First, the head angle is determined using the control law of equation (4). Next, the head angle is substituted into the dynamic model of equation (1) to calculate the amount of direction correction. Then, equation (2), equation (
3) to calculate the pitching angle and position of the robot.

【０００４】本システムのダイナミックモデルは方向修
正角がヘッド角とロボットの姿勢を近似的に表わすピッ
チング角変化量の時系列項および確率分布項の和で表わ
せる確率モデルで表した。パラメータａｎ　，ｂｎ　は
最小２乗法によって推定される。シミュレータ　　Δθｐ　（ｋ）　＝ａ１　Δθｐ　（ｋ−１）　＋
…＋ａｎ　Δθｐ　（ｋ−ｎ）　＋ｂｏ　θｈ　（ｋ）
　　　　　　　　　　　　　　＋ｂ１　θｈ　（ｋ−１
）　＋…＋ｂｎ　θｈ　（ｋ−ｎ）　＋ｅ（ｋ）　　　
　　　　　　　（１）　　θｐ　（ｋ）　＝θｐ　（ｋ
−１）　＋Δθｐ　（ｋ）　　　　　　　　　　　　　
　　　　　　　　　　　　　　　　　　　　　（２）　
　Ｙ（ｋ）　＝Ｙ（ｋ−１）　＋Ｌｓｉｎ　（θｐ　（
ｋ））　　　　　　　　　　　　　　　　　　　　　　
　　　　　　　　　（３）制御則　　θｈ　（ｋ）　＝Ｋｐ　（Ｙｄ　（ｋ）　−Ｙ（ｋ
−１））＋Ｋａ　（θｄ　（ｋ）　−θｐ　（ｋ−１）
）　　　　（４）ただし、Ｋｐ　　：位置偏差フィードバックゲインＫａ
　：角度偏差フィードバックゲインThe dynamic model of this system is expressed as a probability model in which the direction correction angle can be expressed by the sum of a time series term and a probability distribution term of pitching angle changes that approximately represent the head angle and the robot's posture. The parameters an and bn are estimated by the least squares method. Simulator Δθp (k) =a1 Δθp (k-1) +
...+an Δθp (k-n) +bo θh (k)
+b1 θh (k-1
) +...+bn θh (k-n) +e(k)
(1) θp (k) = θp (k
-1) +Δθp (k)
(2)
Y(k) = Y(k-1) +Lsin (θp (
k))
(3) Control law θh (k) = Kp (Yd (k) −Y(k
−1))+Ka (θd (k) −θp (k−1)
) (4) However, Kp: Position deviation feedback gain Ka
: Angle deviation feedback gain

【０００５】図３で
シミュレータ及び制御則で用いる各パラメータを定義す
る。下方の軌道が計画線であり、上方の軌道がロボット
の軌道である。ストロークｋにおける計画線の位置をＹ
ｄ　（ｋ）　、計画線の傾きをθｄ　（ｋ）　、ロボッ
トの位置をＹ（ｋ）　、ロボットのピッチング角をθｐ
　（ｋ）　、ピッチング角変化量をΔθｐ　（ｋ）　、
１ストロークの長さをＬとおく。また、式（２）のダイ
ナミックモデルにおいて、ｅ（ｋ）は残差、ｎはモデル
の次数である。方向制御のブロック線図を図４に示す。FIG. 3 defines each parameter used in the simulator and control law. The lower trajectory is the planned line, and the upper trajectory is the robot's trajectory. The position of the design line at stroke k is Y
d(k), the slope of the planned line is θd(k), the robot position is Y(k), and the pitching angle of the robot is θp.
(k), pitching angle change amount as Δθp (k),
Let the length of one stroke be L. Furthermore, in the dynamic model of equation (2), e(k) is the residual and n is the order of the model. A block diagram of direction control is shown in FIG.

【０００６】上記シミュレータを用いて本方向制御法の
有効性を検討した結果、位置偏差フィードバックゲイン
と角度偏差フィードバックゲインをうまく選択すれば、
良好な方向制御を行なうことがわかった。As a result of examining the effectiveness of this directional control method using the above simulator, we found that if the position deviation feedback gain and the angular deviation feedback gain are appropriately selected,
It was found that good directional control was achieved.

【０００７】そこで、次に、最適フィードバックゲイン
を従来の方法では以下のように求めた。一例としてＮ値
が０〜２である岡山地区のデータを使って説明する。こ
こで、Ｎ値は土質のかたさ、締り具合いを表わす数値で
、値が大きいほどかたい。図５にＫｐ　＝０．０１（ｄ
ｅｇ／ｍｍ）と固定した時の、Ｋａ　−過渡応答の偏差
絶対値積分値特性を示す。但し、残差ｅ（Ｋ）　は平均
値０ｄｅｇ　、標準偏差０．１３ｄｅｇ　の正規分布で
近似し、初期位置、角度はそれぞれ、５００ｍｍ、０ｄ
ｅｇ　とした。また、計画線は初期値０ｍｍの水平線と
した。この図より、偏差絶対値積分値が最小になるＫａ
　を求めることができる。この場合、Ｋａ　＝１．５と
なる。次に、上記と同様に、Ｋｐ　を０．０１から１０
（ｄｅｇ／ｍｍ）まで変化させて偏差絶対値積分値が最
小になるＫａ　を求める。これらの結果を用いると、図
６に示すＫｐ　−最小偏差絶対値積分値特性が求まる。この図より偏差絶対値積分値が最小になるＫｐ　は０．
０７（ｄｅｇ／ｍｍ）となり、そのときのＫａ　はＫｐ
　が０．０７（ｄｅｇ／ｍｍ）のときのＫａ　−過渡応
答の偏差絶対値積分値特性により７．８（無次元）と求
まる。従って、岡山に関する最適ゲインはＫｐ　＝０．
０７（ｄｅｇ／ｍｍ）、Ｋａ　＝７．８（無次元）とな
る。図７にこの最適なＫａ　，Ｋｐ　を使った場合の方
向制御シミュレーション結果を示す。初期位置は５００
ｍｍ、計画線は位置０ｍｍの水平線とした。図に示され
るように、良好な制御が行なわれていることがわかる。[0007] Next, the optimal feedback gain was determined using the conventional method as follows. An example will be explained using data from the Okayama area where the N value is 0 to 2. Here, the N value is a numerical value representing the hardness and compactness of the soil, and the larger the value, the harder the soil is. Figure 5 shows Kp = 0.01(d
3 shows the deviation absolute value integral value characteristic of Ka-transient response when fixed as (eg/mm). However, the residual e(K) is approximated by a normal distribution with an average value of 0deg and a standard deviation of 0.13deg, and the initial position and angle are 500mm and 0d, respectively.
eg. Further, the planned line was a horizontal line with an initial value of 0 mm. From this figure, Ka
can be found. In this case, Ka=1.5. Next, in the same way as above, increase Kp from 0.01 to 10
(deg/mm) to find the minimum deviation absolute value integral value. Using these results, the Kp-minimum deviation absolute value integral value characteristic shown in FIG. 6 can be determined. From this figure, Kp at which the deviation absolute value integral value is minimum is 0.
07 (deg/mm), and Ka at that time is Kp
When Ka is 0.07 (deg/mm), it is determined to be 7.8 (dimensionless) based on the deviation absolute value integral value characteristic of the transient response. Therefore, the optimal gain for Okayama is Kp =0.
07 (deg/mm), Ka = 7.8 (dimensionless). FIG. 7 shows the results of a directional control simulation using these optimal Ka and Kp. The initial position is 500
mm, and the planned line was a horizontal line at a position of 0 mm. As shown in the figure, it can be seen that good control is being performed.

【０００８】[0008]

【発明が解決しようとする課題】このように、従来の方
法を用いても最適ゲインは求まるが、上述したように試
行錯誤的にゲインを探索するため多大な手間と時間がか
かるという問題点があった。[Problem to be Solved by the Invention] As described above, the optimal gain can be found using the conventional method, but as mentioned above, the problem is that it takes a lot of effort and time to search for the gain by trial and error. there were.

【０００９】本発明の目的は従来技術の最適ゲイン探索
法の多大な手間と時間がかかるという問題点を克服する
小口径トンネルロボットのニューラル型最適ゲインオー
トチューニング方法を実現することである。SUMMARY OF THE INVENTION An object of the present invention is to provide a neural-type optimal gain auto-tuning method for a small-diameter tunnel robot that overcomes the problems of the prior art optimal gain search methods, which require a great deal of effort and time.

【００１０】0010

【課題を解決するための手段】本発明は上記課題を解決
するために、入力を初期偏差と偏角、出力を偏差、偏角
フィードバックゲインとしたニューラルネットワークを
方向制御シミュレーションを使い学習させることにより
最適ゲインを得ることが特徴である。[Means for Solving the Problems] In order to solve the above problems, the present invention uses a direction control simulation to train a neural network whose inputs are initial deviation and declination, and whose outputs are deviation and declination feedback gain. It is characterized by obtaining the optimum gain.

【００１１】[0011]

【作用】上記手段により、本発明はニューラルネットワ
ークの学習能力により最適ゲインをオートチューニング
するので手間がかからず時間も従来法に比べ格段に短縮
される。[Operation] By the means described above, the present invention auto-tunes the optimum gain using the learning ability of the neural network, so it does not require much effort and time is significantly shortened compared to the conventional method.

【００１２】0012

【実施例】以下図面を参照して本発明の実施例を詳細に
説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

【００１３】図８に最適ゲインを得るためのニューラル
ネットワークの構成図を示す。このニューラルネットワ
ークは入力層、中間層、出力層の３層から構成されてい
る。各層はユニットの集合であり、となりあう層のユニ
ット同志がもれなく結合されている。この結合を通して
前層の各ユニットから後層のユニットへ値が伝えられる
。このとき、結合に固有の重み係数があり、前層ユニッ
トの出力値にこの重みを乗じた値が後層ユニットに入力
される。後層ユニットは前層のすべてのユニットからの
値の総和を計算し、出力関数として定義される非線形変
換を施した後、つぎの層へ値を出力する。入力層のユニ
ット数は２個とし、入力層のユニットへは初期位置偏差
と初期ピッチング角度偏差を入力する。中間層のユニッ
ト数は任意の数に設定する。出力層のユニットはフィー
ドバック制御則へ入力する、位置偏差フィードバックゲ
インＫｐ　とピッチング角度偏差Ｋａ　の２層とした。FIG. 8 shows a configuration diagram of a neural network for obtaining an optimal gain. This neural network consists of three layers: an input layer, a middle layer, and an output layer. Each layer is a collection of units, and units from adjacent layers are all connected. Through this connection, values are transmitted from each unit in the previous layer to the units in the subsequent layer. At this time, there is a weighting coefficient unique to the connection, and a value obtained by multiplying the output value of the previous layer unit by this weight is input to the subsequent layer unit. The subsequent layer unit calculates the sum of the values from all units in the previous layer, performs a nonlinear transformation defined as an output function, and then outputs the value to the next layer. The number of units in the input layer is two, and the initial position deviation and initial pitching angle deviation are input to the input layer units. The number of units in the middle layer is set to an arbitrary number. The output layer unit has two layers: a position deviation feedback gain Kp and a pitching angle deviation Ka, which are input to the feedback control law.

【００１４】フィードバックゲインＫｐ　，Ｋａ　の最
適ゲインオートチューニングのフローチャートを図９に
示す。以下にオートチューニングの具体的なプロセスを
記述する。（１）初期設定として最小偏差、最小偏角に任意のデー
タを入力しておく。FIG. 9 shows a flowchart of optimal gain auto-tuning of feedback gains Kp and Ka. The specific process of auto-tuning is described below. (1) Enter arbitrary data for the minimum deviation and minimum declination as initial settings.

【００１５】（２）出力層より出力されたゲインをフィ
ードバック制御則のシミュレータにいれ、シミュレーシ
ョンを行い、その時の各ストローク時の計画線とロボッ
ト本体の位置偏差および、ピッチング角度偏差の総和を
算出する。これを、総偏差誤差、総偏角誤差とする。ま
た学習中で総偏差誤差が最小となった時の総偏差誤差、
総偏角誤差をそれぞれ最小総偏差、最小総偏角として記
憶しておく。(2) Put the gain output from the output layer into a feedback control law simulator, perform a simulation, and calculate the sum of the positional deviation of the planned line and the robot body at each stroke at that time, and the pitching angle deviation. . This is defined as the total deviation error and total declination error. Also, the total deviation error when the total deviation error becomes the minimum during learning,
The total declination error is stored as the minimum total deviation and the minimum total declination, respectively.

【００１６】（３）総偏差誤差と最小総偏差、総偏角誤
差と最小総偏角との差の絶対値を算出し、これを偏差結
合係数修正誤差、偏角結合係数修正誤差、とする。ここ
でこれらの値はこのままでは大きいためそれぞれ５０、
５で割っている。また、各結合係数修正誤差の最大値を
どちらも１０とした。(3) Calculate the absolute value of the difference between the total deviation error and the minimum total deviation, and between the total argument error and the minimum total argument, and use these as the deviation coupling coefficient correction error and the argument coupling coefficient correction error. . Here, these values are large as they are, so they are set to 50 and 50, respectively.
It's divided by 5. Further, the maximum value of each coupling coefficient correction error was set to 10 in both cases.

【００１７】（４）次に出力層から出力されたゲインＫ
ｐ　，Ｋａに適当な値を加減算したゲイン（Ｋｐ　±α
，Ｋａ　±β）で小口径トンネルロボットのシミュレー
ションを４回行い、最小の総偏差誤差となったときのゲ
イン（Ｋｐ　±α，Ｋａ　±β）の加減算した符号を、
プロセス（３）で求めた各結合係数修正誤差の符号とし
てつけ、あらためてそれらを偏差結合係数修正誤差、偏
角結合係数修正誤差とする。(4) Next, the gain K output from the output layer
The gain obtained by adding or subtracting an appropriate value to p and Ka (Kp ±α
, Ka ±β), the small diameter tunnel robot is simulated four times, and the sign of the addition and subtraction of the gains (Kp ±α, Ka ±β) when the minimum total deviation error is achieved is:
A sign is assigned to each coupling coefficient correction error obtained in process (3), and these are again defined as a deviation coupling coefficient correction error and an argument coupling coefficient correction error.

【００１８】次に誤差伝搬学習則（バックプロパゲーシ
ョン）を使って、先ほど求めた偏差結合係数修正誤差、
偏角結合係数修正誤差によりニューラルネットの結合係
数修正量を求め、変更された結合係数による出力、すな
わちゲインＫｐ　，Ｋａ　を出す。Next, using the error propagation learning rule (back propagation), the deviation coupling coefficient correction error obtained earlier,
The amount of correction of the neural network's coupling coefficient is determined by the argument coupling coefficient correction error, and outputs based on the changed coupling coefficients, that is, gains Kp and Ka are output.

【００１９】ただし、４回のシミュレーション結果の総
偏差誤差のいずれもが出力層より出力されたゲインによ
るシミュレーション結果の総偏差誤差より大きい場合は
、偏差結合係数修正誤差及び偏角結合係数修正誤差はと
もに０とする。However, if the total deviation error of the four simulation results is larger than the total deviation error of the simulation result due to the gain output from the output layer, the deviation coupling coefficient correction error and the argument coupling coefficient correction error are Both are set to 0.

【００２０】（５）ある一定学習回数の間、Ｋ回目の学
習とＫ＋１回目の学習によって得られたゲインの差があ
る設定値以下であった場合、収束したとみなしその時の
ゲインン値を最適ゲインとする。それまでの間は（２）
から（４）のプロセスを繰り返す。(5) During a certain number of learning times, if the difference between the gains obtained by the K-th learning and the K+1-th learning is less than a certain set value, it is assumed that the gain has converged and the gain value at that time is used as the optimal gain. shall be. Until then (2)
Repeat the process from (4).

【００２１】ただし、プロセス実行中、ゲインＫｐ　，
Ｋａ　が大きすぎてヘッド角が非常に大きくなって飽和
したときには、ヘッド角の可動範囲が±１．５度である
ため総偏差誤差が一定になってほとんどゲインが修正さ
れなくなるときがある。その時はヘッド角が可動範囲±
１．５度以内になるようにゲインＫｐ　，Ｋａ　を減少
させる偏差結合係数修正誤差及び偏角結合係数修正誤差
を与えるようになっている。図１０にオートチューニン
グのブロック線図を示す。本ニューラルネットワークの
学習則は逆伝搬学習則（バックプロパゲーション）を用
いた。However, during process execution, the gains Kp,
When Ka is too large and the head angle becomes very large and saturated, the total deviation error becomes constant and the gain is hardly corrected because the movable range of the head angle is ±1.5 degrees. At that time, the head angle is within the movable range ±
A deviation coupling coefficient correction error and an argument coupling coefficient correction error are provided to reduce the gains Kp and Ka to within 1.5 degrees. FIG. 10 shows a block diagram of autotuning. The learning rule for this neural network is the backpropagation learning rule.

【００２２】次に実際にオートチューニングさせた例を
示す。以下のシミュレーションではニューラルネットワ
ークの入力は初期偏差５０ｍｍ、初期偏差０ｄｅｇ　と
した。狭山地区のシミュレーションモデルを使った場合、ニュ
ーラルの学習中に出力層より出力されるゲインＫｐ　，
Ｋａ　と、これらゲインを用いたシミュレーションによ
り計算される総偏差誤差が学習回数によってどの様に変
化するかを示したものを図１１に示す。この図を見ると
学習初期のゲインＫｐ　，Ｋａ　はかなり変動し、総偏
差誤差は確実に減少する方向に向かっていることがわか
る。また、学習回数が６００程度でゲインはほぼ安定し
学習回数を重ねてもあまり変化はなく収束したといえる
。また、岡山地区の学習中のシミュレーション結果を図
１２から図１７に示す。これらの図より学習が進むほど
良好な制御を行なうゲインを出力していることがわかる
。学習後の重みを使用した岡山地区のモデルでの初期位
置５０ｍｍのシミュレーションを行なった結果を図１８
に示す。このように学習後には最適な制御を行なうゲイ
ンを出力していることがわかる。Next, an example of actual auto-tuning will be shown. In the following simulation, the input to the neural network was set to an initial deviation of 50 mm and an initial deviation of 0 degrees. When using the Sayama district simulation model, the gain Kp output from the output layer during neural learning is
FIG. 11 shows how the total deviation error calculated by simulation using Ka and these gains changes depending on the number of learning times. Looking at this figure, it can be seen that the gains Kp and Ka at the initial stage of learning fluctuate considerably, and the total deviation error is steadily decreasing. Furthermore, the gain is almost stable after about 600 learning times, and it can be said that it has converged without much change even after repeated learning times. Furthermore, the simulation results during learning in the Okayama area are shown in FIGS. 12 to 17. From these figures, it can be seen that as learning progresses, a gain that provides better control is output. Figure 18 shows the results of a simulation with an initial position of 50 mm using the Okayama area model using the weights after learning.
Shown below. It can be seen that after learning, the gain for optimal control is output.

【００２３】図２０に初期偏差５０ｍｍ、初期偏角０ｄ
ｅｇ　とした場合のそれぞれのゲインを小刻みに変化さ
せて求めた最適ゲインＫｐ　，Ｋａ　、総偏差誤差と本
発明のニューラルネットワークによって求めた最適ゲイ
ンＫｐ　，Ｋａ　、総偏差誤差を示す。この結果よりニ
ューラルネットワークはほぼ最適なゲインを学習によっ
て得ていることがわかる。今までのニューラルの学習で
は初期偏角０ｄｅｇ　を入力としていれていたが初期偏
角が０以外の場合でも良好に学習する。学習後の初期偏
差５００ｍｍ、初期偏角１ｄｅｇ　の場合の岡山地区の
モデルのシミュレーション結果を図１９に示す。図を見
てわかるように良好に制御されている。FIG. 20 shows an initial deviation of 50 mm and an initial deviation angle of 0 d.
eg, the optimum gains Kp, Ka, and the total deviation error obtained by changing the respective gains in small steps, and the optimum gains Kp, Ka, and the total deviation error, obtained by the neural network of the present invention. This result shows that the neural network obtains almost the optimal gain through learning. In previous neural learning, an initial argument of 0deg was used as an input, but the system learns well even when the initial argument is other than 0. Figure 19 shows the simulation results of the model in the Okayama area when the initial deviation after learning is 500 mm and the initial declination angle is 1 degree. As you can see from the figure, it is well controlled.

【００２４】[0024]

【発明の効果】以上説明したように本発明によれば、入
力を初期偏差と偏角、出力を偏差、偏角フィードバック
ゲインとしたニューラルネットワークを方向制御シミュ
レーションを使い学習させることにより自動的に最適ゲ
インを得ることができるため、従来技術に比較して手間
がかからず時間も従来法に比べ格段に短縮される。[Effects of the Invention] As explained above, according to the present invention, the neural network whose input is the initial deviation and declination, and whose output is the deviation and declination feedback gain is automatically optimized by learning using direction control simulation. Since a gain can be obtained, this method requires less effort and time compared to the conventional method and is much shorter than the conventional method.

[Brief explanation of the drawing]

【図１】トンネルロボットのシステム構成を示す構成図
である。FIG. 1 is a configuration diagram showing the system configuration of a tunnel robot.

【図２】ヘッド角とピッチング角変化量の定義を説明す
るための説明図である。FIG. 2 is an explanatory diagram for explaining definitions of a head angle and a pitching angle change amount.

【図３】各パラメータの定義を説明するための説明図で
ある。FIG. 3 is an explanatory diagram for explaining the definition of each parameter.

【図４】方向制御のブロック線図である。FIG. 4 is a block diagram of direction control.

【図５】Ｋｐ　＝０．０１（ｄｅｇ／ｍｍ）と固定した
時の、Ｋａ　−過渡応答の偏差絶対値積分特性を示す特
性図である。FIG. 5 is a characteristic diagram showing the deviation absolute value integral characteristic of Ka-transient response when Kp is fixed at 0.01 (deg/mm).

【図６】Ｋｐ　−最小偏差絶対値積分値特性を示す特性
図である。FIG. 6 is a characteristic diagram showing Kp-minimum deviation absolute value integral value characteristics.

【図７】最適なＫａ　，Ｋｐ　を使った場合の方向制御
シミュレーション結果を示す特性図である。FIG. 7 is a characteristic diagram showing directional control simulation results when using optimal Ka and Kp.

【図８】ニューラルネットワークの構成図である。FIG. 8 is a configuration diagram of a neural network.

【図９】最適ゲインオートチューニングのフローチャー
トである。FIG. 9 is a flowchart of optimal gain autotuning.

【図１０】オートチューニングのブロック線図である。FIG. 10 is a block diagram of autotuning.

【図１１】総偏差誤差と学習回数の関係を示す特性図で
ある。FIG. 11 is a characteristic diagram showing the relationship between the total deviation error and the number of learning times.

【図１２】学習回数０の岡山地区の学習中のシミュレー
ション結果を示す特性図である。FIG. 12 is a characteristic diagram showing simulation results during learning in the Okayama area with 0 learning times.

【図１３】学習回数１０の岡山地区の学習中のシミュレ
ーション結果を示す特性図である。FIG. 13 is a characteristic diagram showing simulation results during learning in the Okayama area with 10 learning times.

【図１４】学習回数２０の岡山地区の学習中のシミュレ
ーション結果を示す特性図である。FIG. 14 is a characteristic diagram showing simulation results during learning in the Okayama area with 20 learnings.

【図１５】学習回数３０の岡山地区の学習中のシミュレ
ーション結果を示す特性図である。FIG. 15 is a characteristic diagram showing simulation results during learning in the Okayama area with 30 learnings.

【図１６】学習回数４０の岡山地区の学習中のシミュレ
ーション結果を示す特性図である。FIG. 16 is a characteristic diagram showing simulation results during learning in the Okayama area with 40 learnings.

【図１７】学習回数５０の岡山地区の学習中のシミュレ
ーション結果を示す特性図である。FIG. 17 is a characteristic diagram showing simulation results during learning in the Okayama area with 50 learnings.

【図１８】学習後の重みを使用した岡山地区のモデルで
の初期位置５０ｍｍのシミュレーション結果を示す特性
図である。FIG. 18 is a characteristic diagram showing simulation results at an initial position of 50 mm in a model for the Okayama area using weights after learning.

【図１９】学習後の初期偏差５００ｍｍ、初期偏角１ｄ
ｅｇ　の場合の岡山地区のモデルのシミュレーション結
果を示す特性図である。[Figure 19] Initial deviation after learning 500mm, initial deviation angle 1d
It is a characteristic diagram showing simulation results of a model in the Okayama area in the case of eg.

【図２０】従来法と本発明の方法による最適ゲインＫｐ
　，Ｋａ　、総偏差誤差の比較を示す説明図である。FIG. 20: Optimal gain Kp according to the conventional method and the method of the present invention
, Ka, and a comparison of total deviation errors.

[Explanation of symbols]

１１…トンネルロボット本体、１２…埋設管、１３…押
管装置、１４…油圧装置、１５…操作盤。11... Tunnel robot main body, 12... Buried pipe, 13... Push pipe device, 14... Hydraulic device, 15... Operation panel.

Claims

[Claims]

Claim 1: An input layer, at least one intermediate layer, and an output layer, each layer being a set of units,
The initial position deviation and the initial pitching angle deviation are input to the input layer, the position deviation feedback gain Kp and the pitching angle deviation Ka are output to the output layer, and the units in the adjacent layers are all connected, and each connection has its own characteristics. The output value of the previous layer unit is multiplied by this weight and the value is input to the subsequent layer unit, which calculates the sum of the values from all units of the previous layer and defines it as an output function. Using a neural network that outputs the value to the next layer after applying nonlinear transformation to A neural-type optimal gain auto-tuning method for a small-diameter tunnel robot, characterized by obtaining an optimal position deviation feedback gain Kp and an optimal pitching angle deviation Ka according to the law. Process (1) Input arbitrary data for the minimum deviation and minimum declination as initial settings. (2) Enter the set initial position deviation and initial pitching angle deviation into the input layer of the neural network, input the gain output from the output layer into a direction control simulator using a feedback control law, perform a simulation, and then Calculate the total positional deviation between the planned line and the robot body during the stroke, and the pitching angle deviation. This is defined as the total deviation error and total declination error. Further, the total deviation error and total declination error when the total deviation error becomes the minimum during learning are stored as the minimum total deviation and minimum total declination, respectively. (3) Calculate the absolute value of the difference between the total deviation error and the minimum total deviation, and between the total argument error and the minimum total argument, and use these as the deviation coupling coefficient correction error and the argument coupling coefficient correction error. If these values are large, divide them by a certain constant. (4) Next, the gains Kp and Ka output from the output layer
Gain (Kp ±α, Ka ±
β) to simulate a small diameter tunnel robot 4
The gain (Kp
±α, Ka ±β) are added and subtracted by the process (
A sign is assigned to each coupling coefficient correction error obtained in step 3), and these are again referred to as a deviation coupling coefficient correction error and an argument coupling coefficient correction error. Next, using the error propagation learning rule (back propagation), the amount of correction of the neural network's coupling coefficient is determined by the deviation coupling coefficient correction error and the deviation coupling coefficient correction error obtained earlier, and the output based on the changed coupling coefficient,
That is, gains Kp and Ka are produced. However, if the total deviation error of the four simulation results is larger than the total deviation error of the simulation result due to the gain output from the output layer, both the deviation coupling coefficient correction error and the argument coupling coefficient correction error are 0. do. (5) If the difference between the gains obtained by the K-th learning and the K+1-th learning is less than or equal to a certain set value during a certain number of learning times, it is assumed that convergence has been achieved, and the gain value at that time is set as the optimal gain. Until then, processes (2) to (4) are repeated. However, during process execution, if the gains Kp and Ka are too large and the head angle becomes very large and saturated, the movable range of the head angle will be ±1.5.
Because of this, the total deviation error becomes constant and there are times when the gain is hardly corrected. At that time, the gains Kp and Ka are set so that the head angle is within the movable range of ±1.5 degrees.
Provides a deviation coupling coefficient correction error and an argument coupling correction error that reduce the deviation.