JPH03105662A - Self-learning processing system for learning device - Google Patents

Self-learning processing system for learning device

Info

Publication number
JPH03105662A
JPH03105662A JP1244409A JP24440989A JPH03105662A JP H03105662 A JPH03105662 A JP H03105662A JP 1244409 A JP1244409 A JP 1244409A JP 24440989 A JP24440989 A JP 24440989A JP H03105662 A JPH03105662 A JP H03105662A
Authority
JP
Japan
Prior art keywords
pattern
input
output
teacher
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1244409A
Other languages
Japanese (ja)
Inventor
Tamami Sugasaka
菅坂 玉美
Minoru Sekiguchi
実 関口
Kazushige Saga
一繁 佐賀
Shigemi Osada
茂美 長田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP1244409A priority Critical patent/JPH03105662A/en
Publication of JPH03105662A publication Critical patent/JPH03105662A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To constitute the system so that the learning device itself can generate a teacher pattern by providing an evaluating part for evaluating input and output patterns conforming to the purpose or not, based on a rule determined in advance, with regard to the correspondence of an output pattern to which a noise is added and an input pattern at that time. CONSTITUTION:In an input part 1, an input pattern (a) is generated from external information, and in a learning device 2, the input pattern (a) is processed, and to its result, a noise is added from a noise part 3 and it becomes an output pattern (b). In an input/output pattern corresponding part 5, by coordinating the input pattern (a) and the output pattern (b), an input/output pattern (c) is generated, and in an evaluating part 6, the input/output pattern (c) is evaluated, and a pattern held in a teacher pattern table part 7 and an unnecessary pattern are selected. At the time of learning, the input/output pattern held in the teacher pattern table part 7 is learned as teacher pattern (d) by the learning device 2. In such a way, it is unnecessary to prepare the teacher pattern in advance.

Description

【発明の詳細な説明】 〔概 要〕 提示された教師パターンにもとづいて,入力された入力
パターンを処理する学習装置を有する学習処理装置にお
いて,自己学習を行わせるようにした学習装置のための
自己学習処理方式に関し,学習装置に対して合目的的な
入出力パターンを自動的に抽出せしめるようにすること
を目的とし,学習装置からの出力にノイズを附加せしめ
た出力パターンと,そのときの入力パターンとの対応に
ついて,予め定めたルールにもとづいて合目的的な入出
力パターンか否かを評価する評価部をもうけ,合目的的
な人出力パターンを教師パターンとして保持させるよう
構戒する. 〔産業上の利用分野〕 本発明は提示された教師パターンにもとづいて,入力さ
れた入力パターンを処理する学習装置を有する学習処理
装置において,自己学習を行わせるようにした学習装置
のための自己学習処理方式に関する. 近年.教師パターンを提示することにより学習する装置
(ニューラル・ネットワーク等)が,アルファベットフ
ォント認識や画像認識などのパターン認識や,適応フィ
ルタや.ロボットの各種制御などに応用されるようにな
った. 〔従来の技術〕 従来の教師パターンを必要とする学習装置における教師
パターンは.人間が予め用意した教師パターンを用いて
学習を行うようにされている.〔発明が解決しようとす
る課題〕 ところが,実際のアブリケーシジンの場合,時系列を扱
ったパターン.教師パターン自体が変化するパターン.
予測不能な状態に対するパターンなどがあるため,教師
パターンとして用意するパターンの種類や量を決定する
ことが難しい.従って,教師パターンを作成するのに非
常に時間を要する.実際のアブリケーシッンで学習装置
を実用化するためには,学習装置自身が教師パターンを
作戒するようなアルゴリズムが求められる.本発明は.
学習装置に対して合目的的な入出力パターンを自動的に
抽出せしめるようにすることを目的とする. 〔課題を解決するための手段〕 第1図は,本発明の原理説明図を示す.図中.■は入力
部であって外部環境から検出された刺戟にもとづいて入
力パターン(a)(一般には複数ビットの情報)を作戒
する.■は学習装置であって,教師パターン(d)を取
り込んでおり.当該教師パターン(b)にもとづいて入
力パターン(a)を処理し.教師パターン(b)に沿う
出力を発するような処理を行う。■はノイズ部であって
,学習装置■の出力に対して,いわばランダムなノイズ
を附加して学習装置■の出力自体よりもより合目的的な
出力パターンが得られることを期待するものである.■
は出力部であって.例えば図示しないモータを駆動する
などの動作を行う.■は本発明においてもうけられた入
出力パターン対応部であって,入力パターン(a)とそ
れに対応する出力パターン(b)との対応をとり,入出
力パターン(c)を出力する.■は本発明においてもう
けられた評価部であって.多数の入出力パターン(c)
の夫々について,直前の入出力パターンと現人出力パタ
ーンとを保持せしめ,両者パターンの対比にもとづいて
.現入出力パターンが上記学習装置[2]にとって合目
的的なものであるか否かを評価し,当該合目的的な入出
力パターンを選択的に抽出する.■は教師パターン・テ
ーブル部であって.入出力パターン対応部■によって得
られた多数の入出力パターン(c)を一時的に蓄積した
上で,上記評価部■による参照を許し,上記選択的に抽
出された合目的的な入出力パターンを保持してゆき.か
つ,当該合目的的な人出力パターンを学習装置[2]に
対して.次回のトライにおける教師パターン(d)とし
て供給する.〔作 用〕 入力部ので外部情報より入力パターン(alを作戒する
.この入力パターン(a)は.学習装置■と入出力パタ
ーン対応部■とに送られる.学習装置[2]において入
力パターン(a)が処理され.その結果にノイズ部■か
らノイズが例えば加算され出力パターン(ハ)となる.
出力部■では出力パターン(b)から外部出力を作戒す
る.一方,入出力パターン対応部■では.入力パターン
(a)と出力パターン(b)とを対応づけ入出力パター
ン(c)を作戒する.評価部■において入出力パターン
(c)を評価し,教師パターンテーブル部[7]に保持
するパターンと不要なパターンとを選別する.学習時に
は.教師パターンテーブル部[7]に保持された入出力
パターンを教師パタ一ン(b)として学習装置[2]に
学習させる.〔実施例〕 具体例を用いて説明する.第2図はロボットの例を示し
,視覚センサ入力@を持つロボット■とターゲット@と
を想定して.ターゲット@に近づくという行動をロボッ
ト■に学習させた例を説明する.ロボットは本発明であ
る自己学習処理方式を用いて自己学習する機能をもつ. ロボット■は11個の視覚センサ入力@を持・つi例え
ば,ターゲットを捕らえた視覚センサは「l」,それ以
外は「0』とする.従って第2図における視覚センサ入
力@はro0000111000Jとなる.この視覚セ
ンサ入力■が外部情報として入力部■に送られる.入力
部■では,送られた視覚センサ入力@より入力パターン
(a)を作戒する.この入力パターン(a)を学習装置
■と入出力パターン対応部■とに送る.学習装直■では
入力パターン(a)を処理する.学習装置■の結果にノ
イズ部■で作成されたノイズを加算し.最終的な出力パ
タ一ン(b)とする.ノイズ部■では一様乱数を用いて
ノイズを作成する.もちろん.ガウス分布等のノイズを
用いてもよい.出力部■では.出力パターン(b)から
進行方向の角度0(第2図)を計算し.外部出力として
モータへ送る.その結果ロボットは移動■(第2図)す
る.このIIIの移動を!ステップとする.一方,入出
力パターン対応部[F]では.出力パターン(b)を入
力パターン(b)と対応づけ入出力パターン(c)を作
戒する.この入出力パターン(c)は教師パターンテー
ブル部[7]に一旦蓄えられる.評価部■では,後述す
る評価方法に従って入出力パターンを評価する.評価部
によって選択された合目的的な人出力パターンは教師パ
ターンテーブル部[7]に保持される. 第2図においては図示を省略しているが.ロボットの移
動範囲に制限を与えるためにロボットの周囲に壁を想定
して当該壁に達すると1トライが終了したとする.また
ロボットが自由に動き得る最大移動時間を設け.当該最
大移動時間に達したときにもIトライとする.即ち.学
習する条件として. (1)ターゲットを捕らえるか,
(ii)周囲の壁に当たるか,  (ii)最大移動時
間に達するかを満たすまで以上を繰り返す.その条件を
満たすまでを1トライとする.1トライ終了した後.教
師パターンテーブル部[7]に保持された入出力パター
ンを教師パターン(b)として次回のトライのために学
習を行う.学習終了後には例えば別の位置にロボットと
ターゲットとを置き直して以上を繰り返す. 第3図は評価方法を説明する説明図である.図において
は.1トライの一連の行動軌跡[株]と1トライの一連
の入出力パターン[株]■[相]およびその評価値[相
]とが示されている.第3図において,過去入力[相]
は1つ前の視覚センサ入力.現在人力@は現在の視覚セ
ンサ人力.出力[株]は最終的な出力パターン.評価値
[相]は評価部で評価した結果である.評価部■では,
1つ前の入出力パターン(f)(第3図)を蓄えておき
,現在の入出力パターン(e)(第3図)とから以下(
A) (B)の条件に従って入出力パターンを評価する
.評価は,O,X(強化信号と呼ぶ)を用いて行う.以
下の条件(A) (B)を満たす入出力パターンを○と
評価する.評価結果で0と評価された入出力パターンを
教師パターンテーブル部■で保持する.評価値[相]は
条件(^)を満たせば「1」.条件CB)を満たせば「
一l」とし,どちらも満たさなければ『0」とする. (^)入出力パターン(第3図ブロックAにおける(f
))の如く現在人力[7]に何らかの入力があり.入出
力パターン(第3図ブロックAにおける(e))の如く
現在人力@が入出力パターン(第3図ブロックAにおけ
る(f))の現在人力@より大きく反応した時,例えば
第3図のブロックAにおいて,人出力パターンA (f
)の現在人力@に「1』が1つあるが.入出力パターン
A (e)の現在人力@にl’l」が3つあるように入
出力パターンA(f)の現在人力■よりも「1ノの数が
多くなる時には,これはターゲットをより大きく捕らえ
た.つまり近づいたことを意味する.(B)入出力パタ
ーン(第3図ブロックBにおける(0)の如く過去入力
[相]に何らかの入力があり.入出力パターン(第3図
ブロックBにおける(『))の如く現在人力[7]に何
も入力がなく.入出力パター・ン(第3図ブロックBに
おける(e))の如く現在人力@に何らかの入力があっ
た時,例えば第3図のブロックBにおいて,人出力パタ
ーンB (f)の過去入力[相]に「1」が1つあり,
入出力パターンB (f)の現在人力■は全て「0」で
.入出力パターンB (e)の現在人力[7]にr1,
が1つある時には.これは,ターゲットを一瞬見失った
後にもとに戻ることを意味する.第3図・図示の行動軌
跡[相]が得られた間に上記の如き評価が行われる. 第4図は本発明にもとづいてロボットが学習によりター
ゲットを捕まえるようになる経過を示す.第4図は.未
学習状態の行動軌跡0.1トライ目の学習後の行動軌跡
0.2トライ目の学習後の行動軌跡0,3トライ目の学
習後の行動軌跡[相]を示している.第4図から,学習
を重ねるにつれターゲットにより早く近づいていき.最
終的には最短距離に近い経路で行動できるようになった
ことがわかる.第5図は,未学習状態から1トライ行動
した軌跡@とその際の入出力パターン@@@および評価
値[相]を示す.1}ライ目の教師パターンは10種類
しか作成されていないが,それなりに教師パターンを得
ることができている. 〔発明の効果〕 以上に説明したように.本発明によれば.装置は実行と
対応して教師パターンを作成し.その教師パターンを用
いて学習することが出来る.従って.予め教師パターン
を用意する必要がなく非常に手間が省ける.人間は装置
の動作だけを監視していればよく,細かい内部構造まで
知る必要がない。
[Detailed Description of the Invention] [Summary] A learning processing device having a learning device that processes an inputted input pattern based on a presented teacher pattern, in which self-learning is performed. Regarding the self-learning processing method, the purpose is to make the learning device automatically extract a purposeful input/output pattern, and the output pattern with noise added to the output from the learning device and the Regarding correspondence with input patterns, we have an evaluation section that evaluates whether or not it is a purposeful input/output pattern based on predetermined rules, and we take care to maintain purposeful human output patterns as teacher patterns. [Industrial Application Field] The present invention is directed to a learning processing device having a learning device that processes an input pattern input based on a presented teacher pattern. Concerning learning processing methods. recent years. Devices (neural networks, etc.) that learn by presenting teacher patterns can be used for pattern recognition such as alphabet font recognition and image recognition, adaptive filters, etc. It has come to be applied to various types of robot control. [Prior Art] The conventional teacher pattern in a learning device that requires a teacher pattern is as follows. Learning is performed using teacher patterns prepared in advance by humans. [Problem to be solved by the invention] However, in the case of actual applicability, patterns that deal with time series. A pattern in which the teacher pattern itself changes.
Since there are patterns for unpredictable states, it is difficult to determine the type and amount of patterns to prepare as teacher patterns. Therefore, it takes a lot of time to create a teacher pattern. In order to put the learning device into practical use in actual abduction, an algorithm is required in which the learning device itself disciplines the teacher pattern. The present invention is.
The purpose is to enable the learning device to automatically extract purposeful input/output patterns. [Means for solving the problem] Figure 1 shows a diagram explaining the principle of the present invention. In the figure. ■ is an input section that adjusts the input pattern (a) (generally multiple bits of information) based on stimuli detected from the external environment. ■ is a learning device that incorporates the teacher pattern (d). Process the input pattern (a) based on the teacher pattern (b). Processing is performed to generate an output in accordance with the teacher pattern (b). ■ is the noise part, which is expected to add random noise to the output of the learning device ■ to obtain an output pattern that is more purposeful than the output of the learning device ■ itself. .. ■
is the output part. For example, it performs operations such as driving a motor (not shown). 3 is an input/output pattern correspondence unit provided in the present invention, which takes correspondence between an input pattern (a) and its corresponding output pattern (b), and outputs an input/output pattern (c). ■ is an evaluation section created in the present invention. Many input/output patterns (c)
For each, the previous input/output pattern and the current person's output pattern are retained, and based on the comparison of both patterns. Evaluate whether the current input/output pattern is appropriate for the learning device [2], and selectively extract the appropriate input/output pattern. ■ is the teacher pattern table section. After temporarily accumulating a large number of input/output patterns (c) obtained by the input/output pattern correspondence section (■), the above-mentioned evaluation section (■) is allowed to refer to the input/output patterns (c), and the purposeful input/output patterns selectively extracted as described above are created. Keep it. And, send the relevant human output pattern to the learning device [2]. This will be supplied as the teacher pattern (d) for the next try. [Function] The input section generates an input pattern (al) from external information. This input pattern (a) is sent to the learning device ■ and the input/output pattern correspondence section ■. In the learning device [2], the input pattern (a) is processed. For example, noise is added from the noise section (2) to the result, resulting in an output pattern (c).
The output section ■ controls the external output according to output pattern (b). On the other hand, in the input/output pattern correspondence section■. The input pattern (a) and the output pattern (b) are associated with each other, and the input/output pattern (c) is determined. The evaluation section (■) evaluates the input/output pattern (c) and selects patterns to be retained in the teacher pattern table section [7] and unnecessary patterns. When learning. The input/output pattern held in the teacher pattern table section [7] is made to be learned by the learning device [2] as a teacher pattern (b). [Example] This will be explained using a specific example. Figure 2 shows an example of a robot, assuming a robot ■ with visual sensor input @ and a target @. We will explain an example in which a robot ■ learns the behavior of approaching a target @. The robot has a self-learning function using the self-learning processing method of the present invention. The robot ■ has 11 visual sensor inputs.For example, the visual sensor that captured the target is "l", and the others are "0".Therefore, the visual sensor input in Figure 2 is ro0000111000J. .This visual sensor input ■ is sent as external information to the input unit ■.The input unit ■ generates an input pattern (a) from the sent visual sensor input @.This input pattern (a) is sent to the learning device ■ and the input/output pattern correspondence section ■.The learning device ■ processes the input pattern (a).The noise created by the noise section ■ is added to the result of the learning device ■.The final output pattern (b). In the noise part ■, create noise using uniform random numbers. Of course, noise such as Gaussian distribution may also be used. In the output part ■, the angle in the traveling direction from the output pattern (b) is 0. (Fig. 2) is calculated and sent to the motor as an external output.As a result, the robot moves (Fig. 2).This movement of III is called a step.On the other hand, in the input/output pattern correspondence section [F] .The output pattern (b) is matched with the input pattern (b) and the input/output pattern (c) is controlled.This input/output pattern (c) is temporarily stored in the teacher pattern table section [7].In the evaluation section ■ , the input/output pattern is evaluated according to the evaluation method described later.The purposeful human output pattern selected by the evaluation section is held in the teacher pattern table section [7].It is omitted from illustration in Fig. 2. In order to limit the movement range of the robot, we assume that there is a wall around the robot, and one try ends when it reaches the wall.We also set the maximum movement time during which the robot can move freely. When the time is reached, it is also regarded as an I try.In other words, as a condition for learning.(1) Either the target is captured or
Repeat the above until either (ii) the robot hits the surrounding wall or (ii) the maximum travel time is reached. One try is until the condition is met. After completing one try. Learning is performed for the next trial using the input/output pattern held in the teacher pattern table section [7] as the teacher pattern (b). After learning, for example, place the robot and target in another position and repeat the above process. Figure 3 is an explanatory diagram explaining the evaluation method. In the figure. A series of behavioral trajectories [stock] for one try, a series of input/output patterns [stock] ■ [phase] for one try, and their evaluation values [phase] are shown. In Figure 3, past input [phase]
is the previous visual sensor input. Current human power @ is the current visual sensor human power. Output [stock] is the final output pattern. The evaluation value [phase] is the result of evaluation by the evaluation section. In the evaluation department,
The previous input/output pattern (f) (Figure 3) is stored, and the following (
A) Evaluate the input/output pattern according to the conditions in (B). Evaluation is performed using O and X (called reinforcement signals). An input/output pattern that satisfies the following conditions (A) and (B) is evaluated as ○. The input/output patterns evaluated as 0 in the evaluation results are held in the teacher pattern table section ■. The evaluation value [phase] is "1" if the condition (^) is satisfied. If condition CB) is satisfied, “
If neither is satisfied, set it as ``0''. (^) Input/output pattern ((f in block A in Figure 3)
)) There is currently some kind of input to human power [7]. When the current human power @ in the input/output pattern ((e) in block A in Figure 3) reacts more than the current human power in the input/output pattern ((f) in block A in Figure 3), for example, the block in Figure 3 In A, human output pattern A (f
There is one "1" in the current human power @ of ).As there are three l'l's in the current human power @ of input/output pattern A (e), it is better than the current human power ■ of input/output pattern A (f). "When the number of 1's increases, this means that the target has been captured larger. In other words, it has gotten closer. (B) Input/output pattern (past input [phase] like (0) in block B of Figure 3) There is some input to the input/output pattern ((') in block B in Figure 3), and there is currently no input to human power [7].Input/output pattern ((e) in block B in Figure 3) When there is some input to the current human power @, for example, in block B of Figure 3, there is one "1" in the past input [phase] of the human output pattern B (f),
The current human power ■ in input/output pattern B (f) is all "0". r1 to the current human power [7] of input/output pattern B (e),
When there is one. This means that it loses sight of the target for a moment and then returns to its original position. The above evaluation is performed while the behavior trajectory [phase] shown in Figure 3 is obtained. Figure 4 shows the progress in which the robot learns to catch the target based on the present invention. Figure 4 is. Behavior trajectory in unlearned state 0. Behavior trajectory after learning on the 1st try. Behavior trajectory after learning on the 2nd try. Behavior trajectory after learning on the 0th and 3rd trials [phase]. From Figure 4, as the learning progresses, the robot approaches the target faster. It can be seen that in the end, the user was able to take the route closest to the shortest distance. Figure 5 shows the trajectory @ of one try action from the unlearned state, the input/output pattern @ @ and the evaluation value [phase] at that time. 1} Although only 10 types of teacher patterns for lie eyes have been created, we have been able to obtain a reasonable number of teacher patterns. [Effects of the invention] As explained above. According to the invention. The device creates a teacher pattern corresponding to the execution. It is possible to learn using the teacher pattern. Therefore. There is no need to prepare a teacher pattern in advance, which saves a lot of time and effort. Humans only need to monitor the operation of the device; there is no need to know the detailed internal structure.

なお今回は時系列を扱ったパターンを学習させたが,教
師パターン自体が変化するパターン.予測不能な状態に
対するパターンなどについても学習可能なことは言うま
でもない.もちろん.既知のパターンを用意しておくこ
とも可能である.また今回挙げた実施例では,11個の
視覚センサに対する学習例について述べたが,他の各種
センサを取りつけることも可能である.
Note that this time we learned a pattern that deals with time series, but the teacher pattern itself changes. It goes without saying that it is also possible to learn patterns for unpredictable situations. of course. It is also possible to prepare a known pattern. In addition, in the example given this time, a learning example was described for 11 visual sensors, but it is also possible to attach various other sensors.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の原理説明図,第2図はロボットの例.
第3図ないし第5図は説明図である.図中, ■二入力部. ■:学習装置, ■:ノイズ部, ■:出力部, ■:入出力パターン対応部, ■:評価部, ■:教師パターンテーブル部, (a):入カパターン, Φ):出力パターン. (c):入出力パターン, @:教師パターン.
Figure 1 is an explanatory diagram of the principle of the present invention, and Figure 2 is an example of a robot.
Figures 3 to 5 are explanatory diagrams. In the figure, ■Second input section. ■: Learning device, ■: Noise section, ■: Output section, ■: Input/output pattern correspondence section, ■: Evaluation section, ■: Teacher pattern table section, (a): Input pattern, Φ): Output pattern. (c): Input/output pattern, @: Teacher pattern.

Claims (1)

【特許請求の範囲】 外部情報から入力パターン(a)を作成する入力部[1
]、 提示された教師パターンにもとづいて、入力された入力
パターン(a)を処理する学習装置[2]、学習装置[
2]による処理結果に附加するノイズを作成するノイズ
部[3]、最終的な結果である出力パターン(b)を受
ける出力部[4]とをそなえた学習処理装置において、
入力パターン(a)と出力パターン(b)とを対応させ
入出力パターン(c)を作成する入出力パターン対応部
[5]、 入出力パターン(c)を評価する評価部[6]、評価部
[6]から送られる入出力パターンを蓄える教師パター
ンテーブル部[7]をそなえ、 入出力パターン(c)の対応関係を評価部にて評価し、
その評価基準に応じて入出力パターンを教師パターンテ
ーブル部[7]に保持し、該保持された入出力パターン
を上記学習装置[2]に対する教師パターン(b)とし
て供給するようにした ことを特徴とする学習装置のための自己学習処理方式。
[Claims] Input unit [1] that creates an input pattern (a) from external information
], a learning device [2], a learning device [2], which processes an input pattern (a) based on a presented teacher pattern;
2], a learning processing device comprising a noise unit [3] that creates noise to be added to the processing result, and an output unit [4] that receives the output pattern (b) that is the final result,
An input/output pattern correspondence unit [5] that matches the input pattern (a) and the output pattern (b) to create an input/output pattern (c), an evaluation unit [6] that evaluates the input/output pattern (c), and an evaluation unit It is equipped with a teacher pattern table section [7] that stores the input/output patterns sent from [6], and the evaluation section evaluates the correspondence of the input/output patterns (c).
The input/output pattern is held in the teacher pattern table section [7] according to the evaluation criteria, and the held input/output pattern is supplied as the teacher pattern (b) to the learning device [2]. A self-learning processing method for learning devices.
JP1244409A 1989-09-20 1989-09-20 Self-learning processing system for learning device Pending JPH03105662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1244409A JPH03105662A (en) 1989-09-20 1989-09-20 Self-learning processing system for learning device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1244409A JPH03105662A (en) 1989-09-20 1989-09-20 Self-learning processing system for learning device

Publications (1)

Publication Number Publication Date
JPH03105662A true JPH03105662A (en) 1991-05-02

Family

ID=17118236

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1244409A Pending JPH03105662A (en) 1989-09-20 1989-09-20 Self-learning processing system for learning device

Country Status (1)

Country Link
JP (1) JPH03105662A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0429494A (en) * 1990-05-23 1992-01-31 Matsushita Electric Ind Co Ltd Automatic adjusting device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0429494A (en) * 1990-05-23 1992-01-31 Matsushita Electric Ind Co Ltd Automatic adjusting device

Similar Documents

Publication Publication Date Title
Guély et al. Gradient descent method for optimizing various fuzzy rule bases
Chen et al. Deep reinforcement learning to acquire navigation skills for wheel-legged robots in complex environments
Hiraga et al. An acquisition of operator's rules for collision avoidance using fuzzy neural networks
Lin et al. Nonlinear system control using compensatory neuro-fuzzy networks
JPH03189856A (en) Learning system for external evaluation criterion
Gu et al. Path following with supervised deep reinforcement learning
JPH03105662A (en) Self-learning processing system for learning device
Zhou Neuro-fuzzy gait synthesis with reinforcement learning for a biped walking robot
Saridis et al. Hierarchically intelligent control of a bionic arm
CN111984000A (en) Method and device for automatically influencing an actuator
Sun et al. A recurrent fuzzy neural network based adaptive control and its application on robotic tracking control
Ho et al. A novel fuzzy inferencing methodology for simulated car racing
Algabri et al. Optimization of fuzzy logic controller using PSO for mobile robot navigation in an unknown environment
Xiao et al. A reinforcement learning approach for robot control in an unknown environment
Nimoto et al. Improvement of Agent Learning for a Card Game Based on Multi-channel ART Networks.
Raj et al. Optimized Fuzzy Controller for Cable Robot.
Hong et al. Formation control based on artificial intelligence for multi-agent coordination
Hamavand et al. Trajectory control of robotic manipulators by using a feedback-error-learning neural network
Katic et al. Decomposed connectionist architecture for fast and robust learning of robot dynamics
Song et al. Reinforcement learning and its application to force control of an industrial robot
ULUSOY et al. A neural network system with reinforcement learning to control a dynamic arm model
Hong et al. Cooperative algorithm and group behavior in multirobot
Son et al. Neuro-fuzzy control of a robot manipulator for a trajectory design
Stonier et al. Intelligent hierarchical control for obstacle-avoidance
ZHOU et al. Intelligent robotic control using reinforcement learning agents with fuzzy evaluative feedback